Solal Pirelli created TEZ-3841:
----------------------------------

             Summary: Proposal: Simulator mode
                 Key: TEZ-3841
                 URL: https://issues.apache.org/jira/browse/TEZ-3841
             Project: Apache Tez
          Issue Type: New Feature
            Reporter: Solal Pirelli


Early work on a new feature proposal: a "simulator" mode in which vertices are 
not actually executed, but instead use a simplified "fake" processor (which is 
configurable, and by default does nothing) to let a developer see how certain 
workloads will be handled.

For instance, one might want to check what happens if a vertex has each of its 
1000 tasks send a bunch of events - does this scale? Or, what if a specific 
vertex fails 2% of the time - how does this impact overall graph execution? Are 
2 nodes with 10 containers per node enough, or should one invest in a third 
node?

My current implementation is pretty simple: mimic the "uber" stuff to add a new 
"fake" mode with a custom task scheduler and container launcher. It adds the 
following configuration values:
* Boolean to enable fake mode
* Number of nodes in fake mode
* Number of containers per mode in fake mode
* Class to run in fake mode - must inherit a new class `FakeProcessor`, with a 
single method `run` that takes the vertex name, task index and task attempt, 
and returns a list of events. Throwing an exception causes the task to fail.

I'm currently working on adding a "chaos monkey" kind of service which randomly 
kills tasks, pre-empts containers, etc., but would appreciate some feedback on 
what's already done first. :)

P.S.: I have zero experience with using JIRA or contributing to Apache 
projects; if there is a more formal procedure for suggesting a new feature, 
please point me to it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to