We currently don't start fragments until we have enough input for them. The fragment manager is responsible for watching how much incoming data we have and then once there is enough that it is possible to start working, the fragment is executed. When it is started, I believe it is added to the running fragments. Imagine an n-way merge. Until we get at least one batch from each of the n-way inputs, we don't start the fragment.
Now, this made more sense when we didn't have the fast schema behavior. Now that we do, everything starts basically immediately anyway (except when schema is indeterminate, as in the case of CONVERT_FROM(xx, 'JSON'). On Wed, May 6, 2015 at 3:50 PM, Abdel Hakim Deneche <adene...@maprtech.com> wrote: > Hi all, > > In WorkManager there are two methods that can be used to start a fragment > executor addFragmentRunner(FragmentExecutor) and > addFragmentPendingRemote(FragmentManager) > > both methods will run the fragment executor but the second one will make > sure the fragment executor is added to WorkManager.runningFragments. > > Why not add all fragment that are running in a drillbit to > "runningFragments" and limit ourselves to the fragments started through > addFragmentRunner() ? > > Thanks! > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >