Hey Jeff,

so one scenario i recently encountered was an job on about 300.000 files in 
hdfs.
The splitting alone took 21 minutes. So i thought until the splitting is 
completed completely the a lot of splits could have already been processed… 

thanks for you answer!
Johannes

> On 12 Mar 2015, at 10:51, Jianfeng (Jeff) Zhang <[email protected]> 
> wrote:
> 
> 
> HI Johannes,
> 
> If the input-initlizeer is not done, workers can not be started.
> What¹s your scenario ? Why do you want to start the workers before
> splitting is generated ? Just save the launch time or let the worker to do
> other stuff ?
> 
> 
> Best Regard,
> Jeff Zhang
> 
> 
> 
> 
> 
> On 3/12/15, 5:38 PM, "Johannes Zillmann" <[email protected]> wrote:
> 
>> Hey guys,
>> 
>> dump question. With Tez can i have a input-initializaer which don¹t
>> require to create every split before starting the processing of already
>> created splits ?
>> Means if i have a lot of splits and my splitting process takes a long
>> time, can the workers start working already while still doing the
>> splitting ?
>> 
>> Johannes
> 

Reply via email to