You can use maxNumSegments to generate more than one segment. And instead of 
passing a list of segment names around, why not just loop over the entire 
directory, and move finished segments to another.

 
 
-----Original message-----
> From:Edoardo Causarano <[email protected]>
> Sent: Monday 22nd September 2014 15:25
> To: [email protected]
> Subject: Re: get generated segments from step / fetch all empty segments
> 
> Hi Meraj,
> 
> at the moment I’m not, but in the Generator job class the method “generate” 
> does return a list of Paths therefore the possibility is there (somehow.) For 
> now I’m concentrating on passing at least 1 segment name from one step to the 
> other, then I’ll see if and how I can get more.
> 
> 
> Best,
> Edoardo
>     
> 
> On 22 september 2014 at 14:50:03, Meraj A. Khan ([email protected]) wrote:
> 
> Hi Edoardo,  
> 
> How do you generate the multiple segments at the time of generate phase?  
> On Sep 22, 2014 6:01 AM, "Edoardo Causarano" <[email protected]>  
> wrote:  
> 
> > Hi all,  
> >  
> > I’m building an Oozie workflow to schedule the generate, fetch, etc…  
> > workflow. Right now I'm trying to feed the list of generated segments into  
> > the following fetch stage.  
> >  
> > The “crawl” script assumes that the most recently added segment is  
> > un-fetched and does some hdfs shell scripting to determine its name and  
> > stuff this into a shell variable, but I’d like to avoid this and somehow  
> > feed the list of generated segments directly into the following step.  
> >  
> > I have the feeling that I could use the ooze “capture data from action”  
> > option but I think that will require fiddling with the Generator class  
> > source; that’s ok but I’m a bit weary of adding custom code that may not be 
> >  
> > part of the core distribution. Has anyone already done something similar,  
> > preferably without touching the source? (e.g.  
> > http://qnalist.com/questions/2330221/nutch-oozie-and-elasticsearch but it  
> > now 404s on GitHub)  
> >  
> >  
> > Best,  
> > Edoardo  
> >  
> > --  
> > Edoardo Causarano  
> > Sent with Airmail  
> -- 
> Edoardo Causarano
> Sent with Airmail

Reply via email to