Thanks, I was exactly thinking of the same thing the other day,because the crawl script sometimes loses contact with the long running fetch job and just keeps on waiting for %complete updates especially when the script is run in the background,even after the fetch job is complete,effectively resulting in the termination of the script execution.
I hope this could be avoided in the Oozie workflow. On Sep 22, 2014 9:25 AM, "Edoardo Causarano" <[email protected]> wrote: > Hi Meraj, > > at the moment I’m not, but in the Generator job class the method > “generate” does return a list of Paths therefore the possibility is there > (somehow.) For now I’m concentrating on passing at least 1 segment name > from one step to the other, then I’ll see if and how I can get more. > > > Best, > Edoardo > > > On 22 september 2014 at 14:50:03, Meraj A. Khan ([email protected]) wrote: > > Hi Edoardo, > > How do you generate the multiple segments at the time of generate phase? > On Sep 22, 2014 6:01 AM, "Edoardo Causarano" <[email protected]> > wrote: > > > Hi all, > > > > I’m building an Oozie workflow to schedule the generate, fetch, etc… > > workflow. Right now I'm trying to feed the list of generated segments > into > > the following fetch stage. > > > > The “crawl” script assumes that the most recently added segment is > > un-fetched and does some hdfs shell scripting to determine its name and > > stuff this into a shell variable, but I’d like to avoid this and somehow > > feed the list of generated segments directly into the following step. > > > > I have the feeling that I could use the ooze “capture data from action” > > option but I think that will require fiddling with the Generator class > > source; that’s ok but I’m a bit weary of adding custom code that may not > be > > part of the core distribution. Has anyone already done something similar, > > preferably without touching the source? (e.g. > > http://qnalist.com/questions/2330221/nutch-oozie-and-elasticsearch but > it > > now 404s on GitHub) > > > > > > Best, > > Edoardo > > > > -- > > Edoardo Causarano > > Sent with Airmail > -- > Edoardo Causarano > Sent with Airmail

