Thanks, I was exactly thinking of the same thing the other day,because the
crawl script sometimes loses contact with the long running fetch job and
just keeps on waiting for %complete updates especially when the script is
run in the background,even after the fetch job is complete,effectively
resulting in the termination of the script execution.

I hope this could be avoided in the Oozie workflow.
On Sep 22, 2014 9:25 AM, "Edoardo Causarano" <[email protected]>
wrote:

> Hi Meraj,
>
> at the moment I’m not, but in the Generator job class the method
> “generate” does return a list of Paths therefore the possibility is there
> (somehow.) For now I’m concentrating on passing at least 1 segment name
> from one step to the other, then I’ll see if and how I can get more.
>
>
> Best,
> Edoardo
>
>
> On 22 september 2014 at 14:50:03, Meraj A. Khan ([email protected]) wrote:
>
> Hi Edoardo,
>
> How do you generate the multiple segments at the time of generate phase?
> On Sep 22, 2014 6:01 AM, "Edoardo Causarano" <[email protected]>
> wrote:
>
> > Hi all,
> >
> > I’m building an Oozie workflow to schedule the generate, fetch, etc…
> > workflow. Right now I'm trying to feed the list of generated segments
> into
> > the following fetch stage.
> >
> > The “crawl” script assumes that the most recently added segment is
> > un-fetched and does some hdfs shell scripting to determine its name and
> > stuff this into a shell variable, but I’d like to avoid this and somehow
> > feed the list of generated segments directly into the following step.
> >
> > I have the feeling that I could use the ooze “capture data from action”
> > option but I think that will require fiddling with the Generator class
> > source; that’s ok but I’m a bit weary of adding custom code that may not
> be
> > part of the core distribution. Has anyone already done something similar,
> > preferably without touching the source? (e.g.
> > http://qnalist.com/questions/2330221/nutch-oozie-and-elasticsearch but
> it
> > now 404s on GitHub)
> >
> >
> > Best,
> > Edoardo
> >
> > --
> > Edoardo Causarano
> > Sent with Airmail
> --
> Edoardo Causarano
> Sent with Airmail

Reply via email to