Hi Edoardo,

How do you generate the multiple segments at the time of generate phase?
On Sep 22, 2014 6:01 AM, "Edoardo Causarano" <[email protected]>
wrote:

> Hi all,
>
> I’m building an Oozie workflow to schedule the generate, fetch, etc…
> workflow. Right now I'm trying to feed the list of generated segments into
> the following fetch stage.
>
> The “crawl” script assumes that the most recently added segment is
> un-fetched and does some hdfs shell scripting to determine its name and
> stuff this into a shell variable, but I’d like to avoid this and somehow
> feed the list of generated segments directly into the following step.
>
> I have the feeling that I could use the ooze “capture data from action”
> option but I think that will require fiddling with the Generator class
> source; that’s ok but I’m a bit weary of adding custom code that may not be
> part of the core distribution. Has anyone already done something similar,
> preferably without touching the source? (e.g.
> http://qnalist.com/questions/2330221/nutch-oozie-and-elasticsearch but it
> now 404s on GitHub)
>
>
> Best,
> Edoardo
>
> --
> Edoardo Causarano
> Sent with Airmail

Reply via email to