Hi Edoardo, How do you generate the multiple segments at the time of generate phase? On Sep 22, 2014 6:01 AM, "Edoardo Causarano" <[email protected]> wrote:
> Hi all, > > I’m building an Oozie workflow to schedule the generate, fetch, etc… > workflow. Right now I'm trying to feed the list of generated segments into > the following fetch stage. > > The “crawl” script assumes that the most recently added segment is > un-fetched and does some hdfs shell scripting to determine its name and > stuff this into a shell variable, but I’d like to avoid this and somehow > feed the list of generated segments directly into the following step. > > I have the feeling that I could use the ooze “capture data from action” > option but I think that will require fiddling with the Generator class > source; that’s ok but I’m a bit weary of adding custom code that may not be > part of the core distribution. Has anyone already done something similar, > preferably without touching the source? (e.g. > http://qnalist.com/questions/2330221/nutch-oozie-and-elasticsearch but it > now 404s on GitHub) > > > Best, > Edoardo > > -- > Edoardo Causarano > Sent with Airmail

