Re: Nutch in production

2016-10-18 Thread lewis john mcgibbney
0:00:04 +0530 > Subject: Re: Nutch in production > Thank you guys for your replies. I will look into the suggestions you gave. > But I have one more query. How can I trigger nutch from a queue system in a > distributed environment ? Well this is a bit more tricky of course, as per my ot

Re: Nutch in production

2016-09-29 Thread Sachin Shaju
Can I have a link to this ? Regards, Sachin Shaju sachi...@mstack.com +919539887554 On Thu, Sep 29, 2016 at 11:13 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Yep also check out the work that Sujen Shah just merged (also on my team > at JPL and > USC) where you can

Re: Nutch in production

2016-09-29 Thread Mattmann, Chris A (3980)
Yep also check out the work that Sujen Shah just merged (also on my team at JPL and USC) where you can publish events to an ActiveMQ queue from Nutch crawling. That should allow all sorts of production dashboards and analytics. ++

Re: Nutch in production

2016-09-29 Thread Karanjeet Singh
Hi Sachin, Just a suggestion here - you can use Apache Kafka to generate and catch events which are mapped to incoming crawl requests, crawl status and much more. I have created a prototype for production queue [0] which runs on top of a supercomputer (TACC Wrangler) and integrated it with