0:00:04 +0530
> Subject: Re: Nutch in production
> Thank you guys for your replies. I will look into the suggestions you gave.
> But I have one more query. How can I trigger nutch from a queue system in a
> distributed environment ?
Well this is a bit more tricky of course, as per my ot
Can I have a link to this ?
Regards,
Sachin Shaju
sachi...@mstack.com
+919539887554
On Thu, Sep 29, 2016 at 11:13 PM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
> Yep also check out the work that Sujen Shah just merged (also on my team
> at JPL and
> USC) where you can
Yep also check out the work that Sujen Shah just merged (also on my team at JPL
and
USC) where you can publish events to an ActiveMQ queue from Nutch crawling. That
should allow all sorts of production dashboards and analytics.
++
Hi Sachin,
Just a suggestion here - you can use Apache Kafka to generate and catch
events which are mapped to incoming crawl requests, crawl status and much
more.
I have created a prototype for production queue [0] which runs on top of a
supercomputer (TACC Wrangler) and integrated it with
Hi,
I was experimenting some crawl cycles with nutch and would like to setup
a distributed crawl environment. But I wonder how can I trigger nutch for
incoming crawl requests in a production system. I read about nutch REST
api. Is that the real option that I have ? Or can I run nutch as a
On 23 June 2014 01:44, Meraj A. Khan mera...@gmail.com wrote:
Gora,
Thanks for sharing your admin perspective , rest assured I am not trying
to circumvent any politeness requirements in any way , as I mentioned
earlier , I am with in the crawl-delay limits that are being set by the web
of Nutch in research oriented projects , I would like
to
know from those of you who are using Nutch in production for large scale
crawling (vertical or non-vertical) about what challenges to expect and
how
to overcome them.
I will list a few challenges that I faced below and would like
Hello Folks,
I have noticed that Nutch resources and mailing lists are mostly geared
towards the usage of Nutch in research oriented projects , I would like to
know from those of you who are using Nutch in production for large scale
crawling (vertical or non-vertical) about what challenges
On 22 June 2014 22:07, Meraj A. Khan mera...@gmail.com wrote:
Hello Folks,
I have noticed that Nutch resources and mailing lists are mostly geared
towards the usage of Nutch in research oriented projects , I would like to
know from those of you who are using Nutch in production for large
that Nutch resources and mailing lists are mostly geared
towards the usage of Nutch in research oriented projects , I would like
to
know from those of you who are using Nutch in production for large scale
crawling (vertical or non-vertical) about what challenges to expect and
how
to overcome
10 matches
Mail list logo