Re: crawling / data aggregation - is nutch the right tool?

no spam Mon, 16 Nov 2009 11:01:43 -0800

For now I only need to crawl hundreds of pages, previously I wrote stuff
from scratch in perl.   I want something that allows me to get started
quickly and allows for scale in the future.  I like that Droids is a
framework and I only have to do minimal work to get started.  Apache-Tika is
the framework for parsing and it looks right for the job.  It's the part
that I have a hard time evaluating with Nutch.   Some of what I have read
from the mailing list suggests it's still not all that easy to do extraction
with Nutch, am I wrong?


Mark

Re: crawling / data aggregation - is nutch the right tool?

Reply via email to