Thanks for the info Sebastian.
Re: Why do you want to merge the data structures?
To help inform my crawl strategy I am trying to see what is possible and
it feels like having the ability to run concurrent crawls might get around
any limitations in the software. I am currently seeding a set of
Hi Kamil,
> I was wondering if this script is advisable to use?
I haven't tried the script itself but some of the underlying commands
- mergedb, etc.
> merge command ($nutch_dir/nutch merge $index_dir $new_indexes)
Of course, some of the commands are obsolete. Long time ago, Nutch
used Lucene
Hi,
I am testing how merging crawls works and found this script
https://cwiki.apache.org/confluence/display/NUTCH/MergeCrawl.
I was wondering if this script is advisable to use? I plan to use it for
crawls of non-overlapping urls.
I am wary of using it since it is located under "Archive &
Hi,
Just checking if anyone could comment on my post below. :)
Thanks in advance.
Safdar
On Mon, Jun 11, 2012 at 8:10 AM, Ali Safdar Kureishy
safdar.kurei...@gmail.com wrote:
Hi,
I'm trying to build an incremental crawler, using the various Nutch
crawl tools (generate + fetch/parse +
Hi,
I'm trying to build an incremental crawler, using the various Nutch crawl
tools (generate + fetch/parse + updatedb etc.). By incremental I mean I
want crawled pages to show up quickly in the index (instead of waiting till
the end of the crawl). So, I'd like to index as soon as I have fetched
5 matches
Mail list logo