Hi All -

Is it possible to generate part of index and do a partial update of linkdb?

I am trying build my index gradually – running generate/fetch/index
cycles gradually for 20,000 URLs or so. However, problem I am running
into is that – every time I do another generate/fetch and want to
update my linkdb - I have to run invertlinks with crawl/segments/* -
and this tries to update linkdb with all the segments (newly fetched
as well as old ones). Same problem with index – after every new
generate/fetch, I delete old index file and run index command again
for all the segments: Is it possible to update linkdb and index only
for a newly fetched segment? As my crawldb is getting bigger,
invertlinks and index processes are taking much longer and it seems
unnecessary to perform invertlinks and index of all the segments -
even when the db may already have info on the older segments.
Appreciate any help/pointers.

Cheers
Jha

Reply via email to