Hi All - Is it possible to generate part of index and do a partial update of linkdb?
I am trying build my index gradually – running generate/fetch/index cycles gradually for 20,000 URLs or so. However, problem I am running into is that – every time I do another generate/fetch and want to update my linkdb - I have to run invertlinks with crawl/segments/* - and this tries to update linkdb with all the segments (newly fetched as well as old ones). Same problem with index – after every new generate/fetch, I delete old index file and run index command again for all the segments: Is it possible to update linkdb and index only for a newly fetched segment? As my crawldb is getting bigger, invertlinks and index processes are taking much longer and it seems unnecessary to perform invertlinks and index of all the segments - even when the db may already have info on the older segments. Appreciate any help/pointers. Cheers Jha
