Hello I'm trying to re-index my filesystem. First I create an Index with the normal crawl command. This works and in the end i can search my index with luke. But if i start my re-index script to re-index the same filesystem, i get a invalid index in the end with luke. I searched a while to find the failer but i didn't find one.
Maybe some one could help me. Here is my little script: ------------------------------------------------------------------------ --- cd C:/eclipse_projects/nutchTrunk/ webdb_dir=c:/nutchIndexFile/crawldb segments_dir=c:/nutchIndexFile/segments index_dir=c:/nutchIndexFile/index link_dir=c:/nutchIndexFile/linkdb indexes_dir=c:/nutchIndexFile/indexes/ # The generate/fetch/update cycle with depth 2 for ((i=1; i <= 2 ; i++)) do bin/nutch generate $webdb_dir $segments_dir segment=`ls -d $segments_dir/* | tail -1` bin/nutch fetch $segment bin/nutch updatedb $webdb_dir $segment done #the 2 represents the depth for segment in `ls -d $segments_dir/* | tail -2` do bin/nutch index $indexes_dir $webdb_dir $link_dir $segment done # De-duplicate indexes bin/nutch dedup $indexes_dir mkdir c:/tmpNutch bin/nutch merge -workingdir c:/tmpNutch/ $index_dir $indexes_dir ------------------------------------------------------------------------ -- Thx a lot Alain
