I have no ideea why this hapened - probably due to luke, because of it not re-reading the indexes? very strange!
Anyway, it works as it should - after a reindex the subcollection field is populated with latest data. Please excuse my insistence and my clumsiness, and thanks for your answers. liv wrote: > > Unfortunately my java knowledge is too poor to debug this one. However I > doubt that the file "subcollections.xml" from inside the nutch-xxx.job is > used. This because the file nutchxxx.job is old enough - has the date > since the day I made he nutch installation. > > > Sami Siren-2 wrote: >> >> liv wrote: >>> - I reindex the db: delete folder "indexes", run the command: >>> >>> bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb >>> crawl/segments/* >>> >>> - then I inspect the resulting db with luke again >>> >>> Unfortunately nothing has changed. Maybe I am missing something... >>> Please >>> tell me if you see anything wrong. >> >> If you did exactly those steps then what happens is that the >> subcollections.xml is read from inside the .job file. You need to >> rebuild the .job to put new file inside of it. >> >> simply do "ant" and rerun indexing and it should work as expected. >> >> -- >> Sami Siren >> >> >> > > -- View this message in context: http://www.nabble.com/subcollections-tf2821188.html#a7930248 Sent from the Nutch - User mailing list archive at Nabble.com.
