Hi! I did a crawl on a single seed for 30 rounds and it has crawled around 16k seeds. I have checked (readdb -stats) and it showed 2116 seeds as unfetched. I ran the fetcher again with option 'all' but it does not fetch anything and the unfetched list remains same.
I have dumped only the fields (baseURL, status, protocolStatus) and can be found at ( https://raw.github.com/salvager/NutchDev/master/runtime/local/table_fields/part-r-00000 ). The file clearly shows that urls with status 1 have the protocolStatus(NOT FOUND). Those seeds are never moved to status (db_gone) that is status 3 if i am correct. Did anyone had a similar problem ? Any ideas on how to fix it ? PS : I have made patch which dumps only particular fields through command line (Example: ./bin/nutch readdb -dump table_fields -fields "status,protocolStatus"). baseUrl is dumped by default along with other fields requested. I can upload if anyone is interested. Thanks, -- Kiran Chitturi

