Hello again,

I was inspecting the generator because it doesn't deliver all urls for the fetcht list from the crawldb even if I set the addDays atribute to a value much higher than the max fetch intervall.

As I had a look at the log file I notice that it uses a time stamp which I don't know:

2012-01-20 18:32:24,506 DEBUG org.apache.nutch.crawl.Generator: -shouldFetch rejected 'http://(...)', fetchTime=1327667923420, curTime=1327076858662

So I wanted to see what time these to values are actually are and converted them using the date command:

date -u -d @1327667923420
Do 13. Feb 21:23:40 UTC 44042

So the fetch time is in the year 44042? Quite a long time to wait the same with the system time:

date -u -d @1327076858662
Di 23. Mai 20:44:22 UTC 44023

(My system is NOT set to that date!) ;-)

Does the generator use another kind of timestamp than unix systems? Or is something terrible wrong here?

Thanks a lot in advance

Reply via email to