The time on the machine is set correctly. It's an internal website. Is there anything in particular in the crawl datum you're looking for?
It's not a freshly injected url. And it seems like all of my urls have the long fetch times. And it seems odd that the max interval wouldn't cause a fetch. I was able to do generate -adddays 6000 and run a fetch. I'm waiting for it to finish so I can check if this created long fetch times as well. On Mon, Jun 3, 2013 at 10:35 AM, Tejas Patil <[email protected]>wrote: > On Mon, Jun 3, 2013 at 6:53 AM, feng lu <[email protected]> wrote: > > > I see that nutch2.x will use the underlying operating system time to set > > the FetchTime. like this > > > > fit.page.setFetchTime(System.currentTimeMillis()); > > > > The granularity of the value depends on the underlying operating system. > so > > check your current OS time using date command. > > > > > > On Mon, Jun 3, 2013 at 8:57 PM, Bai Shen <[email protected]> > wrote: > > > > > I'm using the 2.x head and even with adding 30 days I'm not getting any > > > refetches. I did a readdb on my injected url and it says that the > fetch > > > time is in 2027. > > > > Can share the crawl datum for that url ? > > > > > > > > Any idea why this would occur? > > > > If it was a freshly injected url, then I would go with Fengs' advice. > > Will db.fetch.interval.max kick in and > > > cause it to be fetched earlier? > > > > nope. > > Or will I have to manually change the > > > fetchTime using the hbase shell? > > > > I think so. > > > > > > Thanks. > > > > > > > > > > > -- > > Don't Grow Old, Grow Up... :-) > > >

