Thanks a lot Reinhard. It worked perfectly and now it is showing correct last modified date.
-----Original Message----- From: reinhard schwab [mailto:reinhard.sch...@aon.at] Sent: Monday, February 01, 2010 5:13 PM To: nutch-user@lucene.apache.org Subject: Re: 'readdb' and 'readseg' commands shows wrong last-modified-date paul tomblin has posted a diff for handling last modified. dont know whether an issue has been opened in jira. http://www.mail-archive.com/nutch-user@lucene.apache.org/msg15056.html Rupesh Mankar schrieb: > Hi, > > I am using Nutch 1.0. I have successfully crawled our intranet site. But when > I read the properties of crawled URLs using 'readdb' and 'readseg' commands, > it is showing last modified date as 'Modified time: Thu Jan 01 05:30:00 IST > 1970' for every URL which is incorrect. > > Why Nutch is setting wrong 'last modified date'? Is there any way to fix this > problem? > > Thanks, > Rupesh > > DISCLAIMER > ========== > This e-mail may contain privileged and confidential information which is the > property of Persistent Systems Ltd. It is intended only for the use of the > individual or entity to which it is addressed. If you are not the intended > recipient, you are not authorized to read, retain, copy, print, distribute or > use this message. If you have received this communication in error, please > notify the sender and delete all copies of this message. Persistent Systems > Ltd. does not accept any liability for virus infected mails. > > DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.