to be fetched again. For example if page is not found (404) it should end up with date like
that.
-- Sami Siren
Sven Wende wrote:
I am using the latest version from CVS!
When I run the following command $NUTCH_HOME/bin/nutch/readdb db/ -toppages 100 !
I get good looking results like this:
Page 7: Version: 4 URL: http://www.foo2.de/foo.html ID: b19e4ae246393e0b9499eadf3ab8ed4b Next fetch: Wed Dec 22 13:50:42 CET 2004 Retries since fetch: 0 Retry interval: 30 days Num outlinks: 0 Score: 121.91983 NextScore: 0.14999998
but also results like this:
Page 3: Version: 4 URL: http://www.foo.de/foo.html ID: 1e312a7a74448b86e8b6fc95e0ba7c7c Next fetch: Sun Aug 17 07:12:55 CET 292278994 Retries since fetch: 2 Retry interval: 30 days Num outlinks: 0 Score: 6.253641 NextScore: 0.14999998
I wonder about that strange "next fetch" date, which is the same for 1000s of pages in my DB.
These pages will never be refetched, when I�m generating my new segments.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John X
Sent: Donnerstag, 18. November 2004 20:10
To: John X
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [Nutch-dev] Strange refetch date
On Thu, Nov 18, 2004 at 11:07:50AM -0800, John X wrote:
On Thu, Nov 18, 2004 at 01:54:57PM +0100, Sven Wende wrote:about the pages
Hi,
I used the "readdb" command to list some information
bug, or a "feature" ?in my database.
There is a strange refetch date for very much pages:
Next fetch: Sun Aug 17 07:12:55 CET 292278994
Other pages have "normal" dates, like:
Next fetch: Wed Dec 22 13:50:42 CET 2004
I wonder about that strange year indicator. Is this a
Which option did you use? A more detailed log/dump will be helpful.
Please also tell us nutch version number.
-------------------------------------------------------
This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers
------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
