I believe that this is completely normal for pages that are "gone" or in other words are not
to be fetched again. For example if page is not found (404) it should end up with date like
that.


--
Sami Siren



Sven Wende wrote:

I am using the latest version from CVS!

When I run the following command
        
        $NUTCH_HOME/bin/nutch/readdb db/ -toppages 100 !

I get good looking results like this:

        Page 7: Version: 4
        URL: http://www.foo2.de/foo.html
        ID: b19e4ae246393e0b9499eadf3ab8ed4b
        Next fetch: Wed Dec 22 13:50:42 CET 2004
        Retries since fetch: 0
        Retry interval: 30 days
        Num outlinks: 0
        Score: 121.91983
        NextScore: 0.14999998

but also results like this:

        Page 3: Version: 4
        URL: http://www.foo.de/foo.html
        ID: 1e312a7a74448b86e8b6fc95e0ba7c7c
        Next fetch: Sun Aug 17 07:12:55 CET 292278994
        Retries since fetch: 2
        Retry interval: 30 days
        Num outlinks: 0
        Score: 6.253641
        NextScore: 0.14999998


I wonder about that strange "next fetch" date, which is the same for 1000s of pages in my DB.

These pages will never be refetched, when I�m generating my new segments.






-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John X
Sent: Donnerstag, 18. November 2004 20:10
To: John X
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [Nutch-dev] Strange refetch date


On Thu, Nov 18, 2004 at 11:07:50AM -0800, John X wrote:


On Thu, Nov 18, 2004 at 01:54:57PM +0100, Sven Wende wrote:


Hi,

I used the "readdb" command to list some information

about the pages

in my database.

There is a strange refetch date for very much pages:

        Next fetch: Sun Aug 17 07:12:55 CET 292278994

Other pages have "normal" dates, like:

        Next fetch: Wed Dec 22 13:50:42 CET 2004



I wonder about that strange year indicator. Is this a

bug, or a "feature" ?


Which option did you use? A more detailed log/dump will be helpful.


Please also tell us nutch version number.


-------------------------------------------------------
This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers










------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers






------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to