Further details:

If I run strace on the process, it looks like this, over and over and over:

gettimeofday({1155249187, 999952}, NULL) = 0
gettimeofday({1155249188, 389}, NULL)   = 0
gettimeofday({1155249188, 679}, NULL)   = 0
gettimeofday({1155249188, 955}, NULL)   = 0
clock_gettime(CLOCK_REALTIME, {1155249188, 1235000}) = 0
futex(0xb1f0185c, FUTEX_WAIT, 7163, {0, 999720000}) = -1 ETIMEDOUT
(Connection timed out)
futex(0x805d250, FUTEX_WAKE, 1)         = 0
futex(0x805c378, FUTEX_WAIT, 2, NULL)   = 0
futex(0x805c378, FUTEX_WAKE, 1)         = 0

I'm afraid I don't know how to go about finding what part of the code might
be causing this...

Any ideas?

Ben

On 8/10/06, Benjamin Higgins <[EMAIL PROTECTED]> wrote:

Hello,

Nutch is stalling in the fetch process.  I've run it twice now, and it is
stopping on the *same* URL both times. I don't get what's going on!

The last status report was:
060810 145315 status: segment 20060810142649, 7900 pages, 14 errors,
98421231 bytes, 1571224 ms
060810 145315 status: 5.0279274 pages/s, 489.3738 kb/s, 12458.384bytes/page

Then, exactly 94 documents later with no errors in between, it just
stops.  On what appears to be a perfectly normal URL and a perfectly normal
page.  I don't get it.

How can I debug this situation further, to see what's going on?

I'm really frustrated since I don't know where to start looking.

Nutch is still running, taking up a lot of CPU.  I don't want to kill it
unless it really stuck.  How can I tell?

Ben

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to