Hi everybody,

reporting back on this issue.

It seems like for each crawl, I get one new line like the following in the
lsof output:

COMMAND     PID USER   FD      TYPE             DEVICE  SIZE/OFF     NODE
NAME
java          12813 egov  720u     sock                0,5       0t0         
46043464 can't identify protocol

I have a crawl server (Rest API on top of Nutch); each crawl is trigger by
an http request, so the socket leak could be in the web server (I originally
used the HttpServer that comes with the JDK), or by my own Jersey code, or
by Nutch itself.

HttpServer is not guilty; the same issue happens with Tomcat. To test my own
code, I exercised my exact code after commenting out the Nutch calls; the
socket leak disappeared. So it seems like the leak occurs in Nutch - one
dangling socket per crawl.

This page:
http://serverfault.com/questions/153983/sockets-found-by-lsof-but-not-by-netstat

suggests that this can occur when a socket is created, but there is no
connect() or bind() associated to it.

Anything I can do on my side at this point?

Thanks,

Yannick





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-run-program-chmod-too-many-open-files-tp4109753p4111046.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to