Thanks for the tips. But I have a monster computer, 12G RAM and dual 64 bits processors, my network connection is 100 MB/S! I guess Nutch doesn't close the opened sockets in the case of bad host! I am still strugelling with problem.
Any other idea? Nima On 10/18/05, Fuad Efendi <[EMAIL PROTECTED]> wrote: > For comparison (in order to locate a problem...) you may try also > http://htmlparser.sourceforge.net/ > > - it has web-site crawler written in Java. > > Also, some Linux-specific staff, web-site crawlers written in C > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Tuesday, October 18, 2005 11:00 PM > To: [email protected] > Subject: Re: No buffer space available > > > But I tired it on two different machines, one with Linux Cent OS and the > other one Linux UBUNTU! > > On example of the given Exception is like this: > > 051018 153727 28 fetching http://perso.wanadoo.es/largo/ > java.net.SocketException: No buffer space available > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:364) > at java.net.Socket.connect(Socket.java:507) > at java.net.Socket.connect(Socket.java:457) > at java.net.Socket.<init>(Socket.java:365) > at java.net.Socket.<init>(Socket.java:238) > at > org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.c > reateSocket(DefaultProtocolSocketFactory.java:79) > at > org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$ > 1.doit(ControllerThreadSocketFactory.java:90) > at > org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$ > SocketTask.run(ControllerThreadSocketFactory.java:157) > at java.lang.Thread.run(Thread.java:595) > Nima > > > > > > On 10/18/05, Fuad Efendi <[EMAIL PROTECTED]> wrote: > > > > java.net.SocketException - Thrown to indicate that there is an error > > in the underlying protocol, such as a TCP error. > > > > "No buffer space available" - message comes from underlying OS... > > > > I think it's not Nutch or configuration of Nutch... > > > > May be OS tuning? May be JVM version/vendor? > > > > I don't know in-depth UNIX, but it has some specific settings for > > protocol... > > > > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, October 18, 2005 9:29 PM > > To: [email protected] > > Subject: No buffer space available > > > > > > Hi, > > I was trying to fetch DMOZ open directory using using the exact > > example in the nutch tutorial website. So did the following steps: > > mkdir db mkdir segments bin/nutch admin db -create bin/nutch inject db > > -dmozfile ../nutch-0.7.1/content.rdf.u8 -subset 3000 bin/nutch > > generate db segments s1=`ls -d segments/2* | tail -1` echo $s1 > > bin/nutch fetch -showThreadID -noParsing -threads 50 $s1 bin/nutch > > updatedb db $s1 It starts fetching the pages, but after couple > > hundred pages it starts giving me this exception: > > "java.net.SocketException: No buffer space available" > > Do you have any idea why this might happen? I know it is running out of > > availabe buffer for new socket, but why the old socket are not closed? > Even > > if a fetch fails its socket should be closed and the its buffer should get > > freed! I tried both 0.7 and 0.7.1. Thanks. Nima > > > > > > >
