It looks like you can indeed connect to that v4 machine from the machine 
running Nutch.  I can't tell from here why you got the error you originally 
reported.  Does it happen every time you try running Nutch?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: "Del Rio, Ann" <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Friday, May 30, 2008 3:23:00 PM
> Subject: RE: Indexing XML-based document format per DITA standard
> 
> Thank you for your response and help Otis!
> I greatly appreciate it and am sure others will.
> 
> 
> I did a wget from the machine where I was running Nutch and got the
> following...
> 
> -bash-2.05b$ wget http://v4:10000/lib
> --10:37:52--  http://v4:10000/lib
>            => `lib.1'
> Resolving v4... done.
> Connecting to v4:10000... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 2,717 [text/html]
> 100%[====================================>] 2,717          2.59M/s
> ETA 00:00
> 10:37:52 (2.59 MB/s) - `lib.1' saved [2717/2717]
> 
> Then I tried to telnet too and got a connection closed.
> 
> -bash-2.05b$ telnet
> telnet> open
> (to) v4 10000
> Trying xxx.xxx.231.40...
> Connected to xxxx.ebay.com (xxx.xxx.231.40).
> Escape character is '^]'.
> Connection closed by foreign host.
> 
> Doesn't telnet service/ports need to be enabled on the other end's
> server first before we can telnet to it? Does the nutch crawler use
> telnet to fetch the URL?
> 
> Apparently, we do not use proxy hosts and ports here at eBay in any of
> our APIs, so I am not sure how to get those. But I will still ask around
> if they know what proxy hosts and ports we are using.
> 
> Also, when I browse the URL it is fine, so I checked my IE browser
> options and checked on the LAN Settings to look for the proxy address
> and port and we are not using any as well. 
> 
> 
> Thanks,
> Ann Del Rio
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Friday, May 30, 2008 10:17 AM
> To: [email protected]
> Subject: Re: Indexing XML-based document format per DITA standard
> 
> Can you connect to it (telnet to it, for example) directly from the
> machine(s) where you are running Nutch?
> (this is a network issue, nothing to do with XML/parsing)
> 
> 
> Maybe you need to go through some eBay proxy?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> ----- Original Message ----
> > From: "Del Rio, Ann" 
> > To: [email protected]
> > Sent: Friday, May 30, 2008 6:24:01 PM
> > Subject: Indexing XML-based document format per DITA standard
> > 
> > I added a new URL to index which is in a XML-based document format per
> 
> > DITA standard and I get the following error.
> > 
> > java.net.SocketException: Connection reset
> > 2008-05-27 17:56:58 ERROR Http                 at
> > java.net.SocketInputStream.read(SocketInputStream.java:168)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > java.io.BufferedInputStream.read(BufferedInputStream.java:235)
> > 2008-05-27 17:56:58 ERROR Http                 at
> >
> org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:77)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:105)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.j
> > av
> > a:1115)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpC
> > on
> > nectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1373)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethod
> > Ba
> > se.java:1832)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBa
> > se
> > .java:1590)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.ja
> > va
> > :995)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Http
> > Me
> > thodDirector.java:397)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMet
> > ho
> > dDirector.java:170)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
> > :3
> > 96)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
> > :3
> > 24)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.nutch.protocol.httpclient.HttpResponse.(HttpResponse.ja
> > va:96)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:99)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase
> > .j
> > ava:219)
> > 2008-05-27 17:56:58 ERROR Http                 at
> > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:145)
> > 2008-05-27 17:56:58 INFO  Fetcher              fetch of
> > http://v4:10000/lib   failed with:
> > java.net.SocketException: Connection reset
> > 
> > i googled and found no solution so far...
> > 
> > do i need to setup some config / host file to specify the ports?
> > the URL is an internal website.
> > 
> > any response will be appreciated.
> > 
> > Thanks,
> > Ann Del Rio
> > Senior Developer
> > eBay, Inc

Reply via email to