[
https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141334#comment-13141334
]
Zhang JinYan edited comment on NUTCH-1138 at 11/1/11 5:14 PM:
--------------------------------------------------------------
Apply the path to branch-1.4, rebuild with cmd: "ant clean build".
Config to crawl websites:
{quote}
http://172.16.123.123/bbs/viewthread.php?tid=12345
http://172.16.123.123/bbs/attachment.php?aid=12345
http://www.jettycn.com/
{quote}
The previous two sites are not available.
Run crawl with cmd(platform windows):
{quote}
sh.exe ./bin/nutch crawl seedurl -dir crawldev -solr http://localhost:8983/solr/
{quote}
Complete the crawl successfully. Query in solr admin return:
{code:xml}
<result name="response" numFound="320" start="0"></result>
{code}
Search word "ERROR" in "hadoop.log",find 3 results caused by:
{code}
java.net.ConnectException: Connection timed out: connect
{code}
Search word "Exception" in "hadoop.log", find results like this:
{quote}
2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - I/O exception
(org.apache.commons.httpclient.NoHttpResponseException) caught when processing
request: The server www.jettycn.com failed to respond
2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - Retrying request
{quote}
So there is no exception related to your patch in the "hadoop.log".
The patch work fine with "branch-1.4" for me.
was (Author: yearn20m):
Apply the path to branch-1.4, rebuild with cmd: "ant clean build".
Config to crawl websites:
{quote}
http://172.16.123.123/bbs/viewthread.php?tid=12345
http://172.16.123.123/bbs/attachment.php?aid=12345
http://www.jettycn.com/
{quote}
The previous two sites are not available.
Run crawl with cmd(platform windows):
{quote}
sh.exe ./bin/nutch crawl seedurl -dir crawldev -solr http://localhost:8983/solr/
{quote}
Complete the crawl successfully. Query in solr admin return:
{code:xml}
<result name="response" numFound="320" start="0"></result>
{code}
Search word "ERROR" in "hadoop.log",find 3 results caused by:
{code}
java.net.ConnectException: Connection timed out: connect
{code}
Search word "Exception" in "hadoop.log", find results like this:
{quote}
2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - I/O exception
(org.apache.commons.httpclient.NoHttpResponseException) caught when processing
request: The server www.jettycn.com failed to respond
2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - Retrying request
{quote}
So there is no exception related your path in the "hadoop.log".
The path work fine with "branch-1.4" for me.
> remove LogUtil from trunk and nutch gora
> ----------------------------------------
>
> Key: NUTCH-1138
> URL: https://issues.apache.org/jira/browse/NUTCH-1138
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 1.4, nutchgora
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Minor
> Fix For: nutchgora, 1.5
>
> Attachments: Document1.txt, NUTCH-1138-trunk-20111023.patch
>
>
> This should move towards the removal of the LogUtil class from both codebases
> as per comments in NUTCH-1078.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira