[jira] Closed: (NUTCH-651) Remove bin/{start|stop}-balancer.sh from svn tracking

2008-09-22 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doğacan Güney closed NUTCH-651. --- Resolution: Fixed Fix Version/s: 1.0.0 Files removed as of rev. 697781. Remove

[jira] Commented: (NUTCH-120) one bad link on a page kills parsing

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633278#action_12633278 ] Andrzej Bialecki commented on NUTCH-120: - This has been fixed as a part of another

[jira] Closed: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-153. --- Resolution: Fixed Fix Version/s: 1.0.0 TextParser is only supposed to parse plain

[jira] Commented: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633281#action_12633281 ] Andrzej Bialecki commented on NUTCH-153: - The timeout support has been added to

[jira] Commented: (NUTCH-155) Remove web gui from the distribution to contrib and use OpenSearch Servlet

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633286#action_12633286 ] Andrzej Bialecki commented on NUTCH-155: - This has been discussed on the mailing

[jira] Closed: (NUTCH-155) Remove web gui from the distribution to contrib and use OpenSearch Servlet

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-155. --- Resolution: Won't Fix Fix Version/s: 1.0.0 Remove web gui from the distribution to

[jira] Closed: (NUTCH-255) Regular Expression for RegexUrlNormalizer to remove jsessionid

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-255. --- Resolution: Duplicate Regular Expression for RegexUrlNormalizer to remove jsessionid

[jira] Commented: (NUTCH-255) Regular Expression for RegexUrlNormalizer to remove jsessionid

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633288#action_12633288 ] Andrzej Bialecki commented on NUTCH-255: - Duplicate of NUTCH-279 . Regular

[jira] Closed: (NUTCH-330) command line tool to search a Lucene index

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-330. --- Resolution: Won't Fix Fix Version/s: 1.0.0 command line tool to search a Lucene index

[jira] Updated: (NUTCH-355) The title of query result could like the summary have the highlight??

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-355: Priority: Minor (was: Major) Affects Version/s: 1.0.0 The title of query

[jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implment

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-427: Priority: Minor (was: Major) protocol-smb: plugin protocol implementing the CIFS/SMB

[jira] Commented: (NUTCH-451) Tool to recover partial fetcher output

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633370#action_12633370 ] Andrzej Bialecki commented on NUTCH-451: - I'm closing this issue, as the tool is

[jira] Closed: (NUTCH-530) Add a combiner to improve performance on updatedb

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-530. --- Resolution: Won't Fix Add a combiner to improve performance on updatedb

[jira] Closed: (NUTCH-524) Generate Problem with Single Node

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-524. --- Resolution: Won't Fix Generate Problem with Single Node -

[jira] Commented: (NUTCH-524) Generate Problem with Single Node

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633373#action_12633373 ] Andrzej Bialecki commented on NUTCH-524: - Closing this issue because the requested

[jira] Commented: (NUTCH-582) Add missing type parameters

2008-09-22 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633374#action_12633374 ] Andrzej Bialecki commented on NUTCH-582: - I believe this has been addressed as a

[jira] Commented: (NUTCH-637) Add method to nutch and tika system(Code written)

2008-09-22 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633376#action_12633376 ] Doğacan Güney commented on NUTCH-637: - Why do you need this method? (Content-type is

[jira] Commented: (NUTCH-653) Upgrade to hadoop 0.18

2008-09-22 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633500#action_12633500 ] Doğacan Güney commented on NUTCH-653: - Does anyone object to upgrading? I want to go

Re: [jira] Commented: (NUTCH-653) Upgrade to hadoop 0.18

2008-09-22 Thread Dennis Kubes
+1 to commit this. Should provide some performance improvements with hadoop as well. Dennis Doğacan Güney (JIRA) wrote: [ https://issues.apache.org/jira/browse/NUTCH-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633500#action_12633500 ]

[jira] Commented: (NUTCH-375) Link to 0.8.x apidocs broken on website

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633614#action_12633614 ] Hudson commented on NUTCH-375: -- Integrated in Nutch-trunk #580 (See

[jira] Commented: (NUTCH-651) Remove bin/{start|stop}-balancer.sh from svn tracking

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633616#action_12633616 ] Hudson commented on NUTCH-651: -- Integrated in Nutch-trunk #580 (See

[jira] Commented: (NUTCH-633) ParseSegment no longer allow reparsing

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633615#action_12633615 ] Hudson commented on NUTCH-633: -- Integrated in Nutch-trunk #580 (See