[
https://issues.apache.org/jira/browse/NUTCH-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579321#action_12579321
]
Mark DeSpain commented on NUTCH-620:
Hi Andrzej,
Though I'm very interested in using
[
https://issues.apache.org/jira/browse/NUTCH-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579371#action_12579371
]
Andrzej Bialecki commented on NUTCH-615:
-
I'll apply the parts of the current
[
https://issues.apache.org/jira/browse/NUTCH-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-615.
---
Resolution: Fixed
Assignee: Andrzej Bialecki
I applied the relevant parts of the
[
https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-616.
---
Resolution: Fixed
I applied the latest patch with minor changes, in rev. 637861 . Thank you!
[
https://issues.apache.org/jira/browse/NUTCH-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579438#action_12579438
]
Andrzej Bialecki commented on NUTCH-620:
-
It would be interesting to see the source
We continue to run on Fetcher1. What are the benefits of moving to
Fetcher2. Not opposed to it, just hadn't thought about it yet as
Fetcher1 seemed to be working fine for us?
Dennis
Andrzej Bialecki wrote:
Hi all,
I'd like to remove the original Fetcher in favor of Fetcher2.
Maintaining
Dennis Kubes wrote:
We continue to run on Fetcher1.
Since you're running large crawls, could you run one of them with
Fetcher2 and comment on the results? Note that Fetcher2 needs a lot
fewer threads than Fetcher - usually running a large crawl with 100
threads is more than sufficient.
Andrzej Bialecki wrote:
Dennis Kubes wrote:
We continue to run on Fetcher1.
Since you're running large crawls, could you run one of them with
Fetcher2 and comment on the results? Note that Fetcher2 needs a lot
fewer threads than Fetcher - usually running a large crawl with 100
threads
[
https://issues.apache.org/jira/browse/NUTCH-220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-220.
---
Resolution: Fixed
Fix Version/s: 1.0.0
Assignee: Andrzej Bialecki
PDF Box
[
https://issues.apache.org/jira/browse/NUTCH-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579507#action_12579507
]
Andrzej Bialecki commented on NUTCH-243:
-
Duplicate of NUTCH-255 .
Some
[
https://issues.apache.org/jira/browse/NUTCH-243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-243.
---
Resolution: Duplicate
Some meta-refresh urls get ignored due to matching regular expression
[
https://issues.apache.org/jira/browse/NUTCH-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-610.
---
Resolution: Invalid
Can't Update or modify an index while web gui is running
[
https://issues.apache.org/jira/browse/NUTCH-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579707#action_12579707
]
Mark DeSpain commented on NUTCH-620:
Sure :) I'm a bit swamped at the moment, but I'll
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/393/changes
Changes:
[ab] Add missing license file.
[ab] NUTCH-223 Crawl.java uses Integer.MAX_VALUE instead of Long.MAX_VALUE.
[ab] NUTCH-220 Upgrade to PDFBox 0.7.3.
[ab] NUTCH-616 Reset Fetch Retry counter when fetch is successful.
[
https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579731#action_12579731
]
Hudson commented on NUTCH-616:
--
Integrated in Nutch-trunk #393 (See
15 matches
Mail list logo