[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Lihachev updated NUTCH-739:
--
Attachment: NUTCH-739_remove_optimize_on_solr_dedup.patch
This simple patch decrease dedup time
Configuration option to override default language for fetched pages.
Key: NUTCH-740
URL: https://issues.apache.org/jira/browse/NUTCH-740
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-740:
-
Attachment: AcceptLanguage.patch
The patch which allows overriding of Accept-Language
Hi Kirby,
Do you think you could add this to Nutch's JIRA?
Please see http://wiki.apache.org/nutch/HowToContribute
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Kirby Bohling kirby.bohl...@gmail.com
To: nutch-dev@lucene.apache.org
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-740:
---
Priority: Minor (was: Major)
Affects Version/s: (was: 0.9.0)
Fix
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714264#action_12714264
]
Dmitry Lihachev commented on NUTCH-739:
---
in my recrawl script I have following lines
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714277#action_12714277
]
Ken Krugler commented on NUTCH-739:
---
There's another approach that works well here, and
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714286#action_12714286
]
Otis Gospodnetic commented on NUTCH-739:
Yes, external optimize calls will work, I
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714287#action_12714287
]
Dmitry Lihachev commented on NUTCH-739:
---
with this approach we still have few optimize
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714288#action_12714288
]
Dmitry Lihachev commented on NUTCH-739:
---
am I wrong?
SolrDeleteDuplications too slow
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714290#action_12714290
]
Dmitry Lihachev commented on NUTCH-739:
---
I think that optimizing solr - is not hadoop
11 matches
Mail list logo