[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512201
]
Doğacan Güney commented on NUTCH-505:
-
Andrzej, on my tests, java.util.regex is faster on both Java 1.5 and Java 1
suffix-urlfilter.txt does not have a template
-
Key: NUTCH-513
URL: https://issues.apache.org/jira/browse/NUTCH-513
Project: Nutch
Issue Type: Improvement
Affects Versions: 1.0.0
Re
[
https://issues.apache.org/jira/browse/NUTCH-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-512.
---
Resolution: Invalid
Please use mailing lists for such questions.
> Search on date range
> ---
[
https://issues.apache.org/jira/browse/NUTCH-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-511.
---
Resolution: Invalid
Assignee: Andrzej Bialecki
Please use mailing lists for such questi
[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512139
]
Andrzej Bialecki commented on NUTCH-505:
-
Please test Java 1.5 and Java 1.6 - IIRC there are some differences
[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doğacan Güney updated NUTCH-505:
Attachment: NUTCH-505-v3.patch
filtered.txt
New and final version. I shuffled some c
[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512074
]
Doğacan Güney commented on NUTCH-505:
-
Thanks for the suggestion. Automaton really looks good, but using automaton
[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512071
]
Espen Amble Kolstad commented on NUTCH-505:
---
Automaton (http://www.brics.dk/automaton/), used in AutomatonUR
[
https://issues.apache.org/jira/browse/NUTCH-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doğacan Güney updated NUTCH-505:
Attachment: NUTCH-505-v2.patch
After my last commit, I read that Sun's java.util.regex implementatio
Search on date range
Key: NUTCH-512
URL: https://issues.apache.org/jira/browse/NUTCH-512
Project: Nutch
Issue Type: Wish
Affects Versions: 0.9.0
Reporter: anuradha
Hi,
I need to search on date range.
I
Recrawling
---
Key: NUTCH-511
URL: https://issues.apache.org/jira/browse/NUTCH-511
Project: Nutch
Issue Type: Wish
Affects Versions: 0.9.0
Reporter: anuradha
Hi,
First I have crawled one website.
I added one page
[
https://issues.apache.org/jira/browse/NUTCH-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512011
]
Doğacan Güney commented on NUTCH-506:
-
For some reason, crawl_generate is not compressed, even though crawldb,
cr
12 matches
Mail list logo