Re: [PROPOSAL] Rename branch nutchgora into 2.x

2012-07-09 Thread Ferdy Galema
+1 Makes sense. On Mon, Jul 9, 2012 at 12:37 PM, Julien Nioche lists.digitalpeb...@gmail.com wrote: Guys, Now that we've released 2.0, wouldn't it be better to rename the 'nutchgora' branch into something like 'branch-2.x'? Any thoughts on this? Julien -- * *Open Source Solutions for

[jira] [Updated] (NUTCH-1306) Add option to not commit and clarify existing solr.commit.size

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1306: Attachment: NUTCH-1306-trunk-v3.patch minor bug in prev. patch. uploaded v3 of trunk patch.

[jira] [Created] (NUTCH-1423) Remove unused fields in LanguageIndexingFilter

2012-07-09 Thread Ferdy Galema (JIRA)
Ferdy Galema created NUTCH-1423: --- Summary: Remove unused fields in LanguageIndexingFilter Key: NUTCH-1423 URL: https://issues.apache.org/jira/browse/NUTCH-1423 Project: Nutch Issue Type: Bug

[jira] [Updated] (NUTCH-1423) Remove unused fields in LanguageIndexingFilter

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1423: Attachment: NUTCH-1423.patch Remove unused fields in LanguageIndexingFilter

[jira] [Updated] (NUTCH-1424) fix fetcher timelimit logging

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1424: Attachment: NUTCH-1424.patch fix fetcher timelimit logging --

[jira] [Closed] (NUTCH-1424) fix fetcher timelimit logging

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1424. --- Resolution: Fixed Committed. fix fetcher timelimit logging

[jira] [Created] (NUTCH-1425) DbUpdaterJob declares PREV_SIGNATURE on input twice

2012-07-09 Thread Ferdy Galema (JIRA)
Ferdy Galema created NUTCH-1425: --- Summary: DbUpdaterJob declares PREV_SIGNATURE on input twice Key: NUTCH-1425 URL: https://issues.apache.org/jira/browse/NUTCH-1425 Project: Nutch Issue Type:

[jira] [Closed] (NUTCH-1425) DbUpdaterJob declares PREV_SIGNATURE on input twice

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1425. --- Resolution: Fixed Committed. DbUpdaterJob declares PREV_SIGNATURE on input twice

[jira] [Updated] (NUTCH-1425) DbUpdaterJob declares PREV_SIGNATURE on input twice

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1425: Attachment: NUTCH-1425.patch DbUpdaterJob declares PREV_SIGNATURE on input twice

[jira] [Commented] (NUTCH-1306) Add option to not commit and clarify existing solr.commit.size

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409366#comment-13409366 ] Ferdy Galema commented on NUTCH-1306: - Committed in trunk and nutchgora. Thanks anyone

[jira] [Resolved] (NUTCH-1025) Add option not to commit to Solr

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema resolved NUTCH-1025. - Resolution: Fixed Fixed per NUTCH-1306. Add option not to commit to Solr

[jira] [Updated] (NUTCH-1426) HostDb close() should close store instead of flush

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1426: Attachment: NUTCH-1426.patch HostDb close() should close store instead of flush

[jira] [Closed] (NUTCH-1426) HostDb close() should close store instead of flush

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1426. --- Resolution: Fixed Fix Version/s: 2.1 Committed. HostDb close() should close

[jira] [Created] (NUTCH-1426) HostDb close() should close store instead of flush

2012-07-09 Thread Ferdy Galema (JIRA)
Ferdy Galema created NUTCH-1426: --- Summary: HostDb close() should close store instead of flush Key: NUTCH-1426 URL: https://issues.apache.org/jira/browse/NUTCH-1426 Project: Nutch Issue Type:

[jira] [Commented] (NUTCH-1411) nutchgora fetcher.store.content does not work

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409395#comment-13409395 ] Ferdy Galema commented on NUTCH-1411: - +1 Nice and clean implementation. Tested with

[jira] [Closed] (NUTCH-628) Host database to keep track of host-level information

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-628. -- Resolution: Duplicate This one should be closed as it is already implemented by various related

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-09 Thread Kevin Gao (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409465#comment-13409465 ] Kevin Gao commented on NUTCH-1414: -- HI Markus: I found that in your build.xml file, we

[jira] [Updated] (NUTCH-1414) Date extraction parse filter

2012-07-09 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1414: - Attachment: NUTCH-1414-1.6-1-testdata.patch This patch contains the files for

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-09 Thread Kevin Gao (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409497#comment-13409497 ] Kevin Gao commented on NUTCH-1414: -- yes, that is working. thank you very much.

[jira] [Closed] (NUTCH-1411) nutchgora fetcher.store.content does not work

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1411. --- nutchgora fetcher.store.content does not work -

[jira] [Resolved] (NUTCH-1411) nutchgora fetcher.store.content does not work

2012-07-09 Thread Ferdy Galema (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema resolved NUTCH-1411. - Resolution: Fixed Committed. Thanks Alexander for the patch. nutchgora

[Nutch Wiki] Update of FAQ by JulienNioche

2012-07-09 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The FAQ page has been changed by JulienNioche: http://wiki.apache.org/nutch/FAQ?action=diffrev1=133rev2=134 I have two XML files, nutch-default.xml and nutch-site.xml, why?

[jira] [Updated] (NUTCH-1360) Suport the storing of IP address connected to when web crawling

2012-07-09 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1360: - Fix Version/s: (was: nutchgora) 2.1 Suport the storing of IP address

[jira] [Updated] (NUTCH-1087) Deprecate crawl command and replace with example script

2012-07-09 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1087: - Attachment: NUTCH-1087.patch First version of the nutch crawl script. Please test and review

Re: [PROPOSAL] Rename branch nutchgora into 2.x

2012-07-09 Thread Mattmann, Chris A (388J)
+1 from me. Cheers, Chris On Jul 9, 2012, at 3:37 AM, Julien Nioche wrote: Guys, Now that we've released 2.0, wouldn't it be better to rename the 'nutchgora' branch into something like 'branch-2.x'? Any thoughts on this? Julien -- Open Source Solutions for Text Engineering