[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-03-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1113: --- Attachment: NUTCH-1113-trunk-junit-fail.patch Fixed also second problem in junit test:

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-03-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1113: --- Attachment: (was: NUTCH-1113-trunk-junit-fail.patch) Merging segments causes URLs to

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-03-01 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1113: --- Attachment: NUTCH-1113-trunk-junit-fail.patch Merging segments causes URLs to vanish from

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-02-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-trunk-junit-final.patch Final patch including the stuff mentioned by

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-02-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Fix Version/s: (was: 1.9) 1.8 Merging segments causes URLs to vanish

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-02-21 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-trunk.patch Includes STATUS_FETCH_NOTMODIFIED in the check. But are you

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-22 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-junit.patch Attached patch seems to completely fix the issue, finally! *

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-10 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-junit.patch Slightly updated patch. I have no merged and indexed a large

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-10 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1113: --- Attachment: NUTCH-1113-junit.patch * extended Junit test to fail if both linked and fetch

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-09 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-junit.patch Alright, manual testing did not go very well and it takes

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-09 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Priority: Blocker (was: Major) Merging segments causes URLs to vanish from crawldb/index?

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-09 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-junit.patch New patch that actually works for Apache Nutch current trunk.

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-09 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-junit.patch New patch! Previous patch had an error in the checks. With

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2014-01-08 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Attachment: NUTCH-1113-trunk.patch Patch for trunk with Edward's fix. That fix at least solves a

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2012-04-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Fix Version/s: (was: 1.5) 1.6 20120304-push-1.6 Merging

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2011-09-28 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Fix Version/s: (was: 1.4) 1.5 Merging segments causes URLs to vanish

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2011-09-15 Thread Edward Drapkin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Drapkin updated NUTCH-1113: -- Attachment: merged_segment_output.txt unmerged_segment_output.txt Output for

[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index?

2011-09-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1113: - Fix Version/s: 1.4 Thanks! It's marked for 1.4 now so it, at least, doesn't slip of the radar.