[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923720#comment-13923720
]
Markus Jelsma commented on NUTCH-1113:
--
Alright! This fixes the issue! I will commit
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923773#comment-13923773
]
Hudson commented on NUTCH-1113:
---
SUCCESS: Integrated in Nutch-trunk #2553 (See
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917087#comment-13917087
]
Sebastian Nagel commented on NUTCH-1113:
Hi [~markus17], junit tests
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915831#comment-13915831
]
Sebastian Nagel commented on NUTCH-1113:
Results of tests: The number of documents
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915917#comment-13915917
]
Julien Nioche commented on NUTCH-1113:
--
Well done, thanks guys!
Merging segments
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915969#comment-13915969
]
Hudson commented on NUTCH-1113:
---
FAILURE: Integrated in Nutch-trunk #2545 (See
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908265#comment-13908265
]
Sebastian Nagel commented on NUTCH-1113:
Hi [~markus17], your patch should work
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908520#comment-13908520
]
Markus Jelsma commented on NUTCH-1113:
--
I'll get back to this next monday, i
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907160#comment-13907160
]
Sebastian Nagel commented on NUTCH-1113:
Hi [~markus17], tried test data from
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880026#comment-13880026
]
Markus Jelsma commented on NUTCH-1113:
--
I have tried running long sequences with
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880007#comment-13880007
]
Sebastian Nagel commented on NUTCH-1113:
Great! I'll try to verify it within the
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878441#comment-13878441
]
Markus Jelsma commented on NUTCH-1113:
--
The last chronological index went wrong, some
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876283#comment-13876283
]
Markus Jelsma commented on NUTCH-1113:
--
Yes, i reindexed them segment for segment.
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876304#comment-13876304
]
Markus Jelsma commented on NUTCH-1113:
--
Ok, i got something! A record that wasn't
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876384#comment-13876384
]
Markus Jelsma commented on NUTCH-1113:
--
I got less documents indexed when ignoring
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876426#comment-13876426
]
Markus Jelsma commented on NUTCH-1113:
--
I have to reindex my control cluster segment
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876036#comment-13876036
]
Sebastian Nagel commented on NUTCH-1113:
If (re)indexing multiple segments also
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873324#comment-13873324
]
Markus Jelsma commented on NUTCH-1113:
--
I'm now indexing segment for segment to a
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873464#comment-13873464
]
Markus Jelsma commented on NUTCH-1113:
--
SOLR-4260 is blocking every test i do, if
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873533#comment-13873533
]
Markus Jelsma commented on NUTCH-1113:
--
NUTCH-1706 causes some stuff not to be
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870690#comment-13870690
]
Markus Jelsma commented on NUTCH-1113:
--
This works too, but if we ditch most LINKED
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869399#comment-13869399
]
Markus Jelsma commented on NUTCH-1113:
--
Ignoring LINKED completely means around line
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869439#comment-13869439
]
Markus Jelsma commented on NUTCH-1113:
--
Sebastian's patch does solve a few problems
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869441#comment-13869441
]
Markus Jelsma commented on NUTCH-1113:
--
Another record is also missing
{code}
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867945#comment-13867945
]
Markus Jelsma commented on NUTCH-1113:
--
I have two narrowed the problem down to a
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868024#comment-13868024
]
Sebastian Nagel commented on NUTCH-1113:
Hi [~markus17], I've run your unit tests
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866635#comment-13866635
]
Markus Jelsma commented on NUTCH-1113:
--
Alright, NUTCH-1113 isn't correct as well.
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866752#comment-13866752
]
Markus Jelsma commented on NUTCH-1113:
--
There are some issues with the checks, will
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865624#comment-13865624
]
Sebastian Nagel commented on NUTCH-1113:
Isn't this fixed with NUTCH-1520?
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865631#comment-13865631
]
Markus Jelsma commented on NUTCH-1113:
--
No. We don't really seem to be losing
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865632#comment-13865632
]
Markus Jelsma commented on NUTCH-1113:
--
I'll run some more tests tomorrow, at least i
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193115#comment-13193115
]
Sebastian Nagel commented on NUTCH-1113:
I had a look at the attached segment
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105972#comment-13105972
]
Lewis John McGibbney commented on NUTCH-1113:
-
We have a pretty meaty JUnit
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105661#comment-13105661
]
Markus Jelsma commented on NUTCH-1113:
--
Can you rule out the indexer and see what you
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105693#comment-13105693
]
Edward Drapkin commented on NUTCH-1113:
---
Using this command:
nutch readseg -get
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105704#comment-13105704
]
Edward Drapkin commented on NUTCH-1113:
---
I don't have any idea what's causing this
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105715#comment-13105715
]
Markus Jelsma commented on NUTCH-1113:
--
Investigation, debug report; same stuff
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105714#comment-13105714
]
Edward Drapkin commented on NUTCH-1113:
---
Upon further inspection, it appears that
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105758#comment-13105758
]
Edward Drapkin commented on NUTCH-1113:
---
The more I look into this, the more I'm
39 matches
Mail list logo