[ 
https://issues.apache.org/jira/browse/NUTCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665889#action_12665889
 ] 

Todd Lipcon commented on NUTCH-676:
-----------------------------------

Hmm, I can't seem to find the bug I thought I remembered. Maybe the bug I ran 
into was actually due to the hashCode/equals issue.

If a crawl seems to go OK, I'm all for this.

> MapWritable is written inefficiently and confusingly
> ----------------------------------------------------
>
>                 Key: NUTCH-676
>                 URL: https://issues.apache.org/jira/browse/NUTCH-676
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 0.9.0
>            Reporter: Todd Lipcon
>            Priority: Minor
>         Attachments: 
> 0001-NUTCH-676-Replace-MapWritable-implementation-with-t.patch, 
> NUTCH-676_v2.patch, NUTCH-676_v3.patch
>
>
> The MapWritable implemention in o.a.n.crawl is written confusingly - it 
> maintains its own internal linked list which I think may have a bug somewhere 
> (I'm getting an NPE in certain cases in the code, though it's hard to track 
> down)
> Can anyone comment as to why MapWritable is written the way it is, rather 
> than just using a HashMap or a LinkedHashMap if consistent ordering is 
> important? I imagine that would improve performance.
> What about just using the Hadoop MapWritable? Obviously that would break some 
> backwards compatibility but it may be a good idea at some point to reduce 
> confusion (I didn't realize that Nutch had its own impl until a few minutes 
> ago)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to