[ 
https://issues.apache.org/jira/browse/NUTCH-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516357
 ] 

Doğacan Güney commented on NUTCH-530:
-------------------------------------

Ehm, I am not sure about this... After this, we call updateDbScore twice, 
right? Once to 'merge' linked's together, once to pass big-merged-linked to old 
datum. This changes ScoringFilter's semantics and may not work for 
ScoringFilters if one is, say, using the number of outlinks as a factor in 
scoring.

> Add a combiner to improve performance on updatedb
> -------------------------------------------------
>
>                 Key: NUTCH-530
>                 URL: https://issues.apache.org/jira/browse/NUTCH-530
>             Project: Nutch
>          Issue Type: Improvement
>         Environment: java 1.6
>            Reporter: Emmanuel Joke
>            Assignee: Emmanuel Joke
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-530.patch
>
>
> We have a lot of similar links with status "linked" generated at the ouput of 
> the map task when we try to update the crawldb based on the segment fetched.
> We can use a combiner to improve the performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to