[ https://issues.apache.org/jira/browse/NUTCH-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516357 ]
Doğacan Güney commented on NUTCH-530: ------------------------------------- Ehm, I am not sure about this... After this, we call updateDbScore twice, right? Once to 'merge' linked's together, once to pass big-merged-linked to old datum. This changes ScoringFilter's semantics and may not work for ScoringFilters if one is, say, using the number of outlinks as a factor in scoring. > Add a combiner to improve performance on updatedb > ------------------------------------------------- > > Key: NUTCH-530 > URL: https://issues.apache.org/jira/browse/NUTCH-530 > Project: Nutch > Issue Type: Improvement > Environment: java 1.6 > Reporter: Emmanuel Joke > Assignee: Emmanuel Joke > Fix For: 1.0.0 > > Attachments: NUTCH-530.patch > > > We have a lot of similar links with status "linked" generated at the ouput of > the map task when we try to update the crawldb based on the segment fetched. > We can use a combiner to improve the performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers