[
https://issues.apache.org/jira/browse/NUTCH-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584062#comment-13584062
]
lufeng commented on NUTCH-1373:
-------------------------------
Hi Lewis
Do you mean we can put URLNormalizers in Generator#Selector#reduce to
Generator#Selector#map process in 1.x, like GeneratorJob in 2.x does. Maybe the
initial purpose of put URLNormalizers in Generateor#Selector#reduce is to
reduce the computation of url normalize. but now it seems that it doesn't take
any effect.
but if we merge URLFilters and URLNormalizers like
[NUTCH-366|https://issues.apache.org/jira/browse/NUTCH-366]. This problem will
also be solved.
> Implement consistent execution of normalising and filtering in Generator
> ------------------------------------------------------------------------
>
> Key: NUTCH-1373
> URL: https://issues.apache.org/jira/browse/NUTCH-1373
> Project: Nutch
> Issue Type: Improvement
> Components: generator
> Affects Versions: 1.4
> Reporter: Lewis John McGibbney
> Priority: Minor
> Fix For: 1.7
>
>
> As per discussion here [0] this issue should address the inconsistencies we
> see in the scheduled execution of normalising and filtering between Nutchgora
> Generator Mapper and trunk Generator mapper/reducer.
> Hopefully we can come to some consensus as to the best approach acorss both
> dists.
> [0] http://www.mail-archive.com/user%40nutch.apache.org/msg06360.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira