Hi Julien, Thanks for getting back to me. on this one.
On Mon, May 21, 2012 at 8:32 PM, Julien Nioche <[email protected]> wrote: > We could normalise before filtering in the mapper indeed. OK well I'll open a ticket as it seems to be OK in Nutchgora but slightly confusing in trunk... I need to have a right good look into it and will annotate an issue ASAP. > Whether this is > accidental or on purpose is not clear. PLease open a JIRA for this. > On a different subject, do you think you could take care of doing the RC2 > for trunk? I saw that you did some work on it and I assume that Chris too > busy to do it Yes I understand that. I have no issues with running the RC but would like to wait for Chris to chime in before I fire ahead. Does this sound reasonable? Thank you Lewis > > Thanks > > Julien > > > >> When working on some patches for both trunk and Nutchgora branch I >> ended up doing some code analysis of the generator mappers [0] & [1] >> respectively. With specific reference to the code blocks in trunk >> (lines 175 - 185) and Nutchgora branch (lines 57 - 74) where in trunk >> we initially check if filter is true whereas in Nutchgora we check >> whether normalize is true, then check whether filter is true before >> proceeding to catch any nasties... it seems to me that there may be a >> bug in trunk but I am not sure and would like someone to comment. >> >> Thanks >> >> Lewis >> >> [0] >> https://svn.apache.org/viewvc/nutch/trunk/src/java/org/apache/nutch/crawl/Generator.java?view=markup >> [1] >> https://svn.apache.org/viewvc/nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorMapper.java?view=markup >> >> -- >> Lewis >> > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble -- Lewis

