Can you please open a JIRA for this? Thanks Julien
On 13 July 2011 08:01, Nutch User - 1 <[email protected]> wrote: > On 07/12/2011 08:09 PM, lewis john mcgibbney wrote: > > Well I think in order to address the problem directly it would be better > to > > focus on getting something working with a distribution of Nutch you are > most > > comfortable working with. For the time being I would avoid working with > > trunk 2.0 unless you can justify otherwise. I would also either make a > > decision between Nutch 1.2 and the current 1.3 release rather than > focussing > > on previous branches, which may or may not be stable depending on when > you > > last svn updated. > > > > If you can try working with a fresh 1.2 or 1.3 (preferrably 1.3) then we > > could maybe get to the bottom of this one as it would be great to find > > whether there is scope to file a JIRA with this. > > > > Thank you > > Currently I'm working with the official 1.3 distribution of Nutch > (apache-nutch-1.3-bin.zip). I have encountered this URL redirection and > zero scores problem in both 1.2 and 1.3. > > I crawled ~12k pages with the quick fix I made, and none of the URLs in > the CrawlDB had zero as their score. Before the fix crawling the same > pages resulted in ~1.5k of the URLs having zero scores. > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

