My Nutch 0.7.1 always tries to fetch same page two times. Today I checked code from Trunk and found, that 1) html parser creates Outlink[] 2) Some code in core Nutch tries to create Outlink[] from plain (parsed) text
Didn't have much time to check... Another strange behavior: "anchor text" is sometimes huge, not the same which I see on a web-page.
