My Nutch 0.7.1 always tries to fetch same page two times.

Today I checked code from Trunk and found, that
1) html parser creates Outlink[]
2) Some code in core Nutch tries to create Outlink[] from plain (parsed)
text

Didn't have much time to check...
Another strange behavior: "anchor text" is sometimes huge, not the same
which I see on a web-page.

Reply via email to