Andrzej,

I am trying to restore human-oriented web-site tree using anchor text! As a
samle, page with anchor text "Motherboards" has many linked pages with
concrete motherboards, etc; we can group information in many cases.

Anchor text is the true subject of the page, but within same domain. BTW,
some pages have <META name="keywords" content="...">, and Nutch doesn't
handle it.

>Anyway, that's how the PageRank is _supposed_ to work - it should give a 
>higher score to sites that are highly linked, and also it should 
>strongly consider the anchor text as an indication of the page's true 
>subject ... ;-)







Reply via email to