Hey,

HtmlParseFilter plugins decide things to parse out from html (sample : languageidentifier) Similarly, IndexingFilter plugins decide things to Index: (sample : index-basic plugin)

So to ignore, anchor of outbound link, u can implement a custom HtmlParseFilter plugin






grif wrote:
I would like to be able to reduce (or eliminate altogether) Nutch using a
page's outbound anchor text when determining similarity to the user's query.
For instance, if a page has an outbound link to another site with the anchor
text "new jersey", and "new jersey" isn't mentioned anywhere else on the
page, I don't want it to be considered a valid response to a query for "new
jersey". I know you can adjust the weightings of other different properties,
but I did not see anything regarding outbound anchors.

On a separate but similar note, I'd like to consider INCLUDING meta keyword
and description tags. Has anyone done that before?


--
This message has been scanned for viruses and
dangerous content and is believed to be clean.

Reply via email to