hi,
i have a page with <meta name="robots" content="noindex,nofollow" />, now i
know that nutch obey to this tag because i dont find the content and the title
in my index, but i was wondering that this document will not be present in the
index. why he keep the document in my index with no title and no content ??
i'm using index-basic and index-more plugins, and i want to understand why
nutch still filling the url, date, boost....etc since he didnt it for title and
content.
i was thinking that if nutch will obey to nofollow and noindex so it will skip
all the document !
or mabe i missunderstood something, can you plz explain this behavior to me?
best regards.
_________________________________________________________________
Windows Live: Make it easier for your friends to see what you’re up to on
Facebook.
http://go.microsoft.com/?linkid=9691816