Hi,

We have a section of our pages that are  the header and menus for our
website.  I would like that this content didn't get indexed along with the
main body.


Is there a way to :

a) specify sections to index
b) specify
sections to not index
c) build a parse filter that strips out the content.


It seems like c) is the most correct option, but by the time a parse filter
gets to it, the content , parse and document fragment have all been generated.


I couldn't find any information on how to use that to selectively tear
out content which are not relevant to our indexing.

Any help would be greatly
appreciated,

Thanks,
-a


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to