Hi, We have a section of our pages that are the header and menus for our website. I would like that this content didn't get indexed along with the main body.
Is there a way to : a) specify sections to index b) specify sections to not index c) build a parse filter that strips out the content. It seems like c) is the most correct option, but by the time a parse filter gets to it, the content , parse and document fragment have all been generated. I couldn't find any information on how to use that to selectively tear out content which are not relevant to our indexing. Any help would be greatly appreciated, Thanks, -a ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
