Hi Mark, I am not sure, maybe there is a simpler way, but if you want to something to be fetched and processed but not indexed, you can write an index filter plugin and return null for documents that you don't want in the index. This is relatively easy to do, just use the index-basic filter as an example.
Regards, Arkadi >-----Original Message----- >From: Mark Stephenson [mailto:[email protected]] >Sent: Thursday, September 30, 2010 9:29 AM >To: [email protected] >Subject: Excluding javascript files from indexing and search results. > >Hi, > >I'm wondering if there's a way to prevent nutch from indexing >javascript files. I still would like to fetch and parse javascript >files to find valuable outlinks, but I don't want them to show up in >my search results. Is there a good way to do this? > >Thanks a lot, >Mark

