[jira] Created: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed

2007-11-29 Thread Andrea Spinelli (JIRA)
[PARSE-HTML plugin] Block certain parts of HTML code from being indexed --- Key: NUTCH-585 URL: https://issues.apache.org/jira/browse/NUTCH-585 Project: Nutch Issue Type:

Re: [jira] Created: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed

2007-11-29 Thread Matt Kangas
Hi Andrea, It sounds like your addition is useful only for people crawling sites under their control. It's not useful for Internet crawling. Is there a way to make this useful to a wider audience without overly- complicating the code? I'm trying to think of specific scenarios where this