[jira] Created: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed
[PARSE-HTML plugin] Block certain parts of HTML code from being indexed --- Key: NUTCH-585 URL: https://issues.apache.org/jira/browse/NUTCH-585 Project: Nutch Issue Type:
Re: [jira] Created: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed
Hi Andrea, It sounds like your addition is useful only for people crawling sites under their control. It's not useful for Internet crawling. Is there a way to make this useful to a wider audience without overly- complicating the code? I'm trying to think of specific scenarios where this