Behavior of NOINDEX,FOLLOW is not intuitive
-------------------------------------------

                 Key: NUTCH-966
                 URL: https://issues.apache.org/jira/browse/NUTCH-966
             Project: Nutch
          Issue Type: Improvement
          Components: indexer, parser
    Affects Versions: 1.2
            Reporter: Josh Pavel
            Priority: Minor


If a page has NOINDEX,FOLLOW for the ROBOTS metatag, Nutch will still create a 
document that can be found in the index via metatag or URL matching.  Instead, 
Nutch should rely on doc or parse metadata but nothing should be stored by the 
html parser. (thanks to Julien Nioche for helping me to understand the issue). 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to