The issue seems to be with H2 tag inside anchor tag.

Once I remove the H2 tag, nutch crawls that URL without any issues. Any idea
how to fix this issue? 

Note: I don't have rights to remove H2 tags in all the pages that nutch is
crawling. 
    
Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-doesnt-crawl-relative-links-that-doesn-t-start-with-leading-tp4239303p4240650.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to