Hi everyone,

I am currently indexing a single website, say www.somesite.com. But I do not
want to crawl urls with certain pattern let's say "nocrawl", ie
www.somesite.com/nocrawl.html or www.somesite.com/apage.php?nocrawl. I want
to discard any urls that contains the pattern 'nocrawl'. How do I do it? I
am using nutch version 7.1. Also I want to use the 'crawl' command for
crawling these pages.

Thank you for you support.

--
Keep on smiling
:) Kumar

Reply via email to