RE: Crawl and Index specific links on specific page

Markus Jelsma Fri, 13 Dec 2013 04:16:06 -0800

I can't say, the links may be anywhere on the internet and have no clue about 
nutch.apache.org/downloads. 
 
-----Original message-----
> From:anish_88 <[email protected]>
> Sent: Friday 13th December 2013 13:10
> To: [email protected]
> Subject: RE: Crawl and Index specific links on specific page
> 
> Thanks Markus for your reply.
> 
> Can you help me out with some of the regex-filter patterns.
> What can be the pattern if we want to crawl say .txt or .avi file on page
> say http://nutch.apache.org/downloads.html
> 
> Is this work  +^http://([a-z0-9]*\.)*nutch.apache.org/downloads.html  
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Crawl-and-Index-specific-links-on-specific-page-tp4106524p4106581.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

RE: Crawl and Index specific links on specific page

Reply via email to