instead of the complete expression just try http://yahoo.com

----- Original Message -----
From: Meryl Silverburgh <[EMAIL PROTECTED]>
Date: Thursday, April 19, 2007 8:34 am
Subject: Re: Crawl www.yahoo.com using nutch 0.9
To: [EMAIL PROTECTED]

> On 4/18/07, Tanmoy Kumar Mukherjee <[EMAIL PROTECTED]> wrote:
> > did u change the regular expression in the url-filter.txt???? That
> > could be the only problem.
> >
> >
> 
> Yes. I did. I have this in my crawl-urlfilter.txt
> 
> # accept hosts in MY.DOMAIN.NAME
> +^http://([a-zA-Z0-9]*\.)*(cnn.com|yahoo.com)/
> 
> 
> 
> > Tanmoy
> >
> 

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to