On 3/3/06, Michael Ji <[EMAIL PROTECTED]> wrote: > hi, > > I tried this, actually in my case, one site ends with > .net and the other is .org > > so I modified it to > > +^http://([a-z0-9]*\.)*(abc.net|def.org)/ I guess '.' is metadata in regexp, so pls try +^http://([a-z0-9]*\.)*(abc\.net|def\.org)/
Good luck! > and I run another testing, seems doesn't work, coz I > saw a site other than abc and def is being fetched, > > any hints? > > thanks, > > Michael, > > --- sudhendra seshachala <[EMAIL PROTECTED]> wrote: > > > > > Hi, > > Try the following pattern > > +^http://([a-z0-9]*\.)*(abc|def).com/ > > > > I was able to search couple of sites using similar > > pattern. > > If this is what you are asking ? > > > > Michael Ji <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I searched on the mail-post, but still have problem > > to > > run my testing. > > > > Actually, I want my crawling is limited to two site > > solely. > > > > such as, *.abc.com/* > > and *.def.com/* > > > > so I put two line in crawl-urlfilter.txt as > > +^http://([a-z0-9]*\.)*.abc.com/ > > +^http://([a-z0-9]*\.)*.def.com/ > > > > But after running testing, the crawling is not > > limited > > to the above two sites. > > > > From log, I found "not found ...urlfilter-prefix" > > > > I wonder if the failure is due to not include > > crawl-urlfilter.txt in my configure xml or there is > > syntax error for my previous statement. > > > > thanks, > > > > Michael > > > > > > __________________________________________________ > > Do You Yahoo!? > > Tired of spam? Yahoo! Mail has the best spam > > protection around > > http://mail.yahoo.com > > > > > > > > Sudhi Seshachala > > http://sudhilogs.blogspot.com/ > > > > > > > > > > --------------------------------- > > Yahoo! Mail > > Bring photos to life! New PhotoMail makes sharing a > > breeze. > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > -- Keep Discovering ... ... http://www.jroller.com/page/jmars ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
