I trink your regex doesn't allow more than the home Page. Try to extend your Domain by .* +^http://([a-z0-9]*\.)sina.com.cn/.*
Am 15.05.2011 11:05 schrieb "Bupo Jung" <[email protected]>: > Hi, > I use nutch to crawl a website :http://www.sina.com.cn > The crawl process stop at depth 0, and only fetch the homepage of the > website. > > My crawl crawl-urlfilter.txt is > # accept hosts in MY.DOMAIN.NAME > +^http://([a-z0-9]*\.)sina.com.cn/ > > # skip everything else > -. > > Have somebody an idea ? > > -- > > Yizhong Zhuang > Beijing University of Posts and Telecommunications > Email:[email protected] > Myblog:www.mikkoo.info

