hi there,

1)
I did several testing running to fetch page from two
website. The fetching depth is 10.

After checking log files, I found the actual fetched
page linkage is very different for two sites.

In one site with lots of news, only first two depth
fetching running well and only fetching 5 linkages.
The actual linkages in that site is far beyond that.

The other site can fetch till 10 rounds and fetched
100's linkage.

I wonder if any one has similar experience. Should I
setup configure files in /conf/?

2)
Also, in Nutch/conf/ directory, I found several
configuration files. Actually, I only modify
crawl-urlfilter.txt to let it accept all the url
(*.*). 

Is it proper?

I really doesn't touch other conf files. Is there a
guideline how I use these files?

thanks,

Michael,



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to