hi there,

1)
I did several testing running to fetch page from two
website. The fetching depth is 10.

After checking log files, I found the actual fetched
page linkage is very different for two sites.

In one site with lots of news, only first two depth
fetching running well and only fetching 5 linkages.
The actual linkages in that site is far beyond that.

The other site can fetch till 10 rounds and fetched
100's linkage.

I wonder if any one has similar experience. Should I
setup configure files in /conf/?

2)
Also, in Nutch/conf/ directory, I found several
configuration files. Actually, I only modify
crawl-urlfilter.txt to let it accept all the url
(*.*). 

Is it proper?

I really doesn't touch other conf files. Is there a
guideline how I use these files?

thanks,

Michael,



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to