Hi Nutch community.

We are trying to solve such task with the help of nutch:
 User give to us path on site and number of pages to grab. For example
http://www.cnn.com/ and 100 pages.
 We start nutch with settings depth = 2 topN=100.
 As result we receive only 16 pages.
 When we start nutch with settings depth = 2 topN=1000 we still receive 17
pages.

 But on the home page of cnn.com there near 50 unique links.

 If anyone can explain how we can make nutch to get determined amount of
pages from site we will be very appreciate.

Thanks in advance.
-------------------------------------------------
Best wishes, Artyom Shvedchikov

Reply via email to