The same problem on FreeBSD 6.0 + jdk1.4.2
I think it was also reported some time ago by Rod Taylor.

Switch to protocol-http.

SG> Hi there,

SG> is there someone out there that can confirm a problem we discovered?

SG> We was wondering why not all pages of a  generated segments was  
SG> fetched. The most strange thing was that the  sum of errors and  
SG> sucesspages was never the same as we defined in topN when generating  
SG> a sgemtent .
SG> First we discovered a problem with the segment size, but I can not  
SG> reproduce the problem anymore with the latest trunk code. :-/
SG> Very strange since I don't think something changed something but I  
SG> was able to reproduce that the size of the segment is around than 50%
SG> of the defined size (topN) on 2 different map reduce installations.

SG> Anyway today we note that when fetching with http-client the sum of  
SG> errors and fetched pages is  much less than the size defined when  
SG> generating the segment.
SG> Changing to protocol-http solves the problem.
SG> Has anyone also note this behavior?

SG> Thanks for comments.
SG> Stefan








Michael



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to