hi, Jiaqi Tan & John Mendenhall

i have encountered the same problem, i have tried

correct the log4j bug and

http://www.mail-archive.com/[EMAIL PROTECTED]/msg01991.html

already, and it still did not work, i was working on a cluster of 4 boxes with 
redhat as4

i also checked the hadoop.log and found nothing more important

so i think the problem was the generator, and i saw someone said it might 
caused by setting bad mapred.map.tasks and mapred.reduce.tasks, i had 4 PCs and 
followed the explanation of mapred.map.tasks and mapred.reduce.tasks, i set 17 
and 7, was it right? can someone help me?

thanks

ivannie

>> 08/02/20 15:38:09 WARN crawl.Generator: Generator: 0 records selected
>> for fetching, exiting ...
>> 08/02/20 15:38:09 INFO crawl.Crawl: Stopping at depth=0 - no more URLs to 
>> fetch.
>> 08/02/20 15:38:09 WARN crawl.Crawl: No URLs to fetch - check your seed
>> list and URL filters.
>> 
>> I've inserted code at Generator.java:424, which says:
>> if (readers == null || readers.length == 0 || !readers[0].next(new
>> FloatWritable())) {
>>    LOG.warn("Generator: 0 records selected for fetching, exiting ...");
>> 
>> essentially at the decision point to see which of the conditions
>> triggered the 0 records selected message, and the "readers" object is
>> perfectly fine, but the SequenceFileOutputFormat is reporting there
>> are no values (I suppose of URL scores) at all to be retrieved,
>> causing the generator to stop.
>
>There is a problem with the Generator.  There was a change committed
>after 0.9 was released.  I implemented this change and it fixed my
>problem:
>
>http://www.mail-archive.com/[EMAIL PROTECTED]/msg01991.html
>
>JohnM
>
>-- 
>john mendenhall
>[EMAIL PROTECTED]
>surf utopia
>internet services

= = = = = = = = = = = = = = = = = = = =
                        


Reply via email to