hi, Jiaqi Tan & John Mendenhall i have encountered the same problem, i have tried
correct the log4j bug and http://www.mail-archive.com/[EMAIL PROTECTED]/msg01991.html already, and it still did not work, i was working on a cluster of 4 boxes with redhat as4 i also checked the hadoop.log and found nothing more important so i think the problem was the generator, and i saw someone said it might caused by setting bad mapred.map.tasks and mapred.reduce.tasks, i had 4 PCs and followed the explanation of mapred.map.tasks and mapred.reduce.tasks, i set 17 and 7, was it right? can someone help me? thanks ivannie >> 08/02/20 15:38:09 WARN crawl.Generator: Generator: 0 records selected >> for fetching, exiting ... >> 08/02/20 15:38:09 INFO crawl.Crawl: Stopping at depth=0 - no more URLs to >> fetch. >> 08/02/20 15:38:09 WARN crawl.Crawl: No URLs to fetch - check your seed >> list and URL filters. >> >> I've inserted code at Generator.java:424, which says: >> if (readers == null || readers.length == 0 || !readers[0].next(new >> FloatWritable())) { >> LOG.warn("Generator: 0 records selected for fetching, exiting ..."); >> >> essentially at the decision point to see which of the conditions >> triggered the 0 records selected message, and the "readers" object is >> perfectly fine, but the SequenceFileOutputFormat is reporting there >> are no values (I suppose of URL scores) at all to be retrieved, >> causing the generator to stop. > >There is a problem with the Generator. There was a change committed >after 0.9 was released. I implemented this change and it fixed my >problem: > >http://www.mail-archive.com/[EMAIL PROTECTED]/msg01991.html > >JohnM > >-- >john mendenhall >[EMAIL PROTECTED] >surf utopia >internet services = = = = = = = = = = = = = = = = = = = =
