Hi Manikandan, Did you check your datastore after injecterjob? Has it some rows ? Normally Gora does not support Hadoop 2.x. You should change gora's dependecies. I will send my patch to Gora-144
Talat 27 May 2014 06:19 tarihinde "Manikandan Saravanan" < [email protected]> yazdı: > Hi, > > I’m running Nutch 2 on a 2-node Hadoop cluster. I’m also running Solr 4 on > a separate machine accessible by private IP. I run the crawl command by > doing the following. > > bin/crawl urls/seed.txt TestCrawl <solrUrl> 2 > > My problem is that no URLs are fetched. And thus, nothing is indexed. When > I run stats, this is what I get > > {db_stats-job_201405261214_0043= > { > jobID=job_201405261214_0043, > jobName=db_stats, > counters= > {File Input Format Counters ={BYTES_READ=0}, > Job Counters ={TOTAL_LAUNCHED_REDUCES=1, > SLOTS_MILLIS_MAPS=7990, FALLOW_SLOTS_MILLIS_REDUCES=0, > FALLOW_SLOTS_MILLIS_MAPS=0, TOTAL_LAUNCHED_MAPS=1, > SLOTS_MILLIS_REDUCES=9980}, > Map-Reduce > Framework={MAP_OUTPUT_MATERIALIZED_BYTES=6, MAP_INPUT_RECORDS=0, > REDUCE_SHUFFLE_BYTES=6, SPILLED_RECORDS=0, MAP_OUTPUT_BYTES=0, > COMMITTED_HEAP_BYTES=218103808, CPU_MILLISECONDS=1950, > SPLIT_RAW_BYTES=1017, COMBINE_INPUT_RECORDS=0, REDUCE_INPUT_RECORDS=0, > REDUCE_INPUT_GROUPS=0, COMBINE_OUTPUT_RECORDS=0, > PHYSICAL_MEMORY_BYTES=296411136, REDUCE_OUTPUT_RECORDS=0, > VIRTUAL_MEMORY_BYTES=2251104256, MAP_OUTPUT_RECORDS=0}, > FileSystemCounters={FILE_BYTES_READ=6, HDFS_BYTES_READ=1017, > FILE_BYTES_WRITTEN=156962, HDFS_BYTES_WRITTEN=86}, File Output Format > Counters ={BYTES_WRITTEN=86}}}} > 14/05/26 23:12:34 INFO crawl.WebTableReader: TOTAL urls: 0 > 14/05/26 23:12:34 INFO crawl.WebTableReader: WebTable statistics: done > > What am I missing? My regex and normalise filters are allowing all URL > patterns. I’m trying to do a whole web crawl. > > -- > Manikandan Saravanan > Architect - Technology > TheSocialPeople

