I don't think Hadoop 2 has been mentioned at all.

On 28 May 2014 09:43, Talat Uyarer <[email protected]> wrote:

> Hi Manikandan,
>
> Did you check your datastore after injecterjob? Has it some rows ? Normally
> Gora does not support Hadoop 2.x. You should change gora's dependecies. I
> will send my patch to Gora-144
>
> Talat
> 27 May 2014 06:19 tarihinde "Manikandan Saravanan" <
> [email protected]> yazdı:
>
> > Hi,
> >
> > I’m running Nutch 2 on a 2-node Hadoop cluster. I’m also running Solr 4
> on
> > a separate machine accessible by private IP. I run the crawl command by
> > doing the following.
> >
> > bin/crawl urls/seed.txt TestCrawl <solrUrl> 2
> >
> > My problem is that no URLs are fetched. And thus, nothing is indexed.
> When
> > I run stats, this is what I get
> >
> > {db_stats-job_201405261214_0043=
> >         {
> >                 jobID=job_201405261214_0043,
> >                 jobName=db_stats,
> >                 counters=
> >                         {File Input Format Counters ={BYTES_READ=0},
> >                         Job Counters ={TOTAL_LAUNCHED_REDUCES=1,
> > SLOTS_MILLIS_MAPS=7990, FALLOW_SLOTS_MILLIS_REDUCES=0,
> > FALLOW_SLOTS_MILLIS_MAPS=0, TOTAL_LAUNCHED_MAPS=1,
> > SLOTS_MILLIS_REDUCES=9980},
> >                         Map-Reduce
> > Framework={MAP_OUTPUT_MATERIALIZED_BYTES=6, MAP_INPUT_RECORDS=0,
> > REDUCE_SHUFFLE_BYTES=6, SPILLED_RECORDS=0, MAP_OUTPUT_BYTES=0,
> > COMMITTED_HEAP_BYTES=218103808, CPU_MILLISECONDS=1950,
> > SPLIT_RAW_BYTES=1017, COMBINE_INPUT_RECORDS=0, REDUCE_INPUT_RECORDS=0,
> > REDUCE_INPUT_GROUPS=0, COMBINE_OUTPUT_RECORDS=0,
> > PHYSICAL_MEMORY_BYTES=296411136, REDUCE_OUTPUT_RECORDS=0,
> > VIRTUAL_MEMORY_BYTES=2251104256, MAP_OUTPUT_RECORDS=0},
> > FileSystemCounters={FILE_BYTES_READ=6, HDFS_BYTES_READ=1017,
> > FILE_BYTES_WRITTEN=156962, HDFS_BYTES_WRITTEN=86}, File Output Format
> > Counters ={BYTES_WRITTEN=86}}}}
> > 14/05/26 23:12:34 INFO crawl.WebTableReader: TOTAL urls:        0
> > 14/05/26 23:12:34 INFO crawl.WebTableReader: WebTable statistics: done
> >
> > What am I missing? My regex and normalise filters are allowing all URL
> > patterns. I’m trying to do a whole web crawl.
> >
> > --
> > Manikandan Saravanan
> > Architect - Technology
> > TheSocialPeople
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to