Re: Where is the crawl directory stored.

S.L Thu, 13 Mar 2014 07:36:57 -0700

I am sorry I am not  sure I understand that Markus,  I am submitting Nutch
using the following.


/opt/hadoop-2.3.0/bin/hadoop jar
/opt/dfconfig/nutch/apache-nutch-1.8-SNAPSHOT.job
org.apache.nutch.crawl.Crawl /urls -dir crawldirectory123 -depth 1000 topN
30000

however I dont see the crawldirectory123 being stored any where . I looked
it under the /tmp/hadoop-user folder but no luck . Any idea where this is
stored. Is is part of the namenode or datanode in YARN ?


On Thu, Mar 13, 2014 at 4:48 AM, Markus Jelsma
<[email protected]>wrote:

> Well, there is some crawl/ dir somewhere, is it not? Segments are in there.
>
> -----Original message-----
> > From:[email protected] <[email protected]>
> > Sent: Thursday 13th March 2014 5:34
> > To: [email protected]
> > Subject: Where is the crawl directory stored.
> >
> > Hello
> >
> > While running Nutch over Hadoop we pass in the crawl directory with the
> dir parameter , which I assume stores the segment and other data structures.
> >
> > However I could not find this directory in the hadoop tmp directory
> which is /tmp/hadoop-username. Could some one let me know where it is
> stored?Or is there some renaming going on here?
> >
> > Thanks.
> >
> > Sent from my HTC
> >
> >
>

Re: Where is the crawl directory stored.

Reply via email to