I am sorry I am not sure I understand that Markus, I am submitting Nutch using the following.
/opt/hadoop-2.3.0/bin/hadoop jar /opt/dfconfig/nutch/apache-nutch-1.8-SNAPSHOT.job org.apache.nutch.crawl.Crawl /urls -dir crawldirectory123 -depth 1000 topN 30000 however I dont see the crawldirectory123 being stored any where . I looked it under the /tmp/hadoop-user folder but no luck . Any idea where this is stored. Is is part of the namenode or datanode in YARN ? On Thu, Mar 13, 2014 at 4:48 AM, Markus Jelsma <[email protected]>wrote: > Well, there is some crawl/ dir somewhere, is it not? Segments are in there. > > -----Original message----- > > From:[email protected] <[email protected]> > > Sent: Thursday 13th March 2014 5:34 > > To: [email protected] > > Subject: Where is the crawl directory stored. > > > > Hello > > > > While running Nutch over Hadoop we pass in the crawl directory with the > dir parameter , which I assume stores the segment and other data structures. > > > > However I could not find this directory in the hadoop tmp directory > which is /tmp/hadoop-username. Could some one let me know where it is > stored?Or is there some renaming going on here? > > > > Thanks. > > > > Sent from my HTC > > > > >

