Thank you Jason! On Wed, Jul 1, 2009 at 5:26 PM, jason hadoop <[email protected]> wrote:
> The directory returned by getWorkOutputPath is a task specific directory, > to > be used for files that should be part of the final output of the job. > > If you want to write to the task local directory, use the local file system > api, and paths relative to '.'. > The parameter mapred.local.dir will contain the name of the local > directory. > > > On Wed, Jul 1, 2009 at 9:19 AM, bonito perdo <[email protected] > >wrote: > > > Thank you for you immediate response. > > In this case, what is the difference with the path obtained from > > FileOutputFormat.getWorkOutputPath(job)? this path refers to hdfs... > > > > Thank you. > > > > > > On Wed, Jul 1, 2009 at 5:13 PM, jason hadoop <[email protected]> > > wrote: > > > > > The parameter mapred.local.dir controls the directory used by the task > > > tracker for map/reduce jobs local files. > > > > > > the dfs.data.dir paramter is for the datanode. > > > > > > On Wed, Jul 1, 2009 at 8:56 AM, bonito <[email protected]> wrote: > > > > > > > > > > > Hello, > > > > I am a bit confused about the local directories where each map/reduce > > > task > > > > can store data. > > > > According to what I have read, > > > > dfs.data.dir - is the path on the local file system in which the > > DataNode > > > > instance should store its data. That is, since we have a number of > > > > individual nodes, this is the place where each node can store its own > > > data. > > > > Right? > > > > This data may be part of a-let's say- file stored under the hdfs > > > namespace? > > > > The value of this property for my configuration is: > > > > /home/bon/my_hdfiles/temp_0.19.1/dfs/data. > > > > As far as I can understand this path refers to the local "disk" of > each > > > > node. > > > > > > > > Moreover, calling FileOutputFormat.getWorkOutputPath(job) we obtain > the > > > > Path > > > > to the task's temporary output directory for the map-reduce job. This > > > path > > > > is totally different than the previous which confuses me since the > > > > temporary > > > > output of each task should be written locally in the node's disk. The > > > path > > > > I > > > > retrieve is: > > > > > > > > > > > > > > > > > > hdfs://localhost:9000/user/bon/keys_fil.txt/_temporary/_attempt_200907011515_0009_m_000000_0 > > > > Does this path refer to the local disk (node)? Or is it possible that > > it > > > > may > > > > refer to another node in the cluster? > > > > > > > > Any clarification would be of great help. > > > > > > > > Thank you. > > > > -- > > > > View this message in context: > > > > http://www.nabble.com/local-directory-tp24292289p24292289.html > > > > Sent from the Hadoop core-user mailing list archive at Nabble.com. > > > > > > > > > > > > > > > > > -- > > > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > > > http://www.amazon.com/dp/1430219424?tag=jewlerymall > > > www.prohadoopbook.com a community for Hadoop Professionals > > > > > > > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals >
