Aayush You can use the following. Just play around with the pattern
<property> <name>keep.task.files.pattern</name> <value>.*_m_123456_0</value> <description>Keep all files from tasks whose task names match the given regular expression. Defaults to none.</description> </property> Raj >________________________________ > From: aayush <aayushgupta...@gmail.com> >To: common-user@hadoop.apache.org >Sent: Tuesday, March 27, 2012 5:18 AM >Subject: Re: Separating mapper intermediate files > >Thanks Harsh. > >I set the mapred.local.dir as you suggested. It creates 4 folders in it for >jobtracker, tasktracker, tt_private etc. i could not see an attempt directory. >Can you let me know exactly where to look in this directory structure? > >Furthermore, it seems that all the intermediate spill and map output are >cleaned up when the mapper finishes. I want to see those intermediate files >and don't want the cleanup of these files. How can I achieve it? > >Thanks a lot > >On Mar 27, 2012, at 1:16 AM, "Harsh J-2 [via Hadoop >Common]"<ml-node+s472056n3860389...@n3.nabble.com> wrote: > >> Hello Aayush, >> >> Three things that'd help clear your confusion: >> 1. dfs.data.dir controls where HDFS blocks are to be stored. Set this >> to a partition1 path. >> 2. mapred.local.dir controls where intermediate task data go to. Set >> this to a partition2 path. >> >> > Furthermore, can someone also tell me how to save intermediate mapper >> > files(spill outputs) and where are they saved. >> >> Intermediate outputs are handled by the framework itself (There is no >> user/manual work involved), and are saved inside attempt directories >> under mapred.local.dir. >> >> On Tue, Mar 27, 2012 at 4:46 AM, aayush <[hidden email]> wrote: >> > I am a newbie to Hadoop and map reduce. I am running a single node hadoop >> > setup. I have created 2 partitions on my HDD. I want the mapper >> > intermediate >> > files (i.e. the spill files and the mapper output) to be sent to a file >> > system on Partition1 whereas everything else including HDFS should be run >> > on >> > partition2. I am struggling to find the appropriate parametes in the conf >> > files. I understand that there is hadoop.tmp.dir and mapred.local.dir but >> > am >> > not sure how to use what. I would really appreciate if someone could tell >> > me >> > exactly which parameters to modify to achieve the goal. >> >> -- >> Harsh J >> >> >> If you reply to this email, your message will be added to the discussion >> below: >> http://hadoop-common.472056.n3.nabble.com/Separating-mapper-intermediate-files-tp3859787p3860389.html >> To unsubscribe from Separating mapper intermediate files, click here. >> NAML > > >-- >View this message in context: >http://hadoop-common.472056.n3.nabble.com/Separating-mapper-intermediate-files-tp3859787p3861159.html >Sent from the Users mailing list archive at Nabble.com. > >