Try passing mapred.map.tasks = 0 or set a higher min-split size? On Thu, Jul 12, 2012 at 10:36 AM, Yang <teddyyyy...@gmail.com> wrote: > Thanks Harsh > > I see > > then there seems to be some small problems with the Splitter / InputFormat. > > I'm just reading a 1-line text file through pig: > > A = LOAD 'myinput.txt' ; > > supposedly it should generate at most 1 mapper. > > but in reality , it seems that pig generated 3 mappers, and basically fed > empty input to 2 of the mappers > > > Thanks > Yang > > On Wed, Jul 11, 2012 at 10:00 PM, Harsh J <ha...@cloudera.com> wrote: > >> Yang, >> >> No, those three are individual task attempts. >> >> This is how you may generally dissect an attempt ID when reading it: >> >> attempt_201207111710_0024_m_000000_0 >> >> 1. "attempt" - indicates its an attempt ID you'll be reading >> 2. "201207111710" - The job tracker timestamp ID, indicating which >> instance of JT ran this job >> 3. "0024" - The Job ID for which this was a task attempt >> 4. "m" - Indicating this is a mapper (reducers are "r") >> 5. "000000" - The task ID of the mapper (00000 is the first mapper, >> 00001 is the second, etc.) >> 6. "0" - The attempt # for the task ID. 0 means it is the first >> attempt, 1 indicates the second attempt, etc. >> >> On Thu, Jul 12, 2012 at 9:16 AM, Yang <teddyyyy...@gmail.com> wrote: >> > I set the following params to be false in my pig script (0.10.0) >> > >> > SET mapred.map.tasks.speculative.execution false; >> > SET mapred.reduce.tasks.speculative.execution false; >> > >> > >> > I also verified in the jobtracker UI in the job.xml that they are indeed >> > set correctly. >> > >> > when the job finished, jobtracker UI shows that there is only one attempt >> > for each task (in fact I have only 1 task too). >> > >> > but when I went to the tasktracker node, looked under the >> > /var/log/hadoop/userlogs/job_id_here/ >> > dir , there are 3 attempts dir , >> > job_201207111710_0024 # ls >> > attempt_201207111710_0024_m_000000_0 >> attempt_201207111710_0024_m_000001_0 >> > attempt_201207111710_0024_m_000002_0 job-acls.xml >> > >> > so 3 attempts were indeed fired ?? >> > >> > I have to get this controlled correctly because I'm trying to debug the >> > mappers through eclipse, >> > but if more than 1 mapper process is fired, they all try to connect to >> the >> > same debugger port, and the end result is that nobody is able to >> > hook to the debugger. >> > >> > >> > Thanks >> > Yang >> >> >> >> -- >> Harsh J >>
-- Harsh J