Re: speculative execution before mappers finish

2012-10-12 Thread Harsh J
Think of it in partition terms. If you know that your map-splits X, Y and Z won't emit any key of partition P, then the Pth reducer can jump ahead and run without those X, Y and Z completing their processing. Otherwise, a reducer can't run until all maps have completed, in fear of losing a few key

Re: can't disable speculative execution?

2012-07-11 Thread Yang
Thanks Harsh I did set mapred.map.tasks = 1 but still I can consistently see 3 mappers being invoked and the order is always like this: _2_0 ***_0_0 ***_1_0 the 2_0 and 1_0 tasks are the ones that consume 0 data this does look like a bug you could try with a simp

Re: can't disable speculative execution?

2012-07-11 Thread Harsh J
Er, sorry I meant mapred.map.tasks = 1 On Thu, Jul 12, 2012 at 10:44 AM, Harsh J wrote: > Try passing mapred.map.tasks = 0 or set a higher min-split size? > > On Thu, Jul 12, 2012 at 10:36 AM, Yang wrote: >> Thanks Harsh >> >> I see >> >> then there seems to be some small problems with the Split

Re: can't disable speculative execution?

2012-07-11 Thread Harsh J
Try passing mapred.map.tasks = 0 or set a higher min-split size? On Thu, Jul 12, 2012 at 10:36 AM, Yang wrote: > Thanks Harsh > > I see > > then there seems to be some small problems with the Splitter / InputFormat. > > I'm just reading a 1-line text file through pig: > > A = LOAD 'myinput.txt' ;

Re: can't disable speculative execution?

2012-07-11 Thread Yang
yes, let me try that changing the max mapper slot actually requires changing the hadoop config, since I just found that it's "final" param On Wed, Jul 11, 2012 at 10:05 PM, Harsh J wrote: > Your problem is more from the fact that you are running > 1 map slot > per TT, and multiple mappers are

Re: can't disable speculative execution?

2012-07-11 Thread Yang
Thanks Harsh I see then there seems to be some small problems with the Splitter / InputFormat. I'm just reading a 1-line text file through pig: A = LOAD 'myinput.txt' ; supposedly it should generate at most 1 mapper. but in reality , it seems that pig generated 3 mappers, and basically fed em

Re: can't disable speculative execution?

2012-07-11 Thread Harsh J
Your problem is more from the fact that you are running > 1 map slot per TT, and multiple mappers are getting run at the same time, all trying to bind to the same port. Limit your TT's max map tasks to 1 when you're relying on such techniques to debug, or use the LocalJobRunner/Apache MRUnit instea

Re: can't disable speculative execution?

2012-07-11 Thread Harsh J
Yang, No, those three are individual task attempts. This is how you may generally dissect an attempt ID when reading it: attempt_201207111710_0024_m_00_0 1. "attempt" - indicates its an attempt ID you'll be reading 2. "201207111710" - The job tracker timestamp ID, indicating which instance

Re: Speculative execution side effects of files created directly in HDFS

2012-05-17 Thread Harsh J
Yes speculative execution will affect your tasks, please read the FAQ to understand the use of OutputCommitters: http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F On Thu, May 17, 2012 at 2:02 PM, Abhay Ratnaparkhi wrote: >

Speculative execution side effects of files created directly in HDFS

2012-05-17 Thread Abhay Ratnaparkhi
I have multiple reducers running simmultaneously. Each reducer is supposed to output data in different file. I'm creating a file on HDFS using fs.create() command in each reducer. Will speculative execution of tasks affects the output as I'm not using any outputFormat provided? ~Abhay

Re: HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-04-02 Thread anil gupta
. Thanks, Anil On Fri, Mar 30, 2012 at 9:54 PM, Harsh J wrote: > Anil, > > You can also disable speculative execution on a per-job basis. See > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setMapSpeculativeExecution(boolean) > (Which is

Re: apache-solr-3.3.0, corrupt indexes, and speculative execution

2011-10-30 Thread Meng Mao
vmc visible 20 Oct 29 09:54 segments.gen > > part-4/data/spellchecker: > total 16 > -rw-r--r-- 1 vmc visible 32 Oct 29 09:54 segments_1 > -rw-r--r-- 1 vmc visible 20 Oct 29 09:54 segments.gen > > > What might cause that attempt path to be lying around at the time of &

apache-solr-3.3.0, corrupt indexes, and speculative execution

2011-10-29 Thread Meng Mao
20 Oct 29 09:54 segments.gen What might cause that attempt path to be lying around at the time of completion? Has anyone seen anything like this? My gut says if we were able to disable speculative execution, we would probably see this go away. But that might be overreacting. In this job, of the 12

Re: speculative execution

2011-06-06 Thread Shrinivas Joshi
Hi, Just wanted to post an update on this issue. I didn't spend a lot of time to verify for sure what was going wrong but speculative execution definitely was not the cause of the problem here. I was seeing job failures even with speculative execution set to ON. By recreating HDFS enviro

Re: speculative execution

2011-06-02 Thread Shrinivas Joshi
Hi Matei, Thanks for your feedback. I am trying to verify/debug whether the failures are actually due to speculative execution. I will send an update once I more info on this. -Shrinivas On Thu, Jun 2, 2011 at 12:40 AM, Matei Zaharia wrote: > Usually the number of speculatively executed ta

Re: speculative execution

2011-06-01 Thread Matei Zaharia
n't think OOM errors would be caused by not having speculation though; there must be another problem causing that. Matei On Jun 1, 2011, at 12:42 PM, Shrinivas Joshi wrote: > To find out whether it had any positive performance impact, I am trying with > turning OFF speculative executio

speculative execution

2011-06-01 Thread Shrinivas Joshi
To find out whether it had any positive performance impact, I am trying with turning OFF speculative execution. Surprisingly, the job starts to fail in reduce phase with OOM errors when I disable speculative execution for both map and reduce tasks. Has anybody noticed similar behavior? Is there a

Re: Speculative execution

2011-03-03 Thread Keith Wiley
On Mar 3, 2011, at 3:29 PM, Jacob R Rideout wrote: > On Thu, Mar 3, 2011 at 2:04 PM, Keith Wiley wrote: >> On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote: >> >>> yes, but the problem is determining which one will fail. Ideally you should >>> find the route cause, which is often some race con

Re: Speculative execution

2011-03-03 Thread Jacob R Rideout
On Thu, Mar 3, 2011 at 2:04 PM, Keith Wiley wrote: > On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote: > >> yes, but the problem is determining which one will fail. Ideally you should >> find the route cause, which is often some race condition or hardware fault. >> If it's the same server ever t

Re: Speculative execution

2011-03-03 Thread Keith Wiley
On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote: > yes, but the problem is determining which one will fail. Ideally you should > find the route cause, which is often some race condition or hardware fault. > If it's the same server ever time, turn it off. > You can play with the specex paramete

Re: Speculative execution

2011-03-03 Thread Steve Loughran
On 02/03/11 21:01, Keith Wiley wrote: I realize that the intended purpose of speculative execution is to overcome individual slow tasks...and I have read that it explicitly is *not* intended to start copies of a task simultaneously and to then race them, but rather to start copies of tasks

Speculative execution

2011-03-02 Thread Keith Wiley
I realize that the intended purpose of speculative execution is to overcome individual slow tasks...and I have read that it explicitly is *not* intended to start copies of a task simultaneously and to then race them, but rather to start copies of tasks that "seem slow" after running f

Can't turn off speculative execution

2009-07-29 Thread Mark Kerzner
zip files. I have > to > > play games so that the names of the zip files don't collide - and I am > not > > sure if this is stable. > > > > What am I missing in my understanding? > > > > Thank you, > > Mark > > > > You should take a look a