There was some confusion here - turns out that they do turn it on. I added Tu
to this thread and his response:

<quote>
We have speculative set to true by default.  With these settings, we are
seeing about 5-7% of the tasks have speculative tasks launched, other 90%
finished within the standard deviations difference and thus speculation
tasks were never launched.  This will ensure if we have a slow datanode,
our job would not be impacted.

Camus is setup to consume 10 minutes worth of offset/topic/run. If a topic
has more than 10 minutes of offset to be consumed, speculative will also
be active for that topic.  We haven't play much with this setting.
However, if we ever get into a situation where we have to do catchup, it's
good to have this setting disabled.

mapreduce.job.speculative.slownodethreshold     1.0
mapreduce.job.speculative.speculativecap        0.1

mapreduce.map.speculative       true
</quote>

On Tue, Feb 03, 2015 at 05:14:02PM +0000, Aditya Auradkar wrote:
> Hi Bhavesh,
> 
> I just checked with one of the devs on the Camus team. We run the Camus job 
> with speculative execution disabled.
> 
> Aditya
> 
> ________________________________________
> From: Pradeep Gollakota [pradeep...@gmail.com]
> Sent: Monday, February 02, 2015 11:15 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka ETL Camus Question
> 
> Hi Bhavesh,
> 
> At Lithium, we don't run Camus in our pipelines yet, though we plan to. But
> I just wanted to comment regarding speculative execution. We have it
> disabled at the cluster level and typically don't need it for most of our
> jobs. Especially with something like Camus, I don't see any need to run
> parallel copies of the same task.
> 
> On Mon, Feb 2, 2015 at 10:36 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com>
> wrote:
> 
> > Hi Jun,
> >
> > Thanks for info.  I did not get answer  to my question there so I thought I
> > try my luck here :)
> >
> > Thanks,
> >
> > Bhavesh
> >
> > On Mon, Feb 2, 2015 at 9:46 PM, Jun Rao <j...@confluent.io> wrote:
> >
> > > You can probably ask the Camus mailing list.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Jan 29, 2015 at 1:59 PM, Bhavesh Mistry <
> > > mistry.p.bhav...@gmail.com>
> > > wrote:
> > >
> > > > Hi Kafka Team or Linked-In  Team,
> > > >
> > > > I would like to know if you guys run Camus ETL job with speculative
> > > > execution true or false.  Does it make sense to set this to false ?
> > > Having
> > > > true, it creates additional load on brokers for each map task (create a
> > > map
> > > > task to pull same partition twice).  Is there any advantage to this
> > > having
> > > > it on vs off ?
> > > >
> > > > mapred.map.tasks.speculative.execution
> > > >
> > > > Thanks,
> > > >
> > > > Bhavesh
> > > >
> > >
> >

Reply via email to