I think you just need to turn the speculative execution off for that job? The speculative execution that I am referring to is when the job tracker executes multiple instances of the same task operations across the cluster. It will do this when the cluster isn't busy and particular tasks are taking to long, to see if it can get the task completed quicker on another node in the cluster.
My fear was that if there was a mapreduce job running, where a reduce task was being executed. Speculative execution could cause two instances of that same reduce job to get executed -- to see which one would finish first. That could have different impact based on the use case and how the timestamp for the data being ingested into hbase was generated. Is this an issue or just me pretending to know more than I do? Thanks, Ben On Tue, Feb 28, 2012 at 10:06 AM, T Vinod Gupta <[email protected]>wrote: > thanks, that helps!! > > On Tue, Feb 28, 2012 at 7:02 AM, Tim Robertson <[email protected] > >wrote: > > > Hi, > > > > You can call context.write() multiple times in the Reduce(), to emit > > more than one row. > > > > If you are creating the Puts in the Map function then you need to > > setMapSpeculativeExecution(false) on the job conf, or else Hadoop > > *might* spawn more than 1 attempt for a given task, meaning you'll get > > duplicate data. > > > > HTH, > > Tim > > > > > > > > On Tue, Feb 28, 2012 at 3:51 PM, T Vinod Gupta <[email protected]> > > wrote: > > > Ben, > > > I didn't quite understand your concern? What speculative execution are > > you > > > referring to? > > > > > > thanks > > > vinod > > > > > > On Tue, Feb 28, 2012 at 6:45 AM, Ben Snively <[email protected]> > wrote: > > > > > >> I think the short answer to that is yes, but the complex portion I > > would be > > >> worried about is the following: > > >> > > >> > > >> I guess along with that , how do manage speculative execution on the > > >> reducer (or is that only for map tasks)? > > >> > > >> I've always ended up creating import files and bringing it into HBase. > > >> > > >> Thanks, > > >> Ben > > >> > > >> On Tue, Feb 28, 2012 at 9:34 AM, T Vinod Gupta <[email protected] > > >> >wrote: > > >> > > >> > while doing map reduce on hbase tables, is it possible to do > multiple > > >> puts > > >> > in the reducer? what i want is a way to be able to write multiple > > rows. > > >> if > > >> > its not possible, then what are the other alternatives? i mean like > > >> > creating a wider table in that case. > > >> > > > >> > thanks > > >> > > > >> > > >
