Oh! I was of the opinion that the time stamps depend on when the region server corresponding to that cell receives it.
Further, I have tried locking the row before committing the put. I get an UnknownRowLock exception when trying to unlock. What does rowLock() return when it fails to acquire a lock on the row? Thanks Karthik On Tue, Jul 27, 2010 at 3:37 PM, Jean-Daniel Cryans <[email protected]>wrote: > If 2 reducers output the same cell at the same time, it could have the > same timestamp. > > J-D > > On Tue, Jul 27, 2010 at 3:34 PM, Karthik Kambatla > <[email protected]> wrote: > > When I use my own TableOutputFormat with the setAutoFlush set to true, we > > still observe loosing one or two commits. There seems to be some other > issue > > too here. > > > > Thanks > > Karthik > > > > On Tue, Jul 27, 2010 at 10:18 AM, Jean-Daniel Cryans < > [email protected]>wrote: > > > >> Since they are in the same batch, they could end up on the same > >> timestamp and one will hide the other. When not batched, there's > >> always a few milliseconds between the two Puts so it ends up ok. So > >> for your case it seems like you need to use your own HTable without > >> the write buffer since it's enforced in TOF, that means that the > >> overall throughput will be lower. > >> > >> J-D > >> > >> On Tue, Jul 27, 2010 at 9:57 AM, Karthik Kambatla > >> <[email protected]> wrote: > >> > Hi Jean > >> > > >> > I looked at the TableMapReduceUtil code and I implemented my own > version > >> of > >> > TableOutputFormat to find and isolate the problem. > >> > > >> > In TableOutputFormat, table.setAutoFlush(true) is called so the writes > >> can > >> > be batch-written. In our case, there are multiple puts on the same row > in > >> > the batch and only few of them are getting committed. I removed that > line > >> in > >> > MyOutputFormat, and most of the commits go through. > >> > > >> > What is the expected behavior in the following case? > >> > > >> > ArrayList<Put> puts = new ArrayList<Put>(); > >> > > >> > Put p1 = new Put(Bytes.toBytes(0)); > >> > p1.add(family, column, Bytes.toBytes(1)); > >> > puts.add(p1); > >> > > >> > Put p2 = new Put(Bytes.toBytes(0)); > >> > p2.add(family, column, Bytes.toBytes(2)); > >> > puts.add(p2); > >> > > >> > table.put(puts); > >> > > >> > Thanks > >> > Karthik > >> > > >> > > >> > > >> > On Tue, Jul 27, 2010 at 9:25 AM, Jean-Daniel Cryans < > [email protected] > >> >wrote: > >> > > >> >> TableOutputFormat is really just a wrapper around a HTable, see for > >> >> yourself > >> >> > >> > http://github.com/apache/hbase/blob/0.20/src/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java > >> >> > >> >> So there must be something else about the way you use it, or the way > >> >> you use HTable directly. Showing bits of your code could be helpful. > >> >> > >> >> J-D > >> >> > >> >> On Mon, Jul 26, 2010 at 11:17 PM, Karthik Kambatla > >> >> <[email protected]> wrote: > >> >> > Hi > >> >> > > >> >> > I am experiencing a few problems with TableMapReduceUtil, where in > >> only > >> >> some > >> >> > of the puts from the reduce are written to the output table. If I > >> >> explicitly > >> >> > write to the table from within reduce without using > >> TableMapReduceUtil, > >> >> all > >> >> > the puts are written to the table. > >> >> > > >> >> > In our application, multiple puts could be on the same row. In case > >> two > >> >> puts > >> >> > are on the same key, our application requires both puts to be > >> committed > >> >> as > >> >> > two different versions. > >> >> > > >> >> > Am I missing something here? Is there a cleaner way to approach > this > >> >> issue? > >> >> > > >> >> > Thanks for the help. > >> >> > > >> >> > Karthik > >> >> > > >> >> > >> > > >> > > >
