I'll let the Cloudera folks speak, but I has assumed CDH4 would include HBase 
0.94.

-- Lars



________________________________
 From: Ted Yu <[email protected]>
To: [email protected] 
Sent: Thursday, July 5, 2012 11:28 AM
Subject: Re: Mixing Puts and Deletes in a single RPC
 
Take a look at HBASE-3584: Allow atomic put/delete in one call
It is in 0.94, meaning it is not even in cdh4

Cheers

On Thu, Jul 5, 2012 at 11:19 AM, Keith Wyss <[email protected]> wrote:

> Hi,
>
> My organization has been doing something zany to simulate atomic row
> operations is HBase.
>
> We have a converter-object model for the writables that are populated in
> an HBase table, and one of the governing assumptions
> is that if you are dealing with an Object record, you read all the columns
> that compose it out of HBase or a different data source.
>
> When we read lots of data in from a source system that we are trying to
> mirror with HBase, if a column is null that means that whatever is
> in HBase for that column is no longer valid. We  have simulated what I
> believe is now called a AtomicRowMutation by using a single Put
> and populating it with blanks. The downside is the wasted space accrued by
> the metadata for the blank columns.
>
> Atomicity is not of utmost importance to us, but performance is. My
> approach has been to create a Put and Delete object for a record and
> populate the Delete with the null columns. Then we call
> HTable.batch(List<Row>) on a bunch of these. It is my impression that this
> shouldn't appreciably increase network traffic as the RPC calls will be
> bundled.
>
> Has anyone else addressed this problem? Does this seem like a reasonable
> approach?
> What sort of performance overhead should I expect?
>
> Also, I've seen some Jira tickets about making this an atomic operation in
> its own right. Is that something that
> I can expect with CDH3U4?
>
> Thanks,
>
> Keith Wyss
>

Reply via email to