It looks like there is a HBase API called checkAndPut. By setting the value to 
be "null", you can achieve "put only when the row+column family+column 
qualifier doesn't exist". Nice feature.

_____________________________________________
From: Ma, Ming
Sent: Wednesday, June 08, 2011 9:54 PM
To: [email protected]
Subject: Does Put support "don't put if row exists"?


Hi,

Maybe this has been asked before. I couldn't find much information on this.

We have an application where multiple instances across different machines could 
try to insert  a new row with the same row key into a global HBase table at the 
same time. If the row has been inserted by one instance, we don't want other 
instances insert it again; instead the other instances should try to Get the 
row after their Put fails with "already exists" error.

It is somewhat similar to https://issues.apache.org/jira/browse/HBASE-493 , but 
here we need HBase to check for row existence, compared to check for 
version/timestamp.

The insertion rate is low, say 100 requests / sec. One way to implement this is 
to do it outside HBase. We can have client application use zookeeper to create 
a lock named after row key. The program will look like this:

If (!Row.Get())
{
Zookeeper.lock()

// let us do checking again in case another instance has just inserted the same 
row
If (!Row.Get())
{
    // the row doesn't exist
     Row.Put();
}
Zookeeper.unlock()
}

Any suggestions?

Thanks.

Ming

Reply via email to