It looks like there is a HBase API called checkAndPut. By setting the value to be "null", you can achieve "put only when the row+column family+column qualifier doesn't exist". Nice feature.
_____________________________________________ From: Ma, Ming Sent: Wednesday, June 08, 2011 9:54 PM To: [email protected] Subject: Does Put support "don't put if row exists"? Hi, Maybe this has been asked before. I couldn't find much information on this. We have an application where multiple instances across different machines could try to insert a new row with the same row key into a global HBase table at the same time. If the row has been inserted by one instance, we don't want other instances insert it again; instead the other instances should try to Get the row after their Put fails with "already exists" error. It is somewhat similar to https://issues.apache.org/jira/browse/HBASE-493 , but here we need HBase to check for row existence, compared to check for version/timestamp. The insertion rate is low, say 100 requests / sec. One way to implement this is to do it outside HBase. We can have client application use zookeeper to create a lock named after row key. The program will look like this: If (!Row.Get()) { Zookeeper.lock() // let us do checking again in case another instance has just inserted the same row If (!Row.Get()) { // the row doesn't exist Row.Put(); } Zookeeper.unlock() } Any suggestions? Thanks. Ming
