I have the following scenario that would like a best solution for. Here's the scenario:
Table1.Standard1['cassandra']['frequency'] it is used for keeping track of how many times the word "cassandra" appeared. Let's say we have a bunch of articles stored in Hadoop, a Map/Reduce greps all articles throughout the Hadoop cluster that matches the pattern ^cassandra$ and updates Table1.Standard1['cassandra']['frequency']. Hence Table1.Standard1['cassandra']['frequency'] will be updated concurrently. One of the issues I am facing is that Table1.Standard1['cassandra']['frequency'] stores the count as a String (I am using Java), so in order to update the frequency properly, the thread that's running the Map/Reduce will have to retrieve Table1.Standard1['cassandra']['frequency'] in its native String format and hold that in temp (java Sttring), convert into int, then add the new counts in, and finally "SET Table1.Standard1['cassandra']['frequency']. = '" + temp.toString() + ''" During the entire process, how do we guranatee concurrency. The Cql SET does not allow something like SET Table1.Standard1['cassandra']['frequency']. = Table1.Standard1['cassandra']['frequency']. + newCounts since there's only one String type. What would be the best solution in this situtaion? Thanks, Ivan
