Hi Dylan, Right now we don't perform check (read) before performing an update. Below is a simple scenario.
Main table is initially empty, then client sends request which translates to inserting the data, i.e. Main table: A B C D Stats table: A 1 B 1 C 1 D 1 Let say its next request is to delete C. Main table: A B D Stats table: A 1 B 1 C 0 (1 + -1) D 1 Next request is to update B and D (the request got translated to delete B and D, and insert B and D), but let say it somehow failed in between the delete and insert operations, so the tables would look like: Main table: A Stats table: A 1 B 0 C 0 D 0 Client is fault-tolerant, and retry the entire request, so now the tables would look like: Main table: A B D Stats table: A 1 B 0 (-1 + 1) C 0 D 0 (-1 + 1) As you see above, the end state for Main table is correct, because the retry will do the 'update', but unfortunately not for the Stats table. The idea I mentioned last time was to have a batch job that scans the whole Main table to get the 'truth' data, and update Stats table accordingly, but in order to update 'accordingly', it first has to read the current value in Stats table (due to combiner), which affects performance. Thanks, Z -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/another-question-on-summing-combiner-tp15238p15412.html Sent from the Developers mailing list archive at Nabble.com.
