Here are my requirements.
We use Cassandra.
I get millions of invoice line items into the system. As I load them I
need to build up some data structures.
* Invoice line items by invoice id (each line item has an invoice id on
it ), with total dollar value
* Invoice line items by customer
1. Assuming that the majorirty of the line items are new and
2. The lookup of an existing line-item will dictate the performance of the
system because reads are slower than writes in C*.
3. Assuming that you are using counters in C*
Therefore eliminate that problem by implementing a bloom
Oleg,
If you have the aggregates in counters you only need to read the current
counter when adding/removing invoice lines.
In this situation you only need to be sure this sequence:
+ Read current counter value
+ Update current value according to newly created/updated lines
Is done safely to