Yeah… I gathered that this was a memory for availability tradeoff… I was
just curious how much memory was involved.
It seems a shame to waste so much memory and I can't help shake the feeling
that a lot of this is unnecessary.
In some situations I could see Cassandra using up to 4x more memory
From my point vector clocks is too much overhead. If you sync clocks in
your cluster using NTP (which you should do anyway) you will get clock
precision 1/1000s which is good enough.
all my machines running NTP has offset 1/1000s. They are FreeBSD,
Linux is not that precise in clock
This is really interesting… I can track it down but there are a number of
references to Cassandra HAVING vector clocks … which would make sense that I
can't find out how much memory they are using :-P
Cassandra: The Definitive Guide … which I was reading the other night says
that they were
At the point that book was written (about a year ago it was finalized), vector
clocks were planned. In August or September of last year, they were removed.
0.7 was released in January. The ticket for vector clocks is here and you can
see the reasoning for not using them at the bottom.
I had a thread going the other day about vector clock memory usage and that
it is a series of (clock id, clock):ts and the ability to prune old entries
… I'm specifically curious here how often old entries are pruned.
If you're storing small columns within cassandra. Say just an integer. The