[
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918905#action_12918905
]
Sylvain Lebresne commented on CASSANDRA-1546:
---------------------------------------------
{quote}
The complexity of the context-based logic is to be space efficient. In
practice, cassandra nodes are typically I/O bound, not CPU bound.
{quote}
True in current patches. But provided you add back timestamp to contexts for
decrement and provided my math doesn't suck too much, the disk overhead for the
logic of 1546 is of 3 bytes * (#replicas-1). Because the overhead of a column
is 1 byte for the 'flags' (deleted, expiring) and 2 bytes for each value to
record it's length. That's 3 bytes out of 20 you have to record for each
replica (name (the host ip) + value + timestamp). As it turns out, since for
counter columns, we know the size of the value, it's super easy to optimize out
2 of those 3 bytes (I'll be happy to add it). To be fair, there is the overhead
of using super columns, but overall I'm not totally convinced by this argument.
I'd like to add that the splitting of the context in multiple columns of 1546
offers some optimisation opportunity. After the write on the leader, when we
read the value to replicate it to other nodes. In 1546, we only read the value
for the leader parts of the counter (since this is what has been updated). This
will save I/Os and network bandwidth. Not saying this is a crucial thing, just
saying that it seems not so clear to me that context-based logic is
intrinsically an I/O saver.
{quote}
So, I'm not confident in the statement that #1546 is clearly faster than #1072.
As a rule of thumb, it's better to directly manage your memory usage, as
opposed to relying on the runtime's GC.
{quote}
I was merely talking about that the cleanContext() logic. But I'll admit,
saying this parts is faster in 1546 doesn't really matters much, that was a
stupid argument. It remains that I don't like this cleanContext logic. I find
it fragile (as in, hard to maintain) and not very clean, in that it relies on
the fact that nodes have to clean up the columns before sending them over to
other nodes. I wouldn't say that this cleanContext logic is a killer for the
context-based approach but I don't like it.
As for the creation of objects, you may be right. But I'm not even sure. The
byte array manipulations of 1072 does create a bunch of temporary byte arrays
that have to be garbaged out. So like you, I'm not very confident on any
statement related to whether the context-based logic is faster or slower than
the counter-as-supercolumns one of #1546.
bq. #1546 will need to special case AES-related streaming, as well
That is true. Which reminded me of a question for you Kelvin. The changes for
AES repair in #1072 are fairly extensive and I kind of wonder why ? I expect
the change to fix streaming in #1546 to be a few lines: that is, when you
rebuild the sstable after streaming, you'll deserialize and re-serialize the
rows instead of just copying the bytes directly. I don't see a reason to
differentiate the reason for streaming for instance.
> (Yet another) approach to counting
> ----------------------------------
>
> Key: CASSANDRA-1546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 0.7.0
>
> Attachments: 0001-Remove-IClock-from-internals.patch,
> 0001-v2-Remove-IClock-from-internals.patch,
> 0001-v3-Remove-IClock-from-internals.txt, 0002-Counters.patch,
> 0002-v2-Counters.patch, 0002-v3-Counters.txt,
> 0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch,
> 0003-v3-Thrift-changes.txt, marker_idea.txt
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and
> CASSANDRA-1421.
> More details in the comment below.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.