[
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sylvain Lebresne updated CASSANDRA-1546:
----------------------------------------
Attachment: 0002-v4-thrift-changes.patch
0001-v4-Counters.patch
Attaching a version 4 set of patches. Apart from being rebased (again svn rev.
1023326), this mainly introduces two new things:
# It replaces the use of node's IP addresses for the counter's parts by a
node UUID (generated once and saved into system tables). I think that relying
on IP addresses is broken (and that goes for #1072 as well). Because, during
columns reconciliation, a node merges the values for its ID but keeps the more
recent value for other IDs, if we use IP addresses as ID, then changing the IP
address of a node could result in *loosing data*. Using a node UUID solves
this. It does have a small drawback: UUID's are 16 bytes long where IP
addresses are 4 bytes long. But I'll pay that everyday compared to losing
correctness. And actually, maybe we could use some custom unique ID instead of
UUID if we want to get back some of the space. For instance, I'm pretty sure
that an ID that would be generated by taking the IP address plus 4 bytes with
the current time in sec would be fine. Finally, note that this node ID is not
gossiped to other nodes in this patch, because this patch only need this
information locally (this isn't true for #1072 for instance, because of the
cleanContext logic).
# It fixes the streaming problem, by forcing a
deserialization-reserialization after streaming (for counters CFs solely).
Because of the change above, I don't think there is a need to distinguish the
reason for streaming. Moreover, because what this serialization-deserialization
does is just flipping a bit, it is safe to do this in-place on the streamed
data file while rebuilding the index (contrarily to what #1072 does, where a
new sstable is created in this process). Given this, the change is mostly
localized to the SSTableWriter.Builder class.
Now, because it isn't yet clear what we will decide for counters (that or
something closer to #1072), allow me to sum up my stake on this. I'd like to
distinguish two families of differences between this patch and 1072:
# There is the fact that this patch use a super column for the partitions of
a given counter, while 1072 put those partitions in a context (a binary blob).
Let me first stress that the main idea is the same in both case, they are not
completely different idea, just two different implementation of the same idea
(and not my idea btw), each having (I think) pros and cons. My opinion is that
so far the approach of this ticket gives cleaner, simpler code, that mess less
with the rest of Cassandra's code base (and I'm not saying it is a particularly
objective argument). I like that and I think it's something important. However,
the drawback of this approach is that, since we use super column for the
counter, there is no native support for super columns of counters. And the
context approach of 1072 doesn't have this drawback. Personally, I don't see
that as a big drawback, because encoding super columns into standard columns is
fairly simple. But that too, is a matter of opinion so I understand other may
feel differently. At the end of the day, I simply hope we'll end up choosing
what's best for cassandra, and that we'll do that choice soon.
# This patch introduces a bunch of things that are not in the currently
attached patch of #1072 and that I believe are important : the marker strategy,
support for all CLs, decrement support. It also fixes a few (important) bugs: a
race condition in counter reads and the fragility of using IP addresses. I
think we should keep the discussion on those separate from the discussion on
the choice between counter-as-super-columns approach versus context-based
approach, as they apply to both.
> (Yet another) approach to counting
> ----------------------------------
>
> Key: CASSANDRA-1546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 0.7.1
>
> Attachments: 0001-v2-Remove-IClock-from-internals.patch,
> 0001-v3-Remove-IClock-from-internals.txt, 0001-v4-Counters.patch,
> 0002-v2-Counters.patch, 0002-v3-Counters.txt, 0002-v4-thrift-changes.patch,
> 0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch,
> 0003-v3-Thrift-changes.txt, marker_idea.txt
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and
> CASSANDRA-1421.
> More details in the comment below.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.