[ 
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1546:
----------------------------------------

    Attachment: 0002-v4-thrift-changes.patch
                0001-v4-Counters.patch

Attaching a version 4 set of patches. Apart from being rebased (again svn rev. 
1023326), this mainly introduces two new things:
  # It replaces the use of node's IP addresses for the counter's parts by a 
node UUID (generated once and saved into system tables). I think that relying 
on IP addresses is broken (and that goes for #1072 as well). Because, during 
columns reconciliation, a node merges the values for its ID but keeps the more 
recent value for other IDs, if we use IP addresses as ID, then changing the IP 
address of a node could result in *loosing data*. Using a node UUID solves 
this. It does have a small drawback: UUID's are 16 bytes long where IP 
addresses are 4 bytes long. But I'll pay that everyday compared to losing 
correctness. And actually, maybe we could use some custom unique ID instead of 
UUID if we want to get back some of the space.  For instance, I'm pretty sure 
that an ID that would be generated by taking the IP address plus 4 bytes with 
the current time in sec would be fine.  Finally, note that this node ID is not 
gossiped to other nodes in this patch, because this patch only need this 
information locally (this isn't true for #1072 for instance, because of the 
cleanContext logic).
  # It fixes the streaming problem, by forcing a 
deserialization-reserialization after streaming (for counters CFs solely). 
Because of the change above, I don't think there is a need to distinguish the 
reason for streaming. Moreover, because what this serialization-deserialization 
does is just flipping a bit, it is safe to do this in-place on the streamed 
data file while rebuilding the index (contrarily to what #1072 does, where a 
new sstable is created in this process). Given this, the change is mostly 
localized to the SSTableWriter.Builder class.

Now, because it isn't yet clear what we will decide for counters (that or 
something closer to #1072), allow me to sum up my stake on this. I'd like to 
distinguish two families of differences between this patch and 1072:
  # There is the fact that this patch use a super column for the partitions of 
a given counter, while 1072 put those partitions in a context (a binary blob). 
Let me first stress that the main idea is the same in both case, they are not 
completely different idea, just two different implementation of the same idea 
(and not my idea btw), each having (I think) pros and cons. My opinion is that 
so far the approach of this ticket gives cleaner, simpler code, that mess less 
with the rest of Cassandra's code base (and I'm not saying it is a particularly 
objective argument). I like that and I think it's something important. However, 
the drawback of this approach is that, since we use super column for the 
counter, there is no native support for super columns of counters. And the 
context approach of 1072 doesn't have this drawback. Personally, I don't see 
that as a big drawback, because encoding super columns into standard columns is 
fairly simple.  But that too, is a matter of opinion so I understand other may 
feel differently. At the end of the day, I simply hope we'll end up choosing 
what's best for cassandra, and that we'll do that choice soon.
  # This patch introduces a bunch of things that are not in the currently 
attached patch of #1072 and that I believe are important : the marker strategy, 
support for all CLs, decrement support. It also fixes a few (important) bugs: a 
race condition in counter reads and the fragility of using IP addresses. I 
think we should keep the discussion on those separate from the discussion on 
the choice between counter-as-super-columns approach versus context-based 
approach, as they apply to both.

> (Yet another) approach to counting
> ----------------------------------
>
>                 Key: CASSANDRA-1546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 0.7.1
>
>         Attachments: 0001-v2-Remove-IClock-from-internals.patch, 
> 0001-v3-Remove-IClock-from-internals.txt, 0001-v4-Counters.patch, 
> 0002-v2-Counters.patch, 0002-v3-Counters.txt, 0002-v4-thrift-changes.patch, 
> 0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch, 
> 0003-v3-Thrift-changes.txt, marker_idea.txt
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and 
> CASSANDRA-1421.
> More details in the comment below.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to