[
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178475#comment-14178475
]
Joshua McKenzie commented on CASSANDRA-7979:
--------------------------------------------
h5. 2.0
h6. General:
# I don't see anything in there to limit the amount of sampling we're doing -
right now it looks like we're sampling all updates rather than min delta for
all columns as Benedict mentioned earlier.
h6. AtomicSortedColumns
# nit: Spacing on addAllWithSizeDelta. Remove extra after assignment of pair
# Update javadoc for return type
h6. ColumnFamilyStore
# nit: extra space after 'timeDelta ='
h5. trunk
h6. AtomicBTreeColumns
# In ColumnUpdater.apply, the Math.min check is redundant. Anything is always
going to be <= Long.MAX_VALUE
Looks pretty straightforward and appears to work as expected. Also - we should
probably have a 2.0 patch and a 2.1 and merge 2.1 up to trunk.
Once we've limited it to min delta per column on update we should be good to go.
> Acceptable time skew for C*
> ---------------------------
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: sankalp kohli
> Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work
> properly. Since the resolution is based on time and is at thrift column
> level, it depends on the application. How fast is the application updating
> the same column. If you update a column say after 5 millisecond and the clock
> skew is more than that, you might not see the updates in correct order.
> In this JIRA, I am proposing a change which will answer this question: "How
> much clock skew is acceptable for a given application". This will help answer
> the question whether the system needs some alternate NTP algorithms to keep
> time in sync.
> If we measure the time difference between two updates to the same column, we
> will be able to answer the question on clock skew.
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find
> that a column is updated within say 100 millisecond, add the diff to a
> histogram. Since this might have performance issues, we might want to have
> some throttling like randomization or only enable it for a small time via
> nodetool.
> With this histogram, we will know what is an acceptable clock skew.
> Also apart from column resolution, is there any other area which will be
> affected by clock skew?
> Note: For the sake of argument, I am not talking about back date deletes or
> application modified timestamps.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)