[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2015-02-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303273#comment-14303273
 ] 

Benedict commented on CASSANDRA-6106:
-

It seems that, since CASSANDRA-6108 has been shelved, and the alternative 
follow ups aren't slated for anytime soon (including TimeUUID clientside 
timestamps), that we should consider introducing this in 3.0 to fix (or limit) 
the bug with resolution of conflicting data with the same timestamp. We should 
also ask the driver maintainers to do this on their side for v3.0+ protocol 
implementations.



 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 3.0

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-06-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036168#comment-14036168
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

Missed the window to get this into 2.0, and for 3.0 we're planning to switch to 
a unique-per-client time id instead (CASSANDRA-6108).

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 3.0

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-26 Thread Christopher Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982089#comment-13982089
 ] 

Christopher Smith commented on CASSANDRA-6106:
--

Sorry, I've been crazy busy this week. I'll look this over today or tomorrow 
and provide feedback.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979644#comment-13979644
 ] 

Sylvain Lebresne commented on CASSANDRA-6106:
-

Sorry but I still find it a tad more complicated that what we really need (I'll 
note that I really don't want us to screw up timestamps due to some rounding 
error or whatnot, which is why I rather strongly care about simplicity). I'm 
probably stupid, but I still need to sit with a pen and paper a few minutes to 
understand the arithmetic, check there is not edge case not handled and even 
still, I can't totally convince myself that the whole adjusting business really 
provide us practical benefits (given how timestamps are used) over just brute 
forcing monotonity like we currently do in QueryState in the rare cases clocks 
go slightly backward. But for the record, I'll say on the patch that:
* QueryState#getTimestamp would need to be changed or this isn't actually used 
by user queries.
* I don't totally reconcile saying that {{clock_gettime}} is a bit slow, but 
still having it call on a query thread, even if it's only once per second 
(which is not even guaranteed because once the validUntilNanos expires, 
multiple thread might fight over updating the spec). Especially when, if the 
call is slow for some reason, we incur an even greater cost by retrying the 
call up to 3 times. Not a huge problem I suppose, but not too ideal in my book.

So anyway, I've pushed at 
https://github.com/pcmanus/cassandra/commits/6106_alternative what I think is a 
simpler yet sufficient solution, so that I'd rather go with that (unless 
there's an obvious big problem with it I've missed).  


 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979668#comment-13979668
 ] 

Benedict commented on CASSANDRA-6106:
-

Well, if we're not worried about monotonicity, we can certainly simplify the 
patch - but I would much rather not introduce a new thread for updating the 
spec; if we're incurring the micros cost on a seconds time horizon, incurring 
it a few times each second is irrelevant (even a few times per thread in the 
worst case), it's just a matter of ensuring that we don't incur those micros 
costs for _every_ query, as it would be a really noticeable percentage of total 
operation time. We already have far more threads than necessary, let's not 
introduce one here where it's not necessary (note that 3x loop is only ever 
going to occur if we overlap with a GC pause, so we're only likely to loop 
twice at most, and almost never more than once).

Still, I'd much prefer that we just get comfortable with the math, since if we 
don't address it now it will go on the backburner and be forgotten. Correctness 
relies on integer arithmetic always truncating instead of rounding, so there's 
no floating point weirdness or edge cases to contend with. Behaviour is the 
same for all value ranges. We always apply a delta to a fixed offset, and the 
delta is calculated by multiplying by a monotonically increasing value, and 
dividing by a fixed value - as such the monotonically increasing multiplier 
either pushes the combined value over into the next whole number after 
division, or it doesn't. Since we are also guaranteed to only ever move by at 
most 10% of the elapsed time interval, we essentially get (at most) 1 
microsecond in 10 simply appears to take 2 microseconds to elapse. 

TL;DR: every x microseconds we deduct an extra 1microsecond when adjusting 
backwards, so we can only ever stall time (by 1 microsecond), never jump 
backwards.

Either way, since your patch applies no monotonicity guarantees, any slight 
risk I'm wrong in the analysis really doesn't seem to be important - it's still 
much better than without it, if that's your only concern?

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979722#comment-13979722
 ] 

Sylvain Lebresne commented on CASSANDRA-6106:
-

bq. it's still much better than without it, if that's your only concern?

I'm not afraid of slight microseconds imprecision that wouldn't matter, I'm 
afraid of returning a timestamp that is completely broken on some edge case, 
and the more arithmetic is going on, the more risk there is. Sure we can double 
and triple check the math to convince ourself, it's just that I don't think 
your solution bring any real benefits in practice for conflict-resolution 
timestamps over my proposition, and I think my solution is conceptually 
simpler, and I think we should always go for simpler when we can, and I think 
we can.

Now, I've discussed my view on the ticket itself (which I still halfway think 
could be closed as won't fix since at the end of the day the real problem for 
which it was opened is really CASSANDRA-6123), and on you branch (for which I 
don't see the point of getting comfortable with the math when there is a 
simpler solution imo) enough. I don't see much to add at this point. I'm not 
vetoing your solution, I just can't +1 it when I think my solution is a tad 
better (because simpler). Let's have someone else look at it and formulate an 
opinion, probably I'm just being difficult for lack of sleep.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979758#comment-13979758
 ] 

Benedict commented on CASSANDRA-6106:
-

Well, what I should have made clear is that I am willing to drop the 
monotonicity guarantees, however I am -1 on your extra thread.

But I still think the monotonicity guarantees are good, and not so difficult to 
prove, so if we can get somebody who doesn't have a newborn to contend with to 
take a look maybe that wouldn't be a bad thing :)

In case it helps, here's a quick proof we can never give a whack value:

{noformat}
1. -10= adjustMicros=10
2. expire-adjustFrom=10
2a. expireMicros-adjustFromMicros=100
3. adjustFromMicros=micros=expireMicros
4. delta = (adjustMicros * (micros-adjustFromMicros)) / 
(expireMicros-adjustFromMicros)
5. 2a ^ 3 ^ 4 - expireMicros-adjustFromMicros  micros-adjustFromMicros - 
|delta| = |adjustMicros|
{noformat}

i.e. the adjustment is definitely always less than adjustMicros, which is 
itself always less than 100ms per second (per 1 and 2). So we can never give a 
totally whack result. Can do more thorough proofs of other criteria, but I 
think this plus my other statement is enough to demonstrate its safety.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979769#comment-13979769
 ] 

Benedict commented on CASSANDRA-6106:
-

Also, just in case overflow might be considered an issue: per 1 and 2a, we have 
adjustMicros * (micros-adjustFromMicros) = 10billion, which is well within 
limits of safe long values

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979841#comment-13979841
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

Do you have time to look at the math, [~xcbsmith]?  The code itself is a single 
class and reasonably straightforward.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980097#comment-13980097
 ] 

Benedict commented on CASSANDRA-6106:
-

Just to add to the analysis, since it's quite computationally tractable I have 
uploaded a patch to the branch which brute force checks every possible 
computation to ensure the result is always monotonically increasing, and within 
the bounds of what is expected. I have run this to completion and indeed all of 
my statements above check out empirically.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-23 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978014#comment-13978014
 ] 

Benedict commented on CASSANDRA-6106:
-

One last tweak: wanted to make us just a little tolerant to really whack values 
caused by GC during a measurement, but done this in a way that makes the code 
neater rather than uglier.

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976949#comment-13976949
 ] 

Sylvain Lebresne commented on CASSANDRA-6106:
-

I'd like to summarize my understanding of what we're trying to fix here.

As far as conflict resolution goes, microsecond resolution is imo rather 
useless. Given the accuracy of ntp, network latencies and whatnot, no 
application should ever rely on sub-milliseconds resolution for conflicts, and 
any application that rely on fine-grained ordering of updates to a cell should 
really provide client-side timestamp. It doesn't mean we can't use microsecond 
resolution if it's easy of course, but does mean that imo the bar on what 
complexity is worth it is rather low.

This was not the original motivation of this ticket however. The original 
motivation was to limit the chance of 2 updates A and B getting the exact same 
timestamp, because when that happens, we could end up with some cell from A and 
some cell from B. I think we all agreed that the proper fix for that was more 
complicated and left to CASSANDRA-6123. Yet, as I said earlier, since that fix 
is much more complicated, I'm fine lowering the chances of timestamp conflicts 
in the meantime if that's easy for us (less often broken is somewhat better 
than more often broken, even if not broken is obviously better). But for this 
point, Christopher solution of randomizing the microseconds bits was actually 
really simple and probably good enough.

And to be honest, Benedict's branch complexity is above what I consider 
reasonable for the concrete problem at hand. I'm surely not very smart, but it 
doesn't fit my own definition of straightforward. I'm not saying that it's the 
most complicated thing ever, but it's complicated enough to make me 
uncomfortable, given that even some simple rounding error on the timestamp 
could basically destroy user data.

I'm also not convinced we need that complexity in practice. What about just 
having a thread call clock_gettime followed by nanoTime every second or so, and 
then just add the nano time between now and the last time clock_gettime was 
called to get the current time. It might not be perfect to get the most and 
best timestamp we can, but it's imo largely good enough for our purpose (and 
for clocks going back in time, we already handle that in a brute force kind of 
way in QueryState, which is again imo good enough).

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976965#comment-13976965
 ] 

Benedict commented on CASSANDRA-6106:
-

Well I was only implementing what was asked: microsecond accurate timestamps. 
If we just want disambiguation that's a different matter. But since sub 
millisecond network trips are pretty easy within a rack I'm not sure that 
random is a good thing

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6106) Provide timestamp with true microsecond resolution

2014-04-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977322#comment-13977322
 ] 

Benedict commented on CASSANDRA-6106:
-

In case it helps at all, I've commented it heavily and simplified the logic 
quite a bit, by removing the test on time elapsed to grab the realtime 
offset, as the effect will be pretty minimal even if it gets a temporarily 
whack value. It really isn't actually super complicated, but it was a bit ugly 
to read and non-obvious without comments.

It's worth noting that having a monotonically increasing time source is 
probably a good thing in and of itself, which this also provides.

I've rebased and pushed -f

 Provide timestamp with true microsecond resolution
 --

 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Assignee: Benedict
Priority: Minor
  Labels: timestamps
 Fix For: 2.1 beta2

 Attachments: microtimstamp.patch, microtimstamp_random.patch, 
 microtimstamp_random_rev2.patch


 I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
 mentioned issues with millisecond rounding in timestamps and was able to 
 reproduce the issue. If I specify a timestamp in a mutating query, I get 
 microsecond precision, but if I don't, I get timestamps rounded to the 
 nearest millisecond, at least for my first query on a given connection, which 
 substantially increases the possibilities of collision.
 I believe I found the offending code, though I am by no means sure this is 
 comprehensive. I think we probably need a fairly comprehensive replacement of 
 all uses of System.currentTimeMillis() with System.nanoTime().



--
This message was sent by Atlassian JIRA
(v6.2#6252)