Re: Why does `now()` produce different times within the same query?

Bruce Heath Thu, 01 Dec 2016 07:57:33 -0800

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Edward Capriolo <edlinuxg...@gmail.com>
Sent: Thursday, December 1, 2016 10:44:10 AM
To: user@cassandra.apache.org
Subject: Re: Why does `now()` produce different times within the same query?

On Thu, Dec 1, 2016 at 4:06 AM, Sylvain Lebresne 
<sylv...@datastax.com<mailto:sylv...@datastax.com>> wrote:
One can of course always open a JIRA, but I'm going to strongly disagree with a
change here (outside of a documentation one that is).

The now() function is a timeuuid generator, and it thus generates a unique
timeuuid on every call, as specified by the timeuuid spec. I'll note that
document lists it under "Timeuuid functions", and has sentences like
"the value returned by now() is guaranteed to be unique", so while I'm sure the
documentation can be further clarified, I think it's pretty clear it's not the
now() of SQL, and getting unique values on every call shouldn't be *that*
surprising.

Also, now() was primarily meant for use on timeuuid clustering columns for a
time-series like table, something like:
  CREATE TABLE ts (
    k int,
    t timeuuid,
    v text,
    PRIMARY KEY (k, t)
  )
and if you use it multiple times in a batch, this would look something like:
  BEGIN BATCH
    INSERT INTO ts (k, t, v) VALUES (0, now(), 'foo');
    INSERT INTO ts (k, t, v) VALUES (0, now(), 'bar');
  APPLY BATCH
and you definitively want that to insert 2 "events", not just one.

This is also why changing the behavior of this method *would* be a breaking
change.

Another reason this work the way it is is that functions in CQL are just that,
functions. Each execution is unique and they have no notion of being executed in
the same statement/batch/whatever. I actually think this is sensible, assuming
one stops being obsessed with what other databases that aren't Apache Cassandra
do.

I will note that Ben seems to suggest keeping the return of now() unique across
call while keeping the time component equals, thus varying the rest of the uuid
bytes. However:
 - I'm starting to wonder what this would buy us. Why would someone be super
   confused by the time changing across calls (in a single statement/batch), but
   be totally not confused by the actual full return to not be equal? And how is
   that actually useful: you're having different result anyway and you're
   letting the server pick the timestamp in the first place, so you're probably
   not caring about milliseconds precision of that timestamp in the first place.
 - This would basically be a violation of the timeuuid spec
 - This would be a big pain in the code and make of now() a special case
    among functions. I'm unconvinced special cases are making things easier
    in general.

So I'm all for improving the documentation if this confuses users due to
expectations (mistakenly) carried from prior experiences, and please
feel free to open a JIRA for that. I'm a lot less in agreement that there is
something wrong with the way the function behave in principle.

> I can see why this issue has been largely ignored and hasn't had a chance for
> the behaviour to be formally defined

Don't make too much assumptions. The behavior is perfectly well defined: now()
is a "normal" function and is evaluated whenever it's called according to the
timeuuid spec (or as close to it as we can make it).

On Thu, Dec 1, 2016 at 7:25 AM, Benjamin Roth 
<benjamin.r...@jaumo.com<mailto:benjamin.r...@jaumo.com>> wrote:

Great comment. +1

Am 01.12.2016 06:29 schrieb "Ben Bromhead" 
<b...@instaclustr.com<mailto:b...@instaclustr.com>>:
tl;dr +1 yup raise a jira to discuss how now() should behave in a single 
statement (and possible extend to batch statements).

The values of now should be the same if you assume that now() works like it 
does in relational databases such as postgres or mysql, however at the moment 
it instead works like sysdate() in mysql. Given that CQL is supposed to be SQL 
like, I think the assumption around the behaviour of now() was a fair one to 
make.

I definitely agree that raising a jira ticket would be a great place to discuss 
what the behaviour of now() should be for Cassandra. Personally I would be in 
favour of seeing the deterministic component (the actual time part) being the 
same across multiple calls in the one statement or multiple statements in a 
batch.

Cassandra documentation does not make any claims as to how now() works within a 
single statement and reading the code it shows the intent is to work like 
sysdate() from MySQL rather than now(). One of the identified dangers of making 
cql similar to sql is that, while yes it aids adoption, users will find that 
SQL like things don't behave as expected. Of course as a user, one shouldn't 
have to read the source code to determine correct behaviour.

Given that a timeuuid is made up of deterministic and (pseudo) 
non-deterministic components I can see why this issue has been largely ignored 
and hasn't had a chance for the behaviour to be formally defined (you would 
expect now to return the same time in the one statement despite multiple calls, 
but you wouldn't expect the same behaviour for say a call to rand()).

On Wed, 30 Nov 2016 at 19:54 Cody Yancey 
<yan...@uber.com<mailto:yan...@uber.com>> wrote:
    This is not a bug, and in fact changing it would be a serious bug.

False. Absolutely no consumer would be broken by a change to guarantee an 
identical time component that isn't broken already, for the simple reason your 
code already has to handle that case, as it is in fact the majority case RIGHT 
NOW. Users can hit this bug, in production, because unit tests might not 
experienced it! The time component should be the time that the command was 
processed by the coordinator node.

     would one expect a java/py/bash script that loops

Individual Cassandra writes (which is what OP is referring to specifically) are 
not loops. They are in almost every case atomic operations that either succeed 
completely or fail completely. Allowing a single atomic operation to witness 
multiple times in these corner cases is not only surprising, as this thread 
demonstrates, it is also needlessly restricting to what developers can use the 
database for, and provides NO BENEFIT.

    Calling now PRIOR to initiating multiple inserts is in most cases exactly 
what one does...the ONLY practice is to set the value before initiating the 
sequence of calls

Also false. Cassandra does not have a way of doing this on the coordinator node 
rather than the client device, and as I already showed, the client device is 
the wrong place to do it in situations where guaranteeing bounded clock-skew 
actually makes a difference one way or the other.

Thanks,
Cody

On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
<daeme...@gmail.com<mailto:daeme...@gmail.com>> wrote:
This is not a bug, and in fact changing it would be a serious bug.

What it is is a wonderful case of bad coding: would one expect a java/py/bash 
script that loops on a bunch of read/execut/update calls where each iteration 
calls time to return the same exact time for the duration of the execution of 
the code? Whether the code runs for 5 seconds or 5 hours?

Every call to a system call is unique, including within C*. Calling now PRIOR 
to initiating multiple inserts is in most cases exactly what one does to assure 
unique time stamps FOR THE BATCH OF INSERTS. To get a nearly identical system 
time as would be the uuid of the row, one tries to call time as close to just 
before the insert as possible. Then repeat.

You have a logic issue in your code. If you want the same value for a set of 
calls, the ONLY practice is to set the value before initiating the sequence of 
calls.

.......

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198<tel:(415)%20501-0198>
London (+44) (0) 20 8144 9872<tel:+44%2020%208144%209872>

On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey 
<yan...@uber.com<mailto:yan...@uber.com>> wrote:
Getting the same TimeUUID values might be a major problem. Getting two 
different TimeUUIDs that at least have time component would not be a major 
problem as this is the main case today. Getting different time components is 
actually the corner case, and it is a corner case that breaks 
Internet-of-Things applications. We can tightly control clock skew in our 
cluster. We most definitely CANNOT control clock skew on the thousands of 
sensors that write to our cluster.

Thanks,
Cody

On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille 
<rwi...@fold3.com<mailto:rwi...@fold3.com>> wrote:
In my opinion, this is not broken and "fixing" it would break existing code. 
Consider a batch that includes multiple inserts, each of which inserts the 
value returned by now(). Getting the same UUID for each insert would be a major 
problem.

Cheers

Robert

On Nov 30, 2016, at 4:46 PM, Todd Fast 
<t...@digitalexistence.com<mailto:t...@digitalexistence.com>> wrote:

FWIW I'd suggest opening a bug--this behavior is certainly quite unexpected and 
more than just a documentation issue. In general I can't imagine any desirable 
properties of the current implementation, and there are likely a bunch of 
latent bugs sitting out there, so it should be fixed.

Todd

On Wed, Nov 30, 2016 at 12:37 PM Terry Liu 
<t...@turnitin.com<mailto:t...@turnitin.com>> wrote:
Sorry for my typo. Obviously, I meant:
"It appears that a single query that calls Cassandra's`now()` time function 
multiple times may actually cause a query to write or return different times."

Less of a surprise now that I realize more about the implementation, but I 
agree that more explicit documentation around when exactly the "execution" of 
each now() statement happens and what implications it has for the resulting 
timestamps would be helpful when running into this.

Thanks for the quick responses!

-Terry

On Tue, Nov 29, 2016 at 2:45 PM, Marko ?valjek 
<msval...@gmail.com<mailto:msval...@gmail.com>> wrote:
every now() call in statement is under the hood "replaced" with newly generated 
uuid.

It can happen that they belong to  different milliseconds in time.

If you need to have same timestamps you need to set them on the client side.

@msvaljek<https://twitter.com/msvaljek>

2016-11-29 22:49 GMT+01:00 Terry Liu 
<t...@turnitin.com<mailto:t...@turnitin.com>>:
It appears that a single query that calls Cassandra's `now()` time function may 
actually cause a query to write or return different times.

Is this the expected or defined behavior, and if so, why does it behave like 
this rather than evaluating `now()` once across an entire statement?

This really affects UPDATE statements but to test it more easily, you could try 
something like:

SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
FROM keyspace.table
LIMIT 100;

If you run that a few times, you should eventually see that the timestamp 
returned moves onto the next millisecond mid-query.

--
Software Engineer
Turnitin - http://www.turnitin.com<http://www.turnitin.com/>
t...@turnitin.com<mailto:t...@turnitin.com>

--
Software Engineer
Turnitin - http://www.turnitin.com<http://www.turnitin.com/>
t...@turnitin.com<mailto:t...@turnitin.com>

--
Ben Bromhead
CTO | Instaclustr<https://www.instaclustr.com/>
+1 650 284 9692<tel:+1%20650-284-9692>
Managed Cassandra / Spark on AWS, Azure and Softlayer

I am not sure you saw my reply on thread but I believe everyone's needs can be 
met I will copy that here:

"Food for thought: Hive's UDFs introduced an annotation  @UDFType(deterministic 
= false)

http://dmtolpeko.com/2014/10/15/invoking-stateful-udf-at-map-and-reduce-side-in-hive/

The effect is the query planner can see when such a UDF is in use and determine 
the value once at the start of a very long query."

Essentially hive had a similar if not identical problem, during a long running 
distributed process like map/reduce some users wanted the semantics of:

1) Each call should have a new timestamps

While other users wanted the semantics of:

2) Each call should generate the same timestamp

The solution implemented was to add an annotation to udf such that the query 
planner would pick up the annotation and act accordingly.

(Here is a related issue https://issues.apache.org/jira/browse/HIVE-1986

As a result you can essentially implement two UDFS

@UDFType(deterministic = false)
public class UDFNow

and for the other people

@UDFType(deterministic = true)
public class UDFNowOnce extends UDFNow

Both user cases are met in a sensible way.

Re: Why does `now()` produce different times within the same query?

Reply via email to