This is not a bug, and in fact changing it would be a serious bug.

False. Absolutely no consumer would be broken by a change to guarantee an
identical time component that isn't broken already, for the simple reason
your code already has to handle that case, as it is in fact the majority
case RIGHT NOW. Users can hit this bug, in production, because unit tests
might not experienced it! The time component should be the time that the
command was processed by the coordinator node.

     would one expect a java/py/bash script that loops

Individual Cassandra writes (which is what OP is referring to specifically)
are not loops. They are in almost every case atomic operations that either
succeed completely or fail completely. Allowing a single atomic operation
to witness multiple times in these corner cases is not only surprising, as
this thread demonstrates, it is also needlessly restricting to what
developers can use the database for, and provides NO BENEFIT.

    Calling now PRIOR to initiating multiple inserts is in most cases
exactly what one does...the ONLY practice is to set the value before
initiating the sequence of calls

Also false. Cassandra does not have a way of doing this on the coordinator
node rather than the client device, and as I already showed, the client
device is the wrong place to do it in situations where guaranteeing bounded
clock-skew actually makes a difference one way or the other.

Thanks,
Cody



On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle <daeme...@gmail.com>
wrote:

> This is not a bug, and in fact changing it would be a serious bug.
>
> What it is is a wonderful case of bad coding: would one expect a
> java/py/bash script that loops on a bunch of read/execut/update calls where
> each iteration calls time to return the same exact time for the duration of
> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>
> Every call to a system call is unique, including within C*. Calling now
> PRIOR to initiating multiple inserts is in most cases exactly what one does
> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
> identical system time as would be the uuid of the row, one tries to call
> time as close to just before the insert as possible. Then repeat.
>
> You have a logic issue in your code. If you want the same value for a set
> of calls, the ONLY practice is to set the value before initiating the
> sequence of calls.
>
>
>
> *.......*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey <yan...@uber.com> wrote:
>
>> Getting the same TimeUUID values might be a major problem. Getting two
>> different TimeUUIDs that at least have time component would not be a major
>> problem as this is the main case today. Getting different time components
>> is actually the corner case, and it is a corner case that breaks
>> Internet-of-Things applications. We can tightly control clock skew in our
>> cluster. We most definitely CANNOT control clock skew on the thousands of
>> sensors that write to our cluster.
>>
>> Thanks,
>> Cody
>>
>> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille <rwi...@fold3.com> wrote:
>>
>>> In my opinion, this is not broken and “fixing” it would break existing
>>> code. Consider a batch that includes multiple inserts, each of which
>>> inserts the value returned by now(). Getting the same UUID for each insert
>>> would be a major problem.
>>>
>>> Cheers
>>>
>>> Robert
>>>
>>>
>>> On Nov 30, 2016, at 4:46 PM, Todd Fast <t...@digitalexistence.com>
>>> wrote:
>>>
>>> FWIW I'd suggest opening a bug--this behavior is certainly quite
>>> unexpected and more than just a documentation issue. In general I can't
>>> imagine any desirable properties of the current implementation, and there
>>> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>>>
>>> Todd
>>>
>>> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu <t...@turnitin.com> wrote:
>>>
>>>> Sorry for my typo. Obviously, I meant:
>>>> "It appears that a single query that calls Cassandra's`now()` time
>>>> function *multiple times *may actually cause a query to write or
>>>> return different times."
>>>>
>>>> Less of a surprise now that I realize more about the implementation,
>>>> but I agree that more explicit documentation around when exactly the
>>>> "execution" of each now() statement happens and what implications it has
>>>> for the resulting timestamps would be helpful when running into this.
>>>>
>>>> Thanks for the quick responses!
>>>>
>>>> -Terry
>>>>
>>>>
>>>>
>>>> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek <msval...@gmail.com>
>>>> wrote:
>>>>
>>>> every now() call in statement is under the hood "replaced" with newly
>>>> generated uuid.
>>>>
>>>> It can happen that they belong to  different milliseconds in time.
>>>>
>>>> If you need to have same timestamps you need to set them on the client
>>>> side.
>>>>
>>>>
>>>> @msvaljek <https://twitter.com/msvaljek>
>>>>
>>>> 2016-11-29 22:49 GMT+01:00 Terry Liu <t...@turnitin.com>:
>>>>
>>>> It appears that a single query that calls Cassandra's `now()` time
>>>> function may actually cause a query to write or return different times.
>>>>
>>>> Is this the expected or defined behavior, and if so, why does it behave
>>>> like this rather than evaluating `now()` once across an entire statement?
>>>>
>>>> This really affects UPDATE statements but to test it more easily, you
>>>> could try something like:
>>>>
>>>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>>>> FROM keyspace.table
>>>> LIMIT 100;
>>>>
>>>> If you run that a few times, you should eventually see that the
>>>> timestamp returned moves onto the next millisecond mid-query.
>>>>
>>>> --
>>>> *Software Engineer*
>>>> Turnitin - http://www.turnitin.com
>>>> t...@turnitin.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Software Engineer*
>>>> Turnitin - http://www.turnitin.com
>>>> t...@turnitin.com
>>>>
>>>
>>>
>>
>

Reply via email to