Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Benjamin Roth
Great comment. +1

Am 01.12.2016 06:29 schrieb "Ben Bromhead" :

> tl;dr +1 yup raise a jira to discuss how now() should behave in a single
> statement (and possible extend to batch statements).
>
> The values of now should be the same if you assume that now() works like
> it does in relational databases such as postgres or mysql, however at the
> moment it instead works like sysdate() in mysql. Given that CQL is supposed
> to be SQL like, I think the assumption around the behaviour of now() was a
> fair one to make.
>
> I definitely agree that raising a jira ticket would be a great place to
> discuss what the behaviour of now() should be for Cassandra. Personally I
> would be in favour of seeing the deterministic component (the actual time
> part) being the same across multiple calls in the one statement or multiple
> statements in a batch.
>
> Cassandra documentation does not make any claims as to how now() works
> within a single statement and reading the code it shows the intent is to
> work like sysdate() from MySQL rather than now(). One of the identified
> dangers of making cql similar to sql is that, while yes it aids adoption,
> users will find that SQL like things don't behave as expected. Of course as
> a user, one shouldn't have to read the source code to determine correct
> behaviour.
>
> Given that a timeuuid is made up of deterministic and (pseudo)
> non-deterministic components I can see why this issue has been largely
> ignored and hasn't had a chance for the behaviour to be formally defined
> (you would expect now to return the same time in the one statement despite
> multiple calls, but you wouldn't expect the same behaviour for say a call
> to rand()).
>
>
>
>
>
>
>
> On Wed, 30 Nov 2016 at 19:54 Cody Yancey  wrote:
>
>> This is not a bug, and in fact changing it would be a serious bug.
>>
>> False. Absolutely no consumer would be broken by a change to guarantee an
>> identical time component that isn't broken already, for the simple reason
>> your code already has to handle that case, as it is in fact the majority
>> case RIGHT NOW. Users can hit this bug, in production, because unit tests
>> might not experienced it! The time component should be the time that the
>> command was processed by the coordinator node.
>>
>>  would one expect a java/py/bash script that loops
>>
>> Individual Cassandra writes (which is what OP is referring to
>> specifically) are not loops. They are in almost every case atomic
>> operations that either succeed completely or fail completely. Allowing a
>> single atomic operation to witness multiple times in these corner cases is
>> not only surprising, as this thread demonstrates, it is also needlessly
>> restricting to what developers can use the database for, and provides NO
>> BENEFIT.
>>
>> Calling now PRIOR to initiating multiple inserts is in most cases
>> exactly what one does...the ONLY practice is to set the value before
>> initiating the sequence of calls
>>
>> Also false. Cassandra does not have a way of doing this on the
>> coordinator node rather than the client device, and as I already showed,
>> the client device is the wrong place to do it in situations where
>> guaranteeing bounded clock-skew actually makes a difference one way or the
>> other.
>>
>> Thanks,
>> Cody
>>
>>
>>
>> On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
>> wrote:
>>
>> This is not a bug, and in fact changing it would be a serious bug.
>>
>> What it is is a wonderful case of bad coding: would one expect a
>> java/py/bash script that loops on a bunch of read/execut/update calls where
>> each iteration calls time to return the same exact time for the duration of
>> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>>
>> Every call to a system call is unique, including within C*. Calling now
>> PRIOR to initiating multiple inserts is in most cases exactly what one does
>> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
>> identical system time as would be the uuid of the row, one tries to call
>> time as close to just before the insert as possible. Then repeat.
>>
>> You have a logic issue in your code. If you want the same value for a set
>> of calls, the ONLY practice is to set the value before initiating the
>> sequence of calls.
>>
>>
>>
>> *...*
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:
>>
>> Getting the same TimeUUID values might be a major problem. Getting two
>> different TimeUUIDs that at least have time component would not be a major
>> problem as this is the main case today. Getting different time components
>> is actually the corner case, and it is a corner case that breaks
>> Internet-of-Things applications. We can tightly control clock skew in our
>> cluster. We most 

Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Ben Bromhead
tl;dr +1 yup raise a jira to discuss how now() should behave in a single
statement (and possible extend to batch statements).

The values of now should be the same if you assume that now() works like it
does in relational databases such as postgres or mysql, however at the
moment it instead works like sysdate() in mysql. Given that CQL is supposed
to be SQL like, I think the assumption around the behaviour of now() was a
fair one to make.

I definitely agree that raising a jira ticket would be a great place to
discuss what the behaviour of now() should be for Cassandra. Personally I
would be in favour of seeing the deterministic component (the actual time
part) being the same across multiple calls in the one statement or multiple
statements in a batch.

Cassandra documentation does not make any claims as to how now() works
within a single statement and reading the code it shows the intent is to
work like sysdate() from MySQL rather than now(). One of the identified
dangers of making cql similar to sql is that, while yes it aids adoption,
users will find that SQL like things don't behave as expected. Of course as
a user, one shouldn't have to read the source code to determine correct
behaviour.

Given that a timeuuid is made up of deterministic and (pseudo)
non-deterministic components I can see why this issue has been largely
ignored and hasn't had a chance for the behaviour to be formally defined
(you would expect now to return the same time in the one statement despite
multiple calls, but you wouldn't expect the same behaviour for say a call
to rand()).







On Wed, 30 Nov 2016 at 19:54 Cody Yancey  wrote:

> This is not a bug, and in fact changing it would be a serious bug.
>
> False. Absolutely no consumer would be broken by a change to guarantee an
> identical time component that isn't broken already, for the simple reason
> your code already has to handle that case, as it is in fact the majority
> case RIGHT NOW. Users can hit this bug, in production, because unit tests
> might not experienced it! The time component should be the time that the
> command was processed by the coordinator node.
>
>  would one expect a java/py/bash script that loops
>
> Individual Cassandra writes (which is what OP is referring to
> specifically) are not loops. They are in almost every case atomic
> operations that either succeed completely or fail completely. Allowing a
> single atomic operation to witness multiple times in these corner cases is
> not only surprising, as this thread demonstrates, it is also needlessly
> restricting to what developers can use the database for, and provides NO
> BENEFIT.
>
> Calling now PRIOR to initiating multiple inserts is in most cases
> exactly what one does...the ONLY practice is to set the value before
> initiating the sequence of calls
>
> Also false. Cassandra does not have a way of doing this on the coordinator
> node rather than the client device, and as I already showed, the client
> device is the wrong place to do it in situations where guaranteeing bounded
> clock-skew actually makes a difference one way or the other.
>
> Thanks,
> Cody
>
>
>
> On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
> wrote:
>
> This is not a bug, and in fact changing it would be a serious bug.
>
> What it is is a wonderful case of bad coding: would one expect a
> java/py/bash script that loops on a bunch of read/execut/update calls where
> each iteration calls time to return the same exact time for the duration of
> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>
> Every call to a system call is unique, including within C*. Calling now
> PRIOR to initiating multiple inserts is in most cases exactly what one does
> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
> identical system time as would be the uuid of the row, one tries to call
> time as close to just before the insert as possible. Then repeat.
>
> You have a logic issue in your code. If you want the same value for a set
> of calls, the ONLY practice is to set the value before initiating the
> sequence of calls.
>
>
>
> *...*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:
>
> Getting the same TimeUUID values might be a major problem. Getting two
> different TimeUUIDs that at least have time component would not be a major
> problem as this is the main case today. Getting different time components
> is actually the corner case, and it is a corner case that breaks
> Internet-of-Things applications. We can tightly control clock skew in our
> cluster. We most definitely CANNOT control clock skew on the thousands of
> sensors that write to our cluster.
>
> Thanks,
> Cody
>
> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:
>
> In my opinion, this is not broken 

Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Edward Capriolo
On Wed, Nov 30, 2016 at 10:53 PM, Cody Yancey  wrote:

> This is not a bug, and in fact changing it would be a serious bug.
>
> False. Absolutely no consumer would be broken by a change to guarantee an
> identical time component that isn't broken already, for the simple reason
> your code already has to handle that case, as it is in fact the majority
> case RIGHT NOW. Users can hit this bug, in production, because unit tests
> might not experienced it! The time component should be the time that the
> command was processed by the coordinator node.
>
>  would one expect a java/py/bash script that loops
>
> Individual Cassandra writes (which is what OP is referring to
> specifically) are not loops. They are in almost every case atomic
> operations that either succeed completely or fail completely. Allowing a
> single atomic operation to witness multiple times in these corner cases is
> not only surprising, as this thread demonstrates, it is also needlessly
> restricting to what developers can use the database for, and provides NO
> BENEFIT.
>
> Calling now PRIOR to initiating multiple inserts is in most cases
> exactly what one does...the ONLY practice is to set the value before
> initiating the sequence of calls
>
> Also false. Cassandra does not have a way of doing this on the coordinator
> node rather than the client device, and as I already showed, the client
> device is the wrong place to do it in situations where guaranteeing bounded
> clock-skew actually makes a difference one way or the other.
>
> Thanks,
> Cody
>
>
>
> On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
> wrote:
>
>> This is not a bug, and in fact changing it would be a serious bug.
>>
>> What it is is a wonderful case of bad coding: would one expect a
>> java/py/bash script that loops on a bunch of read/execut/update calls where
>> each iteration calls time to return the same exact time for the duration of
>> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>>
>> Every call to a system call is unique, including within C*. Calling now
>> PRIOR to initiating multiple inserts is in most cases exactly what one does
>> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
>> identical system time as would be the uuid of the row, one tries to call
>> time as close to just before the insert as possible. Then repeat.
>>
>> You have a logic issue in your code. If you want the same value for a set
>> of calls, the ONLY practice is to set the value before initiating the
>> sequence of calls.
>>
>>
>>
>> *...*
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>>
>> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:
>>
>>> Getting the same TimeUUID values might be a major problem. Getting two
>>> different TimeUUIDs that at least have time component would not be a major
>>> problem as this is the main case today. Getting different time components
>>> is actually the corner case, and it is a corner case that breaks
>>> Internet-of-Things applications. We can tightly control clock skew in our
>>> cluster. We most definitely CANNOT control clock skew on the thousands of
>>> sensors that write to our cluster.
>>>
>>> Thanks,
>>> Cody
>>>
>>> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:
>>>
 In my opinion, this is not broken and “fixing” it would break existing
 code. Consider a batch that includes multiple inserts, each of which
 inserts the value returned by now(). Getting the same UUID for each insert
 would be a major problem.

 Cheers

 Robert


 On Nov 30, 2016, at 4:46 PM, Todd Fast 
 wrote:

 FWIW I'd suggest opening a bug--this behavior is certainly quite
 unexpected and more than just a documentation issue. In general I can't
 imagine any desirable properties of the current implementation, and there
 are likely a bunch of latent bugs sitting out there, so it should be fixed.

 Todd

 On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:

> Sorry for my typo. Obviously, I meant:
> "It appears that a single query that calls Cassandra's`now()` time
> function *multiple times *may actually cause a query to write or
> return different times."
>
> Less of a surprise now that I realize more about the implementation,
> but I agree that more explicit documentation around when exactly the
> "execution" of each now() statement happens and what implications it has
> for the resulting timestamps would be helpful when running into this.
>
> Thanks for the quick responses!
>
> -Terry
>
>
>
> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
> wrote:
>
> every now() call in statement is under the hood 

Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Cody Yancey
This is not a bug, and in fact changing it would be a serious bug.

False. Absolutely no consumer would be broken by a change to guarantee an
identical time component that isn't broken already, for the simple reason
your code already has to handle that case, as it is in fact the majority
case RIGHT NOW. Users can hit this bug, in production, because unit tests
might not experienced it! The time component should be the time that the
command was processed by the coordinator node.

 would one expect a java/py/bash script that loops

Individual Cassandra writes (which is what OP is referring to specifically)
are not loops. They are in almost every case atomic operations that either
succeed completely or fail completely. Allowing a single atomic operation
to witness multiple times in these corner cases is not only surprising, as
this thread demonstrates, it is also needlessly restricting to what
developers can use the database for, and provides NO BENEFIT.

Calling now PRIOR to initiating multiple inserts is in most cases
exactly what one does...the ONLY practice is to set the value before
initiating the sequence of calls

Also false. Cassandra does not have a way of doing this on the coordinator
node rather than the client device, and as I already showed, the client
device is the wrong place to do it in situations where guaranteeing bounded
clock-skew actually makes a difference one way or the other.

Thanks,
Cody



On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
wrote:

> This is not a bug, and in fact changing it would be a serious bug.
>
> What it is is a wonderful case of bad coding: would one expect a
> java/py/bash script that loops on a bunch of read/execut/update calls where
> each iteration calls time to return the same exact time for the duration of
> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>
> Every call to a system call is unique, including within C*. Calling now
> PRIOR to initiating multiple inserts is in most cases exactly what one does
> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
> identical system time as would be the uuid of the row, one tries to call
> time as close to just before the insert as possible. Then repeat.
>
> You have a logic issue in your code. If you want the same value for a set
> of calls, the ONLY practice is to set the value before initiating the
> sequence of calls.
>
>
>
> *...*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:
>
>> Getting the same TimeUUID values might be a major problem. Getting two
>> different TimeUUIDs that at least have time component would not be a major
>> problem as this is the main case today. Getting different time components
>> is actually the corner case, and it is a corner case that breaks
>> Internet-of-Things applications. We can tightly control clock skew in our
>> cluster. We most definitely CANNOT control clock skew on the thousands of
>> sensors that write to our cluster.
>>
>> Thanks,
>> Cody
>>
>> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:
>>
>>> In my opinion, this is not broken and “fixing” it would break existing
>>> code. Consider a batch that includes multiple inserts, each of which
>>> inserts the value returned by now(). Getting the same UUID for each insert
>>> would be a major problem.
>>>
>>> Cheers
>>>
>>> Robert
>>>
>>>
>>> On Nov 30, 2016, at 4:46 PM, Todd Fast 
>>> wrote:
>>>
>>> FWIW I'd suggest opening a bug--this behavior is certainly quite
>>> unexpected and more than just a documentation issue. In general I can't
>>> imagine any desirable properties of the current implementation, and there
>>> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>>>
>>> Todd
>>>
>>> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:
>>>
 Sorry for my typo. Obviously, I meant:
 "It appears that a single query that calls Cassandra's`now()` time
 function *multiple times *may actually cause a query to write or
 return different times."

 Less of a surprise now that I realize more about the implementation,
 but I agree that more explicit documentation around when exactly the
 "execution" of each now() statement happens and what implications it has
 for the resulting timestamps would be helpful when running into this.

 Thanks for the quick responses!

 -Terry



 On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
 wrote:

 every now() call in statement is under the hood "replaced" with newly
 generated uuid.

 It can happen that they belong to  different milliseconds in time.

 If you need to have same timestamps you need to set them on the client
 side.



Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread daemeon reiydelle
This is not a bug, and in fact changing it would be a serious bug.

What it is is a wonderful case of bad coding: would one expect a
java/py/bash script that loops on a bunch of read/execut/update calls where
each iteration calls time to return the same exact time for the duration of
the execution of the code? Whether the code runs for 5 seconds or 5 hours?

Every call to a system call is unique, including within C*. Calling now
PRIOR to initiating multiple inserts is in most cases exactly what one does
to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
identical system time as would be the uuid of the row, one tries to call
time as close to just before the insert as possible. Then repeat.

You have a logic issue in your code. If you want the same value for a set
of calls, the ONLY practice is to set the value before initiating the
sequence of calls.



*...*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:

> Getting the same TimeUUID values might be a major problem. Getting two
> different TimeUUIDs that at least have time component would not be a major
> problem as this is the main case today. Getting different time components
> is actually the corner case, and it is a corner case that breaks
> Internet-of-Things applications. We can tightly control clock skew in our
> cluster. We most definitely CANNOT control clock skew on the thousands of
> sensors that write to our cluster.
>
> Thanks,
> Cody
>
> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:
>
>> In my opinion, this is not broken and “fixing” it would break existing
>> code. Consider a batch that includes multiple inserts, each of which
>> inserts the value returned by now(). Getting the same UUID for each insert
>> would be a major problem.
>>
>> Cheers
>>
>> Robert
>>
>>
>> On Nov 30, 2016, at 4:46 PM, Todd Fast  wrote:
>>
>> FWIW I'd suggest opening a bug--this behavior is certainly quite
>> unexpected and more than just a documentation issue. In general I can't
>> imagine any desirable properties of the current implementation, and there
>> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>>
>> Todd
>>
>> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:
>>
>>> Sorry for my typo. Obviously, I meant:
>>> "It appears that a single query that calls Cassandra's`now()` time
>>> function *multiple times *may actually cause a query to write or return
>>> different times."
>>>
>>> Less of a surprise now that I realize more about the implementation, but
>>> I agree that more explicit documentation around when exactly the
>>> "execution" of each now() statement happens and what implications it has
>>> for the resulting timestamps would be helpful when running into this.
>>>
>>> Thanks for the quick responses!
>>>
>>> -Terry
>>>
>>>
>>>
>>> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
>>> wrote:
>>>
>>> every now() call in statement is under the hood "replaced" with newly
>>> generated uuid.
>>>
>>> It can happen that they belong to  different milliseconds in time.
>>>
>>> If you need to have same timestamps you need to set them on the client
>>> side.
>>>
>>>
>>> @msvaljek 
>>>
>>> 2016-11-29 22:49 GMT+01:00 Terry Liu :
>>>
>>> It appears that a single query that calls Cassandra's `now()` time
>>> function may actually cause a query to write or return different times.
>>>
>>> Is this the expected or defined behavior, and if so, why does it behave
>>> like this rather than evaluating `now()` once across an entire statement?
>>>
>>> This really affects UPDATE statements but to test it more easily, you
>>> could try something like:
>>>
>>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>>> FROM keyspace.table
>>> LIMIT 100;
>>>
>>> If you run that a few times, you should eventually see that the
>>> timestamp returned moves onto the next millisecond mid-query.
>>>
>>> --
>>> *Software Engineer*
>>> Turnitin - http://www.turnitin.com
>>> t...@turnitin.com
>>>
>>>
>>>
>>>
>>>
>>> --
>>> *Software Engineer*
>>> Turnitin - http://www.turnitin.com
>>> t...@turnitin.com
>>>
>>
>>
>


Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Cody Yancey
Getting the same TimeUUID values might be a major problem. Getting two
different TimeUUIDs that at least have time component would not be a major
problem as this is the main case today. Getting different time components
is actually the corner case, and it is a corner case that breaks
Internet-of-Things applications. We can tightly control clock skew in our
cluster. We most definitely CANNOT control clock skew on the thousands of
sensors that write to our cluster.

Thanks,
Cody

On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:

> In my opinion, this is not broken and “fixing” it would break existing
> code. Consider a batch that includes multiple inserts, each of which
> inserts the value returned by now(). Getting the same UUID for each insert
> would be a major problem.
>
> Cheers
>
> Robert
>
>
> On Nov 30, 2016, at 4:46 PM, Todd Fast  wrote:
>
> FWIW I'd suggest opening a bug--this behavior is certainly quite
> unexpected and more than just a documentation issue. In general I can't
> imagine any desirable properties of the current implementation, and there
> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>
> Todd
>
> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:
>
>> Sorry for my typo. Obviously, I meant:
>> "It appears that a single query that calls Cassandra's`now()` time
>> function *multiple times *may actually cause a query to write or return
>> different times."
>>
>> Less of a surprise now that I realize more about the implementation, but
>> I agree that more explicit documentation around when exactly the
>> "execution" of each now() statement happens and what implications it has
>> for the resulting timestamps would be helpful when running into this.
>>
>> Thanks for the quick responses!
>>
>> -Terry
>>
>>
>>
>> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
>> wrote:
>>
>> every now() call in statement is under the hood "replaced" with newly
>> generated uuid.
>>
>> It can happen that they belong to  different milliseconds in time.
>>
>> If you need to have same timestamps you need to set them on the client
>> side.
>>
>>
>> @msvaljek 
>>
>> 2016-11-29 22:49 GMT+01:00 Terry Liu :
>>
>> It appears that a single query that calls Cassandra's `now()` time
>> function may actually cause a query to write or return different times.
>>
>> Is this the expected or defined behavior, and if so, why does it behave
>> like this rather than evaluating `now()` once across an entire statement?
>>
>> This really affects UPDATE statements but to test it more easily, you
>> could try something like:
>>
>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>> FROM keyspace.table
>> LIMIT 100;
>>
>> If you run that a few times, you should eventually see that the timestamp
>> returned moves onto the next millisecond mid-query.
>>
>> --
>> *Software Engineer*
>> Turnitin - http://www.turnitin.com
>> t...@turnitin.com
>>
>>
>>
>>
>>
>> --
>> *Software Engineer*
>> Turnitin - http://www.turnitin.com
>> t...@turnitin.com
>>
>
>


Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Robert Wille
In my opinion, this is not broken and “fixing” it would break existing code. 
Consider a batch that includes multiple inserts, each of which inserts the 
value returned by now(). Getting the same UUID for each insert would be a major 
problem.

Cheers

Robert

On Nov 30, 2016, at 4:46 PM, Todd Fast 
> wrote:

FWIW I'd suggest opening a bug--this behavior is certainly quite unexpected and 
more than just a documentation issue. In general I can't imagine any desirable 
properties of the current implementation, and there are likely a bunch of 
latent bugs sitting out there, so it should be fixed.

Todd

On Wed, Nov 30, 2016 at 12:37 PM Terry Liu 
> wrote:
Sorry for my typo. Obviously, I meant:
"It appears that a single query that calls Cassandra's`now()` time function 
multiple times may actually cause a query to write or return different times."

Less of a surprise now that I realize more about the implementation, but I 
agree that more explicit documentation around when exactly the "execution" of 
each now() statement happens and what implications it has for the resulting 
timestamps would be helpful when running into this.

Thanks for the quick responses!

-Terry



On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
> wrote:
every now() call in statement is under the hood "replaced" with newly generated 
uuid.

It can happen that they belong to  different milliseconds in time.

If you need to have same timestamps you need to set them on the client side.


@msvaljek

2016-11-29 22:49 GMT+01:00 Terry Liu 
>:
It appears that a single query that calls Cassandra's `now()` time function may 
actually cause a query to write or return different times.

Is this the expected or defined behavior, and if so, why does it behave like 
this rather than evaluating `now()` once across an entire statement?

This really affects UPDATE statements but to test it more easily, you could try 
something like:

SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
FROM keyspace.table
LIMIT 100;

If you run that a few times, you should eventually see that the timestamp 
returned moves onto the next millisecond mid-query.

--
Software Engineer
Turnitin - http://www.turnitin.com
t...@turnitin.com




--
Software Engineer
Turnitin - http://www.turnitin.com
t...@turnitin.com



Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Todd Fast
FWIW I'd suggest opening a bug--this behavior is certainly quite unexpected
and more than just a documentation issue. In general I can't imagine any
desirable properties of the current implementation, and there are likely a
bunch of latent bugs sitting out there, so it should be fixed.

Todd

On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:

> Sorry for my typo. Obviously, I meant:
> "It appears that a single query that calls Cassandra's`now()` time
> function *multiple times *may actually cause a query to write or return
> different times."
>
> Less of a surprise now that I realize more about the implementation, but I
> agree that more explicit documentation around when exactly the "execution"
> of each now() statement happens and what implications it has for the
> resulting timestamps would be helpful when running into this.
>
> Thanks for the quick responses!
>
> -Terry
>
>
>
> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek  wrote:
>
> every now() call in statement is under the hood "replaced" with newly
> generated uuid.
>
> It can happen that they belong to  different milliseconds in time.
>
> If you need to have same timestamps you need to set them on the client
> side.
>
>
> @msvaljek 
>
> 2016-11-29 22:49 GMT+01:00 Terry Liu :
>
> It appears that a single query that calls Cassandra's `now()` time
> function may actually cause a query to write or return different times.
>
> Is this the expected or defined behavior, and if so, why does it behave
> like this rather than evaluating `now()` once across an entire statement?
>
> This really affects UPDATE statements but to test it more easily, you
> could try something like:
>
> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
> FROM keyspace.table
> LIMIT 100;
>
> If you run that a few times, you should eventually see that the timestamp
> returned moves onto the next millisecond mid-query.
>
> --
> *Software Engineer*
> Turnitin - http://www.turnitin.com
> t...@turnitin.com
>
>
>
>
>
> --
> *Software Engineer*
> Turnitin - http://www.turnitin.com
> t...@turnitin.com
>


full repair or incremental repair after scrub?

2016-11-30 Thread Kai Wang
Hi, do I have to do a full repair after scrub? Is it enough to just do
incremental repair? BTW I do nightly incremental repair.


Re: Which version is stable enough for production environment?

2016-11-30 Thread Benjamin Roth
Thanks. I left some comments.

LeveledCompaction: Have you checked if there where major changes in the
LeveledStrategy between 2.x and 3.x?

2016-11-30 21:04 GMT+01:00 Harikrishnan Pillai :

> https://issues.apache.org/jira/browse/CASSANDRA-12728
>
> [CASSANDRA-12728] Handling partially written hint files ...
> 
> issues.apache.org
> Cassandra; CASSANDRA-12728; Handling partially written hint files. Agile
> Board; Awaiting Feedback; Export
> https://issues.apache.org/jira/browse/CASSANDRA-12844
>
>
> Also when i testes some of our write heavy workload Leveled Compaction was
> not keeping up.With same system settings 2.1.16 performs better and all
> levels was properly aligned.
> --
> *From:* Benjamin Roth 
> *Sent:* Tuesday, November 29, 2016 11:20:19 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Which version is stable enough for production environment?
>
> What are the compaction issues / hint corruprions you encountered? Are
> there JIRA tickets for it?
> I am curios cause I use 3.10 (trunk) in production.
>
> For anyone who is planning to use MVs:
> They basically work. We use them in production since some months, BUT
> (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
> may be - depending on the model, config, amount of data - really, really
> painful. I'm currently investigating intensively.
>
> 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai :
>
>> 3.0 has "off the heap memtable" impl removed and if you have a
>> requirement for this,its not available.If you don't have the requirement
>> 3.0.9 can be tried out. 3.9 version we did some testing and find lot issues
>> in compaction,hint corruption etc.
>>
>> Regards
>>
>> Hari
>>
>>
>> --
>> *From:* Discovery 
>> *Sent:* Tuesday, November 29, 2016 5:59 PM
>> *To:* user
>> *Subject:* Re: Which version is stable enough for production environment?
>>
>> Why version 3.x is not recommended?  Thanks.
>>
>>
>> -- Original --
>> *From: * "Harikrishnan Pillai";;
>> *Date: * Wed, Nov 30, 2016 09:57 AM
>> *To: * "user";
>> *Subject: * Re: Which version is stable enough for production
>> environment?
>>
>> Cassandra 2.1.16
>>
>>
>> --
>> *From:* Discovery 
>> *Sent:* Tuesday, November 29, 2016 5:42 PM
>> *To:* user
>> *Subject:* Which version is stable enough for production environment?
>>
>> Hi Cassandra Experts,
>>
>>   We prepare to deploy Cassandra in production env, but
>> we can not confirm which version is stable and recommended, could someone
>> in this mail list give the suggestion? Thanks in advance!
>>
>>
>> Best Regards
>> Discovery
>> 11/30/2016
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
> <07161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Terry Liu
Sorry for my typo. Obviously, I meant:
"It appears that a single query that calls Cassandra's`now()` time
function *multiple
times *may actually cause a query to write or return different times."

Less of a surprise now that I realize more about the implementation, but I
agree that more explicit documentation around when exactly the "execution"
of each now() statement happens and what implications it has for the
resulting timestamps would be helpful when running into this.

Thanks for the quick responses!

-Terry



On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek  wrote:

> every now() call in statement is under the hood "replaced" with newly
> generated uuid.
>
> It can happen that they belong to  different milliseconds in time.
>
> If you need to have same timestamps you need to set them on the client
> side.
>
>
> @msvaljek 
>
> 2016-11-29 22:49 GMT+01:00 Terry Liu :
>
>> It appears that a single query that calls Cassandra's `now()` time
>> function may actually cause a query to write or return different times.
>>
>> Is this the expected or defined behavior, and if so, why does it behave
>> like this rather than evaluating `now()` once across an entire statement?
>>
>> This really affects UPDATE statements but to test it more easily, you
>> could try something like:
>>
>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>> FROM keyspace.table
>> LIMIT 100;
>>
>> If you run that a few times, you should eventually see that the timestamp
>> returned moves onto the next millisecond mid-query.
>>
>> --
>> *Software Engineer*
>> Turnitin - http://www.turnitin.com
>> t...@turnitin.com
>>
>
>


-- 
*Software Engineer*
Turnitin - http://www.turnitin.com
t...@turnitin.com


Re: Which version is stable enough for production environment?

2016-11-30 Thread Harikrishnan Pillai
https://issues.apache.org/jira/browse/CASSANDRA-12728

[CASSANDRA-12728] Handling partially written hint files 
...
issues.apache.org
Cassandra; CASSANDRA-12728; Handling partially written hint files. Agile Board; 
Awaiting Feedback; Export

https://issues.apache.org/jira/browse/CASSANDRA-12844


Also when i testes some of our write heavy workload Leveled Compaction was not 
keeping up.With same system settings 2.1.16 performs better and all levels was 
properly aligned.


From: Benjamin Roth 
Sent: Tuesday, November 29, 2016 11:20:19 PM
To: user@cassandra.apache.org
Subject: Re: Which version is stable enough for production environment?

What are the compaction issues / hint corruprions you encountered? Are there 
JIRA tickets for it?
I am curios cause I use 3.10 (trunk) in production.

For anyone who is planning to use MVs:
They basically work. We use them in production since some months, BUT (it's a 
quite big one) maintainance is a pain. Bootstrapping and repairs may be - 
depending on the model, config, amount of data - really, really painful. I'm 
currently investigating intensively.

2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai 
>:

3.0 has "off the heap memtable" impl removed and if you have a requirement for 
this,its not available.If you don't have the requirement 3.0.9 can be tried 
out. 3.9 version we did some testing and find lot issues in compaction,hint 
corruption etc.

Regards

Hari



From: Discovery >
Sent: Tuesday, November 29, 2016 5:59 PM
To: user
Subject: Re: Which version is stable enough for production environment?

Why version 3.x is not recommended?  Thanks.


-- Original --
From:  "Harikrishnan 
Pillai";>;
Date:  Wed, Nov 30, 2016 09:57 AM
To:  "user">;
Subject:  Re: Which version is stable enough for production environment?


Cassandra 2.1.16



From: Discovery >
Sent: Tuesday, November 29, 2016 5:42 PM
To: user
Subject: Which version is stable enough for production environment?

Hi Cassandra Experts,

  We prepare to deploy Cassandra in production env, but we can 
not confirm which version is stable and recommended, could someone in this mail 
list give the suggestion? Thanks in advance!


Best Regards
Discovery
11/30/2016



--
Benjamin Roth
Prokurist

Jaumo GmbH * www.jaumo.com
Wehrstra?e 46 * 73035 G?ppingen * Germany
Phone +49 7161 304880-6 * Fax +49 7161 304880-1
AG Ulm * HRB 731058 * Managing Director: Jens Kammerer


Sanity checks to run post restore data?

2016-11-30 Thread Varun Gupta
Hi,

We are periodically backing up sstables, and need to learn, what sanity
checks should be performed after restoring them?

Thanks,
Varun


Save the date: ApacheCon Miami, May 15-19, 2017

2016-11-30 Thread Rich Bowen
Dear Apache enthusiast,

ApacheCon and Apache Big Data will be held at the Intercontinental in
Miami, Florida, May 16-18, 2017. Submit your talks, and register, at
http://apachecon.com/  Talks aimed at the Big Data section of the event
should go to
http://events.linuxfoundation.org/events/apache-big-data-north-america/program/cfp
while other talks should go to
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp


ApacheCon is the best place to meet the people that develop the software
that you use and rely on. It’s also a great opportunity to deepen your
involvement in the project, and perhaps make the leap to contributing.
And we find that user case studies, showcasing how you use Apache
projects to solve real world problems, are very popular at this event.
So, do consider whether you have a use case that might make a good
presentation.

ApacheCon will have many different ways that you can participate:

Technical Content: We’ll have three days of technical sessions covering
many of the projects at the ASF. We’ll be publishing a schedule of talks
on March 9th, so that you can plan what you’ll be attending

BarCamp: The Apache BarCamp is a standard feature of ApacheCon - an
un-conference style event, where the schedule is determined on-site by
the attendees, and anything is fair game.

Lightning Talks: Even if you don’t give a full-length talk, the
Lightning Talks are five minute presentations on any topic related to
the ASF, and can be given by any attendee. If there’s something you’re
passionate about, consider giving a Lightning Talk.

Sponsor: It costs money to put on a conference, and this is a great
opportunity for companies involved in Apache projects, or who benefit
from Apache code - your employers - to get their name and products in
front of the community. Sponsors can start any any monetary level, and
can sponsor everything from the conference badge lanyard, through larger
items such as video recordings and evening events. For more information
on sponsoring ApacheCon, see http://apachecon.com/sponsor/

So, get your tickets today at http://apachecon.com/ and submit your
talks. ApacheCon Miami is going to be our best ApacheCon yet, and you,
and your project, can’t afford to miss it.

-- 
Rich Bowen - rbo...@apache.org
VP, Conferences
http://apachecon.com
@apachecon



Re: Inserting list data

2016-11-30 Thread Andrew Baker
Sorry this is so long after the initial. I wrote a dtest to try to make
this happen here:
https://github.com/bakerag1/cassandra-dtest/blob/master/collection_update_test.py

This is my first dtest and my second python script, so I am not overly
confident that it is doing a good job of this test, so if any dtest expert
out there could look it over, I would appreciate it.

I couldn't detect this happening with 5 threads concurrently creating and
updating the same 200 records, with List and List.

We are not using batches in our case, but we are using prepared statements.
I will add that to the test and let you know.

-Andrew

On Fri, Oct 14, 2016 at 11:00 PM Russell Spitzer 
wrote:

> Are you sure you aren't using batches? These will assign the same
> timestamp to your inserts which can lead to unexpected behaviors.
>
> On Fri, Oct 14, 2016 at 9:45 PM Vladimir Yudovin 
> wrote:
>
> Did you try the same quires with Java driver without using prepared
> statements?
>
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone  - Hosted Cloud Cassandra on
> Azure and SoftLayer.Launch your cluster in minutes.*
>
>
>  On Fri, 14 Oct 2016 15:13:38 -0400*Aoi Kadoya  >* wrote 
>
> Hi Vladimir,
>
> In fact I am having difficulty to reproduce this issue by cqlsh.
> I was reported this issue by one of our developers and he is using his
> client application that uses cassandra java driver 3.0.3. (we're using
> DSE5.0.1)
>
> 
>
> app A:
> 2016-10-11 13:28:23,014 [TRACE] [core.QueryLogger.NORMAL] [cluster1]
> [HOST1/IP1:9042] Query completed normally, took 5 ms: [8 bound values]
> INSERT INTO global.table_name
> ("id","alert_to","alert_emails","created_by","created_date","alert_level","updated_by","updated_date")
>
> VALUES (?,?,?,?,?,?,?,?);
> [id:25712, alert_to:[2], alert_emails:NULL,
> created_by:'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',
> created_date:1476160103007, alert_level:2, updated_by:NULL,
> updated_date:NULL]
>
> app B:
> 2016-10-11 13:28:23,014 [TRACE] [core.QueryLogger.NORMAL] [cluster1]
> [HOST2/IP2:9042] Query completed normally, took 6 ms: [8 bound values]
> INSERT INTO global.table_name
> ("alert_to","alert_emails","created_date","id","created_by","updated_by","updated_date","alert_level")
>
> VALUES (?,?,?,?,?,?,?,?);
> [alert_to:[1], alert_emails:NULL, created_date:1476160103007,
> id:25712,
> created_by:'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',
> updated_by:NULL, updated_date:NULL, alert_level:1]
>
> 
> 
> id bigint,
> alert_emails list,
> alert_level int,
> alert_to list,
> created_by text,
> created_date timestamp,
> updated_by text,
> updated_date timestamp,
> PRIMARY KEY (id)
>
>
> SELECT id, alert_level, alert_to FROM global.table_name WHERE id=25712;
> | id | alert_level | alert_to |
> | 25712 | 2 | [2, 1] |
>
> but when I threw the queries like below from cqlsh from different
> nodes at the same time in my testing environment, the data(alert_to)
> was just [1], which is expected behavior.
>
> on host 1
> cqlsh> INSERT INTO global.table_name
> ("id","alert_to","alert_emails","created_by","created_date","alert_level","updated_by","updated_date")
>
> VALUES
> (25712,[2],NULL,'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',1476160103007,2,NULL,NULL);
>
> on host2
> cqlsh> INSERT INTO global.table_name
> ("alert_to","alert_emails","created_date","id","created_by","updated_by","updated_date","alert_level")
>
> VALUES
> ([1],NULL,1476160103007,25712,'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',NULL,NULL,1);
>
>
>
> so I wonder if this is something wrong with java driver but I cannot
> figure out the way to break this down further.
>
>
> @Andrew
> we're not using UDT..but appreciate if you could share your case, too.
>
> Thanks,
> Aoi
>
> 2016-10-13 11:26 GMT-07:00 Andrew Baker :
> > I saw evidence of this behavior, but when we created a test to try to
> make
> > it happen it never did, we assumed it was UDT related and lost interest,
> > since it didn't have a big impact. I will try to carve some time to look
> > into this some more and let you know if I find anything.
> >
> > On Wed, Oct 12, 2016 at 9:24 PM Vladimir Yudovin 
> > wrote:
> >>
> >> The data is actually appended. not overwritten.
> >> Strange, can you send exactly operators?
> >>
> >> Here is example I do:
> >> CREATE KEYSPACE events WITH replication = {'class': 'SimpleStrategy',
> >> 'replication_factor': 1};
> >> CREATE TABLE events.data (id int primary key, events list);
> >> INSERT INTO events.data (id, events) VALUES ( 0, ['a']);
> >> SELECT * FROM events.data ;
> >> id | events
> >> +
> >> 0 | ['a']
> >>
> >> (1 rows)
> >>
> >> INSERT INTO events.data (id, events) VALUES ( 0, ['b']);
> >> SELECT * FROM events.data ;
> >> id | events
> >> +
> >> 0 | ['b']
> >>
> >> (1 rows)
> >>
> >> As you see, 'a' 

Re: Cassandra 2.x Stability

2016-11-30 Thread Vladimir Yudovin
You should also consider end of support term, as Cassandra page says:



Apache Cassandra 2.2 is supported until November 2016.

Apache Cassandra 2.1 is supported until November 2016 with critical fixes only



So 2.1 actually don't get any fixes, even critical.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 30 Nov 2016 07:38:46 -0500 kurt Greaves 
k...@instaclustr.com wrote 




Latest release in 2.2. 2.1 is borderline EOL and from my experience 2.2 is 
quite stable and has some handy bugfixes that didn't actually make it into 2.1



On 30 November 2016 at 10:41, Shalom Sagges shal...@liveperson.com 
wrote:

Hi Everyone, 



I'm about to upgrade our 2.0.14 version to a newer 2.x version. 

At first I thought of upgrading to 2.2.8, but I'm not sure how stable it is, as 
I understand the 2.2 version was supposed to be a sort of beta version for 3.0 
feature-wise, whereas 3.0 upgrade will mainly handle the storage modifications 
(please correct me if I'm wrong). 



So my question is, if I need a 2.x version (can't upgrade to 3 due to client 
considerations), which one should I choose, 2.1.x or 2.2.x? (I'm don't require 
any new features available in 2.2). 



Thanks!




 
Shalom Sagges
 
DBA
 
T: +972-74-700-4035
 

 
 
 
 We Create Meaningful Connections
 
 

 











This message may contain confidential and/or privileged information. 

If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this message 
or any information herein. 

If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.










Re: Cassandra 2.x Stability

2016-11-30 Thread kurt Greaves
Latest release in 2.2. 2.1 is borderline EOL and from my experience 2.2 is
quite stable and has some handy bugfixes that didn't actually make it into
2.1

On 30 November 2016 at 10:41, Shalom Sagges  wrote:

> Hi Everyone,
>
> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
> At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
> is, as I understand the 2.2 version was supposed to be a sort of beta
> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
> storage modifications (please correct me if I'm wrong).
>
> So my question is, if I need a 2.x version (can't upgrade to 3 due to
> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
> don't require any new features available in 2.2).
>
> Thanks!
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
>  
>  We Create Meaningful Connections
>
> 
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>


Cassandra 2.x Stability

2016-11-30 Thread Shalom Sagges
Hi Everyone,

I'm about to upgrade our 2.0.14 version to a newer 2.x version.
At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
is, as I understand the 2.2 version was supposed to be a sort of beta
version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
storage modifications (please correct me if I'm wrong).

So my question is, if I need a 2.x version (can't upgrade to 3 due to
client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
don't require any new features available in 2.2).

Thanks!

Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections


-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: Which version is stable enough for production environment?

2016-11-30 Thread Brooke Jensen
Like I said,

test in a lower environment first with your data model to be sure.




*Brooke Jensen*
VP Technical Operations & Customer Services
www.instaclustr.com | support.instaclustr.com


This email has been sent on behalf of Instaclustr Limited (Australia) and
Instaclustr Inc (USA). This email and any attachments may contain
confidential and legally privileged information.  If you are not the
intended recipient, do not copy or disclose its content, but please reply
to this email immediately and highlight the error to the sender and then
immediately delete the message.

On 30 November 2016 at 19:58, Benjamin Roth  wrote:

> I didn't mean to criticise you. It was meant as a notice for all of those
> who are planning to use MVs.
>
> I already made proposals to solve these issues on the dev list and plan to
> test them on our own cluster during the next days. I am currently working
> heavily on this as we have big trouble bootstrapping new nodes on our 3.10
> cluster - due to these known issues.
>
> JFYI.
>
> 2016-11-30 9:49 GMT+01:00 kurt Greaves :
>
>> Yes Benjamin, no one said it wouldn't. We're actively backporting things
>> as we get time, if you find something you'd like backported raise an issue
>> and let us know. We're well aware of the issues affecting MVs, but they
>> haven't really been solved anywhere yet.
>>
>> On 30 November 2016 at 07:54, Benjamin Roth 
>> wrote:
>>
>>> Hi Brooke,
>>>
>>> Just had a quick look on your code and I will promise that your LTS
>>> version will have the same issues with MVs as any other version.
>>> For details check CASSANDRA-12905 or CASSANDRA-12888.
>>>
>>> 2016-11-30 8:35 GMT+01:00 Brooke Jensen :
>>>
 2.1 will be end of life soon.

 We have a number of customers running 3.7 in production and it's quite
 stable. However you should always test in a lower environment first with
 your data model to be sure.

 If you're interested, we have made available a patched version of 3.7
 
 which backports some key patches from 3.9.
 https://github.com/instaclustr/cassandra


 *Brooke Jensen*
 VP Technical Operations & Customer Services
 www.instaclustr.com | support.instaclustr.com
 

 This email has been sent on behalf of Instaclustr Limited (Australia)
 and Instaclustr Inc (USA). This email and any attachments may contain
 confidential and legally privileged information.  If you are not the
 intended recipient, do not copy or disclose its content, but please reply
 to this email immediately and highlight the error to the sender and then
 immediately delete the message.

 On 30 November 2016 at 18:20, Benjamin Roth 
 wrote:

> What are the compaction issues / hint corruprions you encountered? Are
> there JIRA tickets for it?
> I am curios cause I use 3.10 (trunk) in production.
>
> For anyone who is planning to use MVs:
> They basically work. We use them in production since some months, BUT
> (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
> may be - depending on the model, config, amount of data - really, really
> painful. I'm currently investigating intensively.
>
> 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai  >:
>
>> 3.0 has "off the heap memtable" impl removed and if you have a
>> requirement for this,its not available.If you don't have the requirement
>> 3.0.9 can be tried out. 3.9 version we did some testing and find lot 
>> issues
>> in compaction,hint corruption etc.
>>
>> Regards
>>
>> Hari
>>
>>
>> --
>> *From:* Discovery 
>> *Sent:* Tuesday, November 29, 2016 5:59 PM
>> *To:* user
>> *Subject:* Re: Which version is stable enough for production
>> environment?
>>
>> Why version 3.x is not recommended?  Thanks.
>>
>>
>> -- Original --
>> *From: * "Harikrishnan Pillai";;
>> *Date: * Wed, Nov 30, 2016 09:57 AM
>> *To: * "user";
>> *Subject: * Re: Which version is stable enough for production
>> environment?
>>
>> Cassandra 2.1.16
>>
>>
>> --
>> *From:* Discovery 
>> *Sent:* Tuesday, November 29, 2016 5:42 PM
>> *To:* user
>> *Subject:* Which version is stable enough for production environment?
>>
>> Hi Cassandra Experts,
>>
>>   We prepare to deploy Cassandra in production env,
>> but we can not confirm which version is stable and 

Re: Which version is stable enough for production environment?

2016-11-30 Thread Benjamin Roth
I didn't mean to criticise you. It was meant as a notice for all of those
who are planning to use MVs.

I already made proposals to solve these issues on the dev list and plan to
test them on our own cluster during the next days. I am currently working
heavily on this as we have big trouble bootstrapping new nodes on our 3.10
cluster - due to these known issues.

JFYI.

2016-11-30 9:49 GMT+01:00 kurt Greaves :

> Yes Benjamin, no one said it wouldn't. We're actively backporting things
> as we get time, if you find something you'd like backported raise an issue
> and let us know. We're well aware of the issues affecting MVs, but they
> haven't really been solved anywhere yet.
>
> On 30 November 2016 at 07:54, Benjamin Roth 
> wrote:
>
>> Hi Brooke,
>>
>> Just had a quick look on your code and I will promise that your LTS
>> version will have the same issues with MVs as any other version.
>> For details check CASSANDRA-12905 or CASSANDRA-12888.
>>
>> 2016-11-30 8:35 GMT+01:00 Brooke Jensen :
>>
>>> 2.1 will be end of life soon.
>>>
>>> We have a number of customers running 3.7 in production and it's quite
>>> stable. However you should always test in a lower environment first with
>>> your data model to be sure.
>>>
>>> If you're interested, we have made available a patched version of 3.7
>>> 
>>> which backports some key patches from 3.9.
>>> https://github.com/instaclustr/cassandra
>>>
>>>
>>> *Brooke Jensen*
>>> VP Technical Operations & Customer Services
>>> www.instaclustr.com | support.instaclustr.com
>>> 
>>>
>>> This email has been sent on behalf of Instaclustr Limited (Australia)
>>> and Instaclustr Inc (USA). This email and any attachments may contain
>>> confidential and legally privileged information.  If you are not the
>>> intended recipient, do not copy or disclose its content, but please reply
>>> to this email immediately and highlight the error to the sender and then
>>> immediately delete the message.
>>>
>>> On 30 November 2016 at 18:20, Benjamin Roth 
>>> wrote:
>>>
 What are the compaction issues / hint corruprions you encountered? Are
 there JIRA tickets for it?
 I am curios cause I use 3.10 (trunk) in production.

 For anyone who is planning to use MVs:
 They basically work. We use them in production since some months, BUT
 (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
 may be - depending on the model, config, amount of data - really, really
 painful. I'm currently investigating intensively.

 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai 
 :

> 3.0 has "off the heap memtable" impl removed and if you have a
> requirement for this,its not available.If you don't have the requirement
> 3.0.9 can be tried out. 3.9 version we did some testing and find lot 
> issues
> in compaction,hint corruption etc.
>
> Regards
>
> Hari
>
>
> --
> *From:* Discovery 
> *Sent:* Tuesday, November 29, 2016 5:59 PM
> *To:* user
> *Subject:* Re: Which version is stable enough for production
> environment?
>
> Why version 3.x is not recommended?  Thanks.
>
>
> -- Original --
> *From: * "Harikrishnan Pillai";;
> *Date: * Wed, Nov 30, 2016 09:57 AM
> *To: * "user";
> *Subject: * Re: Which version is stable enough for production
> environment?
>
> Cassandra 2.1.16
>
>
> --
> *From:* Discovery 
> *Sent:* Tuesday, November 29, 2016 5:42 PM
> *To:* user
> *Subject:* Which version is stable enough for production environment?
>
> Hi Cassandra Experts,
>
>   We prepare to deploy Cassandra in production env,
> but we can not confirm which version is stable and recommended, could
> someone in this mail list give the suggestion? Thanks in advance!
>
>
> Best Regards
> Discovery
> 11/30/2016
>



 --
 Benjamin Roth
 Prokurist

 Jaumo GmbH · www.jaumo.com
 Wehrstraße 46 · 73035 Göppingen · Germany
 Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
 <+49%207161%203048801>
 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

>>>
>>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com

Re: Which version is stable enough for production environment?

2016-11-30 Thread kurt Greaves
Yes Benjamin, no one said it wouldn't. We're actively backporting things as
we get time, if you find something you'd like backported raise an issue and
let us know. We're well aware of the issues affecting MVs, but they haven't
really been solved anywhere yet.

On 30 November 2016 at 07:54, Benjamin Roth  wrote:

> Hi Brooke,
>
> Just had a quick look on your code and I will promise that your LTS
> version will have the same issues with MVs as any other version.
> For details check CASSANDRA-12905 or CASSANDRA-12888.
>
> 2016-11-30 8:35 GMT+01:00 Brooke Jensen :
>
>> 2.1 will be end of life soon.
>>
>> We have a number of customers running 3.7 in production and it's quite
>> stable. However you should always test in a lower environment first with
>> your data model to be sure.
>>
>> If you're interested, we have made available a patched version of 3.7
>> 
>> which backports some key patches from 3.9.
>> https://github.com/instaclustr/cassandra
>>
>>
>> *Brooke Jensen*
>> VP Technical Operations & Customer Services
>> www.instaclustr.com | support.instaclustr.com
>> 
>>
>> This email has been sent on behalf of Instaclustr Limited (Australia) and
>> Instaclustr Inc (USA). This email and any attachments may contain
>> confidential and legally privileged information.  If you are not the
>> intended recipient, do not copy or disclose its content, but please reply
>> to this email immediately and highlight the error to the sender and then
>> immediately delete the message.
>>
>> On 30 November 2016 at 18:20, Benjamin Roth 
>> wrote:
>>
>>> What are the compaction issues / hint corruprions you encountered? Are
>>> there JIRA tickets for it?
>>> I am curios cause I use 3.10 (trunk) in production.
>>>
>>> For anyone who is planning to use MVs:
>>> They basically work. We use them in production since some months, BUT
>>> (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
>>> may be - depending on the model, config, amount of data - really, really
>>> painful. I'm currently investigating intensively.
>>>
>>> 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai :
>>>
 3.0 has "off the heap memtable" impl removed and if you have a
 requirement for this,its not available.If you don't have the requirement
 3.0.9 can be tried out. 3.9 version we did some testing and find lot issues
 in compaction,hint corruption etc.

 Regards

 Hari


 --
 *From:* Discovery 
 *Sent:* Tuesday, November 29, 2016 5:59 PM
 *To:* user
 *Subject:* Re: Which version is stable enough for production
 environment?

 Why version 3.x is not recommended?  Thanks.


 -- Original --
 *From: * "Harikrishnan Pillai";;
 *Date: * Wed, Nov 30, 2016 09:57 AM
 *To: * "user";
 *Subject: * Re: Which version is stable enough for production
 environment?

 Cassandra 2.1.16


 --
 *From:* Discovery 
 *Sent:* Tuesday, November 29, 2016 5:42 PM
 *To:* user
 *Subject:* Which version is stable enough for production environment?

 Hi Cassandra Experts,

   We prepare to deploy Cassandra in production env, but
 we can not confirm which version is stable and recommended, could someone
 in this mail list give the suggestion? Thanks in advance!


 Best Regards
 Discovery
 11/30/2016

>>>
>>>
>>>
>>> --
>>> Benjamin Roth
>>> Prokurist
>>>
>>> Jaumo GmbH · www.jaumo.com
>>> Wehrstraße 46 · 73035 Göppingen · Germany
>>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>>> <+49%207161%203048801>
>>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>>
>>
>>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>