Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-03 Thread Dan Kinder
Per Aleksey Yeschenko's comment on that ticket, it does seem like a
timestamp granularity issue, but it should work properly if it is within
the same session. gocql by default uses 2 connections and 128 streams per
connection. If you set it to 1 connection with 1 stream this problem goes
away. I suppose that'll take care of it in testing.

At least one interesting conclusion here: a gocql.Session does not map to
one Cassandra session. This makes some sense given that gocql says to use
Session shared concurrently (so it better not just be one Cassandra
session), but it is a bit concerning that there is no way to make this 100%
safe outside of cutting the gocql.Session down to 1 connection and stream.

On Mon, Mar 2, 2015 at 5:34 PM, Peter Sanford psanf...@retailnext.net
wrote:

 The more I think about it, the more this feels like a column timestamp
 issue. If two inserts have the same timestamp then the values are compared
 lexically to decide which one to keep (which I think explains the
 99/100 999/1000 mystery).

 We can verify this by also selecting out the WRITETIME of the column:

 ...
 var prevTS int
 for i := 0; i  1; i++ {
 val := fmt.Sprintf(%d, i)
 db.Query(UPDATE ut.test SET val = ? WHERE key = 'foo', val).Exec()

 var result string
 var ts int
 db.Query(SELECT val, WRITETIME(val) FROM ut.test WHERE key =
 'foo').Scan(result, ts)
 if result != val {
 fmt.Printf(Expected %v but got: %v; (prevTS:%d, ts:%d)\n, val, result,
 prevTS, ts)
 }
 prevTS = ts
 }


 When I run it with this change I see that the timestamps are in fact the
 same:

 Expected 10 but got: 9; (prevTS:1425345839903000, ts:1425345839903000)
 Expected 100 but got: 99; (prevTS:1425345839939000, ts:1425345839939000)
 Expected 101 but got: 99; (prevTS:1425345839939000, ts:1425345839939000)
 Expected 1000 but got: 999; (prevTS:1425345840296000, ts:1425345840296000)


 It looks like we're only getting millisecond precision instead of
 microsecond for the column timestamps?! If you explicitly set the timestamp
 value when you do the insert, you can get actual microsecond precision and
 the issue should go away.

 -psanford

 On Mon, Mar 2, 2015 at 4:21 PM, Dan Kinder dkin...@turnitin.com wrote:

 Yeah I thought that was suspicious too, it's mysterious and fairly
 consistent. (By the way I had error checking but removed it for email
 brevity, but thanks for verifying :) )

 On Mon, Mar 2, 2015 at 4:13 PM, Peter Sanford psanf...@retailnext.net
 wrote:

 Hmm. I was able to reproduce the behavior with your go program on my dev
 machine (C* 2.0.12). I was hoping it was going to just be an unchecked
 error from the .Exec() or .Scan(), but that is not the case for me.

 The fact that the issue seems to happen on loop iteration 10, 100 and
 1000 is pretty suspicious. I took a tcpdump to confirm that the gocql was
 in fact sending the write 100 query and then on the next read Cassandra
 responded with 99.

 I'll be interested to see what the result of the jira ticket is.

 -psanford




 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com





-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Robert Coli
On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote:

 I had been having the same problem as in those older post:
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E


As I said on that thread :

It sounds unreasonable/unexpected to me, if you have a trivial repro case,
I would file a JIRA.

=Rob


Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Yeah I thought that was suspicious too, it's mysterious and fairly
consistent. (By the way I had error checking but removed it for email
brevity, but thanks for verifying :) )

On Mon, Mar 2, 2015 at 4:13 PM, Peter Sanford psanf...@retailnext.net
wrote:

 Hmm. I was able to reproduce the behavior with your go program on my dev
 machine (C* 2.0.12). I was hoping it was going to just be an unchecked
 error from the .Exec() or .Scan(), but that is not the case for me.

 The fact that the issue seems to happen on loop iteration 10, 100 and 1000
 is pretty suspicious. I took a tcpdump to confirm that the gocql was in
 fact sending the write 100 query and then on the next read Cassandra
 responded with 99.

 I'll be interested to see what the result of the jira ticket is.

 -psanford




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Peter Sanford
Hmm. I was able to reproduce the behavior with your go program on my dev
machine (C* 2.0.12). I was hoping it was going to just be an unchecked
error from the .Exec() or .Scan(), but that is not the case for me.

The fact that the issue seems to happen on loop iteration 10, 100 and 1000
is pretty suspicious. I took a tcpdump to confirm that the gocql was in
fact sending the write 100 query and then on the next read Cassandra
responded with 99.

I'll be interested to see what the result of the jira ticket is.

-psanford


Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Done: https://issues.apache.org/jira/browse/CASSANDRA-8892

On Mon, Mar 2, 2015 at 3:26 PM, Robert Coli rc...@eventbrite.com wrote:

 On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote:

 I had been having the same problem as in those older post:
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E


 As I said on that thread :

 It sounds unreasonable/unexpected to me, if you have a trivial repro
 case, I would file a JIRA.

 =Rob




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Hey all,

I had been having the same problem as in those older post:
http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E

To summarize it, on my local box with just one cassandra node I can update
and then select the updated row and get an incorrect response.

My understanding is this may have to do with not having fine-grained enough
timestamp resolution, but regardless I'm wondering: is this actually a bug
or is there any way to mitigate it? It causes sporadic failures in our unit
tests, and having to Sleep() between tests isn't ideal. At least confirming
it's a bug would be nice though.

For those interested, here's a little go program that can reproduce the
issue. When I run it I typically see:
Expected 100 but got: 99
Expected 1000 but got: 999

--- main.go: ---

package main

import (
fmt

github.com/gocql/gocql
)

func main() {
cf := gocql.NewCluster(localhost)
db, _ := cf.CreateSession()
// Keyspace ut = update test
err := db.Query(`CREATE KEYSPACE IF NOT EXISTS ut
WITH REPLICATION = {'class': 'SimpleStrategy',
'replication_factor': 1 }`).Exec()
if err != nil {
panic(err.Error())
}
err = db.Query(CREATE TABLE IF NOT EXISTS ut.test (key text, val text,
PRIMARY KEY(key))).Exec()
if err != nil {panic(err.Error())
   }
err = db.Query(TRUNCATE ut.test).Exec()
if err != nil {
panic(err.Error())

}

err = db.Query(INSERT INTO ut.test (key) VALUES ('foo')).Exec()

if err != nil {

panic(err.Error())

}


for i := 0; i  1; i++ {

val := fmt.Sprintf(%d, i)

db.Query(UPDATE ut.test SET val = ? WHERE key = 'foo',
val).Exec()


var result string
db.Query(SELECT val FROM ut.test WHERE key = 'foo').Scan(result)
if result != val {
fmt.Printf(Expected %v but got: %v\n, val, result)
}
}

}


read after write inconsistent even on a one node cluster

2014-11-06 Thread Brian Tarbox
We're doing development on a single node cluster (and yes of course we're
not really deploying that way), and we're getting inconsistent behavior on
reads after writes.

We write values to our keyspaces and then immediately read the values back
(in our Cucumber tests).  About 20% of the time we get the old value.if
we wait 1 second and redo the query (within the same java method) we get
the new value.

This is all happening on a single node...how is this possible?

We're using 2.0.9 and the java client.   Though it shouldn't matter given a
single node cluster I set the consistency level to ALL with no effect.

I've read CASSANDRA-876 which seems spot-on but it was closed as
won't-fix...and I don't see what the solution is.

Thanks in advance for any help.

Brian Tarbox

-- 
http://about.me/BrianTarbox


Re: read after write inconsistent even on a one node cluster

2014-11-06 Thread Eric Stevens
If this is just for doing tests to make sure you get back the data you
expect, I would recommend looking some sort of eventually construct in your
testing.  We use Specs2 as our testing framework, and our write-then-read
tests look something like this:

someDAO.write(someObject)

eventually {
someDAO.read(someObject.id) mustEqual someObject
}

This will retry the read repeatedly over a short duration.

Just in case you are trying to do write-then-read outside of tests, you
should be aware that it's a Bad Idea™, but your email reads like you
already know that =)

On Thu Nov 06 2014 at 7:16:25 AM Brian Tarbox briantar...@gmail.com wrote:

 We're doing development on a single node cluster (and yes of course we're
 not really deploying that way), and we're getting inconsistent behavior on
 reads after writes.

 We write values to our keyspaces and then immediately read the values back
 (in our Cucumber tests).  About 20% of the time we get the old value.if
 we wait 1 second and redo the query (within the same java method) we get
 the new value.

 This is all happening on a single node...how is this possible?

 We're using 2.0.9 and the java client.   Though it shouldn't matter given
 a single node cluster I set the consistency level to ALL with no effect.

 I've read CASSANDRA-876 which seems spot-on but it was closed as
 won't-fix...and I don't see what the solution is.

 Thanks in advance for any help.

 Brian Tarbox

 --
 http://about.me/BrianTarbox



Re: read after write inconsistent even on a one node cluster

2014-11-06 Thread Brian Tarbox
Thanks.   Right now its just for testing but in general we can't guard
against multiple users ending up the one writes and then one reads.

It would be one thing if the read just got old data but we're seeing it
return wrong data...i.e. data that doesn't correspond to any particular
version of the object.

Brian

On Thu, Nov 6, 2014 at 10:30 AM, Eric Stevens migh...@gmail.com wrote:

 If this is just for doing tests to make sure you get back the data you
 expect, I would recommend looking some sort of eventually construct in your
 testing.  We use Specs2 as our testing framework, and our write-then-read
 tests look something like this:

 someDAO.write(someObject)

 eventually {
 someDAO.read(someObject.id) mustEqual someObject
 }

 This will retry the read repeatedly over a short duration.

 Just in case you are trying to do write-then-read outside of tests, you
 should be aware that it's a Bad Idea™, but your email reads like you
 already know that =)

 On Thu Nov 06 2014 at 7:16:25 AM Brian Tarbox briantar...@gmail.com
 wrote:

 We're doing development on a single node cluster (and yes of course we're
 not really deploying that way), and we're getting inconsistent behavior on
 reads after writes.

 We write values to our keyspaces and then immediately read the values
 back (in our Cucumber tests).  About 20% of the time we get the old
 value.if we wait 1 second and redo the query (within the same java
 method) we get the new value.

 This is all happening on a single node...how is this possible?

 We're using 2.0.9 and the java client.   Though it shouldn't matter given
 a single node cluster I set the consistency level to ALL with no effect.

 I've read CASSANDRA-876 which seems spot-on but it was closed as
 won't-fix...and I don't see what the solution is.

 Thanks in advance for any help.

 Brian Tarbox

 --
 http://about.me/BrianTarbox




-- 
http://about.me/BrianTarbox


Re: read after write inconsistent even on a one node cluster

2014-11-06 Thread Robert Coli
On Thu, Nov 6, 2014 at 6:14 AM, Brian Tarbox briantar...@gmail.com wrote:

 We write values to our keyspaces and then immediately read the values back
 (in our Cucumber tests).  About 20% of the time we get the old value.if
 we wait 1 second and redo the query (within the same java method) we get
 the new value.

 This is all happening on a single node...how is this possible?


It sounds unreasonable/unexpected to me, if you have a trivial repro case,
I would file a JIRA.

=Rob


Re: read after write inconsistent even on a one node cluster

2014-11-06 Thread Jonathan Haddad
For cqlengine we do quite a bit of write then read to ensure data was
written correctly, across 1.2, 2.0, and 2.1.  For what it's worth,
I've never seen this issue come up.  On a single node, Cassandra only
acks the write after it's been written into the memtable.  So, you'd
expect to see the most recent data.

A possibility - if you're running in a VM, it's possible the clock
isn't incrementing in real time?  I've seen this happen with uuid1
generation - I was getting duplicates if I generated them fast enough.
Perhaps you're writing 2 values one right after the other and they're
getting the same millisecond precision timestamp.

On Thu, Nov 6, 2014 at 10:26 AM, Robert Coli rc...@eventbrite.com wrote:
 On Thu, Nov 6, 2014 at 6:14 AM, Brian Tarbox briantar...@gmail.com wrote:

 We write values to our keyspaces and then immediately read the values back
 (in our Cucumber tests).  About 20% of the time we get the old value.if
 we wait 1 second and redo the query (within the same java method) we get the
 new value.

 This is all happening on a single node...how is this possible?


 It sounds unreasonable/unexpected to me, if you have a trivial repro case, I
 would file a JIRA.

 =Rob




-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade