Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster
Per Aleksey Yeschenko's comment on that ticket, it does seem like a timestamp granularity issue, but it should work properly if it is within the same session. gocql by default uses 2 connections and 128 streams per connection. If you set it to 1 connection with 1 stream this problem goes away. I suppose that'll take care of it in testing. At least one interesting conclusion here: a gocql.Session does not map to one Cassandra session. This makes some sense given that gocql says to use Session shared concurrently (so it better not just be one Cassandra session), but it is a bit concerning that there is no way to make this 100% safe outside of cutting the gocql.Session down to 1 connection and stream. On Mon, Mar 2, 2015 at 5:34 PM, Peter Sanford psanf...@retailnext.net wrote: The more I think about it, the more this feels like a column timestamp issue. If two inserts have the same timestamp then the values are compared lexically to decide which one to keep (which I think explains the 99/100 999/1000 mystery). We can verify this by also selecting out the WRITETIME of the column: ... var prevTS int for i := 0; i 1; i++ { val := fmt.Sprintf(%d, i) db.Query(UPDATE ut.test SET val = ? WHERE key = 'foo', val).Exec() var result string var ts int db.Query(SELECT val, WRITETIME(val) FROM ut.test WHERE key = 'foo').Scan(result, ts) if result != val { fmt.Printf(Expected %v but got: %v; (prevTS:%d, ts:%d)\n, val, result, prevTS, ts) } prevTS = ts } When I run it with this change I see that the timestamps are in fact the same: Expected 10 but got: 9; (prevTS:1425345839903000, ts:1425345839903000) Expected 100 but got: 99; (prevTS:1425345839939000, ts:1425345839939000) Expected 101 but got: 99; (prevTS:1425345839939000, ts:1425345839939000) Expected 1000 but got: 999; (prevTS:1425345840296000, ts:1425345840296000) It looks like we're only getting millisecond precision instead of microsecond for the column timestamps?! If you explicitly set the timestamp value when you do the insert, you can get actual microsecond precision and the issue should go away. -psanford On Mon, Mar 2, 2015 at 4:21 PM, Dan Kinder dkin...@turnitin.com wrote: Yeah I thought that was suspicious too, it's mysterious and fairly consistent. (By the way I had error checking but removed it for email brevity, but thanks for verifying :) ) On Mon, Mar 2, 2015 at 4:13 PM, Peter Sanford psanf...@retailnext.net wrote: Hmm. I was able to reproduce the behavior with your go program on my dev machine (C* 2.0.12). I was hoping it was going to just be an unchecked error from the .Exec() or .Scan(), but that is not the case for me. The fact that the issue seems to happen on loop iteration 10, 100 and 1000 is pretty suspicious. I took a tcpdump to confirm that the gocql was in fact sending the write 100 query and then on the next read Cassandra responded with 99. I'll be interested to see what the result of the jira ticket is. -psanford -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster
On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote: I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E As I said on that thread : It sounds unreasonable/unexpected to me, if you have a trivial repro case, I would file a JIRA. =Rob
Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster
Yeah I thought that was suspicious too, it's mysterious and fairly consistent. (By the way I had error checking but removed it for email brevity, but thanks for verifying :) ) On Mon, Mar 2, 2015 at 4:13 PM, Peter Sanford psanf...@retailnext.net wrote: Hmm. I was able to reproduce the behavior with your go program on my dev machine (C* 2.0.12). I was hoping it was going to just be an unchecked error from the .Exec() or .Scan(), but that is not the case for me. The fact that the issue seems to happen on loop iteration 10, 100 and 1000 is pretty suspicious. I took a tcpdump to confirm that the gocql was in fact sending the write 100 query and then on the next read Cassandra responded with 99. I'll be interested to see what the result of the jira ticket is. -psanford -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster
Hmm. I was able to reproduce the behavior with your go program on my dev machine (C* 2.0.12). I was hoping it was going to just be an unchecked error from the .Exec() or .Scan(), but that is not the case for me. The fact that the issue seems to happen on loop iteration 10, 100 and 1000 is pretty suspicious. I took a tcpdump to confirm that the gocql was in fact sending the write 100 query and then on the next read Cassandra responded with 99. I'll be interested to see what the result of the jira ticket is. -psanford
Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster
Done: https://issues.apache.org/jira/browse/CASSANDRA-8892 On Mon, Mar 2, 2015 at 3:26 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote: I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E As I said on that thread : It sounds unreasonable/unexpected to me, if you have a trivial repro case, I would file a JIRA. =Rob -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Reboot: Read After Write Inconsistent Even On A One Node Cluster
Hey all, I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E To summarize it, on my local box with just one cassandra node I can update and then select the updated row and get an incorrect response. My understanding is this may have to do with not having fine-grained enough timestamp resolution, but regardless I'm wondering: is this actually a bug or is there any way to mitigate it? It causes sporadic failures in our unit tests, and having to Sleep() between tests isn't ideal. At least confirming it's a bug would be nice though. For those interested, here's a little go program that can reproduce the issue. When I run it I typically see: Expected 100 but got: 99 Expected 1000 but got: 999 --- main.go: --- package main import ( fmt github.com/gocql/gocql ) func main() { cf := gocql.NewCluster(localhost) db, _ := cf.CreateSession() // Keyspace ut = update test err := db.Query(`CREATE KEYSPACE IF NOT EXISTS ut WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 }`).Exec() if err != nil { panic(err.Error()) } err = db.Query(CREATE TABLE IF NOT EXISTS ut.test (key text, val text, PRIMARY KEY(key))).Exec() if err != nil {panic(err.Error()) } err = db.Query(TRUNCATE ut.test).Exec() if err != nil { panic(err.Error()) } err = db.Query(INSERT INTO ut.test (key) VALUES ('foo')).Exec() if err != nil { panic(err.Error()) } for i := 0; i 1; i++ { val := fmt.Sprintf(%d, i) db.Query(UPDATE ut.test SET val = ? WHERE key = 'foo', val).Exec() var result string db.Query(SELECT val FROM ut.test WHERE key = 'foo').Scan(result) if result != val { fmt.Printf(Expected %v but got: %v\n, val, result) } } }
read after write inconsistent even on a one node cluster
We're doing development on a single node cluster (and yes of course we're not really deploying that way), and we're getting inconsistent behavior on reads after writes. We write values to our keyspaces and then immediately read the values back (in our Cucumber tests). About 20% of the time we get the old value.if we wait 1 second and redo the query (within the same java method) we get the new value. This is all happening on a single node...how is this possible? We're using 2.0.9 and the java client. Though it shouldn't matter given a single node cluster I set the consistency level to ALL with no effect. I've read CASSANDRA-876 which seems spot-on but it was closed as won't-fix...and I don't see what the solution is. Thanks in advance for any help. Brian Tarbox -- http://about.me/BrianTarbox
Re: read after write inconsistent even on a one node cluster
If this is just for doing tests to make sure you get back the data you expect, I would recommend looking some sort of eventually construct in your testing. We use Specs2 as our testing framework, and our write-then-read tests look something like this: someDAO.write(someObject) eventually { someDAO.read(someObject.id) mustEqual someObject } This will retry the read repeatedly over a short duration. Just in case you are trying to do write-then-read outside of tests, you should be aware that it's a Bad Idea™, but your email reads like you already know that =) On Thu Nov 06 2014 at 7:16:25 AM Brian Tarbox briantar...@gmail.com wrote: We're doing development on a single node cluster (and yes of course we're not really deploying that way), and we're getting inconsistent behavior on reads after writes. We write values to our keyspaces and then immediately read the values back (in our Cucumber tests). About 20% of the time we get the old value.if we wait 1 second and redo the query (within the same java method) we get the new value. This is all happening on a single node...how is this possible? We're using 2.0.9 and the java client. Though it shouldn't matter given a single node cluster I set the consistency level to ALL with no effect. I've read CASSANDRA-876 which seems spot-on but it was closed as won't-fix...and I don't see what the solution is. Thanks in advance for any help. Brian Tarbox -- http://about.me/BrianTarbox
Re: read after write inconsistent even on a one node cluster
Thanks. Right now its just for testing but in general we can't guard against multiple users ending up the one writes and then one reads. It would be one thing if the read just got old data but we're seeing it return wrong data...i.e. data that doesn't correspond to any particular version of the object. Brian On Thu, Nov 6, 2014 at 10:30 AM, Eric Stevens migh...@gmail.com wrote: If this is just for doing tests to make sure you get back the data you expect, I would recommend looking some sort of eventually construct in your testing. We use Specs2 as our testing framework, and our write-then-read tests look something like this: someDAO.write(someObject) eventually { someDAO.read(someObject.id) mustEqual someObject } This will retry the read repeatedly over a short duration. Just in case you are trying to do write-then-read outside of tests, you should be aware that it's a Bad Idea™, but your email reads like you already know that =) On Thu Nov 06 2014 at 7:16:25 AM Brian Tarbox briantar...@gmail.com wrote: We're doing development on a single node cluster (and yes of course we're not really deploying that way), and we're getting inconsistent behavior on reads after writes. We write values to our keyspaces and then immediately read the values back (in our Cucumber tests). About 20% of the time we get the old value.if we wait 1 second and redo the query (within the same java method) we get the new value. This is all happening on a single node...how is this possible? We're using 2.0.9 and the java client. Though it shouldn't matter given a single node cluster I set the consistency level to ALL with no effect. I've read CASSANDRA-876 which seems spot-on but it was closed as won't-fix...and I don't see what the solution is. Thanks in advance for any help. Brian Tarbox -- http://about.me/BrianTarbox -- http://about.me/BrianTarbox
Re: read after write inconsistent even on a one node cluster
On Thu, Nov 6, 2014 at 6:14 AM, Brian Tarbox briantar...@gmail.com wrote: We write values to our keyspaces and then immediately read the values back (in our Cucumber tests). About 20% of the time we get the old value.if we wait 1 second and redo the query (within the same java method) we get the new value. This is all happening on a single node...how is this possible? It sounds unreasonable/unexpected to me, if you have a trivial repro case, I would file a JIRA. =Rob
Re: read after write inconsistent even on a one node cluster
For cqlengine we do quite a bit of write then read to ensure data was written correctly, across 1.2, 2.0, and 2.1. For what it's worth, I've never seen this issue come up. On a single node, Cassandra only acks the write after it's been written into the memtable. So, you'd expect to see the most recent data. A possibility - if you're running in a VM, it's possible the clock isn't incrementing in real time? I've seen this happen with uuid1 generation - I was getting duplicates if I generated them fast enough. Perhaps you're writing 2 values one right after the other and they're getting the same millisecond precision timestamp. On Thu, Nov 6, 2014 at 10:26 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Nov 6, 2014 at 6:14 AM, Brian Tarbox briantar...@gmail.com wrote: We write values to our keyspaces and then immediately read the values back (in our Cucumber tests). About 20% of the time we get the old value.if we wait 1 second and redo the query (within the same java method) we get the new value. This is all happening on a single node...how is this possible? It sounds unreasonable/unexpected to me, if you have a trivial repro case, I would file a JIRA. =Rob -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade