Re: Inconsistent Quorum Read after Quorum Write
Li, I did not reset repairedAt and ran repair with -pr directly. That’s > probably why the inconsistency occurred. > Yes, this will be a likely cause. There's enough docs out there to help you with this. Shout out if not. > As our tables are pretty big, full repair takes many days to finish. Given > the 10 days gc period, it means repair almost will run all the time. > Consistency is more important to us and the cluster takes the same amount > of write and read requests. Temporary outage is allowed but if a dead node > can’t come back in time, we will go back to Quorum mode. > Yes, repairs can be a real headache. Install Reaper. Seriously. http://cassandra-reaper.io/ > What’s new in 3.11.3? We’ve been running on C* for almost 2 years. The > biggest pain point is about repair. Especially with 3.11.1, incremental > repair doesn’t work well compared to our experience with 3.10. Maybe it’s > just because our data size wasn’t that big before upgrade... > 3.11.2 and 3.11.3 are just patch releases on top of 3.11.1. It's definitely recommended to always *test* and upgrade to the latest patch release. And it's kinda a prerequisite if you want help from the open source community, none of us really enjoy debugging old code :-) regards, Mick -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting http://www.thelastpickle.com
Re: Inconsistent Quorum Read after Quorum Write
Hi Mick, Thanks for replying! I did not reset repairedAt and ran repair with -pr directly. That’s probably why the inconsistency occurred. As our tables are pretty big, full repair takes many days to finish. Given the 10 days gc period, it means repair almost will run all the time. Consistency is more important to us and the cluster takes the same amount of write and read requests. Temporary outage is allowed but if a dead node can’t come back in time, we will go back to Quorum mode. What’s new in 3.11.3? We’ve been running on C* for almost 2 years. The biggest pain point is about repair. Especially with 3.11.1, incremental repair doesn’t work well compared to our experience with 3.10. Maybe it’s just because our data size wasn’t that big before upgrade... Thanks, Li > On Jul 11, 2018, at 05:32, Mick Semb Wever wrote: > > > Li, > > >> I’ve confirmed that the inconsistency issues disappeared after repair >> finished. >> >> Anything changed with repair in 3.11.1? One difference I noticed is that the >> validation step during repair could turn down the node upon large tables, >> which never happen in 3.10. I had to throttle validation requests to let it >> pass. Also I switched back to -pr instead of incremental repair which is a >> resource killer and often hangs for the first node to be repaired. > > > When you switched back to non-incremental did you set `repairedAt` on all > sstables (on all nodes) back to zero (or unrepaired state)? > This should have been done with `sstablerepairedset --is-unrepaired … ` while > the node is stopped. > > >> To address the inconsistency issue, I could do Write All and Read One by >> giving up availability and stop running repair. Any comments on that? > > > You loose availability doing this, and at the number of reads you're doing I > would not recommend it. > You could think about using a fallback strategy that initially tries CL.ALL > and falls back to CL.QUORUM. But this is a hack, could overload your cluster, > and if there's any correlation to dropped messages or flapping nodes won't > help. > > I'd also be prepared to upgrade to 3.11.3, when it does get released. > > regards, > Mick > > -- > Mick Semb Wever > Australia > > The Last Pickle > Apache Cassandra Consulting > http://www.thelastpickle.com
Re: Inconsistent Quorum Read after Quorum Write
Li, I’ve confirmed that the inconsistency issues disappeared after repair > finished. > > Anything changed with repair in 3.11.1? One difference I noticed is that > the validation step during repair could turn down the node upon large > tables, which never happen in 3.10. I had to throttle validation requests > to let it pass. Also I switched back to -pr instead of incremental repair > which is a resource killer and often hangs for the first node to be > repaired. > When you switched back to non-incremental did you set `repairedAt` on all sstables (on all nodes) back to zero (or unrepaired state)? This should have been done with `sstablerepairedset --is-unrepaired … ` while the node is stopped. > To address the inconsistency issue, I could do Write All and Read One by > giving up availability and stop running repair. Any comments on that? > You loose availability doing this, and at the number of reads you're doing I would not recommend it. You could think about using a fallback strategy that initially tries CL.ALL and falls back to CL.QUORUM. But this is a hack, could overload your cluster, and if there's any correlation to dropped messages or flapping nodes won't help. I'd also be prepared to upgrade to 3.11.3, when it does get released. regards, Mick -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting http://www.thelastpickle.com
Re: Inconsistent Quorum Read after Quorum Write
I’ve confirmed that the inconsistency issues disappeared after repair finished. Anything changed with repair in 3.11.1? One difference I noticed is that the validation step during repair could turn down the node upon large tables, which never happen in 3.10. I had to throttle validation requests to let it pass. Also I switched back to -pr instead of incremental repair which is a resource killer and often hangs for the first node to be repaired. To address the inconsistency issue, I could do Write All and Read One by giving up availability and stop running repair. Any comments on that? I guess I could also try downgrade c* version but will data file be a problem? Li > On Jul 3, 2018, at 15:53, kurt greaves wrote: > > Shouldn't happen. Any chance you could trace the queries, or have you been > able to reproduce it? Also, what version of Cassandra? > >> On Wed., 4 Jul. 2018, 06:41 Visa, wrote: >> Hi all, >> >> We recently experienced an unexpected behavior with C* consistency. >> >> For example, a table t consists of 4 columns - pk , a, b and c. We perform >> Quorum write and then Quorum read (RF=3 / LCS compaction). >> >> The consistency seems to break while repairing is running(repair -pr). >> >> Say, a record already exists in t like >> pk=1, a=1, b=1, c=1 >> >> While repair is not running >> >> Quorum Write: >> update t set c = 2 where pk=1 >> >> Quorum Read: >> select pk,a,b,c from t where pk=1 limit 1 >> >> Returns: (1, 1, 1, 2) as expected. >> >> But if we do it while repair is running, >> >> Quorum Write: >> update t set c=3 where pk=1 >> >> Quorum Read, however, returns (1, null, null, 3) w/o values of a and b. >> >> After repair is done, then the same Quorum Read returns the right values >> (1,1,1,3). >> >> It does not happen to every row in t. The impacted rows are like 40 out of >> 300 millions. But still how the consistency gets broken here? >> >> Thanks for your attention! >> >> Li >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >>
Re: Inconsistent Quorum Read after Quorum Write
Thanks for replying! The version is C* 3.11.1. The quorum write and read are done in java code (spark streaming) and in async mode within the same session. Could not reproduce it via cqlsh yet. With the same session, execute the async write, and in the callback execute the async read. To be more accurate, the read is select json pk,a,b,c from t where pk=1 limit 1. Li > On Jul 3, 2018, at 15:53, kurt greaves wrote: > > Shouldn't happen. Any chance you could trace the queries, or have you been > able to reproduce it? Also, what version of Cassandra? > >> On Wed., 4 Jul. 2018, 06:41 Visa, wrote: >> Hi all, >> >> We recently experienced an unexpected behavior with C* consistency. >> >> For example, a table t consists of 4 columns - pk , a, b and c. We perform >> Quorum write and then Quorum read (RF=3 / LCS compaction). >> >> The consistency seems to break while repairing is running(repair -pr). >> >> Say, a record already exists in t like >> pk=1, a=1, b=1, c=1 >> >> While repair is not running >> >> Quorum Write: >> update t set c = 2 where pk=1 >> >> Quorum Read: >> select pk,a,b,c from t where pk=1 limit 1 >> >> Returns: (1, 1, 1, 2) as expected. >> >> But if we do it while repair is running, >> >> Quorum Write: >> update t set c=3 where pk=1 >> >> Quorum Read, however, returns (1, null, null, 3) w/o values of a and b. >> >> After repair is done, then the same Quorum Read returns the right values >> (1,1,1,3). >> >> It does not happen to every row in t. The impacted rows are like 40 out of >> 300 millions. But still how the consistency gets broken here? >> >> Thanks for your attention! >> >> Li >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >>
Re: Inconsistent Quorum Read after Quorum Write
Shouldn't happen. Any chance you could trace the queries, or have you been able to reproduce it? Also, what version of Cassandra? On Wed., 4 Jul. 2018, 06:41 Visa, wrote: > Hi all, > > We recently experienced an unexpected behavior with C* consistency. > > For example, a table t consists of 4 columns - pk , a, b and c. We perform > Quorum write and then Quorum read (RF=3 / LCS compaction). > > The consistency seems to break while repairing is running(repair -pr). > > Say, a record already exists in t like > pk=1, a=1, b=1, c=1 > > While repair is not running > > Quorum Write: > update t set c = 2 where pk=1 > > Quorum Read: > select pk,a,b,c from t where pk=1 limit 1 > > Returns: (1, 1, 1, 2) as expected. > > But if we do it while repair is running, > > Quorum Write: > update t set c=3 where pk=1 > > Quorum Read, however, returns (1, null, null, 3) w/o values of a and b. > > After repair is done, then the same Quorum Read returns the right values > (1,1,1,3). > > It does not happen to every row in t. The impacted rows are like 40 out of > 300 millions. But still how the consistency gets broken here? > > Thanks for your attention! > > Li > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
Inconsistent Quorum Read after Quorum Write
Hi all, We recently experienced an unexpected behavior with C* consistency. For example, a table t consists of 4 columns - pk , a, b and c. We perform Quorum write and then Quorum read (RF=3 / LCS compaction). The consistency seems to break while repairing is running(repair -pr). Say, a record already exists in t like pk=1, a=1, b=1, c=1 While repair is not running Quorum Write: update t set c = 2 where pk=1 Quorum Read: select pk,a,b,c from t where pk=1 limit 1 Returns: (1, 1, 1, 2) as expected. But if we do it while repair is running, Quorum Write: update t set c=3 where pk=1 Quorum Read, however, returns (1, null, null, 3) w/o values of a and b. After repair is done, then the same Quorum Read returns the right values (1,1,1,3). It does not happen to every row in t. The impacted rows are like 40 out of 300 millions. But still how the consistency gets broken here? Thanks for your attention! Li - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org