Mutation dropped and Read-Repair performance issue

2020-12-19 Thread sunil pawar
Hi All, We are facing problems of failure of Read-Repair stages with error Digest Mismatch and count is 300+ per day per node. At the same time, we are experiencing node is getting overloaded for a quick couple of seconds due to long GC pauses (of around 7-8 seconds). We are not running a repair

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-02 Thread Tobias Eriksson
2020 at 17:27 To: cassandra Subject: Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls Yes, it's possible. A typical JVM GC pause for most configs is on the order of 50-200ms. If you have a host do a small collection/pause, then the read at #4 is basically racing

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Jeff Jirsa
. Data replicated by Cassandra, but will not finish before (4) below >3. Wait 80 ms on average >4. Data read again with QUORUM i.e asking for atleast 2 out of 3 nodes >for result, and now ONE replies with inaccurate data >5. (4) triggers a READ REPAIR >6. The RE

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Erick Ramirez
Did you mean LOCAL_QUORUM? Because QUORUM will require 4 out of 6 replicas, not 2 out of 3. :) But it sounds like you are using QUORUM because you said it syncs to all nodes in DC2. To answer your question, RR *can* be triggered if you're reading before the replicas are *eventually* consistent.

Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Tobias Eriksson
with QUORUM i.e asking for atleast 2 out of 3 nodes for result, and now ONE replies with inaccurate data 5. (4) triggers a READ REPAIR 6. The READ REPAIR now synchs to ALL nodes also in DC2 So my question is: Is it really possible that Cassandra within 80 ms is not able to replicate to all 3 nodes

Re: Why a READ REPAIR ?

2020-08-12 Thread Erick Ramirez
You can check for the string "digest mismatch" in the logs. Similarly, you can track the RR stats in nodetool netstats and the dropped mutations in nodetool tpstats. To be clear though, RRs are a side-effect of nodes either dropping mutations or being unresponsive so they miss mutations. RRs do

Re: Why a READ REPAIR ?

2020-08-12 Thread Tobias Eriksson
Thanx Erick Is there a way to turn on tracing based on certain criteria, I would like to start tracing when there is some sort of failure, i.e. in this case when a READ REPAIR is triggered as I would like to know why we sometimes can’t reach one of the nodes -Tobias From: Erick Ramirez Reply

Re: Why a READ REPAIR ?

2020-08-11 Thread Erick Ramirez
um : SELECT * FROM products WHERE id = ABC123 > > READ 2 with Local One : SELECT * FROM products WHERE id = ABC123 > > > > Would read (2) be blocked by the READ REPAIR that was done by read (1) > > As I understand that the read repair is working not on the whole table but > o

Re: Why a READ REPAIR ?

2020-08-11 Thread manish khandelwal
Hi Tobias READ2 will not be blocked by READ repair of READ1. Regards Manish On Tue, Aug 11, 2020 at 6:02 PM Tobias Eriksson wrote: > Thanx Erick, > > Perhaps this is super obvious but I need a confirmation as you say “…not > subsequent reads for other data unrelated to the read be

Re: Why a READ REPAIR ?

2020-08-11 Thread Tobias Eriksson
= ABC123 READ 2 with Local One : SELECT * FROM products WHERE id = ABC123 Would read (2) be blocked by the READ REPAIR that was done by read (1) As I understand that the read repair is working not on the whole table but on the partition key it had problems with -Tobias From: Erick Ramirez Reply

Re: Why a READ REPAIR ?

2020-08-11 Thread Erick Ramirez
> > If a READ triggers a READ REPAIR, and then if we do an additional READ > would then that BLOCK until the “first” READ REPAIR would be done ? > > -Tobias > Not all read repairs are blocking RRs (aka foreground RRs). There are also background RRs which by definition are no

Re: Why a READ REPAIR ?

2020-08-11 Thread Tobias Eriksson
If a READ triggers a READ REPAIR, and then if we do an additional READ would then that BLOCK until the “first” READ REPAIR would be done ? -Tobias From: Jeff Jirsa Reply to: "user@cassandra.apache.org" Date: Tuesday, 11 August 2020 at 07:30 To: cassandra Subject: Re: Why a R

Re: Why a READ REPAIR ?

2020-08-10 Thread Jeff Jirsa
Your schema may have read repair (non-blocking, background) set to 10% (0.1, for dclocal). You may have GC pauses causing writes (or reads) to be delayed. You may be hitting a cassandra bug. Would need the `TRACING` output to know for sure. On Mon, Aug 10, 2020 at 10:10 PM Tobias Eriksson

Why a READ REPAIR ?

2020-08-10 Thread Tobias Eriksson
Hi We have a Cassandra solution with 2 DCs where each DC has >30 nodes From time to time we see problems with READ REPAIR, but I am stuck with the analysis We have a pattern for these faults where we do 1. INSERT with Local Quorum (2 out of 3) 2. Wait for 0.5 - 1 seconds time window

Re: disable debug message on read repair

2020-03-10 Thread Paul Chandler
Hi Gil, All the logging is controlled via logback. You can change the level of any type of message. Take a look here for some more details: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/configuration/configLoggingLevels.html

Re: disable debug message on read repair

2020-03-10 Thread Gil Ganz
That's one option, I wish I there was a way to disable just that and not the entire debug log level, there are some things there I would like to keep. On Sun, Mar 8, 2020 at 6:41 PM Jeff Jirsa wrote: > There are likely two log configs - one for debug.log and one for > system.log. Disable the

Re: disable debug message on read repair

2020-03-08 Thread Jeff Jirsa
There are likely two log configs - one for debug.log and one for system.log. Disable the debug.log one, or change org.apache.cassandra.service to log at INFO instead Nobody needs to see every digest mismatch and that someone thought this was a good idea is amazing to me. Someone should jira

Re: disable debug message on read repair

2020-03-08 Thread Gil Ganz
Thanks Shalom, I know why these read repairs are happening, and they will continue to happen for some time, even if I will run a full repair. I would like to disable these warning messages. On Sun, Mar 8, 2020 at 10:19 AM Shalom Sagges wrote: > Hi Gil, > > You can run a full repair on your

Re: disable debug message on read repair

2020-03-08 Thread Shalom Sagges
Hi Gil, You can run a full repair on your cluster. But if these messages come back again, you need to check what's causing these data inconsistencies. On Sun, Mar 8, 2020 at 10:11 AM Gil Ganz wrote: > Hey all > I have a lot of debug message about read repairs in my debug log : > > DEBUG

disable debug message on read repair

2020-03-08 Thread Gil Ganz
Hey all I have a lot of debug message about read repairs in my debug log : DEBUG [ReadRepairStage:346] 2020-03-08 08:09:12,959 ReadCallback.java:242 - Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-28476014476640,

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
doesn't seem to be the same, it looks like just less than 10% of the read traffic. the query i originally posted was one that we captured and used as an example. every time i would run it at local_quorum, all, quorum... it would do a read repair. the record hasn't been updated for a long time

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Jeff Jirsa
>> >> >> *From:* Patrick Lee [mailto:patrickclee0...@gmail.com] >> *Sent:* Wednesday, October 16, 2019 12:22 PM >> *To:* user@cassandra.apache.org >> *Subject:* Re: Constant blocking read repair for such a tiny table >> >> >> >> haven't

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
il.com] > *Sent:* Wednesday, October 16, 2019 12:22 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Constant blocking read repair for such a tiny table > > > > haven't really figured this out yet. it's not a big problem but it is > annoying for sure! the cluster

RE: Constant blocking read repair for such a tiny table

2019-10-16 Thread ZAIDI, ASAD
atrick Lee [mailto:patrickclee0...@gmail.com] Sent: Wednesday, October 16, 2019 12:22 PM To: user@cassandra.apache.org Subject: Re: Constant blocking read repair for such a tiny table haven't really figured this out yet. it's not a big problem but it is annoying for sure! the cluster was upgrade

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
1 table, out of all the ones on the cluster has this behavior. repair has been run few times via reaper. even did a nodetool compact on the nodes (since this table is like 1GB per node..) . just don't see why there would be any inconsistency that would trigger read repair. any insight you may have

Re: Constant blocking read repair for such a tiny table

2019-10-15 Thread Alain RODRIGUEZ
50ms.. the only >> odd thing i see is just that there are constant read repairs that follow >> the same traffic pattern on the reads, which shows constant writes on the >> table (from the read repairs), which after read repair or just normal full >> repairs (all full t

Oversized Read Repair Mutations

2019-10-14 Thread Isaac Reath (BLOOMBERG/ 731 LEX)
Hi Cassandra users, Recently on some of our production clusters we have run into the following error: 2019-10-11 15:14:46,803 DataResolver.java:507 - Encountered an oversized (x/y) read repair mutation for table. Which is described in this jira: https://issues.apache.org/jira/browse

Re: Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
reads, which shows constant writes on the > table (from the read repairs), which after read repair or just normal full > repairs (all full through reaper, never ran any incremental repair) i would > expect it to not have any mismatches. the other 5 tables they use on the > cluster can h

Re: Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
thing i see is just that there are constant read repairs that follow the same traffic pattern on the reads, which shows constant writes on the table (from the read repairs), which after read repair or just normal full repairs (all full through reaper, never ran any incremental repair) i would expect

RE: Constant blocking read repair for such a tiny table

2019-10-03 Thread John Belliveau
PM To: user@cassandra.apache.org Subject: Constant blocking read repair for such a tiny table I have a cluster that is running 3.11.4 ( was upgraded a while back from 2.1.16 ).  what I see is a steady rate of read repair which is about 10% constantly on only this 1 table.  Repairs have been run

Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
I have a cluster that is running 3.11.4 ( was upgraded a while back from 2.1.16 ). what I see is a steady rate of read repair which is about 10% constantly on only this 1 table. Repairs have been run (actually several times). The table does not have a lot of writes to it so after repair

Re: read repair with consistency one

2018-04-25 Thread Grzegorz Pietrusza
Hi Ben Thanks a lot. From my analysis of the code it looks like you are right. When global read repair kicks in all live endpoints are queried for data, regardless of consistency level. Only EACH_QUORUM is treated differently. Cheers Grzegorz 2018-04-22 1:45 GMT+02:00 Ben Slater <ben.

Re: read repair with consistency one

2018-04-21 Thread Ben Slater
Ben On Sat, 21 Apr 2018 at 22:20 Grzegorz Pietrusza <gpietru...@gmail.com> wrote: > I haven't asked about "regular" repairs. I just wanted to know how read > repair behaves in my configuration (or is it doing anything at all). > > 2018-04-21 14:04 GMT+02:00 Rahul Sing

Re: read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
I haven't asked about "regular" repairs. I just wanted to know how read repair behaves in my configuration (or is it doing anything at all). 2018-04-21 14:04 GMT+02:00 Rahul Singh <rahul.xavier.si...@gmail.com>: > Read repairs are one anti-entropy measure. Continuous repairs i

Re: read repair with consistency one

2018-04-21 Thread Rahul Singh
Read repairs are one anti-entropy measure. Continuous repairs is another. If you do repairs via Reaper or your own method it will resolve your discrepencies. On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza <gpietru...@gmail.com>, wrote: > Hi all > > I'm a bit confused with

read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
Hi all I'm a bit confused with how read repair works in my case, which is: - multiple DCs with RF 1 (NetworkTopologyStrategy) - reads with consistency ONE The article #1 says that read repair in fact runs RF reads for some percent of the requests. Let's say I have read_repair_chance = 0.1. Does

Re: Blocking read repair giving consistent data but not repairing existing data

2017-12-11 Thread Michael Semb Wever
y, 2 clustering key > for a row but 3 other normal values are null. > > When doing consistency level all query we get complete view of the row and > in the tracing output it says that inconsistency found in digest and read > repair is sent out to the nodes. > <*Exac

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-27 Thread Jeff Jirsa
way to know for which table the > DigestMismatchException happens? > No, the read repair stats we provide are not per table, so if it’s not in the log, it’s not apparent. Feel free to open a jira to ask for it to be added to the log message. > Can the AsyncRepairRunner be triggered if

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-27 Thread Artur Siekielski
be triggered if read and writes for all other tables are done with CL=LOCAL_QUORUM (RF=3)? I assumed in that case async read repair is not done even if dclocal_read_repair_chance > 0. Could it be that the async repair runs for that case and it's executed faster than the background syncing to meet R

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Artur Siekielski
It was set to the default 99PERCENTILE, I changed it to NONE but the exceptions are still logged (for the same table). I'm assuming node restarts are not required for that ALTER. On 10/26/2017 05:13 PM, Jeff Jirsa wrote: Is speculative retry enabled?

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Jeff Jirsa
Is speculative retry enabled? -- Jeff Jirsa > On Oct 26, 2017, at 3:19 AM, Artur Siekielski <a...@vhex.net> wrote: > > Hi, > > we have one table for which reads and writes are done with CL=ONE. The table > contains counters. We wanted to disable async read repair for

Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Artur Siekielski
Hi, we have one table for which reads and writes are done with CL=ONE. The table contains counters. We wanted to disable async read repair for the table (to lessen cluster load and to avoid DigestMismatchExceptions in debug.log). After altering the table with read_repair_chance=0

Re: Does async read repair happen when using CL.LOCAL_QUORUM?

2017-09-25 Thread Lutaya Shafiq Holmes
; > `If (blockfor < endpoints.size() && n == endpoints.size())` > > > Whereas n is the received data from endpoints in local datacenter. > > In that case, the async repair runner won’t be created, thus only foreground > read repair is possible to happen (when DigestM

Does async read repair happen when using CL.LOCAL_QUORUM?

2017-09-24 Thread 孟靖
he received data from endpoints in local datacenter. In that case, the async repair runner won’t be created, thus only foreground read repair is possible to happen (when DigestMismatchException is raised) when CL = LOCAL_QUORUM. Is it true, or am I missing something here? Btw, the cassan

Re: Tomstones impact on repairs both anti-entropy and read repair

2016-11-16 Thread Alain RODRIGUEZ
Hi, > My question to the community is will tombstone cause issues in data > consistency across the DCs. It might, if your repairs are not succeeding for some reason or not running fully (all the token ranges) within gc_grace_second (parameter at the table level) I wrote a blog post and talked

Tomstones impact on repairs both anti-entropy and read repair

2016-11-14 Thread K F
Hi Folks, I have a table that has lot of tombstones generated and has caused inconsistent data across various datacenters. we run anti-entropy repairs and also have read_repair_chance tuned-up during our non busy hours. But yet when we try to compare data residing in various replicas across

Can I monitor Read Repair from the logs

2016-11-04 Thread James Rothering
What should I grep for in the logs to see if read repair is happening on a table?

RE: Question on Read Repair

2016-11-03 Thread Anubhav Kale
Subject: Re: Question on Read Repair Yes: https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L286 From: Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> Rep

Re: Scenarios when blocking read repair takes place

2016-10-17 Thread siddharth verma
gt; Mankapur? > > Krishna > > On Oct 14, 2016 12:15 PM, "siddharth verma" <sidd.verma29.l...@gmail.com> > wrote: > >> Hi, >> Does blocking read repair take place only when we read on the primary key >> or >> does it take place in the

Re: Scenarios when blocking read repair takes place

2016-10-15 Thread Krishna Chandra Prajapati
Hi which side is this? Mankapur? Krishna On Oct 14, 2016 12:15 PM, "siddharth verma" <sidd.verma29.l...@gmail.com> wrote: > Hi, > Does blocking read repair take place only when we read on the primary key > or > does it take place in the following scenarios as w

Scenarios when blocking read repair takes place

2016-10-14 Thread siddharth verma
Hi, Does blocking read repair take place only when we read on the primary key or does it take place in the following scenarios as well? Consistemcy ALL 1. select * from ks.table_name 2. select * from ks.table_name where token(pk) >= ? and token(pk) <= ? While using manual paging or aut

Re: Question on Read Repair

2016-10-11 Thread Jeff Jirsa
Date: Tuesday, October 11, 2016 at 11:45 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: RE: Question on Read Repair Thank you. Interesting detail. Does it work the same way for other consistency levels as well ? From: Jeff Jirsa [mailto:jeff

RE: Question on Read Repair

2016-10-11 Thread Anubhav Kale
Thank you. Interesting detail. Does it work the same way for other consistency levels as well ? From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Tuesday, October 11, 2016 10:29 AM To: user@cassandra.apache.org Subject: Re: Question on Read Repair If the failuredetector knows

Re: Question on Read Repair

2016-10-11 Thread Edward Capriolo
art a read process. One of the three nodes may not respond within the read timeout window.Call the end of the read timeout window time('3) Note: Anti-entropy read-repair like Read repair is set to only happen a fraction of requests. Note: Anti-entropy read-repair is (async) not guaranteed not retried (

Re: Question on Read Repair

2016-10-11 Thread Jeff Jirsa
assandra.apache.org> Date: Tuesday, October 11, 2016 at 10:24 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Question on Read Repair Hello, This is more of a theory / concept question. I set CL=ALL and do a read. Say one replica was down, will the res

Question on Read Repair

2016-10-11 Thread Anubhav Kale
Hello, This is more of a theory / concept question. I set CL=ALL and do a read. Say one replica was down, will the rest of the replicas get repaired as part of this ? (I am hoping the answer is yes). Thanks !

Blocking read repair giving consistent data but not repairing existing data

2016-05-12 Thread Bhuvan Rawal
for a row but 3 other normal values are null. When doing consistency level all query we get complete view of the row and in the tracing output it says that inconsistency found in digest and read repair is sent out to the nodes. <*Exact error in tracing : Digest misma

Re: Read Repair

2015-07-08 Thread Robert Coli
on remaining nodes. As there is no Rollback, Node1 row attributes will remain new state, State2 and rest of the nodes row will have old state, State1. If I do a Read and Cassandra detects state difference, it will issue a Read repair which will result in new state, State2 being propagated to other

Read Repair

2015-07-08 Thread Saladi Naidu
state, State2 and rest of the nodes row will have old state, State1. If I do a Read and Cassandra detects state difference, it will issue a Read repair which will result in new state, State2 being propagated to other nodes. But from a application point of view the update never happened because

RE: Read Repair

2015-07-08 Thread Ashic Mahtab
-0700 Subject: Re: Read Repair From: rc...@eventbrite.com To: user@cassandra.apache.org; naidusp2...@yahoo.com On Wed, Jul 8, 2015 at 2:07 PM, Saladi Naidu naidusp2...@yahoo.com wrote: Suppose I have a row of existing data with set of values for attributes I call this State1, and issue an update

RE: Read Repair in cassandra

2015-04-08 Thread Jan Karlsson
The request would return with the latest data. The read request would fire against node 1 and node 3. The coordinator would get answers from both and would merge the answers and return the latest. Then read repair might run to update node 3. QUORUM does not take into consideration whether

Read Repair in cassandra

2015-04-07 Thread ankit tyagi
Hi All, I have a doubt regarding read repair while reading data. I and using QUORUM for both read and write operations with RF 3 for strong consistency suppose while write data node1 and node2 replicate the data but it doesn't get replicate on node3 because of various factors. coordinator node

Re: read repair across DC and latency

2014-11-21 Thread Tyler Hobbs
the latency spike when we have large number of same cql hitting the server? I doubt read repair is related. I would try tracing a few of your queries. -- Tyler Hobbs DataStax http://datastax.com/

Re: read repair across DC and latency

2014-11-19 Thread Jimmy Lin
, Nov 16, 2014 at 5:13 PM, Jimmy Lin y2klyf+w...@gmail.com wrote: I have read that read repair suppose to be running as background, but does the co-ordinator node need to wait for the response(along with other normal read tasks) before return the entire result back to the caller? For the 10

Re: read repair across DC and latency

2014-11-18 Thread Tyler Hobbs
On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin y2klyf+w...@gmail.com wrote: I have read that read repair suppose to be running as background, but does the co-ordinator node need to wait for the response(along with other normal read tasks) before return the entire result back to the caller

read repair across DC and latency

2014-11-16 Thread Jimmy Lin
I have a CF that use the default, read_repair_chance (0.1) and dc_read_repair_chance(0). Our read and write is all local_quorum, on one of the 2 DC, replication of 3. so a read will have 10% chance trigger a read repair to other DC. # I have read that read repair suppose to be running

Re: Understanding about Cassandra read repair with QUORUM

2014-01-16 Thread Aaron Morton
I have following understanding about Cassandra read repair: Read Repair is an automatic process that reads from more nodes than necessary during a normal read and checks and repairs differences in the background. It’s different to “repair” or Anti Entropy that you run with nodetool repair

Understanding about Cassandra read repair with QUORUM

2014-01-11 Thread chovatia jaydeep
Hi, I have following understanding about Cassandra read repair: * If we write with QUORUM and read with QUORUM then we do not need to externally (nodetool) trigger read repair.  * Since we are reading + writing with QUORUM then it is safe to set read_repair_chance=0

Re: Read repair

2013-10-31 Thread Aaron Morton
, there is no quorum until failed rack comes back up. Hope this explains the scenario. From: Aaron Morton Sent: ‎10/‎28/‎2013 2:42 AM To: Cassandra User Subject: Re: Read repair As soon as it came back up, due to some human error, rack1 goes down. Now for some rows it is possible that Quorum cannot

Re: Read repair

2013-10-31 Thread Baskar Duraikannu
Yes, it helps. Thanks --- Original Message --- From: Aaron Morton aa...@thelastpickle.com Sent: October 31, 2013 3:51 AM To: Cassandra User user@cassandra.apache.org Subject: Re: Read repair (assuming RF 3 and NTS is putting a replica in each rack) Rack1 goes down and some writes happen

RE: Read repair

2013-10-29 Thread Baskar Duraikannu
hour and 30 mins, there is no quorum until failed rack comes back up. Hope this explains the scenario. From: Aaron Mortonmailto:aa...@thelastpickle.com Sent: ‎10/‎28/‎2013 2:42 AM To: Cassandra Usermailto:user@cassandra.apache.org Subject: Re: Read repair As soon

Re: manual read repair

2013-10-28 Thread Aaron Morton
We have seen read repair take very long time even for few GBs Read Repair is a process that runs during a read to repair differences in the background. It’s active on (by default) 10% of the reads. I assume you mean nodetool repair (aka anti entropy). It runs in two phases, first

Re: Read repair

2013-10-28 Thread Aaron Morton
of the nodes available and would be able to achieve a QUORUM. Just to minimize the issues, we are thinking of running read repair manually every night. If you are reading and writing at QUORUM and the cluster does not have a QUORUM of nodes available writes will not be processed. During reads any

Read repair

2013-10-25 Thread Baskar Duraikannu
for some rows it is possible that Quorum cannot be established. Just to minimize the issues, we are thinking of running read repair manually every night. Is this a good idea? How often do you perform read repair on your cluster?

manual read repair

2013-10-25 Thread Baskar Duraikannu
We have seen read repair take very long time even for few GBs of data even though we don't see disk or network bottlenecks. Do you use any specific configuration to speed up read repairs?

Re: Waiting on read repair?

2013-03-20 Thread aaron morton
. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? No That's just a utility method. Nothing on the read path waits for Read Repair, and controlled by read_repair_chance CF property, it's all async to the client request. There is no CL

Re: Waiting on read repair?

2013-03-19 Thread aaron morton
) Thanks again, Jasdeep On Mon, Mar 18, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com wrote: 1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? No That's just a utility method. Nothing on the read

Re: Waiting on read repair?

2013-03-19 Thread Jasdeep Hundal
On Mon, Mar 18, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com wrote: 1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? No That's just a utility method. Nothing on the read path waits for Read Repair

Re: Waiting on read repair?

2013-03-18 Thread aaron morton
1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? No That's just a utility method. Nothing on the read path waits for Read Repair, and controlled by read_repair_chance CF property, it's all async to the client request

Re: Waiting on read repair?

2013-03-18 Thread Jasdeep Hundal
, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com wrote: 1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? No That's just a utility method. Nothing on the read path waits for Read Repair, and controlled

Waiting on read repair?

2013-03-15 Thread Jasdeep Hundal
I've got a couple of questions related issues I'm encountering using Cassandra under a heavy write load: 1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? 2. When read repair applies a mutation, it needs to obtain a lock

Re: Read-repair working, repair not working?

2013-02-11 Thread Brian Fleming
. i.e. QUOURM write and read. Because you are using RF 2 per DC I assume you are not using LOCAL_QUOURM because that is 2 and you would not have any redundancy in the DC. - Would increasing logging level to ‘DEBUG’ show read-repair activity (to confirm that this is happening

Re: Read-repair working, repair not working?

2013-02-11 Thread aaron morton
level to ‘DEBUG’ show read-repair activity (to confirm that this is happening, when for what proportion of total requests)? It would, but the INFO logging for the AES is pretty good. I would hold off for now. - Is there something obvious that I could be missing here? When

Read-repair working, repair not working?

2013-02-10 Thread Brian Fleming
consistency availability: I’d request data, nothing would be returned, I would then re-request the data and it would correctly be returned: i.e. read-repair appeared to be occurring. However running repairs on the nodes didn’t resolve this (I tried general ‘*repair’* commands as well as targeted

Re: Read-repair working, repair not working?

2013-02-10 Thread aaron morton
. QUOURM write and read. Because you are using RF 2 per DC I assume you are not using LOCAL_QUOURM because that is 2 and you would not have any redundancy in the DC. - Would increasing logging level to ‘DEBUG’ show read-repair activity (to confirm that this is happening, when

Re: neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes

2013-02-05 Thread Alexei Bakanov
; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_256'; 0 Row Returned. Elapsed time: 47 msec(s). $ python fetcher_repair.py (running one more time in hope that 'read repair' kicked in after the last query, but unfortunately no) 254 255 256 Traceback

neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes

2013-02-01 Thread Alexei Bakanov
: missing rows for userId 256, data length is 0 $ ccm cli [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_256'; 0 Row Returned. Elapsed time: 47 msec(s). $ python fetcher_repair.py (running one more time in hope that 'read repair

read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
I know there is a 10 day limit if you have a node out of the cluster where you better be running read-repair or you end up with forgotten deletes, but what about on a clean cluster with all nodes always available? Shouldn't the deletes eventually take place or does one have to keep running

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Aaron Turner
limit if you have a node out of the cluster where you better be running read-repair or you end up with forgotten deletes, but what about on a clean cluster with all nodes always available? Shouldn't the deletes eventually take place or does one have to keep running read-repair manually all

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
to run repair once per/gc_grace period. You won't see empty/deleted rows go away until they're compacted away. On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean dean.hil...@nrel.gov wrote: I know there is a 10 day limit if you have a node out of the cluster where you better be running read-repair

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
. On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean dean.hil...@nrel.gov wrote: I know there is a 10 day limit if you have a node out of the cluster where you better be running read-repair or you end up with forgotten deletes, but what about on a clean cluster with all nodes always available? Shouldn't

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Aaron Turner
inline... On Mon, Oct 1, 2012 at 7:46 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Thanks, (actually new it was configurable) BUT what I don't get is why I have to run a repair. IF all nodes became consistent on the delete, it should not be possible to get a forgotten delete, correct. The

Re: Question on Read Repair

2012-09-18 Thread Vijay
is if I do reads at LOCAL_QUORUM in DC1, will read repair happen on the replicas in DC2? Thanks -Raj

Question on Read Repair

2012-09-16 Thread Raj N
Hi, I have a 2 DC setup(DC1:3, DC2:3). All reads and writes are at LOCAL_QUORUM. The question is if I do reads at LOCAL_QUORUM in DC1, will read repair happen on the replicas in DC2? Thanks -Raj

Re: Node crashing during read repair

2012-07-03 Thread Robin Verlangen
Hi Aaron, This was the first error. It occurred a couple of times after this. We did an hardware upgrade on the server and increased the max heap size. Now running fine. Seems that 1.1.1 uses a little more memory, or or data set just grew ;-) Thank you for your time! 2012/7/3 aaron morton

Re: Node crashing during read repair

2012-07-02 Thread aaron morton
Is this still an issue ? It looks like something shut down the messaging service. Was there anything else in the logs ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/06/2012, at 3:49 AM, Robin Verlangen wrote: Hi there, Today

Node crashing during read repair

2012-06-27 Thread Robin Verlangen
Hi there, Today I found one node (running 1.1.1 in a 3 node cluster) being dead for the third time this week, it died with the following message: ERROR [ReadRepairStage:3] 2012-06-27 14:28:30,929 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ReadRepairStage:3,5,main]

Re: read-repair?

2012-02-04 Thread Mr.Quintero

Re: read-repair?

2012-02-02 Thread Guy Incognito
failed (since it only made it to one node), so this is not a violation of the contract. Once node 2 and/or 3 return their response, read repair (if it is active) will cause re-read and re-conciliation followed by a row mutation being send to the nodes to correct the column. do i get the clock 5

Re: read-repair?

2012-02-02 Thread Peter Schuller
sorry to be dense, but which is it?  do i get the old version or the new version?  or is it indeterminate? Indeterminate, depending on which nodes happen to be participating in the read. Eventually you should get the new version, unless the node that took the new version permanently crashed

  1   2   >