Hi All,
We are facing problems of failure of Read-Repair stages with error Digest
Mismatch and count is 300+ per day per node.
At the same time, we are experiencing node is getting overloaded for a
quick couple of seconds due to long GC pauses (of around 7-8 seconds). We
are not running a repair
2020 at 17:27
To: cassandra
Subject: Re: Could a READ REPAIR really be triggered even if there avg 80 ms
between calls
Yes, it's possible. A typical JVM GC pause for most configs is on the order of
50-200ms. If you have a host do a small collection/pause, then the read at #4
is basically racing
. Data replicated by Cassandra, but will not finish before (4) below
>3. Wait 80 ms on average
>4. Data read again with QUORUM i.e asking for atleast 2 out of 3 nodes
>for result, and now ONE replies with inaccurate data
>5. (4) triggers a READ REPAIR
>6. The RE
Did you mean LOCAL_QUORUM? Because QUORUM will require 4 out of 6 replicas,
not 2 out of 3. :) But it sounds like you are using QUORUM because you said
it syncs to all nodes in DC2.
To answer your question, RR *can* be triggered if you're reading before the
replicas are *eventually* consistent.
with QUORUM i.e asking for atleast 2 out of 3 nodes for
result, and now ONE replies with inaccurate data
5. (4) triggers a READ REPAIR
6. The READ REPAIR now synchs to ALL nodes also in DC2
So my question is: Is it really possible that Cassandra within 80 ms is not
able to replicate to all 3 nodes
You can check for the string "digest mismatch" in the logs. Similarly, you
can track the RR stats in nodetool netstats and the dropped mutations
in nodetool
tpstats.
To be clear though, RRs are a side-effect of nodes either dropping
mutations or being unresponsive so they miss mutations. RRs do
Thanx Erick
Is there a way to turn on tracing based on certain criteria,
I would like to start tracing when there is some sort of failure, i.e. in this
case when a READ REPAIR is triggered as I would like to know why we sometimes
can’t reach one of the nodes
-Tobias
From: Erick Ramirez
Reply
um : SELECT * FROM products WHERE id = ABC123
>
> READ 2 with Local One : SELECT * FROM products WHERE id = ABC123
>
>
>
> Would read (2) be blocked by the READ REPAIR that was done by read (1)
>
> As I understand that the read repair is working not on the whole table but
> o
Hi Tobias
READ2 will not be blocked by READ repair of READ1.
Regards
Manish
On Tue, Aug 11, 2020 at 6:02 PM Tobias Eriksson
wrote:
> Thanx Erick,
>
> Perhaps this is super obvious but I need a confirmation as you say “…not
> subsequent reads for other data unrelated to the read be
= ABC123
READ 2 with Local One : SELECT * FROM products WHERE id = ABC123
Would read (2) be blocked by the READ REPAIR that was done by read (1)
As I understand that the read repair is working not on the whole table but on
the partition key it had problems with
-Tobias
From: Erick Ramirez
Reply
>
> If a READ triggers a READ REPAIR, and then if we do an additional READ
> would then that BLOCK until the “first” READ REPAIR would be done ?
>
> -Tobias
>
Not all read repairs are blocking RRs (aka foreground RRs). There are also
background RRs which by definition are no
If a READ triggers a READ REPAIR, and then if we do an additional READ would
then that BLOCK until the “first” READ REPAIR would be done ?
-Tobias
From: Jeff Jirsa
Reply to: "user@cassandra.apache.org"
Date: Tuesday, 11 August 2020 at 07:30
To: cassandra
Subject: Re: Why a R
Your schema may have read repair (non-blocking, background) set to 10%
(0.1, for dclocal).
You may have GC pauses causing writes (or reads) to be delayed.
You may be hitting a cassandra bug.
Would need the `TRACING` output to know for sure.
On Mon, Aug 10, 2020 at 10:10 PM Tobias Eriksson
Hi
We have a Cassandra solution with 2 DCs where each DC has >30 nodes
From time to time we see problems with READ REPAIR, but I am stuck with the
analysis
We have a pattern for these faults where we do
1. INSERT with Local Quorum (2 out of 3)
2. Wait for 0.5 - 1 seconds time window
Hi Gil,
All the logging is controlled via logback. You can change the level of any type
of message.
Take a look here for some more details:
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/configuration/configLoggingLevels.html
That's one option, I wish I there was a way to disable just that and not
the entire debug log level, there are some things there I would like to
keep.
On Sun, Mar 8, 2020 at 6:41 PM Jeff Jirsa wrote:
> There are likely two log configs - one for debug.log and one for
> system.log. Disable the
There are likely two log configs - one for debug.log and one for system.log.
Disable the debug.log one, or change org.apache.cassandra.service to log at
INFO instead
Nobody needs to see every digest mismatch and that someone thought this was a
good idea is amazing to me. Someone should jira
Thanks Shalom, I know why these read repairs are happening, and they will
continue to happen for some time, even if I will run a full repair.
I would like to disable these warning messages.
On Sun, Mar 8, 2020 at 10:19 AM Shalom Sagges
wrote:
> Hi Gil,
>
> You can run a full repair on your
Hi Gil,
You can run a full repair on your cluster. But if these messages come back
again, you need to check what's causing these data inconsistencies.
On Sun, Mar 8, 2020 at 10:11 AM Gil Ganz wrote:
> Hey all
> I have a lot of debug message about read repairs in my debug log :
>
> DEBUG
Hey all
I have a lot of debug message about read repairs in my debug log :
DEBUG [ReadRepairStage:346] 2020-03-08 08:09:12,959 ReadCallback.java:242 -
Digest mismatch:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key
DecoratedKey(-28476014476640,
doesn't seem to be the same, it looks like just less than 10% of the read
traffic. the query i originally posted was one that we captured and used as
an example. every time i would run it at local_quorum, all, quorum... it
would do a read repair. the record hasn't been updated for a long time
>>
>>
>> *From:* Patrick Lee [mailto:patrickclee0...@gmail.com]
>> *Sent:* Wednesday, October 16, 2019 12:22 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Constant blocking read repair for such a tiny table
>>
>>
>>
>> haven't
il.com]
> *Sent:* Wednesday, October 16, 2019 12:22 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Constant blocking read repair for such a tiny table
>
>
>
> haven't really figured this out yet. it's not a big problem but it is
> annoying for sure! the cluster
atrick Lee [mailto:patrickclee0...@gmail.com]
Sent: Wednesday, October 16, 2019 12:22 PM
To: user@cassandra.apache.org
Subject: Re: Constant blocking read repair for such a tiny table
haven't really figured this out yet. it's not a big problem but it is annoying
for sure! the cluster was upgrade
1 table, out of all the ones on the cluster has this behavior. repair
has been run few times via reaper. even did a nodetool compact on the
nodes (since this table is like 1GB per node..) . just don't see why there
would be any inconsistency that would trigger read repair.
any insight you may have
50ms.. the only
>> odd thing i see is just that there are constant read repairs that follow
>> the same traffic pattern on the reads, which shows constant writes on the
>> table (from the read repairs), which after read repair or just normal full
>> repairs (all full t
Hi Cassandra users,
Recently on some of our production clusters we have run into the following
error:
2019-10-11 15:14:46,803 DataResolver.java:507 - Encountered an oversized (x/y)
read repair mutation for table.
Which is described in this jira:
https://issues.apache.org/jira/browse
reads, which shows constant writes on the
> table (from the read repairs), which after read repair or just normal full
> repairs (all full through reaper, never ran any incremental repair) i would
> expect it to not have any mismatches. the other 5 tables they use on the
> cluster can h
thing i see is just that there are constant read repairs that follow
the same traffic pattern on the reads, which shows constant writes on the
table (from the read repairs), which after read repair or just normal full
repairs (all full through reaper, never ran any incremental repair) i would
expect
PM
To: user@cassandra.apache.org
Subject: Constant blocking read repair for such a tiny table
I have a cluster that is running 3.11.4 ( was upgraded a while back from 2.1.16
). what I see is a steady rate of read repair which is about 10% constantly on
only this 1 table. Repairs have been run
I have a cluster that is running 3.11.4 ( was upgraded a while back from
2.1.16 ). what I see is a steady rate of read repair which is about 10%
constantly on only this 1 table. Repairs have been run (actually several
times). The table does not have a lot of writes to it so after repair
Hi Ben
Thanks a lot. From my analysis of the code it looks like you are right.
When global read repair kicks in all live endpoints are queried for data,
regardless of consistency level. Only EACH_QUORUM is treated differently.
Cheers
Grzegorz
2018-04-22 1:45 GMT+02:00 Ben Slater <ben.
Ben
On Sat, 21 Apr 2018 at 22:20 Grzegorz Pietrusza <gpietru...@gmail.com>
wrote:
> I haven't asked about "regular" repairs. I just wanted to know how read
> repair behaves in my configuration (or is it doing anything at all).
>
> 2018-04-21 14:04 GMT+02:00 Rahul Sing
I haven't asked about "regular" repairs. I just wanted to know how read
repair behaves in my configuration (or is it doing anything at all).
2018-04-21 14:04 GMT+02:00 Rahul Singh <rahul.xavier.si...@gmail.com>:
> Read repairs are one anti-entropy measure. Continuous repairs i
Read repairs are one anti-entropy measure. Continuous repairs is another. If
you do repairs via Reaper or your own method it will resolve your discrepencies.
On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza <gpietru...@gmail.com>,
wrote:
> Hi all
>
> I'm a bit confused with
Hi all
I'm a bit confused with how read repair works in my case, which is:
- multiple DCs with RF 1 (NetworkTopologyStrategy)
- reads with consistency ONE
The article #1 says that read repair in fact runs RF reads for some percent
of the requests. Let's say I have read_repair_chance = 0.1. Does
y, 2 clustering key
> for a row but 3 other normal values are null.
>
> When doing consistency level all query we get complete view of the row and
> in the tracing output it says that inconsistency found in digest and read
> repair is sent out to the nodes.
> <*Exac
way to know for which table the
> DigestMismatchException happens?
>
No, the read repair stats we provide are not per table, so if it’s not in the
log, it’s not apparent. Feel free to open a jira to ask for it to be added to
the log message.
> Can the AsyncRepairRunner be triggered if
be triggered if read and writes for all other
tables are done with CL=LOCAL_QUORUM (RF=3)? I assumed in that case
async read repair is not done even if dclocal_read_repair_chance > 0.
Could it be that the async repair runs for that case and it's executed
faster than the background syncing to meet R
It was set to the default 99PERCENTILE, I changed it to NONE but the
exceptions are still logged (for the same table). I'm assuming node
restarts are not required for that ALTER.
On 10/26/2017 05:13 PM, Jeff Jirsa wrote:
Is speculative retry enabled?
Is speculative retry enabled?
--
Jeff Jirsa
> On Oct 26, 2017, at 3:19 AM, Artur Siekielski <a...@vhex.net> wrote:
>
> Hi,
>
> we have one table for which reads and writes are done with CL=ONE. The table
> contains counters. We wanted to disable async read repair for
Hi,
we have one table for which reads and writes are done with CL=ONE. The
table contains counters. We wanted to disable async read repair for the
table (to lessen cluster load and to avoid DigestMismatchExceptions in
debug.log). After altering the table with read_repair_chance=0
;
> `If (blockfor < endpoints.size() && n == endpoints.size())`
>
>
> Whereas n is the received data from endpoints in local datacenter.
>
> In that case, the async repair runner won’t be created, thus only foreground
> read repair is possible to happen (when DigestM
he received data from endpoints in local datacenter.
In that case, the async repair runner won’t be created, thus only foreground
read repair is possible to happen (when DigestMismatchException is raised)
when CL = LOCAL_QUORUM.
Is it true, or am I missing something here?
Btw, the cassan
Hi,
> My question to the community is will tombstone cause issues in data
> consistency across the DCs.
It might, if your repairs are not succeeding for some reason or not running
fully (all the token ranges) within gc_grace_second (parameter at the table
level)
I wrote a blog post and talked
Hi Folks,
I have a table that has lot of tombstones generated and has caused inconsistent
data across various datacenters. we run anti-entropy repairs and also have
read_repair_chance tuned-up during our non busy hours. But yet when we try to
compare data residing in various replicas across
What should I grep for in the logs to see if read repair is happening on a
table?
Subject: Re: Question on Read Repair
Yes:
https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L286
From: Anubhav Kale
<anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>
Rep
gt; Mankapur?
>
> Krishna
>
> On Oct 14, 2016 12:15 PM, "siddharth verma" <sidd.verma29.l...@gmail.com>
> wrote:
>
>> Hi,
>> Does blocking read repair take place only when we read on the primary key
>> or
>> does it take place in the
Hi which side is this?
Mankapur?
Krishna
On Oct 14, 2016 12:15 PM, "siddharth verma" <sidd.verma29.l...@gmail.com>
wrote:
> Hi,
> Does blocking read repair take place only when we read on the primary key
> or
> does it take place in the following scenarios as w
Hi,
Does blocking read repair take place only when we read on the primary key or
does it take place in the following scenarios as well?
Consistemcy ALL
1. select * from ks.table_name
2. select * from ks.table_name where token(pk) >= ? and token(pk) <= ?
While using manual paging or aut
Date: Tuesday, October 11, 2016 at 11:45 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: RE: Question on Read Repair
Thank you.
Interesting detail. Does it work the same way for other consistency levels as
well ?
From: Jeff Jirsa [mailto:jeff
Thank you.
Interesting detail. Does it work the same way for other consistency levels as
well ?
From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Tuesday, October 11, 2016 10:29 AM
To: user@cassandra.apache.org
Subject: Re: Question on Read Repair
If the failuredetector knows
art a read process.
One of the three nodes may not respond within the read timeout window.Call
the end of the read timeout window time('3)
Note: Anti-entropy read-repair like Read repair is set to only happen a
fraction of requests.
Note: Anti-entropy read-repair is (async) not guaranteed not retried (
assandra.apache.org>
Date: Tuesday, October 11, 2016 at 10:24 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Question on Read Repair
Hello,
This is more of a theory / concept question. I set CL=ALL and do a read. Say
one replica was down, will the res
Hello,
This is more of a theory / concept question. I set CL=ALL and do a read. Say
one replica was down, will the rest of the replicas get repaired as part of
this ? (I am hoping the answer is yes).
Thanks !
for a row but 3 other normal values are null.
When doing consistency level all query we get complete view of the row and
in the tracing output it says that inconsistency found in digest and read
repair is sent out to the nodes.
<*Exact error in tracing : Digest misma
on remaining nodes. As there is no Rollback, Node1 row attributes will
remain new state, State2 and rest of the nodes row will have old state,
State1. If I do a Read and Cassandra detects state difference, it will
issue a Read repair which will result in new state, State2 being propagated
to other
state, State2 and
rest of the nodes row will have old state, State1. If I do a Read and Cassandra
detects state difference, it will issue a Read repair which will result in new
state, State2 being propagated to other nodes. But from a application point of
view the update never happened because
-0700
Subject: Re: Read Repair
From: rc...@eventbrite.com
To: user@cassandra.apache.org; naidusp2...@yahoo.com
On Wed, Jul 8, 2015 at 2:07 PM, Saladi Naidu naidusp2...@yahoo.com wrote:
Suppose I have a row of existing data with set of values for attributes I call
this State1, and issue an update
The request would return with the latest data.
The read request would fire against node 1 and node 3. The coordinator would
get answers from both and would merge the answers and return the latest.
Then read repair might run to update node 3.
QUORUM does not take into consideration whether
Hi All,
I have a doubt regarding read repair while reading data. I and using
QUORUM for both read and write operations with RF 3 for strong consistency
suppose while write data node1 and node2 replicate the data but it doesn't
get replicate on node3 because of various factors. coordinator node
the latency spike when we
have large number of same cql hitting the server?
I doubt read repair is related. I would try tracing a few of your queries.
--
Tyler Hobbs
DataStax http://datastax.com/
, Nov 16, 2014 at 5:13 PM, Jimmy Lin y2klyf+w...@gmail.com wrote:
I have read that read repair suppose to be running as background, but
does the co-ordinator node need to wait for the response(along with other
normal read tasks) before return the entire result back to the caller?
For the 10
On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin y2klyf+w...@gmail.com wrote:
I have read that read repair suppose to be running as background, but
does the co-ordinator node need to wait for the response(along with other
normal read tasks) before return the entire result back to the caller
I have a CF that use the default, read_repair_chance (0.1) and
dc_read_repair_chance(0).
Our read and write is all local_quorum, on one of the 2 DC, replication of
3.
so a read will have 10% chance trigger a read repair to other DC.
#
I have read that read repair suppose to be running
I have following understanding about Cassandra read repair:
Read Repair is an automatic process that reads from more nodes than necessary
during a normal read and checks and repairs differences in the background. It’s
different to “repair” or Anti Entropy that you run with nodetool repair
Hi,
I have following understanding about Cassandra read repair:
* If we write with QUORUM and read with QUORUM then we do not need to
externally (nodetool) trigger read repair.
* Since we are reading + writing with QUORUM then it is safe to set
read_repair_chance=0
, there is no quorum until failed rack comes back up.
Hope this explains the scenario.
From: Aaron Morton
Sent: 10/28/2013 2:42 AM
To: Cassandra User
Subject: Re: Read repair
As soon as it came back up, due to some human error, rack1 goes down. Now
for some rows it is possible that Quorum cannot
Yes, it helps. Thanks
--- Original Message ---
From: Aaron Morton aa...@thelastpickle.com
Sent: October 31, 2013 3:51 AM
To: Cassandra User user@cassandra.apache.org
Subject: Re: Read repair
(assuming RF 3 and NTS is putting a replica in each rack)
Rack1 goes down and some writes happen
hour and 30 mins,
there is no quorum until failed rack comes back up.
Hope this explains the scenario.
From: Aaron Mortonmailto:aa...@thelastpickle.com
Sent: 10/28/2013 2:42 AM
To: Cassandra Usermailto:user@cassandra.apache.org
Subject: Re: Read repair
As soon
We have seen read repair take very long time even for few GBs
Read Repair is a process that runs during a read to repair differences in the
background. It’s active on (by default) 10% of the reads.
I assume you mean nodetool repair (aka anti entropy). It runs in two phases,
first
of the nodes available and would be
able to achieve a QUORUM.
Just to minimize the issues, we are thinking of running read repair manually
every night.
If you are reading and writing at QUORUM and the cluster does not have a QUORUM
of nodes available writes will not be processed. During reads any
for
some rows it is possible that Quorum cannot be established. Just to minimize
the issues, we are thinking of running read repair manually every night.
Is this a good idea? How often do you perform read repair on your cluster?
We have seen read repair take very long time even for few GBs of data even
though we don't see disk or network bottlenecks. Do you use any specific
configuration to speed up read repairs?
. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
No
That's just a utility method.
Nothing on the read path waits for Read Repair, and controlled by
read_repair_chance CF property, it's all async to the client request.
There is no CL
)
Thanks again,
Jasdeep
On Mon, Mar 18, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com
wrote:
1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
No
That's just a utility method.
Nothing on the read
On Mon, Mar 18, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com
wrote:
1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
No
That's just a utility method.
Nothing on the read path waits for Read Repair
1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
No
That's just a utility method.
Nothing on the read path waits for Read Repair, and controlled by
read_repair_chance CF property, it's all async to the client request
, 2013 at 10:24 AM, aaron morton aa...@thelastpickle.com wrote:
1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
No
That's just a utility method.
Nothing on the read path waits for Read Repair, and controlled
I've got a couple of questions related issues I'm encountering using
Cassandra under a heavy write load:
1. With a ConsistencyLevel of quorum, does
FBUtilities.waitForFutures() wait for read repair to complete before
returning?
2. When read repair applies a mutation, it needs to obtain a lock
. i.e. QUOURM write and read. Because you are using RF 2 per DC I
assume you are not using LOCAL_QUOURM because that is 2 and you would not
have any redundancy in the DC.
- Would increasing logging level to ‘DEBUG’ show read-repair
activity (to confirm that this is happening
level to ‘DEBUG’ show read-repair
activity (to confirm that this is happening, when for what proportion of
total requests)?
It would, but the INFO logging for the AES is pretty good. I would hold off
for now.
- Is there something obvious that I could be missing here?
When
consistency availability: I’d
request data, nothing would be returned, I would then re-request the data
and it would correctly be returned: i.e. read-repair appeared to be
occurring. However running repairs on the nodes didn’t resolve this (I
tried general ‘*repair’* commands as well as targeted
. QUOURM write and read. Because you are using RF 2 per DC I
assume you are not using LOCAL_QUOURM because that is 2 and you would not have
any redundancy in the DC.
- Would increasing logging level to ‘DEBUG’ show read-repair
activity (to confirm that this is happening, when
;
Authenticated to keyspace: testks
[default@testks] get cf1 where 'indexedColumn'='userId_256';
0 Row Returned.
Elapsed time: 47 msec(s).
$ python fetcher_repair.py (running one more time in hope that 'read
repair' kicked in after the last query, but unfortunately no)
254
255
256
Traceback
: missing rows for userId 256, data length is 0
$ ccm cli
[default@unknown] use testks;
Authenticated to keyspace: testks
[default@testks] get cf1 where 'indexedColumn'='userId_256';
0 Row Returned.
Elapsed time: 47 msec(s).
$ python fetcher_repair.py (running one more time in hope that 'read
repair
I know there is a 10 day limit if you have a node out of the cluster where you
better be running read-repair or you end up with forgotten deletes, but what
about on a clean cluster with all nodes always available? Shouldn't the
deletes eventually take place or does one have to keep running
limit if you have a node out of the cluster where
you better be running read-repair or you end up with forgotten deletes, but
what about on a clean cluster with all nodes always available? Shouldn't the
deletes eventually take place or does one have to keep running read-repair
manually all
to run repair once per/gc_grace period.
You won't see empty/deleted rows go away until they're compacted away.
On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean dean.hil...@nrel.gov wrote:
I know there is a 10 day limit if you have a node out of the cluster
where you better be running read-repair
.
On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean dean.hil...@nrel.gov
wrote:
I know there is a 10 day limit if you have a node out of the cluster
where you better be running read-repair or you end up with forgotten
deletes, but what about on a clean cluster with all nodes always
available? Shouldn't
inline...
On Mon, Oct 1, 2012 at 7:46 PM, Hiller, Dean dean.hil...@nrel.gov wrote:
Thanks, (actually new it was configurable) BUT what I don't get is why I
have to run a repair. IF all nodes became consistent on the delete, it
should not be possible to get a forgotten delete, correct. The
is if I do reads at LOCAL_QUORUM in DC1, will
read repair happen on the replicas in DC2?
Thanks
-Raj
Hi,
I have a 2 DC setup(DC1:3, DC2:3). All reads and writes are at
LOCAL_QUORUM. The question is if I do reads at LOCAL_QUORUM in DC1, will
read repair happen on the replicas in DC2?
Thanks
-Raj
Hi Aaron,
This was the first error. It occurred a couple of times after this. We did
an hardware upgrade on the server and increased the max heap size. Now
running fine. Seems that 1.1.1 uses a little more memory, or or data set
just grew ;-)
Thank you for your time!
2012/7/3 aaron morton
Is this still an issue ?
It looks like something shut down the messaging service. Was there anything
else in the logs ?
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 28/06/2012, at 3:49 AM, Robin Verlangen wrote:
Hi there,
Today
Hi there,
Today I found one node (running 1.1.1 in a 3 node cluster) being dead for
the third time this week, it died with the following message:
ERROR [ReadRepairStage:3] 2012-06-27 14:28:30,929
AbstractCassandraDaemon.java (line 134) Exception in thread
Thread[ReadRepairStage:3,5,main]
failed (since it only made it to one node),
so this is not a violation of the contract.
Once node 2 and/or 3 return their response, read repair (if it is
active) will cause re-read and re-conciliation followed by a row
mutation being send to the nodes to correct the column.
do i get the clock 5
sorry to be dense, but which is it? do i get the old version or the new
version? or is it indeterminate?
Indeterminate, depending on which nodes happen to be participating in
the read. Eventually you should get the new version, unless the node
that took the new version permanently crashed
1 - 100 of 110 matches
Mail list logo