Hi all …
The datastax & apache docs are clear: run ‘nodetool repair’ after you alter a
keyspace to change its RF or RS.
However, the details are all over the place as what type of repair and on what
nodes it needs to run. None of the above doc authorities are clear and what you
find on the
Typically, when a read is submitted to C*, it may complete with …
1. No errors & returns expected data
2. Errors out with UnavailableException
3. No error & returns zero rows on first attempt, but returned on subsequent
runs.
The third scenario happens as a result of cluster entropy specially
C*: 2.2.8
Write CL = LQ
Kspace RF = 3
Three racks
A write gets received by node 1 in rack 1 at above specs. Node 1 (rack1) & node
2 (rack2) acknowledge it to the client.
Within some unit of time, node 1 & 2 die. Either ….
- Scenario 1: C* process death: Row did not make it to sstable (it is
nsistent data , given you can
tolerate bit latency until your repair is complete – if you go by
recommendation i.e. to add one node at a time – you’ll avoid all these nuances .
From: Fd Habash [mailto:fmhab...@gmail.com]
Sent: Wednesday, May 01, 2019 3:12 PM
To: user@cassandra.apache.org
Sub
not applies as
the replacing node just owns the token ranges of the dead node. I think that’s
why the restriction of only replacing one node at a time does not applies in
this case.
Thanks
Alok Dwivedi
Senior Consultant
https://www.instaclustr.com/platform/
From: Fd Habash
Repl
Reviewing the documentation & based on my testing, using C* 2.2.8, I was not
able to extend the cluster by adding multiple nodes simultaneously. I got an
error message …
Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while
cassandra.consistent.rangemovement is true
I
Any ideas, please?
Thank you
From: Fd Habash
Sent: Tuesday, April 23, 2019 10:38 AM
To: user@cassandra.apache.org
Subject: A keyspace with RF=3, Cluster with 3 RACS, CL=LQ: No Data on
FirstAttempt, but 1 Row Aftwards
Cluster setup …
- C* 2.2.8
- Three RACs, one DC
- Keyspace
Cluster setup …
- C* 2.2.8
- Three RACs, one DC
- Keyspace with RF=3
- RS = Network topology
At CL=LQ …
I get zero rows on first attempt, and one row on the second or third. Once
found, I always get the row afterwards.
Trying to understand this behavior …
First attempt, my read request hits
You could have run removenode
You could have run assassinate
Also could be some new bug, but that's much less likely.
On Thu, Mar 14, 2019 at 2:50 PM Fd Habash wrote:
I have a node which I know for certain was a cluster member last week. It
showed in nodetool status as DN. When I attem
I have a node which I know for certain was a cluster member last week. It
showed in nodetool status as DN. When I attempted to replace it today, I got
this message
ERROR [main] 2019-03-14 14:40:49,208 CassandraDaemon.java:654 - Exception
encountered during startup
java.lang.RuntimeException:
Assume you have a 30 node cluster distributed across three AZ’s with an RF of
3. Trying to come up with a runbook to manage multi-nodes failure as a result
of …
- Loss of an entire AZ1
- Loss of multiple nodes in AZ2
- AZ3 unaffected. No node loss
Is this is most optimal plan. Replacing dead
For those who are using Reaper …
Currently, I'm run repairs using crontab/nodetool using 'repair -pr' on 2.2.8
which defaults to incremental. If I migrate to Reaper, do I have to mark
sstables as un-repaired first? Also, out of the box, does Reaper run full
parallel repair? If yes, is it not
. Let’s wait for other experts to
comment.
Can you also check sstable count for each table just to be sure that they are
not extraordinarily high?
Sent from my iPhone
On Jun 11, 2018, at 10:21 AM, Fd Habash wrote:
Yes we did after adding the three nodes back and a full cluster repair as well
Yes we did after adding the three nodes back and a full cluster repair as well.
But even it we didn’t run cleanup, would it have impacted read latency the fact
that some nodes still have sstables that they no longer need?
Thanks
Thank you
From: Nitan Kainth
Sent: Monday,
Thank you.
In regards to my second inquiry, as we plan for C* upgrades, I did not find the
NEWS.txt always to be telling of possible upgrade paths. Is there a rule of
thumb or may be an official reference for upgrade paths?
Thank you
From: Alexander Dejanovski
Sent:
2018-03-02 14:42 GMT+00:00 Fd Habash <fmhab...@gmail.com>:
This is a 2.8.8. cluster with three AWS AZs, each with 4 nodes.
Few days ago we noticed a single node’s read latency reaching 1.5 secs there
was 8 others with read latencies going up near 900 ms.
This single node was a see
This is a 2.8.8. cluster with three AWS AZs, each with 4 nodes.
Few days ago we noticed a single node’s read latency reaching 1.5 secs there
was 8 others with read latencies going up near 900 ms.
This single node was a seed node and it was running a ‘repair -pr’ at the time.
We intervened as
Thank you
From: Fd Habash
Sent: Thursday, February 22, 2018 9:00 AM
To: user@cassandra.apache.org
Subject: RE: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead
Latency After Shrinking Cluster
“ data was allowed to fully rebalance/repair/drain before the next node
<fmhab...@gmail.com> wrote:
One node at a time
On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com> wrote:
What is your replication factor?
Single datacenter, three availability zones, is that right?
You removed one node at a time or three at once?
On Wed, Feb 21,
We have had a 15 node cluster across three zones and cluster repairs using
‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the
cluster to 12. Since then, same repair job has taken up to 12 hours to finish
and most times, it never does.
More importantly, at some point
-hosts”, how do you identify what specific hosts to repair?
Thanks
Thank you
From: Fd Habash
Sent: Thursday, December 7, 2017 12:09 PM
To: user@cassandra.apache.org
Subject: RE: When Replacing a Node, How to Force a Consistent Bootstrap
Thank you.
How do I identify what other 2
we don't support).
You'll need to repair (and you can repair before you do the replace to avoid
the window of time where you violate consistency - use the -hosts option to
allow repair with a down host, you'll repair A+C, so when B starts it'll
definitely have all of the data).
On Tue, D
Assume I have cluster of 3 nodes (A,B,C). Row x was written with CL=LQ to node
A and B. Before it was written to C, node B crashes. I replaced B and it
bootstrapped data from node C.
Now, row x is missing from C and B. If node A crashes, it will be replaced and
it will bootstrap from either C
Hi all …
I know there is plenty of docs on how to replace a seed node, but some are
steps are contradictory e.g. need to remote the node from seed list for entire
cluster.
My cluster has 6 nodes with 3 seeds running C* 2.8. One seed node was
terminated by AWS.
I came up with this procedure.
I have a scenario where data has to be loaded into Spark nodes from two data
stores: Oracle and Cassandra. We did the initial loading of data and found a
way to do daily incremental loading from Oracle to Spark.
I’m tying to figure our how to do this from C*. What tools are available in C*
to
In the process of upgrading our cluster. Nodes that go upgraded are constantly
emitting these messages. No impact, but wanted to know what they mean and why
after the upgrade only.
Any feedback will be appreciated.
17-04-10 20:18:11,580 Memtable.java:352 - Writing
26 matches
Mail list logo