Re: URGENT: disable reads from node

2018-09-06 Thread Vlad
Hi,
this node isn't in system.peers on both nodes. 

On Wednesday, August 29, 2018 4:22 PM, Vlad  
wrote:
 

 Hi,
>You'll need to disable the native transportWell, this is what I did already, 
>it seems repair is running

I'm not sure whether repair will finish within 3 hours, but I can run it again 
(as it's incremental repair by default, right?)


I'm not sure about RF=3 and QUORUM reads because of load/disk space constrains 
we have, but we'll definitely consider this.

Thanks to all for help!
 

On Wednesday, August 29, 2018 4:13 PM, Alexander Dejanovski 
 wrote:
 

 Kurt is right. 
So here are the options I can think of : - use the join_ring false technique 
and rely on hints. You'll need to disable the native transport on the node as 
well to prevent direct connections to be made to it. Hopefully, you can run 
repair in less than 3 hours which is the hint window (hints will be collected 
while the node hasn't joined the ring). Otherwise you'll have more consistency 
issues after the node joins the ring again. Maybe incremental repair could help 
fixing this quickly afterwards if you've been running full repairs that 
involved anticompaction (if you're running at least Cassandra 2.2).- Fully 
re-bootstrap the node by replacing itself, using the replace_address_first_boot 
technique (but since you have RF=2, that would most probably mean some data 
loss since you read/write at ONE)- Try to cheat the dynamic snitch to take the 
node out of reads. You would then have the node join the ring normally, disable 
native transport and raise Severity (in 
org.apache.cassandra.db:type=DynamicEndpointSnitch) to something like 50 so the 
node won't be selected by the dynamic snitch. I guess the value will reset 
itself over time so you may need to set it to 50 on a regular basis while 
repair is happening.
I would then strongly consider moving to RF=3 because RF=2 will lead you to 
this type of situation again in the future and does not allow quorum reads with 
fault tolerance. Good luck,
On Wed, Aug 29, 2018 at 1:56 PM Vlad  wrote:

I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com

   

   

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi,
>You'll need to disable the native transportWell, this is what I did already, 
>it seems repair is running

I'm not sure whether repair will finish within 3 hours, but I can run it again 
(as it's incremental repair by default, right?)


I'm not sure about RF=3 and QUORUM reads because of load/disk space constrains 
we have, but we'll definitely consider this.

Thanks to all for help!
 

On Wednesday, August 29, 2018 4:13 PM, Alexander Dejanovski 
 wrote:
 

 Kurt is right. 
So here are the options I can think of : - use the join_ring false technique 
and rely on hints. You'll need to disable the native transport on the node as 
well to prevent direct connections to be made to it. Hopefully, you can run 
repair in less than 3 hours which is the hint window (hints will be collected 
while the node hasn't joined the ring). Otherwise you'll have more consistency 
issues after the node joins the ring again. Maybe incremental repair could help 
fixing this quickly afterwards if you've been running full repairs that 
involved anticompaction (if you're running at least Cassandra 2.2).- Fully 
re-bootstrap the node by replacing itself, using the replace_address_first_boot 
technique (but since you have RF=2, that would most probably mean some data 
loss since you read/write at ONE)- Try to cheat the dynamic snitch to take the 
node out of reads. You would then have the node join the ring normally, disable 
native transport and raise Severity (in 
org.apache.cassandra.db:type=DynamicEndpointSnitch) to something like 50 so the 
node won't be selected by the dynamic snitch. I guess the value will reset 
itself over time so you may need to set it to 50 on a regular basis while 
repair is happening.
I would then strongly consider moving to RF=3 because RF=2 will lead you to 
this type of situation again in the future and does not allow quorum reads with 
fault tolerance. Good luck,
On Wed, Aug 29, 2018 at 1:56 PM Vlad  wrote:

I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com

   

Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Kurt is right.

So here are the options I can think of :
- use the join_ring false technique and rely on hints. You'll need to
disable the native transport on the node as well to prevent direct
connections to be made to it. Hopefully, you can run repair in less than 3
hours which is the hint window (hints will be collected while the node
hasn't joined the ring). Otherwise you'll have more consistency issues
after the node joins the ring again. Maybe incremental repair could help
fixing this quickly afterwards if you've been running full repairs that
involved anticompaction (if you're running at least Cassandra 2.2).
- Fully re-bootstrap the node by replacing itself, using the
replace_address_first_boot technique (but since you have RF=2, that would
most probably mean some data loss since you read/write at ONE)
- Try to cheat the dynamic snitch to take the node out of reads. You would
then have the node join the ring normally, disable native transport and
raise Severity (in org.apache.cassandra.db:type=DynamicEndpointSnitch) to
something like 50 so the node won't be selected by the dynamic snitch. I
guess the value will reset itself over time so you may need to set it to 50
on a regular basis while repair is happening.

I would then strongly consider moving to RF=3 because RF=2 will lead you to
this type of situation again in the future and does not allow quorum reads
with fault tolerance.

Good luck,

On Wed, Aug 29, 2018 at 1:56 PM Vlad  wrote:

> I restarted with cassandra.join_ring=false
> nodetool status on other nodes shows this node as DN, while it see itself
> as UN.
>
>
> >I'd say best to just query at QUORUM until you can finish repairs.
> We have RH 2, so I guess QUORUM queries will fail. Also different
> application should be changed for this.
>
>
> On Wednesday, August 29, 2018 2:41 PM, kurt greaves 
> wrote:
>
>
> Note that you'll miss incoming writes if you do that, so you'll be
> inconsistent even after the repair. I'd say best to just query at QUORUM
> until you can finish repairs.
>
> On 29 August 2018 at 21:22, Alexander Dejanovski 
> wrote:
>
> Hi Vlad, you must restart the node but first disable joining the cluster,
> as described in the second part of this blog post :
> http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping-
> Without-Bootstrapping.html
> 
>
> Once repaired, you'll have to run "nodetool join" to start serving reads.
>
>
> Le mer. 29 août 2018 à 12:40, Vlad  a écrit :
>
> Will it help to set read_repair_chance to 1 (compaction is
> SizeTieredCompactionStrategy)?
>
>
> On Wednesday, August 29, 2018 1:34 PM, Vlad 
> wrote:
>
>
> Hi,
>
> quite urgent questions:
> due to disk and C* start problem we were forced to delete commit logs from
> one of nodes.
>
> Now repair is running, but meanwhile some reads bring no data (RF=2)
>
> Can this node be excluded from reads queries? And that  all reads will be
> redirected to other node in the ring?
>
>
> Thanks to All for help.
>
>
> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Also after restart with join_ring=false C* is still accepting connection on 
port 9042 (and obviously returning no data), so I run nodetool drainIs it good?
I run nodetool repair on this node. Meanwhile command didn't return, but I see 
in log
INFO  [Thread-6] 2018-08-29 12:16:03,954 RepairRunnable.java:125 - Starting 
repair command #1, repairing keyspace scanrepo with repair options 
(parallelism: parallel, primary range: false, incremental: true, job threads: 
1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 530)
ERROR [Thread-6] 2018-08-29 12:16:14,363 SystemDistributedKeyspace.java:306 - 
Error executing query INSERT INTO system_distributed.parent_repair_history 
(parent_id, keyspace_name, columnfamily_names, requested_ranges, started_at,
  options) VALUES (...) 
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
received only 0 responses.

and  nodetool compactionstats shows 
pending tasks: 9
- system_schema.tables: 1
- system_schema.keyspaces: 1
- ks1.tb1: 4
- ks1.tb2: 3


 

On Wednesday, August 29, 2018 2:57 PM, Vlad  
wrote:
 

 I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   

   

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   

Re: URGENT: disable reads from node

2018-08-29 Thread kurt greaves
Note that you'll miss incoming writes if you do that, so you'll be
inconsistent even after the repair. I'd say best to just query at QUORUM
until you can finish repairs.

On 29 August 2018 at 21:22, Alexander Dejanovski 
wrote:

> Hi Vlad, you must restart the node but first disable joining the cluster,
> as described in the second part of this blog post :
> http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-
> Without-Bootstrapping.html
>
> Once repaired, you'll have to run "nodetool join" to start serving reads.
>
>
> Le mer. 29 août 2018 à 12:40, Vlad  a écrit :
>
>> Will it help to set read_repair_chance to 1 (compaction is
>> SizeTieredCompactionStrategy)?
>>
>>
>> On Wednesday, August 29, 2018 1:34 PM, Vlad 
>> wrote:
>>
>>
>> Hi,
>>
>> quite urgent questions:
>> due to disk and C* start problem we were forced to delete commit logs
>> from one of nodes.
>>
>> Now repair is running, but meanwhile some reads bring no data (RF=2)
>>
>> Can this node be excluded from reads queries? And that  all reads will be
>> redirected to other node in the ring?
>>
>>
>> Thanks to All for help.
>>
>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>


Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Hi Vlad, you must restart the node but first disable joining the cluster,
as described in the second part of this blog post :
http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-Without-Bootstrapping.html

Once repaired, you'll have to run "nodetool join" to start serving reads.


Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

> Will it help to set read_repair_chance to 1 (compaction is
> SizeTieredCompactionStrategy)?
>
>
> On Wednesday, August 29, 2018 1:34 PM, Vlad 
> wrote:
>
>
> Hi,
>
> quite urgent questions:
> due to disk and C* start problem we were forced to delete commit logs from
> one of nodes.
>
> Now repair is running, but meanwhile some reads bring no data (RF=2)
>
> Can this node be excluded from reads queries? And that  all reads will be
> redirected to other node in the ring?
>
>
> Thanks to All for help.
>
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   

URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.