Re: unreachable nodes mystery in describecluster output

2016-08-03 Thread Aleksandr Ivanov
> > The latency is high... > It is but is it really causing the problem? Latency is high but constant and not higher than ~200ms. Regarding the ALTER, did you try to increase the timeout with "cqlsh > --request-timeout=REQUEST_TIMEOUT"? Because the default is 10 seconds. > I use 25sec timeout

Re: unreachable nodes mystery in describecluster output

2016-08-03 Thread Aleksandr Ivanov
orks. > Is there a VPN between DCs? Is there room for improvement at the network > level? TCP tuning, etc. I'm not saying you won't have unreachable nodes but > it's worth it if you can. > > Romain > > Le Mercredi 3 août 2016 15h02, Aleksandr Ivanov <ale...@gm

unreachable nodes mystery in describecluster output

2016-08-03 Thread Aleksandr Ivanov
Hello, I'm running v3.0.8 in multi-data center deployment (6 DCs, 6 nodes per DC, maximum latency between some nodes ~200ms). After clean cluster start I run into issue when "nodetool descibecluster" shows that some random nodes from deployment are UNREACHABLE however in "nodetool status" or

How to start using incremental repairs?

2016-08-25 Thread Aleksandr Ivanov
I’m new in Cassandra and trying to figure out how to _start_ using incremental repairs. I have seen article about “Migrating to incremental repairs” but since I didn’t use repairs before at all and I use Cassandra version v3.0.8, then maybe not all steps are needed which are mentioned in Datastax

Re: How to start using incremental repairs?

2016-08-26 Thread Aleksandr Ivanov
tions, so you should run incremental repair in all nodes in all DCs > sequentially (you should be aware that this will probably generate inter-DC > traffic), no need to disable autocompaction or stopping nodes. > > 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov <ale...@gmail.com>: >

Re: How to start using incremental repairs?

2016-08-26 Thread Aleksandr Ivanov
time > (search for validation failures in your cassandra log). > Best chance to have it succeed IMHO is to run inc repair one node at a > time. > > Le ven. 26 août 2016 à 08:02, Aleksandr Ivanov <ale...@gmail.com> a > écrit : > >> Thanks for confirmation Paulo. Then my

Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
Yes, I use compression. Tried without and it gave ~15% increase in speed, but is still too low (~35Mbps) On sending side no high CPU/IO/etc utilization. But on receiving node I see that one "STREAM-IN" thread takes 100% CPU and it just doesn't scale by design since "Each stream is a single

Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
Nope, no MVs On Thu, Dec 8, 2016 at 11:31 AM, Benjamin Roth <benjamin.r...@jaumo.com> wrote: > Just an educated guess: you have materialized Views? They are known to > Stream very slow > > Am 08.12.2016 10:28 schrieb "Aleksandr Ivanov" <ale...@gmail.com>: >

Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
e your System cannot Stream faster. Is your cpu or hd/ssd fully > utilized? > > Am 07.12.2016 16:07 schrieb "Eric Evans" <john.eric.ev...@gmail.com>: > >> On Tue, Dec 6, 2016 at 9:54 AM, Aleksandr Ivanov <ale...@gmail.com> >> wrote: >> > I'm trying

node decommission throttled

2016-12-06 Thread Aleksandr Ivanov
I'm trying to decommission one C* node from 6 nodes cluster and see that outbound network traffic on this node doesn't go over ~30Mb/s. Looks like it is throttled somewhere in C* Should stream_throughput_outbound_megabits_per_sec limit apply on "decommissioning streams" as well? >From my

Node doesn't join to the ring

2016-12-09 Thread Aleksandr Ivanov
I'm trying to join node to the ring with nodetool join command but it fails with error message about inconsistent replica. My steps: 1. run cassandra with -Dcassandra.join_ring=false 2. wait couple of minutes 3. ensure that all nodes are in UN state 4. run nodetool join Joining fails with

Re: Node doesn't join to the ring

2016-12-12 Thread Aleksandr Ivanov
Anomaly solved. Reason was same TCP timeouts on OS level and hardware firewall. OS TCP timeout changed to smaller value than timeout on firewall and it fixed problem. On Fri, Dec 9, 2016 at 7:10 PM Aleksandr Ivanov <ale...@gmail.com> wrote: > I'm trying to join node to the ring with

Time range for metrics histogram

2016-12-17 Thread Aleksandr Ivanov
Hi C* experts! I'm trying to understand over what time range C* latency metrics histogram is calculated. Several sources state that max is calculated from C* start, but on graphs I see that max latency metric is jumping up and down over the time. Other sources state that histogram is calculated

Re: Help

2017-01-14 Thread Aleksandr Ivanov
Could you share a bit your cluster setup? Do you use cloud for your deployment or dedicated firewalls in front of nodes? If gossip shows that everything is up it doesn't mean that all nodes can communicate with each other. I have noticed situations when TCP connection was killed by firewall and

disable reads from node till hints are fully synced

2017-09-10 Thread Aleksandr Ivanov
Hello, from time to time we have situations where node is down for longer period (but less than max_hint_window_in_ms). After node is up and hints are activly syncing to affected node, clients get inconsistent data (client uses LOCAL_ONE consistency due to performance reasons). Is any way exist