Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-05-22 Thread Hiroyuki Yamada
Hi,

FYI: I created a bug ticket since I think the behavior is just not right.
https://issues.apache.org/jira/browse/CASSANDRA-15138

Thanks,
Hiro

On Mon, May 13, 2019 at 10:58 AM Hiroyuki Yamada  wrote:

> Hi,
>
> Should I post a bug ?
> It doesn't seem to be an expected behavior,
> so I think it should be at least documented somewhere.
>
> Thanks,
> Hiro
>
>
> On Fri, Apr 26, 2019 at 3:17 PM Hiroyuki Yamada 
> wrote:
>
>> Hello,
>>
>> Thank you for some feedbacks.
>>
>> >Ben
>> Thank you.
>> I've tested with lower concurrency in my side, the issue still occurs.
>> We are using 3 x T3.xlarge instances for C* and small and separate
>> instance for the client program.
>> But if we tried with 1 host with 3 C* nodes, the issue didn't occur.
>>
>> > Alok
>> We also thought so and tested with hints disabled, but it doesn't make
>> any difference. (the issue still occurs)
>>
>> Thanks,
>> Hiro
>>
>>
>>
>>
>> On Fri, Apr 26, 2019 at 8:19 AM Alok Dwivedi <
>> alok.dwiv...@instaclustr.com> wrote:
>>
>>> Could it be related to hinted hand offs being stored in Node1 and then
>>> attempted to be replayed in Node2 when it comes back causing more load as
>>> new mutations are also being applied from cassandra-stress at same time?
>>>
>>> Alok Dwivedi
>>> Senior Consultant
>>> https://www.instaclustr.com/
>>>
>>>
>>>
>>>
>>> On 26 Apr 2019, at 09:04, Ben Slater  wrote:
>>>
>>> In the absence of anyone else having any bright ideas - it still sounds
>>> to me like the kind of scenario that can occur in a heavily overloaded
>>> cluster. I would try again with a lower load.
>>>
>>> What size machines are you using for stress client and the nodes? Are
>>> they all on separate machines?
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater**Chief Product Officer*
>>>
>>> 
>>>
>>> 
>>> 
>>> 
>>>
>>> Read our latest technical blog posts here
>>> .
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>> On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada 
>>> wrote:
>>>
 Hello,

 Sorry again.
 We found yet another weird thing in this.
 If we stop nodes with systemctl or just kill (TERM), it causes the
 problem,
 but if we kill -9, it doesn't cause the problem.

 Thanks,
 Hiro

 On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada 
 wrote:

> Sorry, I didn't write the version and the configurations.
> I've tested with C* 3.11.4, and
> the configurations are mostly set to default except for the
> replication factor and listen_address for proper networking.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
> wrote:
>
>> Hello Ben,
>>
>> Thank you for the quick reply.
>> I haven't tried that case, but it does't recover even if I stopped
>> the stress.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater <
>> ben.sla...@instaclustr.com> wrote:
>>
>>> Is it possible that stress is overloading node 1 so it’s not
>>> recovering state properly when node 2 comes up? Have you tried running 
>>> with
>>> a lower load (say 2 or 3 threads)?
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater*
>>> *Chief Product Officer*
>>>
>>>
>>> 
>>> 
>>> 
>>>
>>> Read our latest technical blog posts here
>>> .
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not 
>>> copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the 
>>> message.
>>>
>>>
>>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
>>> wrote:
>>>
 Hello,

 I faced a weird issue when recovering a cluster after two nodes are
 stopped.
 It is easily reproduce-able and looks like a bug or an issue to fix,
 so let me write down the steps to reproduce.

 === STEPS TO REPRODUCE ===
 * 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-05-12 Thread Hiroyuki Yamada
Hi,

Should I post a bug ?
It doesn't seem to be an expected behavior,
so I think it should be at least documented somewhere.

Thanks,
Hiro


On Fri, Apr 26, 2019 at 3:17 PM Hiroyuki Yamada  wrote:

> Hello,
>
> Thank you for some feedbacks.
>
> >Ben
> Thank you.
> I've tested with lower concurrency in my side, the issue still occurs.
> We are using 3 x T3.xlarge instances for C* and small and separate
> instance for the client program.
> But if we tried with 1 host with 3 C* nodes, the issue didn't occur.
>
> > Alok
> We also thought so and tested with hints disabled, but it doesn't make any
> difference. (the issue still occurs)
>
> Thanks,
> Hiro
>
>
>
>
> On Fri, Apr 26, 2019 at 8:19 AM Alok Dwivedi 
> wrote:
>
>> Could it be related to hinted hand offs being stored in Node1 and then
>> attempted to be replayed in Node2 when it comes back causing more load as
>> new mutations are also being applied from cassandra-stress at same time?
>>
>> Alok Dwivedi
>> Senior Consultant
>> https://www.instaclustr.com/
>>
>>
>>
>>
>> On 26 Apr 2019, at 09:04, Ben Slater  wrote:
>>
>> In the absence of anyone else having any bright ideas - it still sounds
>> to me like the kind of scenario that can occur in a heavily overloaded
>> cluster. I would try again with a lower load.
>>
>> What size machines are you using for stress client and the nodes? Are
>> they all on separate machines?
>>
>> Cheers
>> Ben
>>
>> ---
>>
>>
>> *Ben Slater**Chief Product Officer*
>>
>> 
>>
>> 
>> 
>> 
>>
>> Read our latest technical blog posts here
>> .
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>>
>> On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada  wrote:
>>
>>> Hello,
>>>
>>> Sorry again.
>>> We found yet another weird thing in this.
>>> If we stop nodes with systemctl or just kill (TERM), it causes the
>>> problem,
>>> but if we kill -9, it doesn't cause the problem.
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada 
>>> wrote:
>>>
 Sorry, I didn't write the version and the configurations.
 I've tested with C* 3.11.4, and
 the configurations are mostly set to default except for the replication
 factor and listen_address for proper networking.

 Thanks,
 Hiro

 On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
 wrote:

> Hello Ben,
>
> Thank you for the quick reply.
> I haven't tried that case, but it does't recover even if I stopped the
> stress.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
> wrote:
>
>> Is it possible that stress is overloading node 1 so it’s not
>> recovering state properly when node 2 comes up? Have you tried running 
>> with
>> a lower load (say 2 or 3 threads)?
>>
>> Cheers
>> Ben
>>
>> ---
>>
>>
>> *Ben Slater*
>> *Chief Product Officer*
>>
>>
>> 
>> 
>> 
>>
>> Read our latest technical blog posts here
>> .
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not 
>> copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the 
>> message.
>>
>>
>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
>> wrote:
>>
>>> Hello,
>>>
>>> I faced a weird issue when recovering a cluster after two nodes are
>>> stopped.
>>> It is easily reproduce-able and looks like a bug or an issue to fix,
>>> so let me write down the steps to reproduce.
>>>
>>> === STEPS TO REPRODUCE ===
>>> * Create a 3-node cluster with RF=3
>>>- node1(seed), node2, node3
>>> * Start requests to the cluster with cassandra-stress (it continues
>>> until the end)
>>>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
>>> -errors ignore -node node1,node2,node3 -rate threads\>=16
>>> threads\<=256
>>> * Stop node3 normally (with systemctl stop)
>>>- the system is still available because the quorum of 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-26 Thread Hiroyuki Yamada
Hello,

Thank you for some feedbacks.

>Ben
Thank you.
I've tested with lower concurrency in my side, the issue still occurs.
We are using 3 x T3.xlarge instances for C* and small and separate instance
for the client program.
But if we tried with 1 host with 3 C* nodes, the issue didn't occur.

> Alok
We also thought so and tested with hints disabled, but it doesn't make any
difference. (the issue still occurs)

Thanks,
Hiro




On Fri, Apr 26, 2019 at 8:19 AM Alok Dwivedi 
wrote:

> Could it be related to hinted hand offs being stored in Node1 and then
> attempted to be replayed in Node2 when it comes back causing more load as
> new mutations are also being applied from cassandra-stress at same time?
>
> Alok Dwivedi
> Senior Consultant
> https://www.instaclustr.com/
>
>
>
>
> On 26 Apr 2019, at 09:04, Ben Slater  wrote:
>
> In the absence of anyone else having any bright ideas - it still sounds to
> me like the kind of scenario that can occur in a heavily overloaded
> cluster. I would try again with a lower load.
>
> What size machines are you using for stress client and the nodes? Are they
> all on separate machines?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
>    
>
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada  wrote:
>
>> Hello,
>>
>> Sorry again.
>> We found yet another weird thing in this.
>> If we stop nodes with systemctl or just kill (TERM), it causes the
>> problem,
>> but if we kill -9, it doesn't cause the problem.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada 
>> wrote:
>>
>>> Sorry, I didn't write the version and the configurations.
>>> I've tested with C* 3.11.4, and
>>> the configurations are mostly set to default except for the replication
>>> factor and listen_address for proper networking.
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
>>> wrote:
>>>
 Hello Ben,

 Thank you for the quick reply.
 I haven't tried that case, but it does't recover even if I stopped the
 stress.

 Thanks,
 Hiro

 On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
 wrote:

> Is it possible that stress is overloading node 1 so it’s not
> recovering state properly when node 2 comes up? Have you tried running 
> with
> a lower load (say 2 or 3 threads)?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater*
> *Chief Product Officer*
>
>
> 
> 
> 
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited
> (Australia) and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not 
> copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
> wrote:
>
>> Hello,
>>
>> I faced a weird issue when recovering a cluster after two nodes are
>> stopped.
>> It is easily reproduce-able and looks like a bug or an issue to fix,
>> so let me write down the steps to reproduce.
>>
>> === STEPS TO REPRODUCE ===
>> * Create a 3-node cluster with RF=3
>>- node1(seed), node2, node3
>> * Start requests to the cluster with cassandra-stress (it continues
>> until the end)
>>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
>> -errors ignore -node node1,node2,node3 -rate threads\>=16
>> threads\<=256
>> * Stop node3 normally (with systemctl stop)
>>- the system is still available because the quorum of nodes is
>> still available
>> * Stop node2 normally (with systemctl stop)
>>- the system is NOT available after it's stopped.
>>- the client gets `UnavailableException: Not enough replicas
>> available for query at consistency QUORUM`
>>- the client gets errors right away (so few ms)
>>- so far it's all expected
>> * Wait for 1 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-25 Thread Alok Dwivedi
Could it be related to hinted hand offs being stored in Node1 and then 
attempted to be replayed in Node2 when it comes back causing more load as new 
mutations are also being applied from cassandra-stress at same time?

Alok Dwivedi
Senior Consultant 
https://www.instaclustr.com/




> On 26 Apr 2019, at 09:04, Ben Slater  wrote:
> 
> In the absence of anyone else having any bright ideas - it still sounds to me 
> like the kind of scenario that can occur in a heavily overloaded cluster. I 
> would try again with a lower load. 
> 
> What size machines are you using for stress client and the nodes? Are they 
> all on separate machines?
> 
> Cheers
> Ben
> --- 
> Ben Slater
> Chief Product Officer
> 
>  
>        
> 
> Read our latest technical blog posts here .
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia) 
> and Instaclustr Inc (USA).
> This email and any attachments may contain confidential and legally 
> privileged information.  If you are not the intended recipient, do not copy 
> or disclose its content, but please reply to this email immediately and 
> highlight the error to the sender and then immediately delete the message.
> 
> 
> On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada  > wrote:
> Hello,
> 
> Sorry again.
> We found yet another weird thing in this.
> If we stop nodes with systemctl or just kill (TERM), it causes the problem,
> but if we kill -9, it doesn't cause the problem.
> 
> Thanks,
> Hiro
> 
> On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada  > wrote:
> Sorry, I didn't write the version and the configurations.
> I've tested with C* 3.11.4, and 
> the configurations are mostly set to default except for the replication 
> factor and listen_address for proper networking.
> 
> Thanks,
> Hiro
> 
> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada  > wrote:
> Hello Ben,
> 
> Thank you for the quick reply.
> I haven't tried that case, but it does't recover even if I stopped the stress.
> 
> Thanks,
> Hiro
> 
> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater  > wrote:
> Is it possible that stress is overloading node 1 so it’s not recovering state 
> properly when node 2 comes up? Have you tried running with a lower load (say 
> 2 or 3 threads)?
> 
> Cheers
> Ben
> --- 
> Ben Slater
> Chief Product Officer
> 
>        
> 
> Read our latest technical blog posts here .
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia) 
> and Instaclustr Inc (USA).
> This email and any attachments may contain confidential and legally 
> privileged information.  If you are not the intended recipient, do not copy 
> or disclose its content, but please reply to this email immediately and 
> highlight the error to the sender and then immediately delete the message.
> 
> 
> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada  > wrote:
> Hello,
> 
> I faced a weird issue when recovering a cluster after two nodes are stopped.
> It is easily reproduce-able and looks like a bug or an issue to fix,
> so let me write down the steps to reproduce.
> 
> === STEPS TO REPRODUCE ===
> * Create a 3-node cluster with RF=3
>- node1(seed), node2, node3
> * Start requests to the cluster with cassandra-stress (it continues
> until the end)
>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
> -errors ignore -node node1,node2,node3 -rate threads\>=16
> threads\<=256
> * Stop node3 normally (with systemctl stop)
>- the system is still available because the quorum of nodes is
> still available
> * Stop node2 normally (with systemctl stop)
>- the system is NOT available after it's stopped.
>- the client gets `UnavailableException: Not enough replicas
> available for query at consistency QUORUM`
>- the client gets errors right away (so few ms)
>- so far it's all expected
> * Wait for 1 mins
> * Bring up node2
>- The issue happens here.
>- the client gets ReadTimeoutException` or WriteTimeoutException
> depending on if the request is read or write even after the node2 is
> up
>- the client gets errors after about 5000ms or 2000ms, which are
> request timeout for write and read request
>- what node1 reports with `nodetool status` and what node2 reports
> are not consistent. (node2 thinks node1 is down)
>- It takes very long time to recover from its state
> === STEPS TO REPRODUCE ===
> 
> Is it supposed to happen ?
> If we don't start cassandra-stress, it's all fine.
> 
> Some workarounds we found to recover the state are the followings:
> * Restarting node1 and it 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-25 Thread Ben Slater
In the absence of anyone else having any bright ideas - it still sounds to
me like the kind of scenario that can occur in a heavily overloaded
cluster. I would try again with a lower load.

What size machines are you using for stress client and the nodes? Are they
all on separate machines?

Cheers
Ben

---


*Ben Slater**Chief Product Officer*



   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada  wrote:

> Hello,
>
> Sorry again.
> We found yet another weird thing in this.
> If we stop nodes with systemctl or just kill (TERM), it causes the problem,
> but if we kill -9, it doesn't cause the problem.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada 
> wrote:
>
>> Sorry, I didn't write the version and the configurations.
>> I've tested with C* 3.11.4, and
>> the configurations are mostly set to default except for the replication
>> factor and listen_address for proper networking.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
>> wrote:
>>
>>> Hello Ben,
>>>
>>> Thank you for the quick reply.
>>> I haven't tried that case, but it does't recover even if I stopped the
>>> stress.
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
>>> wrote:
>>>
 Is it possible that stress is overloading node 1 so it’s not recovering
 state properly when node 2 comes up? Have you tried running with a lower
 load (say 2 or 3 threads)?

 Cheers
 Ben

 ---


 *Ben Slater*
 *Chief Product Officer*


 
 
 

 Read our latest technical blog posts here
 .

 This email has been sent on behalf of Instaclustr Pty. Limited
 (Australia) and Instaclustr Inc (USA).

 This email and any attachments may contain confidential and legally
 privileged information.  If you are not the intended recipient, do not copy
 or disclose its content, but please reply to this email immediately and
 highlight the error to the sender and then immediately delete the message.


 On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
 wrote:

> Hello,
>
> I faced a weird issue when recovering a cluster after two nodes are
> stopped.
> It is easily reproduce-able and looks like a bug or an issue to fix,
> so let me write down the steps to reproduce.
>
> === STEPS TO REPRODUCE ===
> * Create a 3-node cluster with RF=3
>- node1(seed), node2, node3
> * Start requests to the cluster with cassandra-stress (it continues
> until the end)
>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
> -errors ignore -node node1,node2,node3 -rate threads\>=16
> threads\<=256
> * Stop node3 normally (with systemctl stop)
>- the system is still available because the quorum of nodes is
> still available
> * Stop node2 normally (with systemctl stop)
>- the system is NOT available after it's stopped.
>- the client gets `UnavailableException: Not enough replicas
> available for query at consistency QUORUM`
>- the client gets errors right away (so few ms)
>- so far it's all expected
> * Wait for 1 mins
> * Bring up node2
>- The issue happens here.
>- the client gets ReadTimeoutException` or WriteTimeoutException
> depending on if the request is read or write even after the node2 is
> up
>- the client gets errors after about 5000ms or 2000ms, which are
> request timeout for write and read request
>- what node1 reports with `nodetool status` and what node2 reports
> are not consistent. (node2 thinks node1 is down)
>- It takes very long time to recover from its state
> === STEPS TO REPRODUCE ===
>
> Is it supposed to happen ?
> If we don't start cassandra-stress, it's all fine.
>
> Some workarounds we found to recover the state are the followings:
> * Restarting node1 and it recovers its state right after it's restarted
> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 6
> or something)
>
> I don't think either of them is a really good solution.
> Can anyone 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-25 Thread Hiroyuki Yamada
Hello,

Sorry again.
We found yet another weird thing in this.
If we stop nodes with systemctl or just kill (TERM), it causes the problem,
but if we kill -9, it doesn't cause the problem.

Thanks,
Hiro

On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada  wrote:

> Sorry, I didn't write the version and the configurations.
> I've tested with C* 3.11.4, and
> the configurations are mostly set to default except for the replication
> factor and listen_address for proper networking.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
> wrote:
>
>> Hello Ben,
>>
>> Thank you for the quick reply.
>> I haven't tried that case, but it does't recover even if I stopped the
>> stress.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
>> wrote:
>>
>>> Is it possible that stress is overloading node 1 so it’s not recovering
>>> state properly when node 2 comes up? Have you tried running with a lower
>>> load (say 2 or 3 threads)?
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater*
>>> *Chief Product Officer*
>>>
>>>
>>> 
>>> 
>>> 
>>>
>>> Read our latest technical blog posts here
>>> .
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
>>> wrote:
>>>
 Hello,

 I faced a weird issue when recovering a cluster after two nodes are
 stopped.
 It is easily reproduce-able and looks like a bug or an issue to fix,
 so let me write down the steps to reproduce.

 === STEPS TO REPRODUCE ===
 * Create a 3-node cluster with RF=3
- node1(seed), node2, node3
 * Start requests to the cluster with cassandra-stress (it continues
 until the end)
- what we did: cassandra-stress mixed cl=QUORUM duration=10m
 -errors ignore -node node1,node2,node3 -rate threads\>=16
 threads\<=256
 * Stop node3 normally (with systemctl stop)
- the system is still available because the quorum of nodes is
 still available
 * Stop node2 normally (with systemctl stop)
- the system is NOT available after it's stopped.
- the client gets `UnavailableException: Not enough replicas
 available for query at consistency QUORUM`
- the client gets errors right away (so few ms)
- so far it's all expected
 * Wait for 1 mins
 * Bring up node2
- The issue happens here.
- the client gets ReadTimeoutException` or WriteTimeoutException
 depending on if the request is read or write even after the node2 is
 up
- the client gets errors after about 5000ms or 2000ms, which are
 request timeout for write and read request
- what node1 reports with `nodetool status` and what node2 reports
 are not consistent. (node2 thinks node1 is down)
- It takes very long time to recover from its state
 === STEPS TO REPRODUCE ===

 Is it supposed to happen ?
 If we don't start cassandra-stress, it's all fine.

 Some workarounds we found to recover the state are the followings:
 * Restarting node1 and it recovers its state right after it's restarted
 * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 6
 or something)

 I don't think either of them is a really good solution.
 Can anyone explain what is going on and what is the best way to make
 it not happen or recover ?

 Thanks,
 Hiro

 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org




Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-24 Thread Hiroyuki Yamada
Sorry, I didn't write the version and the configurations.
I've tested with C* 3.11.4, and
the configurations are mostly set to default except for the replication
factor and listen_address for proper networking.

Thanks,
Hiro

On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada  wrote:

> Hello Ben,
>
> Thank you for the quick reply.
> I haven't tried that case, but it does't recover even if I stopped the
> stress.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
> wrote:
>
>> Is it possible that stress is overloading node 1 so it’s not recovering
>> state properly when node 2 comes up? Have you tried running with a lower
>> load (say 2 or 3 threads)?
>>
>> Cheers
>> Ben
>>
>> ---
>>
>>
>> *Ben Slater*
>> *Chief Product Officer*
>>
>>
>> 
>> 
>> 
>>
>> Read our latest technical blog posts here
>> .
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>>
>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada  wrote:
>>
>>> Hello,
>>>
>>> I faced a weird issue when recovering a cluster after two nodes are
>>> stopped.
>>> It is easily reproduce-able and looks like a bug or an issue to fix,
>>> so let me write down the steps to reproduce.
>>>
>>> === STEPS TO REPRODUCE ===
>>> * Create a 3-node cluster with RF=3
>>>- node1(seed), node2, node3
>>> * Start requests to the cluster with cassandra-stress (it continues
>>> until the end)
>>>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
>>> -errors ignore -node node1,node2,node3 -rate threads\>=16
>>> threads\<=256
>>> * Stop node3 normally (with systemctl stop)
>>>- the system is still available because the quorum of nodes is
>>> still available
>>> * Stop node2 normally (with systemctl stop)
>>>- the system is NOT available after it's stopped.
>>>- the client gets `UnavailableException: Not enough replicas
>>> available for query at consistency QUORUM`
>>>- the client gets errors right away (so few ms)
>>>- so far it's all expected
>>> * Wait for 1 mins
>>> * Bring up node2
>>>- The issue happens here.
>>>- the client gets ReadTimeoutException` or WriteTimeoutException
>>> depending on if the request is read or write even after the node2 is
>>> up
>>>- the client gets errors after about 5000ms or 2000ms, which are
>>> request timeout for write and read request
>>>- what node1 reports with `nodetool status` and what node2 reports
>>> are not consistent. (node2 thinks node1 is down)
>>>- It takes very long time to recover from its state
>>> === STEPS TO REPRODUCE ===
>>>
>>> Is it supposed to happen ?
>>> If we don't start cassandra-stress, it's all fine.
>>>
>>> Some workarounds we found to recover the state are the followings:
>>> * Restarting node1 and it recovers its state right after it's restarted
>>> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 6
>>> or something)
>>>
>>> I don't think either of them is a really good solution.
>>> Can anyone explain what is going on and what is the best way to make
>>> it not happen or recover ?
>>>
>>> Thanks,
>>> Hiro
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>


Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-24 Thread Hiroyuki Yamada
Hello Ben,

Thank you for the quick reply.
I haven't tried that case, but it does't recover even if I stopped the
stress.

Thanks,
Hiro

On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
wrote:

> Is it possible that stress is overloading node 1 so it’s not recovering
> state properly when node 2 comes up? Have you tried running with a lower
> load (say 2 or 3 threads)?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater*
> *Chief Product Officer*
>
>
>    
>
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada  wrote:
>
>> Hello,
>>
>> I faced a weird issue when recovering a cluster after two nodes are
>> stopped.
>> It is easily reproduce-able and looks like a bug or an issue to fix,
>> so let me write down the steps to reproduce.
>>
>> === STEPS TO REPRODUCE ===
>> * Create a 3-node cluster with RF=3
>>- node1(seed), node2, node3
>> * Start requests to the cluster with cassandra-stress (it continues
>> until the end)
>>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
>> -errors ignore -node node1,node2,node3 -rate threads\>=16
>> threads\<=256
>> * Stop node3 normally (with systemctl stop)
>>- the system is still available because the quorum of nodes is
>> still available
>> * Stop node2 normally (with systemctl stop)
>>- the system is NOT available after it's stopped.
>>- the client gets `UnavailableException: Not enough replicas
>> available for query at consistency QUORUM`
>>- the client gets errors right away (so few ms)
>>- so far it's all expected
>> * Wait for 1 mins
>> * Bring up node2
>>- The issue happens here.
>>- the client gets ReadTimeoutException` or WriteTimeoutException
>> depending on if the request is read or write even after the node2 is
>> up
>>- the client gets errors after about 5000ms or 2000ms, which are
>> request timeout for write and read request
>>- what node1 reports with `nodetool status` and what node2 reports
>> are not consistent. (node2 thinks node1 is down)
>>- It takes very long time to recover from its state
>> === STEPS TO REPRODUCE ===
>>
>> Is it supposed to happen ?
>> If we don't start cassandra-stress, it's all fine.
>>
>> Some workarounds we found to recover the state are the followings:
>> * Restarting node1 and it recovers its state right after it's restarted
>> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 6
>> or something)
>>
>> I don't think either of them is a really good solution.
>> Can anyone explain what is going on and what is the best way to make
>> it not happen or recover ?
>>
>> Thanks,
>> Hiro
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-24 Thread Ben Slater
Is it possible that stress is overloading node 1 so it’s not recovering
state properly when node 2 comes up? Have you tried running with a lower
load (say 2 or 3 threads)?

Cheers
Ben

---


*Ben Slater*
*Chief Product Officer*


   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada  wrote:

> Hello,
>
> I faced a weird issue when recovering a cluster after two nodes are
> stopped.
> It is easily reproduce-able and looks like a bug or an issue to fix,
> so let me write down the steps to reproduce.
>
> === STEPS TO REPRODUCE ===
> * Create a 3-node cluster with RF=3
>- node1(seed), node2, node3
> * Start requests to the cluster with cassandra-stress (it continues
> until the end)
>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
> -errors ignore -node node1,node2,node3 -rate threads\>=16
> threads\<=256
> * Stop node3 normally (with systemctl stop)
>- the system is still available because the quorum of nodes is
> still available
> * Stop node2 normally (with systemctl stop)
>- the system is NOT available after it's stopped.
>- the client gets `UnavailableException: Not enough replicas
> available for query at consistency QUORUM`
>- the client gets errors right away (so few ms)
>- so far it's all expected
> * Wait for 1 mins
> * Bring up node2
>- The issue happens here.
>- the client gets ReadTimeoutException` or WriteTimeoutException
> depending on if the request is read or write even after the node2 is
> up
>- the client gets errors after about 5000ms or 2000ms, which are
> request timeout for write and read request
>- what node1 reports with `nodetool status` and what node2 reports
> are not consistent. (node2 thinks node1 is down)
>- It takes very long time to recover from its state
> === STEPS TO REPRODUCE ===
>
> Is it supposed to happen ?
> If we don't start cassandra-stress, it's all fine.
>
> Some workarounds we found to recover the state are the followings:
> * Restarting node1 and it recovers its state right after it's restarted
> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 6
> or something)
>
> I don't think either of them is a really good solution.
> Can anyone explain what is going on and what is the best way to make
> it not happen or recover ?
>
> Thanks,
> Hiro
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>