Re: Join_ring=false Use Cases

2016-12-21 Thread Anuj Wadehra
Thanks All !!
I think the intent of the JIRA https://issues.apache.org/ 
jira/browse/CASSANDRA-6961 was to primarily deal with stale information after 
outages and give opportunity for repairing the data before a node joins the 
cluster. If a node started with join_ring=false doesn't accept writes while the 
repair is happening, the purpose of JIRA is defeated as it will anyways lead to 
stale information. Seems to be a defect.

ThanksAnuj


On Wednesday, 21 December 2016 2:53 AM, kurt Greaves  
wrote:
 

 It seems that you're correct in saying that writes don't propagate to a node 
that has join_ring set to false, so I'd say this is a flaw. In reality I can't 
see many actual use cases in regards to node outages with the current 
implementation. The main usage I'd think would be to have additional 
coordinators for CPU heavy workloads.

It seems to make it actually useful for repairs/outages we'd need to have 
another option to turn on writes so that it behaved similarly to write survey 
mode (but on already bootstrapped nodes).

Is there a reason we don't have this already? Or does it exist somewhere I'm 
not aware of? 

On 20 December 2016 at 17:40, Anuj Wadehra  wrote:

No responses yet :)
Any C* expert who could help on join_ring use case and the concern raised?
Thanks
Anuj 
 
 On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra wrote:  
 Hi,
I need to understand the use case of join_ring=false in case of node outages. 
As per https://issues.apache.org/ jira/browse/CASSANDRA-6961, you would want 
join_ring=false when you have to repair a node before bringing a node back 
after some considerable outage. The problem I see with join_ring=false is that 
unlike autobootstrap, the node will NOT accept writes while you are running 
repair on it. If a node was down for 5 hours and you bring it back with 
join_ring=false, repair the node for 7 hours and then make it join the ring, it 
will STILL have missed writes because while the time repair was running (7 
hrs), writes only went to other others. So, if you want to make sure that reads 
served by the restored node at CL ONE will return consistent data after the 
node has joined, you wont get that as writes have been missed while the node is 
being repaired. And if you work with Read/Write CL=QUORUM, even if you bring 
back the node without join_ring=false, you would anyways get the desired 
consistency. So, how join_ring would provide any additional consistency in this 
case ??
I can see join_ring=false useful only when I am restoring from Snapshot or 
bootstrapping and there are dropped mutations in my cluster which are not fixed 
by hinted handoff.
For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted Handoff=3 
hrs.10 AM Snapshot taken on all 3 nodes11 AM: Node B goes down for 4 hours3 PM: 
Node B comes up but data is not repaired. So, 1 hr of dropped mutations (2-3 
PM) not fixed via Hinted Handoff.5 PM: Node A crashes.6 PM: Node A restored 
from 10 AM Snapshot, Node A started with join_ring=false, repaired and then 
joined the cluster.
In above restore snapshot example, updates from 2-3 PM were outside hinted 
handoff window of 3 hours. Thus, node B wont get those updates. Node A data for 
2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e. node C 
and minimum consistency needed is QUORUM so join_ring=false would help. But 
this is very specific use case.  
ThanksAnuj
  




   

Re: Join_ring=false Use Cases

2016-12-20 Thread Carlos Rolo
Beware the Java Driver limitations around whitelisting IPs.

Works fine in Python.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Tue, Dec 20, 2016 at 9:25 PM, Matija Gobec  wrote:

> There is a talk from cassandra summit 2016 about coordinator nodes by Eric
> Lubow from SimpleReach. He explains how you can use that join_ring=false.
>
> On Tue, Dec 20, 2016 at 10:23 PM, kurt Greaves 
> wrote:
>
>> It seems that you're correct in saying that writes don't propagate to a
>> node that has join_ring set to false, so I'd say this is a flaw. In reality
>> I can't see many actual use cases in regards to node outages with the
>> current implementation. The main usage I'd think would be to have
>> additional coordinators for CPU heavy workloads.
>>
>> It seems to make it actually useful for repairs/outages we'd need to have
>> another option to turn on writes so that it behaved similarly to write
>> survey mode (but on already bootstrapped nodes).
>>
>> Is there a reason we don't have this already? Or does it exist somewhere
>> I'm not aware of?
>>
>> On 20 December 2016 at 17:40, Anuj Wadehra 
>> wrote:
>>
>>> No responses yet :)
>>>
>>> Any C* expert who could help on join_ring use case and the concern
>>> raised?
>>>
>>> Thanks
>>> Anuj
>>>
>>> On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra
>>>  wrote:
>>> Hi,
>>>
>>> I need to understand the use case of join_ring=false in case of node
>>> outages. As per https://issues.apache.org/jira/browse/CASSANDRA-6961,
>>> you would want join_ring=false when you have to repair a node before
>>> bringing a node back after some considerable outage. The problem I see with
>>> join_ring=false is that unlike autobootstrap, the node will NOT accept
>>> writes while you are running repair on it. If a node was down for 5 hours
>>> and you bring it back with join_ring=false, repair the node for 7 hours and
>>> then make it join the ring, it will STILL have missed writes because while
>>> the time repair was running (7 hrs), writes only went to other others.
>>> So, if you want to make sure that reads served by the restored node at CL
>>> ONE will return consistent data after the node has joined, you wont get
>>> that as writes have been missed while the node is being repaired. And if
>>> you work with Read/Write CL=QUORUM, even if you bring back the node without
>>> join_ring=false, you would anyways get the desired consistency. So, how
>>> join_ring would provide any additional consistency in this case ??
>>>
>>> I can see join_ring=false useful only when I am restoring from Snapshot
>>> or bootstrapping and there are dropped mutations in my cluster which are
>>> not fixed by hinted handoff.
>>>
>>> For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted
>>> Handoff=3 hrs.
>>> 10 AM Snapshot taken on all 3 nodes
>>> 11 AM: Node B goes down for 4 hours
>>> 3 PM: Node B comes up but data is not repaired. So, 1 hr of dropped
>>> mutations (2-3 PM) not fixed via Hinted Handoff.
>>> 5 PM: Node A crashes.
>>> 6 PM: Node A restored from 10 AM Snapshot, Node A started with
>>> join_ring=false, repaired and then joined the cluster.
>>>
>>> In above restore snapshot example, updates from 2-3 PM were outside
>>> hinted handoff window of 3 hours. Thus, node B wont get those updates.
>>> Node A data for 2-3 PM is already lost. So, 2-3 PM updates are only on one
>>> replica i.e. node C and minimum consistency needed is QUORUM so
>>> join_ring=false would help. But this is very specific use case.
>>>
>>> Thanks
>>> Anuj
>>>
>>>
>>
>

-- 


--





Re: Join_ring=false Use Cases

2016-12-20 Thread Matija Gobec
There is a talk from cassandra summit 2016 about coordinator nodes by Eric
Lubow from SimpleReach. He explains how you can use that join_ring=false.

On Tue, Dec 20, 2016 at 10:23 PM, kurt Greaves  wrote:

> It seems that you're correct in saying that writes don't propagate to a
> node that has join_ring set to false, so I'd say this is a flaw. In reality
> I can't see many actual use cases in regards to node outages with the
> current implementation. The main usage I'd think would be to have
> additional coordinators for CPU heavy workloads.
>
> It seems to make it actually useful for repairs/outages we'd need to have
> another option to turn on writes so that it behaved similarly to write
> survey mode (but on already bootstrapped nodes).
>
> Is there a reason we don't have this already? Or does it exist somewhere
> I'm not aware of?
>
> On 20 December 2016 at 17:40, Anuj Wadehra  wrote:
>
>> No responses yet :)
>>
>> Any C* expert who could help on join_ring use case and the concern raised?
>>
>> Thanks
>> Anuj
>>
>> On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra
>>  wrote:
>> Hi,
>>
>> I need to understand the use case of join_ring=false in case of node
>> outages. As per https://issues.apache.org/jira/browse/CASSANDRA-6961,
>> you would want join_ring=false when you have to repair a node before
>> bringing a node back after some considerable outage. The problem I see with
>> join_ring=false is that unlike autobootstrap, the node will NOT accept
>> writes while you are running repair on it. If a node was down for 5 hours
>> and you bring it back with join_ring=false, repair the node for 7 hours and
>> then make it join the ring, it will STILL have missed writes because while
>> the time repair was running (7 hrs), writes only went to other others.
>> So, if you want to make sure that reads served by the restored node at CL
>> ONE will return consistent data after the node has joined, you wont get
>> that as writes have been missed while the node is being repaired. And if
>> you work with Read/Write CL=QUORUM, even if you bring back the node without
>> join_ring=false, you would anyways get the desired consistency. So, how
>> join_ring would provide any additional consistency in this case ??
>>
>> I can see join_ring=false useful only when I am restoring from Snapshot
>> or bootstrapping and there are dropped mutations in my cluster which are
>> not fixed by hinted handoff.
>>
>> For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted
>> Handoff=3 hrs.
>> 10 AM Snapshot taken on all 3 nodes
>> 11 AM: Node B goes down for 4 hours
>> 3 PM: Node B comes up but data is not repaired. So, 1 hr of dropped
>> mutations (2-3 PM) not fixed via Hinted Handoff.
>> 5 PM: Node A crashes.
>> 6 PM: Node A restored from 10 AM Snapshot, Node A started with
>> join_ring=false, repaired and then joined the cluster.
>>
>> In above restore snapshot example, updates from 2-3 PM were outside
>> hinted handoff window of 3 hours. Thus, node B wont get those updates.
>> Node A data for 2-3 PM is already lost. So, 2-3 PM updates are only on one
>> replica i.e. node C and minimum consistency needed is QUORUM so
>> join_ring=false would help. But this is very specific use case.
>>
>> Thanks
>> Anuj
>>
>>
>


Re: Join_ring=false Use Cases

2016-12-20 Thread kurt Greaves
It seems that you're correct in saying that writes don't propagate to a
node that has join_ring set to false, so I'd say this is a flaw. In reality
I can't see many actual use cases in regards to node outages with the
current implementation. The main usage I'd think would be to have
additional coordinators for CPU heavy workloads.

It seems to make it actually useful for repairs/outages we'd need to have
another option to turn on writes so that it behaved similarly to write
survey mode (but on already bootstrapped nodes).

Is there a reason we don't have this already? Or does it exist somewhere
I'm not aware of?

On 20 December 2016 at 17:40, Anuj Wadehra  wrote:

> No responses yet :)
>
> Any C* expert who could help on join_ring use case and the concern raised?
>
> Thanks
> Anuj
>
> On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra
>  wrote:
> Hi,
>
> I need to understand the use case of join_ring=false in case of node
> outages. As per https://issues.apache.org/jira/browse/CASSANDRA-6961, you
> would want join_ring=false when you have to repair a node before bringing a
> node back after some considerable outage. The problem I see with
> join_ring=false is that unlike autobootstrap, the node will NOT accept
> writes while you are running repair on it. If a node was down for 5 hours
> and you bring it back with join_ring=false, repair the node for 7 hours and
> then make it join the ring, it will STILL have missed writes because while
> the time repair was running (7 hrs), writes only went to other others.
> So, if you want to make sure that reads served by the restored node at CL
> ONE will return consistent data after the node has joined, you wont get
> that as writes have been missed while the node is being repaired. And if
> you work with Read/Write CL=QUORUM, even if you bring back the node without
> join_ring=false, you would anyways get the desired consistency. So, how
> join_ring would provide any additional consistency in this case ??
>
> I can see join_ring=false useful only when I am restoring from Snapshot or
> bootstrapping and there are dropped mutations in my cluster which are not
> fixed by hinted handoff.
>
> For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted
> Handoff=3 hrs.
> 10 AM Snapshot taken on all 3 nodes
> 11 AM: Node B goes down for 4 hours
> 3 PM: Node B comes up but data is not repaired. So, 1 hr of dropped
> mutations (2-3 PM) not fixed via Hinted Handoff.
> 5 PM: Node A crashes.
> 6 PM: Node A restored from 10 AM Snapshot, Node A started with
> join_ring=false, repaired and then joined the cluster.
>
> In above restore snapshot example, updates from 2-3 PM were outside hinted
> handoff window of 3 hours. Thus, node B wont get those updates. Node A data
> for 2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e.
> node C and minimum consistency needed is QUORUM so join_ring=false would
> help. But this is very specific use case.
>
> Thanks
> Anuj
>
>


Re: Join_ring=false Use Cases

2016-12-20 Thread Anuj Wadehra
No responses yet :)
Any C* expert who could help on join_ring use case and the concern raised?
Thanks
Anuj 
 
  On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra wrote: 
   Hi,
I need to understand the use case of join_ring=false in case of node outages. 
As per https://issues.apache.org/jira/browse/CASSANDRA-6961, you would want 
join_ring=false when you have to repair a node before bringing a node back 
after some considerable outage. The problem I see with join_ring=false is that 
unlike autobootstrap, the node will NOT accept writes while you are running 
repair on it. If a node was down for 5 hours and you bring it back with 
join_ring=false, repair the node for 7 hours and then make it join the ring, it 
will STILL have missed writes because while the time repair was running (7 
hrs), writes only went to other others. So, if you want to make sure that reads 
served by the restored node at CL ONE will return consistent data after the 
node has joined, you wont get that as writes have been missed while the node is 
being repaired. And if you work with Read/Write CL=QUORUM, even if you bring 
back the node without join_ring=false, you would anyways get the desired 
consistency. So, how join_ring would provide any additional consistency in this 
case ??
I can see join_ring=false useful only when I am restoring from Snapshot or 
bootstrapping and there are dropped mutations in my cluster which are not fixed 
by hinted handoff.
For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted Handoff=3 
hrs.10 AM Snapshot taken on all 3 nodes11 AM: Node B goes down for 4 hours3 PM: 
Node B comes up but data is not repaired. So, 1 hr of dropped mutations (2-3 
PM) not fixed via Hinted Handoff.5 PM: Node A crashes.6 PM: Node A restored 
from 10 AM Snapshot, Node A started with join_ring=false, repaired and then 
joined the cluster.
In above restore snapshot example, updates from 2-3 PM were outside hinted 
handoff window of 3 hours. Thus, node B wont get those updates. Node A data for 
2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e. node C 
and minimum consistency needed is QUORUM so join_ring=false would help. But 
this is very specific use case.  
ThanksAnuj
  


Re: Join_ring=false Use Cases

2016-12-14 Thread Anuj Wadehra
Can anyone help me with join_ring and address my concerns?

Thanks
Anuj 
 
  On Tue, 13 Dec, 2016 at 11:31 PM, Anuj Wadehra wrote: 
   Hi,
I need to understand the use case of join_ring=false in case of node outages. 
As per https://issues.apache.org/jira/browse/CASSANDRA-6961, you would want 
join_ring=false when you have to repair a node before bringing a node back 
after some considerable outage. The problem I see with join_ring=false is that 
unlike autobootstrap, the node will NOT accept writes while you are running 
repair on it. If a node was down for 5 hours and you bring it back with 
join_ring=false, repair the node for 7 hours and then make it join the ring, it 
will STILL have missed writes because while the time repair was running (7 
hrs), writes only went to other others. So, if you want to make sure that reads 
served by the restored node at CL ONE will return consistent data after the 
node has joined, you wont get that as writes have been missed while the node is 
being repaired. And if you work with Read/Write CL=QUORUM, even if you bring 
back the node without join_ring=false, you would anyways get the desired 
consistency. So, how join_ring would provide any additional consistency in this 
case ??
I can see join_ring=false useful only when I am restoring from Snapshot or 
bootstrapping and there are dropped mutations in my cluster which are not fixed 
by hinted handoff.
For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted Handoff=3 
hrs.10 AM Snapshot taken on all 3 nodes11 AM: Node B goes down for 4 hours3 PM: 
Node B comes up but data is not repaired. So, 1 hr of dropped mutations (2-3 
PM) not fixed via Hinted Handoff.5 PM: Node A crashes.6 PM: Node A restored 
from 10 AM Snapshot, Node A started with join_ring=false, repaired and then 
joined the cluster.
In above restore snapshot example, updates from 2-3 PM were outside hinted 
handoff window of 3 hours. Thus, node B wont get those updates. Node A data for 
2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e. node C 
and minimum consistency needed is QUORUM so join_ring=false would help. But 
this is very specific use case.  
ThanksAnuj
  


Join_ring=false Use Cases

2016-12-13 Thread Anuj Wadehra
 Hi,
I need to understand the use case of join_ring=false in case of node outages. 
As per https://issues.apache.org/jira/browse/CASSANDRA-6961, you would want 
join_ring=false when you have to repair a node before bringing a node back 
after some considerable outage. The problem I see with join_ring=false is that 
unlike autobootstrap, the node will NOT accept writes while you are running 
repair on it. If a node was down for 5 hours and you bring it back with 
join_ring=false, repair the node for 7 hours and then make it join the ring, it 
will STILL have missed writes because while the time repair was running (7 
hrs), writes only went to other others. So, if you want to make sure that reads 
served by the restored node at CL ONE will return consistent data after the 
node has joined, you wont get that as writes have been missed while the node is 
being repaired. And if you work with Read/Write CL=QUORUM, even if you bring 
back the node without join_ring=false, you would anyways get the desired 
consistency. So, how join_ring would provide any additional consistency in this 
case ??
I can see join_ring=false useful only when I am restoring from Snapshot or 
bootstrapping and there are dropped mutations in my cluster which are not fixed 
by hinted handoff.
For Example: 3 nodes A,B,C working at Read/Write CL QUORUM. Hinted Handoff=3 
hrs.10 AM Snapshot taken on all 3 nodes11 AM: Node B goes down for 4 hours3 PM: 
Node B comes up but data is not repaired. So, 1 hr of dropped mutations (2-3 
PM) not fixed via Hinted Handoff.5 PM: Node A crashes.6 PM: Node A restored 
from 10 AM Snapshot, Node A started with join_ring=false, repaired and then 
joined the cluster.
In above restore snapshot example, updates from 2-3 PM were outside hinted 
handoff window of 3 hours. Thus, node B wont get those updates. Node A data for 
2-3 PM is already lost. So, 2-3 PM updates are only on one replica i.e. node C 
and minimum consistency needed is QUORUM so join_ring=false would help. But 
this is very specific use case.  
ThanksAnuj