Re: 回复: 回复: tolerate how many nodes down in the cluster

2017-07-27 Thread Anuj Wadehra
Hi Peng, 
Racks can be logical (as defined with RAC attribute in Cassandra configuration 
files) or physical (racks in server rooms).  
In my view, for leveraging racks in your case, its important to understand the 
implication of following decisions:
1. Number of distinct logical RACs defined in Cassandra:If you want to leverage 
RACs optimally for operational efficiencies (like Brooke explained), you need 
to make sure that logical RACs are ALWAYS equal to RF irrespective of the fact 
whether physical Racks are equal to or greater than RF.  
Keeping logical Racks=RF, ensures that nodes allocated to a logical rack have 
exactly 1 replicas of the entire 100% data set.  So,  if your have RF=3 and you 
use QUORUM for read/write,  you can bring down ALL nodes allocated to a logical 
rack for maintenance activity and still achieve 100% availability. This makes 
operations faster and cuts down the risk involved. For example, imagine taking 
a Cassandra restart of entire cluster. If one node takes 3 minutes, a rolling 
restart of 30 nodes would take 90 minutes. But, if you use 3 logical RACs with 
RF=3 and assign 10 nodes to each logical RAC, you can restart 10 nodes within a 
RAC simultaneously (of course in off-peak hours so that remaining 20 nodes can 
take the load). Staring Cassandra on all RACs one by one will just take 9 
minutes rather than 90 minutes. If there are any issues during 
restart/maintenance, you can take all the nodes on a Logical RAC down, fix them 
and bring them back without affecting availability


2.Number of physical Racks : As per historical data, there are instances when 
more than one nodes in a physical rack fail together. When you are using VMs, 
there are three levels instead of two. VMs on a single physical machine are 
likely to fail together too due to hardware failure.
Physical Racks > Physical Machines > VMs
Ensure that all VMs on a physical machine map to single logical RAC. If you 
want to afford failure of physical racks in the server room, you also need to 
ensure that all physical servers on a physical rack must map to just one 
logical RAC. This way, you can afford failure of ALL VMs on ALL physical 
machines mapped to a single logical RAC and still be 100% available.
For Example: RF=3 , 6 physical racks, 2 physical servers per physical rack and 
3 VMs per physical server.Setup would be-
Physical Rack1 = [Physical1 (3 VM) + Physical2 (3 VM) ]= LogicalRAC1Physical 
Rack2 = [Physical3 (3 VM) + Physical4 (3 VM) ]= LogicalRAC1

Physical Rack3 = [Physical5 (3 VM) + Physical6 (3 VM) ]= LogicalRAC2Physical 
Rack4 = [Physical7 (3 VM) + Physical8 (3 VM) ]= LogicalRAC2
Physical Rack5 = [Physical9 (3 VM) + Physical10 (3 VM) ]= LogicalRAC3Physical 
Rack6 = [Physical11 (3 VM) + Physical12 (3 VM) ]= LogicalRAC3
Problem with this approach is scaling. What if you want to add a single 
physical server? If you do that and allocate it to one existing logical RAC, 
your cluster wont be balanced properly because the logical RAC to which the 
server is added will have additional capacity for same data as other two 
logical RACs.To keep your cluster balanced, you need to add at least 3 physical 
servers in 3 different physical Racks and assign each physical server to 
different logical RAC. This is wastage of resources and hard to digest.

If you have physical machines < logical RACs, every physical machine may have 
more than 1 replica. If entire physical machine fails, you will NOT have 100% 
availability as more than 1 replica may be unavailable. Similarly, if you have 
physical racks < logical RACs, every physical rack may have more than 1 
replica. If entire physical rack fails, you will NOT have 100% availability as 
more than 1 replica may be unavailable. 

Coming back to your example: RF=3 per DC (total RF=6), CL=QUORUM, 2 DCs, 6 
physical machines, 8 VMs per physical machine:
My Recommendation :1. In each DC, assign 3 physical machines in a DC to 3 
logical RACs in Cassandra configuration .  2 DCs can have same RAC names as 
RACs are uniquely identified with their DC names. So, these are 6 different 
logical RACs (multiple of RF) (i.e. 1 physical machine per logical RAC)


2. Add 6 physical machines (3 physical machines per DC) to scale the cluster 
and assign every machine to different logical RAC within the DC.
This way, even if you have Active-Passive DC setup, you can afford failure of 
any physical machine or physical rack in Active DC and still ensure 100% 
availability. You would also achieve operational benefits explained above. 

In multi-DC setup, you can also choose to do away with RACs and achieve 
operational benefits by doing maintenance on one entire DC at a time and 
leveraging other DC to handle client requests during that time. That will make 
your life simpler.


ThanksAnuj







Sent from Yahoo Mail on Android 
 


  On Thu, 27 Jul 2017 at 12:03, kurt greaves wrote:   
Note that if you use more racks than RF you lose some of the 

Re: 回复: 回复: tolerate how many nodes down in the cluster

2017-07-27 Thread kurt greaves
Note that if you use more racks than RF you lose some of the operational
benefit. e.g: you'll still only be able to take out one rack at a time
(especially if using vnodes), despite the fact that you have more racks
than RF. As Jeff said this may be desirable, but really it comes down to
what your physical failure domains are and how/if you plan to scale.

As Jeff said, as long as you don't start with # racks < RF you should be
fine.


Re: 回复: 回复: tolerate how many nodes down in the cluster

2017-07-27 Thread Jeff Jirsa


On 2017-07-26 19:38 (-0700), "Peng Xiao" <2535...@qq.com> wrote: 
> Kurt/All,
> 
> 
> why the  # of racks should be equal to RF?
> 
> For example,we have 2 DCs each 6 machines with RF=3,each machine virtualized 
> to 8 vms ,
> can we set 6 racs with RF3? I mean one machine one RAC to avoid hardware 
> errors or only set 3 racs,1 rac with 2 machines,which is better?
> 
> 

The guarantee you get from racks is that IF you have more racks than replicas, 
you won't have 2 replicas on the same rack. There's no requirement that # of 
racks >= # of replicas, you just leave yourself exposed to losing quorum if you 
have an outage while # racks < # replicas. 

Yes, with a rack == a hypervisor, the snitch would avoid placing 2 replicas on 
the same physical machine, and would protect you against hardware errors. 
There's nothing to gain from having 3 racks instead of 6 in that case (in fact 
6 is probably better, as you're less likely to have to skip a duplicate rack in 
getNaturalEndpoints()).

All of this said:

BE REALLY CAREFUL WHEN USING RACKS. 

If you start with # of racks < RF, and you try to add another rack, you will 
probably be very unhappy (when you add that first node in the new rack, it'll 
take 1/RF of the ring instantly, which usually crashes everything). For that 
reason, a lot of people advise not to use racks unless you have > RF racks, or 
you REALLY know what you're doing.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



?????? ?????? tolerate how many nodes down in the cluster

2017-07-26 Thread Peng Xiao
Kurt/All,


why the  # of racks should be equal to RF?

For example,we have 2 DCs each 6 machines with RF=3,each machine virtualized to 
8 vms ,
can we set 6 racs with RF3? I mean one machine one RAC to avoid hardware errors 
or only set 3 racs,1 rac with 2 machines,which is better?


Thanks








--  --
??: "Anuj Wadehra";<anujw_2...@yahoo.co.in.INVALID>;
: 2017??7??27??(??) 1:41
??: "Brooke Thorley"<bro...@instaclustr.com>; 
"user@cassandra.apache.org"<user@cassandra.apache.org>; 
: "Peng Xiao"<2535...@qq.com>; 
: Re: ?? tolerate how many nodes down in the cluster



 Hi Brooke,


 Very nice presentation: https://www.youtube.com/watch?v=QrP7G1eeQTI !! 
 Good to know that you are able to leverage Racks for gaining operational 
efficiencies. I think vnodes have made life easier. 
 

I still see some concerns with Racks:

 
 1. Usually scaling needs are driven by business requirements. Customers want 
value for every penny they spend. Adding 3 or 5 servers (because you have RF=3 
or 5) instead of 1 server costs them dearly. It's difficult to justify the 
additional cost as fault tolerance can only be improved but not guaranteed with 
racks.

 
2. You need to maintain mappings of Logical Racks (=RF) and physical racks 
(multiple of RFs) for large clusters. 
 
3.  Using racks tightly couples your hardware (rack size, rack count) / 
virtualization decisions (VM Size, VM count per physical node) with application 
RF.
 
Thanks
 Anuj
 
 


On Tuesday, 25 July 2017 3:56 AM, Brooke Thorley <bro...@instaclustr.com> 
wrote:

  

 Hello Peng. 

I think spending the time to set up your nodes into racks is worth it for the 
benefits that it brings. With RF3 and NTS you can tolerate the loss of a whole 
rack of nodes without losing QUORUM as each rack will contain a full set of 
data.  It makes ongoing cluster maintenance easier, as you can perform 
upgrades, repairs and restarts on a whole rack of nodes at once.  Setting up 
racks or adding nodes is not difficult particularly if you are using vnodes.  
You would simply add nodes in multiples of  to keep the racks 
balanced.  This is how we run all our managed clusters and it works very well.


You may be interested to watch my Cassandra Summit presentation from last year 
in which I discussed this very topic: 
https://www.youtube.com/watch?v=QrP7G1eeQTI (from 4:00)



If you were to consider changing your rack topology, I would recommend that you 
do this by DC migration rather than "in place". 



Kind Regards,Brooke Thorley
VP Technical Operations & Customer Services
supp...@instaclustr.com | support.instaclustr.com


Read our latest technical blog posts here.
This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).
This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.













 
On 25 July 2017 at 03:06, Anuj Wadehra <anujw_2...@yahoo.co.in.invalid> wrote:
Hi Peng, 

Three things are important when you are evaluating fault tolerance and 
availability for your cluster:


1. RF
2. CL
3. Topology -  how data is replicated in racks. 


If you assume that N  nodes from ANY rack may fail at the same time,  then you 
can afford failure of RF-CL nodes and still be 100% available.  E. g.  If you 
are reading at quorum and RF=3, you can only afford one (3-2) node failure. 
Thus, even if you have a 30 node cluster,  10 node failure can not provide you 
100% availability. RF impacts availability rather than total number of nodes in 
a cluster. 


If you assume that N nodes failing together will ALWAYS be from the same rack,  
you can spread your servers in RF physical racks and use 
NetworkTopologyStrategy. While allocating replicas for any data, Cassandra will 
ensure that 3 replicas are placed in 3 different racks E.g. you can have 10 
nodes in 3 racks and then even a 10 node failure within SAME rack shall ensure 
that you have 100% availability as two replicas are there for 100% data and 
CL=QUORUM can be met. I have not tested this but that how the rack concept is 
expected to work.  I agree, using racks generally makes operations tougher.




Thanks
Anuj




 
   On Mon, 24 Jul 2017 at 20:10, Peng Xiao
<2535...@qq.com> wrote:
 
  Hi Bhuvan,
From the following link,it doesn't suggest us to use RAC and it looks 
reasonable.
http://www.datastax.com/dev/ blog/multi-datacenter- replication



Defining one rack for the entire cluster is the simplest and most common 
implementation. Multiple racks should be avoided for the following reasons:
?6?1Most users tend to ignore or forget rack requirements 

Re: 回复: tolerate how many nodes down in the cluster

2017-07-26 Thread Anuj Wadehra
t;;发送时间: 2017年7月24日(星期一) 晚上7:17收件人: 
"user"<user@cassandra. apache.org>; 主题: Re: tolerate how many nodes down in the 
cluster
Hi Peng ,
This really depends on how you have configured your topology. Say if you have 
segregated your dc into 3 racks with 10 servers each. With RF of 3 you can 
safely assume your data to be available if one rack goes down. 
But if different servers amongst the racks fail then i guess you are not 
guaranteeing data integrity with RF of 3 in that case you can at max lose 2 
servers to be available. Best idea would be to plan failover modes 
appropriately and letting cassandra know of the same.
Regards,Bhuvan
On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:

Hi,
Suppose we have a 30 nodes cluster in one DC with RF=3,how many nodes can be 
down?can we tolerate 10 nodes down?it seems that we are not able to avoid  the 
data distribution 3 replicas in the 10 nodes?,then we can only tolerate 1 node 
down even we have 30 nodes?Could anyone please advise?
Thanks

  




   

回复: 回复: tolerate how many nodes down in the cluster

2017-07-26 Thread Peng Xiao
as per Brooke suggests,RACs a multipile of RF.
https://www.youtube.com/watch?v=QrP7G1eeQTI


if we have 6 machines with RF=3,then we can set up 6 RACs or setup 3RACs,which 
will be better?
Could you please further advise?


Many thanks




-- 原始邮件 --
发件人: "我自己的邮箱";<2535...@qq.com>;
发送时间: 2017年7月26日(星期三) 晚上7:31
收件人: "user"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: 回复: 回复: tolerate how many nodes down in the cluster



One more question.why the  # of racks should be equal to RF? 

For example,we have 4 machines,each virtualized to 8 vms ,can we set 4 RACs 
with RF3?I mean one machine one RAC.


Thanks


-- 原始邮件 --
发件人: "我自己的邮箱";<2535...@qq.com>;
发送时间: 2017年7月26日(星期三) 上午10:32
收件人: "user"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: 回复: 回复: tolerate how many nodes down in the cluster



Thanks for the remind,we will setup a new DC as suggested.




-- 原始邮件 --
发件人: "kurt greaves";<k...@instaclustr.com>;
发送时间: 2017年7月26日(星期三) 上午10:30
收件人: "User"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: Re: 回复: tolerate how many nodes down in the cluster



Keep in mind that you shouldn't just enable multiple racks on an existing 
cluster (this will lead to massive inconsistencies). The best method is to 
migrate to a new DC as Brooke mentioned.​

回复: 回复: tolerate how many nodes down in the cluster

2017-07-26 Thread Peng Xiao
One more question.why the  # of racks should be equal to RF? 

For example,we have 4 machines,each virtualized to 8 vms ,can we set 4 RACs 
with RF3?I mean one machine one RAC.


Thanks


-- 原始邮件 --
发件人: "我自己的邮箱";<2535...@qq.com>;
发送时间: 2017年7月26日(星期三) 上午10:32
收件人: "user"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: 回复: 回复: tolerate how many nodes down in the cluster



Thanks for the remind,we will setup a new DC as suggested.




-- 原始邮件 --
发件人: "kurt greaves";<k...@instaclustr.com>;
发送时间: 2017年7月26日(星期三) 上午10:30
收件人: "User"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: Re: 回复: tolerate how many nodes down in the cluster



Keep in mind that you shouldn't just enable multiple racks on an existing 
cluster (this will lead to massive inconsistencies). The best method is to 
migrate to a new DC as Brooke mentioned.​

回复: 回复: tolerate how many nodes down in the cluster

2017-07-25 Thread Peng Xiao
Thanks for the remind,we will setup a new DC as suggested.




-- 原始邮件 --
发件人: "kurt greaves";<k...@instaclustr.com>;
发送时间: 2017年7月26日(星期三) 上午10:30
收件人: "User"<user@cassandra.apache.org>; 
抄送: "anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
主题: Re: 回复: tolerate how many nodes down in the cluster



Keep in mind that you shouldn't just enable multiple racks on an existing 
cluster (this will lead to massive inconsistencies). The best method is to 
migrate to a new DC as Brooke mentioned.​

Re: 回复: tolerate how many nodes down in the cluster

2017-07-25 Thread kurt greaves
Keep in mind that you shouldn't just enable multiple racks on an existing
cluster (this will lead to massive inconsistencies). The best method is to
migrate to a new DC as Brooke mentioned.​


?????? ?????? tolerate how many nodes down in the cluster

2017-07-25 Thread Peng Xiao
Thanks All for your reply.We will begin using RACs in our C* cluster.


Thanks.




--  --
??: "kurt greaves";<k...@instaclustr.com>;
: 2017??7??25??(??) 6:27
??: "User"<user@cassandra.apache.org>; 
"anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>; 
: "Peng Xiao"<2535...@qq.com>; 
: Re: ?? tolerate how many nodes down in the cluster



I've never really understood why Datastax recommends against racks. In those 
docs they make it out to be much more difficult than it actually is to 
configure and manage racks.

The important thing to keep in mind when using racks is that your # of racks 
should be equal to your RF. If you have keyspaces with different RF, then it's 
best to have the same # as the RF of your most important keyspace, but in this 
scenario you lose some of the benefits of using racks.


As Anuj has described, if you use RF # of racks, you can lose up to an entire 
rack without losing availability. Note that this entirely depends on the 
situation. When you take a node down, the other nodes in the cluster require 
capacity to be able to handle the extra load that node is no longer handling. 
What this means is that if your cluster will require the other nodes to store 
hints for that node (equivalent to the amount of writes made to that node), and 
also handle its portion of READs. You can only take out as many nodes from a 
rack as the capacity of your cluster allows.


I also strongly disagree that using racks makes operations tougher. If 
anything, it makes them considerably easier (especially when using vnodes). The 
only difficulty is the initial setup of racks, but for all the possible 
benefits it's certainly worth it. As well as the fact that you can lose up to 
an entire rack (great for AWS AZ's) without affecting availability, using racks 
also makes operations on large clusters much smoother. For example, when 
upgrading a cluster, you can now do it a rack at a time, or some portion of a 
rack at a time. Same for OS upgrades or any other operation that could happen 
in your environment. This is important if you have lots of nodes.  Also it 
makes coordinating repairs easier, as you now only need to repair a single rack 
to ensure you've repaired all the data. Basically any operation/problem where 
you need to consider the distribution of data, racks are going to help you.

Re: 回复: tolerate how many nodes down in the cluster

2017-07-24 Thread kurt greaves
I've never really understood why Datastax recommends against racks. In
those docs they make it out to be much more difficult than it actually is
to configure and manage racks.

The important thing to keep in mind when using racks is that your # of
racks should be equal to your RF. If you have keyspaces with different RF,
then it's best to have the same # as the RF of your most important
keyspace, but in this scenario you lose some of the benefits of using racks.

As Anuj has described, if you use RF # of racks, you *can* lose up to an
entire rack without losing availability. Note that this entirely depends on
the situation. *When you take a node down, the other nodes in the cluster
require capacity to be able to handle the extra load that node is no longer
handling. *What this means is that if your cluster will require the other
nodes to store hints for that node (equivalent to the amount of writes made
to that node), and also handle its portion of READs. You can only take out
as many nodes from a rack as the capacity of your cluster allows.

I also strongly disagree that using racks makes operations tougher. If
anything, it makes them considerably easier (especially when using vnodes).
The only difficulty is the initial setup of racks, but for all the possible
benefits it's certainly worth it. As well as the fact that you can lose up
to an entire rack (great for AWS AZ's) without affecting availability,
using racks also makes operations on large clusters much smoother. For
example, when upgrading a cluster, you can now do it a rack at a time, or
some portion of a rack at a time. Same for OS upgrades or any other
operation that could happen in your environment. This is important if you
have lots of nodes.  Also it makes coordinating repairs easier, as you now
only need to repair a single rack to ensure you've repaired all the data.
Basically any operation/problem where you need to consider the distribution
of data, racks are going to help you.


Re: 回复: tolerate how many nodes down in the cluster

2017-07-24 Thread Brooke Thorley
Hello Peng.

I think spending the time to set up your nodes into racks is worth it for
the benefits that it brings. With RF3 and NTS you can tolerate the loss of
a whole rack of nodes without losing QUORUM as each rack will contain a
full set of data.  It makes ongoing cluster maintenance easier, as you can
perform upgrades, repairs and restarts on a whole rack of nodes at once.
Setting up racks or adding nodes is not difficult particularly if you are
using vnodes.  You would simply add nodes in multiples of  to
keep the racks balanced.  This is how we run all our managed clusters and
it works very well.

You may be interested to watch my Cassandra Summit presentation from last
year in which I discussed this very topic:
https://www.youtube.com/watch?v=QrP7G1eeQTI (from 4:00)

If you were to consider changing your rack topology, I would recommend that
you do this by DC migration rather than "in place".


Kind Regards,
*Brooke Thorley*
*VP Technical Operations & Customer Services*
supp...@instaclustr.com | support.instaclustr.com

<https://www.instaclustr.com/>

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On 25 July 2017 at 03:06, Anuj Wadehra <anujw_2...@yahoo.co.in.invalid>
wrote:

> Hi Peng,
>
> Three things are important when you are evaluating fault tolerance and
> availability for your cluster:
>
> 1. RF
> 2. CL
> 3. Topology -  how data is replicated in racks.
>
> If you assume that N  nodes from ANY rack may fail at the same time,  then
> you can afford failure of RF-CL nodes and still be 100% available.  E. g.
> If you are reading at quorum and RF=3, you can only afford one (3-2) node
> failure. Thus, even if you have a 30 node cluster,  10 node failure can not
> provide you 100% availability. RF impacts availability rather than total
> number of nodes in a cluster.
>
> If you assume that N nodes failing together will ALWAYS be from the same
> rack,  you can spread your servers in RF physical racks and use
> NetworkTopologyStrategy. While allocating replicas for any data, Cassandra
> will ensure that 3 replicas are placed in 3 different racks E.g. you can
> have 10 nodes in 3 racks and then even a 10 node failure within SAME rack
> shall ensure that you have 100% availability as two replicas are there for
> 100% data and CL=QUORUM can be met. I have not tested this but that how the
> rack concept is expected to work.  I agree, using racks generally makes
> operations tougher.
>
>
> Thanks
> Anuj
>
>
>
> On Mon, 24 Jul 2017 at 20:10, Peng Xiao
> <2535...@qq.com> wrote:
> Hi Bhuvan,
> From the following link,it doesn't suggest us to use RAC and it looks
> reasonable.
> http://www.datastax.com/dev/blog/multi-datacenter-replication
>
> Defining one rack for the entire cluster is the simplest and most common
> implementation. Multiple racks should be avoided for the following reasons:
> • Most users tend to ignore or forget rack requirements that state racks
> should be in an alternating order to allow the data to get distributed
> safely and appropriately.
> • Many users are not using the rack information effectively by using a
> setup with as many racks as they have nodes, or similar non-beneficial
> scenarios.
> • When using racks correctly, each rack should typically have the same
> number of nodes.
> • In a scenario that requires a cluster expansion while using racks, the
> expansion procedure can be tedious since it typically involves several node
> moves and has has to ensure to ensure that racks will be distributing data
> correctly and evenly. At times when clusters need immediate expansion,
> racks should be the last things to worry about.
>
>
>
>
>
> -- 原始邮件 --
> *发件人:* "Bhuvan Rawal";<bhu1ra...@gmail.com>;
> *发送时间:* 2017年7月24日(星期一) 晚上7:17
> *收件人:* "user"<user@cassandra.apache.org>;
> *主题:* Re: tolerate how many nodes down in the cluster
>
> Hi Peng ,
>
> This really depends on how you have configured your topology. Say if you
> have segregated your dc into 3 racks with 10 servers each. With RF of 3 you
> can safely assume your data to be available if one rack goes down.
>
> But if different servers amongst the rac

Re: 回复: tolerate how many nodes down in the cluster

2017-07-24 Thread Anuj Wadehra
Hi Peng, 
Three things are important when you are evaluating fault tolerance and 
availability for your cluster:
1. RF2. CL3. Topology -  how data is replicated in racks. 
If you assume that N  nodes from ANY rack may fail at the same time,  then you 
can afford failure of RF-CL nodes and still be 100% available.  E. g.  If you 
are reading at quorum and RF=3, you can only afford one (3-2) node failure. 
Thus, even if you have a 30 node cluster,  10 node failure can not provide you 
100% availability. RF impacts availability rather than total number of nodes in 
a cluster. 
If you assume that N nodes failing together will ALWAYS be from the same rack,  
you can spread your servers in RF physical racks and use 
NetworkTopologyStrategy. While allocating replicas for any data, Cassandra will 
ensure that 3 replicas are placed in 3 different racks E.g. you can have 10 
nodes in 3 racks and then even a 10 node failure within SAME rack shall ensure 
that you have 100% availability as two replicas are there for 100% data and 
CL=QUORUM can be met. I have not tested this but that how the rack concept is 
expected to work.  I agree, using racks generally makes operations tougher.

ThanksAnuj

 
 
  On Mon, 24 Jul 2017 at 20:10, Peng Xiao<2535...@qq.com> wrote:   Hi 
Bhuvan,From the following link,it doesn't suggest us to use RAC and it looks 
reasonable.http://www.datastax.com/dev/blog/multi-datacenter-replication
Defining one rack for the entire cluster is the simplest and most common 
implementation. Multiple racks should be avoided for the following reasons: • 
Most users tend to ignore or forget rack requirements that state racks should 
be in an alternating order to allow the data to get distributed safely and 
appropriately. • Many users are not using the rack information effectively by 
using a setup with as many racks as they have nodes, or similar non-beneficial 
scenarios. • When using racks correctly, each rack should typically have the 
same number of nodes. • In a scenario that requires a cluster expansion while 
using racks, the expansion procedure can be tedious since it typically involves 
several node moves and has has to ensure to ensure that racks will be 
distributing data correctly and evenly. At times when clusters need immediate 
expansion, racks should be the last things to worry about.




-- 原始邮件 --发件人: "Bhuvan 
Rawal";<bhu1ra...@gmail.com>;发送时间: 2017年7月24日(星期一) 晚上7:17收件人: 
"user"<user@cassandra.apache.org>; 主题: Re: tolerate how many nodes down in the 
cluster
Hi Peng ,
This really depends on how you have configured your topology. Say if you have 
segregated your dc into 3 racks with 10 servers each. With RF of 3 you can 
safely assume your data to be available if one rack goes down. 
But if different servers amongst the racks fail then i guess you are not 
guaranteeing data integrity with RF of 3 in that case you can at max lose 2 
servers to be available. Best idea would be to plan failover modes 
appropriately and letting cassandra know of the same.
Regards,Bhuvan
On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:

Hi,
Suppose we have a 30 nodes cluster in one DC with RF=3,how many nodes can be 
down?can we tolerate 10 nodes down?it seems that we are not able to avoid  the 
data distribution 3 replicas in the 10 nodes?,then we can only tolerate 1 node 
down even we have 30 nodes?Could anyone please advise?
Thanks

  


回复: tolerate how many nodes down in the cluster

2017-07-24 Thread Peng Xiao
Hi Bhuvan,
From the following link,it doesn't suggest us to use RAC and it looks 
reasonable.
http://www.datastax.com/dev/blog/multi-datacenter-replication



Defining one rack for the entire cluster is the simplest and most common 
implementation. Multiple racks should be avoided for the following reasons:
•   Most users tend to ignore or forget rack requirements that 
state racks should be in an alternating order to allow the data to get 
distributed safely and appropriately.
•   Many users are not using the rack information effectively by 
using a setup with as many racks as they have nodes, or similar non-beneficial 
scenarios.
•   When using racks correctly, each rack should typically have the 
same number of nodes.
•   In a scenario that requires a cluster expansion while using 
racks, the expansion procedure can be tedious since it typically involves 
several node moves and has has to ensure to ensure that racks will be 
distributing data correctly and evenly. At times when clusters need immediate 
expansion, racks should be the last things to worry about.












-- 原始邮件 --
发件人: "Bhuvan Rawal";<bhu1ra...@gmail.com>;
发送时间: 2017年7月24日(星期一) 晚上7:17
收件人: "user"<user@cassandra.apache.org>; 

主题: Re: tolerate how many nodes down in the cluster



Hi Peng ,

This really depends on how you have configured your topology. Say if you have 
segregated your dc into 3 racks with 10 servers each. With RF of 3 you can 
safely assume your data to be available if one rack goes down. 


But if different servers amongst the racks fail then i guess you are not 
guaranteeing data integrity with RF of 3 in that case you can at max lose 2 
servers to be available. Best idea would be to plan failover modes 
appropriately and letting cassandra know of the same.


Regards,
Bhuvan


On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:
Hi,


Suppose we have a 30 nodes cluster in one DC with RF=3,
how many nodes can be down?can we tolerate 10 nodes down?
it seems that we are not able to avoid  the data distribution 3 replicas in the 
10 nodes?,
then we can only tolerate 1 node down even we have 30 nodes?
Could anyone please advise?


Thanks

Re: tolerate how many nodes down in the cluster

2017-07-24 Thread Bhuvan Rawal
Hi Peng ,

This really depends on how you have configured your topology. Say if you
have segregated your dc into 3 racks with 10 servers each. With RF of 3 you
can safely assume your data to be available if one rack goes down.

But if different servers amongst the racks fail then i guess you are not
guaranteeing data integrity with RF of 3 in that case you can at max lose 2
servers to be available. Best idea would be to plan failover modes
appropriately and letting cassandra know of the same.

Regards,
Bhuvan

On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:

> Hi,
>
> Suppose we have a 30 nodes cluster in one DC with RF=3,
> how many nodes can be down?can we tolerate 10 nodes down?
> it seems that we are not able to avoid  the data distribution 3 replicas
> in the 10 nodes?,
> then we can only tolerate 1 node down even we have 30 nodes?
> Could anyone please advise?
>
> Thanks
>


tolerate how many nodes down in the cluster

2017-07-24 Thread Peng Xiao
Hi,


Suppose we have a 30 nodes cluster in one DC with RF=3,
how many nodes can be down?can we tolerate 10 nodes down?
it seems that we are not able to avoid  the data distribution 3 replicas in the 
10 nodes?,
then we can only tolerate 1 node down even we have 30 nodes?
Could anyone please advise?


Thanks