Re: [EXTERNAL] multiple Cassandra instances per server, possible?
We do multiple nodes per host as a standard practice. In our case, we never put 2 nodes from a single cluster on the same host, though as mentioned before, you could potentially get away with that if you properly use rack awareness, just be careful of load. We also do NOT use any other layer of segregation such as docker or VMs, we just have multiple IPs per host, and bind each IP to a distinct node. We have looked at VMs and Containers, but they either add abstraction complexity or some kind of performance penalty. As for system resources, we dedicate individual ssds for each node, but CPU, memory, and network is shared. We are spoiled by good network and beefy memory, so the only place we have to be careful is CPU. As such, we pick fairly conservative Cassandra.yaml settings and monitor CPU usage. If workloads get hot on a particular host, we have some flexibility to move things around. In any case, it sounds like you will be fine running 1 node per host. With that many resources, be sure to tune you nodes to make use of them. Good luck. On Thu, Apr 18, 2019, 2:49 PM William R wrote: > hi, > > Thank you for your answers, starting with the most important point from > your answers I understand that > > "it is OK to go more than 1 TB in disk usage" > > so in this case if I am going to use the 50% of the disk capacity I will > end up having around 3 TB per node which in this case I will not need to > use a docker solution which is a very good usa case for us. > > The goal of my setup is to save large data volumes in every node (~ 3 TB - > 50% usage of HD) with the current hardware that we possess. The high > availability I consider it standard since we are going to have 2 DCs with > RF3. > > I also have to note that Datastax also recommends usage no more than 500 > GB - 1 TB. > > Cheers, > > Vasilis > > > Sent with ProtonMail <https://protonmail.com> Secure Email. > > ‐‐‐ Original Message ‐‐‐ > On Thursday, April 18, 2019 6:56 PM, Jacques-Henri Berthemet < > jacques-henri.berthe...@genesys.com> wrote: > > So how much data can you safely fit per node using SSDs with Cassandra > 3.11? How much free space do you need on your disks? > > There should be some recommendations on node sizes on: > > http://cassandra.apache.org/doc/latest/operating/hardware.html > > Documentation - Apache Cassandra > <http://cassandra.apache.org/doc/latest/operating/hardware.html> > cassandra.apache.org > The Apache Cassandra database is the right choice when you need > scalability and high availability without compromising performance. Linear > scalability and proven fault-tolerance on commodity hardware or cloud > infrastructure make it the perfect platform for mission-critical data. > Cassandra's support for replicating across multiple datacenters is > best-in-class, providing lower latency for your ... > > > -------------- > > *From:* Jon Haddad > *Sent:* Thursday, April 18, 2019 6:43:15 PM > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] multiple Cassandra instances per server, > possible? > > Agreed with Jeff here. The whole "community recommends no more than > 1TB" has been around, and inaccurate, for a long time. > > The biggest issue with dense nodes is how long it takes to replace > them. 4.0 should help with that under certain circumstances. > > > On Thu, Apr 18, 2019 at 6:57 AM Jeff Jirsa wrote: > > > > Agreed that you can go larger than 1T on ssd > > > > You can do this safely with both instances in the same cluster if you > guarantee two replicas aren’t on the same machine. Cassandra provides a > primitive to do this - rack awareness through the network topology snitch. > > > > The limitation (until 4.0) is that you’ll need two IPs per machine as > both instances have to run in the same port. > > > > > > -- > > Jeff Jirsa > > > > > > On Apr 18, 2019, at 6:45 AM, Durity, Sean R > wrote: > > > > What is the data problem that you are trying to solve with Cassandra? Is > it high availability? Low latency queries? Large data volumes? High > concurrent users? I would design the solution to fit the problem(s) you are > solving. > > > > > > > > For example, if high availability is the goal, I would be very cautious > about 2 nodes/machine. If you need the full amount of the disk – you *can* > have larger nodes than 1 TB. I agree that administration tasks (like > adding/removing nodes, etc.) are more painful with large nodes – but not > impossible. For large amounts of data, I like nodes that have about 2.5 – 3 > TB of usable SSD disk. > > > > > > > > It is possible
Re: [EXTERNAL] multiple Cassandra instances per server, possible?
hi, Thank you for your answers, starting with the most important point from your answers I understand that "it is OK to go more than 1 TB in disk usage" so in this case if I am going to use the 50% of the disk capacity I will end up having around 3 TB per node which in this case I will not need to use a docker solution which is a very good usa case for us. The goal of my setup is to save large data volumes in every node (~ 3 TB - 50% usage of HD) with the current hardware that we possess. The high availability I consider it standard since we are going to have 2 DCs with RF3. I also have to note that Datastax also recommends usage no more than 500 GB - 1 TB. Cheers, Vasilis Sent with [ProtonMail](https://protonmail.com) Secure Email. ‐‐‐ Original Message ‐‐‐ On Thursday, April 18, 2019 6:56 PM, Jacques-Henri Berthemet wrote: > So how much data can you safely fit per node using SSDs with Cassandra 3.11? > How much free space do you need on your disks? > > There should be some recommendations on node sizes on: > > http://cassandra.apache.org/doc/latest/operating/hardware.html > > [Documentation - Apache > Cassandra](http://cassandra.apache.org/doc/latest/operating/hardware.html) > cassandra.apache.org > The Apache Cassandra database is the right choice when you need scalability > and high availability without compromising performance. Linear scalability > and proven fault-tolerance on commodity hardware or cloud infrastructure make > it the perfect platform for mission-critical data. Cassandra's support for > replicating across multiple datacenters is best-in-class, providing lower > latency for your ... > > --- > > From: Jon Haddad > Sent: Thursday, April 18, 2019 6:43:15 PM > To: user@cassandra.apache.org > Subject: Re: [EXTERNAL] multiple Cassandra instances per server, possible? > > Agreed with Jeff here. The whole "community recommends no more than > 1TB" has been around, and inaccurate, for a long time. > > The biggest issue with dense nodes is how long it takes to replace > them. 4.0 should help with that under certain circumstances. > > On Thu, Apr 18, 2019 at 6:57 AM Jeff Jirsa wrote: >> >> Agreed that you can go larger than 1T on ssd >> >> You can do this safely with both instances in the same cluster if you >> guarantee two replicas aren’t on the same machine. Cassandra provides a >> primitive to do this - rack awareness through the network topology snitch. >> >> The limitation (until 4.0) is that you’ll need two IPs per machine as both >> instances have to run in the same port. >> >> >> -- >> Jeff Jirsa >> >> >> On Apr 18, 2019, at 6:45 AM, Durity, Sean R >> wrote: >> >> What is the data problem that you are trying to solve with Cassandra? Is it >> high availability? Low latency queries? Large data volumes? High concurrent >> users? I would design the solution to fit the problem(s) you are solving. >> >> >> >> For example, if high availability is the goal, I would be very cautious >> about 2 nodes/machine. If you need the full amount of the disk – you *can* >> have larger nodes than 1 TB. I agree that administration tasks (like >> adding/removing nodes, etc.) are more painful with large nodes – but not >> impossible. For large amounts of data, I like nodes that have about 2.5 – 3 >> TB of usable SSD disk. >> >> >> >> It is possible that your nodes might be under-utilized, especially at first. >> But if the hardware is already available, you have to use what you have. >> >> >> >> We have done multiple nodes on single physical hardware, but they were two >> separate clusters (for the same application). In that case, we had a >> different install location and different ports for one of the clusters. >> >> >> >> Sean Durity >> >> >> >> From: William R >> Sent: Thursday, April 18, 2019 9:14 AM >> To: user@cassandra.apache.org >> Subject: [EXTERNAL] multiple Cassandra instances per server, possible? >> >> >> >> Hi all, >> >> >> >> In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram >> and 64 cores and we are thinking to use them as Cassandra nodes. From what I >> am reading around, the community recommends that every node should not keep >> more than 1 TB data so in this case I am wondering if it is possible to >> install 2 instances per node using docker so each docker instance can write >> to its own physical disk and utilise more efficiently the rest hardware (CP
Re: [EXTERNAL] multiple Cassandra instances per server, possible?
So how much data can you safely fit per node using SSDs with Cassandra 3.11? How much free space do you need on your disks? There should be some recommendations on node sizes on: http://cassandra.apache.org/doc/latest/operating/hardware.html Documentation - Apache Cassandra<http://cassandra.apache.org/doc/latest/operating/hardware.html> cassandra.apache.org The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your ... From: Jon Haddad Sent: Thursday, April 18, 2019 6:43:15 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] multiple Cassandra instances per server, possible? Agreed with Jeff here. The whole "community recommends no more than 1TB" has been around, and inaccurate, for a long time. The biggest issue with dense nodes is how long it takes to replace them. 4.0 should help with that under certain circumstances. On Thu, Apr 18, 2019 at 6:57 AM Jeff Jirsa wrote: > > Agreed that you can go larger than 1T on ssd > > You can do this safely with both instances in the same cluster if you > guarantee two replicas aren’t on the same machine. Cassandra provides a > primitive to do this - rack awareness through the network topology snitch. > > The limitation (until 4.0) is that you’ll need two IPs per machine as both > instances have to run in the same port. > > > -- > Jeff Jirsa > > > On Apr 18, 2019, at 6:45 AM, Durity, Sean R > wrote: > > What is the data problem that you are trying to solve with Cassandra? Is it > high availability? Low latency queries? Large data volumes? High concurrent > users? I would design the solution to fit the problem(s) you are solving. > > > > For example, if high availability is the goal, I would be very cautious about > 2 nodes/machine. If you need the full amount of the disk – you *can* have > larger nodes than 1 TB. I agree that administration tasks (like > adding/removing nodes, etc.) are more painful with large nodes – but not > impossible. For large amounts of data, I like nodes that have about 2.5 – 3 > TB of usable SSD disk. > > > > It is possible that your nodes might be under-utilized, especially at first. > But if the hardware is already available, you have to use what you have. > > > > We have done multiple nodes on single physical hardware, but they were two > separate clusters (for the same application). In that case, we had a > different install location and different ports for one of the clusters. > > > > Sean Durity > > > > From: William R > Sent: Thursday, April 18, 2019 9:14 AM > To: user@cassandra.apache.org > Subject: [EXTERNAL] multiple Cassandra instances per server, possible? > > > > Hi all, > > > > In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram > and 64 cores and we are thinking to use them as Cassandra nodes. From what I > am reading around, the community recommends that every node should not keep > more than 1 TB data so in this case I am wondering if it is possible to > install 2 instances per node using docker so each docker instance can write > to its own physical disk and utilise more efficiently the rest hardware (CPU > & RAM). > > > > I understand with this setup there is the danger of creating a single point > of failure for 2 Cassandra nodes but except that do you think that is a > possible setup to start with the cluster? > > > > Except the docker solution do you recommend any other way to split the > physical node to 2 instances? (VMWare? or even maybe 2 separate installations > of Cassandra? ) > > > > Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes each > (5 baremetal nodes with 2 Cassandra instances) > > > > Probably later when we will start introducing more nodes to the cluster we > can decommissioning the "double-instaned" ones and aim for a more homogeneous > solution.. > > > > Thank you, > > > > Wil > > > > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email by > anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be taken > in reliance on it, is prohibited and may be unlawful. When addressed to o
Re: [EXTERNAL] multiple Cassandra instances per server, possible?
Agreed with Jeff here. The whole "community recommends no more than 1TB" has been around, and inaccurate, for a long time. The biggest issue with dense nodes is how long it takes to replace them. 4.0 should help with that under certain circumstances. On Thu, Apr 18, 2019 at 6:57 AM Jeff Jirsa wrote: > > Agreed that you can go larger than 1T on ssd > > You can do this safely with both instances in the same cluster if you > guarantee two replicas aren’t on the same machine. Cassandra provides a > primitive to do this - rack awareness through the network topology snitch. > > The limitation (until 4.0) is that you’ll need two IPs per machine as both > instances have to run in the same port. > > > -- > Jeff Jirsa > > > On Apr 18, 2019, at 6:45 AM, Durity, Sean R > wrote: > > What is the data problem that you are trying to solve with Cassandra? Is it > high availability? Low latency queries? Large data volumes? High concurrent > users? I would design the solution to fit the problem(s) you are solving. > > > > For example, if high availability is the goal, I would be very cautious about > 2 nodes/machine. If you need the full amount of the disk – you *can* have > larger nodes than 1 TB. I agree that administration tasks (like > adding/removing nodes, etc.) are more painful with large nodes – but not > impossible. For large amounts of data, I like nodes that have about 2.5 – 3 > TB of usable SSD disk. > > > > It is possible that your nodes might be under-utilized, especially at first. > But if the hardware is already available, you have to use what you have. > > > > We have done multiple nodes on single physical hardware, but they were two > separate clusters (for the same application). In that case, we had a > different install location and different ports for one of the clusters. > > > > Sean Durity > > > > From: William R > Sent: Thursday, April 18, 2019 9:14 AM > To: user@cassandra.apache.org > Subject: [EXTERNAL] multiple Cassandra instances per server, possible? > > > > Hi all, > > > > In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram > and 64 cores and we are thinking to use them as Cassandra nodes. From what I > am reading around, the community recommends that every node should not keep > more than 1 TB data so in this case I am wondering if it is possible to > install 2 instances per node using docker so each docker instance can write > to its own physical disk and utilise more efficiently the rest hardware (CPU > & RAM). > > > > I understand with this setup there is the danger of creating a single point > of failure for 2 Cassandra nodes but except that do you think that is a > possible setup to start with the cluster? > > > > Except the docker solution do you recommend any other way to split the > physical node to 2 instances? (VMWare? or even maybe 2 separate installations > of Cassandra? ) > > > > Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes each > (5 baremetal nodes with 2 Cassandra instances) > > > > Probably later when we will start introducing more nodes to the cluster we > can decommissioning the "double-instaned" ones and aim for a more homogeneous > solution.. > > > > Thank you, > > > > Wil > > > > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email by > anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be taken > in reliance on it, is prohibited and may be unlawful. When addressed to our > clients any opinions or advice contained in this Email are subject to the > terms and conditions expressed in any applicable governing The Home Depot > terms of business or client engagement letter. The Home Depot disclaims all > responsibility and liability for the accuracy and content of this attachment > and for any damages or losses arising from any inaccuracies, errors, viruses, > e.g., worms, trojan horses, etc., or other items of a destructive nature, > which may be contained in this attachment and shall not be liable for direct, > indirect, consequential or special damages in connection with this e-mail > message or its attachment. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: [EXTERNAL] multiple Cassandra instances per server, possible?
Agreed that you can go larger than 1T on ssd You can do this safely with both instances in the same cluster if you guarantee two replicas aren’t on the same machine. Cassandra provides a primitive to do this - rack awareness through the network topology snitch. The limitation (until 4.0) is that you’ll need two IPs per machine as both instances have to run in the same port. -- Jeff Jirsa > On Apr 18, 2019, at 6:45 AM, Durity, Sean R > wrote: > > What is the data problem that you are trying to solve with Cassandra? Is it > high availability? Low latency queries? Large data volumes? High concurrent > users? I would design the solution to fit the problem(s) you are solving. > > For example, if high availability is the goal, I would be very cautious about > 2 nodes/machine. If you need the full amount of the disk – you *can* have > larger nodes than 1 TB. I agree that administration tasks (like > adding/removing nodes, etc.) are more painful with large nodes – but not > impossible. For large amounts of data, I like nodes that have about 2.5 – 3 > TB of usable SSD disk. > > It is possible that your nodes might be under-utilized, especially at first. > But if the hardware is already available, you have to use what you have. > > We have done multiple nodes on single physical hardware, but they were two > separate clusters (for the same application). In that case, we had a > different install location and different ports for one of the clusters. > > Sean Durity > > From: William R > Sent: Thursday, April 18, 2019 9:14 AM > To: user@cassandra.apache.org > Subject: [EXTERNAL] multiple Cassandra instances per server, possible? > > Hi all, > > In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram > and 64 cores and we are thinking to use them as Cassandra nodes. From what I > am reading around, the community recommends that every node should not keep > more than 1 TB data so in this case I am wondering if it is possible to > install 2 instances per node using docker so each docker instance can write > to its own physical disk and utilise more efficiently the rest hardware (CPU > & RAM). > > I understand with this setup there is the danger of creating a single point > of failure for 2 Cassandra nodes but except that do you think that is a > possible setup to start with the cluster? > > Except the docker solution do you recommend any other way to split the > physical node to 2 instances? (VMWare? or even maybe 2 separate installations > of Cassandra? ) > > Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes each > (5 baremetal nodes with 2 Cassandra instances) > > Probably later when we will start introducing more nodes to the cluster we > can decommissioning the "double-instaned" ones and aim for a more homogeneous > solution.. > > Thank you, > > Wil > > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email by > anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be taken > in reliance on it, is prohibited and may be unlawful. When addressed to our > clients any opinions or advice contained in this Email are subject to the > terms and conditions expressed in any applicable governing The Home Depot > terms of business or client engagement letter. The Home Depot disclaims all > responsibility and liability for the accuracy and content of this attachment > and for any damages or losses arising from any inaccuracies, errors, viruses, > e.g., worms, trojan horses, etc., or other items of a destructive nature, > which may be contained in this attachment and shall not be liable for direct, > indirect, consequential or special damages in connection with this e-mail > message or its attachment.
RE: [EXTERNAL] multiple Cassandra instances per server, possible?
What is the data problem that you are trying to solve with Cassandra? Is it high availability? Low latency queries? Large data volumes? High concurrent users? I would design the solution to fit the problem(s) you are solving. For example, if high availability is the goal, I would be very cautious about 2 nodes/machine. If you need the full amount of the disk – you *can* have larger nodes than 1 TB. I agree that administration tasks (like adding/removing nodes, etc.) are more painful with large nodes – but not impossible. For large amounts of data, I like nodes that have about 2.5 – 3 TB of usable SSD disk. It is possible that your nodes might be under-utilized, especially at first. But if the hardware is already available, you have to use what you have. We have done multiple nodes on single physical hardware, but they were two separate clusters (for the same application). In that case, we had a different install location and different ports for one of the clusters. Sean Durity From: William R Sent: Thursday, April 18, 2019 9:14 AM To: user@cassandra.apache.org Subject: [EXTERNAL] multiple Cassandra instances per server, possible? Hi all, In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB ram and 64 cores and we are thinking to use them as Cassandra nodes. From what I am reading around, the community recommends that every node should not keep more than 1 TB data so in this case I am wondering if it is possible to install 2 instances per node using docker so each docker instance can write to its own physical disk and utilise more efficiently the rest hardware (CPU & RAM). I understand with this setup there is the danger of creating a single point of failure for 2 Cassandra nodes but except that do you think that is a possible setup to start with the cluster? Except the docker solution do you recommend any other way to split the physical node to 2 instances? (VMWare? or even maybe 2 separate installations of Cassandra? ) Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes each (5 baremetal nodes with 2 Cassandra instances) Probably later when we will start introducing more nodes to the cluster we can decommissioning the "double-instaned" ones and aim for a more homogeneous solution.. Thank you, Wil The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.