Re: [EXTERNAL] multiple Cassandra instances per server, possible?

Jonathan Koppenhofer Thu, 18 Apr 2019 18:28:20 -0700

We do multiple nodes per host as a standard practice. In our case, we never
put 2 nodes from a single cluster  on the same host, though as mentioned
before, you could potentially get away with that if you properly use rack
awareness, just be careful of load.


We also do NOT use any other layer of segregation such as docker or VMs, we
just have multiple IPs per host, and bind each IP to a distinct node. We
have looked at VMs and Containers, but they either add abstraction
complexity or some kind of performance penalty.

As for system resources, we dedicate individual ssds for each node, but
CPU, memory, and network is shared. We are spoiled by good network and
beefy memory, so the only place we have to be careful is CPU. As such, we
pick fairly conservative Cassandra.yaml settings and monitor CPU usage. If
workloads get hot on a particular host, we have some flexibility to move
things around.

In any case, it sounds like you will be fine running 1 node per host. With
that many resources, be sure to tune you nodes to make use of them.

Good luck.

On Thu, Apr 18, 2019, 2:49 PM William R <[email protected]>
wrote:

> hi,
>
> Thank you for your answers, starting with the most important point from
> your answers I understand that
>
> "it is OK to go more than 1 TB in disk usage"
>
> so in this case if I am going to use the 50% of the disk capacity I will
> end up having around 3 TB per node which in this case I will not need to
> use a docker solution which is a very good usa case for us.
>
> The goal of my setup is to save large data volumes in every node (~ 3 TB -
> 50% usage of HD) with the current hardware that we possess. The high
> availability I consider it standard since we are going to have 2 DCs with
> RF3.
>
> I also have to note that Datastax also recommends usage no more than 500
> GB - 1 TB.
>
> Cheers,
>
> Vasilis
>
>
> Sent with ProtonMail <https://protonmail.com> Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Thursday, April 18, 2019 6:56 PM, Jacques-Henri Berthemet <
> [email protected]> wrote:
>
> So how much data can you safely fit per node using SSDs with Cassandra
> 3.11? How much free space do you need on your disks?
>
> There should be some recommendations on node sizes on:
>
> http://cassandra.apache.org/doc/latest/operating/hardware.html
>
> Documentation - Apache Cassandra
> <http://cassandra.apache.org/doc/latest/operating/hardware.html>
> cassandra.apache.org
> The Apache Cassandra database is the right choice when you need
> scalability and high availability without compromising performance. Linear
> scalability and proven fault-tolerance on commodity hardware or cloud
> infrastructure make it the perfect platform for mission-critical data.
> Cassandra's support for replicating across multiple datacenters is
> best-in-class, providing lower latency for your ...
>
>
> ------------------------------
>
> *From:* Jon Haddad <[email protected]>
> *Sent:* Thursday, April 18, 2019 6:43:15 PM
> *To:* [email protected]
> *Subject:* Re: [EXTERNAL] multiple Cassandra instances per server,
> possible?
>
> Agreed with Jeff here.  The whole "community recommends no more than
> 1TB" has been around, and inaccurate, for a long time.
>
> The biggest issue with dense nodes is how long it takes to replace
> them.  4.0 should help with that under certain circumstances.
>
>
> On Thu, Apr 18, 2019 at 6:57 AM Jeff Jirsa <[email protected]> wrote:
> >
> > Agreed that you can go larger than 1T on ssd
> >
> > You can do this safely with both instances in the same cluster if you
> guarantee two replicas aren’t on the same machine. Cassandra provides a
> primitive to do this - rack awareness through the network topology snitch.
> >
> > The limitation (until 4.0) is that you’ll need two IPs per machine as
> both instances have to run in the same port.
> >
> >
> > --
> > Jeff Jirsa
> >
> >
> > On Apr 18, 2019, at 6:45 AM, Durity, Sean R <[email protected]>
> wrote:
> >
> > What is the data problem that you are trying to solve with Cassandra? Is
> it high availability? Low latency queries? Large data volumes? High
> concurrent users? I would design the solution to fit the problem(s) you are
> solving.
> >
> >
> >
> > For example, if high availability is the goal, I would be very cautious
> about 2 nodes/machine. If you need the full amount of the disk – you *can*
> have larger nodes than 1 TB. I agree that administration tasks (like
> adding/removing nodes, etc.) are more painful with large nodes – but not
> impossible. For large amounts of data, I like nodes that have about 2.5 – 3
> TB of usable SSD disk.
> >
> >
> >
> > It is possible that your nodes might be under-utilized, especially at
> first. But if the hardware is already available, you have to use what you
> have.
> >
> >
> >
> > We have done multiple nodes on single physical hardware, but they were
> two separate clusters (for the same application). In that case, we had  a
> different install location and different ports for one of the clusters.
> >
> >
> >
> > Sean Durity
> >
> >
> >
> > From: William R <[email protected]>
> > Sent: Thursday, April 18, 2019 9:14 AM
> > To: [email protected]
> > Subject: [EXTERNAL] multiple Cassandra instances per server, possible?
> >
> >
> >
> > Hi all,
> >
> >
> >
> > In our small company we have 10 nodes of (2 x 3 TB HD) 6 TB each, 128 GB
> ram and 64 cores and we are thinking to use them as Cassandra nodes. From
> what I am reading around, the community recommends that every node should
> not keep more than 1 TB data so in this case I am wondering if it is
> possible to install 2 instances per node using docker so each docker
> instance can write to its own physical disk and utilise more efficiently
> the rest hardware (CPU & RAM).
> >
> >
> >
> > I understand with this setup there is the danger of creating a single
> point of failure for 2 Cassandra nodes but except that do you think that is
> a possible setup to start with the cluster?
> >
> >
> >
> > Except the docker solution do you recommend any other way to split the
> physical node to 2 instances? (VMWare? or even maybe 2 separate
> installations of Cassandra? )
> >
> >
> >
> > Eventually we are aiming in a cluster consisted of 2 DCs with 10 nodes
> each (5 baremetal nodes with 2 Cassandra instances)
> >
> >
> >
> > Probably later when we will start introducing more nodes to the cluster
> we can decommissioning the "double-instaned" ones and aim for a more
> homogeneous solution..
> >
> >
> >
> > Thank you,
> >
> >
> >
> > Wil
> >
> >
> > ________________________________
> >
> > The information in this Internet Email is confidential and may be
> legally privileged. It is intended solely for the addressee. Access to this
> Email by anyone else is unauthorized. If you are not the intended
> recipient, any disclosure, copying, distribution or any action taken or
> omitted to be taken in reliance on it, is prohibited and may be unlawful.
> When addressed to our clients any opinions or advice contained in this
> Email are subject to the terms and conditions expressed in any applicable
> governing The Home Depot terms of business or client engagement letter. The
> Home Depot disclaims all responsibility and liability for the accuracy and
> content of this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
>

Re: [EXTERNAL] multiple Cassandra instances per server, possible?

Reply via email to