Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Ben Mills
Thanks again Reid - another great response - you have pointed me in the
right direction.

On Mon, Nov 4, 2019 at 12:23 PM Reid Pinchback 
wrote:

> It’s not a setting I’ve played with at all.  I understand the gist of it
> though, essentially it’ll let you automatically adjust your JVM size
> relative to whatever you allocated to the cgroup.  Unfortunately I’m not a
> K8s developer (that may change shortly, but atm the case).  What you need
> to a firm handle on yourself is where does the memory for the O/S file
> cache live, and is that size sufficient for your read/write activity.  Bare
> metal and VM tuning I understand better, so I’ll have to defer to others
> who may have specific personal experience with the details, but the essence
> of the issue should remain the same.  You want a file cache that functions
> appropriately or you’ll get excessive stalls happening on either reading
> from disk or flushing dirty pages to disk.
>
>
>
>
>
> *From: *Ben Mills 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, November 4, 2019 at 12:14 PM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC
>
>
>
> CGroup
>


Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Reid Pinchback
It’s not a setting I’ve played with at all.  I understand the gist of it 
though, essentially it’ll let you automatically adjust your JVM size relative 
to whatever you allocated to the cgroup.  Unfortunately I’m not a K8s developer 
(that may change shortly, but atm the case).  What you need to a firm handle on 
yourself is where does the memory for the O/S file cache live, and is that size 
sufficient for your read/write activity.  Bare metal and VM tuning I understand 
better, so I’ll have to defer to others who may have specific personal 
experience with the details, but the essence of the issue should remain the 
same.  You want a file cache that functions appropriately or you’ll get 
excessive stalls happening on either reading from disk or flushing dirty pages 
to disk.


From: Ben Mills 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, November 4, 2019 at 12:14 PM
To: "user@cassandra.apache.org" 
Subject: Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

CGroup


Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Ben Mills
Hi Reid,

Many thanks for this thoughtful response - very helpful and much
appreciated.

No doubt some additional experimentation will pay off as you noted.

One additional question: we currently use this heap setting:

-XX:MaxRAMFraction=2

I realize every environment and its tuning goals are different; though -
just generally - what do you think of MaxRAMFraction=2 with Java 8?

If the stateful set is configured with 16Gi memory, that setting would
allocate roughly 8Gi to the heap and seems a safe balance between
heap/nonheap. No worries if you don't have enough information to answer (as
I haven't shared our tuning goals), but any feedback is, again, appreciated.


On Mon, Nov 4, 2019 at 10:28 AM Reid Pinchback 
wrote:

> Hi Ben, just catching up over the weekend.
>
>
>
> The typical advice, per Sergio’s link reference, is an obvious starting
> point.  We use G1GC and normally I’d treat 8gig as the minimal starting
> point for a heap.  What sometimes doesn’t get talked about in the myriad of
> tunings, is that you have to have a clear goal in your mind on what you are
> tuning **for**. You could be tuning for throughput, or average latency,
> or 99’s latency, etc.  How you tune varies quite a lot according to your
> goal.  The more your goal is about latency, the more work you have ahead of
> you.
>
>
>
> I will suggest that, if your data footprint is going to stay low, that you
> give yourself permission to do some experimentation.  As you’re using K8s,
> you are in a bit of a position where if your usage is small enough, you can
> get 2x bang for the buck on your servers by sizing the pods to about 45% of
> server resources and using the C* rack metaphor to ensure you don’t
> co-locate replicas.
>
>
>
> For example, were I you, I’d start asking myself if SSTable compression
> mattered to me at all.  The reason I’d start asking myself questions like
> that is C* has multiple uses of memory, and one of the balancing acts is
> chunk cache and the O/S file cache.  If I could find a way to make my O/S
> file cache be a defacto C* cache, I’d roll up the shirt sleeves and see
> what kind of performance numbers I could squeeze out with some creative
> tuning experiments.  Now, I’m not saying **do** that, because your write
> volume also plays a roll, and you said you’re expecting a relatively even
> balance in reads and writes.  I’m just saying, by way of example, I’d start
> weighing if the advice I get online was based in experience similar to my
> current circumstance, or ones that were very different.
>
>
>
> R
>
>
>
> *From: *Ben Mills 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, November 4, 2019 at 8:51 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC
>
>
>
> *Message from External Sender*
>
> Hi (yet again) Sergio,
>
>
>
> Finally, note that we use this sidecar
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Stackdriver_stackdriver-2Dprometheus-2Dsidecar&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=EP6Ql6dsh_bz1U49OKL6IYmkd51gf4VD6m2QwaQJ0ZM&s=m9OmSlwbgoGmO8jUYlAF6b4fbWx82f8NlqqQtOqlwhQ&e=>
>  for
> shipping metrics to Stackdriver. It runs as a second container within our
> Prometheus stateful set.
>
>
>
>
>
> On Mon, Nov 4, 2019 at 8:46 AM Ben Mills  wrote:
>
> Hi (again) Sergio,
>
>
>
> I forgot to note that along with Prometheus, we use Grafana (with
> Prometheus as its data source) as well as Stackdriver for monitoring.
>
>
>
> As Stackdriver is still developing (i.e. does not have all the features we
> need), we tend to use it for the basics (i.e. monitoring and alerting on
> memory, cpu and disk (PVs) thresholds). More specifically, the
> Prometheus JMX exporter (noted above) scrapes all the MBeans inside
> Cassandra, exporting in the Prometheus data model. Its config map filters
> (allows) our metrics of interest, and those metrics are sent to our Grafana
> instances and to Stackdriver. We use Grafana for more advanced metric
> configs that provide deeper insight in Cassandra - e.g. read/write
> latencies and so forth. For monitoring memory utilization, we monitor both
> pod-level in Stackdriver (i.e. to avoid having a Cassandra pod oomkilled by
> kubelet) as well as inside the JVM (heap space).
>
>
>
> Hope this helps.
>
>
>
> On Mon, Nov 4, 2019 at 8:26 AM Ben Mills  wrote:
>
> Hi Sergio,
>
>
>
> Thanks for this and sorry for the slow reply.
>
>
>
> We are indeed still running Java 8 and so it's very helpful.
>
>
>
> This Cassandra cluster has been ru

Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Reid Pinchback
Hi Ben, just catching up over the weekend.

The typical advice, per Sergio’s link reference, is an obvious starting point.  
We use G1GC and normally I’d treat 8gig as the minimal starting point for a 
heap.  What sometimes doesn’t get talked about in the myriad of tunings, is 
that you have to have a clear goal in your mind on what you are tuning *for*. 
You could be tuning for throughput, or average latency, or 99’s latency, etc.  
How you tune varies quite a lot according to your goal.  The more your goal is 
about latency, the more work you have ahead of you.

I will suggest that, if your data footprint is going to stay low, that you give 
yourself permission to do some experimentation.  As you’re using K8s, you are 
in a bit of a position where if your usage is small enough, you can get 2x bang 
for the buck on your servers by sizing the pods to about 45% of server 
resources and using the C* rack metaphor to ensure you don’t co-locate replicas.

For example, were I you, I’d start asking myself if SSTable compression 
mattered to me at all.  The reason I’d start asking myself questions like that 
is C* has multiple uses of memory, and one of the balancing acts is chunk cache 
and the O/S file cache.  If I could find a way to make my O/S file cache be a 
defacto C* cache, I’d roll up the shirt sleeves and see what kind of 
performance numbers I could squeeze out with some creative tuning experiments.  
Now, I’m not saying *do* that, because your write volume also plays a roll, and 
you said you’re expecting a relatively even balance in reads and writes.  I’m 
just saying, by way of example, I’d start weighing if the advice I get online 
was based in experience similar to my current circumstance, or ones that were 
very different.

R

From: Ben Mills 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, November 4, 2019 at 8:51 AM
To: "user@cassandra.apache.org" 
Subject: Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

Message from External Sender
Hi (yet again) Sergio,

Finally, note that we use this 
sidecar<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Stackdriver_stackdriver-2Dprometheus-2Dsidecar&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=EP6Ql6dsh_bz1U49OKL6IYmkd51gf4VD6m2QwaQJ0ZM&s=m9OmSlwbgoGmO8jUYlAF6b4fbWx82f8NlqqQtOqlwhQ&e=>
 for shipping metrics to Stackdriver. It runs as a second container within our 
Prometheus stateful set.


On Mon, Nov 4, 2019 at 8:46 AM Ben Mills 
mailto:b...@bitbrew.com>> wrote:
Hi (again) Sergio,

I forgot to note that along with Prometheus, we use Grafana (with Prometheus as 
its data source) as well as Stackdriver for monitoring.

As Stackdriver is still developing (i.e. does not have all the features we 
need), we tend to use it for the basics (i.e. monitoring and alerting on 
memory, cpu and disk (PVs) thresholds). More specifically, the Prometheus JMX 
exporter (noted above) scrapes all the MBeans inside Cassandra, exporting in 
the Prometheus data model. Its config map filters (allows) our metrics of 
interest, and those metrics are sent to our Grafana instances and to 
Stackdriver. We use Grafana for more advanced metric configs that provide 
deeper insight in Cassandra - e.g. read/write latencies and so forth. For 
monitoring memory utilization, we monitor both pod-level in Stackdriver (i.e. 
to avoid having a Cassandra pod oomkilled by kubelet) as well as inside the JVM 
(heap space).

Hope this helps.

On Mon, Nov 4, 2019 at 8:26 AM Ben Mills 
mailto:b...@bitbrew.com>> wrote:
Hi Sergio,

Thanks for this and sorry for the slow reply.

We are indeed still running Java 8 and so it's very helpful.

This Cassandra cluster has been running reliably in Kubernetes for several 
years, and while we've had some repair-related issues, they are not related to 
container orchestration or the cloud environment. We don't use operators and 
have simply built the needed Kubernetes configs (YAML manifests) to handle 
deployment of new Docker images (when needed), and so forth. We have:

(1) ConfigMap - Cassandra environment variables
(2) ConfigMap - Prometheus configs for this JMX 
exporter<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_prometheus_jmx-5Fexporter&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=EP6Ql6dsh_bz1U49OKL6IYmkd51gf4VD6m2QwaQJ0ZM&s=l3csYnTFP-q25mQ57k36PlkMKj2OdN7JhM-vuSyKWh8&e=>,
 which is built into the image and runs as a Java agent
(3) PodDisruptionBudget - with minAvailable: 2 as the important setting
(4) Service - this is a headless service (clusterIP: None) which specifies the 
ports for cql, jmx, prometheus, intra-node
(5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as you 
would expect

We store data on persistent volumes using an SSD storage class, and 

Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Ben Mills
Hi (yet again) Sergio,

Finally, note that we use this sidecar
 for
shipping metrics to Stackdriver. It runs as a second container within our
Prometheus stateful set.


On Mon, Nov 4, 2019 at 8:46 AM Ben Mills  wrote:

> Hi (again) Sergio,
>
> I forgot to note that along with Prometheus, we use Grafana (with
> Prometheus as its data source) as well as Stackdriver for monitoring.
>
> As Stackdriver is still developing (i.e. does not have all the features we
> need), we tend to use it for the basics (i.e. monitoring and alerting on
> memory, cpu and disk (PVs) thresholds). More specifically, the
> Prometheus JMX exporter (noted above) scrapes all the MBeans inside
> Cassandra, exporting in the Prometheus data model. Its config map filters
> (allows) our metrics of interest, and those metrics are sent to our Grafana
> instances and to Stackdriver. We use Grafana for more advanced metric
> configs that provide deeper insight in Cassandra - e.g. read/write
> latencies and so forth. For monitoring memory utilization, we monitor both
> pod-level in Stackdriver (i.e. to avoid having a Cassandra pod oomkilled by
> kubelet) as well as inside the JVM (heap space).
>
> Hope this helps.
>
> On Mon, Nov 4, 2019 at 8:26 AM Ben Mills  wrote:
>
>> Hi Sergio,
>>
>> Thanks for this and sorry for the slow reply.
>>
>> We are indeed still running Java 8 and so it's very helpful.
>>
>> This Cassandra cluster has been running reliably in Kubernetes for
>> several years, and while we've had some repair-related issues, they are not
>> related to container orchestration or the cloud environment. We don't use
>> operators and have simply built the needed Kubernetes configs (YAML
>> manifests) to handle deployment of new Docker images (when needed), and so
>> forth. We have:
>>
>> (1) ConfigMap - Cassandra environment variables
>> (2) ConfigMap - Prometheus configs for this JMX exporter
>> , which is built into the
>> image and runs as a Java agent
>> (3) PodDisruptionBudget - with minAvailable: 2 as the important setting
>> (4) Service - this is a headless service (clusterIP: None) which
>> specifies the ports for cql, jmx, prometheus, intra-node
>> (5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as
>> you would expect
>>
>> We store data on persistent volumes using an SSD storage class, and use:
>> an updateStrategy of OnDelete, some affinity rules to ensure an even
>> spread of pods across our zones, Prometheus annotations for scraping the
>> metrics port, a nodeSelector and tolerations to ensure the Cassandra pods
>> run in their dedicated node pool, and a preStop hook that runs nodetool
>> drain to help with graceful shutdown when a pod is rolled.
>>
>> I'm guessing your installation is much larger than ours and so operators
>> may be a good way to go. For our needs the above has been very reliable as
>> has GCP in general.
>>
>> We are currently updating our backup/restore implementation to provide
>> better granularity with respect to restoring a specific keyspace and also
>> exploring Velero  for DR.
>>
>> Hope this helps.
>>
>>
>> On Fri, Nov 1, 2019 at 5:34 PM Sergio  wrote:
>>
>>> Hi Ben,
>>>
>>> Well, I had a similar question and Jon Haddad was preferring ParNew +
>>> CMS over G1GC for java 8.
>>> https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E
>>> It depends on your JVM and in any case, I would test it based on your
>>> workload.
>>>
>>> What's your experience of running Cassandra in k8s. Are you using the
>>> Cassandra Kubernetes Operator?
>>>
>>> How do you monitor it and how do you perform disaster recovery backup?
>>>
>>>
>>> Best,
>>>
>>> Sergio
>>>
>>> Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills  ha
>>> scritto:
>>>
 Thanks Sergio - that's good advice and we have this built into the
 plan.
 Have you heard a solid/consistent recommendation/requirement as to the
 amount of memory heap requires for G1GC?

 On Fri, Nov 1, 2019 at 5:11 PM Sergio 
 wrote:

> In any case I would test with tlp-stress or Cassandra stress tool any
> configuration
>
> Sergio
>
> On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:
>
>> Greetings,
>>
>> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and
>> considering a change to the GC config.
>>
>> What is the minimum amount of memory that needs to be allocated to
>> heap space when using G1GC?
>>
>> For GC, we currently use CMS. Along with the version upgrade, we'll
>> be running the stateful set of Cassandra pods on new machine types in a 
>> new
>> node pool with 12Gi memory per node. Not a lot of memory but an
>> improvement. We may be able to go up to 16Gi memory per node. We'd like 
>> to
>> continue using these heap 

Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Ben Mills
Hi (again) Sergio,

I forgot to note that along with Prometheus, we use Grafana (with
Prometheus as its data source) as well as Stackdriver for monitoring.

As Stackdriver is still developing (i.e. does not have all the features we
need), we tend to use it for the basics (i.e. monitoring and alerting on
memory, cpu and disk (PVs) thresholds). More specifically, the
Prometheus JMX exporter (noted above) scrapes all the MBeans inside
Cassandra, exporting in the Prometheus data model. Its config map filters
(allows) our metrics of interest, and those metrics are sent to our Grafana
instances and to Stackdriver. We use Grafana for more advanced metric
configs that provide deeper insight in Cassandra - e.g. read/write
latencies and so forth. For monitoring memory utilization, we monitor both
pod-level in Stackdriver (i.e. to avoid having a Cassandra pod oomkilled by
kubelet) as well as inside the JVM (heap space).

Hope this helps.

On Mon, Nov 4, 2019 at 8:26 AM Ben Mills  wrote:

> Hi Sergio,
>
> Thanks for this and sorry for the slow reply.
>
> We are indeed still running Java 8 and so it's very helpful.
>
> This Cassandra cluster has been running reliably in Kubernetes for several
> years, and while we've had some repair-related issues, they are not related
> to container orchestration or the cloud environment. We don't use operators
> and have simply built the needed Kubernetes configs (YAML manifests) to
> handle deployment of new Docker images (when needed), and so forth. We have:
>
> (1) ConfigMap - Cassandra environment variables
> (2) ConfigMap - Prometheus configs for this JMX exporter
> , which is built into the
> image and runs as a Java agent
> (3) PodDisruptionBudget - with minAvailable: 2 as the important setting
> (4) Service - this is a headless service (clusterIP: None) which specifies
> the ports for cql, jmx, prometheus, intra-node
> (5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as
> you would expect
>
> We store data on persistent volumes using an SSD storage class, and use:
> an updateStrategy of OnDelete, some affinity rules to ensure an even
> spread of pods across our zones, Prometheus annotations for scraping the
> metrics port, a nodeSelector and tolerations to ensure the Cassandra pods
> run in their dedicated node pool, and a preStop hook that runs nodetool
> drain to help with graceful shutdown when a pod is rolled.
>
> I'm guessing your installation is much larger than ours and so operators
> may be a good way to go. For our needs the above has been very reliable as
> has GCP in general.
>
> We are currently updating our backup/restore implementation to provide
> better granularity with respect to restoring a specific keyspace and also
> exploring Velero  for DR.
>
> Hope this helps.
>
>
> On Fri, Nov 1, 2019 at 5:34 PM Sergio  wrote:
>
>> Hi Ben,
>>
>> Well, I had a similar question and Jon Haddad was preferring ParNew + CMS
>> over G1GC for java 8.
>> https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E
>> It depends on your JVM and in any case, I would test it based on your
>> workload.
>>
>> What's your experience of running Cassandra in k8s. Are you using the
>> Cassandra Kubernetes Operator?
>>
>> How do you monitor it and how do you perform disaster recovery backup?
>>
>>
>> Best,
>>
>> Sergio
>>
>> Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills  ha
>> scritto:
>>
>>> Thanks Sergio - that's good advice and we have this built into the plan.
>>> Have you heard a solid/consistent recommendation/requirement as to the
>>> amount of memory heap requires for G1GC?
>>>
>>> On Fri, Nov 1, 2019 at 5:11 PM Sergio  wrote:
>>>
 In any case I would test with tlp-stress or Cassandra stress tool any
 configuration

 Sergio

 On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:

> Greetings,
>
> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering
> a change to the GC config.
>
> What is the minimum amount of memory that needs to be allocated to
> heap space when using G1GC?
>
> For GC, we currently use CMS. Along with the version upgrade, we'll be
> running the stateful set of Cassandra pods on new machine types in a new
> node pool with 12Gi memory per node. Not a lot of memory but an
> improvement. We may be able to go up to 16Gi memory per node. We'd like to
> continue using these heap settings:
>
> -XX:+UnlockExperimentalVMOptions
> -XX:+UseCGroupMemoryLimitForHeap
> -XX:MaxRAMFraction=2
>
> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half
> of total available).
>
> Here are some details on the environment and configs in the event that
> something is relevant.
>
> Environment: Kubernetes
> Environment Config: Stateful set of 3 replicas
> 

Re: ***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-04 Thread Ben Mills
Hi Sergio,

Thanks for this and sorry for the slow reply.

We are indeed still running Java 8 and so it's very helpful.

This Cassandra cluster has been running reliably in Kubernetes for several
years, and while we've had some repair-related issues, they are not related
to container orchestration or the cloud environment. We don't use operators
and have simply built the needed Kubernetes configs (YAML manifests) to
handle deployment of new Docker images (when needed), and so forth. We have:

(1) ConfigMap - Cassandra environment variables
(2) ConfigMap - Prometheus configs for this JMX exporter
, which is built into the image
and runs as a Java agent
(3) PodDisruptionBudget - with minAvailable: 2 as the important setting
(4) Service - this is a headless service (clusterIP: None) which specifies
the ports for cql, jmx, prometheus, intra-node
(5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as you
would expect

We store data on persistent volumes using an SSD storage class, and use: an
updateStrategy of OnDelete, some affinity rules to ensure an even spread of
pods across our zones, Prometheus annotations for scraping the metrics
port, a nodeSelector and tolerations to ensure the Cassandra pods run in
their dedicated node pool, and a preStop hook that runs nodetool drain to
help with graceful shutdown when a pod is rolled.

I'm guessing your installation is much larger than ours and so operators
may be a good way to go. For our needs the above has been very reliable as
has GCP in general.

We are currently updating our backup/restore implementation to provide
better granularity with respect to restoring a specific keyspace and also
exploring Velero  for DR.

Hope this helps.


On Fri, Nov 1, 2019 at 5:34 PM Sergio  wrote:

> Hi Ben,
>
> Well, I had a similar question and Jon Haddad was preferring ParNew + CMS
> over G1GC for java 8.
> https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E
> It depends on your JVM and in any case, I would test it based on your
> workload.
>
> What's your experience of running Cassandra in k8s. Are you using the
> Cassandra Kubernetes Operator?
>
> How do you monitor it and how do you perform disaster recovery backup?
>
>
> Best,
>
> Sergio
>
> Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills  ha
> scritto:
>
>> Thanks Sergio - that's good advice and we have this built into the plan.
>> Have you heard a solid/consistent recommendation/requirement as to the
>> amount of memory heap requires for G1GC?
>>
>> On Fri, Nov 1, 2019 at 5:11 PM Sergio  wrote:
>>
>>> In any case I would test with tlp-stress or Cassandra stress tool any
>>> configuration
>>>
>>> Sergio
>>>
>>> On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:
>>>
 Greetings,

 We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering
 a change to the GC config.

 What is the minimum amount of memory that needs to be allocated to heap
 space when using G1GC?

 For GC, we currently use CMS. Along with the version upgrade, we'll be
 running the stateful set of Cassandra pods on new machine types in a new
 node pool with 12Gi memory per node. Not a lot of memory but an
 improvement. We may be able to go up to 16Gi memory per node. We'd like to
 continue using these heap settings:

 -XX:+UnlockExperimentalVMOptions
 -XX:+UseCGroupMemoryLimitForHeap
 -XX:MaxRAMFraction=2

 which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half
 of total available).

 Here are some details on the environment and configs in the event that
 something is relevant.

 Environment: Kubernetes
 Environment Config: Stateful set of 3 replicas
 Storage: Persistent Volumes
 Storage Class: SSD
 Node OS: Container-Optimized OS
 Container OS: Ubuntu 16.04.3 LTS
 Data Centers: 1
 Racks: 3 (one per zone)
 Nodes: 3
 Tokens: 4
 Replication Factor: 3
 Replication Strategy: NetworkTopologyStrategy (all keyspaces)
 Compaction Strategy: STCS (all tables)
 Read/Write Requirements: Blend of both
 Data Load: <1GB per node
 gc_grace_seconds: default (10 days - all tables)

 GC Settings: (CMS)

 -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC
 -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=8
 -XX:MaxTenuringThreshold=1
 -XX:CMSInitiatingOccupancyFraction=75
 -XX:+UseCMSInitiatingOccupancyOnly
 -XX:CMSWaitDuration=3
 -XX:+CMSParallelInitialMarkEnabled
 -XX:+CMSEdenChunksRecordAlways

 Any ideas are much appreciated.

>>>


***UNCHECKED*** Re: Memory Recommendations for G1GC

2019-11-01 Thread Sergio
Hi Ben,

Well, I had a similar question and Jon Haddad was preferring ParNew + CMS
over G1GC for java 8.
https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E
It depends on your JVM and in any case, I would test it based on your
workload.

What's your experience of running Cassandra in k8s. Are you using the
Cassandra Kubernetes Operator?

How do you monitor it and how do you perform disaster recovery backup?


Best,

Sergio

Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills  ha
scritto:

> Thanks Sergio - that's good advice and we have this built into the plan.
> Have you heard a solid/consistent recommendation/requirement as to the
> amount of memory heap requires for G1GC?
>
> On Fri, Nov 1, 2019 at 5:11 PM Sergio  wrote:
>
>> In any case I would test with tlp-stress or Cassandra stress tool any
>> configuration
>>
>> Sergio
>>
>> On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:
>>
>>> Greetings,
>>>
>>> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a
>>> change to the GC config.
>>>
>>> What is the minimum amount of memory that needs to be allocated to heap
>>> space when using G1GC?
>>>
>>> For GC, we currently use CMS. Along with the version upgrade, we'll be
>>> running the stateful set of Cassandra pods on new machine types in a new
>>> node pool with 12Gi memory per node. Not a lot of memory but an
>>> improvement. We may be able to go up to 16Gi memory per node. We'd like to
>>> continue using these heap settings:
>>>
>>> -XX:+UnlockExperimentalVMOptions
>>> -XX:+UseCGroupMemoryLimitForHeap
>>> -XX:MaxRAMFraction=2
>>>
>>> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of
>>> total available).
>>>
>>> Here are some details on the environment and configs in the event that
>>> something is relevant.
>>>
>>> Environment: Kubernetes
>>> Environment Config: Stateful set of 3 replicas
>>> Storage: Persistent Volumes
>>> Storage Class: SSD
>>> Node OS: Container-Optimized OS
>>> Container OS: Ubuntu 16.04.3 LTS
>>> Data Centers: 1
>>> Racks: 3 (one per zone)
>>> Nodes: 3
>>> Tokens: 4
>>> Replication Factor: 3
>>> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
>>> Compaction Strategy: STCS (all tables)
>>> Read/Write Requirements: Blend of both
>>> Data Load: <1GB per node
>>> gc_grace_seconds: default (10 days - all tables)
>>>
>>> GC Settings: (CMS)
>>>
>>> -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled
>>> -XX:SurvivorRatio=8
>>> -XX:MaxTenuringThreshold=1
>>> -XX:CMSInitiatingOccupancyFraction=75
>>> -XX:+UseCMSInitiatingOccupancyOnly
>>> -XX:CMSWaitDuration=3
>>> -XX:+CMSParallelInitialMarkEnabled
>>> -XX:+CMSEdenChunksRecordAlways
>>>
>>> Any ideas are much appreciated.
>>>
>>


Re: Memory Recommendations for G1GC

2019-11-01 Thread Ben Mills
Thanks Sergio - that's good advice and we have this built into the plan.
Have you heard a solid/consistent recommendation/requirement as to the
amount of memory heap requires for G1GC?

On Fri, Nov 1, 2019 at 5:11 PM Sergio  wrote:

> In any case I would test with tlp-stress or Cassandra stress tool any
> configuration
>
> Sergio
>
> On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:
>
>> Greetings,
>>
>> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a
>> change to the GC config.
>>
>> What is the minimum amount of memory that needs to be allocated to heap
>> space when using G1GC?
>>
>> For GC, we currently use CMS. Along with the version upgrade, we'll be
>> running the stateful set of Cassandra pods on new machine types in a new
>> node pool with 12Gi memory per node. Not a lot of memory but an
>> improvement. We may be able to go up to 16Gi memory per node. We'd like to
>> continue using these heap settings:
>>
>> -XX:+UnlockExperimentalVMOptions
>> -XX:+UseCGroupMemoryLimitForHeap
>> -XX:MaxRAMFraction=2
>>
>> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of
>> total available).
>>
>> Here are some details on the environment and configs in the event that
>> something is relevant.
>>
>> Environment: Kubernetes
>> Environment Config: Stateful set of 3 replicas
>> Storage: Persistent Volumes
>> Storage Class: SSD
>> Node OS: Container-Optimized OS
>> Container OS: Ubuntu 16.04.3 LTS
>> Data Centers: 1
>> Racks: 3 (one per zone)
>> Nodes: 3
>> Tokens: 4
>> Replication Factor: 3
>> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
>> Compaction Strategy: STCS (all tables)
>> Read/Write Requirements: Blend of both
>> Data Load: <1GB per node
>> gc_grace_seconds: default (10 days - all tables)
>>
>> GC Settings: (CMS)
>>
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=1
>> -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSWaitDuration=3
>> -XX:+CMSParallelInitialMarkEnabled
>> -XX:+CMSEdenChunksRecordAlways
>>
>> Any ideas are much appreciated.
>>
>


Re: Memory Recommendations for G1GC

2019-11-01 Thread Ben Mills
Thanks Reid,

We currently only have ~1GB data per node with a replication factor of 3.
The amount of data will certainly grow, though I have no solid projections
at this time. The current memory and CPU resources are quite low (for
Cassandra) and so along with the upgrade we plan to increase both. This
seems to be the strong recommendation from this user group.

On Fri, Nov 1, 2019 at 4:52 PM Reid Pinchback 
wrote:

> Maybe I’m missing something.  You’re expecting less than 1 gig of data per
> node?  Unless this is some situation of super-high data churn/brief TTL, it
> sounds like you’ll end up with your entire database in memory.
>
>
>
> *From: *Ben Mills 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Friday, November 1, 2019 at 3:31 PM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Memory Recommendations for G1GC
>
>
>
> *Message from External Sender*
>
> Greetings,
>
>
>
> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a
> change to the GC config.
>
>
>
> What is the minimum amount of memory that needs to be allocated to heap
> space when using G1GC?
>
>
>
> For GC, we currently use CMS. Along with the version upgrade, we'll be
> running the stateful set of Cassandra pods on new machine types in a new
> node pool with 12Gi memory per node. Not a lot of memory but an
> improvement. We may be able to go up to 16Gi memory per node. We'd like to
> continue using these heap settings:
>
>
> -XX:+UnlockExperimentalVMOptions
> -XX:+UseCGroupMemoryLimitForHeap
> -XX:MaxRAMFraction=2
>
>
>
> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of
> total available).
>
>
>
> Here are some details on the environment and configs in the event that
> something is relevant.
>
>
>
> Environment: Kubernetes
> Environment Config: Stateful set of 3 replicas
> Storage: Persistent Volumes
> Storage Class: SSD
> Node OS: Container-Optimized OS
> Container OS: Ubuntu 16.04.3 LTS
> Data Centers: 1
> Racks: 3 (one per zone)
> Nodes: 3
> Tokens: 4
> Replication Factor: 3
> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
> Compaction Strategy: STCS (all tables)
> Read/Write Requirements: Blend of both
> Data Load: <1GB per node
> gc_grace_seconds: default (10 days - all tables)
>
> GC Settings: (CMS)
>
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=3
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
>
>
>
> Any ideas are much appreciated.
>


Re: Memory Recommendations for G1GC

2019-11-01 Thread Sergio
In any case I would test with tlp-stress or Cassandra stress tool any
configuration

Sergio

On Fri, Nov 1, 2019, 12:31 PM Ben Mills  wrote:

> Greetings,
>
> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a
> change to the GC config.
>
> What is the minimum amount of memory that needs to be allocated to heap
> space when using G1GC?
>
> For GC, we currently use CMS. Along with the version upgrade, we'll be
> running the stateful set of Cassandra pods on new machine types in a new
> node pool with 12Gi memory per node. Not a lot of memory but an
> improvement. We may be able to go up to 16Gi memory per node. We'd like to
> continue using these heap settings:
>
> -XX:+UnlockExperimentalVMOptions
> -XX:+UseCGroupMemoryLimitForHeap
> -XX:MaxRAMFraction=2
>
> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of
> total available).
>
> Here are some details on the environment and configs in the event that
> something is relevant.
>
> Environment: Kubernetes
> Environment Config: Stateful set of 3 replicas
> Storage: Persistent Volumes
> Storage Class: SSD
> Node OS: Container-Optimized OS
> Container OS: Ubuntu 16.04.3 LTS
> Data Centers: 1
> Racks: 3 (one per zone)
> Nodes: 3
> Tokens: 4
> Replication Factor: 3
> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
> Compaction Strategy: STCS (all tables)
> Read/Write Requirements: Blend of both
> Data Load: <1GB per node
> gc_grace_seconds: default (10 days - all tables)
>
> GC Settings: (CMS)
>
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=3
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
>
> Any ideas are much appreciated.
>


Re: Memory Recommendations for G1GC

2019-11-01 Thread Reid Pinchback
Maybe I’m missing something.  You’re expecting less than 1 gig of data per 
node?  Unless this is some situation of super-high data churn/brief TTL, it 
sounds like you’ll end up with your entire database in memory.

From: Ben Mills 
Reply-To: "user@cassandra.apache.org" 
Date: Friday, November 1, 2019 at 3:31 PM
To: "user@cassandra.apache.org" 
Subject: Memory Recommendations for G1GC

Message from External Sender
Greetings,

We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a change 
to the GC config.

What is the minimum amount of memory that needs to be allocated to heap space 
when using G1GC?

For GC, we currently use CMS. Along with the version upgrade, we'll be running 
the stateful set of Cassandra pods on new machine types in a new node pool with 
12Gi memory per node. Not a lot of memory but an improvement. We may be able to 
go up to 16Gi memory per node. We'd like to continue using these heap settings:

-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap
-XX:MaxRAMFraction=2

which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of total 
available).

Here are some details on the environment and configs in the event that 
something is relevant.

Environment: Kubernetes
Environment Config: Stateful set of 3 replicas
Storage: Persistent Volumes
Storage Class: SSD
Node OS: Container-Optimized OS
Container OS: Ubuntu 16.04.3 LTS
Data Centers: 1
Racks: 3 (one per zone)
Nodes: 3
Tokens: 4
Replication Factor: 3
Replication Strategy: NetworkTopologyStrategy (all keyspaces)
Compaction Strategy: STCS (all tables)
Read/Write Requirements: Blend of both
Data Load: <1GB per node
gc_grace_seconds: default (10 days - all tables)

GC Settings: (CMS)

-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=3
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways

Any ideas are much appreciated.


Memory Recommendations for G1GC

2019-11-01 Thread Ben Mills
Greetings,

We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering a
change to the GC config.

What is the minimum amount of memory that needs to be allocated to heap
space when using G1GC?

For GC, we currently use CMS. Along with the version upgrade, we'll be
running the stateful set of Cassandra pods on new machine types in a new
node pool with 12Gi memory per node. Not a lot of memory but an
improvement. We may be able to go up to 16Gi memory per node. We'd like to
continue using these heap settings:

-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap
-XX:MaxRAMFraction=2

which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half of
total available).

Here are some details on the environment and configs in the event that
something is relevant.

Environment: Kubernetes
Environment Config: Stateful set of 3 replicas
Storage: Persistent Volumes
Storage Class: SSD
Node OS: Container-Optimized OS
Container OS: Ubuntu 16.04.3 LTS
Data Centers: 1
Racks: 3 (one per zone)
Nodes: 3
Tokens: 4
Replication Factor: 3
Replication Strategy: NetworkTopologyStrategy (all keyspaces)
Compaction Strategy: STCS (all tables)
Read/Write Requirements: Blend of both
Data Load: <1GB per node
gc_grace_seconds: default (10 days - all tables)

GC Settings: (CMS)

-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=3
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways

Any ideas are much appreciated.