Hello,
our top contributor from a data volume perspective is time series data. We are
running with STCS since our initial production deployment in 2014 with several
clusters with a varying number of nodes, but currently with max. 9 nodes per
single cluster per different region in AWS with
ndra/tools/toolsSSTableRepairedSet.html>.
Cheers,
On Fri, Sep 15, 2017 at 10:27 AM Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hi Alex,
thanks a lot. Somehow missed that incremental repairs are the default now.
We h
Hello,
we have a test (regression) environment hosted in AWS, which is used for auto
deploying our software on a daily basis and attach constant load across all
deployments. Basically to allow us to detect any regressions in our software on
a daily basis.
On the Cassandra-side, this is
, 2017, at 2:37 AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
we have a test (regression) environment hosted in AWS, which is used for auto
deploying our software on a daily basis and attach constant load across all
Hi,
usually automatic minor compactions are fine, but you may need much more free
disk space to reclaim disk space via automatic minor compactions, especially in
a time series use case with size-tiered compaction strategy (possibly with
leveled as well, I’m not familiar with this strategy
Hello,
we are currently in the process of upgrading from 2.1.18 to 3.0.14. After
upgrading a few test environments, we start to see some suspicious log entries
regarding repair issues.
We have a cron job on all nodes basically executing the following repair call
on a daily basis:
nodetool
when nodetool or the logs show that repair is over (which will include the
anticompaction phase).
Cheers,
On Fri, Sep 15, 2017 at 8:42 AM Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
we are currently in the process of upgradi
in 3.0 in context of CPU/GC and not disk savings?
Thanks,
Thomas
From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
Sent: Freitag, 15. September 2017 13:51
To: user@cassandra.apache.org
Subject: RE: GC/CPU increase after upgrading to 3.0.14 (from 2.1.18)
Hi Jeff,
we are using
a/tools/toolsSSTableRepairedSet.html>.
Cheers,
On Fri, Sep 15, 2017 at 10:27 AM Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hi Alex,
thanks a lot. Somehow missed that incremental repairs are the default now.
We have been
Hi,
additionally, with saved (key) caches, we had some sort of corruption (I think,
for whatever reason) once. So, if you see something like that upon Cassandra
startup:
INFO [main] 2017-01-04 15:38:58,772 AutoSavingCache.java (line 114) reading
saved cache
users may want to run keep
running full repairs without the additional cost of anti-compaction.
Would you mind opening a ticket for this?
2017-09-19 1:33 GMT-05:00 Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com>:
> Hi Kurt,
>
>
>
> thanks for the link!
>
>
>
&
Additional to Kurt’s reply. Double disk usage is really the worst case. Most of
the time you are fine having > largest column family free disk available.
Also take local snapshots into account. Even after a finished major compaction,
disk space may have not been reclaimed, if snapshot sym links
Nandan,
you may find the following useful.
Slideshare:
https://www.slideshare.net/DataStax/apache-cassandra-multidatacenter-essentials-julien-anguenot-iland-internet-solutions-c-summit-2016
Youtube:
https://www.youtube.com/watch?v=G6od16YKSsA
From a client perspective, if you are targeting
wish to do this, you'll have to mark back all your sstables to
unrepaired, using nodetool
sstablerepairedset<https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsSSTableRepairedSet.html>.
Cheers,
On Fri, Sep 15, 2017 at 10:27 AM Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.c
Hi,
within the default hint window of 3 hours, the hinted handoff mechanism should
take care of that, but we have seen that failing from time to time (depending
on the load) in 2.1 with some sort of tombstone related issues causing failing
requests on the system hints table. So, watch out any
QUORUM should succeed with a RF=3 and 2 of 3 nodes available.
Modern client drivers also have ways to “downgrade” the CL of requests, in case
they fail. E.g. for the Java driver:
G1 suggested settings
http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMTAvOC8tLWdjLmxvZy4wLmN1cnJlbnQtLTE5LTExLTE3
@Steinmaurer, Thomas If this happens in a very short very frequently and
depending on your allocation rate in MB/s, a combination of the G1 bug and a
small heap, might result
://issues.apache.org/jira/browse/CASSANDRA-13900. Feel free
to request any further additional information on the ticket.
Unfortunately this is a real show-stopper for us upgrading to 3.0.
Thanks for your attention.
Thomas
From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
Sent
Dan,
do you see any major GC? We have been hit by the following memory leak in our
loadtest environment with 3.11.0.
https://issues.apache.org/jira/browse/CASSANDRA-13754
So, depending on the heap size and uptime, you might get into heap troubles.
Thomas
From: Dan Kinder
Hello,
we were facing a memory leak with 3.11.0
(https://issues.apache.org/jira/browse/CASSANDRA-13754) thus upgraded our
loadtest environment to a snapshot build of 3.11.1. Having it running for > 48
hrs now, we still see a steady increase on heap utilization.
Eclipse memory analyzer shows
Hi,
half of free space does not make sense. Imagine your SSTables need 100G space
and you have 20G free disk. Compaction won't be able to do its job with 10G.
Half free of total disk makes more sense and is what you need for a major
compaction worst case.
Thomas
From: Peng Xiao
Hello Justin,
yes, but in real-world, hard to accomplish for high volume column families >=
3-digit GB. Even with the default of 10 days grace period, this may get a real
challenge, to complete a full repair. ☺
Possibly back again to the discussion that incremental repair has some flaws,
full
situation after upgrading from 2.1.14 to 3.11 in our
production.
Have you already tried G1GC instead of CMS? Our timeouts were mitigated after
replacing CMS with G1GC.
Thanks.
2017-09-25 20:01 GMT+09:00 Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatra
mitigated after
replacing CMS with G1GC.
Thanks.
2017-09-25 20:01 GMT+09:00 Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>:
Hello,
I have now some concrete numbers from our 9 node loadtest cluster with constant
load, same infrastructure aft
Side-note: At least with 2.1 (or even later), be aware that you might run into
the following issue:
https://issues.apache.org/jira/browse/CASSANDRA-11155
We are doing cron―job based hourly snapshots in production and have tried to
also run cleanup after extending a cluster from 6 to 9 nodes.
Marshall,
-pr should not be used with incremental repairs, which is the default since
2.2. But even when used with full repairs (-full option), this will cause
troubles when running nodetool repair -pr from several nodes concurrently. So,
unfortunately, this does not seem to work anymore and
Hi,
although not happening here with Cassandra (due to using CMS), we had some
weird problem with our server application e.g. hit by the following JVM/G1 bugs:
https://bugs.openjdk.java.net/browse/JDK-8140597
https://bugs.openjdk.java.net/browse/JDK-8141402 (more or less a duplicate of
above)
?
On 18 October 2017 at 08:04, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
due to performance/latency reasons, we are currently reading and writing time
series data at consistency level ONE/ANY.
In case of a node bei
read requests while running nodetool repair
You can accomplish this by manually tweaking the values in the dynamic snitch
mbean so other nodes won’t select it for reads
--
Jeff Jirsa
On Oct 18, 2017, at 3:24 AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.st
Hello,
due to performance/latency reasons, we are currently reading and writing time
series data at consistency level ONE/ANY.
In case of a node being down and recovering after the default hinted handoff
window of 3 hrs, we may potentially read stale data from the recovering node.
Of course,
Hello,
I know that Cassandra is built for scale out on commodity hardware, but I
wonder if anyone can share some experience when running Cassandra on rather
capable machines.
Let's say we have a 3 node cluster with 128G RAM, 32 physical cores (16 per CPU
socket), Large Raid with Spinning
Latest DSE is based on 3.11 (possibly due to CASSANDRA-12269, but just a guess).
For us (only), none of 3.0+/3.11+ qualifies for production to be honest, when
you are familiar with having 2.1 in production.
· 3.0 needs more hardware resources to handle the same load =>
Hi Sam,
in our pre-production stages, we are running Cassandra 3.0 and 3.11 with 8u172
(previously u102 then u162) without any visible troubles/regressions.
In case of Cassandra 3.11, you need 3.11.2 due to:
https://issues.apache.org/jira/browse/CASSANDRA-14173. Cassandra 3.0 is not
affected
Hello,
on a 3 node loadtest cluster with very capable machines (32 physical cores,
512G RAM, 20T storage (26 disk RAID)), I'm trying to max out compaction, thus
currently testing with:
concurrent_compactors: 16
compaction_throughput_mb_per_sec: 0
With our simulated incoming load + compaction
Sorry, should have first looked at the source code. In case of 0, it is set to
Double.MAX_VALUE.
Thomas
From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
Sent: Montag, 11. Juni 2018 08:53
To: user@cassandra.apache.org
Subject: compaction_throughput: Difference between 0
Explicitly setting Xmn with G1 basically results in overriding the target
pause-time goal, thus should be avoided.
http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
Thomas
From: rajpal reddy [mailto:rajpalreddy...@gmail.com]
Sent: Mittwoch, 13. Juni 2018 17:27
To:
Hi Kurt,
thanks for pointing me to the Xmx issue.
JIRA + patch (for Linux only based on C* 3.11) for the parallel GC thread issue
is available here: https://issues.apache.org/jira/browse/CASSANDRA-14475
Thanks,
Thomas
From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Dienstag, 29. Mai
heapsize by default will be 256mb, which isn't hugely
problematic, and it's unlikely more than that would get allocated.
On 29 May 2018 at 09:29, Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hi Kurt,
thanks for pointing me to the Xmx issue.
JIRA + patch (for Linux only
Hello,
most likely obvious and perhaps already answered in the past, but just want to
be sure ...
E.g. I have set:
concurrent_compactors: 4
compaction_throughput_mb_per_sec: 16
I guess this will lead to ~ 4MB/s per Thread if I have 4 compactions running in
parallel?
So, in case of upscaling
Jeff,
FWIW, when talking about https://issues.apache.org/jira/browse/CASSANDRA-13929,
there is a patch available since March without getting further attention.
Regards,
Thomas
From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Dienstag, 05. Juni 2018 00:51
To: cassandra
Subject: Re: 3.11.2
Hello,
on a quite capable machine with 32 physical cores (64 vCPUs) we see sporadic
CPU usage up to 50% caused by nodetool on this box, thus digged a bit further.
A few observations:
1) nodetool is reusing the $MAX_HEAP_SIZE environment variable, thus if we are
running Cassandra with e.g.
Hello,
has anybody already some experience/results if a patched Linux kernel regarding
Meltdown/Spectre is affecting performance of Cassandra negatively?
In production, all nodes running in AWS with m4.xlarge, we see up to a 50%
relative (e.g. AVG CPU from 40% => 60%) CPU increase since Jan 4,
AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Quick follow up.
Others in AWS reporting/seeing something similar, e.g.:
https://twitter.com/BenBromhead/status/950245250504601600
So, while we have seen an relative CPU incr
and not production though), thus more or less double patched now. Additional
CPU impact by OS/VM level kernel patching is more or less negligible, so looks
highly Hypervisor related.
Regards,
Thomas
From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
Sent: Freitag, 05. Jänner 2018
Hello,
we are running 2.1.18 with vnodes in production and due to
(https://issues.apache.org/jira/browse/CASSANDRA-11155) we can't run cleanup
e.g. after extending the cluster without blocking our hourly snapshots.
What options do we have to get rid of partitions a node does not own anymore?
?
On Thu, Jan 18, 2018 at 2:32 AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Sam,
thanks for the confirmation. Going back to u152 then.
Thomas
From: li...@beobal.com<mailto:li...@beobal.com>
[mailto:li...@be
Hello,
after switching from JDK8u152 to JDK8u162, Cassandra fails with the following
stack trace upon startup.
ERROR [main] 2018-01-18 07:33:18,804 CassandraDaemon.java:706 - Exception
encountered during startup
java.lang.AbstractMethodError:
t;> wrote:
For what it’s worth, we (TLP) just posted some results comparing pre and post
meltdown statistics:
http://thelastpickle.com/blog/2018/01/10/meltdown-impact-on-latency.html
On Jan 10, 2018, at 1:57 AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinma
e logs on the JIRA or better yet a
way to reproduce?
On 14 January 2018 at 16:12, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
we are running 2.1.18 with vnodes in production and due to
(https://issues.apache.org/jira/browse/C
tdown-impact-on-latency.html
On Jan 10, 2018, at 1:57 AM, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
m4.xlarge do have PCID to my knowledge, but possibly we need a rather new
kernel 4.14. But I fail to see how this
on in 2.1 that triggered this and it wasn't worth fixing. If you
are triggering it easily maybe it is worth fixing in 2.1 as well. Does this
happen consistently? Can you provide some more logs on the JIRA or better yet a
way to reproduce?
On 14 January 2018 at 16:12, Steinmaurer, Thomas
<tho
Did you started with a 9 node cluster from the beginning or did you extend /
scale out your cluster (with vnodes) beyond the replication factor?
If later applies and if you are deleting by explicit deletes and not via TTL,
then nodes might not see the deletes anymore, as a node might not own
of 3, then added another 3 nodes
and again another 3 nodes. So it is a good guess :)
But I have run both repair and cleanup against the table on all nodes, would
that not have removed any stray partitions?
tor. 1. feb. 2018 kl. 22.31 skrev Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.
Stick with 31G in your case. Another article on compressed Oops:
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/
Thomas
From: Eunsu Kim [mailto:eunsu.bil...@gmail.com]
Sent: Dienstag, 13. Februar 2018 08:09
To: user@cassandra.apache.org
Subject: if the heap
Jon,
eager trying it out. Just FYI. Followed the installation instructions on
http://cassandra-reaper.io/docs/download/install/ Debian-based.
1) Importing the key results in:
XXX:~$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys
2895100917357435
Executing:
Hello,
we are running Cassandra in AWS and On-Premise at customer sites, currently 2.1
in production with 3.11 in loadtest.
In a migration path from 2.1 to 3.11.x, I'm afraid that at some point in time
we end up in incremental repairs being enabled / ran a first time
unintentionally, cause:
Hello,
with 2.1, in case a second Cassandra process/instance is started on a host (by
accident), may this result in some sort of corruption, although Cassandra will
exit at some point in time due to not being able to bind TCP ports already in
use?
What we have seen in this scenario is
have gone to 2.1 in the first place, but
it just got missed. Very simple patch so I think a backport should be accepted.
On 7 August 2018 at 15:57, Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
with 2.1, in case a second Cassandra process/instance is started on
Hello,
is it a known issue / limitation that cleanup compactions aren't counted in the
compaction remaining time?
nodetool compactionstats -H
pending tasks: 1
compaction type keyspace table completed totalunit
progress
CleanupXXX YYY
incremental repair?
No flag currently exists. Probably a good idea considering the serious issues
with incremental repairs since forever, and the change of defaults since 3.0.
On 7 August 2018 at 16:44, Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
we are r
t :
Probably worth a JIRA (especially if you can repro in 3.0 or higher, since 2.1
is critical fixes only)
On Wed, Sep 5, 2018 at 10:46 PM Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
is it a known issue / limitation that cleanup compactions aren’t counted in t
in 3.0 or higher, since 2.1
is critical fixes only)
On Wed, Sep 5, 2018 at 10:46 PM Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
is it a known issue / limitation that cleanup compactions aren’t counted in the
compaction remaining time?
nodetool compactionst
2.1 processes?
New ticket for backporting, referencing the existing.
On Mon., 13 Aug. 2018, 22:50 Steinmaurer, Thomas,
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Thanks Kurt.
What is the proper workflow here to get this accepted? Create a new ticket
dedicated for the backport refer
From: Jeff Jirsa
Sent: Montag, 10. September 2018 19:40
To: cassandra
Subject: Re: Drop TTLd rows: upgradesstables -a or scrub?
I think it's important to describe exactly what's going on for people who just
read the list but who don't have context. This blog does a really good job:
Hello,
is there a way to Online scrub a particular SSTable file only and not the
entire column family?
According to the Cassandra logs we have a corrupted SSTable smallish compared
to the entire data volume of the column family in question.
To my understanding, both, nodetool scrub and
As far as I remember, in newer Cassandra versions, with STCS, nodetool compact
offers a ‘-s’ command-line option to split the output into files with 50%, 25%
… in size, thus in this case, not a single largish SSTable anymore. By default,
without -s, it is a single SSTable though.
Thomas
From:
. September 2018 09:47
To: User
Subject: Re: Drop TTLd rows: upgradesstables -a or scrub?
On Tue, Sep 11, 2018 at 9:31 AM Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
As far as I remember, in newer Cassandra versions, with STCS, nodetool compact
offers a ‘-s’ comman
o downgrade back to 152 then !
On 18 January 2018 at 08:34, Steinmaurer, Thomas
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
after switching from JDK8u152 to JDK8u162, Cassandra fails with the following
stack trace upon startup.
E
Hello,
when triggering a "nodetool cleanup" with Cassandra 3.11, the nodetool call
almost returns instantly and I see the following INFO log.
INFO [CompactionExecutor:54] 2018-01-22 12:59:53,903
CompactionManager.java:1777 - Compaction interrupted:
Hello,
Production, 9 node cluster with Cassandra 2.1.18, vnodes, default 256 tokens,
RF=3, compaction throttling = 16, concurrent compactors = 4, running in AWS
using m4.xlarge at ~ 35% CPU AVG
We have a nightly cronjob starting a "nodetool repair -pr ks cf1 cf2"
concurrently on all nodes,
Hi Kurt,
our provisioning layer allows extending a cluster only one-by-one, thus we
didn’t add multiple nodes at the same time.
What we did have was some sort of overlapping between our daily repair cronjob
and the newly added node still in progress joining. Don’t know if this sort of
Hello,
yet another question/issue with repair.
Cassandra 2.1.18, 3 nodes, RF=3, vnode=256, data volume ~ 5G per node only. A
repair (nodetool repair -par) issued on a single node at this data volume takes
around 36min with an AVG of ~ 15MByte/s disk throughput (read+write) for the
entire
Hello,
while bootstrapping a new node into an existing cluster, a node which is acting
as source for streaming got restarted unfortunately. Since then, from nodetool
netstats I don't see any progress for this particular node anymore.
E.g.:
/X.X.X.X
Receiving 94 files, 260.09 GB total.
age-
> From: Michael Shuler On Behalf Of Michael
> Shuler
> Sent: Freitag, 21. September 2018 15:49
> To: user@cassandra.apache.org
> Subject: Re: Cassandra 2.1.21 ETA?
>
> On 9/21/18 3:28 AM, Steinmaurer, Thomas wrote:
> >
> > is there an ETA for 2.1.21 con
Hello,
is there an ETA for 2.1.21 containing the logback update (security
vulnerability fix)?
Thanks,
Thomas
The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee,
> On 9/21/18 3:28 AM, Steinmaurer, Thomas wrote:
> >
> > is there an ETA for 2.1.21 containing the logback update (security
> > vulnerability fix)?
>
> Are you using SocketServer? Is your cluster firewalled?
>
> Feb 2018 2.1->3.11 commits noting this in NEWS.txt:
> ht
Hello,
is there a JMX metric for monitoring dropped hints as a counter/rate,
equivalent to what we see in Cassandra log, e.g.:
WARN [HintedHandoffManager:1] 2018-11-13 13:28:46,991
HintedHandoffMetrics.java:79 - /XXX has 18180 dropped hints, because node is
down past configured hint window.
org.apache.cassandra.metrics/DroppedMessage/HINT/Attributes/FiveMinuteRate
org.apache.cassandra.metrics/DroppedMessage/HINT/Attributes/FifteenMinuteRate
Hayato
On Tue, 22 Jan 2019 at 07:45, Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
Hello,
is there a JMX metric for monitoring d
Hi,
I remember something that a client using the native protocol gets notified too
early by Cassandra being ready due to the following issue:
https://issues.apache.org/jira/browse/CASSANDRA-8236
which looks similar, but above was marked as fixed in 2.2.
Thomas
From: Riccardo Ferrari
Sent:
Hello,
a Blackduck security scan of our product detected a security vulnerability in
the Apache Thrift library 0.9.2, which is shipped in Cassandra up to 3.11
(haven't checked 4.0), also pointed out here:
Alex,
any indications in Cassandra log about insufficient disk space during
compactions?
Thomas
From: Oleksandr Shulgin
Sent: Dienstag, 18. September 2018 10:01
To: User
Subject: Major compaction ignoring one SSTable? (was Re: Fresh SSTable files
(due to repair?) in a static table (was Re:
Hello,
any ideas regarding below, cause it happened again on a different node.
Thanks
Thomas
From: Steinmaurer, Thomas
Sent: Dienstag, 05. Februar 2019 23:03
To: user@cassandra.apache.org
Subject: Cassandra 2.1.18 - NPE during startup
Hello,
at a particular customer location, we are seeing
Hello,
at a particular customer location, we are seeing the following NPE during
startup with Cassandra 2.1.18.
INFO [SSTableBatchOpen:2] 2019-02-03 13:32:56,131 SSTableReader.java:475 -
Opening
arity.
Per-second metrics might show CPU cores getting pegged.
I’m not sure that GC tuning eliminates this problem, but if it isn’t being
caused by that, GC tuning may at least improve the visibility of the underlying
problem.
From: "Steinmaurer, Thomas"
mailto:thomas.steinmau..
Hello,
after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several
hours a node has successfully joined a cluster (via auto-bootstrap).
I have created the following ticket trying to describe the situation, including
hprof / MAT screens:
Hello,
sorry, I know, 3.0.19 has been released just recently. Any ETA for 3.0.20?
Reason is that we are having quite some pain with on-heap pressure after moving
from 2.1.18 to 3.0.18.
https://issues.apache.org/jira/browse/CASSANDRA-15400
Thanks a lot,
Thomas
The contents of this e-mail are
Hello,
looks like 3.0.18 can't handle the same write ingest compared to 2.1.18 on the
same hardware. Basically it looks like the write path, processing batch
messages show 10x higher numbers in regard to on-heap allocations.
I've tried to summarize the finding on the following ticket:
Hello,
using 2.1.8, 3 nodes (m4.10xlarge, ESB SSD-based), vnodes=256, RF=3, we are
trying to add a 4th node.
The two options to my knowledge, mainly affecting throughput, namely stream
output and compaction throttling has been set to very high values (e.g. stream
output = 800 Mbit/s resp.
e fighting against then. It is easy to have a box that looks unused but
in reality its struggling. Given that you’ve opened up the floodgates on
compaction, that would seem quite plausible to be what you are experiencing.
From: "Steinmaurer, Thomas"
mailto:thomas.steinmau...@dynatra
: Oleksandr Shulgin
Sent: Dienstag, 22. Oktober 2019 16:35
To: User
Subject: Re: Cassandra 2.1.18 - Question on stream/bootstrap throughput
On Tue, Oct 22, 2019 at 12:47 PM Steinmaurer, Thomas
mailto:thomas.steinmau...@dynatrace.com>>
wrote:
using 2.1.8, 3 nodes (m4.10xlarge, ESB SSD-based),
If possible, prefer m5 over m4, cause they are running on a newer hypervisor
(KVM-based), single core performance is ~ 10% better compared to m4 with m5
even being slightly cheaper than m4.
Thomas
From: Erick Ramirez
Sent: Donnerstag, 30. Jänner 2020 03:00
To: user@cassandra.apache.org
Hello,
https://issues.apache.org/jira/browse/CASSANDRA-15426. According to the ticket,
changes in https://issues.apache.org/jira/browse/CASSANDRA-15053 likely being
the root cause.
Will this be fixed in 3.0.20 and 3.11.6?
Thanks,
Thomas
The contents of this e-mail are intended for the named
Leon,
we had an awful performance/throughput experience with 3.x coming from 2.1.
3.11 is simply a memory hog, if you are using batch statements on the client
side. If so, you are likely affected by
https://issues.apache.org/jira/browse/CASSANDRA-16201
Regards,
Thomas
93 matches
Mail list logo