[
https://issues.apache.org/jira/browse/CASSANDRA-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-13096:
-----------------------------------------
Resolution: Not A Problem
Status: Resolved (was: Open)
> Snapshots slow down jmx scraping
> --------------------------------
>
> Key: CASSANDRA-13096
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13096
> Project: Cassandra
> Issue Type: Bug
> Components: Observability/Metrics
> Reporter: Maxime Fouilleul
> Priority: Normal
> Attachments: CPU Load.png, Clear Snapshots.png, JMX Scrape
> Duration.png
>
>
> Hello,
> We are scraping the jmx metrics through a prometheus exporter and we noticed
> that some nodes became really long to answer (more than 20 seconds). After
> some investigations we do not find any hardware problem or overload issues on
> there "slow" nodes. It happens on different clusters, some with only few giga
> bytes of dataset and it does not seams to be related to a specific version
> neither as it happens on 2.1, 2.2 and 3.0 nodes.
> After some unsuccessful actions, one of our ideas was to clean the snapshots
> staying on one problematic node:
> {code}
> nodetool clearsnapshot
> {code}
> And the magic happens... as you can see in the attached diagrams, the second
> we cleared the snapshots, the CPU activity dropped immediatly and the
> duration to scrape the jmx metrics goes from +20 secs to instantaneous...
> Can you enlighten us on this issue? Once again, it appears on our three 2.1,
> 2.2 and 3.0 versions, on different volumetry and it is not systematically
> linked to the snapshots as we have some nodes with the same snapshots volume
> which are going pretty well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]