[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7402:

Component/s: Metrics

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
>  Labels: ops, performance, stability
> Fix For: 4.x
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2015-11-10 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7402:
-
Reviewer:   (was: Aleksey Yeschenko)

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
>  Labels: ops, performance, stability
> Fix For: 3.x
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2015-03-30 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7402:
--
Reviewer: Aleksey Yeschenko
Priority: Minor  (was: Major)

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
>  Labels: ops, performance, stability
> Fix For: 2.1.4
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2015-01-22 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7402:
-
Reviewer:   (was: Aleksey Yeschenko)

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 2.1.3
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-11-10 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7402:
--
Fix Version/s: (was: 2.1.2)
   2.1.3

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 2.1.3
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-10-07 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7402:
-
Fix Version/s: (was: 3.0)
   2.1.1

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 2.1.1
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-10-01 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7402:
--
Reviewer: Aleksey Yeschenko

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 3.0
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-09-19 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7402:
--
Attachment: 7402.txt

Patch to add a histogram and meter for reads and writes.  These metrics exist 
per column family and are rolled up to keyspace level.

For reads the histogram track for the heap size of query responses (both per 
partition and across partitions (for range queries))  

For writes the histogram tracks the heap size of single mutations (we already 
track and warn users on large batches).

The meters track the aggregate heap usage of reads and writes per node. This is 
valuable to track since you can see that you are generating too many aggregate 
operations at once.

I changed nodetool cfstats to expose these per column family.   Most operators 
would want to track this stat in their system and pick values to alert on.

{code}
Average read response bytes per query (last five minutes): 620
Maximum read response bytes per query (last five minutes): 620
Total read response rate bytes/sec (past minute): 7836749
Total read response rate bytes/sec (past five minutes): 2027754
Average write bytes per partition (last five minutes): 620
Maximum write bytes per partition  (last five minutes): 620
Total write rate bytes/sec (past minute): 2391983
Total write rate bytes/sec (past five minutes): 2940078
{code}

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 3.0
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-09-19 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7402:
--
Description: 
When running a production cluster one common operational issue is quantifying 
GC pauses caused by ongoing requests.

Since different queries return varying amount of data you can easily get your 
self into a situation where you Stop the world from a couple of bad actors in 
the system.  Or more likely the aggregate garbage generated on a single node 
across all in flight requests causes a GC.

It would be very useful for operators to see how much garbage the system is 
using to handle in flight mutations and queries. 

It would also be nice to have either a log of queries which generate the most 
garbage so operators can track this.  Also a histogram.


  was:
When running a production cluster one common operational issue is quantifying 
GC pauses caused by ongoing requests.

Since different queries return varying amount of data you can easily get your 
self into a situation where you Stop the world from a couple of bad actors in 
the system.  Or more likely the aggregate garbage generated on a single node 
across all in flight requests causes a GC.

We should be able to set a limit on the max heap we can allocate to all 
outstanding requests and track the garbage per requests to stop this from 
happening.  It should increase a single nodes availability substantially.

In the yaml this would be

{code}
total_request_memory_space_mb: 400
{code}

It would also be nice to have either a log of queries which generate the most 
garbage so operators can track this.  Also a histogram.



> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 3.0
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2014-09-15 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7402:
--
Summary: Add metrics to track memory used by client requests  (was: limit 
the on heap memory available to requests)

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: ops, performance, stability
> Fix For: 3.0
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> We should be able to set a limit on the max heap we can allocate to all 
> outstanding requests and track the garbage per requests to stop this from 
> happening.  It should increase a single nodes availability substantially.
> In the yaml this would be
> {code}
> total_request_memory_space_mb: 400
> {code}
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)