[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-12-12 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15741849#comment-15741849
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


Thanks everybody.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Assignee: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-12-06 Thread Alwyn Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727484#comment-15727484
 ] 

Alwyn Davis commented on CASSANDRA-12649:
-

[~blerer] No concerns from me, thanks for identifying the CAS issue!

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Assignee: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-12-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727459#comment-15727459
 ] 

Aleksey Yeschenko commented on CASSANDRA-12649:
---

Looks all good to me. Could fix if/else brackets formatting in 
{{updatePartitionsPerBatchMetrics()}} if you feel like it, but the logic seems 
correct.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Assignee: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-12-05 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722250#comment-15722250
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


@Alwyn Davis any concern with the changes I did to your patch? 

[~iamaleksey] could you have a look at the patch to check that I did not miss 
anything?

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Assignee: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-24 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692984#comment-15692984
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


Thanks for the patch [~alwyn].

I spent over a day looking at it and at the problem.

My first concern was that the benchmark was not really telling me the cost of 
measuring the total size of the batch mutations. I ended up doing some 
profiling to get a feeling of the cost associated to it. For a batch of 100 
inserts, on a table with 1 clustering column and 1 other column, the 
computation took around 0.1% of the CPU time needed to process the request 
(taking into account that the JIT was able to profile more aggressively than it 
will be in a real production system and that the CPU was doing only few 
branches misspredictions). In my opinion this number is still reasonable if we 
execute only once that operation per write request.

Looking at the patch, I realized that the {{MutationSizeHistogram}} will only 
be computed for batch without conditions. The problem being that for CAS writes 
the mutations were only created after the condition had been checked in 
{{StorageProxy}}. I decided to try to move the {{MutationSizeHistogram}} metric 
to the {{StorageProxy}} level. The result is more consistent, and less 
surprising for the operators, has it gives the mutation size distribution for 
all the write requests. The disavantage is obviously that for some batches we 
will compute twice the data size. I guess that we should address that problem 
at some point.

For the partition per batch metrics, I decided to ignore the CAS batches. As 
they do not really belong to the logged or unlogged categories we would have 
needed another histogram and, as they cannot be performed on more than one 
partition, those histograms will not bring any interesting information.

The result of my experimentations on top of your patch are 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:12649-3.X].
|[utest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-12649-3.X-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-12649-3.X-dtest/]|
As the {{StorageProxy}} metrics cannot be easily unit tested, I checked them 
manually via the JMX console. I also made sure that the changes did not broke 
{{nodeTool}}. 

[~alwyn] could you check the patch and tell me if you are fine with the changes 
I made. If it looks good to you, I will ask [~iamaleksey] to have a look at it 
to be make sure that I did not miss anything.





> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-14 Thread Alwyn Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665310#comment-15665310
 ] 

Alwyn Davis commented on CASSANDRA-12649:
-

I've updated the patch - [^12649-3.x-v2.patch] based on your points.

I've also added a {{BatchBench}} JMH class and ran it against trunk, the first 
implementation of this patch and this current implementation:

*BATCH METRICS*
{noformat}o.a.c.t.m.BatchBench.batchLoggedthrpt5038.784 
   ± 6.790ops/ms
o.a.c.t.m.BatchBench.batchUnloggedthrpt5039.278± 7.572  
  ops/ms{noformat}

*BATCH METRICS (v2)*
{noformat}o.a.c.t.m.BatchBench.batchLoggedthrpt5040.340 
   ± 7.856ops/ms
o.a.c.t.m.BatchBench.batchUnloggedthrpt5038.198± 8.116  
  ops/ms{noformat}

*TRUNK*
{noformat}o.a.c.t.m.BatchBench.batchLoggedthrpt5038.737 
   ± 7.661ops/ms
o.a.c.t.m.BatchBench.batchUnloggedthrpt5038.455± 7.111  
  ops/ms{noformat}

As {{update.dataSize()}} is not checked on single partition batches, 
{{mutationSizeHistogram}} will only measure multi-partition batches.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x-v2.patch, 12649-3.x.patch, 
> stress-batch-metrics.tar.gz, stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-07 Thread Alwyn Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645629#comment-15645629
 ] 

Alwyn Davis commented on CASSANDRA-12649:
-

Thanks for the awesome feedback! I'll try moving the metrics to 
`BatchStatement` and verifying the performance impact with some JMH benchmarks.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, 
> stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-07 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645550#comment-15645550
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


I had a discussion earlier today with [~tjake] who has more experience than me 
with Cassandra performance and Stress. It pointed out to me that we already use 
{{PartitionUpdate::dataSize}} to issue batch size warnings in 
{{BatchStatement::verifyBatchSize}}. His suggestion was to move the metric to 
{{BatchStatement}} as it will allow the patch to reuse the result of the 
existing call and avoid having the to compute the size on each replica.
The idea would be then to add that metric to your {{BatchMetrics}} instead of 
adding it to {{TableMetrics}}.

I looked at {{BatchStatement}} and it seems that the {{verifyBatchSize}} is 
only called for batch without conditions. I opened CASSANDRA-12885 to address 
that problem. If you can fix it first, it would be great :-).

Regarding your initial patch, I also have the following nits:
* {{BatchMetrics}} has a start method that does nothing and should be removed 
(including the call to it in StorageService)
* Instead of making {{BatchMetrics}} a singleton you could have a public static 
variable in {{BatchStatement}} (similar to {{CQLMetrics}} in 
{{QueryProcessor}}) as it is more inline with the rest of the code.
* It would be nice if you could add the documentation for those new metrics in 
{{doc/source/operating/metrics.rts}} 



> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, 
> stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-07 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15643663#comment-15643663
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


[~appodictic] If you have some questions/concerns on some review feedbacks, you 
should not hesitate to write them down on the ticket.

I realize now that my feedback was may be not really helpfull. Sorry for that.

My main concern is the fact that the measurement of the mutation will be done 
on every mutation and that it involves going through the mutation, to sum up 
the size of the mutation. On nodes with heavy writes this might have a non 
neglectable CPU cost. It is true that we compute the data size when we 
serialize the data but, there, we do not have another choice.

Setting up a benchmark for that take some time and unfortunatly I do not have 
the time to do it myself.
If I have to do it, I will try to setup a JMH benchmark to check the throughput 
of the {{dataSize}} method of random data. It is too difficult to assess that 
type of stuff on a running Cassandra due to the JIT and the overall fluctuation 
in response time of the all database (the impact of the change might end up 
being lost in the noise).

Regarding,
{quote}
I am also not fully convinced by the usefullness of Logged / Unlogged 
Partitions per batch distribution. Could you explain in more details how it 
will be usefull for you?
{quote}
I just wanted to understand why such mettric will be usefull.

What I did not see was the end of [~KurtG] comment:
bq. We have seen a lot of users mistakenly batch against multiple partitions.

Then my question is: Is there not a better way of doing it? Should we not have 
a setting for rejecting batches against multi partition or a warning?

[~appodictic] My only concern is the quality of what goes into Cassandra. I am 
not trying to prevent anybody from contributing. If you do not understand my 
comments or do not agree with them will free to write it down on the ticket. My 
patches do not receive a better treatment (have a look at CASSANDRA-10707, 
which took me 8 months), they rarely go in on the first attempt.



 


 



> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, 
> stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-07 Thread Alwyn Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15643580#comment-15643580
 ] 

Alwyn Davis commented on CASSANDRA-12649:
-

Sure.  I ran cassandra-stress on a separate EC2 instance against a 3-node 
cluster with trunk and then with the batch metrics patch.  The results summary 
is (I've also attached the full stress output):
TRUNK

LOGGED
Row rate: 13,965 row/s, 95th latency: 1.4ms
Row rate: 14,351 row/s, 95th latency: 1.3ms
Row rate: 14,359 row/s, 95th latency: 1.7ms

UNLOGGED
Row rate: 14,674 row/s, 95th latency: 1.3ms
Row rate: 14,168 row/s, 95th latency: 1.7ms
Row rate: 14,128 row/s, 95th latency: 1.7ms


BATCH METRICS

LOGGED
Row rate: 13,531 row/s, 95th latency: 1.9ms
Row rate: 13,943 row/s, 95th latency: 1.7ms
Row rate: 14,210 row/s, 95th latency: 1.7ms

UNLOGGED
Row rate: 14,552 row/s, 95th latency: 1.3ms
Row rate: 14,568 rwo/s, 95th latency: 1.3ms
Row rate: 14,627 row/s, 95th latency: 1.2ms


I also ran the BatchMetrics test class with the metrics patch and just trunk 
for comparison (10,000 iterations of single queries, logged, unlogged and CAS 
batches):
WITH BATCH METRICS
query: 378410 ms, loggedBatch: 31638 ms, unloggedBatch: 21466 ms, cas: 33880 ms
query: 392960 ms, loggedBatch: 26284 ms, unloggedBatch: 21114 ms, cas: 30566 ms
query: 386964 ms, loggedBatch: 28724 ms, unloggedBatch: 22257 ms, cas: 33323 ms

TRUNK
query: 395683 ms, loggedBatch: 29994 ms, unloggedBatch: 21638 ms, cas: 33096 ms
query: 379503 ms, loggedBatch: 29911 ms, unloggedBatch: 21346 ms, cas: 32364 ms
query: 396935 ms, loggedBatch: 42124 ms, unloggedBatch: 21679 ms, cas: 34353 ms


Regarding the usefulness of Logged / Unlogged Partitions per batch, I believe 
that it would allow correlation of batch size to performance.  Standard advice 
is to process a single partition per unlogged batch, so comparing the number of 
partitions per batch against throughput should highlight poor usage of batches.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, 
> stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-04 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636401#comment-15636401
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


I discussed the patch with [~iamaleksey] and we both have some concerns with 
the performance impact of measuring the mutation size. Could you provide a 
benchmark to assess the performance impact of that measurement?

I am also not fully convinced by the usefullness of Logged / Unlogged 
Partitions per batch distribution. Could you explain in more details how it 
will be usefull for you?

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-10-31 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623715#comment-15623715
 ] 

Kurt Greaves commented on CASSANDRA-12649:
--

So rebased the patch on 3.X and one of the added unit tests actually exposed a 
bug that was just introduced in CASSANDRA-12060. Attached new, rebased patch 
here, however doubtful it's going to make it into 3.10.

Also raised CASSANDRA-12867 to cover the bug.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-10-28 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614759#comment-15614759
 ] 

Romain Hardouin commented on CASSANDRA-12649:
-

Yes, these metrics are useful.
Unfortunately the patch doesn't merge anymore neither to cassandra-3.X branch 
nor to the trunk.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-10-27 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613856#comment-15613856
 ] 

Kurt Greaves commented on CASSANDRA-12649:
--

Would be nice to see this committed. We have seen a lot of users mistakenly 
batch against multiple partitions. 

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-10-18 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585794#comment-15585794
 ] 

Edward Capriolo commented on CASSANDRA-12649:
-

+1 (non binding) users with unpredictable batch sizes tend to also have gc 
problems and this would aid in insight.

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)