[
https://issues.apache.org/jira/browse/CASSANDRA-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuki Morishita updated CASSANDRA-4894:
--------------------------------------
Attachment: 4894-1.2.txt
Patch attached to track count per number of merged rows.
For logging counters, I just append dump of counters to the end of compaction
log.
{code}
INFO [CompactionExecutor:1] 2012-12-17 15:22:53,528 CompactionTask.java (line
238) Compacted to
[/Users/yuki/.ccm/1.2/node1/data/system/local/system-local-ia-18-Data.db,].
957 to 629 (~65% of original) bytes for 1 keys at 0.017139MB/s. Time: 35ms.
Merged row stats: [0, 0, 0, 1].
{code}
'Merged row stats' part is newly added one. If there is better format, please
let me know.
> log number of combined/merged rows during a compaction
> ------------------------------------------------------
>
> Key: CASSANDRA-4894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4894
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Matthew F. Dennis
> Assignee: Yuki Morishita
> Priority: Minor
> Fix For: 1.2.1
>
> Attachments: 4894-1.2.txt
>
>
> we already log some details about compactions but it would be useful to know
> how many rows were merged (resulting in "useful" work) and how many were
> unique (representing "wasted work").
> the simple approach requires two additional counters (one for unique rows,
> one for merged rows). As the merge join is progressing if two or more rows
> are combined, tick the joined counter. If a row is simply copied tick the
> unique counter.
> a more complete solution would be to keep a separate count for each number of
> merges. This would require number_of_files_being_merged counters. If no
> rows were merged, tick counters[0], if two rows were merged tick counters[1]
> and so on
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira