[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246079#comment-14246079
 ] 

Jonathan Shook edited comment on CASSANDRA-8371 at 12/14/14 8:16 PM:
---------------------------------------------------------------------

[~Bj0rn],
The phrase "ideal scheduling" was meant to describe the case in which the data 
for each sstable is compacted exactly once per window. In other words, there is 
only one coalescing compaction needed for all data in the new interval once a 
set of smaller intervals is grouped into a single larger interval. You describe 
some of the scenarios which make this more of an ideal than an actuality in 
your response above. I understand that the windows are anchored at fixed points 
using modulo against the timestamp. The rationale I used above actually depends 
on it as an assumption, otherwise you wouldn't be able to achieve ideal 
compaction scheduling of "once per interval".

I guess we need to be careful about the terms we use here. I'd favor "fixed 
intervals" and "coalescing of fixed intervals". I believe my rationale on 
compaction load still makes sense, unless someone has a counter-example or 
clarification.





was (Author: jshook):
[~Bj0rn],
The phrase "ideal scheduling" was meant to describe the case in which the data 
for each sstable is compacted exactly once per window. In other words, there is 
only one coalescing compaction needed for all data once a set of smaller 
intervals is grouped into a single larger interval. You describe some of the 
scenarios which make this more of an ideal than an actuality in your response 
above. I understand that the windows are anchored at fixed points using modulo 
against the timestamp. The rationale I used above actually depends on it as an 
assumption, otherwise you wouldn't be able to achieve ideal compaction 
scheduling of "once per interval".

I guess we need to be careful about the terms we use here. I'd favor "fixed 
intervals" and "coalescing of fixed intervals". I believe my rationale on 
compaction load still makes sense, unless someone has a counter-example or 
clarification.




> DateTieredCompactionStrategy is always compacting 
> --------------------------------------------------
>
>                 Key: CASSANDRA-8371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: mck
>            Assignee: Björn Hegerfors
>              Labels: compaction, performance
>         Attachments: java_gc_counts_rate-month.png, 
> read-latency-recommenders-adview.png, read-latency.png, 
> sstables-recommenders-adviews.png, sstables.png, vg2_iad-month.png
>
>
> Running 2.0.11 and having switched a table to 
> [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
> disk IO and gc count increase, along with the number of reads happening in 
> the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always 
> happening, and pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in 
> slides 7+8 of https://prezi.com/b9-aj6p2esft/
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
> screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
> to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under 
> (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to