[
https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334128#comment-15334128
]
Paulo Motta commented on CASSANDRA-10862:
-----------------------------------------
Thanks for the patch and sorry for the delay [[email protected]]. Overall I
like your approach, because it mitigates the impact without so many changes.
See some suggestions for improvement below:
* I'm a bit uncomfortable with the unbounded busy wait so we should probably
add a time bound to the loop in order to avoid hanging indefinitely if there is
a problem with compactions catching up
* While the synchronization block on {{CFS}} would guarantee only one producer
would add new sstables at a time, I think this might be a premature
optimization that could be a source of problems later, so I'd prefer to take a
best effort approach initially since I think that we're trying to protect from
an abysmal number of sstables, and we don't allow concurrent repairs on the
same tables anyway, so there shouldn't me many concurrent
{{OnCompletionRunnable}} running for the same {{CFS}}
* With that said, I think we could have something like
{{CompactionManager.waitForL0Leveling(ColumnFamilyStore cfs, int
maxSStableCount, long maxWaitTime}}), similar to {{waitForCessation}} method
but waits for L0 leveling instead and without taking a {{Callable}} as argument
* I think waiting for leveling on validation will probably cause overstreaming
during repair, since different replicas will flush on different times, causing
digest mismatches, so we should probably avoid that
* {{compaction_max_l0_sstable_count}} is quite an advanced lever so I don't
think it should be exposed as a {{cassandra.yaml}} attribute, but instead as a
system property (similar to {{cassandra.disable_stcs_in_l0}}). We could also
maybe add this as a dynamic JMX attribute to facilitate tuning.
* We could also probably add another property for the {{max_wait_time}} for the
L0 leveling, and maybe even provide a conservative default to both properties
on trunk that could already bring some benefits to the average user and still
allow more advanced users to tune it according to usage, something like:
{{streaming.max_L0_count=1000}} and {{streaming.max_L0_wait_time=1min}}. I'm
not really sure about these values, so it would be nice if you have any
suggestions based on your tests so far.
Anything else to add here or any caveat I might be missing [~krummas]?
> LCS repair: compact tables before making available in L0
> --------------------------------------------------------
>
> Key: CASSANDRA-10862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10862
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction, Streaming and Messaging
> Reporter: Jeff Ferland
> Assignee: Chen Shen
>
> When doing repair on a system with lots of mismatched ranges, the number of
> tables in L0 goes up dramatically, as correspondingly goes the number of
> tables referenced for a query. Latency increases dramatically in tandem.
> Eventually all the copied tables are compacted down in L0, then copied into
> L1 (which may be a very large copy), finally reducing the number of SSTables
> per query into the manageable range.
> It seems to me that the cleanest answer is to compact after streaming, then
> mark tables available rather than marking available when the file itself is
> complete.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)