[
https://issues.apache.org/jira/browse/CASSANDRA-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14327961#comment-14327961
]
Joshua McKenzie commented on CASSANDRA-8833:
--------------------------------------------
Regarding Windows: early re-open with memory-mapped index files isn't a
technical feasibility due to ntfs restrictions and I suspect that memory
mapping will eclipse performance gains from early re-open for most use-cases.
CASSANDRA-8709 will make it an option w/out mmap and we can leave that as the
default path.
I'm familiar with the cost leading up to this point in time to stabilize this
feature and the other systems it stressed. Code-changes to stabilize this have
been increasing quite a bit recently so I don't see evidence that we're near
the end of stabilization for this feature yet, and as you stated:
bq. every line touched is a new bug in waiting
Also, the performance gains from early re-open look to vary greatly depending
on test setup; the graphs referenced in CASSANDRA-6916 vary from a
night-and-day comparison down to a 9% total ops improvement on the regression
case. When speaking of the performance improvement of this feature in the
comments above I see several phrases that stick out to me: "can be", "Possibly
introducing", "may see", "could be". I'd prefer hard #'s from test beds.
> Stop opening compaction results early
> -------------------------------------
>
> Key: CASSANDRA-8833
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8833
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Marcus Eriksson
> Fix For: 2.1.4
>
>
> We should simplify the code base by not doing early opening of compaction
> results. It makes it very hard to reason about sstable life cycles since they
> can be in many different states, "opened early", "starts moved", "shadowed",
> "final", instead of as before, basically just one (tmp files are not really
> 'live' yet so I don't count those). The ref counting of shared resources
> between sstables in these different states is also hard to reason about. This
> has caused quite a few issues since we released 2.1
> I think it all boils down to a performance vs code complexity issue, is
> opening compaction results early really 'worth it' wrt the performance gain?
> The results in CASSANDRA-6916 sure look like the benefits are big enough, but
> the difference should not be as big for people on SSDs (which most people who
> care about latencies are)
> WDYT [~benedict] [~jbellis] [~iamaleksey] [~JoshuaMcKenzie]?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)