[
https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956580#comment-16956580
]
David Capwell commented on CASSANDRA-15364:
-------------------------------------------
FYI I created a quick JMH to show the difference between the TreeSet version
and the canonical file version; here are the results (too lazy to run long
enough, stoped after iterations became stable... only ran on Mac...)
Number of Components: 5
{code:java}
# 1 SSTable
* Before : ~30031.749 ops/s
* After : ~42116.323 ops/s
# 20000
* Before : ~0.328 ops/s
* After : ~4.523 ops/s
* Before w/abspath : ~3.133 ops/s
{code}
The last one is when I switched from FileUtils.getCanonicalPath to new
File(baseFilename).getAbsolutePath() (though, descriptor still calls
getCanonicalFile).
LGTM.
> Avoid over scanning data directories in LogFile.verify()
> --------------------------------------------------------
>
> Key: CASSANDRA-15364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Compaction
> Reporter: Marcus Eriksson
> Assignee: Marcus Eriksson
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file
> in {{LogFile.verify()}} - this can get very expensive during startup when we
> call {{LogTransaction.removeUnfinishedLeftovers()}}. In
> {{LogRecord.getExistingFiles(Set<String> absoluteFilePaths)}} we also fully
> parse the file name of the sstables found, here we only need to prefix match.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]