[ 
https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956580#comment-16956580
 ] 

David Capwell commented on CASSANDRA-15364:
-------------------------------------------

FYI I created a quick JMH to show the difference between the TreeSet version 
and the canonical file version; here are the results (too lazy to run long 
enough, stoped after iterations became stable... only ran on Mac...)

 

Number of Components: 5

 
{code:java}
# 1 SSTable
* Before : ~30031.749 ops/s
* After  : ~42116.323 ops/s
# 20000
* Before : ~0.328 ops/s
* After  : ~4.523 ops/s
* Before w/abspath : ~3.133 ops/s
{code}
 

 

The last one is when I switched from FileUtils.getCanonicalPath to new 
File(baseFilename).getAbsolutePath() (though, descriptor still calls 
getCanonicalFile).

 

LGTM.

> Avoid over scanning data directories in LogFile.verify()
> --------------------------------------------------------
>
>                 Key: CASSANDRA-15364
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file 
> in {{LogFile.verify()}} - this can get very expensive during startup when we 
> call {{LogTransaction.removeUnfinishedLeftovers()}}. In 
> {{LogRecord.getExistingFiles(Set<String> absoluteFilePaths)}} we also fully 
> parse the file name of the sstables found, here we only need to prefix match.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to