[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331 ] Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/25/23 7:57 AM: So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. There are also a number of benchmarks which just fail because of different reasons, like assertion errors or other runtime exceptions. Those are unrelated to this patch and should be addressed in CASSANDRA-18873 was (Author: jlewandowski): So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. {{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 756 tests x 11s each ~= 2h 20m {{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 method = 7056 tests x 30s each ~= 59h I'm going to exclude the longest benchmarks for now and create a ticket to fix it later - see CASSANDRA-18873. Those benchmarks are still ok to run locally with {{-Dbenchmark.name=...}} btw. I've implemented a test which estimates benchmark run times according to the number of forks, warups and measurement iterations number and time, and the number of parameter combinations. The results are as follows: {noformat} Estimated time for CacheLoaderBench: ~5 s Estimated time for LatencyTrackingBench: ~26 s Estimated time for SampleBench: ~30 s Estimated time for ReadWriteBench: ~30 s Estimated time for MutationBench: ~30 s Estimated time for CompactionBench: ~35 s Estimated time for DiagnosticEventPersistenceBench: ~40 s Estimated time for ZeroCopyStreamingBench: ~44 s Estimated time for BatchStatementBench: ~110 s Estimated time for DiagnosticEventServiceBench: ~120 s Estimated time for MessageOutBench: ~144 s Estimated time for BloomFilterSerializerBench: ~144 s Estimated time for FastThreadLocalBench: ~156 s Estimated time for HashingBench: ~156 s Estimated time for ChecksumBench: ~208 s Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s Estimated time for PendingRangesBench: ~ 5 m Estimated time for DirectorySizerBench: ~ 5 m Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m Estimated time for PreaggregatedByteBufsBench: ~ 7 m Estimated time for AutoBoxingBench: ~ 8 m Estimated time for OutputStreamBench: ~ 13 m Estimated time for BTreeBuildBench: ~ 13 m Estimated time for StringsEncodeBench: ~ 20 m Estimated time for instance.ReadWidePartitionsBench: ~ 21 m Estimated time for btree.BTreeBuildBench: ~ 30 m Estimated time for BTreeSearchIteratorBench: ~ 31 m Estimated time for btree.BTreeTransformBench: ~ 138 m Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m Estimated time for btree.BTreeUpdateBench: ~58 h Total estimated time: ~69 h {noformat} We can make it assert that no benchmark is planned to run longer than 30 minutes (but as said, a separate ticket) > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > Time Spent: 50m > Remaining Estimate: 0h > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331 ] Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/24/23 7:23 AM: So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. {{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 756 tests x 11s each ~= 2h 20m {{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 method = 7056 tests x 30s each ~= 59h I'm going to exclude the longest benchmarks for now and create a ticket to fix it later - see CASSANDRA-18873. Those benchmarks are still ok to run locally with {{-Dbenchmark.name=...}} btw. I've implemented a test which estimates benchmark run times according to the number of forks, warups and measurement iterations number and time, and the number of parameter combinations. The results are as follows: {noformat} Estimated time for CacheLoaderBench: ~5 s Estimated time for LatencyTrackingBench: ~26 s Estimated time for SampleBench: ~30 s Estimated time for ReadWriteBench: ~30 s Estimated time for MutationBench: ~30 s Estimated time for CompactionBench: ~35 s Estimated time for DiagnosticEventPersistenceBench: ~40 s Estimated time for ZeroCopyStreamingBench: ~44 s Estimated time for BatchStatementBench: ~110 s Estimated time for DiagnosticEventServiceBench: ~120 s Estimated time for MessageOutBench: ~144 s Estimated time for BloomFilterSerializerBench: ~144 s Estimated time for FastThreadLocalBench: ~156 s Estimated time for HashingBench: ~156 s Estimated time for ChecksumBench: ~208 s Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s Estimated time for PendingRangesBench: ~ 5 m Estimated time for DirectorySizerBench: ~ 5 m Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m Estimated time for PreaggregatedByteBufsBench: ~ 7 m Estimated time for AutoBoxingBench: ~ 8 m Estimated time for OutputStreamBench: ~ 13 m Estimated time for BTreeBuildBench: ~ 13 m Estimated time for StringsEncodeBench: ~ 20 m Estimated time for instance.ReadWidePartitionsBench: ~ 21 m Estimated time for btree.BTreeBuildBench: ~ 30 m Estimated time for BTreeSearchIteratorBench: ~ 31 m Estimated time for btree.BTreeTransformBench: ~ 138 m Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m Estimated time for btree.BTreeUpdateBench: ~58 h Total estimated time: ~69 h {noformat} We can make it assert that no benchmark is planned to run longer than 30 minutes (but as said, a separate ticket) was (Author: jlewandowski): So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. {{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 756 tests x 11s each ~= 2h 20m {{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 method = 7056 tests x 30s each ~= 59h To me, the other one is unacceptable for CI. We need to reduce the number of parameters, and also, set the number of forks to 1 (probably for each test). I'm going to exclude the benchmark for now and create a ticket to fix it later (I'm going to do that for each benchmark which causes CI to fail). Those benchmarks are still ok to run locally with {{-Dbenchmark.name=...}} btw. I've implemented a test which estimates benchmark run times according to the number of forks, warups and measurement iterations number and time, and the number of parameter combinations. The results are as follows: {noformat} Estimated time for CacheLoaderBench: ~5 s Estimated time for LatencyTrackingBench: ~26 s Estimated time for SampleBench: ~30 s Estimated time for ReadWriteBench: ~30 s Estimated time for MutationBench: ~30 s Estimated time for CompactionBench: ~35 s Estimated time for DiagnosticEventPersistenceBench: ~40 s Estimated time for ZeroCopyStreamingBench: ~44 s Estimated time for BatchStatementBench: ~110 s Estimated time for DiagnosticEventServiceBench: ~120 s Estimated time for MessageOutBench: ~144 s Estimated time for BloomFilterSerializerBench: ~144 s Estimated time for FastThreadLocalBench: ~156 s Estimated time for HashingBench: ~156 s Estimated time for ChecksumBench: ~208 s Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s Estimated time for PendingRangesBench: ~ 5 m Estimated time for DirectorySizerBench: ~ 5 m Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m Estimated time for PreaggregatedByteBufsBench: ~ 7 m Estimated time for AutoBoxingBench: ~ 8 m Estimated time for OutputStreamBench: ~ 13 m Estimated time for BTreeBuildBench: ~ 13 m Estimated time for StringsEncodeBench: ~ 20 m Estimated time for
[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331 ] Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/24/23 7:03 AM: So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. {{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 756 tests x 11s each ~= 2h 20m {{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 method = 7056 tests x 30s each ~= 59h To me, the other one is unacceptable for CI. We need to reduce the number of parameters, and also, set the number of forks to 1 (probably for each test). I'm going to exclude the benchmark for now and create a ticket to fix it later (I'm going to do that for each benchmark which causes CI to fail). Those benchmarks are still ok to run locally with {{-Dbenchmark.name=...}} btw. I've implemented a test which estimates benchmark run times according to the number of forks, warups and measurement iterations number and time, and the number of parameter combinations. The results are as follows: {noformat} Estimated time for CacheLoaderBench: ~5 s Estimated time for LatencyTrackingBench: ~26 s Estimated time for SampleBench: ~30 s Estimated time for ReadWriteBench: ~30 s Estimated time for MutationBench: ~30 s Estimated time for CompactionBench: ~35 s Estimated time for DiagnosticEventPersistenceBench: ~40 s Estimated time for ZeroCopyStreamingBench: ~44 s Estimated time for BatchStatementBench: ~110 s Estimated time for DiagnosticEventServiceBench: ~120 s Estimated time for MessageOutBench: ~144 s Estimated time for BloomFilterSerializerBench: ~144 s Estimated time for FastThreadLocalBench: ~156 s Estimated time for HashingBench: ~156 s Estimated time for ChecksumBench: ~208 s Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s Estimated time for PendingRangesBench: ~ 5 m Estimated time for DirectorySizerBench: ~ 5 m Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m Estimated time for PreaggregatedByteBufsBench: ~ 7 m Estimated time for AutoBoxingBench: ~ 8 m Estimated time for OutputStreamBench: ~ 13 m Estimated time for BTreeBuildBench: ~ 13 m Estimated time for StringsEncodeBench: ~ 20 m Estimated time for instance.ReadWidePartitionsBench: ~ 21 m Estimated time for btree.BTreeBuildBench: ~ 30 m Estimated time for BTreeSearchIteratorBench: ~ 31 m Estimated time for btree.BTreeTransformBench: ~ 138 m Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m Estimated time for btree.BTreeUpdateBench: ~58 h Total estimated time: ~69 h {noformat} We can make it assert that no benchmark is planned to run longer than 30 minutes (but as said, a separate ticket) was (Author: jlewandowski): So the build failed, probably it was running for too long. I looked into the logs to figure out why it takes so long and learned that there are benchmarks which takes extremely long to run. {{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 756 tests x 11s each ~= 2h 20m {{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 method = 7056 tests x 30s each ~= 59h To me, the other one is unacceptable for CI. We need to reduce the number of parameters, and also, set the number of forks to 1 (probably for each test). I'm going to exclude the benchmark for now and create a ticket to fix it later (I'm going to do that for each benchmark which causes CI to fail). Those benchmarks are still ok to run locally with {{-Dbenchmark.name=...}} > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > Time Spent: 50m > Remaining Estimate: 0h > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler
[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504 ] Branimir Lambov edited comment on CASSANDRA-18871 at 9/21/23 10:47 AM: --- Yes, the parameter passing works great for me now. Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine. was (Author: blambov): Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine. > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org