[
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768331#comment-17768331
]
Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/24/23 7:03 AM:
------------------------------------------------------------------------
So the build failed, probably it was running for too long. I looked into the
logs to figure out why it takes so long and learned that there are benchmarks
which takes extremely long to run.
{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods =
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1
method = 7056 tests x 30s each ~= 59h
To me, the other one is unacceptable for CI. We need to reduce the number of
parameters, and also, set the number of forks to 1 (probably for each test).
I'm going to exclude the benchmark for now and create a ticket to fix it later
(I'm going to do that for each benchmark which causes CI to fail). Those
benchmarks are still ok to run locally with {{-Dbenchmark.name=...}}
btw. I've implemented a test which estimates benchmark run times according to
the number of forks, warups and measurement iterations number and time, and the
number of parameter combinations. The results are as follows:
{noformat}
Estimated time for CacheLoaderBench: ~5 s
Estimated time for LatencyTrackingBench: ~26 s
Estimated time for SampleBench: ~30 s
Estimated time for ReadWriteBench: ~30 s
Estimated time for MutationBench: ~30 s
Estimated time for CompactionBench: ~35 s
Estimated time for DiagnosticEventPersistenceBench: ~40 s
Estimated time for ZeroCopyStreamingBench: ~44 s
Estimated time for BatchStatementBench: ~110 s
Estimated time for DiagnosticEventServiceBench: ~120 s
Estimated time for MessageOutBench: ~144 s
Estimated time for BloomFilterSerializerBench: ~144 s
Estimated time for FastThreadLocalBench: ~156 s
Estimated time for HashingBench: ~156 s
Estimated time for ChecksumBench: ~208 s
Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
Estimated time for PendingRangesBench: ~ 5 m
Estimated time for DirectorySizerBench: ~ 5 m
Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
Estimated time for PreaggregatedByteBufsBench: ~ 7 m
Estimated time for AutoBoxingBench: ~ 8 m
Estimated time for OutputStreamBench: ~ 13 m
Estimated time for BTreeBuildBench: ~ 13 m
Estimated time for StringsEncodeBench: ~ 20 m
Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
Estimated time for btree.BTreeBuildBench: ~ 30 m
Estimated time for BTreeSearchIteratorBench: ~ 31 m
Estimated time for btree.BTreeTransformBench: ~ 138 m
Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
Estimated time for btree.BTreeUpdateBench: ~58 h
Total estimated time: ~69 h
{noformat}
We can make it assert that no benchmark is planned to run longer than 30
minutes (but as said, a separate ticket)
was (Author: jlewandowski):
So the build failed, probably it was running for too long. I looked into the
logs to figure out why it takes so long and learned that there are benchmarks
which takes extremely long to run.
{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods =
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1
method = 7056 tests x 30s each ~= 59h
To me, the other one is unacceptable for CI. We need to reduce the number of
parameters, and also, set the number of forks to 1 (probably for each test).
I'm going to exclude the benchmark for now and create a ticket to fix it later
(I'm going to do that for each benchmark which causes CI to fail). Those
benchmarks are still ok to run locally with {{-Dbenchmark.name=...}}
> JMH benchmark improvements
> --------------------------
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
> Issue Type: Improvement
> Components: Build, Legacy/Tools
> Reporter: Jacek Lewandowski
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for
> JMH benchmarks which is then not used with {{ant microbench}} task. It is
> used though by the {{test/bin/jmh}} script.
> In fact, I have no idea why we should use uber jar if JMH can perfectly run
> with a regular classpath. Maybe that had something to do with older JMH
> version which was used that time. Building uber jars takes time and is
> annoying. Since it seems to be redundant anyway, I'm going to remove it and
> fix {{test/bin/jmh}} to use a regular classpath.
> 2. I'll add support for async profiler in benchmarks. That is, the
> {{microbench}} target automatically fetches the async profiler binaries and
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we
> run {{microbench-with-profiler}} task. If no additional properties are
> provided some default options will be applied (defined in the script, can be
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property
> will be added as profiler options after library path and target directory
> definition.
> 3. If someone wants to see any additional improvements, please comment on the
> ticket.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]