[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements

2023-09-25 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331
 ] 

Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/25/23 7:57 AM:


So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 
There are also a number of benchmarks which just fail because of different 
reasons, like assertion errors or other runtime exceptions. Those are unrelated 
to this patch and should be addressed in CASSANDRA-18873


was (Author: jlewandowski):
So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 

{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 
method = 7056 tests x 30s each ~= 59h

I'm going to exclude the longest benchmarks for now and create a ticket to fix 
it later - see CASSANDRA-18873. Those benchmarks are still ok to run locally 
with {{-Dbenchmark.name=...}}
 
btw. I've implemented a test which estimates benchmark run times according to 
the number of forks, warups and measurement iterations number and time, and the 
number of parameter combinations. The results are as follows: 

{noformat}
Estimated time for CacheLoaderBench: ~5 s
Estimated time for LatencyTrackingBench: ~26 s
Estimated time for SampleBench: ~30 s
Estimated time for ReadWriteBench: ~30 s
Estimated time for MutationBench: ~30 s
Estimated time for CompactionBench: ~35 s
Estimated time for DiagnosticEventPersistenceBench: ~40 s
Estimated time for ZeroCopyStreamingBench: ~44 s
Estimated time for BatchStatementBench: ~110 s
Estimated time for DiagnosticEventServiceBench: ~120 s
Estimated time for MessageOutBench: ~144 s
Estimated time for BloomFilterSerializerBench: ~144 s
Estimated time for FastThreadLocalBench: ~156 s
Estimated time for HashingBench: ~156 s
Estimated time for ChecksumBench: ~208 s
Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
Estimated time for PendingRangesBench: ~ 5 m
Estimated time for DirectorySizerBench: ~ 5 m
Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
Estimated time for PreaggregatedByteBufsBench: ~ 7 m
Estimated time for AutoBoxingBench: ~ 8 m
Estimated time for OutputStreamBench: ~ 13 m
Estimated time for BTreeBuildBench: ~ 13 m
Estimated time for StringsEncodeBench: ~ 20 m
Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
Estimated time for btree.BTreeBuildBench: ~ 30 m
Estimated time for BTreeSearchIteratorBench: ~ 31 m
Estimated time for btree.BTreeTransformBench: ~ 138 m
Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
Estimated time for btree.BTreeUpdateBench: ~58 h
Total estimated time: ~69 h
{noformat}

We can make it assert that no benchmark is planned to run longer than 30 
minutes (but as said, a separate ticket)

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler binaries and 
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we 
> run {{microbench-with-profiler}} task. If no additional properties are 
> provided some default options will be applied (defined in the script, can be 
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property 
> will be added as profiler options after library path and target directory 
> definition.
> 3. If someone wants to see any additional improvements, please comment on the 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements

2023-09-24 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331
 ] 

Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/24/23 7:23 AM:


So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 

{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 
method = 7056 tests x 30s each ~= 59h

I'm going to exclude the longest benchmarks for now and create a ticket to fix 
it later - see CASSANDRA-18873. Those benchmarks are still ok to run locally 
with {{-Dbenchmark.name=...}}
 
btw. I've implemented a test which estimates benchmark run times according to 
the number of forks, warups and measurement iterations number and time, and the 
number of parameter combinations. The results are as follows: 

{noformat}
Estimated time for CacheLoaderBench: ~5 s
Estimated time for LatencyTrackingBench: ~26 s
Estimated time for SampleBench: ~30 s
Estimated time for ReadWriteBench: ~30 s
Estimated time for MutationBench: ~30 s
Estimated time for CompactionBench: ~35 s
Estimated time for DiagnosticEventPersistenceBench: ~40 s
Estimated time for ZeroCopyStreamingBench: ~44 s
Estimated time for BatchStatementBench: ~110 s
Estimated time for DiagnosticEventServiceBench: ~120 s
Estimated time for MessageOutBench: ~144 s
Estimated time for BloomFilterSerializerBench: ~144 s
Estimated time for FastThreadLocalBench: ~156 s
Estimated time for HashingBench: ~156 s
Estimated time for ChecksumBench: ~208 s
Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
Estimated time for PendingRangesBench: ~ 5 m
Estimated time for DirectorySizerBench: ~ 5 m
Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
Estimated time for PreaggregatedByteBufsBench: ~ 7 m
Estimated time for AutoBoxingBench: ~ 8 m
Estimated time for OutputStreamBench: ~ 13 m
Estimated time for BTreeBuildBench: ~ 13 m
Estimated time for StringsEncodeBench: ~ 20 m
Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
Estimated time for btree.BTreeBuildBench: ~ 30 m
Estimated time for BTreeSearchIteratorBench: ~ 31 m
Estimated time for btree.BTreeTransformBench: ~ 138 m
Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
Estimated time for btree.BTreeUpdateBench: ~58 h
Total estimated time: ~69 h
{noformat}

We can make it assert that no benchmark is planned to run longer than 30 
minutes (but as said, a separate ticket)


was (Author: jlewandowski):
So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 

{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 
method = 7056 tests x 30s each ~= 59h

To me, the other one is unacceptable for CI. We need to reduce the number of 
parameters, and also, set the number of forks to 1 (probably for each test). 

I'm going to exclude the benchmark for now and create a ticket to fix it later 
(I'm going to do that for each benchmark which causes CI to fail). Those 
benchmarks are still ok to run locally with {{-Dbenchmark.name=...}}
 
btw. I've implemented a test which estimates benchmark run times according to 
the number of forks, warups and measurement iterations number and time, and the 
number of parameter combinations. The results are as follows: 

{noformat}
Estimated time for CacheLoaderBench: ~5 s
Estimated time for LatencyTrackingBench: ~26 s
Estimated time for SampleBench: ~30 s
Estimated time for ReadWriteBench: ~30 s
Estimated time for MutationBench: ~30 s
Estimated time for CompactionBench: ~35 s
Estimated time for DiagnosticEventPersistenceBench: ~40 s
Estimated time for ZeroCopyStreamingBench: ~44 s
Estimated time for BatchStatementBench: ~110 s
Estimated time for DiagnosticEventServiceBench: ~120 s
Estimated time for MessageOutBench: ~144 s
Estimated time for BloomFilterSerializerBench: ~144 s
Estimated time for FastThreadLocalBench: ~156 s
Estimated time for HashingBench: ~156 s
Estimated time for ChecksumBench: ~208 s
Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
Estimated time for PendingRangesBench: ~ 5 m
Estimated time for DirectorySizerBench: ~ 5 m
Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
Estimated time for PreaggregatedByteBufsBench: ~ 7 m
Estimated time for AutoBoxingBench: ~ 8 m
Estimated time for OutputStreamBench: ~ 13 m
Estimated time for BTreeBuildBench: ~ 13 m
Estimated time for StringsEncodeBench: ~ 20 m
Estimated time for 

[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements

2023-09-24 Thread Jacek Lewandowski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768331#comment-17768331
 ] 

Jacek Lewandowski edited comment on CASSANDRA-18871 at 9/24/23 7:03 AM:


So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 

{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 
method = 7056 tests x 30s each ~= 59h

To me, the other one is unacceptable for CI. We need to reduce the number of 
parameters, and also, set the number of forks to 1 (probably for each test). 

I'm going to exclude the benchmark for now and create a ticket to fix it later 
(I'm going to do that for each benchmark which causes CI to fail). Those 
benchmarks are still ok to run locally with {{-Dbenchmark.name=...}}
 
btw. I've implemented a test which estimates benchmark run times according to 
the number of forks, warups and measurement iterations number and time, and the 
number of parameter combinations. The results are as follows: 

{noformat}
Estimated time for CacheLoaderBench: ~5 s
Estimated time for LatencyTrackingBench: ~26 s
Estimated time for SampleBench: ~30 s
Estimated time for ReadWriteBench: ~30 s
Estimated time for MutationBench: ~30 s
Estimated time for CompactionBench: ~35 s
Estimated time for DiagnosticEventPersistenceBench: ~40 s
Estimated time for ZeroCopyStreamingBench: ~44 s
Estimated time for BatchStatementBench: ~110 s
Estimated time for DiagnosticEventServiceBench: ~120 s
Estimated time for MessageOutBench: ~144 s
Estimated time for BloomFilterSerializerBench: ~144 s
Estimated time for FastThreadLocalBench: ~156 s
Estimated time for HashingBench: ~156 s
Estimated time for ChecksumBench: ~208 s
Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
Estimated time for PendingRangesBench: ~ 5 m
Estimated time for DirectorySizerBench: ~ 5 m
Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
Estimated time for PreaggregatedByteBufsBench: ~ 7 m
Estimated time for AutoBoxingBench: ~ 8 m
Estimated time for OutputStreamBench: ~ 13 m
Estimated time for BTreeBuildBench: ~ 13 m
Estimated time for StringsEncodeBench: ~ 20 m
Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
Estimated time for btree.BTreeBuildBench: ~ 30 m
Estimated time for BTreeSearchIteratorBench: ~ 31 m
Estimated time for btree.BTreeTransformBench: ~ 138 m
Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
Estimated time for btree.BTreeUpdateBench: ~58 h
Total estimated time: ~69 h
{noformat}

We can make it assert that no benchmark is planned to run longer than 30 
minutes (but as said, a separate ticket)


was (Author: jlewandowski):
So the build failed, probably it was running for too long. I looked into the 
logs to figure out why it takes so long and learned that there are benchmarks 
which takes extremely long to run. 

{{btree.BTreeTransformBench}}, params = 7x9x2=63, x4 forks = 252, x3 methods = 
756 tests x 11s each ~= 2h 20m
{{btree.BTreeUpdateBench}}, params = 7x7x3x2x2x3=1764, x4 forks = 7056, x1 
method = 7056 tests x 30s each ~= 59h

To me, the other one is unacceptable for CI. We need to reduce the number of 
parameters, and also, set the number of forks to 1 (probably for each test). 

I'm going to exclude the benchmark for now and create a ticket to fix it later 
(I'm going to do that for each benchmark which causes CI to fail). Those 
benchmarks are still ok to run locally with {{-Dbenchmark.name=...}}
 

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler 

[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements

2023-09-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504
 ] 

Branimir Lambov edited comment on CASSANDRA-18871 at 9/21/23 10:47 AM:
---

Yes, the parameter passing works great for me now.

Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine.


was (Author: blambov):
Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine.

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler binaries and 
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we 
> run {{microbench-with-profiler}} task. If no additional properties are 
> provided some default options will be applied (defined in the script, can be 
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property 
> will be added as profiler options after library path and target directory 
> definition.
> 3. If someone wants to see any additional improvements, please comment on the 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org