[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-02-04 Thread Matt Juntunen (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17030032#comment-17030032
 ] 

Matt Juntunen commented on GEOMETRY-75:
---

[~erans], are you able to merge this?

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-02-03 Thread Matt Juntunen (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029404#comment-17029404
 ] 

Matt Juntunen commented on GEOMETRY-75:
---

Do any more changes need to be made on this PR?

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-02-01 Thread Matt Juntunen (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028102#comment-17028102
 ] 

Matt Juntunen commented on GEOMETRY-75:
---

Thanks for the code review, [~aherbert]!

 
{quote}From what I know the computation takes about 75% of its runtime or more 
for the over/underflow protection. Do you need to have this functionality?
{quote}
It has been requested in GEOMETRY-50.

 
{quote}I note that your benchmarks use 1 vector. Best practice for benchmarking 
code with lots of branches is to have a lot of data.
{quote}
I've updated it to use a list of input vectors, 1000 by default. I used the 
approach given in the JMH examples 
[here|https://hg.openjdk.java.net/code-tools/jmh/file/a61ab96aafb9/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_34_SafeLooping.java]
 and directed the output to a {{Blackhole}}.

bq. In the Complex benchmark I have added unnormalized vectors using a 
NormalizedGaussianSampler for each dimension. ... It applies perfectly to what 
you are benchmarking here.

I added it in and I hope I'm using it correctly. The result numbers seem to be 
about the same.

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-01-29 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026329#comment-17026329
 ] 

Alex Herbert commented on GEOMETRY-75:
--

I see that you did use a JDK from 9+. Here's the difference on my laptop 
between JDK 8 and 9:
{noformat}
Benchmark  (type)  Mode  CntScore   Error  Units
VectorPerformance.norm2D   random  avgt5  291.902 ± 2.366  ns/op
VectorPerformance.norm2D edge  avgt5   29.761 ± 0.081  ns/op
VectorPerformance.normalize2D N/A  avgt5  319.804 ± 9.047  ns/op

Benchmark  (type)  Mode  Cnt   Score   Error  Units
VectorPerformance.norm2D   random  avgt5  20.071 ± 0.757  ns/op
VectorPerformance.norm2D edge  avgt5  15.815 ± 2.870  ns/op
VectorPerformance.normalize2D N/A  avgt5  19.371 ± 3.843  ns/op
{noformat}
Note the hypot function does a fair bit of work to protect over/underflow. If 
you replace it with {{Math.sqrt(x*x+y*y)}} you will see a big difference.

I note that your benchmarks use 1 vector. Best practice for benchmarking code 
with lots of branches is to have a lot of data. This applies to Math.hypot. 
With the same input the branch prediction process in the CPU will learn the 
input and choose the right branch. So the JMH result gives you the time for the 
computations on the correct path. Not the time it would take if the data were 
unpredictable (which would be slower as some cycles are lost when an incorrect 
branch is pipelined and has to be corrected).

Math.hypot uses extended precision multiplication to achieve 1 ULP results from 
the exact answer. Part of this involves a branch where given {{|x| > |y|}} it 
chooses a different calculation if {{|2y| > |x|}}, i.e. the values are within 
2-fold of each other. On most input combinations with the same range for x and 
y this will occur 50% of the time. It is hard to predict when it will happen 
from earlier branches in the same function which only check which is larger. 
This impacts the performance of hypot. I've tested this using a mock which 
returns an incorrect result just to make the branch do something. This code is 
slower than returning {{x * x + y * y}} as the unpredictable 50% guess on what 
to do impacts performance.
{code:java}
final double w = x - y;
if (w > y) {
return x;
}
// 2y > x > y
return y;
{code}
To try more data you could add {{@Setup(Level.Iteration)}}. This will generate 
a different vector for each iteration, so 5 in total with the current defaults. 
I did this and you can see the two branches being run at max speed in norm2D. 
Here it is not quite 50/50 (I get a more even ratio with more iterations) but 
one branch is 17.8ns and the other 14.6ns.
{noformat}
java -jar target/examples-jmh.jar VectorPerformance.*norm2D -i 10 -p type=random
Iteration   1: 17.831 ns/op
Iteration   2: 17.849 ns/op
Iteration   3: 17.804 ns/op
Iteration   4: 14.762 ns/op
Iteration   5: 17.905 ns/op
Iteration   6: 14.667 ns/op
Iteration   7: 18.038 ns/op
Iteration   8: 17.827 ns/op
Iteration   9: 17.857 ns/op
Iteration  10: 14.672 ns/op
{noformat}
For your standard random data and the tests for 1D and 3D these have no 
branches so 1 data sample is enough.

The bigger concern here is for the edge cases. You have 9 edge case numbers in 
your array and you create a max length vector of size 3 from them. So the edge 
case test is not validating all possible edge cases.

You have two options here:
 * Add a {{@Benchmark}} for each specific edge case (NaN, Inf, 0 as documented 
in the Vectors.norm methods)
 * Alter the test to create a set of vectors of a given size and the benchmark 
has to loop over them all

The second method has some overhead. However you just mock up a similar test 
with a no-op in place of your normalisation method. This encapsulates all the 
same overhead of object creation without doing any computation:
{code:java}
@Benchmark
public Vector3D[] baseline(final NormalizableVectorInput3D input) {
Vector3D[] data = input.getData();
Vector3D[] result = new Vector3D[data.length];
for (int i = 0; i < data.length; i++) {
Vector v = data[i];
result[i] = Vector3D.of(v.getX(), v.getY(), v.getZ());
}
return result;
}
{code}
If you subtract the time for this from your real benchmarks it provides the 
time for the operation.

I see you are using the log-uniform type distribution for producing random 
doubles. This is fine if you want to generate numbers with fully random 52-bit 
mantissas and random exponents. But these may not represent your actual 
expected data distribution. A double has a lot of values but these are not 
uniformly spread.

In the Complex benchmark I have added unnormalized vectors using a 
NormalizedGaussianSampler for each dimension. This puts your ND vector at a 
random 

[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-01-29 Thread Alex Herbert (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026274#comment-17026274
 ] 

Alex Herbert commented on GEOMETRY-75:
--

I am currently working on a hypot implementation for use in Complex. It was 
changed in JDK9 for a much faster implementation due to performance problems 
(with the same computation). Which JDK are you running for the benchmark? If 
you are using JDK 8 you will see a 7x speed-up in JDK 9+.

>From what I know the computation takes about 75% of its runtime or more for 
>the over/underflow protection. Do you need to have this functionality?

 

 

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-01-29 Thread Matt Juntunen (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026101#comment-17026101
 ] 

Matt Juntunen commented on GEOMETRY-75:
---

Yes, I noticed that, too. I'm wondering if its related to the use of 
{{Math.hypot}} in the computation of the 2D norms (see 
[https://github.com/apache/commons-geometry/blob/master/commons-geometry-euclidean/src/main/java/org/apache/commons/geometry/euclidean/internal/Vectors.java#L112]).
 I was planning to investigate it further in GEOMETRY-50 after this PR is 
merged.

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-01-29 Thread Gilles Sadowski (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025972#comment-17025972
 ] 

Gilles Sadowski commented on GEOMETRY-75:
-

I've run the {{VectorPerformance}} benchmark:
{noformat}
Benchmark  (type)  Mode  Cnt   Score   Error  Units
VectorPerformance.norm1D   random  avgt5   3.399 ± 0.238  ns/op
VectorPerformance.norm1D edge  avgt5   3.469 ± 0.283  ns/op
VectorPerformance.norm2D   random  avgt5  15.122 ± 0.576  ns/op
VectorPerformance.norm2D edge  avgt5   8.053 ± 0.098  ns/op
VectorPerformance.norm3D   random  avgt5   4.753 ± 0.021  ns/op
VectorPerformance.norm3D edge  avgt5   4.604 ± 0.060  ns/op
VectorPerformance.normalize1D N/A  avgt5   4.455 ± 0.027  ns/op
VectorPerformance.normalize2D N/A  avgt5  28.252 ± 4.558  ns/op
VectorPerformance.normalize3D N/A  avgt5  14.061 ± 1.041  ns/op
{noformat}
The results for {{norm2D}} and {{normalize2D}} seem strange...

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEOMETRY-75) Performance Test Module

2020-01-27 Thread Matt Juntunen (Jira)


[ 
https://issues.apache.org/jira/browse/GEOMETRY-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024851#comment-17024851
 ] 

Matt Juntunen commented on GEOMETRY-75:
---

Added PR: https://github.com/apache/commons-geometry/pull/60

> Performance Test Module
> ---
>
> Key: GEOMETRY-75
> URL: https://issues.apache.org/jira/browse/GEOMETRY-75
> Project: Apache Commons Geometry
>  Issue Type: Task
>Reporter: Matt Juntunen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a module for executing performance tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)