Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19222
@hvanhovell @rednaxelafx
After running a benchmark program, I took a polymorphic approach (i.e. each
subclass has `getInt()`/`putInt()` methods. Then, I got better performance than
monomorphic approach (i.e. only `MemoryBlock` class has `final`
`getInt()`/`putInt()` methods.
**The root cause for better performance is to pass a concrete type to the
first argument of `Platform.getInt()/putInt()` instead of virtual call.**
I run [this benchmark
program](https://gist.github.com/kiszk/94f75b506c93a663bbbc372ffe8f05de) using
[the
commit](https://github.com/apache/spark/commit/0714ddcab6d83a489e791536775630e75e8fe5c6).
I got the following results:
```
OpenJDK 64-Bit Server VM 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13 on Linux
4.4.0-22-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
Memory access benchmarks: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
IntArrayMemoryBlock 423 / 445 634.1
1.6 1.0X
ByteArrayMemoryBlock 433 / 443 620.3
1.6 1.0X
Platform 431 / 436 622.7
1.6 1.0X
Platform Object 1004 / 1055 267.4
3.7 0.4X
Platform copyMemory 45 / 48 5903.9
0.2 9.3X
Platform copyMemory Object 45 / 47 6004.0
0.2 9.5X
```
This result shows three facts:
1. According to the first three results, To have `getInt()/putInt()` in
subclasses of `MemoryBlock` can achieve comparable performance to the current
implementation (`Platform` in a table).
2. According to the third and forth results, even if we use
`Platform.getInt()/putInt(), we achieve more than 2x worse performance
(`Platform Object` in a table) when we pass `Object` to the first argument
instead of concrete type (i.e. `byte[]`).
For example, `byte[] b; Platform.getInt(b, 0);` can achieve better
performance than `Object o; Platform.getInt(o, 0);`
3. According to the fifth and sixth results, for Platform.copy(), to pass
`Object` can achieve the same performance as to pass `byte[]`.
From fact 2., I used polymorphic approach to pass the concrete type for
each subclass of `MemoryBlock`. As a result, we can achieve the same
performance if the current Spark uses a concrete type for the first argument of
`Platform.getInt()/putInt()`.
If the current Spark uses `Object` (e.g.
[here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java#L61)),
this PR can achieve better performance.
Probably, @rednaxelafx can explain this very well :)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]