GitHub user kiszk reopened a pull request:
https://github.com/apache/spark/pull/19222
[SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks to choose several
types of memory block
## What changes were proposed in this pull request?
This PR allows us to use one of several types of `MemoryBlock`, such as
byte array, int array, long array, or `java.nio.DirectByteBuffer`. To use
`java.nio.DirectByteBuffer` allows to have off heap memory which is
automatically deallocated by JVM. `MemoryBlock` class has primitive accessors
like `Platform.getInt()`, `Platform.putint()`, or `Platform.copyMemory()`.
This PR uses `MemoryBlock` for `OffHeapColumnVector`, `UTF8String`, and
other places. This PR can improve performance of operations involving memory
accesses (e.g. `UTF8String.trim`) by 1.8x.
For now, this PR does not use `MemoryBlock` for `BufferHolder` based on
@cloud-fan's
[suggestion](https://github.com/apache/spark/pull/11494#issuecomment-309694290).
Since this PR is a successor of #11494, close #11494. Many codes were
ported from #11494. Many efforts were put here. **I think this PR should credit
to @yzotov.**
This PR can achieve **1.1-1.4x performance improvements** for operations
in `UTF8String` or `Murmur3_x86_32`. Other operations are almost comparable
performances.
Without this PR
```
OpenJDK 64-Bit Server VM 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13 on Linux
4.4.0-22-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
OpenJDK 64-Bit Server VM 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13 on Linux
4.4.0-22-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
Hash byte arrays with length 268435487: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Murmur3_x86_32 526 / 536 0.0
131399881.5 1.0X
UTF8String benchmark: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
hashCode 525 / 552 1022.6
1.0 1.0X
substring 414 / 423 1298.0
0.8 1.3X
```
With this PR
```
OpenJDK 64-Bit Server VM 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13 on Linux
4.4.0-22-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
Hash byte arrays with length 268435487: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Murmur3_x86_32 474 / 488 0.0
118552232.0 1.0X
UTF8String benchmark: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
hashCode 476 / 480 1127.3
0.9 1.0X
substring 287 / 291 1869.9
0.5 1.7X
```
Benchmark program
```
test("benchmark Murmur3_x86_32") {
val length = 8192 * 32768 + 31
val seed = 42L
val iters = 1 << 2
val random = new Random(seed)
val arrays = Array.fill[MemoryBlock](numArrays) {
val bytes = new Array[Byte](length)
random.nextBytes(bytes)
new ByteArrayMemoryBlock(bytes, Platform.BYTE_ARRAY_OFFSET, length)
}
val benchmark = new Benchmark("Hash byte arrays with length " + length,
iters * numArrays, minNumIters = 20)
benchmark.addCase("HiveHasher") { _: Int =>
var sum = 0L
for (_ <- 0L until iters) {
sum += HiveHasher.hashUnsafeBytesBlock(
arrays(i), Platform.BYTE_ARRAY_OFFSET, length)
}
}
benchmark.run()
}
test("benchmark UTF8String") {
val N = 512 * 1024 * 1024
val iters = 2
val benchmark = new Benchmark("UTF8String benchmark", N, minNumIters = 20)
val str0 = new java.io.StringWriter() { { for (i <- 0 until N) { write("
") } } }.toString
val s0 = UTF8String.fromString(str0)
benchmark.addCase("hashCode") { _: Int =>
var h: Int = 0
for (_ <- 0L until iters) { h += s0.hashCode }
}
benchmark.addCase("substring") { _: Int =>
var s: UTF8String = null
for (_ <- 0L until iters) { s = s0.substring(N / 2 - 5, N / 2 + 5) }
}
benchmark.run()
}
```
I run [this benchmark
program](https://gist.github.com/kiszk/94f75b506c93a663bbbc372ffe8f05de) using
[the
commit](https://github.com/apache/spark/pull/19222/commits/ee5a79861c18725fb1cd9b518cdfd2489c05b81d6).
I got the following results:
```
OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12 on Linux
4.4.0-66-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
Memory access benchmarks: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
ByteArrayMemoryBlock get/putInt() 220 / 221 609.3
1.6 1.0X
Platform get/putInt(byte[]) 220 / 236 610.9
1.6 1.0X
Platform get/putInt(Object) 492 / 494 272.8
3.7 0.4X
OnHeapMemoryBlock get/putLong() 322 / 323 416.5
2.4 0.7X
long[] 221 / 221 608.0
1.6 1.0X
Platform get/putLong(long[]) 321 / 321 418.7
2.4 0.7X
Platform get/putLong(Object) 561 / 563 239.2
4.2 0.4X
```
I also run [this benchmark
program](https://gist.github.com/kiszk/5fdb4e03733a5d110421177e289d1fb5) for
comparing performance of `Platform.copyMemory()`.
```
OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12 on Linux
4.4.0-66-generic
Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
Platform copyMemory: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Object to Object 1961 / 1967 8.6
116.9 1.0X
System.arraycopy Object to Object 1917 / 1921 8.8
114.3 1.0X
byte array to byte array 1961 / 1968 8.6
116.9 1.0X
System.arraycopy byte array to byte array 1909 / 1937 8.8
113.8 1.0X
int array to int array 1921 / 1990 8.7
114.5 1.0X
double array to double array 1918 / 1923 8.7
114.3 1.0X
Object to byte array 1961 / 1967 8.6
116.9 1.0X
Object to short array 1965 / 1972 8.5
117.1 1.0X
Object to int array 1910 / 1915 8.8
113.9 1.0X
Object to float array 1971 / 1978 8.5
117.5 1.0X
Object to double array 1919 / 1944 8.7
114.4 1.0X
byte array to Object 1959 / 1967 8.6
116.8 1.0X
int array to Object 1961 / 1970 8.6
116.9 1.0X
double array to Object 1917 / 1924 8.8
114.3 1.0X
```
These results show three facts:
1. According to the second/third or sixth/seventh results in the first
experiment, if we use `Platform.get/putInt(Object)`, we achieve more than 2x
worse performance than `Platform.get/putInt(byte[])` with concrete type (i.e.
`byte[]`).
2. According to the second/third or fourth/fifth/sixth results in the first
experiment, the fastest way to access an array element on Java heap is
`array[]`. **Cons of `array[]` is that it is not possible to support
unaligned-8byte access.**
3. According to the first/second/third or fourth/sixth/seventh results in
the first experiment, `getInt()/putInt() or getLong()/putLong()` in subclasses
of `MemoryBlock` can achieve comparable performance to `Platform.get/putInt()`
or `Platform.get/putLong()` with concrete type (second or sixth result). There
is no overhead regarding virtual call.
4. According to results in the second experiment, for `Platform.copy()`, to
pass `Object` can achieve the same performance as to pass any type of primitive
array as source or destination.
5. According to second/fourth results in the second experiment,
`Platform.copy()` can achieve the same performance as `System.arrayCopy`. **It
would be good to use `Platform.copy()` since `Platform.copy()` can take any
types for src and dst.**
We are incrementally replace `Platform.get/putXXX` with
`MemoryBlock.get/putXXX`. This is because we have two advantages.
1) Achieve better performance due to having a concrete type for an array.
2) Use simple OO design instead of passing `Object`
It is easy to use `MemoryBlock` in `InternalRow`, `BufferHolder`,
`TaskMemoryManager`, and others that are already abstracted. It is not easy to
use `MemoryBlock` in utility classes related to hashing or others.
Other candidates are
- UnsafeRow, UnsafeArrayData, UnsafeMapData, SpecificUnsafeRowJoiner
- UTF8StringBuffer
- BufferHolder
- TaskMemoryManager
- OnHeapColumnVector
- BytesToBytesMap
- CachedBatch
- classes for hash
- others.
## How was this patch tested?
Added `UnsafeMemoryAllocator`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kiszk/spark SPARK-10399
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19222.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19222
----
commit f427ca2a0f30d70f42b4755d0d72318cfa0e4c77
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-13T10:16:19Z
introduce ByteArrayMemoryBlock, IntArrayMemoryBlock, LongArrayMemoryBlock,
and OffheaMemoryBlock
commit 5d7ccdb0e845afcaa430bac3d21b519d35d1e6f4
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-13T17:15:25Z
OffHeapColumnVector uses UnsafeMemoryAllocator
commit 251fa09d4421085a6c97f56bb38108980a79b5c8
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-13T17:27:09Z
UTF8String uses UnsafeMemoryAllocator
commit 790bbe7f3ac52b2b1e4375684998737d6127a552
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-13T17:36:57Z
Platform.copymemory() in UsafeInMemorySorter uses new MemoryBlock
commit 93a792e7729f213fcef66f0ed9b33f45259d1ec3
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-14T15:34:48Z
address review comments
commit 0beab0308cdac57e85daf7794170a0ca899ab568
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-14T17:53:33Z
fix test failures (e.g. String in UnsafeArrayData)
commit fcf764c1aebdc847675f710c17ec8477d6022a40
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-18T16:13:13Z
fix failures
commit d2d2e50f8a2baf41d5b85127bf888da6f8bca343
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-21T18:45:55Z
minor update of UTF8String constructor
commit f5e10bb52c33856ddd3e1b1f8483b170e0167c53
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T11:00:12Z
rename method name
commit 1905e8ca4b3b8200fa56f5fc91899bb420a07628
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T11:01:13Z
remove unused code
commit 7778e586e94130749cec3f54a60b6fb24514647a
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T11:02:30Z
update arrayEquals
commit 4f96c82b151b78641bcfc92a65913048b055cfee
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T13:18:01Z
rebase master
commit d1d6ae90589c0fae6c64af0cb95696e270446228
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T14:51:40Z
make more methods final
commit 914dcd11d0d5ef014284868f2794cb4e5baa0958
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-22T15:39:57Z
make fill method final in MemoryBlock
commit 336e4b7bfd7edcb861edeac3ca115dead785b68a
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-23T11:35:15Z
fix test failures
commit 5be9ccb163832e1895b045730a06e79eb3b171cf
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-24T14:00:40Z
add testsuite
commit 43e6b572bd893bd42e58df930a34b7e31549a49a
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-24T18:10:49Z
pass concrete type to the first argument of Platform.get*/put* to get
better performance
commit 05f024e566f828e9c3f836430c9c7b34da5e954b
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-28T01:51:38Z
rename methods related to hash
commit 9071cf6449123400f3d774664e1709337b05c555
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-28T01:52:48Z
added methods for MemoryBlock
commit 37ee9fa07a8f8faaeb097164422bb15958fa4b1c
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-28T03:44:30Z
rebase with master
commit d0b5d59bb31fe2845477ee243008992686e2f2a2
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-09-28T04:37:25Z
fix scala style error
commit 5cdad44717ccb510d4114d14cfff304bef9f5bb4
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-10-14T07:29:14Z
use MemoryBlock in Murmur3 for performance reason
commit 91028fa2ae34bb3ae667692112b8455d4394cbbd
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-10-14T07:29:30Z
fix typo in comment
commit 0210bd1e5f46f81617a35493d2cd0b737b4cf85d
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-10-29T12:32:38Z
address review comment
commit df6dad3762f4e918d503df75ae8fce052af8bf43
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2017-11-28T06:08:52Z
rebase with master
commit 1fa47a8c291ab54a8c1e386737769f467a76a672
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2018-02-20T12:17:26Z
fix failures
commit 01f9c8e8146ff8d21d18feba59c0c2ad83299e2a
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2018-02-20T17:30:11Z
fix failures in ArrowColumnVectorSuite and FeatureHasherSuite
commit 2ed8f82f13288e00e6352bdccbac0dd890c91715
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2018-02-21T17:48:41Z
address review comments
rename LongArrayMemoryBlock to OnHeapMemoryBlock
remove intArrayMemoryBlock
separate copyMemory to copyFrom and writeTo and make them non-static
add assertion for range check
commit 5e3afd11a2dc76d2cd23264b4052abbf3f5d7e9d
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2018-02-23T16:20:21Z
address review comments
commit 95fbdee04e9137938cdc76f7f4573116720357f5
Author: Kazuaki Ishizaki <ishizaki@...>
Date: 2018-02-24T00:22:44Z
fix test failures
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]