LuciferYang commented on PR #36571:
URL: https://github.com/apache/spark/pull/36571#issuecomment-1128419927
For `ColumnVectorUtils.populate` method:
```scala
def testPopulate(valuesPerIteration: Int, length: Int): Unit = {
val batchSize = 4096
val onHeapColumnVector = new OnHeapColumnVector(batchSize, StringType)
val offHeapColumnVector = new OffHeapColumnVector(batchSize, StringType)
val benchmark = new Benchmark(
s"Test ColumnVectorUtils.populate, row length = $length",
valuesPerIteration * batchSize,
output = output)
val builder = new UTF8StringBuilder()
builder.append(RandomStringUtils.random(length))
val row = InternalRow(builder.build())
benchmark.addCase("OnHeapColumnVector") { _: Int =>
for (_ <- 0L until valuesPerIteration) {
onHeapColumnVector.reset()
ColumnVectorUtils.populate(onHeapColumnVector, row, 0)
}
}
benchmark.addCase("OffHeapColumnVector") { _: Int =>
for (_ <- 0L until valuesPerIteration) {
offHeapColumnVector.reset()
ColumnVectorUtils.populate(offHeapColumnVector, row, 0)
}
}
benchmark.run()
}
override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
val valuesPerIteration = 100000
Seq(1, 5, 10, 15, 20).foreach { length =>
testPopulate(valuesPerIteration, length)
}
}
```
**Before**
```
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 1: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 3381
3404 32 121.2 8.3 1.0X
OffHeapColumnVector 3931
3968 53 104.2 9.6 0.9X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 5: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4700
4767 96 87.2 11.5 1.0X
OffHeapColumnVector 5258
5356 139 77.9 12.8 0.9X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 10: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4920
4934 19 83.2 12.0 1.0X
OffHeapColumnVector 5007
5017 14 81.8 12.2 1.0X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 15: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 5227
5255 40 78.4 12.8 1.0X
OffHeapColumnVector 5626
5731 148 72.8 13.7 0.9X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 20: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 5226
5263 53 78.4 12.8 1.0X
OffHeapColumnVector 5526
5699 244 74.1 13.5 0.9X
```
**After**
```
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 1: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 3734
3742 11 109.7 9.1 1.0X
OffHeapColumnVector 3683
3683 0 111.2 9.0 1.0X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 5: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4085
4088 4 100.3 10.0 1.0X
OffHeapColumnVector 4770
4771 2 85.9 11.6 0.9X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 10: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4788
4789 1 85.5 11.7 1.0X
OffHeapColumnVector 4387
4387 0 93.4 10.7 1.1X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 15: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4669
4669 0 87.7 11.4 1.0X
OffHeapColumnVector 5197
5198 1 78.8 12.7 0.9X
OpenJDK 64-Bit Server VM 1.8.0_332-b09 on Linux 5.13.0-1022-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Test ColumnVectorUtils.populate, row length = 20: Best Time(ms) Avg
Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
OnHeapColumnVector 4769
4769 0 85.9 11.6 1.0X
OffHeapColumnVector 5441
5441 1 75.3 13.3 0.9X
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]