Re: RFR: 8187033: [PPC] Imporve performance of ObjectStreamClass.getClassDataLayout()

Peter Levart Wed, 20 Sep 2017 06:49:41 -0700

Hi Ogata,

On 09/20/2017 12:12 PM, Kazunori Ogata wrote:

Hi Peter,


The benchmark is GradientBoostingTree of Intel HiBench [1].  HiBench is a
suite of programs using Hadoop or Spark, and GradientBoostingTree is a
Spark program.  The source code (in Scala) is [2].  To build the code, you
need Apache Spark.

The command line is equivalent to java -Xmx10g -D spark.master="local[4]"
GradientBoostingTree <inputDir> 100, but what I actually use is a Java
program that calls the main method and measures its execution time using
currentTimeMills().

By the way, I'm running the benchmark on POWER8 machine.  Removing
volatile won't change the performance on x86.


[1] https://github.com/intel-hadoop/HiBench
[2]
https://github.com/intel-hadoop/HiBench/blob/master/sparkbench/ml/src/main/scala/com/intel/sparkbench/ml/GradientBoostingTree.scala


Regards,
Ogata

Huh, I thought it would be something easier to run. Am I right that theimprovement we are expecting comes from execution of Java serializationand deserialization of some data structure? If you could extract fromthe benchmark just the approximate shape of the data structure andtypical values it contains, I could create a JMH benchmark that testsjust that part. Which would be appropriate to tune serialization code.After some best variant is chosen, you could verify it by running yourtest in your Spark setup. I think there is still room for improvement. Ihave a few ideas I would like to test.


Regards, Peter

Re: RFR: 8187033: [PPC] Imporve performance of ObjectStreamClass.getClassDataLayout()

Reply via email to