Hello,

-XX:TieredStopAtLevel=1 flag is often used in some applications (e.g. Spring 
Boot based) to reduce start-up time.

With this flag I've spotted huge performance degradation of Array::newInstance 
comparing to plain constructor call.

I've used this benchmark

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class ArrayInstantiationBenchmark {

  @Param({"10", "100", "1000"})
  private int length;

  @Benchmark
  public Object newInstance() {
    return Array.newInstance(Object.class, length);
  }

  @Benchmark
  public Object constructor() {
    return new Object[length];
  }

}

On C2 (JDK 11) both methods perform the same:

Benchmark                                (length)  Mode  Cnt    Score    Error  
Units
ArrayInstantiationBenchmark.constructor        10  avgt   50   11,557 ±  0,316  
ns/op
ArrayInstantiationBenchmark.constructor       100  avgt   50   86,944 ±  4,945  
ns/op
ArrayInstantiationBenchmark.constructor      1000  avgt   50  520,722 ± 28,068  
ns/op

ArrayInstantiationBenchmark.newInstance        10  avgt   50   11,899 ±  0,569  
ns/op
ArrayInstantiationBenchmark.newInstance       100  avgt   50   86,805 ±  5,103  
ns/op
ArrayInstantiationBenchmark.newInstance      1000  avgt   50  488,647 ± 20,829  
ns/op

On C1 however there's a huge difference (approximately 8 times!) for length = 
10:

Benchmark                                (length)  Mode  Cnt    Score    Error  
Units
ArrayInstantiationBenchmark.constructor        10  avgt   50   11,183 ±  0,168  
ns/op
ArrayInstantiationBenchmark.constructor       100  avgt   50   92,215 ±  4,425  
ns/op
ArrayInstantiationBenchmark.constructor      1000  avgt   50  838,303 ± 33,161  
ns/op

ArrayInstantiationBenchmark.newInstance        10  avgt   50   86,696 ±  1,297  
ns/op
ArrayInstantiationBenchmark.newInstance       100  avgt   50  106,751 ±  2,796  
ns/op
ArrayInstantiationBenchmark.newInstance      1000  avgt   50  840,582 ± 24,745  
ns/op

Pay attention that performance for length = {100, 1000} is almost the same.

I suppose it's a bug somewhere on VM because both methods just allocate memory 
and do zeroing elimination and subsequently there shouldn't be such a huge 
difference between them.

Sergey Tsypanov


Reply via email to