MaxGekk opened a new pull request #28613:
URL: https://github.com/apache/spark/pull/28613


   ### What changes were proposed in this pull request?
   Re-generate results of:
   - DateTimeBenchmark
   - CSVBenchmark
   - JsonBenchmark
   
   in the environment:
   
   | Item | Description |
   | ---- | ----|
   | Region | us-west-2 (Oregon) |
   | Instance | r3.xlarge |
   | AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 
(ami-06f2f779464715dc5) |
   | Java | OpenJDK 64-Bit Server VM 1.8.0_242 and OpenJDK 64-Bit Server VM 
11.0.6+10 |
   
   ### Why are the changes needed?
   1. The PR https://github.com/apache/spark/pull/28576 changed date-time 
parser. The `DateTimeBenchmark` should confirm that this PR didn't slow down 
date/timestamp parsing.
   2. CSV/JSON datasources are affected by the above PR too. This PR updates 
the benchmark results in the same environment as other benchmarks the have a 
base line for future optimizations.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   By running benchmarks via the script:
   ```python
   #!/usr/bin/env python3
   
   import os
   from sparktestsupport.shellutils import run_cmd
   
   benchmarks = [
       ['sql/test', 
'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'],
       ['sql/test', 
'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'],
       ['sql/test', 
'org.apache.spark.sql.execution.datasources.json.JsonBenchmark']
   ]
   
   print('Set SPARK_GENERATE_BENCHMARK_FILES=1')
   os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1'
   
   for b in benchmarks:
       print("Run benchmark: %s" % b[1])
       run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])])
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to