Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21596#discussion_r197712172
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
---
@@ -25,8 +25,13 @@ import org.apache.spark.util.{Benchmark, Utils}
/**
* The benchmarks aims to measure performance of JSON parsing when
encoding is set and isn't.
- * To run this:
- * spark-submit --class <this class> --jars <spark sql test jar>
+ * To run:
+ * mvn clean package -pl sql/core -DskipTests
+ * ./dev/make-distribution.sh --name local-dist
+ * cd dist/
+ * ./bin/spark-submit --class
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks \
+ * ../sql/core/target/spark-sql_2.11-2.4.0-SNAPSHOT-tests.jar >
/tmp/output.txt
--- End diff --
But wouldn't this command be invalid if we have another release. I mean, I
believe we can build by following
https://spark.apache.org/docs/latest/building-spark.html and run the benchmark.
Let's just say it like `bin/spark-submit --class
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks --jars
./sql/core/target/spark-sql_*.jar` or like `spark-submit --class
org.apache.spark.sql.execution.datasources.json.JSONBenchmarks --jars <spark
sql test jar>`
Should better to make the steps independent of other possible ways.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]