Hey Patrick,

It's Ozgun from Citus Data. We'd like to make these benchmark results fair,
and have tried different config settings for SparkSQL over the past month.
We picked the best config settings we could find, and also contacted the
Spark users list about running TPC-H numbers.

http://goo.gl/IU5Hw0
http://goo.gl/WQ1kML
http://goo.gl/ihLzgh

We also received advice at the Spark Summit '14 to wait until v1.1, and
therefore re-ran our tests on SparkSQL 1.1. On the specific optimizations,
Marco and Samay from our team have much more context, and I'll let them
answer your questions on the different settings we tried.

Our intent is to be fair and not misrepresent SparkSQL's performance. On
that front, we used publicly available documentation and user lists, and
spent about a month trying to get the best Spark performance results. If
there are specific optimizations we should have applied and missed, we'd
love to be involved with the community in re-running the numbers.

Is this email thread the best place to continue the conversation?

Best,
Ozgun



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Surprising-Spark-SQL-benchmark-tp9041p9073.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to