I'm doing a spark SQL benchmark similar to the code in https://spark.apache.org/docs/latest/sql-programming-guide.html (section: Inferring the Schema Using Reflection**). What's the simplest way to time the SQL statement itself, so that I'm not timing the .map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt) part of the RDD creation? I'm using a few calls to System.nanoTime() for timing.
Arun ** val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.createSchemaRDD case class Person(name: String, age: Int) val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)) people.registerTempTable("people") val teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19") teenagers.map(t => "Name: " + t(0)).collect().foreach(println)