I'm building spark from branch-1.6 source with mvn -DskipTests package and I'm running the following code with spark shell.
*val* sqlContext *=* *new* org.apache.spark.sql.*SQLContext*(sc) *import* *sqlContext.implicits._* *val df = sqlContext.read.json("persons.json")* *val df2 = sqlContext.read.json("cars.json")* *df.registerTempTable("t")* *df2.registerTempTable("u")* *val d3 =sqlContext.sql("select * from t join u on t.id <http://t.id> = u.id <http://u.id> where t.id <http://t.id> = 1")* With the log4j root category level WARN, the last printed messages refers to the Batch Resolution rules execution. === Result of Batch Resolution === !'Project [unresolvedalias(*)] Project [id#0L,id#1L] !+- 'Filter ('t.id = 1) +- Filter (id#0L = cast(1 as bigint)) ! +- 'Join Inner, Some(('t.id = 'u.id)) +- Join Inner, Some((id#0L = id#1L)) ! :- 'UnresolvedRelation `t`, None :- Subquery t ! +- 'UnresolvedRelation `u`, None : +- Relation[id#0L] JSONRelation ! +- Subquery u ! +- Relation[id#1L] JSONRelation I think that only the analyser rules are being executed. The optimiser rules should not to run in this case? 2016-05-11 19:24 GMT+01:00 Michael Armbrust <mich...@databricks.com>: > >> logical plan after optimizer execution: >> >> Project [id#0L,id#1L] >> !+- Filter (id#0L = cast(1 as bigint)) >> ! +- Join Inner, Some((id#0L = id#1L)) >> ! :- Subquery t >> ! : +- Relation[id#0L] JSONRelation >> ! +- Subquery u >> ! +- Relation[id#1L] JSONRelation >> > > I think you are mistaken. If this was the optimized plan there would be > no subqueries. >