scala> user res19: org.apache.spark.sql.SchemaRDD = SchemaRDD[0] at RDD at SchemaRDD.scala:98 == Query Plan == ParquetTableScan [id#0,name#1], (ParquetRelation /user/hive/warehouse/user), None
scala> order res20: org.apache.spark.sql.SchemaRDD = SchemaRDD[72] at RDD at SchemaRDD.scala:98 == Query Plan == ParquetTableScan [id#8,userid#9,unit#10], (ParquetRelation /user/hive/warehouse/orders), None For joining SchemaRDD user and order, This will generate Ambiguous issue because both of tables have 'id. user.join(order, on=Some('id === 'userid)) How can I specify an expression which can use SchemaRDD name and column together? Something might be like 'user.'id. This expression currently doesn't work in Spark 1.0.0 CDH 5.1.0.