HI All, Lets say, we have
val df = bigTableA.join(bigTableB,bigTableA("A")===bigTableB("A"),"left") val rddFromDF = df.rdd println(rddFromDF.count) My understanding is that spark will convert all data frame operations before "rddFromDF.count" into RDD equivalent operation as we are not performing any action on dataframe directly. In that case, spark will not be using optimization engine. Is my assumption right? Please point me to right resources. [ Note : I have posted same question on so : http://stackoverflow.com/questions/38889812/how-spark-dataframe-optimization-engine-works-with-dag ] Thanks