Hi, I have two dataframes which has common column Product_Id on which i have to perform a join operation.
val transactionDF = readCSVToDataFrame(sqlCtx: SQLContext, pathToReadTransactions: String, transactionSchema: StructType) val productDF = readCSVToDataFrame(sqlCtx: SQLContext, pathToReadProduct:String, productSchema: StructType) As, transaction data is very large but product data is small, i would ideally do a broadcast join where i braodcast productDF. val productBroadcastDF = broadcast(productDF) val broadcastJoin = transcationDF.join(productBroadcastDF, "productId") Or simply, val innerJoin = transcationDF.join(productDF, "productId") should give the same result as above. But If i join using simple inner join i get dataframe with joined values whereas if i do broadcast join i get empty dataframe with empty values. I am not able to explain this behavior. Ideally both should give the same result. What could have gone wrong. Any one faced the similar issue? Thanks, Prateek