Hi Yogesh You can try registering these 2 DFs as temporary table and then execute the sql query. df1.registerTempTable("df1") df2.registerTempTable("df2")
val rs = sqlContext.sql("SELECT a.* FROM df1 a, df2 b where a.id != b.id) Thanks Deepak On Mon, Oct 3, 2016 at 12:38 PM, Yogesh Vyas <informy...@gmail.com> wrote: > Hi, > > I have two SparkDataFrames, df1 and df2. > There schemas are as follows: > df1=>SparkDataFrame[id:double, c1:string, c2:string] > df2=>SparkDataFrame[id:double, c3:string, c4:string] > > I want to filter out rows from df1 where df1$id does not match df2$id > > I tried some expression: filter(df1,!(df1$id %in% df2$id)), but it does > not works. > > Anybody could please provide me a solution for it? > > Regards, > Yogesh > -- Thanks Deepak www.bigdatabig.com www.keosha.net