Hello Rahul, Please try to use df.filter(df("id").isin(1,2))
Thanks, On Thu, Mar 30, 2017 at 10:45 PM, Rahul Nandi <rahulnandi...@gmail.com> wrote: > Hi, > I have around 2 million data as parquet file in s3. The file structure is > somewhat like > id data > 1 abc > 2 cdf > 3 fas > Now I want to filter and take the records where the id matches with my > required Id. > > val requiredDataId = Array(1,2) //Might go upto 100s of records. > > df.filter(requiredDataId.contains("id")) > > This is my use case. > > What will be best way to do this in spark 2.0.1 where I can also pushDown > the filter to parquet? > > > > Thanks and Regards, > Rahul > >