Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
I am looking for any built-in API if at all exists? On Tue, Jun 22, 2021 at 1:16 PM Chetan Khatri wrote: > this has been very slow > > On Tue, Jun 22, 2021 at 1:15 PM Sachit Murarka > wrote: > >> Hi Chetan, >> >> You can substract the data frame or use except operation. >> First DF contains

Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
this has been very slow On Tue, Jun 22, 2021 at 1:15 PM Sachit Murarka wrote: > Hi Chetan, > > You can substract the data frame or use except operation. > First DF contains full rows. > Second DF contains unique rows (post remove duplicates) > Subtract first and second DF . > > hope this helps

Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Sachit Murarka
Hi Chetan, You can substract the data frame or use except operation. First DF contains full rows. Second DF contains unique rows (post remove duplicates) Subtract first and second DF . hope this helps Thanks Sachit On Tue, Jun 22, 2021, 22:23 Chetan Khatri wrote: > Hi Spark Users, > > I want

Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
Hi Spark Users, I want to use DropDuplicate, but those records which I discard. I would like to log to the instrumental table. What would be the best approach to do that? Thanks