Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
I am looking for any built-in API if at all exists?

On Tue, Jun 22, 2021 at 1:16 PM Chetan Khatri 
wrote:

> this has been very slow
>
> On Tue, Jun 22, 2021 at 1:15 PM Sachit Murarka 
> wrote:
>
>> Hi Chetan,
>>
>> You can substract the data frame or use except operation.
>> First DF contains full rows.
>> Second DF contains unique rows (post remove duplicates)
>> Subtract first and second DF .
>>
>> hope this helps
>>
>> Thanks
>> Sachit
>>
>> On Tue, Jun 22, 2021, 22:23 Chetan Khatri 
>> wrote:
>>
>>> Hi Spark Users,
>>>
>>> I want to use DropDuplicate, but those records which I discard. I
>>> would like to log to the instrumental table.
>>>
>>> What would be the best approach to do that?
>>>
>>> Thanks
>>>
>>


Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
this has been very slow

On Tue, Jun 22, 2021 at 1:15 PM Sachit Murarka 
wrote:

> Hi Chetan,
>
> You can substract the data frame or use except operation.
> First DF contains full rows.
> Second DF contains unique rows (post remove duplicates)
> Subtract first and second DF .
>
> hope this helps
>
> Thanks
> Sachit
>
> On Tue, Jun 22, 2021, 22:23 Chetan Khatri 
> wrote:
>
>> Hi Spark Users,
>>
>> I want to use DropDuplicate, but those records which I discard. I
>> would like to log to the instrumental table.
>>
>> What would be the best approach to do that?
>>
>> Thanks
>>
>


Re: Usage of DropDuplicate in Spark

2021-06-22 Thread Sachit Murarka
Hi Chetan,

You can substract the data frame or use except operation.
First DF contains full rows.
Second DF contains unique rows (post remove duplicates)
Subtract first and second DF .

hope this helps

Thanks
Sachit

On Tue, Jun 22, 2021, 22:23 Chetan Khatri 
wrote:

> Hi Spark Users,
>
> I want to use DropDuplicate, but those records which I discard. I
> would like to log to the instrumental table.
>
> What would be the best approach to do that?
>
> Thanks
>


Usage of DropDuplicate in Spark

2021-06-22 Thread Chetan Khatri
Hi Spark Users,

I want to use DropDuplicate, but those records which I discard. I
would like to log to the instrumental table.

What would be the best approach to do that?

Thanks