Will take my workaround back. Your workaround is cleaner. Thanks again.

Joaquim S <[email protected]> escreveu no dia quinta, 26/03/2020 à(s)
17:09:

> Looking at the definition of soft deletes, as the key in the record is
> still available, I can work with that for downstream consumption for now.
> For hard deletes will follow the thread.
>
> Joaquim S <[email protected]> escreveu no dia quinta, 26/03/2020 à(s)
> 17:04:
>
>> Thank you Vinoth. Yup, will consider this option. If I can capture the
>> soft deletes, maybe make a change on my side to only use soft deletes for
>> now. (will see how much impact it creates, still figuring out hudi...)
>>
>> Vinoth Chandar <[email protected]> escreveu no dia quinta, 26/03/2020
>> à(s) 17:00:
>>
>>> As a workaround, I am wondering if you can do this for now.
>>> Let's say you can records deleted from commit c1 to now.
>>>
>>> - Do a snapshot query at commit c1
>>> - Do a snapshot query at latest commit
>>> - Diff the two
>>>
>>> It will inefficient, since it has to scan all the data twice, but may
>>> work
>>> functionally?
>>>
>>> On Thu, Mar 26, 2020 at 1:47 PM Vinoth Chandar <[email protected]>
>>> wrote:
>>>
>>> > Currently, soft deletes will show up in the incremental stream, while
>>> hard
>>> > deletes will not..
>>> >
>>> > We are debating how to add this features, since it has come up few
>>> times
>>> > recently..
>>> >
>>> > May be this can be a good discuss thread for that? :)
>>> >
>>> > On Thu, Mar 26, 2020 at 1:42 PM Joaquim S <[email protected]> wrote:
>>> >
>>> >> Folks,
>>> >>
>>> >> I am looking at DMS integration and have a question.
>>> >>
>>> >> It is clear that the incremental queries only show incrementals ( :)
>>> ) . I
>>> >> need to extract the deletes too from a specific commit (ideally
>>> sparksql
>>> >> or
>>> >> hive).
>>> >>
>>> >> What am I missing, is there another way to query the timeline to get
>>> the
>>> >> records that were deleted after a specific commit?
>>> >>
>>> >> Thank you!
>>> >>
>>> >
>>>
>>

Reply via email to