Will take my workaround back. Your workaround is cleaner. Thanks again. Joaquim S <[email protected]> escreveu no dia quinta, 26/03/2020 à(s) 17:09:
> Looking at the definition of soft deletes, as the key in the record is > still available, I can work with that for downstream consumption for now. > For hard deletes will follow the thread. > > Joaquim S <[email protected]> escreveu no dia quinta, 26/03/2020 à(s) > 17:04: > >> Thank you Vinoth. Yup, will consider this option. If I can capture the >> soft deletes, maybe make a change on my side to only use soft deletes for >> now. (will see how much impact it creates, still figuring out hudi...) >> >> Vinoth Chandar <[email protected]> escreveu no dia quinta, 26/03/2020 >> à(s) 17:00: >> >>> As a workaround, I am wondering if you can do this for now. >>> Let's say you can records deleted from commit c1 to now. >>> >>> - Do a snapshot query at commit c1 >>> - Do a snapshot query at latest commit >>> - Diff the two >>> >>> It will inefficient, since it has to scan all the data twice, but may >>> work >>> functionally? >>> >>> On Thu, Mar 26, 2020 at 1:47 PM Vinoth Chandar <[email protected]> >>> wrote: >>> >>> > Currently, soft deletes will show up in the incremental stream, while >>> hard >>> > deletes will not.. >>> > >>> > We are debating how to add this features, since it has come up few >>> times >>> > recently.. >>> > >>> > May be this can be a good discuss thread for that? :) >>> > >>> > On Thu, Mar 26, 2020 at 1:42 PM Joaquim S <[email protected]> wrote: >>> > >>> >> Folks, >>> >> >>> >> I am looking at DMS integration and have a question. >>> >> >>> >> It is clear that the incremental queries only show incrementals ( :) >>> ) . I >>> >> need to extract the deletes too from a specific commit (ideally >>> sparksql >>> >> or >>> >> hive). >>> >> >>> >> What am I missing, is there another way to query the timeline to get >>> the >>> >> records that were deleted after a specific commit? >>> >> >>> >> Thank you! >>> >> >>> > >>> >>
