Looking at the definition of soft deletes, as the key in the record is still available, I can work with that for downstream consumption for now. For hard deletes will follow the thread.
Joaquim S <[email protected]> escreveu no dia quinta, 26/03/2020 à(s) 17:04: > Thank you Vinoth. Yup, will consider this option. If I can capture the > soft deletes, maybe make a change on my side to only use soft deletes for > now. (will see how much impact it creates, still figuring out hudi...) > > Vinoth Chandar <[email protected]> escreveu no dia quinta, 26/03/2020 > à(s) 17:00: > >> As a workaround, I am wondering if you can do this for now. >> Let's say you can records deleted from commit c1 to now. >> >> - Do a snapshot query at commit c1 >> - Do a snapshot query at latest commit >> - Diff the two >> >> It will inefficient, since it has to scan all the data twice, but may work >> functionally? >> >> On Thu, Mar 26, 2020 at 1:47 PM Vinoth Chandar <[email protected]> wrote: >> >> > Currently, soft deletes will show up in the incremental stream, while >> hard >> > deletes will not.. >> > >> > We are debating how to add this features, since it has come up few times >> > recently.. >> > >> > May be this can be a good discuss thread for that? :) >> > >> > On Thu, Mar 26, 2020 at 1:42 PM Joaquim S <[email protected]> wrote: >> > >> >> Folks, >> >> >> >> I am looking at DMS integration and have a question. >> >> >> >> It is clear that the incremental queries only show incrementals ( :) ) >> . I >> >> need to extract the deletes too from a specific commit (ideally >> sparksql >> >> or >> >> hive). >> >> >> >> What am I missing, is there another way to query the timeline to get >> the >> >> records that were deleted after a specific commit? >> >> >> >> Thank you! >> >> >> > >> >
