[
https://issues.apache.org/jira/browse/HUDI-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975585#comment-16975585
]
sivabalan narayanan commented on HUDI-15:
-----------------------------------------
cool. thanks.
I synced up with Balaji on the fix. Here is the plan. Every
HoodieCommitMetadata is going to have schema added to it. Whenever a delete()
is called on HoodieWriteClient(), user may not set any schema in the config. We
fetch the last instant and get the hoodie commit metadata to get the schema and
set the schema in the config. Rest of the flow should be seemless similar to
updates.
> Add a delete() API to HoodieWriteClient as well as Spark datasource #531
> ------------------------------------------------------------------------
>
> Key: HUDI-15
> URL: https://issues.apache.org/jira/browse/HUDI-15
> Project: Apache Hudi (incubating)
> Issue Type: New Feature
> Components: Spark datasource, Write Client
> Reporter: Vinoth Chandar
> Assignee: sivabalan narayanan
> Priority: Major
> Fix For: 0.5.1
>
>
> Delete API needs to be supported as first class citizen via DeltaStreamer,
> WriteClient and datasources. Currently there are two ways to delete, soft
> deletes and hard deletes - https://hudi.apache.org/writing_data.html#deletes.
> We need to ensure for hard deletes, we are able to leverage
> EmptyHoodieRecordPayload with just the HoodieKey and empty record value for
> deleting.
> [https://github.com/uber/hudi/issues/531]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)