[
https://issues.apache.org/jira/browse/HUDI-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249238#comment-17249238
]
Ryan Murray commented on HUDI-1455:
-----------------------------------
The catalog interface in Iceberg[1] and the LogStore interface in Delta [3]
both abstract away the file operations to commit a transaction. Typically for
filesystems with atomic rename (eg hdfs) this just delegates to hdfs libraries.
For S3 Iceberg delegates the locking to Hive and indications are that
proprietary Delta delegates to an internal Databricks api (guessing from the
code in the oss repo and the docs). Nessie fits into iceberg [2] and delta [4]
by implementing those interfaces and performing the (optimistic) locking
through nessie. As it hooks in at this layer it is used both as the locking
mechanism (which is what allows for many simultaneous readers and writers) and
is able to capture the required info to maintain the git-like history of
branches and tags.
My (admittedly not extensive) research into Hudi looks like there is indeed no
layer for those types of operations and everything is handled in by the IO
itself. I am not sure how easy it is to slide something like Nessie in at that
level or if it requires something like implementing a hadoop filesystem
interface. What do you think?
[1] http://iceberg.apache.org/custom-catalog/
[2]
https://github.com/apache/iceberg/tree/master/nessie/src/main/java/org/apache/iceberg/nessie
[3]
https://github.com/delta-io/delta/blob/master/src/main/scala/org/apache/spark/sql/delta/storage/LogStore.scala
[4]
https://github.com/projectnessie/nessie/blob/main/clients/deltalake/core/src/main/scala/com/dremio/nessie/deltalake/NessieLogStore.scala
> Hudi integration with project nessie
> ------------------------------------
>
> Key: HUDI-1455
> URL: https://issues.apache.org/jira/browse/HUDI-1455
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Vinoth Chandar
> Priority: Major
>
> [https://github.com/apache/hudi/issues/2330#issuecomment-743423398]
> Follow up from this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)