Pramod Biligiri created HUDI-5024:
-------------------------------------
Summary: Support storing database also as a Dataset in Datahub,
not just a table
Key: HUDI-5024
URL: https://issues.apache.org/jira/browse/HUDI-5024
Project: Apache Hudi
Issue Type: Task
Components: meta-sync
Reporter: Pramod Biligiri
Note: Evaluate feasibility and desirability of this before implementing.
Hudi's DatahubSyncTool only pushes tables as a Dataset into Datahub, and not
the database itself as a Dataset. Moreover, Datahub also appears (on the face
of it) to only store tables as a Dataset, and not the database itself. This is
shown even in their demo page:
[https://demo.datahubproject.io/browse/dataset/prod/postgres/calm-pagoda-323403/jaffle_shop]
But some customers might want to store the Database also as a top-level entity.
So consider enhancing DatahubSyncTool to do the same - probably using some
advanced features of Datahub?
Ongoing Slack thread about this in Datahub Slack:
https://datahubspace.slack.com/archives/CUMUWQU66/p1665636994736379
--
This message was sent by Atlassian Jira
(v8.20.10#820010)