Pramod Biligiri created HUDI-5024:
-------------------------------------

             Summary: Support storing database also as a Dataset in Datahub, 
not just a table
                 Key: HUDI-5024
                 URL: https://issues.apache.org/jira/browse/HUDI-5024
             Project: Apache Hudi
          Issue Type: Task
          Components: meta-sync
            Reporter: Pramod Biligiri


Note: Evaluate feasibility and desirability of this before implementing.

Hudi's DatahubSyncTool only pushes tables as a Dataset into Datahub, and not 
the database itself as a Dataset. Moreover, Datahub also appears (on the face 
of it) to only store tables as a Dataset, and not the database itself. This is 
shown even in their demo page: 
[https://demo.datahubproject.io/browse/dataset/prod/postgres/calm-pagoda-323403/jaffle_shop]

But some customers might want to store the Database also as a top-level entity. 
So consider enhancing DatahubSyncTool to do the same - probably using some 
advanced features of Datahub?

Ongoing Slack thread about this in Datahub Slack: 
https://datahubspace.slack.com/archives/CUMUWQU66/p1665636994736379



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to