[ 
https://issues.apache.org/jira/browse/GOBBLIN-1484?focusedWorklogId=622778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622778
 ]

ASF GitHub Bot logged work on GOBBLIN-1484:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Jul/21 22:55
            Start Date: 14/Jul/21 22:55
    Worklog Time Spent: 10m 
      Work Description: ZihanLi58 commented on pull request #3329:
URL: https://github.com/apache/gobblin/pull/3329#issuecomment-880262243


   > No additional comments beyond what @sv2000 mentioned above. But a broader 
question to deal with this kind of feature is: What should be the right way to 
specify "lineage" of schema between different tables? Is setting source.db in 
GMCE a right approach (which means you need to set this in a specific 
application's GMCE if you expect the application itself doesn't carry the 
schema during runtime, for example compaction), or is there something broader 
missing in the overall picture.
   Yeah I do think there is something miss broader. Ideally, we should have a 
source of truth relationship graph between each table, so that when we see 
schema update, we can modify all tables using that schema. Leveraging config 
store is doable, but will introduce more complexity in manage the 
relationships. One better way is that we can use datahub for this case, but 
this will need more design. As for now, I would like to support source db in 
the GMCE itself to make it feasible for OSS user as well
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 622778)
    Time Spent: 2h 50m  (was: 2h 40m)

> Make Gobblin metadata writer be able to support schema source DB
> ----------------------------------------------------------------
>
>                 Key: GOBBLIN-1484
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1484
>             Project: Apache Gobblin
>          Issue Type: New Feature
>            Reporter: Zihan Li
>            Priority: Major
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Sometime we need the avro schema in place even for ORC tables, we have that 
> information in ingestion job but not for compaction job. And it's hard to get 
> the avro schema from orc file itself, so we want to support schema source db, 
> so that we can fetch the schema from source db where ingestion job registers 
> to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to