[ 
https://issues.apache.org/jira/browse/GOBBLIN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shirshanka Das resolved GOBBLIN-957.
------------------------------------
    Fix Version/s: 0.15.0
       Resolution: Fixed

Issue resolved by pull request #2806
[https://github.com/apache/incubator-gobblin/pull/2806]

> Support automatic recursion removal from schemas
> ------------------------------------------------
>
>                 Key: GOBBLIN-957
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-957
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Shirshanka Das
>            Assignee: Shirshanka Das
>            Priority: Major
>              Labels: avro
>             Fix For: 0.15.0
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Analytics engines like Hive etc cannot handle recursive schemas: schemas 
> where inner fields can refer to the wrapping type. 
> This Jira proposes that we provide support for automatic recursion removal in 
> data during data ingestion. 
> The simple proposal is to just drop the fields in the schema that introduce 
> the recursion. 
> e.g.  (pseudo-schema)
> User { 
>  string name;
>  User friend;
> }
> gets converted to :
> User { 
>    string name;
> }
>  
> A more sophisticated solution would be to do one or two levels of 
> "schema-unrolling" before dropping data. 
> e.g. 
> output schema with one-level unrolling would look like: 
> User { 
>    string name;
>    User1 friend;
> }
> User 1 { 
>     string name;
>   }
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to