[ 
https://issues.apache.org/jira/browse/ATLAS-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carol Drummond updated ATLAS-3006:
----------------------------------
    Labels: new-feature  (was: new-feature release-notes)

> Option to ignore/prune metadata for temporary/staging Hive tables
> -----------------------------------------------------------------
>
>                 Key: ATLAS-3006
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3006
>             Project: Atlas
>          Issue Type: Improvement
>          Components:  atlas-core
>            Reporter: Madhan Neethiraj
>            Assignee: Madhan Neethiraj
>            Priority: Major
>              Labels: new-feature
>             Fix For: 0.8.4, 1.2.0, 2.0.0
>
>         Attachments: ATLAS-3006-branch-0.8.patch, ATLAS-3006.patch
>
>
> It is not uncommon for a Hive deployment to use a large number of 
> staging/temporary tables, which are created periodically to load data into 
> target tables and deleted after completion of data load. A large number of 
> entities are created in Atlas for these staging/temporary tables 
> (tables/columns/column-lineage).
> For staging tables, it is probably not useful to track details like columns 
> and column-lineage in Atlas. Not tracking these details in Atlas can 
> significantly reduce the time it takes to process notifications, and can help 
> in improving the performance overall. Only minimum details of these staging 
> tables can be stored in Atlas, to capture data lineage from source to target 
> table via all intermediate staging tables.
> Also, it will be helpful to good to ignore tables that are created & deleted 
> during data loading i.e. temporary tables.
> Configurations should be provided to specify which of the tables are 
> staging/temporary. In addition to supporting this in Hive hook (to avoid 
> generation of large messages for staging/temporary tables), Atlas server 
> should also be updated, to control this further at server side while 
> processing notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to