[ 
https://issues.apache.org/jira/browse/ATLAS-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Solbi Choi updated ATLAS-4460:
------------------------------
    Description: 
h2. Problem

When one of the partitionKeys in hive table deleted, atlas search API still 
gets all partitionKeys including deleted one.

Adding
{code:java}
"excludeDeletedEntities": True{code}
in request json doesn't work in this situation.

 
h2. Reproduce
 * Create hive table with partition key and sync it using hive-import.
 * Delete the hive table and re-create hive table with same name but without 
partition key this time. Re-sync using hive-import.
 * Then you can see the partitionKey deleted in Atlas web view.

!스크린샷 2021-10-22 오후 5.07.47.png!
 * But when trying search API to get the hive table entity using

{code:java}
   request = { "typeName": "hive_table", 
               "attributes": [ "db", "name", "partitionKeys" ],
               "excludeDeletedEntities": True,
               "limit": limit,
               "offset": offset
              }
{code}
 * You get the deleted partitionKey also.

{code:java}
'partitionKeys': [
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'foo.test_partition_drop.ds@primary'}}, 
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'foo.test_partition_drop.ts@primary'}}
]
{code}
 

 

Additionally, this is reproduced within *hive columns, too.*

After changing column name by using alter table statement, (eg. foo -> bar)

the Search API gives 2 columns(foo and bar) as result of the hive table even 
with "excludeDeletedEntities" {color:#172b4d}option.{color}

 

{color:#172b4d}!https://media.oss.navercorp.com/user/16858/files/94f0c600-336a-11ec-8388-329a5c8a6323!{color}

 

 
{code:java}
'columns': [
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'db_name.test_partition_drop.bar@primary'}}, 
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'db_name.test_partition_drop.foo@primary'}}
]
{code}
 

  was:
Problems

When one of the partitionKeys in hive table deleted, atlas search API still 
gets all partitionKeys including deleted one.

Adding
{code:java}
"excludeDeletedEntities": True{code}
in request json doesn't work in this situation.

 

Reproduce
 * Create hive table with partition key and sync it using hive-import.
 * Delete the hive table and re-create hive table with same name but without 
partition key this time. Re-sync using hive-import.
 * Then you can see the partitionKey deleted in Atlas web view.

!스크린샷 2021-10-22 오후 5.07.47.png!
 * But when trying search API to get the hive table entity using

{code:java}
   request = { "typeName": "hive_table", 
               "attributes": [ "db", "name", "partitionKeys" ],
               "excludeDeletedEntities": True,
               "limit": limit,
               "offset": offset
              }
{code}
 * You get the deleted partitionKey also.

{code:java}
'partitionKeys': [
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'foo.test_partition_drop.ds@primary'}}, 
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'foo.test_partition_drop.ts@primary'}}
]
{code}
 

 

Additionally, this is reproduced within *hive columns, too.*

After changing column name by alter table statement, (eg. foo -> bar)

the Search API gives 2 columns(foo and bar) as result of the hive table even 
with "excludeDeletedEntities" {color:#172b4d}option.{color}

 

{color:#172b4d}!https://media.oss.navercorp.com/user/16858/files/94f0c600-336a-11ec-8388-329a5c8a6323!{color}

 

 
{code:java}
'columns': [
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'db_name.test_partition_drop.bar@primary'}}, 
{'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
{'qualifiedName': 'db_name.test_partition_drop.foo@primary'}}
]
{code}
 


> Search API gets deleted partitionKeys(and columns) of Hive table
> ----------------------------------------------------------------
>
>                 Key: ATLAS-4460
>                 URL: https://issues.apache.org/jira/browse/ATLAS-4460
>             Project: Atlas
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 2.2.0
>            Reporter: Solbi Choi
>            Priority: Major
>         Attachments: 스크린샷 2021-10-22 오후 5.07.47.png
>
>
> h2. Problem
> When one of the partitionKeys in hive table deleted, atlas search API still 
> gets all partitionKeys including deleted one.
> Adding
> {code:java}
> "excludeDeletedEntities": True{code}
> in request json doesn't work in this situation.
>  
> h2. Reproduce
>  * Create hive table with partition key and sync it using hive-import.
>  * Delete the hive table and re-create hive table with same name but without 
> partition key this time. Re-sync using hive-import.
>  * Then you can see the partitionKey deleted in Atlas web view.
> !스크린샷 2021-10-22 오후 5.07.47.png!
>  * But when trying search API to get the hive table entity using
> {code:java}
>    request = { "typeName": "hive_table", 
>                "attributes": [ "db", "name", "partitionKeys" ],
>                "excludeDeletedEntities": True,
>                "limit": limit,
>                "offset": offset
>               }
> {code}
>  * You get the deleted partitionKey also.
> {code:java}
> 'partitionKeys': [
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
> {'qualifiedName': 'foo.test_partition_drop.ds@primary'}}, 
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
> {'qualifiedName': 'foo.test_partition_drop.ts@primary'}}
> ]
> {code}
>  
>  
> Additionally, this is reproduced within *hive columns, too.*
> After changing column name by using alter table statement, (eg. foo -> bar)
> the Search API gives 2 columns(foo and bar) as result of the hive table even 
> with "excludeDeletedEntities" {color:#172b4d}option.{color}
>  
> {color:#172b4d}!https://media.oss.navercorp.com/user/16858/files/94f0c600-336a-11ec-8388-329a5c8a6323!{color}
>  
>  
> {code:java}
> 'columns': [
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
> {'qualifiedName': 'db_name.test_partition_drop.bar@primary'}}, 
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': 
> {'qualifiedName': 'db_name.test_partition_drop.foo@primary'}}
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to