[
https://issues.apache.org/jira/browse/ATLAS-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867275#comment-16867275
]
ASF subversion and git services commented on ATLAS-3290:
--------------------------------------------------------
Commit 1c399cc7121fc2f68f50675f804e6cc21b7a5bf5 in atlas's branch
refs/heads/master from lina.li
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=1c399cc ]
ATLAS-3290: Impala Hook should get database name and table name from vertex
metadata
Signed-off-by: Sarath Subramanian <[email protected]>
> Impala Hook should get database name and table name from vertex metadata
> ------------------------------------------------------------------------
>
> Key: ATLAS-3290
> URL: https://issues.apache.org/jira/browse/ATLAS-3290
> Project: Atlas
> Issue Type: New Feature
> Components: atlas-core
> Affects Versions: 2.1.0
> Reporter: Na Li
> Assignee: Na Li
> Priority: Major
> Attachments: ATLAS-3290.001.patch
>
>
> The column name in Impala lineage record may not contain its database name
> and its table name.
> To get its its database name and its table name, we should use the metadata
> in a vertex, not assuming column name contains its database name and its
> table name.
> When assuming that column name always contains its database name and its
> table name, we run into the following exception
> {code}
> I0618 19:16:02.415920 209817 QueryEventHookManager.java:212] Initiating
> onQueryComplete: org.apache.atlas.impala.hook.ImpalaLineageHook
> E0618 19:16:02.418964 210738 ImpalaLineageHook.java:126]
> ImpalaLineageHook.process(): failed to process query create table sales_sg as
> select * from sales_asia
> Java exception follows:
> java.lang.IllegalArgumentException: fullColumnName {} does not contain
> database name or table name
> at
> org.apache.atlas.impala.hook.AtlasImpalaHookContext.getQualifiedNameForColumn(AtlasImpalaHookContext.java:115)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:164)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:134)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.getColumnEntities(BaseImpalaEvent.java:495)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:430)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:393)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.toAtlasEntity(BaseImpalaEvent.java:315)
> at
> org.apache.atlas.impala.hook.events.BaseImpalaEvent.getInputOutputEntity(BaseImpalaEvent.java:297)
> at
> org.apache.atlas.impala.hook.events.CreateImpalaProcess.getEntities(CreateImpalaProcess.java:103)
> at
> org.apache.atlas.impala.hook.events.CreateImpalaProcess.getNotificationMessages(CreateImpalaProcess.java:54)
> at
> org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:122)
> at
> org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:79)
> at
> org.apache.atlas.impala.hook.ImpalaHook.onQueryComplete(ImpalaHook.java:36)
> at
> org.apache.atlas.impala.hook.ImpalaLineageHook.onQueryComplete(ImpalaLineageHook.java:52)
> at
> org.apache.impala.hooks.QueryEventHookManager.lambda$null$1(QueryEventHookManager.java:215)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The lineage record from Impala is
> {code}
> {
> "queryText":"create table sales_china as select * from sales_asia",
> "queryId":"2940d0b242de53ea:e82ba8d300000000",
> "hash":"a705a9ec851a5440afca0dfb8df86cd5",
> "user":"root",
> "timestamp":1560885032,
> "endTime":1560885040,
> "edges":[
> {
> "sources":[
> 1
> ],
> "targets":[
> 0
> ],
> "edgeType":"PROJECTION"
> },
> {
> "sources":[
> 3
> ],
> "targets":[
> 2
> ],
> "edgeType":"PROJECTION"
> }
> ],
> "vertices":[
> {
> "id":0,
> "vertexType":"COLUMN",
> "vertexId":"id",
> "metadata":{
> "tableName":"sales_db.sales_china",
> "tableCreateTime":1560885039
> }
> },
> {
> "id":1,
> "vertexType":"COLUMN",
> "vertexId":"sales_db.sales_asia.id",
> "metadata":{
> "tableName":"sales_db.sales_asia",
> "tableCreateTime":1560884919
> }
> },
> {
> "id":2,
> "vertexType":"COLUMN",
> "vertexId":"name",
> "metadata":{
> "tableName":"sales_db.sales_china",
> "tableCreateTime":1560885039
> }
> },
> {
> "id":3,
> "vertexType":"COLUMN",
> "vertexId":"sales_db.sales_asia.name",
> "metadata":{
> "tableName":"sales_db.sales_asia",
> "tableCreateTime":1560884919
> }
> }
> ]
> }
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)