[
https://issues.apache.org/jira/browse/IMPALA-12237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltán Borók-Nagy resolved IMPALA-12237.
----------------------------------------
Fix Version/s: Impala 4.3.0
Resolution: Fixed
> Add information about the table type in the lineage log
> -------------------------------------------------------
>
> Key: IMPALA-12237
> URL: https://issues.apache.org/jira/browse/IMPALA-12237
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Zoltán Borók-Nagy
> Assignee: Zoltán Borók-Nagy
> Priority: Major
> Labels: impala-iceberg
> Fix For: Impala 4.3.0
>
>
> Atlas needs table type information to correctly build the lineage graph.
> Currently this is in the lineage log for a CTAS statement:
> {noformat}
> {
> "queryText": "create table lineage_ctas as select * from lineage_test",
> "queryId": "774232610e386de9:8111ae3500000000",
> "hash": "ed91deffcdc11c442c2420da3b33d3b3",
> "user": "boroknagyz",
> "timestamp": 1687351038,
> "endTime": 1687351038,
> "edges": [
> {
> "sources": [
> 1
> ],
> "targets": [
> 0
> ],
> "edgeType": "PROJECTION"
> }
> ],
> "vertices": [
> {
> "id": 0,
> "vertexType": "COLUMN",
> "vertexId": "i",
> "metadata": {
> "tableName": "default.lineage_ctas",
> "tableCreateTime": 1687351038
> }
> },
> {
> "id": 1,
> "vertexType": "COLUMN",
> "vertexId": "default.lineage_test.i",
> "metadata": {
> "tableName": "default.lineage_test",
> "tableCreateTime": 1687351020
> }
> }
> ]
> }
> {noformat}
> Under vertices this is what they'd like to see:
> {noformat}
> "vertices": [
> {
> "id": 0,
> "vertexType": "COLUMN",
> "vertexId": "i",
> "metadata": {
> "tableName": "default.lineage_ctas",
> "tableType": "iceberg",
> "tableCreateTime": 1687351038
> }
> },
> {
> "id": 1,
> "vertexType": "COLUMN",
> "vertexId": "default.lineage_test.i",
> "metadata": {
> "tableName": "default.lineage_test",
> "tableType": "hive",
> "tableCreateTime": 1687351020
> }
> }
> ]
> {noformat}
> So under the vertices' metadata, there should be a new field: 'tableType'.
> For FS-based tables it should be "hive", except for Iceberg, in which case it
> should be "iceberg". For Kudu it should be "kudu", and for HBase it should be
> "hbase".
--
This message was sent by Atlassian Jira
(v8.20.10#820010)