[jira] [Updated] (ATLAS-4072) spark_column_lineage missing for insert into select * queries run via spark-shell
[ https://issues.apache.org/jira/browse/ATLAS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Padashetty updated ATLAS-4072: Attachment: Screenshot 2020-12-11 at 1.34.24 AM.png > spark_column_lineage missing for insert into select * queries run via > spark-shell > - > > Key: ATLAS-4072 > URL: https://issues.apache.org/jira/browse/ATLAS-4072 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Reporter: Umesh Padashetty >Priority: Major > Attachments: Screenshot 2020-12-11 at 1.29.39 AM.png, Screenshot > 2020-12-11 at 1.34.24 AM.png > > > From the spark-shell, ran the below queries > * spark.sql("create table umesh(name string)"); > * spark.sql("create table umesh_insert(name string)"); > * spark.sql("insert into umesh_insert select * from umesh"); > There is a spark_process created between umesh and umesh_insert tables, but > the spark_column_lineage is missing between the umesh.name and > umesh_insert.name columns > !Screenshot 2020-12-11 at 1.29.39 AM.png|width=438,height=435! > To cross verify the behavior, I ran similar hive queries via beeline and > found out that along with hive_process being created between umesh_hive and > umesh_hive_insert tables, hive_column_lineage is created between > umesh_hive.name and umesh_hive_insert.name columns. > Queries run via beeline > * create table umesh_hive(name string); > * create table umesh_hive_insert(name string); > * insert into umesh_hive_insert select * from umesh_hive; > !Screenshot 2020-12-11 at 1.34.24 AM.png|width=441,height=548! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ATLAS-4072) spark_column_lineage missing for insert into select * queries run via spark-shell
[ https://issues.apache.org/jira/browse/ATLAS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Padashetty updated ATLAS-4072: Attachment: Screenshot 2020-12-11 at 1.29.39 AM.png > spark_column_lineage missing for insert into select * queries run via > spark-shell > - > > Key: ATLAS-4072 > URL: https://issues.apache.org/jira/browse/ATLAS-4072 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Reporter: Umesh Padashetty >Priority: Major > Attachments: Screenshot 2020-12-11 at 1.29.39 AM.png, Screenshot > 2020-12-11 at 1.34.24 AM.png > > > From the spark-shell, ran the below queries > * spark.sql("create table umesh(name string)"); > * spark.sql("create table umesh_insert(name string)"); > * spark.sql("insert into umesh_insert select * from umesh"); > There is a spark_process created between umesh and umesh_insert tables, but > the spark_column_lineage is missing between the umesh.name and > umesh_insert.name columns > !Screenshot 2020-12-11 at 1.29.39 AM.png|width=438,height=435! > To cross verify the behavior, I ran similar hive queries via beeline and > found out that along with hive_process being created between umesh_hive and > umesh_hive_insert tables, hive_column_lineage is created between > umesh_hive.name and umesh_hive_insert.name columns. > Queries run via beeline > * create table umesh_hive(name string); > * create table umesh_hive_insert(name string); > * insert into umesh_hive_insert select * from umesh_hive; > !Screenshot 2020-12-11 at 1.34.24 AM.png|width=441,height=548! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ATLAS-4072) spark_column_lineage missing for insert into select * queries run via spark-shell
[ https://issues.apache.org/jira/browse/ATLAS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Padashetty updated ATLAS-4072: Description: >From the spark-shell, ran the below queries * spark.sql("create table umesh(name string)"); * spark.sql("create table umesh_insert(name string)"); * spark.sql("insert into umesh_insert select * from umesh"); There is a spark_process created between umesh and umesh_insert tables, but the spark_column_lineage is missing between the umesh.name and umesh_insert.name columns !Screenshot 2020-12-11 at 1.29.39 AM.png! To cross verify the behavior, I ran similar hive queries via beeline and found out that along with hive_process being created between umesh_hive and umesh_hive_insert tables, hive_column_lineage is created between umesh_hive.name and umesh_hive_insert.name columns. Queries run via beeline * create table umesh_hive(name string); * create table umesh_hive_insert(name string); * insert into umesh_hive_insert select * from umesh_hive; !Screenshot 2020-12-11 at 1.34.24 AM.png! was: >From the spark-shell, ran the below queries * spark.sql("create table umesh(name string)"); * spark.sql("create table umesh_insert(name string)"); * spark.sql("insert into umesh_insert select * from umesh"); There is a spark_process created between umesh and umesh_insert tables, but the spark_column_lineage is missing between the umesh.name and umesh_insert.name columns !Screenshot 2020-12-11 at 1.29.39 AM.png|width=438,height=435! To cross verify the behavior, I ran similar hive queries via beeline and found out that along with hive_process being created between umesh_hive and umesh_hive_insert tables, hive_column_lineage is created between umesh_hive.name and umesh_hive_insert.name columns. Queries run via beeline * create table umesh_hive(name string); * create table umesh_hive_insert(name string); * insert into umesh_hive_insert select * from umesh_hive; !Screenshot 2020-12-11 at 1.34.24 AM.png|width=441,height=548! > spark_column_lineage missing for insert into select * queries run via > spark-shell > - > > Key: ATLAS-4072 > URL: https://issues.apache.org/jira/browse/ATLAS-4072 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Reporter: Umesh Padashetty >Priority: Major > Attachments: Screenshot 2020-12-11 at 1.29.39 AM.png, Screenshot > 2020-12-11 at 1.34.24 AM.png > > > From the spark-shell, ran the below queries > * spark.sql("create table umesh(name string)"); > * spark.sql("create table umesh_insert(name string)"); > * spark.sql("insert into umesh_insert select * from umesh"); > There is a spark_process created between umesh and umesh_insert tables, but > the spark_column_lineage is missing between the umesh.name and > umesh_insert.name columns > !Screenshot 2020-12-11 at 1.29.39 AM.png! > To cross verify the behavior, I ran similar hive queries via beeline and > found out that along with hive_process being created between umesh_hive and > umesh_hive_insert tables, hive_column_lineage is created between > umesh_hive.name and umesh_hive_insert.name columns. > Queries run via beeline > * create table umesh_hive(name string); > * create table umesh_hive_insert(name string); > * insert into umesh_hive_insert select * from umesh_hive; > !Screenshot 2020-12-11 at 1.34.24 AM.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)