[ https://issues.apache.org/jira/browse/SPARK-23478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomasz Bartczak updated SPARK-23478: ------------------------------------ Priority: Minor (was: Major) > Inconsistent behaviour of union when columns have conflicting metadata > ---------------------------------------------------------------------- > > Key: SPARK-23478 > URL: https://issues.apache.org/jira/browse/SPARK-23478 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.1 > Reporter: Tomasz Bartczak > Priority: Minor > > When columns have different metadata and we union dataframes with them - the > end result of metadata depends on union ordering: > {code:java} > df = spark.createDataFrame([{'a':1}]) > a = df > b = df.select(col('a').alias('a',metadata={'description':'xxx'})) > print("a.union(b) gives {}".format(a.union(b).schema.fields[0].metadata)) > print("b.union(a) gives {}".format(b.union(a).schema.fields[0].metadata)) > {code} > gives: > {code:java} > a.union(b) gives {} > b.union(a) gives {'description': 'xxx'}{code} > > And I wonder if this kind of union should be allowed at all - when fields > with different metadata are inside a struct - union fails, which can be seen > in https://issues.apache.org/jira/projects/SPARK/issues/SPARK-23477 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org