[ https://issues.apache.org/jira/browse/SPARK-36768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Willi Raschkowski updated SPARK-36768: -------------------------------------- Description: Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro: {code:python} >>> spark.range(3).toDF("col").write.parquet("testdata") # Single name part attribute is fine >>> spark.sql("SELECT col FROM parquet.testdata").show() +---+ |col| +---+ | 1| | 0| | 2| +---+ # Name part with the table reference fails >>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show() AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7; 'Project ['parquet.testdata.col] +- Relation[col#50L] parquet {code} The expected behavior is that {{parquet.testdata.col}} is recognized as referring to attribute {{col}} in {{parquet.testdata}}. This also reproduces on master at time of writing. was: Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro: {code:python} >>> spark.range(3).toDF("col").write.parquet("testdata") # Single name part attribute is fine >>> spark.sql("SELECT col FROM parquet.testdata").show() +---+ |col| +---+ | 1| | 0| | 2| +---+ # Name part with the table reference fails >>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show() AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7; 'Project ['parquet.testdata.col] +- Relation[col#50L] parquet {code} This also reproduces on master at time of writing. > Cannot resolve attribute with table reference > --------------------------------------------- > > Key: SPARK-36768 > URL: https://issues.apache.org/jira/browse/SPARK-36768 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 2.4.7, 3.0.3, 3.1.2 > Reporter: Willi Raschkowski > Priority: Major > > Spark seems in some cases unable to resolve attributes that contain > multi-part names where the first parts reference a table. Here's a repro: > {code:python} > >>> spark.range(3).toDF("col").write.parquet("testdata") > # Single name part attribute is fine > >>> spark.sql("SELECT col FROM parquet.testdata").show() > +---+ > |col| > +---+ > | 1| > | 0| > | 2| > +---+ > # Name part with the table reference fails > >>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show() > AnalysisException: cannot resolve '`parquet.testdata.col`' given input > columns: [col]; line 1 pos 7; > 'Project ['parquet.testdata.col] > +- Relation[col#50L] parquet > {code} > The expected behavior is that {{parquet.testdata.col}} is recognized as > referring to attribute {{col}} in {{parquet.testdata}}. > This also reproduces on master at time of writing. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org