[
https://issues.apache.org/jira/browse/SPARK-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-6851:
-----------------------------------
Assignee: Michael Armbrust (was: Apache Spark)
> Wrong answers for self joins of converted parquet relations
> -----------------------------------------------------------
>
> Key: SPARK-6851
> URL: https://issues.apache.org/jira/browse/SPARK-6851
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.1
> Reporter: Michael Armbrust
> Assignee: Michael Armbrust
> Priority: Blocker
>
> From the user list (
> /cc [~chinnitv]) When the same relation exists twice in a query plan, our
> new caching logic replaces both instances with identical replacements. The
> bug can be see in the following transformation:
> {code}
> === Applying Rule
> org.apache.spark.sql.hive.HiveMetastoreCatalog$ParquetConversions ===
> !Project [state#59,month#60]
> 'Project [state#105,month#106]
> ! Join Inner, Some(((state#69 = state#59) && (month#70 = month#60))) 'Join
> Inner, Some(((state#105 = state#105) && (month#106 = month#106)))
> ! MetastoreRelation default, orders, None
> Subquery orders
> ! Subquery ao
> Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106]
> org.apache.spark.sql.parquet.ParquetRelation2
> ! Distinct
> Subquery ao
> ! Project [state#69,month#70]
> Distinct
> ! Join Inner, Some((id#81 = id#71))
> Project [state#105,month#106]
> ! MetastoreRelation default, orders, None
> Join Inner, Some((id#115 = id#97))
> ! MetastoreRelation default, orderupdates, None
> Subquery orders
> !
> Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106]
> org.apache.spark.sql.parquet.ParquetRelation2
> !
> Subquery orderupdates
> !
> Relation[id#115,category#116,make#117,type#118,price#119,pdate#120,customer#121,city#122,state#123,month#124]
> org.apache.spark.sql.parquet.ParquetRelation2
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]