[ https://issues.apache.org/jira/browse/SPARK-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625755#comment-14625755 ]
Apache Spark commented on SPARK-6851: ------------------------------------- User 'adrian-wang' has created a pull request for this issue: https://github.com/apache/spark/pull/7387 > Wrong answers for self joins of converted parquet relations > ----------------------------------------------------------- > > Key: SPARK-6851 > URL: https://issues.apache.org/jira/browse/SPARK-6851 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.1 > Reporter: Michael Armbrust > Assignee: Michael Armbrust > Priority: Blocker > Fix For: 1.3.1, 1.4.0 > > > From the user list ( > /cc [~chinnitv]) When the same relation exists twice in a query plan, our > new caching logic replaces both instances with identical replacements. The > bug can be see in the following transformation: > {code} > === Applying Rule > org.apache.spark.sql.hive.HiveMetastoreCatalog$ParquetConversions === > !Project [state#59,month#60] > 'Project [state#105,month#106] > ! Join Inner, Some(((state#69 = state#59) && (month#70 = month#60))) 'Join > Inner, Some(((state#105 = state#105) && (month#106 = month#106))) > ! MetastoreRelation default, orders, None > Subquery orders > ! Subquery ao > Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106] > org.apache.spark.sql.parquet.ParquetRelation2 > ! Distinct > Subquery ao > ! Project [state#69,month#70] > Distinct > ! Join Inner, Some((id#81 = id#71)) > Project [state#105,month#106] > ! MetastoreRelation default, orders, None > Join Inner, Some((id#115 = id#97)) > ! MetastoreRelation default, orderupdates, None > Subquery orders > ! > Relation[id#97,category#98,make#99,type#100,price#101,pdate#102,customer#103,city#104,state#105,month#106] > org.apache.spark.sql.parquet.ParquetRelation2 > ! > Subquery orderupdates > ! > Relation[id#115,category#116,make#117,type#118,price#119,pdate#120,customer#121,city#122,state#123,month#124] > org.apache.spark.sql.parquet.ParquetRelation2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org