[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-17866. --------------------------------- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15427 [https://github.com/apache/spark/pull/15427] > Dataset.dropDuplicates (i.e., distinct) should not change the output of child > plan > ---------------------------------------------------------------------------------- > > Key: SPARK-17866 > URL: https://issues.apache.org/jira/browse/SPARK-17866 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Liang-Chi Hsieh > Assignee: Liang-Chi Hsieh > Fix For: 2.1.0 > > > We create new Alias with new exprId in Dataset.dropDuplicates now. However it > causes problem when we want to select the columns as follows: > {code} > val ds = Seq(("a", 1), ("a", 2), ("b", 1), ("a", 1)).toDS() > // ds("_2") will cause analysis exception > ds.dropDuplicates("_1").select(ds("_1").as[String], ds("_2").as[Int]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org