Stavros Kontopoulos created SPARK-17959: -------------------------------------------
Summary: spark.sql.join.preferSortMergeJoin has no effect for simple join Key: SPARK-17959 URL: https://issues.apache.org/jira/browse/SPARK-17959 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.0.1 Reporter: Stavros Kontopoulos Example code: val df = spark.sparkContext.parallelize(List(("A", 10, "dss@s1"), ("A", 20, "dss@s2"), ("B", 1, "dss@qqa"), ("B", 2, "dss@qqb"))).toDF("Group", "Amount", "Email") df.as("a").join(df.as("b")) .where($"a.Group" === $"b.Group") .explain() I always get the SortMerge strategy even if i set spark.sql.join.preferSortMergeJoin to false since: sinzeInBytes = 2^63-1 https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala#L101 and thus: condition here: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L127 is always false... -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org