Sameer Agarwal created SPARK-16488: -------------------------------------- Summary: Codegen variable namespace collision for pmod and partitionBy Key: SPARK-16488 URL: https://issues.apache.org/jira/browse/SPARK-16488 Project: Spark Issue Type: Bug Components: SQL Reporter: Sameer Agarwal
Reported by [~brkyvz]. Original description below: The generated code used by `pmod` conflicts with DataFrameWriter.partitionBy Quick repro: {code} import org.apache.spark.sql.functions._ case class Test(a: Int, b: String) val ds = Seq(Test(0, "a"), Test(1, "b"), Test(1, "a")).toDS.createOrReplaceTempView("test") sql(""" select * from test distribute by pmod(a, 2) """) .write .partitionBy("b") .mode("overwrite") .parquet("/tmp/repro") {code} You may also use repartition with the function `pmod` instead of using `pmod` inside `distribute by` in sql. Example generated code (two variables defined as r): {code} /* 025 */ public UnsafeRow apply(InternalRow i) { /* 026 */ int value1 = 42; /* 027 */ /* 028 */ boolean isNull2 = i.isNullAt(0); /* 029 */ UTF8String value2 = isNull2 ? null : (i.getUTF8String(0)); /* 030 */ if (!isNull2) { /* 031 */ value1 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(value2.getBaseObject(), value2.getBaseOffset(), value2.numBytes(), value1); /* 032 */ } /* 033 */ /* 034 */ /* 035 */ int value4 = 42; /* 036 */ /* 037 */ boolean isNull5 = i.isNullAt(1); /* 038 */ UTF8String value5 = isNull5 ? null : (i.getUTF8String(1)); /* 039 */ if (!isNull5) { /* 040 */ value4 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(value5.getBaseObject(), value5.getBaseOffset(), value5.numBytes(), value4); /* 041 */ } /* 042 */ /* 043 */ int value3 = -1; /* 044 */ /* 045 */ int r = value4 % 10; /* 046 */ if (r < 0) { /* 047 */ value3 = (r + 10) % 10; /* 048 */ } else { /* 049 */ value3 = r; /* 050 */ } /* 051 */ value1 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashInt(value3, value1); /* 052 */ /* 053 */ int value = -1; /* 054 */ /* 055 */ int r = value1 % 200; /* 056 */ if (r < 0) { /* 057 */ value = (r + 200) % 200; /* 058 */ } else { /* 059 */ value = r; /* 060 */ } /* 061 */ rowWriter.write(0, value); /* 062 */ return result; /* 063 */ } /* 064 */ } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org