markj-db commented on code in PR #43880:
URL: https://github.com/apache/spark/pull/43880#discussion_r1399558967
##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,17 @@ class MathFunctionsSuite extends QueryTest with
SharedSparkSession {
}
}
+ test("SPARK-44973: conv must allocate enough space for all digits plus
negative sign") {
+ withSQLConf(SQLConf.ANSI_ENABLED.key -> false.toString) {
+ val df = Seq(
+ ((BigInt(Long.MaxValue) + 1).toString(16)),
+ (BigInt(Long.MinValue).toString(16))
Review Comment:
Both `8000000000000000` and `-8000000000000000` trigger the
`java.lang.ArrayIndexOutOfBoundsException` before the fix. There's not really
a big change in path coverage with the addition of the second test case, but
there's also not really a big change in test runtime. Since both cases were
failing before, it seemed an appropriate tradeoff (to me) to test that the fix
addressed both cases.
I suppose I'm more concerned about future regressions due to unknown future
refactoring of `NumberConverter` and the benefit of documenting this edge case
(the documentation of `conv()` is sparse, in the sense that it has no
discussion of edge cases or the effect of the internal `long` representation).
As an example, it's counterintuitive to me that
`conv("8000000000000000",16,-2)==conv("-8000000000000000",16,-2)` but
`conv("7fffffffffffffff",16,-2)!=conv("-7fffffffffffffff",16,-2)`.
```
scala> spark.sql(s"""select col1, conv(col1,16,-2) from values
("${"8"+"0"*15}"), ("${"-8"+"0"*15}"), ("${"7"+"f"*15}"), ("${"-7"+"f"*15}"),
("${"8" + "0" * 14 + "1"}"), ("${"-8" + "0" * 14 +
"1"}")""").show(truncate=false)
+-----------------+-----------------------------------------------------------------+
|col1 |conv(col1, 16, -2)
|
+-----------------+-----------------------------------------------------------------+
|8000000000000000
|-1000000000000000000000000000000000000000000000000000000000000000|
|-8000000000000000|-1000000000000000000000000000000000000000000000000000000000000000|
|7fffffffffffffff
|111111111111111111111111111111111111111111111111111111111111111 |
|-7fffffffffffffff|-111111111111111111111111111111111111111111111111111111111111111
|
|8000000000000001
|-111111111111111111111111111111111111111111111111111111111111111 |
|-8000000000000001|-111111111111111111111111111111111111111111111111111111111111111
|
+-----------------+-----------------------------------------------------------------+
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]