markj-db commented on code in PR #43880:
URL: https://github.com/apache/spark/pull/43880#discussion_r1398321440
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala:
##########
@@ -23,6 +23,13 @@ import org.apache.spark.unsafe.types.UTF8String
object NumberConverter {
+ /**
+ * The output string has a max length of one char per bit in the
intermediate representation plus
+ * one char for the '-' sign. This corresponds to the representation of
`Long.MinValue` with
+ * `toBase` equal to -2.
+ */
+ private final val MAX_OUTPUT_LENGTH = 64 + 1
Review Comment:
Ah, thanks, I agree, I was searching for a suitable constant and failed to
find it.
##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with
SharedSparkSession {
}
}
+ test("SPARK-44973 conv must allocate enough space for all digits plus
negative sign") {
+ withSQLConf(SQLConf.ANSI_ENABLED.key -> false.toString) {
+ val df = Seq(("8" + "0"*15), ("-8" + "0" * 15)).toDF("num")
Review Comment:
I went with the `BigInt` proposal
##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with
SharedSparkSession {
}
}
+ test("SPARK-44973 conv must allocate enough space for all digits plus
negative sign") {
Review Comment:
I was following the convention in the file 🤷
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala:
##########
@@ -23,6 +23,13 @@ import org.apache.spark.unsafe.types.UTF8String
object NumberConverter {
+ /**
+ * The output string has a max length of one char per bit in the
intermediate representation plus
+ * one char for the '-' sign. This corresponds to the representation of
`Long.MinValue` with
+ * `toBase` equal to -2.
+ */
+ private final val MAX_OUTPUT_LENGTH = 64 + 1
Review Comment:
I don't think the problem is the length of the intermediate representation
exceeding 64, it's when it's exactly 64 and we need an additional byte for the
'-' sign. I've updated the comment; hopefully it's clearer?
##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with
SharedSparkSession {
}
}
+ test("SPARK-44973 conv must allocate enough space for all digits plus
negative sign") {
Review Comment:
I was following the convention in the file 🤷
##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/NumberConverterSuite.scala:
##########
@@ -55,6 +55,12 @@ class NumberConverterSuite extends SparkFunSuite {
checkConv("-10", 11, 7, "45012021522523134134555")
}
+ test("SPARK-44973: conv must allocate enough space for all digits plus
negative sign") {
+ checkConv(s"${Long.MinValue}", 10, -2, "-1" + "0" * 63)
+ checkConv("8" + "0" * 15, 16, -2, "-1" + "0" * 63)
Review Comment:
Done (although IMO this is less obvious in some sense)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]