markj-db commented on code in PR #43880:
URL: https://github.com/apache/spark/pull/43880#discussion_r1398321440


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala:
##########
@@ -23,6 +23,13 @@ import org.apache.spark.unsafe.types.UTF8String
 
 object NumberConverter {
 
+  /**
+   * The output string has a max length of one char per bit in the 
intermediate representation plus
+   * one char for the '-' sign.  This corresponds to the representation of 
`Long.MinValue` with
+   * `toBase` equal to -2.
+   */
+  private final val MAX_OUTPUT_LENGTH = 64 + 1

Review Comment:
   Ah, thanks, I agree, I was searching for a suitable constant and failed to 
find it.



##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with 
SharedSparkSession {
     }
   }
 
+  test("SPARK-44973 conv must allocate enough space for all digits plus 
negative sign") {
+    withSQLConf(SQLConf.ANSI_ENABLED.key -> false.toString) {
+      val df = Seq(("8" + "0"*15), ("-8" + "0" * 15)).toDF("num")

Review Comment:
   I went with the `BigInt` proposal



##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with 
SharedSparkSession {
     }
   }
 
+  test("SPARK-44973 conv must allocate enough space for all digits plus 
negative sign") {

Review Comment:
   I was following the convention in the file 🤷 



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala:
##########
@@ -23,6 +23,13 @@ import org.apache.spark.unsafe.types.UTF8String
 
 object NumberConverter {
 
+  /**
+   * The output string has a max length of one char per bit in the 
intermediate representation plus
+   * one char for the '-' sign.  This corresponds to the representation of 
`Long.MinValue` with
+   * `toBase` equal to -2.
+   */
+  private final val MAX_OUTPUT_LENGTH = 64 + 1

Review Comment:
   I don't think the problem is the length of the intermediate representation 
exceeding 64, it's when it's exactly 64 and we need an additional byte for the 
'-' sign.  I've updated the comment; hopefully it's clearer?



##########
sql/core/src/test/scala/org/apache/spark/sql/MathFunctionsSuite.scala:
##########
@@ -262,6 +262,14 @@ class MathFunctionsSuite extends QueryTest with 
SharedSparkSession {
     }
   }
 
+  test("SPARK-44973 conv must allocate enough space for all digits plus 
negative sign") {

Review Comment:
   I was following the convention in the file 🤷 



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/NumberConverterSuite.scala:
##########
@@ -55,6 +55,12 @@ class NumberConverterSuite extends SparkFunSuite {
     checkConv("-10", 11, 7, "45012021522523134134555")
   }
 
+  test("SPARK-44973: conv must allocate enough space for all digits plus 
negative sign") {
+    checkConv(s"${Long.MinValue}", 10, -2, "-1" + "0" * 63)
+    checkConv("8" + "0" * 15, 16, -2, "-1" + "0" * 63)

Review Comment:
   Done (although IMO this is less obvious in some sense)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to