bersprockets opened a new pull request, #37852:
URL: https://github.com/apache/spark/pull/37852

   ### What changes were proposed in this pull request?
   
   Change `UnsafeArrayWriter#initialize` to use longs rather than ints when 
calculating the initial size of the array.
   
   ### Why are the changes needed?
   
   When calculating the initial size in bytes needed for the array, 
`UnsafeArrayWriter#initialize` uses an int expression, which can overflow. The 
initialize method then passes the negative size to `BufferHolder#grow`, which 
complains about the negative size.
   
   Example (the following will run just fine on a 16GB laptop, despite the 
large driver size setting):
   ```
   bin/spark-sql --driver-memory 22g --master "local[1]"
   
   create or replace temp view data1 as
   select 0 as key, id as val
   from range(0, 268271216);
   
   create or replace temp view data2 as
   select key as lkey, collect_list(val) as bigarray
   from data1
   group by key;
   
   -- the below cache forces Spark to create unsafe rows
   cache lazy table data2;
   
   select count(*) from data2;
   ```
   After a few minutes, `BufferHolder#grow` will throw the following exception:
   ```
   java.lang.IllegalArgumentException: Cannot grow BufferHolder by size 
-2115263656 because the size is negative
        at 
org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder.grow(BufferHolder.java:67)
        at 
org.apache.spark.sql.catalyst.expressions.codegen.UnsafeArrayWriter.initialize(UnsafeArrayWriter.java:61)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
        at 
org.apache.spark.sql.catalyst.expressions.aggregate.Collect.serialize(collect.scala:73)
        at 
org.apache.spark.sql.catalyst.expressions.aggregate.Collect.serialize(collect.scala:37)
   ```
   This query was going to fail anyway, but the message makes it looks like a 
bug in Spark rather than a user problem. `UnsafeArrayWriter#initialize` should 
calculate using a long expression and fail if the size exceeds 
`Integer.MAX_VALUE`, showing the actual initial size in the error message.
   
   Note: This issue is not related to SPARK-39608, as far as I can tell, 
despite having the same symptom
   
   ### Does this PR introduce _any_ user-facing change?
   
   Other than a better error message, no.
   
   ### How was this patch tested?
   
   New unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to