Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19266#discussion_r143421894
  
    --- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
 ---
    @@ -35,6 +35,11 @@
      * if the fields of row are all fixed-length, as the size of result row is 
also fixed.
      */
     public class BufferHolder {
    +
    +  // Some JVMs can't allocate arrays of length Integer.MAX_VALUE; actual 
max is somewhat
    +  // smaller. Be conservative and lower the cap a little.
    +  private static final int ARRAY_MAX = Integer.MAX_VALUE - 8;
    --- End diff --
    
    @gatorsmile have a look at the JIRA for some detail; you can see a similar 
limit in the JDK at for example 
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/util/ArrayList.java#l229
 
    
    You are right, I think around line 54 needs to be something straightforward 
like:
    ```
    long totalSize = initialSize + bitsetWidthInBytes + 8L * row.numFields();
    if (totalSize > ARRAY_MAX) { ...error... }
    this.buffer = new byte[(int) totalSize];
    ```
    
    Yes I agree with your new JIRA @liufengdb though think we'll need to go the 
other way to `Integer.MAX_VALUE - 15` where the value must be divisible by 8.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to