Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19266#discussion_r143421894
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
---
@@ -35,6 +35,11 @@
* if the fields of row are all fixed-length, as the size of result row is
also fixed.
*/
public class BufferHolder {
+
+ // Some JVMs can't allocate arrays of length Integer.MAX_VALUE; actual
max is somewhat
+ // smaller. Be conservative and lower the cap a little.
+ private static final int ARRAY_MAX = Integer.MAX_VALUE - 8;
--- End diff --
@gatorsmile have a look at the JIRA for some detail; you can see a similar
limit in the JDK at for example
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/util/ArrayList.java#l229
You are right, I think around line 54 needs to be something straightforward
like:
```
long totalSize = initialSize + bitsetWidthInBytes + 8L * row.numFields();
if (totalSize > ARRAY_MAX) { ...error... }
this.buffer = new byte[(int) totalSize];
```
Yes I agree with your new JIRA @liufengdb though think we'll need to go the
other way to `Integer.MAX_VALUE - 15` where the value must be divisible by 8.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]