Github user clockfly commented on a diff in the pull request:
https://github.com/apache/spark/pull/13829#discussion_r69049183
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
---
@@ -55,6 +61,11 @@ public BufferHolder(UnsafeRow row, int initialSize) {
* Grows the buffer by at least neededSize and points the row to the
buffer.
*/
public void grow(int neededSize) {
+ if (neededSize > Integer.MAX_VALUE / 2 - totalSize()) {
+ throw new UnsupportedOperationException(
+ "Cannot grow BufferHolder by size " + neededSize + " because the
size after growing " +
+ "exceeds size limitation " + Integer.MAX_VALUE / 2);
+ }
final int length = totalSize() + neededSize;
if (buffer.length < length) {
// This will not happen frequently, because the buffer is re-used.
--- End diff --
@cloud-fan
Currently the limit for `neededSize + totalSize` is `Integer.MAX_VALUE /
2`, I don't see there is a big difference to enlarge the limit to
`Integer.MAX_VALUE`.
`Integer.MAX_VALUE / 2` is about 1 GB, it is quite rare for a single row to
exceed this limit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]