Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21311#discussion_r189908375
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
---
@@ -626,6 +618,29 @@ private[execution] final class LongToUnsafeRowMap(val
mm: TaskMemoryManager, cap
}
}
+ private def grow(inputRowSize: Int): Unit = {
+ val neededNumWords = (cursor - Platform.LONG_ARRAY_OFFSET + 8 +
inputRowSize + 7) / 8
+ if (neededNumWords > page.length) {
+ if (neededNumWords > (1 << 30)) {
+ throw new UnsupportedOperationException(
+ "Can not build a HashedRelation that is larger than 8G")
+ }
+ val newNumWords = math.max(neededNumWords, math.min(page.length * 2,
1 << 30))
+ if (newNumWords > ARRAY_MAX) {
--- End diff --
we won't need this check now, `newNumWords` is guaranteed to be less than
(1 << 30), which is much smaller than `ARRAY_MAX`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]