Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/8174#discussion_r37019514
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
---
@@ -247,40 +247,67 @@ private[joins] final class UnsafeHashedRelation(
}
override def writeExternal(out: ObjectOutput): Unit =
Utils.tryOrIOException {
- out.writeInt(hashTable.size())
-
- val iter = hashTable.entrySet().iterator()
- while (iter.hasNext) {
- val entry = iter.next()
- val key = entry.getKey
- val values = entry.getValue
-
- // write all the values as single byte array
- var totalSize = 0L
- var i = 0
- while (i < values.length) {
- totalSize += values(i).getSizeInBytes + 4 + 4
- i += 1
+ if (binaryMap != null) {
+ // This could happen when a cached broadcast object need to be
dumped into disk to free memory
+ out.writeInt(binaryMap.numElements())
+
+ var buffer = new Array[Byte](64)
+ def write(addr: MemoryLocation, length: Int): Unit = {
+ if (buffer.length < length) {
+ buffer = new Array[Byte](length)
+ }
+ Platform.copyMemory(addr.getBaseObject, addr.getBaseOffset,
--- End diff --
It looks like this ends up first buffering the entire object and then
writes the whole thing in a single OutputStream.write call. If you want to
avoid having to use a resizable buffer here then you could use a loop and
perform the memory copying and writes in small chunks (say, 8 KB). I think that
there's code elsewhere which does this, so if we decide to go with this
approach then we could move that code into a static helper method.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]