[ https://issues.apache.org/jira/browse/SPARK-25827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Imran Rashid updated SPARK-25827: --------------------------------- Description: There are a couple of issues with replicating & remote reads of large encrypted blocks, which try to create buffers where they shouldn't. Some of this is properly limiting the size of arrays under SPARK-25904, but there are others specific to encryption & trying to convert EncryptedBlockData into a regular ByteBuffer. *EDIT*: moved general array size stuff under SPARK-25904. was: When replicating large blocks with encryption, we try to allocate an array of size {{Int.MaxValue}} which is just a bit too big for the JVM. This is basically the same as SPARK-25704, just another case. In DiskStore: {code} val chunkSize = math.min(remaining, Int.MaxValue) {code} {noformat} 18/10/22 17:04:06 WARN storage.BlockManager: Failed to replicate rdd_1_1 to ..., failure #0 org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226) at org.apache.spark.network.BlockTransferService.uploadBlockSync(BlockTransferService.scala:133) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$replicate(BlockManager.scala:1421) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1230) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091) ... Caused by: java.lang.RuntimeException: java.io.IOException: Destination failed while reading stream ... Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.spark.storage.BlockManager$$anon$1$$anonfun$7.apply(BlockManager.scala:446) at org.apache.spark.storage.BlockManager$$anon$1$$anonfun$7.apply(BlockManager.scala:446) at org.apache.spark.storage.EncryptedBlockData.toChunkedByteBuffer(DiskStore.scala:221) at org.apache.spark.storage.BlockManager$$anon$1.onComplete(BlockManager.scala:449) ... {noformat} > Replicating a block > 2gb with encryption fails > ----------------------------------------------- > > Key: SPARK-25827 > URL: https://issues.apache.org/jira/browse/SPARK-25827 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.0 > Reporter: Imran Rashid > Priority: Major > > There are a couple of issues with replicating & remote reads of large > encrypted blocks, which try to create buffers where they shouldn't. Some of > this is properly limiting the size of arrays under SPARK-25904, but there are > others specific to encryption & trying to convert EncryptedBlockData into a > regular ByteBuffer. > *EDIT*: moved general array size stuff under SPARK-25904. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org