[jira] [Resolved] (SPARK-25704) Replication of > 2GB block fails due to bad config default

Imran Rashid (JIRA) Fri, 19 Oct 2018 10:59:16 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-25704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Imran Rashid resolved SPARK-25704.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.4.0

Resolved by pr https://github.com/apache/spark/pull/22705

Commit to master 
https://github.com/apache/spark/commit/43717dee570dc41d71f0b27b8939f6297a029a02

to branch-2.4
https://github.com/apache/spark/commit/1001d2314275c902da519725da266a23b537e33a

> Replication of > 2GB block fails due to bad config default
> ----------------------------------------------------------
>
>                 Key: SPARK-25704
>                 URL: https://issues.apache.org/jira/browse/SPARK-25704
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Assignee: Imran Rashid
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Replicating a block > 2GB currently fails because it tries to allocate a 
> bytebuffer that is just a *bit* too large, due to a bad default config.  This 
> [line|https://github.com/apache/spark/blob/cd40655965072051dfae65eabd979edff0e4d398/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L454]:
> {code}
> ChunkedByteBuffer.fromFile(tmpFile, 
> conf.get(config.MEMORY_MAP_LIMIT_FOR_TESTS).toInt)
> {code}
> {{MEMORY_MAP_LIMIT_FOR_TESTS}} defaults to {{Integer.MAX_VALUE}}, but 
> unfortunately that is just a tiny bit too big.  You'll see an exception like:
> {noformat}
> 18/10/09 21:21:54 WARN server.TransportChannelHandler: Exception in 
> connection from /172.31.118.153:53534
> java.lang.OutOfMemoryError: Requested array size exceeds VM limit
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$8.apply(ChunkedByteBuffer.scala:199)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$8.apply(ChunkedByteBuffer.scala:199)
>         at 
> org.apache.spark.util.io.ChunkedByteBufferOutputStream.allocateNewChunkIfNeeded(ChunkedByteBufferOutputStream.scala:87)
>         at 
> org.apache.spark.util.io.ChunkedByteBufferOutputStream.write(ChunkedByteBufferOutputStream.scala:75)
>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2315)
>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply$mcI$sp(ChunkedByteBuffer.scala:201)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply(ChunkedByteBuffer.scala:201)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply(ChunkedByteBuffer.scala:201)
>         at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$.fromFile(ChunkedByteBuffer.scala:202)
>         at 
> org.apache.spark.util.io.ChunkedByteBuffer$.fromFile(ChunkedByteBuffer.scala:184)
>         at 
> org.apache.spark.storage.BlockManager$$anon$1.onComplete(BlockManager.scala:454)
> {noformat}
> at least on my system, its just 2 bytes too big :(
> {noformat}
> > scala -J-Xmx4G
> import java.nio.ByteBuffer
> scala> ByteBuffer.allocate(Integer.MAX_VALUE)
> java.lang.OutOfMemoryError: Requested array size exceeds VM limit
>   at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
>   ... 30 elided
> scala> ByteBuffer.allocate(Integer.MAX_VALUE - 1)
> java.lang.OutOfMemoryError: Requested array size exceeds VM limit
>   at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
>   ... 30 elided
> scala> ByteBuffer.allocate(Integer.MAX_VALUE - 2)
> res3: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=2147483645 
> cap=2147483645]
> {noformat}
> *Workaround*: Set to "spark.storage.memoryMapLimitForTests" something a bit 
> smaller, eg. 2147483135 (that's Integer.MAX_VALUE - 512, just in case its a 
> bit different on other systems).
> This was introduced by SPARK-25422.  I'll file a PR shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-25704) Replication of > 2GB block fails due to bad config default

Reply via email to