[
https://issues.apache.org/jira/browse/SPARK-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Patrick Wendell resolved SPARK-1145.
------------------------------------
Resolution: Fixed
Issue resolved by pull request 43
[https://github.com/apache/spark/pull/43]
> Memory mapping with many small blocks can cause JVM allocation failures
> -----------------------------------------------------------------------
>
> Key: SPARK-1145
> URL: https://issues.apache.org/jira/browse/SPARK-1145
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 0.9.0
> Reporter: Patrick Wendell
> Assignee: Patrick Wendell
> Fix For: 1.0.0
>
>
> During a shuffle each block or block segment is memory mapped to a file. When
> the segments are very small and there are a large number of them, the memory
> maps can start failing and eventually the JVM will terminate. It's not clear
> exactly what's happening but it appears that when the JVM terminates about
> 265MB of virtual address space is used by memory mapped files. This doesn't
> seem affected at all by `-XXmaxdirectmemorysize` - AFAIK that option is just
> to give the JVM its own self imposed limit rather than allow it to run into
> OS limits.
> At the time of JVM failure it appears the overall OS memory becomes scarce,
> so it's possible there are overheads for each memory mapped file that are
> adding up here. One overhead is that the memory mapping occurs at the
> granularity of pages, so if blocks are really small there is natural overhead
> required to pad to the page boundary.
> In the particular case where I saw this, the JVM was running 4 reducers, each
> of which was trying to access about 30,000 blocks for a total of 120,000
> concurrent reads. At about 65,000 open files it crapped out. In this case
> each file was about 1000 bytes.
> User should really be coalescing or using fewer reducers if they have 1000
> byte shuffle files, but I expect this to happen nonetheless. My proposal was
> that if the file is smaller than a few pages, we should just read it into a
> java buffer and not bother to memory map it. Memory mapping huge numbers of
> small files in the JVM is neither recommended or good for performance, AFAIK.
> Below is the stack trace:
> {code}
> 14/02/27 08:32:35 ERROR storage.BlockManagerWorker: Exception handling buffer
> message
> java.io.IOException: Map failed
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888)
> at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:89)
> at
> org.apache.spark.storage.BlockManager.getLocalBytes(BlockManager.scala:285)
> at
> org.apache.spark.storage.BlockManagerWorker.getBlock(BlockManagerWorker.scala:90)
> at
> org.apache.spark.storage.BlockManagerWorker.processBlockMessage(BlockManagerWorker.scala:69)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at
> org.apache.spark.storage.BlockMessageArray.foreach(BlockMessageArray.scala:28)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at
> org.apache.spark.storage.BlockMessageArray.map(BlockMessageArray.scala:28)
> at
> org.apache.spark.storage.BlockManagerWorker.onBlockMessageReceive(BlockManagerWorker.scala:44)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
> at
> org.apache.spark.network.ConnectionManager.org$apache$spark$network$ConnectionManager$$handleMessage(ConnectionManager.scala:512)
> at
> org.apache.spark.network.ConnectionManager$$anon$8.run(ConnectionManager.scala:478)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}
> And the JVM error log had a bunch of entries like this:
> {code}
> 7f4b48f89000-7f4b48f8a000 r--s 00000000 ca:30 1622077901
> /mnt4/spark/spark-local-20140227020022-227c/26/shuffle_0_22312_38
> 7f4b48f8a000-7f4b48f8b000 r--s 00000000 ca:20 545892715
> /mnt3/spark/spark-local-20140227020022-5ef5/3a/shuffle_0_26808_20
> 7f4b48f8b000-7f4b48f8c000 r--s 00000000 ca:50 1622480741
> /mnt2/spark/spark-local-20140227020022-315b/1c/shuffle_0_29013_19
> 7f4b48f8c000-7f4b48f8d000 r--s 00000000 ca:30 10082610
> /mnt4/spark/spark-local-20140227020022-227c/3b/shuffle_0_28002_9
> 7f4b48f8d000-7f4b48f8e000 r--s 00000000 ca:50 1622268539
> /mnt2/spark/spark-local-20140227020022-315b/3e/shuffle_0_23983_17
> 7f4b48f8e000-7f4b48f8f000 r--s 00000000 ca:50 1083068239
> /mnt2/spark/spark-local-20140227020022-315b/37/shuffle_0_25505_22
> 7f4b48f8f000-7f4b48f90000 r--s 00000000 ca:30 9921006
> /mnt4/spark/spark-local-20140227020022-227c/31/shuffle_0_24072_95
> 7f4b48f90000-7f4b48f91000 r--s 00000000 ca:50 10441349
> /mnt2/spark/spark-local-20140227020022-315b/20/shuffle_0_27409_47
> 7f4b48f91000-7f4b48f92000 r--s 00000000 ca:50 10406042
> /mnt2/spark/spark-local-20140227020022-315b/0e/shuffle_0_26481_84
> 7f4b48f92000-7f4b48f93000 r--s 00000000 ca:50 1622268192
> /mnt2/spark/spark-local-20140227020022-315b/14/shuffle_0_23818_92
> 7f4b48f93000-7f4b48f94000 r--s 00000000 ca:50 1082957628
> /mnt2/spark/spark-local-20140227020022-315b/09/shuffle_0_22824_45
> 7f4b48f94000-7f4b48f95000 r--s 00000000 ca:20 1082199965
> /mnt3/spark/spark-local-20140227020022-5ef5/00/shuffle_0_1429_13
> 7f4b48f95000-7f4b48f96000 r--s 00000000 ca:20 10940995
> /mnt3/spark/spark-local-20140227020022-5ef5/38/shuffle_0_28705_44
> 7f4b48f96000-7f4b48f97000 r--s 00000000 ca:10 17456971
> /mnt/spark/spark-local-20140227020022-b372/28/shuffle_0_23055_72
> 7f4b48f97000-7f4b48f98000 r--s 00000000 ca:30 9853895
> /mnt4/spark/spark-local-20140227020022-227c/08/shuffle_0_22797_42
> 7f4b48f98000-7f4b48f99000 r--s 00000000 ca:20 1622089728
> /mnt3/spark/spark-local-20140227020022-5ef5/27/shuffle_0_24017_97
> 7f4b48f99000-7f4b48f9a000 r--s 00000000 ca:50 1082937570
> /mnt2/spark/spark-local-20140227020022-315b/24/shuffle_0_22291_38
> 7f4b48f9a000-7f4b48f9b000 r--s 00000000 ca:30 10056604
> /mnt4/spark/spark-local-20140227020022-227c/2f/shuffle_0_27408_59
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)