[ 
https://issues.apache.org/jira/browse/SPARK-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1145.
------------------------------------

    Resolution: Fixed

Issue resolved by pull request 43
[https://github.com/apache/spark/pull/43]

> Memory mapping with many small blocks can cause JVM allocation failures
> -----------------------------------------------------------------------
>
>                 Key: SPARK-1145
>                 URL: https://issues.apache.org/jira/browse/SPARK-1145
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Patrick Wendell
>            Assignee: Patrick Wendell
>             Fix For: 1.0.0
>
>
> During a shuffle each block or block segment is memory mapped to a file. When 
> the segments are very small and there are a large number of them, the memory 
> maps can start failing and eventually the JVM will terminate. It's not clear 
> exactly what's happening but it appears that when the JVM terminates about 
> 265MB of virtual address space is used by memory mapped files. This doesn't 
> seem affected at all by `-XXmaxdirectmemorysize` - AFAIK that option is just 
> to give the JVM its own self imposed limit rather than allow it to run into 
> OS limits. 
> At the time of JVM failure it appears the overall OS memory becomes scarce, 
> so it's possible there are overheads for each memory mapped file that are 
> adding up here. One overhead is that the memory mapping occurs at the 
> granularity of pages, so if blocks are really small there is natural overhead 
> required to pad to the page boundary.
> In the particular case where I saw this, the JVM was running 4 reducers, each 
> of which was trying to access about 30,000 blocks for a total of 120,000 
> concurrent reads. At about 65,000 open files it crapped out. In this case 
> each file was about 1000 bytes.
> User should really be coalescing or using fewer reducers if they have 1000 
> byte shuffle files, but I expect this to happen nonetheless. My proposal was 
> that if the file is smaller than a few pages, we should just read it into a 
> java buffer and not bother to memory map it. Memory mapping huge numbers of 
> small files in the JVM is neither recommended or good for performance, AFAIK.
> Below is the stack trace:
> {code}
> 14/02/27 08:32:35 ERROR storage.BlockManagerWorker: Exception handling buffer 
> message
> java.io.IOException: Map failed
>   at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888)
>   at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:89)
>   at 
> org.apache.spark.storage.BlockManager.getLocalBytes(BlockManager.scala:285)
>   at 
> org.apache.spark.storage.BlockManagerWorker.getBlock(BlockManagerWorker.scala:90)
>   at 
> org.apache.spark.storage.BlockManagerWorker.processBlockMessage(BlockManagerWorker.scala:69)
>   at 
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
>   at 
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at 
> org.apache.spark.storage.BlockMessageArray.foreach(BlockMessageArray.scala:28)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at 
> org.apache.spark.storage.BlockMessageArray.map(BlockMessageArray.scala:28)
>   at 
> org.apache.spark.storage.BlockManagerWorker.onBlockMessageReceive(BlockManagerWorker.scala:44)
>   at 
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
>   at 
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
>   at 
> org.apache.spark.network.ConnectionManager.org$apache$spark$network$ConnectionManager$$handleMessage(ConnectionManager.scala:512)
>   at 
> org.apache.spark.network.ConnectionManager$$anon$8.run(ConnectionManager.scala:478)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}
> And the JVM error log had a bunch of entries like this:
> {code}
> 7f4b48f89000-7f4b48f8a000 r--s 00000000 ca:30 1622077901                 
> /mnt4/spark/spark-local-20140227020022-227c/26/shuffle_0_22312_38
> 7f4b48f8a000-7f4b48f8b000 r--s 00000000 ca:20 545892715                  
> /mnt3/spark/spark-local-20140227020022-5ef5/3a/shuffle_0_26808_20
> 7f4b48f8b000-7f4b48f8c000 r--s 00000000 ca:50 1622480741                 
> /mnt2/spark/spark-local-20140227020022-315b/1c/shuffle_0_29013_19
> 7f4b48f8c000-7f4b48f8d000 r--s 00000000 ca:30 10082610                   
> /mnt4/spark/spark-local-20140227020022-227c/3b/shuffle_0_28002_9
> 7f4b48f8d000-7f4b48f8e000 r--s 00000000 ca:50 1622268539                 
> /mnt2/spark/spark-local-20140227020022-315b/3e/shuffle_0_23983_17
> 7f4b48f8e000-7f4b48f8f000 r--s 00000000 ca:50 1083068239                 
> /mnt2/spark/spark-local-20140227020022-315b/37/shuffle_0_25505_22
> 7f4b48f8f000-7f4b48f90000 r--s 00000000 ca:30 9921006                    
> /mnt4/spark/spark-local-20140227020022-227c/31/shuffle_0_24072_95
> 7f4b48f90000-7f4b48f91000 r--s 00000000 ca:50 10441349                   
> /mnt2/spark/spark-local-20140227020022-315b/20/shuffle_0_27409_47
> 7f4b48f91000-7f4b48f92000 r--s 00000000 ca:50 10406042                   
> /mnt2/spark/spark-local-20140227020022-315b/0e/shuffle_0_26481_84
> 7f4b48f92000-7f4b48f93000 r--s 00000000 ca:50 1622268192                 
> /mnt2/spark/spark-local-20140227020022-315b/14/shuffle_0_23818_92
> 7f4b48f93000-7f4b48f94000 r--s 00000000 ca:50 1082957628                 
> /mnt2/spark/spark-local-20140227020022-315b/09/shuffle_0_22824_45
> 7f4b48f94000-7f4b48f95000 r--s 00000000 ca:20 1082199965                 
> /mnt3/spark/spark-local-20140227020022-5ef5/00/shuffle_0_1429_13
> 7f4b48f95000-7f4b48f96000 r--s 00000000 ca:20 10940995                   
> /mnt3/spark/spark-local-20140227020022-5ef5/38/shuffle_0_28705_44
> 7f4b48f96000-7f4b48f97000 r--s 00000000 ca:10 17456971                   
> /mnt/spark/spark-local-20140227020022-b372/28/shuffle_0_23055_72
> 7f4b48f97000-7f4b48f98000 r--s 00000000 ca:30 9853895                    
> /mnt4/spark/spark-local-20140227020022-227c/08/shuffle_0_22797_42
> 7f4b48f98000-7f4b48f99000 r--s 00000000 ca:20 1622089728                 
> /mnt3/spark/spark-local-20140227020022-5ef5/27/shuffle_0_24017_97
> 7f4b48f99000-7f4b48f9a000 r--s 00000000 ca:50 1082937570                 
> /mnt2/spark/spark-local-20140227020022-315b/24/shuffle_0_22291_38
> 7f4b48f9a000-7f4b48f9b000 r--s 00000000 ca:30 10056604                   
> /mnt4/spark/spark-local-20140227020022-227c/2f/shuffle_0_27408_59
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to