[jira] [Commented] (SPARK-10314) [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size
[ https://issues.apache.org/jira/browse/SPARK-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726551#comment-14726551 ] Apache Spark commented on SPARK-10314: -- User 'romansew' has created a pull request for this issue: https://github.com/apache/spark/pull/8562 > [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception > when parallelism is big than data split size > > > Key: SPARK-10314 > URL: https://issues.apache.org/jira/browse/SPARK-10314 > Project: Spark > Issue Type: Bug > Components: Block Manager >Affects Versions: 1.4.1 > Environment: Spark 1.4.1,Hadoop 2.6.0,Tachyon 0.6.4 >Reporter: Xiaoyu Wang >Priority: Minor > > RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when > parallelism is big than data split size > {code} > val rdd = sc.parallelize(List(1, 2),2) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > is ok. > {code} > val rdd = sc.parallelize(List(1, 2),3) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > got exceptoin: > {noformat} > 15/08/27 17:53:07 INFO SparkContext: Starting job: count at :24 > 15/08/27 17:53:07 INFO DAGScheduler: Got job 0 (count at :24) with 3 > output partitions (allowLocal=false) > 15/08/27 17:53:07 INFO DAGScheduler: Final stage: ResultStage 0(count at > :24) > 15/08/27 17:53:07 INFO DAGScheduler: Parents of final stage: List() > 15/08/27 17:53:07 INFO DAGScheduler: Missing parents: List() > 15/08/27 17:53:07 INFO DAGScheduler: Submitting ResultStage 0 > (ParallelCollectionRDD[0] at parallelize at :21), which has no > missing parents > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(1096) called with > curMem=0, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0 stored as values in > memory (estimated size 1096.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(788) called with > curMem=1096, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes > in memory (estimated size 788.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on localhost:43776 (size: 788.0 B, free: 706.9 MB) > 15/08/27 17:53:07 INFO SparkContext: Created broadcast 0 from broadcast at > DAGScheduler.scala:874 > 15/08/27 17:53:07 INFO DAGScheduler: Submitting 3 missing tasks from > ResultStage 0 (ParallelCollectionRDD[0] at parallelize at :21) > 15/08/27 17:53:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, > localhost, PROCESS_LOCAL, 1269 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) > 15/08/27 17:53:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) > 15/08/27 17:53:07 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_2 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_1 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_0 not found, computing it > 15/08/27 17:53:07 INFO ExternalBlockStore: ExternalBlockStore started > 15/08/27 17:53:08 WARN : tachyon.home is not set. Using > /mnt/tachyon_default_home as the default value. > 15/08/27 17:53:08 INFO : Tachyon client (version 0.6.4) is trying to connect > master @ localhost/127.0.0.1:19998 > 15/08/27 17:53:08 INFO : User registered at the master > localhost/127.0.0.1:19998 got UserId 109 > 15/08/27 17:53:08 INFO TachyonBlockManager: Created tachyon directory at > /spark/spark-c6ec419f-7c7d-48a6-8448-c2431e761ea5/driver/spark-tachyon-20150827175308-6aa5 > 15/08/27 17:53:08 INFO : Trying to get local worker host : localhost > 15/08/27 17:53:08 INFO : Connecting local worker @ localhost/127.0.0.1:29998 > 15/08/27 17:53:08 INFO : Folder /mnt/ramdisk/tachyonworker/users/109 was > created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4386235351040 > was created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4388382834688 > was created! > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_0 on ExternalBlockStore > on localhost:43776 (size: 0.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_1 on ExternalBlockStore > on localhost:43776 (size: 2.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_2 on ExternalBlockStore > on localhost:43776 (size: 2.0 B) >
[jira] [Commented] (SPARK-10314) [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size
[ https://issues.apache.org/jira/browse/SPARK-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726557#comment-14726557 ] Xiaoyu Wang commented on SPARK-10314: - I resubmit the pull request on the master branch > [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception > when parallelism is big than data split size > > > Key: SPARK-10314 > URL: https://issues.apache.org/jira/browse/SPARK-10314 > Project: Spark > Issue Type: Bug > Components: Block Manager >Affects Versions: 1.4.1 > Environment: Spark 1.4.1,Hadoop 2.6.0,Tachyon 0.6.4 >Reporter: Xiaoyu Wang >Priority: Minor > > RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when > parallelism is big than data split size > {code} > val rdd = sc.parallelize(List(1, 2),2) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > is ok. > {code} > val rdd = sc.parallelize(List(1, 2),3) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > got exceptoin: > {noformat} > 15/08/27 17:53:07 INFO SparkContext: Starting job: count at :24 > 15/08/27 17:53:07 INFO DAGScheduler: Got job 0 (count at :24) with 3 > output partitions (allowLocal=false) > 15/08/27 17:53:07 INFO DAGScheduler: Final stage: ResultStage 0(count at > :24) > 15/08/27 17:53:07 INFO DAGScheduler: Parents of final stage: List() > 15/08/27 17:53:07 INFO DAGScheduler: Missing parents: List() > 15/08/27 17:53:07 INFO DAGScheduler: Submitting ResultStage 0 > (ParallelCollectionRDD[0] at parallelize at :21), which has no > missing parents > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(1096) called with > curMem=0, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0 stored as values in > memory (estimated size 1096.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(788) called with > curMem=1096, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes > in memory (estimated size 788.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on localhost:43776 (size: 788.0 B, free: 706.9 MB) > 15/08/27 17:53:07 INFO SparkContext: Created broadcast 0 from broadcast at > DAGScheduler.scala:874 > 15/08/27 17:53:07 INFO DAGScheduler: Submitting 3 missing tasks from > ResultStage 0 (ParallelCollectionRDD[0] at parallelize at :21) > 15/08/27 17:53:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, > localhost, PROCESS_LOCAL, 1269 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) > 15/08/27 17:53:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) > 15/08/27 17:53:07 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_2 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_1 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_0 not found, computing it > 15/08/27 17:53:07 INFO ExternalBlockStore: ExternalBlockStore started > 15/08/27 17:53:08 WARN : tachyon.home is not set. Using > /mnt/tachyon_default_home as the default value. > 15/08/27 17:53:08 INFO : Tachyon client (version 0.6.4) is trying to connect > master @ localhost/127.0.0.1:19998 > 15/08/27 17:53:08 INFO : User registered at the master > localhost/127.0.0.1:19998 got UserId 109 > 15/08/27 17:53:08 INFO TachyonBlockManager: Created tachyon directory at > /spark/spark-c6ec419f-7c7d-48a6-8448-c2431e761ea5/driver/spark-tachyon-20150827175308-6aa5 > 15/08/27 17:53:08 INFO : Trying to get local worker host : localhost > 15/08/27 17:53:08 INFO : Connecting local worker @ localhost/127.0.0.1:29998 > 15/08/27 17:53:08 INFO : Folder /mnt/ramdisk/tachyonworker/users/109 was > created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4386235351040 > was created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4388382834688 > was created! > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_0 on ExternalBlockStore > on localhost:43776 (size: 0.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_1 on ExternalBlockStore > on localhost:43776 (size: 2.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_2 on ExternalBlockStore > on localhost:43776 (size: 2.0 B) > 15/08/27 17:53:08 INFO BlockManager: Found block
[jira] [Commented] (SPARK-10314) [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size
[ https://issues.apache.org/jira/browse/SPARK-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725530#comment-14725530 ] Xiaoyu Wang commented on SPARK-10314: - Yes,Any questions with the pull request? Do you need me to resubmit a pull request for master branch? The previous pull request is submit to branch-1.4! > [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception > when parallelism is big than data split size > > > Key: SPARK-10314 > URL: https://issues.apache.org/jira/browse/SPARK-10314 > Project: Spark > Issue Type: Bug > Components: Block Manager >Affects Versions: 1.4.1 > Environment: Spark 1.4.1,Hadoop 2.6.0,Tachyon 0.6.4 >Reporter: Xiaoyu Wang >Priority: Minor > > RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when > parallelism is big than data split size > {code} > val rdd = sc.parallelize(List(1, 2),2) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > is ok. > {code} > val rdd = sc.parallelize(List(1, 2),3) > rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) > rdd.count() > {code} > got exceptoin: > {noformat} > 15/08/27 17:53:07 INFO SparkContext: Starting job: count at :24 > 15/08/27 17:53:07 INFO DAGScheduler: Got job 0 (count at :24) with 3 > output partitions (allowLocal=false) > 15/08/27 17:53:07 INFO DAGScheduler: Final stage: ResultStage 0(count at > :24) > 15/08/27 17:53:07 INFO DAGScheduler: Parents of final stage: List() > 15/08/27 17:53:07 INFO DAGScheduler: Missing parents: List() > 15/08/27 17:53:07 INFO DAGScheduler: Submitting ResultStage 0 > (ParallelCollectionRDD[0] at parallelize at :21), which has no > missing parents > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(1096) called with > curMem=0, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0 stored as values in > memory (estimated size 1096.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(788) called with > curMem=1096, maxMem=741196431 > 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes > in memory (estimated size 788.0 B, free 706.9 MB) > 15/08/27 17:53:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on localhost:43776 (size: 788.0 B, free: 706.9 MB) > 15/08/27 17:53:07 INFO SparkContext: Created broadcast 0 from broadcast at > DAGScheduler.scala:874 > 15/08/27 17:53:07 INFO DAGScheduler: Submitting 3 missing tasks from > ResultStage 0 (ParallelCollectionRDD[0] at parallelize at :21) > 15/08/27 17:53:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, > localhost, PROCESS_LOCAL, 1269 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, > localhost, PROCESS_LOCAL, 1270 bytes) > 15/08/27 17:53:07 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) > 15/08/27 17:53:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) > 15/08/27 17:53:07 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_2 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_1 not found, computing it > 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_0 not found, computing it > 15/08/27 17:53:07 INFO ExternalBlockStore: ExternalBlockStore started > 15/08/27 17:53:08 WARN : tachyon.home is not set. Using > /mnt/tachyon_default_home as the default value. > 15/08/27 17:53:08 INFO : Tachyon client (version 0.6.4) is trying to connect > master @ localhost/127.0.0.1:19998 > 15/08/27 17:53:08 INFO : User registered at the master > localhost/127.0.0.1:19998 got UserId 109 > 15/08/27 17:53:08 INFO TachyonBlockManager: Created tachyon directory at > /spark/spark-c6ec419f-7c7d-48a6-8448-c2431e761ea5/driver/spark-tachyon-20150827175308-6aa5 > 15/08/27 17:53:08 INFO : Trying to get local worker host : localhost > 15/08/27 17:53:08 INFO : Connecting local worker @ localhost/127.0.0.1:29998 > 15/08/27 17:53:08 INFO : Folder /mnt/ramdisk/tachyonworker/users/109 was > created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4386235351040 > was created! > 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4388382834688 > was created! > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_0 on ExternalBlockStore > on localhost:43776 (size: 0.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_1 on ExternalBlockStore > on localhost:43776 (size: 2.0 B) > 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_2 on
[jira] [Commented] (SPARK-10314) [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size
[ https://issues.apache.org/jira/browse/SPARK-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716456#comment-14716456 ] Apache Spark commented on SPARK-10314: -- User 'romansew' has created a pull request for this issue: https://github.com/apache/spark/pull/8482 [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size Key: SPARK-10314 URL: https://issues.apache.org/jira/browse/SPARK-10314 Project: Spark Issue Type: Bug Components: Block Manager Affects Versions: 1.4.1 Environment: Spark 1.4.1,Hadoop 2.6.0,Tachyon 0.6.4 Reporter: Xiaoyu Wang RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size {code} val rdd = sc.parallelize(List(1, 2),2) rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) rdd.count() {code} is ok. {code} val rdd = sc.parallelize(List(1, 2),3) rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) rdd.count() {code} got exceptoin: {noformat} 15/08/27 17:53:07 INFO SparkContext: Starting job: count at console:24 15/08/27 17:53:07 INFO DAGScheduler: Got job 0 (count at console:24) with 3 output partitions (allowLocal=false) 15/08/27 17:53:07 INFO DAGScheduler: Final stage: ResultStage 0(count at console:24) 15/08/27 17:53:07 INFO DAGScheduler: Parents of final stage: List() 15/08/27 17:53:07 INFO DAGScheduler: Missing parents: List() 15/08/27 17:53:07 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at console:21), which has no missing parents 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(1096) called with curMem=0, maxMem=741196431 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1096.0 B, free 706.9 MB) 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(788) called with curMem=1096, maxMem=741196431 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 788.0 B, free 706.9 MB) 15/08/27 17:53:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:43776 (size: 788.0 B, free: 706.9 MB) 15/08/27 17:53:07 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:874 15/08/27 17:53:07 INFO DAGScheduler: Submitting 3 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at console:21) 15/08/27 17:53:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks 15/08/27 17:53:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1269 bytes) 15/08/27 17:53:07 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1270 bytes) 15/08/27 17:53:07 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, PROCESS_LOCAL, 1270 bytes) 15/08/27 17:53:07 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) 15/08/27 17:53:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 15/08/27 17:53:07 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_2 not found, computing it 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_1 not found, computing it 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_0 not found, computing it 15/08/27 17:53:07 INFO ExternalBlockStore: ExternalBlockStore started 15/08/27 17:53:08 WARN : tachyon.home is not set. Using /mnt/tachyon_default_home as the default value. 15/08/27 17:53:08 INFO : Tachyon client (version 0.6.4) is trying to connect master @ localhost/127.0.0.1:19998 15/08/27 17:53:08 INFO : User registered at the master localhost/127.0.0.1:19998 got UserId 109 15/08/27 17:53:08 INFO TachyonBlockManager: Created tachyon directory at /spark/spark-c6ec419f-7c7d-48a6-8448-c2431e761ea5/driver/spark-tachyon-20150827175308-6aa5 15/08/27 17:53:08 INFO : Trying to get local worker host : localhost 15/08/27 17:53:08 INFO : Connecting local worker @ localhost/127.0.0.1:29998 15/08/27 17:53:08 INFO : Folder /mnt/ramdisk/tachyonworker/users/109 was created! 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4386235351040 was created! 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4388382834688 was created! 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_0 on ExternalBlockStore on localhost:43776 (size: 0.0 B) 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_1 on ExternalBlockStore on localhost:43776 (size: 2.0 B) 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_2 on ExternalBlockStore on localhost:43776 (size: 2.0 B) 15/08/27 17:53:08 INFO BlockManager: Found block rdd_0_1 locally
[jira] [Commented] (SPARK-10314) [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size
[ https://issues.apache.org/jira/browse/SPARK-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716461#comment-14716461 ] Xiaoyu Wang commented on SPARK-10314: - Here is the pull request: https://github.com/apache/spark/pull/8482 [CORE]RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size Key: SPARK-10314 URL: https://issues.apache.org/jira/browse/SPARK-10314 Project: Spark Issue Type: Bug Components: Block Manager Affects Versions: 1.4.1 Environment: Spark 1.4.1,Hadoop 2.6.0,Tachyon 0.6.4 Reporter: Xiaoyu Wang RDD persist to OFF_HEAP tachyon got block rdd_x_x not found exception when parallelism is big than data split size {code} val rdd = sc.parallelize(List(1, 2),2) rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) rdd.count() {code} is ok. {code} val rdd = sc.parallelize(List(1, 2),3) rdd.persist(org.apache.spark.storage.StorageLevel.OFF_HEAP) rdd.count() {code} got exceptoin: {noformat} 15/08/27 17:53:07 INFO SparkContext: Starting job: count at console:24 15/08/27 17:53:07 INFO DAGScheduler: Got job 0 (count at console:24) with 3 output partitions (allowLocal=false) 15/08/27 17:53:07 INFO DAGScheduler: Final stage: ResultStage 0(count at console:24) 15/08/27 17:53:07 INFO DAGScheduler: Parents of final stage: List() 15/08/27 17:53:07 INFO DAGScheduler: Missing parents: List() 15/08/27 17:53:07 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at console:21), which has no missing parents 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(1096) called with curMem=0, maxMem=741196431 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1096.0 B, free 706.9 MB) 15/08/27 17:53:07 INFO MemoryStore: ensureFreeSpace(788) called with curMem=1096, maxMem=741196431 15/08/27 17:53:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 788.0 B, free 706.9 MB) 15/08/27 17:53:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:43776 (size: 788.0 B, free: 706.9 MB) 15/08/27 17:53:07 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:874 15/08/27 17:53:07 INFO DAGScheduler: Submitting 3 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at console:21) 15/08/27 17:53:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks 15/08/27 17:53:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1269 bytes) 15/08/27 17:53:07 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1270 bytes) 15/08/27 17:53:07 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, PROCESS_LOCAL, 1270 bytes) 15/08/27 17:53:07 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) 15/08/27 17:53:07 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 15/08/27 17:53:07 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_2 not found, computing it 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_1 not found, computing it 15/08/27 17:53:07 INFO CacheManager: Partition rdd_0_0 not found, computing it 15/08/27 17:53:07 INFO ExternalBlockStore: ExternalBlockStore started 15/08/27 17:53:08 WARN : tachyon.home is not set. Using /mnt/tachyon_default_home as the default value. 15/08/27 17:53:08 INFO : Tachyon client (version 0.6.4) is trying to connect master @ localhost/127.0.0.1:19998 15/08/27 17:53:08 INFO : User registered at the master localhost/127.0.0.1:19998 got UserId 109 15/08/27 17:53:08 INFO TachyonBlockManager: Created tachyon directory at /spark/spark-c6ec419f-7c7d-48a6-8448-c2431e761ea5/driver/spark-tachyon-20150827175308-6aa5 15/08/27 17:53:08 INFO : Trying to get local worker host : localhost 15/08/27 17:53:08 INFO : Connecting local worker @ localhost/127.0.0.1:29998 15/08/27 17:53:08 INFO : Folder /mnt/ramdisk/tachyonworker/users/109 was created! 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4386235351040 was created! 15/08/27 17:53:08 INFO : /mnt/ramdisk/tachyonworker/users/109/4388382834688 was created! 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_0 on ExternalBlockStore on localhost:43776 (size: 0.0 B) 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_1 on ExternalBlockStore on localhost:43776 (size: 2.0 B) 15/08/27 17:53:08 INFO BlockManagerInfo: Added rdd_0_2 on ExternalBlockStore on localhost:43776 (size: 2.0 B) 15/08/27 17:53:08 INFO BlockManager: Found block rdd_0_1 locally 15/08/27 17:53:08 INFO