prakharjain09 commented on a change in pull request #27539: [SPARK-30786]
[CORE] Fix Block replication failure propogation issue in BlockManager
URL: https://github.com/apache/spark/pull/27539#discussion_r379329268
##########
File path: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala
##########
@@ -624,6 +624,49 @@ class BlockManagerSuite extends SparkFunSuite with
Matchers with BeforeAndAfterE
assert(store.getRemoteBytes("list1").isEmpty)
}
+ Seq(false, true).foreach { stream =>
+ test(s"test for Block replication retry logic (stream = $stream)") {
+ // Retry replication logic for 2 failures
+ conf.set(STORAGE_MAX_REPLICATION_FAILURE, 2)
+ // Custom block replication policy which prioritizes BlockManagers as
per hostnames
+ conf.set(STORAGE_REPLICATION_POLICY,
classOf[SortOnHostNameBlockReplicationPolicy].getName)
+ // To use upload block stream flow, set maxRemoteBlockSizeFetchToMem to 0
+ val maxRemoteBlockSizeFetchToMem = if (stream) 0 else Int.MaxValue - 512
+ conf.set(MAX_REMOTE_BLOCK_SIZE_FETCH_TO_MEM,
maxRemoteBlockSizeFetchToMem.toLong)
+ val blockManagers = (0 to 6).map(index => makeBlockManager(7800,
s"host-$index", master))
+ val a1 = new Array[Byte](4000)
+ val a2 = new Array[Byte](4000)
+ val a3 = new Array[Byte](4000)
+ // Put 4000 byte sized RDD Blocks in block manager 1, 2 and 4. So that
they
+ // won't have space for another 4000 bytes block.
+ blockManagers(1).putSingle(rdd(0, 1), a1, StorageLevel.MEMORY_ONLY)
+ blockManagers(2).putSingle(rdd(0, 2), a2, StorageLevel.MEMORY_ONLY)
+ blockManagers(4).putSingle(rdd(0, 3), a3, StorageLevel.MEMORY_ONLY)
Review comment:
@Ngone51 Different blocks of same RDD shouldn't evict/replace each other.
Neverthless I have replaced the old test with a new one - which uses mocked
memory manager to simulate putBlock failure. But the build failed with some
intermittent issue. Can you please retrigger the build.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]