attilapiros commented on issue #24554: [SPARK-27622][Core] Avoiding the network when block manager fetches disk persisted RDD blocks from the same host URL: https://github.com/apache/spark/pull/24554#issuecomment-496262942 In the previous commit "introduce getAndMapRemoteManagedBuf and open the block file early" the old `getRemoteManagedBuffer` is restructured a bit: - the network fetching is extracted - the transformation of `ManagedBuffer` to `BlockResult` and to `ChunkedByteBuffer` is passed down for testing the buffer reading from the local directory of the same host remote executor Both these usage (`getRemoteValues` and `getRemoteBytes`) involve opening of the file so this way we have a guarantee to read the block content. If the transformation fails because the file is deleted right before it would be opened then the process falls back to fetching from the network (tested with new tests). The relevant tests: ``` info] BlockManagerSuite: [info] - SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, 1 replicas) (421 milliseconds) [info] - SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, deserialized, 1 replicas) (85 milliseconds) [info] - SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteValue() (83 milliseconds) [info] - SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteBytes() (66 milliseconds) [info] - SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteValue() (81 milliseconds) [info] - SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteBytes() (58 milliseconds) ``` And with DEBUG mode the relevant output: ``` attilapiros@apiros-MBP ~/github/spark (SPARK-27622) $ grep -e "from the disk of a same host executor\|===" core/target/unit-tests.log ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, 1 replicas)' ===== 19/05/27 18:36:40.434 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManager: Read test_list from the disk of a same host executor is successful. 19/05/27 18:36:40.460 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManager: Read test_list from the disk of a same host executor is successful. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, 1 replicas)' ===== ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, deserialized, 1 replicas)' ===== 19/05/27 18:36:40.597 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManager: Read test_list from the disk of a same host executor is successful. 19/05/27 18:36:40.611 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManager: Read test_list from the disk of a same host executor is successful. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: avoid the network when block requested from same host, StorageLevel(disk, deserialized, 1 replicas)' ===== ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteValue()' ===== 19/05/27 18:36:40.705 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManagerSuite$$anon$6: Read test_list from the disk of a same host executor is failed. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteValue()' ===== ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteBytes()' ===== 19/05/27 18:36:40.823 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManagerSuite$$anon$6: Read test_list from the disk of a same host executor is failed. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, 1 replicas), getRemoteBytes()' ===== ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteValue()' ===== 19/05/27 18:36:40.936 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManagerSuite$$anon$6: Read test_list from the disk of a same host executor is failed. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteValue()' ===== ===== TEST OUTPUT FOR o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteBytes()' ===== 19/05/27 18:36:41.039 pool-1-thread-1-ScalaTest-running-BlockManagerSuite DEBUG BlockManagerSuite$$anon$6: Read test_list from the disk of a same host executor is failed. ===== FINISHED o.a.s.storage.BlockManagerSuite: 'SPARK-27622: as file is removed fall back to network fetch, StorageLevel(disk, deserialized, 1 replicas), getRemoteBytes()' ===== ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
