Hello, developers,

I am just curious about the following two things which seems to be
contradictory to each other, please help me find out my understanding
mistakes:

1) Excerpted from sosp 2013 paper, "Then, when a node fails, the system
detects all missing RDD partitions and launches tasks to recompute them
from the last checkpoint. Many tasks can be launched at the same time to
compute different RDD partitions, allowing the whole cluster to partake in
recovery."

2) Excerpted from code, this function is called when there's one dead
BlockManager, this function didn't launch tasks to recover lost partitions,
instead, it updated many meta-data.

  private def removeBlockManager(blockManagerId: BlockManagerId) {
    val info = blockManagerInfo(blockManagerId)

    // Remove the block manager from blockManagerIdByExecutor.
    blockManagerIdByExecutor -= blockManagerId.executorId

    // Remove it from blockManagerInfo and remove all the blocks.
    blockManagerInfo.remove(blockManagerId)
    val iterator = info.blocks.keySet.iterator
    while (iterator.hasNext) {
      val blockId = iterator.next
      val locations = blockLocations.get(blockId)
      locations -= blockManagerId
      if (locations.size == 0) {
        blockLocations.remove(locations)
      }
    }
  }

thanks,
dachuan.

-- 
Dachuan Huang
Cellphone: 614-390-7234
2015 Neil Avenue
Ohio State University
Columbus, Ohio
U.S.A.
43210

Reply via email to