[ https://issues.apache.org/jira/browse/HDFS-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871530#comment-17871530 ]
ASF GitHub Bot commented on HDFS-17542: --------------------------------------- zhengchenyu commented on code in PR #6915: URL: https://github.com/apache/hadoop/pull/6915#discussion_r1706335775 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/NumberReplicasStriped.java: ########## @@ -0,0 +1,137 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hdfs.server.blockmanagement; + +import org.apache.hadoop.thirdparty.com.google.common.base.Objects; + +import java.util.concurrent.ThreadLocalRandom; + +import static org.apache.hadoop.hdfs.server.blockmanagement.NumberReplicas.StoredReplicaState.LIVE; + +class NumberReplicasStriped extends NumberReplicas { + + private DatanodeStorageInfo[] storages; + private StoredReplicaState[] states; + private boolean considerBusy; + private boolean[] busys; + + NumberReplicasStriped(int totalBlock, boolean considerBusy) { + this.storages = new DatanodeStorageInfo[totalBlock]; + this.states = new StoredReplicaState[totalBlock]; + this.considerBusy = considerBusy; + if (this.considerBusy) { + this.busys = new boolean[totalBlock]; + } + } + + public void add(final StoredReplicaState state, final long value, int blockIndex, + DatanodeStorageInfo storage, boolean busy) { + if (this.storages[blockIndex] == null) { + this.storages[blockIndex] = storage; + this.states[blockIndex] = state; + if (this.considerBusy) { + this.busys[blockIndex] = busy; + } + this.add(state, 1); + } else if (state.getPriority() < this.states[blockIndex].getPriority()) { + this.subtract(this.states[blockIndex], 1); + this.storages[blockIndex] = storage; + this.states[blockIndex] = state; + if (this.considerBusy) { + this.busys[blockIndex] = busy; Review Comment: busys should not set in NumberReplicasStriped. we need a wrapper. I will change it if the proposal is accepted. > EC: Optimize the EC block reconstruction. > ----------------------------------------- > > Key: HDFS-17542 > URL: https://issues.apache.org/jira/browse/HDFS-17542 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Chenyu Zheng > Assignee: Chenyu Zheng > Priority: Major > Labels: pull-request-available > > The current reconstruction process of EC blocks is based on the original > contiguous blocks. It is mainly implemented through the work constructed by > computeReconstructionWorkForBlocks. It can be roughly divided into three > processes: > * scheduleReconstruction > * chooseTargets > * validateReconstructionWork > For ordinary contiguous blocks: > * (1) scheduleReconstruction > Select srcNodes as the source of the copy block according to the status of > each replica of the block. > * (2) chooseTargets > Select the target of the copy. > * (3) validateReconstructionWork > Add the copy command to srcNode, srcNode receives the command through > heartbeat, and executes the block copy from src to target. > For EC blocks: > (1) and (2) seems nearly same. However, whether to perform simple block copy > or block reconstruction for EC blocks is determined in (3). And when some > storage is busy, may result no work, it will lead to the problem described in > HDFS-17516. Even if no block copying or block reconstruction is generated, > pendingReconstruction and neededReconstruction will still be updated until > the block times out, which wastes the scheduling opportunity. > Because the decision of whether to perform block copy or block reconstruction > is made in (3), unnecessary liveBusyBlockIndices, and > excludeReconstructedIndices are introduced. We know many bugs are related > here. These should be avoided. > Improvements: > * Move the work of deciding whether to copy or reconstruct blocks from (3) > to (1). > Such improvements are more conducive to implementing the explicit > specification of the reconstruction block index mentioned in HDFS-16874. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org