[ https://issues.apache.org/jira/browse/HDFS-17599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870759#comment-17870759 ]
ASF GitHub Bot commented on HDFS-17599: --------------------------------------- hadoop-yetus commented on PR #6979: URL: https://github.com/apache/hadoop/pull/6979#issuecomment-2267080793 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 17m 26s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 1s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 48m 58s | | trunk passed | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | +1 :green_heart: | checkstyle | 1m 13s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 22s | | trunk passed | | +1 :green_heart: | javadoc | 1m 9s | | trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | +1 :green_heart: | spotbugs | 3m 20s | | trunk passed | | +1 :green_heart: | shadedclient | 40m 56s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 9s | | the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | +1 :green_heart: | javac | 1m 9s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 1s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 80 unchanged - 0 fixed = 83 total (was 80) | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 | | +1 :green_heart: | javadoc | 1m 32s | | the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | +1 :green_heart: | spotbugs | 3m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 41m 12s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | -1 :x: | unit | 253m 15s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | | The patch does not generate ASF License warnings. | | | | 423m 23s | | | | Reason | Tests | |-------:|:------| | Failed junit tests | hadoop.hdfs.TestRollingUpgrade | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6979 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 7876e0c54b9e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 7ade5bedb727e24d51a17a485c7113e4785d7b8c | | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/testReport/ | | Max. process+thread count | 3046 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6979/1/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Fix the mismatch between locations and indices for mover > -------------------------------------------------------- > > Key: HDFS-17599 > URL: https://issues.apache.org/jira/browse/HDFS-17599 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.3.0, 3.4.0 > Reporter: Tao Li > Assignee: Tao Li > Priority: Major > Labels: pull-request-available > Attachments: image-2024-08-03-17-59-08-059.png, > image-2024-08-03-18-00-01-950.png > > > We set the EC policy to (6+3) and also have nodes that were in state > ENTERING_MAINTENANCE. > > When we move the data of some directories from SSD to HDD, some blocks move > fail due to disk full, as shown in the figure below > (blk_-9223372033441574269). > We tried to move again and found the following error "{color:#ff0000}Replica > does not exist{color}". > Observing the information of fsck, it can be found that the wrong > blockid(blk_-9223372033441574270) was found when moving block. > > {*}Mover Logs{*}: > !image-2024-08-03-17-59-08-059.png|width=741,height=85! > > {*}FSCK Info{*}: > !image-2024-08-03-18-00-01-950.png|width=738,height=120! > > {*}Root Cause{*}: > Similar to this HDFS-16333, when mover is initialized, only the `LIVE` node > is processed. As a result, the datanode in the `ENTERING_MAINTENANCE` state > in the locations is filtered when initializing `DBlockStriped`, but the > indices are not adapted, resulting in a mismatch between the location and > indices lengths. Finally, ec block calculates the wrong blockid when getting > internal block (see `DBlockStriped#getInternalBlock`). > > We added debug logs, and a few key messages are shown below. > {color:#ff0000}The result is an incorrect correspondence: xx.xx.7.31 -> > -9223372033441574270{color}. > {code:java} > DBlock getInternalBlock(StorageGroup storage) { > // storage == xx.xx.7.31 > // idxInLocs == 1 (location ([xx.xx.,85.29:DISK, xx.xx.7.31:DISK, > xx.xx.207.22:DISK, xx.xx.8.25:DISK, xx.xx.79.30:DISK, xx.xx.87.21:DISK, > xx.xx.8.38:DISK]), xx.xx.179.31 is in the ENTERING_MAINTENANCE state is > filtered) > int idxInLocs = locations.indexOf(storage); > if (idxInLocs == -1) { > return null; > } > // idxInGroup == 2 (indices is [1,2,3,4,5,6,7,8]) > byte idxInGroup = indices[idxInLocs]; > // blkId: -9223372033441574272 + 2 = -9223372033441574270 > long blkId = getBlock().getBlockId() + idxInGroup; > long numBytes = getInternalBlockLength(getNumBytes(), cellSize, > dataBlockNum, idxInGroup); > Block blk = new Block(getBlock()); > blk.setBlockId(blkId); > blk.setNumBytes(numBytes); > DBlock dblk = new DBlock(blk); > dblk.addLocation(storage); > return dblk; > } {code} > {*}Solution{*}: > When initializing DBlockStriped, if any location is filtered out, we need to > remove the corresponding element in the indices to do the adaptation. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org