[ 
https://issues.apache.org/jira/browse/HDFS-17920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18081441#comment-18081441
 ] 

ASF GitHub Bot commented on HDFS-17920:
---------------------------------------

hadoop-yetus commented on PR #8502:
URL: https://github.com/apache/hadoop/pull/8502#issuecomment-4469252200

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |  18m 44s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 47s |  |  trunk passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  compile  |   1m 48s |  |  trunk passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  checkstyle  |   1m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 56s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  spotbugs  |   4m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 21s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8502/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 2 unchanged - 
0 fixed = 4 total (was 2)  |
   | +1 :green_heart: |  mvnsite  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  |  the patch passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  spotbugs  |   4m 13s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 282m 18s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 21s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 452m 56s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.54 ServerAPI=1.54 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8502/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/8502 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux a72e6cce7bf3 5.15.0-174-generic #184-Ubuntu SMP Fri Mar 13 
18:41:50 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b2eb47d7bb1741ccadf94294fa77454accc57752 |
   | Default Java | Ubuntu-17.0.18+8-Ubuntu-124.04.1 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 
/usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8502/1/testReport/ |
   | Max. process+thread count | 2578 (vs. ulimit of 10000) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8502/1/console |
   | versions | git=2.43.0 maven=3.9.15 spotbugs=4.9.7 |
   | Powered by | Apache Yetus 0.14.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> TestDiskError.testShutdown can run into infinite loop
> -----------------------------------------------------
>
>                 Key: HDFS-17920
>                 URL: https://issues.apache.org/jira/browse/HDFS-17920
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: András Bokor
>            Priority: Critical
>              Labels: pull-request-available
>
> The test case tests if DN shuts down when there is a disk failure (that is 
> simulated by the test).
> We found that when this feature does not work for whatever reason 
> TestDiskError.testShutdown takes a long time and did not finish, also it 
> consumes all the storage space. The log file is somewhere around 11 GB, but 
> it can be increased by increasing the container size.
> Since the log file is huge and capable of running indefinitely, it is 
> suspicious that there might be an infinite loop somewhere in the test.
> I checked what loops exist [in the test 
> file;|https://github.com/apache/hadoop/blob/734dd8a67cd6df56b59ff75aa43de57834a0d248/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java#L121]
>  there aren't many, and with one exception, they all run only a few 
> iterations:
> {code:java}
> DataNode dn = cluster.getDataNodes().get(dnIndex);
>       for (int i=0; dn.isDatanodeUp(); i++) {
>         Path fileName = new Path("/test.txt"+i);
>         DFSTestUtil.createFile(fs, fileName, 1024, (short)2, 1L);
>         DFSTestUtil.waitReplication(fs, fileName, (short)2);
>         fs.delete(fileName, true);
>       } {code}
> Here, we keep creating and deleting new files until the DataNode (DN) dies. I 
> don't know how long the replication takes, but based on the file size and the 
> replication factor of 2, it should happen quickly. This is a suspicious 
> section because if the test doesn't finish quickly (meaning the "bad" DN 
> doesn't shut itself down), it’s conceivable that a vast number of file 
> operations are generating a massive amount of logs.I ran a grep on the log 
> file to see how many iterations are executed, and I found a line like this:
>  
> {code:java}
> BLOCK* allocate blk_1073970157_229333, replicas=127.0.0.1:34219, 
> 127.0.0.1:39923 for /test.txt114166{code}
>  
> This indicates that this single unit test case generates over a hundred 
> thousand file operations on its own. Based on the log I examined, which 
> covers a half-hour window, the loop is running about 60 times per second; I'm 
> not even sure if this makes sense.
> Introducing some kind of interval plus a timeout would likely help, as the 
> test currently works in a way where if the feature under test fails, you 
> don't get an assertion error—you get an infinite loop.
> *Please note that* {*}we are not addressing the root cause of the possible 
> shutdown failure{*}; instead, we are targeting the resulting infinite loop 
> and the unnecessarily large log file.
> Also, I have set the priority to Critical (even though a unit test failure 
> does not indicate that) because, this issue can block CI process.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to