[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506058 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 04:49 Start Date: 29/Oct/20 04:49 Worklog Time Spent: 10m Work Description: aajisaka commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513966454 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I could reproduce even without `-Pparallel-tests` ``` $ pwd /home/aajisaka/hadoop/hadoop-hdfs-project/hadoop-hdfs $ mvn test -Dtest=TestFileChecksum -Pnative ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506058) Time Spent: 3h 40m (was: 3.5h) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, > org.apache.hadoop.hdfs.TestFileChecksum-output.txt, > org.apache.hadoop.hdfs.TestFileChecksum.txt > > Time Spent: 3h 40m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506054 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 04:40 Start Date: 29/Oct/20 04:40 Worklog Time Spent: 10m Work Description: aajisaka commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513963136 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I could reproduce the failure locally: ``` $ ./start-build-env.sh $ mvn clean install -DskipTests -Pnative $ cd hadoop-hdfs-project/hadoop-hdfs $ mvn test -Pnative -Pparallel-tests ``` Attached the stdout in the JIRA: https://issues.apache.org/jira/secure/attachment/13014321/org.apache.hadoop.hdfs.TestFileChecksum-output.txt This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506054) Time Spent: 3.5h (was: 3h 20m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, > org.apache.hadoop.hdfs.TestFileChecksum-output.txt, > org.apache.hadoop.hdfs.TestFileChecksum.txt > > Time Spent: 3.5h > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at >
[jira] [Updated] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15643: - Attachment: org.apache.hadoop.hdfs.TestFileChecksum.txt > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, > org.apache.hadoop.hdfs.TestFileChecksum-output.txt, > org.apache.hadoop.hdfs.TestFileChecksum.txt > > Time Spent: 3h 20m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > > {code:bash} > Error Message > `/striped/stripedFileChecksum1': Fail to get block checksum for > LocatedStripedBlock{BP-1299291876-172.17.0.3-1602771356932:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:42217,DS-6c29e4b7-e4f1-4302-ad23-fb078f37d783,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41307,DS-3d824f14-3cd0-46b1-bef1-caa808bf278d,DISK], > >
[jira] [Updated] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15643: - Attachment: org.apache.hadoop.hdfs.TestFileChecksum-output.txt > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, > org.apache.hadoop.hdfs.TestFileChecksum-output.txt > > Time Spent: 3h 20m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > > {code:bash} > Error Message > `/striped/stripedFileChecksum1': Fail to get block checksum for > LocatedStripedBlock{BP-1299291876-172.17.0.3-1602771356932:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:42217,DS-6c29e4b7-e4f1-4302-ad23-fb078f37d783,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41307,DS-3d824f14-3cd0-46b1-bef1-caa808bf278d,DISK], > >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506034=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506034 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 03:38 Start Date: 29/Oct/20 03:38 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2421: URL: https://github.com/apache/hadoop/pull/2421#issuecomment-718338898 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 14s | | trunk passed | | +1 :green_heart: | compile | 1m 18s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 13s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 49s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 24s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 11s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 55s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 28s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 3s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 0s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 13s | | the patch passed | | +1 :green_heart: | compile | 1m 12s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 39s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 44s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 47s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 20s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 3m 7s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 59m 22s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2421/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | -1 :x: | asflicense | 0m 37s | [/patch-asflicense-problems.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2421/1/artifact/out/patch-asflicense-problems.txt) | The patch generated 4 ASF License warnings. | | | | 145m 17s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.hdfs.TestFileCreationClient | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.TestErasureCodingPolicies | | | hadoop.hdfs.TestDecommissionWithStriped | | | hadoop.hdfs.TestMiniDFSCluster | | | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.hdfs.TestDFSStripedInputStream | | | hadoop.hdfs.TestFileAppend2 | | | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.TestErasureCodingMultipleRacks | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestDFSClientSocketSize | | | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506020 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 02:49 Start Date: 29/Oct/20 02:49 Worklog Time Spent: 10m Work Description: amahussein commented on pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718325314 > I think the `keepLongStdio` option can be used https://www.jenkins.io/doc/pipeline/steps/junit/ > The option can be enabled by updating the `./Jenkinsfile` as follows: > > ```diff > -junit "${env.SOURCEDIR}/**/target/surefire-reports/*.xml" > +junit keepLongStdio: true, testResults: "${env.SOURCEDIR}/**/target/surefire-reports/*.xml" > ``` Thanks @aajisaka ! I am going to try it out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506020) Time Spent: 3h 10m (was: 3h) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 3h 10m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506006 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 02:11 Start Date: 29/Oct/20 02:11 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718313974 > do you guys know if it is possible to see the full logs of the unit test? I think the `keepLongStdio` option can be used https://www.jenkins.io/doc/pipeline/steps/junit/ The option can be enabled by updating the `./Jenkinsfile` as follows: ```diff -junit "${env.SOURCEDIR}/**/target/surefire-reports/*.xml" +junit keepLongStdio: true, testResults: "${env.SOURCEDIR}/**/target/surefire-reports/*.xml" ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506006) Time Spent: 3h (was: 2h 50m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 3h > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at >
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=506004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506004 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 29/Oct/20 02:04 Start Date: 29/Oct/20 02:04 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718312065 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 18s | | trunk passed | | +1 :green_heart: | compile | 1m 19s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | compile | 1m 11s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | checkstyle | 0m 49s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 22s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 10s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 54s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 1m 27s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +0 :ok: | spotbugs | 3m 4s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 2s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 25s | | the patch passed | | +1 :green_heart: | compile | 1m 12s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javac | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 43s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 12s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 30s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 49s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 1m 14s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | findbugs | 3m 4s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 95m 19s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 184m 30s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.TestGetFileChecksum | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2419 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 65b9eeb7a07d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / bab5bf9743f | | Default Java | Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | Multi-JDK versions |
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505997=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505997 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 29/Oct/20 01:50 Start Date: 29/Oct/20 01:50 Worklog Time Spent: 10m Work Description: huangtianhua removed a comment on pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718304736 @brahmareddybattula Or maybe we can change the code to keep the ordinal for pre storage types first? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505997) Time Spent: 4h 50m (was: 4h 40m) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505996 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 29/Oct/20 01:48 Start Date: 29/Oct/20 01:48 Worklog Time Spent: 10m Work Description: huangtianhua removed a comment on pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718303977 @brahmareddybattula , OK, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505996) Time Spent: 4h 40m (was: 4.5h) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222654#comment-17222654 ] huangtianhua commented on HDFS-15624: - [~ayushtkn], hi, I agree with you because the code is almost ready, would you please help to review too, https://github.com/apache/hadoop/pull/2377 > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505995 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 29/Oct/20 01:39 Start Date: 29/Oct/20 01:39 Worklog Time Spent: 10m Work Description: huangtianhua commented on pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718304736 @brahmareddybattula Or maybe we can change the code to keep the ordinal for pre storage types first? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505995) Time Spent: 4.5h (was: 4h 20m) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505994 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 29/Oct/20 01:36 Start Date: 29/Oct/20 01:36 Worklog Time Spent: 10m Work Description: huangtianhua commented on pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718303977 @brahmareddybattula , OK, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505994) Time Spent: 4h 20m (was: 4h 10m) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505986 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 29/Oct/20 01:11 Start Date: 29/Oct/20 01:11 Worklog Time Spent: 10m Work Description: amahussein opened a new pull request #2421: URL: https://github.com/apache/hadoop/pull/2421 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505986) Time Spent: 2h 50m (was: 2h 40m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2h 50m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505969 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:37 Start Date: 28/Oct/20 23:37 Worklog Time Spent: 10m Work Description: amahussein commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513824033 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: Thanks @goiri Those are the logs I was looking at. All logs in `TestFileChecksum` and `TestFileChecksumCompositeCrc` truncate the last 9 seconds prior to the failure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505969) Time Spent: 2h 40m (was: 2.5h) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2h 40m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505966 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:31 Start Date: 28/Oct/20 23:31 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513822007 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I see there is truncation there too though. We may want to make the test a little less verbose. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505966) Time Spent: 2.5h (was: 2h 20m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2.5h > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505965 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:29 Start Date: 28/Oct/20 23:29 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513821613 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I am not sure exactly what test you are interested, but here you can see one of the failed tests full logs: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2408/2/testReport/org.apache.hadoop.hdfs/TestFileChecksum/testStripedFileChecksumWithMissedDataBlocksRangeQuery11/ Here are all the tests: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2408/2/testReport/org.apache.hadoop.hdfs/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505965) Time Spent: 2h 20m (was: 2h 10m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2h 20m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at
[jira] [Commented] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222606#comment-17222606 ] Íñigo Goiri commented on HDFS-15654: Thanks [~ahussein] for the fix. Merged PR 2419. > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at >
[jira] [Assigned] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned HDFS-15654: -- Assignee: Ahmed Hussein > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at >
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505964 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:24 Start Date: 28/Oct/20 23:24 Worklog Time Spent: 10m Work Description: goiri merged pull request #2419: URL: https://github.com/apache/hadoop/pull/2419 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505964) Time Spent: 1.5h (was: 1h 20m) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at
[jira] [Resolved] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri resolved HDFS-15654. Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505962 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:13 Start Date: 28/Oct/20 23:13 Worklog Time Spent: 10m Work Description: amahussein commented on pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718261099 @goiri and @aajisaka , do you guys know if it is possible to see the full logs of the unit test? The Yetus console and test reports show truncated logs. So, I cannot see the sequence of events that leads to the Exception and the stack trace. I cannot reproduce the failure locally too :/ ``` ...[truncated 896674 chars]... sed]. Total timeout mills is 48, 479813 millis timeout left. at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:351) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505962) Time Spent: 2h 10m (was: 2h) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2h 10m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at >
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505961 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:08 Start Date: 28/Oct/20 23:08 Worklog Time Spent: 10m Work Description: amahussein commented on pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718259197 > This LGTM. @amahussein good to merge? yes, this is good to go. Thanks @goiri This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505961) Time Spent: 1h 20m (was: 1h 10m) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505959 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 23:06 Start Date: 28/Oct/20 23:06 Worklog Time Spent: 10m Work Description: goiri commented on pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718258543 This LGTM. @amahussein good to merge? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505959) Time Spent: 1h 10m (was: 1h) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505863 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 18:48 Start Date: 28/Oct/20 18:48 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718136892 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 42s | | trunk passed | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | checkstyle | 0m 51s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 20s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 22s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 55s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 1m 23s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +0 :ok: | spotbugs | 3m 7s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 4s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 12s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javac | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 38s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 14s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 1m 19s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | findbugs | 3m 4s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 95m 33s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 43s | | The patch does not generate ASF License warnings. | | | | 184m 48s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2419 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 64d68b4245d3 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / b3ba74d72df | | Default Java | Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | Test Results |
[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1773#comment-1773 ] Ayush Saxena commented on HDFS-15624: - HDFS-15660 is handling a different issue, things getting fixed here are quite specific to NVDMIM, like the broken FsImage Compatibility, due to change in ordinal of Storage Types. Rolling Upgrade issue. Regarding HDFS-15660 : That tends to handle the exception due to missing storage type at Client, and so the protobuf response while decoding the new Storage Type, fetches an exception. This would happen with NVDMIM also, in case of 3.3.0 client and and 3.4.0 server, but that is something not being chased here. No point holding this IMO, I didn't check at what stage the code is now, considering Vinay was following. I think that is almost at conclusion, and we should get this in and sort the mess from HDFS-15025. But I am ok holding it as well, but in that case we should revert HDFS-15025, So, as if this doesn't get concluded for any reasons, later getting rid of this shouldn't be a problem, and even prevent someone backporting the original in there internal versions. [~vinayakumarb]/ [~liuml07] let me know if you folks too want to hold it for HDFS-15660(This won't be too quick and is quite different as well), Will revert the original by tomorrow EOD and we can track there in that case. > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505794=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505794 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 16:25 Start Date: 28/Oct/20 16:25 Worklog Time Spent: 10m Work Description: amahussein commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513584215 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I see. the problem is that I cannot reproduce it on my local machine. However, it seems that it fails in a consistent way on Yetus. If it is not a real bug, I wonder if volumeScanner could be a factor in randomly slowing down the DNs. I see many log message from the volume scanner when I run locally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505794) Time Spent: 2h (was: 1h 50m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 2h > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Updated] (HDFS-15644) Failed volumes can cause DNs to stop block reporting
[ https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15644: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks! > Failed volumes can cause DNs to stop block reporting > > > Key: HDFS-15644 > URL: https://issues.apache.org/jira/browse/HDFS-15644 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement, datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: refactor > Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3 > > Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, > HDFS-15644.002.patch > > > [~daryn] found a corner case where remove failed volumes can cause a NPE in > [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939]. > +Scenario:+ > * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a > 2-step process. > ** First it's removed from from the volumes list > ** Later in time are the replicas scrubbed from the volume map > * A concurrent thread generating blockReports may access the replicaMap > accessing a non existing VolumeID. > He made a fix for that and we have been using it on our clusters since > Hadoop-2.7. > By analyzing the code, the bug is still applicable to Trunk. > * The path Datanode#removeVolumes() is safe because the two step process in > {{FsDataImpl.removeVolumes()}} > [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577] > is protected by {{datasetWriteLock}} . > * The path Datanode#handleVolumeFailures() is not safe because the failed > volume is removed from the list without acquiring > {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239] > The race condition can cause the caller of getBlockReports() to throw NPE if > the RUR is referring to a volume that has already been removed > [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976]. > {code:java} > case RUR: > ReplicaInfo orig = b.getOriginalReplica(); > builders.get(volStorageID).add(orig); > break; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15644) Failed volumes can cause DNs to stop block reporting
[ https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15644: --- Fix Version/s: 2.10.2 > Failed volumes can cause DNs to stop block reporting > > > Key: HDFS-15644 > URL: https://issues.apache.org/jira/browse/HDFS-15644 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement, datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: refactor > Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3 > > Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, > HDFS-15644.002.patch > > > [~daryn] found a corner case where remove failed volumes can cause a NPE in > [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939]. > +Scenario:+ > * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a > 2-step process. > ** First it's removed from from the volumes list > ** Later in time are the replicas scrubbed from the volume map > * A concurrent thread generating blockReports may access the replicaMap > accessing a non existing VolumeID. > He made a fix for that and we have been using it on our clusters since > Hadoop-2.7. > By analyzing the code, the bug is still applicable to Trunk. > * The path Datanode#removeVolumes() is safe because the two step process in > {{FsDataImpl.removeVolumes()}} > [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577] > is protected by {{datasetWriteLock}} . > * The path Datanode#handleVolumeFailures() is not safe because the failed > volume is removed from the list without acquiring > {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239] > The race condition can cause the caller of getBlockReports() to throw NPE if > the RUR is referring to a volume that has already been removed > [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976]. > {code:java} > case RUR: > ReplicaInfo orig = b.getOriginalReplica(); > builders.get(volStorageID).add(orig); > break; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15644) Failed volumes can cause DNs to stop block reporting
[ https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1754#comment-1754 ] Wei-Chiu Chuang commented on HDFS-15644: I tried to verify the tests are good but TestFsDatasetImpl#testReportBadBlocks times out consistently, and then I realized it times out consistently even without this patch. I'll keep digging into it. +1 for the branch-2.10. > Failed volumes can cause DNs to stop block reporting > > > Key: HDFS-15644 > URL: https://issues.apache.org/jira/browse/HDFS-15644 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement, datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: refactor > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, > HDFS-15644.002.patch > > > [~daryn] found a corner case where remove failed volumes can cause a NPE in > [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939]. > +Scenario:+ > * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a > 2-step process. > ** First it's removed from from the volumes list > ** Later in time are the replicas scrubbed from the volume map > * A concurrent thread generating blockReports may access the replicaMap > accessing a non existing VolumeID. > He made a fix for that and we have been using it on our clusters since > Hadoop-2.7. > By analyzing the code, the bug is still applicable to Trunk. > * The path Datanode#removeVolumes() is safe because the two step process in > {{FsDataImpl.removeVolumes()}} > [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577] > is protected by {{datasetWriteLock}} . > * The path Datanode#handleVolumeFailures() is not safe because the failed > volume is removed from the list without acquiring > {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239] > The race condition can cause the caller of getBlockReports() to throw NPE if > the RUR is referring to a volume that has already been removed > [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976]. > {code:java} > case RUR: > ReplicaInfo orig = b.getOriginalReplica(); > builders.get(volStorageID).add(orig); > break; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505788 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 16:16 Start Date: 28/Oct/20 16:16 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#discussion_r513577757 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java ## @@ -318,45 +317,41 @@ public void blockUtilSendFullBlockReport() { count.addAndGet(1); Thread.sleep(1); } catch (Exception e) { -e.printStackTrace(); +LOG.error("error addNewBlockThread", e); } } }); addNewBlockThread.start(); // Make sure that generate blocks for DataNode and IBR not empty now. - GenericTestUtils.waitFor(() -> { -if(count.get() > 0) { - return true; -} -return false; - }, 100, 1000); + GenericTestUtils.waitFor(() -> count.get() > 0, 100, 1000); // Trigger re-register using DataNode Command. datanodeCommands[0] = new DatanodeCommand[]{RegisterCommand.REGISTER}; - bpos.triggerHeartbeatForTests(); + bpos.triggerHeartbeatForTests(); + addNewBlockThread.join(); + addNewBlockThread = null; + // Verify FBR/IBR count is equal to generate number. try { -GenericTestUtils.waitFor(() -> { - if(fullBlockReportCount == totalTestBlocks || - incrBlockReportCount == totalTestBlocks) { -return true; - } - return false; -}, 1000, 15000); - } catch (Exception e) {} +GenericTestUtils.waitFor(() -> +(fullBlockReportCount == totalTestBlocks || +incrBlockReportCount == totalTestBlocks), 1000, 15000); + } catch (Exception e) { +LOG.error("Timed out wait for IBR counts FBRCount = {}," ++ " IBRCount = {}; expected = {}", +fullBlockReportCount, incrBlockReportCount, totalTestBlocks); +Assert.fail(); Review comment: And do the static import of fail() as the other assertEquals() and so on. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505788) Time Spent: 50m (was: 40m) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { >
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505787 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 16:13 Start Date: 28/Oct/20 16:13 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#discussion_r513575273 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java ## @@ -318,45 +317,41 @@ public void blockUtilSendFullBlockReport() { count.addAndGet(1); Thread.sleep(1); } catch (Exception e) { -e.printStackTrace(); +LOG.error("error addNewBlockThread", e); } } }); addNewBlockThread.start(); // Make sure that generate blocks for DataNode and IBR not empty now. - GenericTestUtils.waitFor(() -> { -if(count.get() > 0) { - return true; -} -return false; - }, 100, 1000); + GenericTestUtils.waitFor(() -> count.get() > 0, 100, 1000); // Trigger re-register using DataNode Command. datanodeCommands[0] = new DatanodeCommand[]{RegisterCommand.REGISTER}; - bpos.triggerHeartbeatForTests(); + bpos.triggerHeartbeatForTests(); + addNewBlockThread.join(); + addNewBlockThread = null; + // Verify FBR/IBR count is equal to generate number. try { -GenericTestUtils.waitFor(() -> { - if(fullBlockReportCount == totalTestBlocks || - incrBlockReportCount == totalTestBlocks) { -return true; - } - return false; -}, 1000, 15000); - } catch (Exception e) {} +GenericTestUtils.waitFor(() -> +(fullBlockReportCount == totalTestBlocks || +incrBlockReportCount == totalTestBlocks), 1000, 15000); + } catch (Exception e) { +LOG.error("Timed out wait for IBR counts FBRCount = {}," ++ " IBRCount = {}; expected = {}", +fullBlockReportCount, incrBlockReportCount, totalTestBlocks); +Assert.fail(); Review comment: You might want to put the previous log message in the fail(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505787) Time Spent: 40m (was: 0.5h) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505776 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:44 Start Date: 28/Oct/20 15:44 Worklog Time Spent: 10m Work Description: amahussein commented on pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718021914 That's trickier than what I thought. Need another iteration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505776) Time Spent: 1h 50m (was: 1h 40m) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 1h 50m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at >
[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505763=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505763 ] ASF GitHub Bot logged work on HDFS-15643: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:32 Start Date: 28/Oct/20 15:32 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2408: URL: https://github.com/apache/hadoop/pull/2408#discussion_r513541376 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java ## @@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int range, dnIdxToDie = getDataNodeToKill(filePath); DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie); shutdownDataNode(dnToDie); + // wait enough time for the locations to be updated. + Thread.sleep(STALE_INTERVAL); Review comment: I am not very close to this part of the code but there must be ways to force the statistics to update. Not sure who can help with this part of the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505763) Time Spent: 1h 40m (was: 1.5h) > TestFileChecksumCompositeCrc fails intermittently > - > > Key: HDFS-15643 > URL: https://issues.apache.org/jira/browse/HDFS-15643 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Critical > Labels: pull-request-available > Attachments: > TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log > > Time Spent: 1h 40m > Remaining Estimate: 0h > > There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases > {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The > following is a sample of the stack trace in two of them Query7 and Query8. > {code:bash} > org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail > to get block checksum for > LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001; > getBlockSize()=37748736; corrupt=false; offset=0; > locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK], > > DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK], > > DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK], > > DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK], > > DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK], > > DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]]; > indices=[1, 2, 3, 4, 5, 6, 7, 8]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851) > at > org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902) > at > org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916) > at > org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295) > at > org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505760 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:29 Start Date: 28/Oct/20 15:29 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#discussion_r513538747 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java ## @@ -280,26 +282,24 @@ public void testBasicFunctionality() throws Exception { */ @Test public void testMissBlocksWhenReregister() throws Exception { + Review comment: Remove this extra line added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505760) Time Spent: 0.5h (was: 20m) > TestBPOfferService#testMissBlocksWhenReregister fails intermittently > > > Key: HDFS-15654 > URL: https://issues.apache.org/jira/browse/HDFS-15654 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {{TestBPOfferService.testMissBlocksWhenReregister}} is flaky. It fails > randomly when the > following expression is not true: > {code:java} > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > {code} > There is a race condition here that relies once more on "time" to synchronize > between concurrent threads. The code below is is causing the > non-deterministic execution. > On a slow server, {{addNewBlockThread}} may not be done by the time the main > thread reach the assertion call. > {code:java} > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > addNewBlockThread.join(); > bpos.stop(); > bpos.join(); > {code} > Therefore, the correct implementation should wait for the thread to finish > {code:java} > // the thread finished execution. > addNewBlockThread.join(); > // Verify FBR/IBR count is equal to generate number. > assertTrue(fullBlockReportCount == totalTestBlocks || > incrBlockReportCount == totalTestBlocks); > } finally { > bpos.stop(); > bpos.join(); > {code} > {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is > not enough to satisfy the condition. > {code:java} > DataNodeFaultInjector.set(new DataNodeFaultInjector() { > public void blockUtilSendFullBlockReport() { > try { > GenericTestUtils.waitFor(() -> { > if(count.get() > 2000) { > return true; > } > return false; > }, 100, 1); // increase that waiting time to 10 seconds. > } catch (Exception e) { > e.printStackTrace(); > } > } > }); > {code} > {code:bash} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at >
[jira] [Comment Edited] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222133#comment-17222133 ] Brahma Reddy Battula edited comment on HDFS-15624 at 10/28/20, 12:33 PM: - There is an issue after provided storage type introduced ( https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there itself, so we can hold on till HDFS-15660 is addressed. was (Author: brahmareddy): there is an issue provided storage type itself ( https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there itself > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505709 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 28/Oct/20 12:32 Start Date: 28/Oct/20 12:32 Worklog Time Spent: 10m Work Description: brahmareddybattula commented on pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#issuecomment-717902202 I think, you can hold on till HDFS-15660 addressed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505709) Time Spent: 4h 10m (was: 4h) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222133#comment-17222133 ] Brahma Reddy Battula commented on HDFS-15624: - there is an issue provided storage type itself ( https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there itself > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222107#comment-17222107 ] Akira Ajisaka edited comment on HDFS-15659 at 10/28/20, 11:29 AM: -- Thank you [~ayushtkn] for your comment. {quote}IIRC couple of days back you had objections having this config disabled in a test. I didn't follow up, and I lost the track of that Jira too. Is that concern not there now? {quote} The comment is here: [https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731] Sorry, I've changed my mind. For fixing one test, I thought it's okay to increase the number of DNs. However, for fixing many tests, it's easier to disable this feature by default rather than increasing the number of DNs for each test case. {quote}So, If this concludes, We can revert that. Increasing datanodes has adverse affect on test performance, though minor {quote} Yes, I think we can revert HDFS-15461 if the feature is disabled by default. was (Author: ajisakaa): Thank you [~ayushtkn] for your comment. {quote}IIRC couple of days back you had objections having this config disabled in a test. I didn't follow up, and I lost the track of that Jira too. Is that concern not there now? {quote} The comment is here: [https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731] Sorry, I've changed my mind. For fixing one test, I thought it's okay to increase the number of DNs. However, for fixing many tests, it's easier to disable this feature by default rather than increasing the number of DNs for each test case. {quote}So, If this concludes, We can revert that. Increasing datanodes has adverse affect on test performance, though minor {quote} Yes, I think we can revert HDFS-15461 if the feature is disabled by default. > Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster > --- > > Key: HDFS-15659 > URL: https://issues.apache.org/jira/browse/HDFS-15659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Akira Ajisaka >Priority: Major > > dfs.namenode.redundancy.considerLoad is true by default and it is causing > many test failures. Let's disable it in MiniDFSCluster. > Originally reported by [~weichiu]: > https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612 > {quote} > i've certain seen this option causing test failures in the past. > Maybe we should turn it off by default in MiniDDFSCluster, and only enable it > for specific tests. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222107#comment-17222107 ] Akira Ajisaka commented on HDFS-15659: -- Thank you [~ayushtkn] for your comment. {quote}IIRC couple of days back you had objections having this config disabled in a test. I didn't follow up, and I lost the track of that Jira too. Is that concern not there now? {quote} The comment is here: [https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731] Sorry, I've changed my mind. For fixing one test, I thought it's okay to increase the number of DNs. However, for fixing many tests, it's easier to disable this feature by default rather than increasing the number of DNs for each test case. {quote}So, If this concludes, We can revert that. Increasing datanodes has adverse affect on test performance, though minor {quote} Yes, I think we can revert HDFS-15461 if the feature is disabled by default. > Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster > --- > > Key: HDFS-15659 > URL: https://issues.apache.org/jira/browse/HDFS-15659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Akira Ajisaka >Priority: Major > > dfs.namenode.redundancy.considerLoad is true by default and it is causing > many test failures. Let's disable it in MiniDFSCluster. > Originally reported by [~weichiu]: > https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612 > {quote} > i've certain seen this option causing test failures in the past. > Maybe we should turn it off by default in MiniDDFSCluster, and only enable it > for specific tests. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222090#comment-17222090 ] Ayush Saxena commented on HDFS-15659: - [~aajisaka] IIRC couple of days back you had objections having this config disabled in a test. I didn't follow up, and I lost the track of that Jira too. Is that concern not there now? I think there the number of datanodes were increased instead? If we are doing this, I don't think that fix would be required? So, If this concludes, We can revert that. Increasing datanodes has adverse affect on test performance, though minor > Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster > --- > > Key: HDFS-15659 > URL: https://issues.apache.org/jira/browse/HDFS-15659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Akira Ajisaka >Priority: Major > > dfs.namenode.redundancy.considerLoad is true by default and it is causing > many test failures. Let's disable it in MiniDFSCluster. > Originally reported by [~weichiu]: > https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612 > {quote} > i've certain seen this option causing test failures in the past. > Maybe we should turn it off by default in MiniDDFSCluster, and only enable it > for specific tests. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15655) Add option to make balancer prefer to get cold blocks
[ https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222088#comment-17222088 ] Hadoop QA commented on HDFS-15655: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 24s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 25s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 18s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 13s{color} | | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no blanks issues. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 58s{color} | | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 17s{color} | | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || || | {color:red}-1{color} | {color:red} unit {color} |
[jira] [Resolved] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma resolved HDFS-15657. - Fix Version/s: 3.4.0 3.3.1 Resolution: Fixed Merged to trunk and branch-3.3. Thanks, [~aajisaka]. > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 2h > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Caused by: java.net.BindException: Problem binding to [0.0.0.0:] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at >
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505665 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 10:13 Start Date: 28/Oct/20 10:13 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717833448 Thank you for reviewing and merging. Would you backport this to branch-3.3? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505665) Time Spent: 2h (was: 1h 50m) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 2h > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at >
[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222077#comment-17222077 ] Ryan Wu commented on HDFS-15660: {code:java} // code placeholder enum StorageTypeProto { DISK = 1; SSD = 2; ARCHIVE = 3; RAM_DISK = 4; PROVIDED = 5; } {code} this PROVIDE storageType is added from HDFS-10675 > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Fix For: 3.4.0 > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505664 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 10:12 Start Date: 28/Oct/20 10:12 Worklog Time Spent: 10m Work Description: tasanuma commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717832919 @aajisaka Thanks for your contribution! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505664) Time Spent: 1h 50m (was: 1h 40m) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 1h 50m > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) >
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505663 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 10:12 Start Date: 28/Oct/20 10:12 Worklog Time Spent: 10m Work Description: tasanuma merged pull request #2418: URL: https://github.com/apache/hadoop/pull/2418 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505663) Time Spent: 1h 40m (was: 1.5h) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 1h 40m > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at >
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505656 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 09:58 Start Date: 28/Oct/20 09:58 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717825235 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 52s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 50s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 0m 39s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 44s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 1m 24s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 1m 22s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 38s | | the patch passed | | +1 :green_heart: | compile | 0m 41s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 0m 41s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 18s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 36s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 16m 10s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 36s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 54s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 1m 32s | | the patch passed | _ Other Tests _ | | +1 :green_heart: | unit | 12m 58s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 33s | | The patch does not generate ASF License warnings. | | | | 98m 5s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2418 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1df7450ea6a4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d0c786db4de | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/testReport/ | | Max. process+thread count | 2730 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 | | Powered by | Apache Yetus
[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222075#comment-17222075 ] Ryan Wu commented on HDFS-15660: This happened when our cluster enabled heterogeneous storage. To run hivesql or we call "hdfs dfs -count ", this problem will appear. > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Fix For: 3.4.0 > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by
[jira] [Assigned] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
[ https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Wu reassigned HDFS-15660: -- Assignee: Ryan Wu > StorageTypeProto is not compatiable between 3.x and 2.6 > --- > > Key: HDFS-15660 > URL: https://issues.apache.org/jira/browse/HDFS-15660 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.2.0, 3.1.3 >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Fix For: 3.4.0 > > > In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, > we found hive to call getContentSummary method , the client and server was > not compatible because of hadoop3 added new PROVIDED storage type. > {code:java} > // code placeholder > 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while > invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over > x/x:8020. Trying to fail over immediately. > java.io.IOException: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) > at > org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) > at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at > org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at > org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > Caused by: com.google.protobuf.ServiceException: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) > at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) > ... 23 more > Caused by: com.google.protobuf.UninitializedMessageException: Message missing > required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type > at > com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) > ... 25 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Created] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6
Ryan Wu created HDFS-15660: -- Summary: StorageTypeProto is not compatiable between 3.x and 2.6 Key: HDFS-15660 URL: https://issues.apache.org/jira/browse/HDFS-15660 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.1.3, 3.2.0 Reporter: Ryan Wu Fix For: 3.4.0 In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6, we found hive to call getContentSummary method , the client and server was not compatible because of hadoop3 added new PROVIDED storage type. {code:java} // code placeholder 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over x/x:8020. Trying to fail over immediately. java.io.IOException: com.google.protobuf.ServiceException: com.google.protobuf.UninitializedMessageException: Message missing required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type at org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144) at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706) at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713) at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) at org.apache.hadoop.fs.shell.Command.run(Command.java:165) at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) Caused by: com.google.protobuf.ServiceException: com.google.protobuf.UninitializedMessageException: Message missing required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272) at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816) ... 23 more Caused by: com.google.protobuf.UninitializedMessageException: Message missing required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type at com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263) ... 25 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505657 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 09:59 Start Date: 28/Oct/20 09:59 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717825773 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 2m 2s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 7s | | trunk passed | | +1 :green_heart: | compile | 0m 44s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 57s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 58s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 1m 28s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 1m 26s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 36s | | the patch passed | | +1 :green_heart: | compile | 0m 37s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 0m 37s | | the patch passed | | +1 :green_heart: | compile | 0m 34s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 0m 34s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 17s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 34s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 16m 22s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 37s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 1m 28s | | the patch passed | _ Other Tests _ | | +1 :green_heart: | unit | 12m 50s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 29s | | The patch does not generate ASF License warnings. | | | | 98m 1s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2418 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4d3d3255b1e2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d0c786db4de | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/testReport/ | | Max. process+thread count | 2716 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 | | Powered by | Apache Yetus
[jira] [Commented] (HDFS-15658) Improve datanode capability balancing
[ https://issues.apache.org/jira/browse/HDFS-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222070#comment-17222070 ] Hadoop QA commented on HDFS-15658: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 3m 54s{color} | | {color:red} Docker failed to build yetus/hadoop:06eafeedf12. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-15658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13014273/HDFS-15658-branch-2.7.patch | | Console output | https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/271/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > Improve datanode capability balancing > - > > Key: HDFS-15658 > URL: https://issues.apache.org/jira/browse/HDFS-15658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: chuanjie.duan >Priority: Major > Attachments: HDFS-15658-branch-2.7.patch > > > How about adjust the order of choosing replication to deletion? > Is there any other meaning, choosing "oldestHeartbeatStorage" first? > > public DatanodeStorageInfo chooseReplicaToDelete( > Collection moreThanOne, > Collection exactlyOne, > final List excessTypes, > Map> rackMap) { > .. > final DatanodeStorageInfo storage; > if (minSpaceStorage != null) { > storage = minSpaceStorage; > } else if (oldestHeartbeatStorage != null) { > storage = oldestHeartbeatStorage; > } else { > return null; > } > excessTypes.remove(storage.getStorageType()); > return storage; > } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15658) Improve datanode capability balancing
[ https://issues.apache.org/jira/browse/HDFS-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chuanjie.duan updated HDFS-15658: - Attachment: HDFS-15658-branch-2.7.patch Status: Patch Available (was: Open) > Improve datanode capability balancing > - > > Key: HDFS-15658 > URL: https://issues.apache.org/jira/browse/HDFS-15658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: chuanjie.duan >Priority: Major > Attachments: HDFS-15658-branch-2.7.patch > > > How about adjust the order of choosing replication to deletion? > Is there any other meaning, choosing "oldestHeartbeatStorage" first? > > public DatanodeStorageInfo chooseReplicaToDelete( > Collection moreThanOne, > Collection exactlyOne, > final List excessTypes, > Map> rackMap) { > .. > final DatanodeStorageInfo storage; > if (minSpaceStorage != null) { > storage = minSpaceStorage; > } else if (oldestHeartbeatStorage != null) { > storage = oldestHeartbeatStorage; > } else { > return null; > } > excessTypes.remove(storage.getStorageType()); > return storage; > } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505630 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 08:32 Start Date: 28/Oct/20 08:32 Worklog Time Spent: 10m Work Description: tasanuma commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717780318 +1, pending Jenkins. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505630) Time Spent: 1h 10m (was: 1h) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 1h 10m > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at >
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505626 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 08:21 Start Date: 28/Oct/20 08:21 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717775420 @tasanuma Thank you for checking this. Reverted the change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505626) Time Spent: 1h (was: 50m) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 1h > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at >
[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505610 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:59 Start Date: 28/Oct/20 07:59 Worklog Time Spent: 10m Work Description: tasanuma commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717765124 @aajisaka Thanks for updating PR. > According to the javadoc, this test case is to verify the default behavior. Don't set the default value (true) explicitly to verify the default behavior. I checked [HDFS-14653](https://issues.apache.org/jira/browse/HDFS-14653) again. `testNamenodeHeartBeatEnableDefault` checks that `DFS_ROUTER_NAMENODE_HEARTBEAT_ENABLE` is suppose to take the value of `DFS_ROUTER_HEARTBEAT_ENABLE` when `DFS_ROUTER_NAMENODE_HEARTBEAT_ENABLE` isn't explicitly specified. Therefore, the previous code here may not be wrong. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505610) Time Spent: 50m (was: 40m) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 50m > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) >
[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505601 ] ASF GitHub Bot logged work on HDFS-15654: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:35 Start Date: 28/Oct/20 07:35 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2419: URL: https://github.com/apache/hadoop/pull/2419#issuecomment-717755720 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 51s | | trunk passed | | +1 :green_heart: | compile | 1m 19s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 10s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 46s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | shadedclient | 18m 44s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 52s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 23s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 12s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 10s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 41s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 13s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 16m 9s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 49s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 3m 19s | | the patch passed | _ Other Tests _ | | -1 :x: | unit | 110m 29s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 202m 33s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2419 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7a2ecb2a7552 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d0c786db4de | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results |
[jira] [Updated] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15657: - Summary: RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException (was: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException) > RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException > - > > Key: HDFS-15657 > URL: https://issues.apache.org/jira/browse/HDFS-15657 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > Time Spent: 40m > Remaining Estimate: 0h > > https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > {noformat} > [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 > s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter > [ERROR] > testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter) > Time elapsed: 1.04 s <<< ERROR! > org.apache.hadoop.service.ServiceStateException: java.net.BindException: > Problem binding to [0.0.0.0:] java.net.BindException: Address already in > use; For more details see: http://wiki.apache.org/hadoop/BindException > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:174) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281) > at > org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Caused by: java.net.BindException: Problem binding to [0.0.0.0:] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at >
[jira] [Updated] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-15659: - Parent: HDFS-15646 Issue Type: Sub-task (was: Improvement) > Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster > --- > > Key: HDFS-15659 > URL: https://issues.apache.org/jira/browse/HDFS-15659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Akira Ajisaka >Priority: Major > > dfs.namenode.redundancy.considerLoad is true by default and it is causing > many test failures. Let's disable it in MiniDFSCluster. > Originally reported by [~weichiu]: > https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612 > {quote} > i've certain seen this option causing test failures in the past. > Maybe we should turn it off by default in MiniDDFSCluster, and only enable it > for specific tests. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-9776) TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky
[ https://issues.apache.org/jira/browse/HDFS-9776?focusedWorklogId=505600=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505600 ] ASF GitHub Bot logged work on HDFS-9776: Author: ASF GitHub Bot Created on: 28/Oct/20 07:31 Start Date: 28/Oct/20 07:31 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2410: URL: https://github.com/apache/hadoop/pull/2410#issuecomment-717754632 > That sounds like a good idea. Perhaps we can file a follow up lira to set the flag in the miniDFSCluster Filed https://issues.apache.org/jira/browse/HDFS-15659 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505600) Time Spent: 1h 20m (was: 1h 10m) > TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky > - > > Key: HDFS-9776 > URL: https://issues.apache.org/jira/browse/HDFS-9776 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Vinayakumar B >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Attachments: TestHAAppend.testMultipleAppendsDuringCatchupTailing.log > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Initial analysys of Recent test failure in > {{TestHAAppend#testMultipleAppendsDuringCatchupTailing}} > [here|https://builds.apache.org/job/PreCommit-HDFS-Build/14420/testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestHAAppend/testMultipleAppendsDuringCatchupTailing/] > > has found that, if the Active NameNode goes down immediately after truncate > operation, but before BlockRecovery command sent to datanode, > Then this block will never be truncated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
Akira Ajisaka created HDFS-15659: Summary: Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster Key: HDFS-15659 URL: https://issues.apache.org/jira/browse/HDFS-15659 Project: Hadoop HDFS Issue Type: Improvement Components: test Reporter: Akira Ajisaka dfs.namenode.redundancy.considerLoad is true by default and it is causing many test failures. Let's disable it in MiniDFSCluster. Originally reported by [~weichiu]: https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612 {quote} i've certain seen this option causing test failures in the past. Maybe we should turn it off by default in MiniDDFSCluster, and only enable it for specific tests. {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks
[ https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-15655: Attachment: (was: HDFS-15655.002.patch) > Add option to make balancer prefer to get cold blocks > - > > Key: HDFS-15655 > URL: https://issues.apache.org/jira/browse/HDFS-15655 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch > > > We met two issues when using balancer. > # Moving hot files may cause failing of dfsclient reading. > # Some blocks of temporary files are moved and they are deleted soon. > Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the > blocks which are belong to the cold files created before this time period. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks
[ https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-15655: Attachment: HDFS-15655.002.patch Status: Patch Available (was: Open) > Add option to make balancer prefer to get cold blocks > - > > Key: HDFS-15655 > URL: https://issues.apache.org/jira/browse/HDFS-15655 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch > > > We met two issues when using balancer. > # Moving hot files may cause failing of dfsclient reading. > # Some blocks of temporary files are moved and they are deleted soon. > Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the > blocks which are belong to the cold files created before this time period. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks
[ https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-15655: Status: Open (was: Patch Available) > Add option to make balancer prefer to get cold blocks > - > > Key: HDFS-15655 > URL: https://issues.apache.org/jira/browse/HDFS-15655 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch > > > We met two issues when using balancer. > # Moving hot files may cause failing of dfsclient reading. > # Some blocks of temporary files are moved and they are deleted soon. > Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the > blocks which are belong to the cold files created before this time period. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15657) TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
[ https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505592 ] ASF GitHub Bot logged work on HDFS-15657: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:09 Start Date: 28/Oct/20 07:09 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2418: URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717745936 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 20s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 32s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 49s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 1m 12s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 1m 11s | | trunk passed | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | checkstyle | 0m 16s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 16m 12s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 33s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 1m 14s | | the patch passed | _ Other Tests _ | | +1 :green_heart: | unit | 12m 31s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 28s | | The patch does not generate ASF License warnings. | | | | 93m 16s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2418 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7c610a70681f 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d0c786db4de | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/testReport/ | | Max. process+thread count | 2764 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 | | Powered by | Apache Yetus
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505580 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 28/Oct/20 06:40 Start Date: 28/Oct/20 06:40 Worklog Time Spent: 10m Work Description: huangtianhua commented on a change in pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#discussion_r513210557 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java ## @@ -33,13 +33,12 @@ @InterfaceAudience.Public @InterfaceStability.Unstable public enum StorageType { - // sorted by the speed of the storage types, from fast to slow RAM_DISK(true, true), - NVDIMM(false, true), SSD(false, false), DISK(false, false), ARCHIVE(false, false), - PROVIDED(false, false); + PROVIDED(false, false), + NVDIMM(false, true); Review comment: We add check for setQuota() in FSNamesystem.java, I think it's ok, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505580) Time Spent: 4h (was: 3h 50m) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15658) Improve datanode capability balancing
chuanjie.duan created HDFS-15658: Summary: Improve datanode capability balancing Key: HDFS-15658 URL: https://issues.apache.org/jira/browse/HDFS-15658 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: chuanjie.duan How about adjust the order of choosing replication to deletion? Is there any other meaning, choosing "oldestHeartbeatStorage" first? public DatanodeStorageInfo chooseReplicaToDelete( Collection moreThanOne, Collection exactlyOne, final List excessTypes, Map> rackMap) { .. final DatanodeStorageInfo storage; if (minSpaceStorage != null) { storage = minSpaceStorage; } else if (oldestHeartbeatStorage != null) { storage = oldestHeartbeatStorage; } else { return null; } excessTypes.remove(storage.getStorageType()); return storage; } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop
[ https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505578 ] ASF GitHub Bot logged work on HDFS-15624: - Author: ASF GitHub Bot Created on: 28/Oct/20 06:38 Start Date: 28/Oct/20 06:38 Worklog Time Spent: 10m Work Description: huangtianhua commented on a change in pull request #2377: URL: https://github.com/apache/hadoop/pull/2377#discussion_r513209898 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeLayoutVersion.java ## @@ -89,7 +89,8 @@ public static boolean supports(final LayoutFeature f, final int lv) { APPEND_NEW_BLOCK(-62, -61, "Support appending to new block"), QUOTA_BY_STORAGE_TYPE(-63, -61, "Support quota for specific storage types"), ERASURE_CODING(-64, -61, "Support erasure coding"), -EXPANDED_STRING_TABLE(-65, -61, "Support expanded string table in fsimage"); +EXPANDED_STRING_TABLE(-65, -61, "Support expanded string table in fsimage"), +NVDIMM_SUPPORT(-66, -66, "Support NVDIMM storage type"); Review comment: As the comment above said: If the feature cannot satisfy compatibility with any prior version, then set its minimum compatible lqyout version to itself to indicate that downgrade is impossible. Or maybe we missed something? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505578) Time Spent: 3h 50m (was: 3h 40m) > Fix the SetQuotaByStorageTypeOp problem after updating hadoop > --- > > Key: HDFS-15624 > URL: https://issues.apache.org/jira/browse/HDFS-15624 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: YaYun Wang >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > > HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum > of StorageType. And, setting the quota by storageType depends on the > ordinal(), therefore, it may cause the setting of quota to be invalid after > upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org