[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506058
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 04:49
Start Date: 29/Oct/20 04:49
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513966454



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I could reproduce even without `-Pparallel-tests`
   ```
   $ pwd
   /home/aajisaka/hadoop/hadoop-hdfs-project/hadoop-hdfs
   $ mvn test -Dtest=TestFileChecksum -Pnative
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 506058)
Time Spent: 3h 40m  (was: 3.5h)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, 
> org.apache.hadoop.hdfs.TestFileChecksum-output.txt, 
> org.apache.hadoop.hdfs.TestFileChecksum.txt
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506054
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 04:40
Start Date: 29/Oct/20 04:40
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513963136



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I could reproduce the failure locally:
   ```
   $ ./start-build-env.sh
   $ mvn clean install -DskipTests -Pnative
   $ cd hadoop-hdfs-project/hadoop-hdfs
   $ mvn test -Pnative -Pparallel-tests
   ```
   Attached the stdout in the JIRA: 
https://issues.apache.org/jira/secure/attachment/13014321/org.apache.hadoop.hdfs.TestFileChecksum-output.txt





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 506054)
Time Spent: 3.5h  (was: 3h 20m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, 
> org.apache.hadoop.hdfs.TestFileChecksum-output.txt, 
> org.apache.hadoop.hdfs.TestFileChecksum.txt
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> 

[jira] [Updated] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15643:
-
Attachment: org.apache.hadoop.hdfs.TestFileChecksum.txt

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, 
> org.apache.hadoop.hdfs.TestFileChecksum-output.txt, 
> org.apache.hadoop.hdfs.TestFileChecksum.txt
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> {code:bash}
> Error Message
> `/striped/stripedFileChecksum1': Fail to get block checksum for 
> LocatedStripedBlock{BP-1299291876-172.17.0.3-1602771356932:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:42217,DS-6c29e4b7-e4f1-4302-ad23-fb078f37d783,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41307,DS-3d824f14-3cd0-46b1-bef1-caa808bf278d,DISK],
>  
> 

[jira] [Updated] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15643:
-
Attachment: org.apache.hadoop.hdfs.TestFileChecksum-output.txt

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, 
> org.apache.hadoop.hdfs.TestFileChecksum-output.txt
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> {code:bash}
> Error Message
> `/striped/stripedFileChecksum1': Fail to get block checksum for 
> LocatedStripedBlock{BP-1299291876-172.17.0.3-1602771356932:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:42217,DS-6c29e4b7-e4f1-4302-ad23-fb078f37d783,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41307,DS-3d824f14-3cd0-46b1-bef1-caa808bf278d,DISK],
>  
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506034=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506034
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 03:38
Start Date: 29/Oct/20 03:38
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2421:
URL: https://github.com/apache/hadoop/pull/2421#issuecomment-718338898


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 14s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 11s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  3s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  0s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 44s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m  7s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  59m 22s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2421/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  asflicense  |   0m 37s | 
[/patch-asflicense-problems.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2421/1/artifact/out/patch-asflicense-problems.txt)
 |  The patch generated 4 ASF License warnings.  |
   |  |   | 145m 17s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
   |   | hadoop.hdfs.TestDFSStorageStateRecovery |
   |   | hadoop.hdfs.TestSafeModeWithStripedFile |
   |   | hadoop.hdfs.TestFileCreationClient |
   |   | hadoop.hdfs.tools.TestDFSZKFailoverController |
   |   | hadoop.hdfs.TestErasureCodingPolicies |
   |   | hadoop.hdfs.TestDecommissionWithStriped |
   |   | hadoop.hdfs.TestMiniDFSCluster |
   |   | hadoop.hdfs.TestMultipleNNPortQOP |
   |   | hadoop.hdfs.TestDFSStripedInputStream |
   |   | hadoop.hdfs.TestFileAppend2 |
   |   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   |   | hadoop.hdfs.TestDistributedFileSystem |
   |   | hadoop.hdfs.TestDatanodeDeath |
   |   | hadoop.hdfs.TestErasureCodingMultipleRacks |
   |   | hadoop.hdfs.TestSnapshotCommands |
   |   | hadoop.hdfs.TestReadStripedFileWithDecoding |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestDFSClientSocketSize |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
  

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506020
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 02:49
Start Date: 29/Oct/20 02:49
Worklog Time Spent: 10m 
  Work Description: amahussein commented on pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718325314


   > I think the `keepLongStdio` option can be used 
https://www.jenkins.io/doc/pipeline/steps/junit/
   > The option can be enabled by updating the `./Jenkinsfile` as follows:
   > 
   > ```diff
   > -junit "${env.SOURCEDIR}/**/target/surefire-reports/*.xml"
   > +junit keepLongStdio: true, testResults: 
"${env.SOURCEDIR}/**/target/surefire-reports/*.xml"
   > ```
   
   Thanks @aajisaka !
   I am going to try it out.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 506020)
Time Spent: 3h 10m  (was: 3h)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=506006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506006
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 02:11
Start Date: 29/Oct/20 02:11
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718313974


   > do you guys know if it is possible to see the full logs of the unit test?
   
   I think the `keepLongStdio` option can be used 
https://www.jenkins.io/doc/pipeline/steps/junit/
   The option can be enabled by updating the `./Jenkinsfile` as follows:
   ```diff
   -junit "${env.SOURCEDIR}/**/target/surefire-reports/*.xml"
   +junit keepLongStdio: true, testResults: 
"${env.SOURCEDIR}/**/target/surefire-reports/*.xml"
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 506006)
Time Spent: 3h  (was: 2h 50m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=506004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506004
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 02:04
Start Date: 29/Oct/20 02:04
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718312065


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 18s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +0 :ok: |  spotbugs  |   3m  4s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  2s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javac  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 30s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  findbugs  |   3m  4s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  95m 19s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 184m 30s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestGetFileChecksum |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2419 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 65b9eeb7a07d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bab5bf9743f |
   | Default Java | Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 |
   | Multi-JDK versions | 

[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505997=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505997
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 01:50
Start Date: 29/Oct/20 01:50
Worklog Time Spent: 10m 
  Work Description: huangtianhua removed a comment on pull request #2377:
URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718304736


   @brahmareddybattula Or maybe we can change the code to keep the ordinal for 
pre storage types first?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505997)
Time Spent: 4h 50m  (was: 4h 40m)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505996
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 01:48
Start Date: 29/Oct/20 01:48
Worklog Time Spent: 10m 
  Work Description: huangtianhua removed a comment on pull request #2377:
URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718303977


   @brahmareddybattula , OK, thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505996)
Time Spent: 4h 40m  (was: 4.5h)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread huangtianhua (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222654#comment-17222654
 ] 

huangtianhua commented on HDFS-15624:
-

[~ayushtkn], hi, I agree with you because the code is almost ready, would you 
please help to review too, https://github.com/apache/hadoop/pull/2377

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505995
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 01:39
Start Date: 29/Oct/20 01:39
Worklog Time Spent: 10m 
  Work Description: huangtianhua commented on pull request #2377:
URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718304736


   @brahmareddybattula Or maybe we can change the code to keep the ordinal for 
pre storage types first?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505995)
Time Spent: 4.5h  (was: 4h 20m)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505994
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 01:36
Start Date: 29/Oct/20 01:36
Worklog Time Spent: 10m 
  Work Description: huangtianhua commented on pull request #2377:
URL: https://github.com/apache/hadoop/pull/2377#issuecomment-718303977


   @brahmareddybattula , OK, thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505994)
Time Spent: 4h 20m  (was: 4h 10m)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505986
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 01:11
Start Date: 29/Oct/20 01:11
Worklog Time Spent: 10m 
  Work Description: amahussein opened a new pull request #2421:
URL: https://github.com/apache/hadoop/pull/2421


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505986)
Time Spent: 2h 50m  (was: 2h 40m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505969
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:37
Start Date: 28/Oct/20 23:37
Worklog Time Spent: 10m 
  Work Description: amahussein commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513824033



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   Thanks @goiri 
   Those are the logs I was looking at. All logs in `TestFileChecksum` and 
`TestFileChecksumCompositeCrc` truncate the last 9 seconds prior to the failure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505969)
Time Spent: 2h 40m  (was: 2.5h)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505966
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:31
Start Date: 28/Oct/20 23:31
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513822007



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I see there is truncation there too though.
   We may want to make the test a little less verbose.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505966)
Time Spent: 2.5h  (was: 2h 20m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505965
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:29
Start Date: 28/Oct/20 23:29
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513821613



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I am not sure exactly what test you are interested, but here you can see 
one of the failed tests full logs:
   
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2408/2/testReport/org.apache.hadoop.hdfs/TestFileChecksum/testStripedFileChecksumWithMissedDataBlocksRangeQuery11/
   
   Here are all the tests:
   
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2408/2/testReport/org.apache.hadoop.hdfs/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505965)
Time Spent: 2h 20m  (was: 2h 10m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at 

[jira] [Commented] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222606#comment-17222606
 ] 

Íñigo Goiri commented on HDFS-15654:


Thanks [~ahussein] for the fix.
Merged PR 2419.

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> 

[jira] [Assigned] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned HDFS-15654:
--

Assignee: Ahmed Hussein

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505964
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:24
Start Date: 28/Oct/20 23:24
Worklog Time Spent: 10m 
  Work Description: goiri merged pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505964)
Time Spent: 1.5h  (was: 1h 20m)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at 

[jira] [Resolved] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved HDFS-15654.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505962
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:13
Start Date: 28/Oct/20 23:13
Worklog Time Spent: 10m 
  Work Description: amahussein commented on pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718261099


   @goiri and @aajisaka , do you guys know if it is possible to see the full 
logs of the unit test?
   The Yetus console and test reports show truncated logs. So, I cannot see the 
sequence of events that leads to the Exception and the stack trace. I cannot 
reproduce the failure locally too :/
   
   ```
   ...[truncated 896674 chars]...
   sed]. Total timeout mills is 48, 479813 millis timeout left.
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:351)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505962)
Time Spent: 2h 10m  (was: 2h)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505961
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:08
Start Date: 28/Oct/20 23:08
Worklog Time Spent: 10m 
  Work Description: amahussein commented on pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718259197


   > This LGTM. @amahussein good to merge?
   
   yes, this is good to go. Thanks @goiri 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505961)
Time Spent: 1h 20m  (was: 1h 10m)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505959
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 23:06
Start Date: 28/Oct/20 23:06
Worklog Time Spent: 10m 
  Work Description: goiri commented on pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718258543


   This LGTM. @amahussein good to merge?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505959)
Time Spent: 1h 10m  (was: 1h)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505863
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 18:48
Start Date: 28/Oct/20 18:48
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#issuecomment-718136892


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +0 :ok: |  spotbugs  |   3m  7s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  4s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javac  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10  |
   | +1 :green_heart: |  findbugs  |   3m  4s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  95m 33s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 184m 48s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2419 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 64d68b4245d3 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b3ba74d72df |
   | Default Java | Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 |
   |  Test Results | 

[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1773#comment-1773
 ] 

Ayush Saxena commented on HDFS-15624:
-

HDFS-15660 is handling a different issue, things getting fixed here are quite 
specific to NVDMIM, like the broken FsImage Compatibility, due to change in 
ordinal of Storage Types. Rolling Upgrade issue.

Regarding HDFS-15660 : That tends to handle the exception due to missing 
storage type at Client, and so the protobuf response while decoding the new 
Storage Type, fetches an exception. This would happen with NVDMIM also, in case 
of 3.3.0 client and and 3.4.0 server, but that is something not being chased 
here.

No point holding this IMO, I didn't check at what stage the code is now, 
considering Vinay was following. I think that is almost at conclusion, and we 
should get this in and sort the mess from HDFS-15025. 

But I am ok holding it as well, but in that case we should revert HDFS-15025, 
So, as if this doesn't get concluded for any reasons, later getting rid of this 
shouldn't be a problem, and even prevent someone backporting the original in 
there internal versions.

[~vinayakumarb]/ [~liuml07] let me know if you folks too want to hold it for 
HDFS-15660(This won't be too quick and is quite different as well), Will revert 
the original by tomorrow EOD and we can track there in that case. 

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505794=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505794
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 16:25
Start Date: 28/Oct/20 16:25
Worklog Time Spent: 10m 
  Work Description: amahussein commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513584215



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I see. the problem is that I cannot reproduce it on my local machine. 
However, it seems that it fails in a consistent way on Yetus.
   If it is not a real bug, I wonder if volumeScanner could be a factor in 
randomly slowing down the DNs. I see many log message from the volume scanner 
when I run locally. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505794)
Time Spent: 2h  (was: 1h 50m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Updated] (HDFS-15644) Failed volumes can cause DNs to stop block reporting

2020-10-28 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15644:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks!

> Failed volumes can cause DNs to stop block reporting
> 
>
> Key: HDFS-15644
> URL: https://issues.apache.org/jira/browse/HDFS-15644
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: block placement, datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactor
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
> Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, 
> HDFS-15644.002.patch
>
>
> [~daryn] found a corner case where remove failed volumes can cause a NPE in 
> [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939].
> +Scenario:+
>  * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a 
> 2-step process.
>  ** First it's removed from from the volumes list
>  ** Later in time are the replicas scrubbed from the volume map
>  * A concurrent thread generating blockReports may access the replicaMap 
> accessing a non existing VolumeID.
> He made a fix for that and we have been using it on our clusters since 
> Hadoop-2.7.
> By analyzing the code, the bug is still applicable to Trunk.
>  * The path Datanode#removeVolumes() is safe because the two step process in 
> {{FsDataImpl.removeVolumes()}} 
> [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577]
>  is protected by {{datasetWriteLock}} .
>  * The path Datanode#handleVolumeFailures() is not safe because the failed 
> volume is removed from the list without acquiring 
> {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239]
> The race condition can cause the caller of getBlockReports() to throw NPE if 
> the RUR is referring to a volume that has already been removed 
> [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976].
> {code:java}
> case RUR:
>   ReplicaInfo orig = b.getOriginalReplica();
>   builders.get(volStorageID).add(orig);
>   break;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15644) Failed volumes can cause DNs to stop block reporting

2020-10-28 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15644:
---
Fix Version/s: 2.10.2

> Failed volumes can cause DNs to stop block reporting
> 
>
> Key: HDFS-15644
> URL: https://issues.apache.org/jira/browse/HDFS-15644
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: block placement, datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactor
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
> Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, 
> HDFS-15644.002.patch
>
>
> [~daryn] found a corner case where remove failed volumes can cause a NPE in 
> [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939].
> +Scenario:+
>  * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a 
> 2-step process.
>  ** First it's removed from from the volumes list
>  ** Later in time are the replicas scrubbed from the volume map
>  * A concurrent thread generating blockReports may access the replicaMap 
> accessing a non existing VolumeID.
> He made a fix for that and we have been using it on our clusters since 
> Hadoop-2.7.
> By analyzing the code, the bug is still applicable to Trunk.
>  * The path Datanode#removeVolumes() is safe because the two step process in 
> {{FsDataImpl.removeVolumes()}} 
> [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577]
>  is protected by {{datasetWriteLock}} .
>  * The path Datanode#handleVolumeFailures() is not safe because the failed 
> volume is removed from the list without acquiring 
> {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239]
> The race condition can cause the caller of getBlockReports() to throw NPE if 
> the RUR is referring to a volume that has already been removed 
> [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976].
> {code:java}
> case RUR:
>   ReplicaInfo orig = b.getOriginalReplica();
>   builders.get(volStorageID).add(orig);
>   break;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15644) Failed volumes can cause DNs to stop block reporting

2020-10-28 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1754#comment-1754
 ] 

Wei-Chiu Chuang commented on HDFS-15644:


I tried to verify the tests are good but TestFsDatasetImpl#testReportBadBlocks 
times out consistently, and then I realized it times out consistently even 
without this patch. I'll keep digging into it.

+1 for the branch-2.10. 

> Failed volumes can cause DNs to stop block reporting
> 
>
> Key: HDFS-15644
> URL: https://issues.apache.org/jira/browse/HDFS-15644
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: block placement, datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactor
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15644-branch-2.10.002.patch, HDFS-15644.001.patch, 
> HDFS-15644.002.patch
>
>
> [~daryn] found a corner case where remove failed volumes can cause a NPE in 
> [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939].
> +Scenario:+
>  * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a 
> 2-step process.
>  ** First it's removed from from the volumes list
>  ** Later in time are the replicas scrubbed from the volume map
>  * A concurrent thread generating blockReports may access the replicaMap 
> accessing a non existing VolumeID.
> He made a fix for that and we have been using it on our clusters since 
> Hadoop-2.7.
> By analyzing the code, the bug is still applicable to Trunk.
>  * The path Datanode#removeVolumes() is safe because the two step process in 
> {{FsDataImpl.removeVolumes()}} 
> [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577]
>  is protected by {{datasetWriteLock}} .
>  * The path Datanode#handleVolumeFailures() is not safe because the failed 
> volume is removed from the list without acquiring 
> {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239]
> The race condition can cause the caller of getBlockReports() to throw NPE if 
> the RUR is referring to a volume that has already been removed 
> [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976].
> {code:java}
> case RUR:
>   ReplicaInfo orig = b.getOriginalReplica();
>   builders.get(volStorageID).add(orig);
>   break;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505788
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 16:16
Start Date: 28/Oct/20 16:16
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#discussion_r513577757



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
##
@@ -318,45 +317,41 @@ public void blockUtilSendFullBlockReport() {
 count.addAndGet(1);
 Thread.sleep(1);
   } catch (Exception e) {
-e.printStackTrace();
+LOG.error("error addNewBlockThread", e);
   }
 }
   });
   addNewBlockThread.start();
 
   // Make sure that generate blocks for DataNode and IBR not empty now.
-  GenericTestUtils.waitFor(() -> {
-if(count.get() > 0) {
-  return true;
-}
-return false;
-  }, 100, 1000);
+  GenericTestUtils.waitFor(() -> count.get() > 0, 100, 1000);
 
   // Trigger re-register using DataNode Command.
   datanodeCommands[0] = new DatanodeCommand[]{RegisterCommand.REGISTER};
-  bpos.triggerHeartbeatForTests();
 
+  bpos.triggerHeartbeatForTests();
+  addNewBlockThread.join();
+  addNewBlockThread = null;
+  // Verify FBR/IBR count is equal to generate number.
   try {
-GenericTestUtils.waitFor(() -> {
-  if(fullBlockReportCount == totalTestBlocks ||
-  incrBlockReportCount == totalTestBlocks) {
-return true;
-  }
-  return false;
-}, 1000, 15000);
-  } catch (Exception e) {}
+GenericTestUtils.waitFor(() ->
+(fullBlockReportCount == totalTestBlocks ||
+incrBlockReportCount == totalTestBlocks), 1000, 15000);
+  } catch (Exception e) {
+LOG.error("Timed out wait for IBR counts FBRCount = {},"
++ " IBRCount = {}; expected = {}",
+fullBlockReportCount, incrBlockReportCount, totalTestBlocks);
+Assert.fail();

Review comment:
   And do the static import of fail() as the other assertEquals() and so on.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505788)
Time Spent: 50m  (was: 40m)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505787
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 16:13
Start Date: 28/Oct/20 16:13
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#discussion_r513575273



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
##
@@ -318,45 +317,41 @@ public void blockUtilSendFullBlockReport() {
 count.addAndGet(1);
 Thread.sleep(1);
   } catch (Exception e) {
-e.printStackTrace();
+LOG.error("error addNewBlockThread", e);
   }
 }
   });
   addNewBlockThread.start();
 
   // Make sure that generate blocks for DataNode and IBR not empty now.
-  GenericTestUtils.waitFor(() -> {
-if(count.get() > 0) {
-  return true;
-}
-return false;
-  }, 100, 1000);
+  GenericTestUtils.waitFor(() -> count.get() > 0, 100, 1000);
 
   // Trigger re-register using DataNode Command.
   datanodeCommands[0] = new DatanodeCommand[]{RegisterCommand.REGISTER};
-  bpos.triggerHeartbeatForTests();
 
+  bpos.triggerHeartbeatForTests();
+  addNewBlockThread.join();
+  addNewBlockThread = null;
+  // Verify FBR/IBR count is equal to generate number.
   try {
-GenericTestUtils.waitFor(() -> {
-  if(fullBlockReportCount == totalTestBlocks ||
-  incrBlockReportCount == totalTestBlocks) {
-return true;
-  }
-  return false;
-}, 1000, 15000);
-  } catch (Exception e) {}
+GenericTestUtils.waitFor(() ->
+(fullBlockReportCount == totalTestBlocks ||
+incrBlockReportCount == totalTestBlocks), 1000, 15000);
+  } catch (Exception e) {
+LOG.error("Timed out wait for IBR counts FBRCount = {},"
++ " IBRCount = {}; expected = {}",
+fullBlockReportCount, incrBlockReportCount, totalTestBlocks);
+Assert.fail();

Review comment:
   You might want to put the previous log message in the fail().





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505787)
Time Spent: 40m  (was: 0.5h)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505776
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 15:44
Start Date: 28/Oct/20 15:44
Worklog Time Spent: 10m 
  Work Description: amahussein commented on pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#issuecomment-718021914


   That's trickier than what I thought. Need another iteration.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505776)
Time Spent: 1h 50m  (was: 1h 40m)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> 

[jira] [Work logged] (HDFS-15643) TestFileChecksumCompositeCrc fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?focusedWorklogId=505763=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505763
 ]

ASF GitHub Bot logged work on HDFS-15643:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 15:32
Start Date: 28/Oct/20 15:32
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2408:
URL: https://github.com/apache/hadoop/pull/2408#discussion_r513541376



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java
##
@@ -575,6 +596,8 @@ private FileChecksum getFileChecksum(String filePath, int 
range,
   dnIdxToDie = getDataNodeToKill(filePath);
   DataNode dnToDie = cluster.getDataNodes().get(dnIdxToDie);
   shutdownDataNode(dnToDie);
+  // wait enough time for the locations to be updated.
+  Thread.sleep(STALE_INTERVAL);

Review comment:
   I am not very close to this part of the code but there must be ways to 
force the statistics to update.
   Not sure who can help with this part of the code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505763)
Time Spent: 1h 40m  (was: 1.5h)

> TestFileChecksumCompositeCrc fails intermittently
> -
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505760
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 15:29
Start Date: 28/Oct/20 15:29
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#discussion_r513538747



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
##
@@ -280,26 +282,24 @@ public void testBasicFunctionality() throws Exception {
*/
   @Test
   public void testMissBlocksWhenReregister() throws Exception {
+

Review comment:
   Remove this extra line added.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505760)
Time Spent: 0.5h  (was: 20m)

> TestBPOfferService#testMissBlocksWhenReregister fails intermittently
> 
>
> Key: HDFS-15654
> URL: https://issues.apache.org/jira/browse/HDFS-15654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestBPOfferService.testMissBlocksWhenReregister}}  is flaky. It fails 
> randomly when the 
> following expression is not true:
> {code:java}
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> {code}
> There is a race condition here that relies once more on "time" to synchronize 
> between concurrent threads. The code below is is causing the 
> non-deterministic execution.
> On a slow server, {{addNewBlockThread}} may not be done by the time the main 
> thread reach the assertion call.
> {code:java}
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   addNewBlockThread.join();
>   bpos.stop();
>   bpos.join();
> {code}
> Therefore, the correct implementation should wait for the thread to finish
> {code:java}
>  // the thread finished execution.
>  addNewBlockThread.join();
>   // Verify FBR/IBR count is equal to generate number.
>   assertTrue(fullBlockReportCount == totalTestBlocks ||
>   incrBlockReportCount == totalTestBlocks);
> } finally {
>   bpos.stop();
>   bpos.join();
> {code}
> {{DataNodeFaultInjector}} needs to have a longer wait_time too. 1 second is 
> not enough to satisfy the condition.
> {code:java}
>   DataNodeFaultInjector.set(new DataNodeFaultInjector() {
> public void blockUtilSendFullBlockReport() {
>   try {
> GenericTestUtils.waitFor(() -> {
>   if(count.get() > 2000) {
> return true;
>   }
>   return false;
> }, 100, 1); // increase that waiting time to 10 seconds.
>   } catch (Exception e) {
> e.printStackTrace();
>   }
> }
>   });
> {code}
> {code:bash}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> 

[jira] [Comment Edited] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222133#comment-17222133
 ] 

Brahma Reddy Battula edited comment on HDFS-15624 at 10/28/20, 12:33 PM:
-

There is an issue after provided storage type introduced ( 
https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there 
itself, so we can hold on till HDFS-15660 is addressed.


was (Author: brahmareddy):
there is an issue provided storage type itself ( 
https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there 
itself

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505709
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 12:32
Start Date: 28/Oct/20 12:32
Worklog Time Spent: 10m 
  Work Description: brahmareddybattula commented on pull request #2377:
URL: https://github.com/apache/hadoop/pull/2377#issuecomment-717902202


   I think, you can hold on till HDFS-15660 addressed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505709)
Time Spent: 4h 10m  (was: 4h)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222133#comment-17222133
 ] 

Brahma Reddy Battula commented on HDFS-15624:
-

there is an issue provided storage type itself ( 
https://issues.apache.org/jira/browse/HDFS-15660), issue can be addressed there 
itself

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2020-10-28 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222107#comment-17222107
 ] 

Akira Ajisaka edited comment on HDFS-15659 at 10/28/20, 11:29 AM:
--

Thank you [~ayushtkn] for your comment.
{quote}IIRC couple of days back you had objections having this config disabled 
in a test. I didn't follow up, and I lost the track of that Jira too. Is that 
concern not there now?
{quote}
The comment is here: 
[https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731]

Sorry, I've changed my mind. For fixing one test, I thought it's okay to 
increase the number of DNs. However, for fixing many tests, it's easier to 
disable this feature by default rather than increasing the number of DNs for 
each test case.
{quote}So, If this concludes, We can revert that. Increasing datanodes has 
adverse affect on test performance, though minor
{quote}
Yes, I think we can revert HDFS-15461 if the feature is disabled by default.


was (Author: ajisakaa):
Thank you [~ayushtkn] for your comment.
{quote}IIRC couple of days back you had objections having this config disabled 
in a test. I didn't follow up, and I lost the track of that Jira too. Is that 
concern not there now?
{quote}
The comment is here: 
[https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731]

Sorry, I've changed my mind. For fixing one test, I thought it's okay to 
increase the number of DNs. However, for fixing many tests, 
 it's easier to disable this feature by default rather than increasing the 
number of DNs for each test case.
{quote}So, If this concludes, We can revert that. Increasing datanodes has 
adverse affect on test performance, though minor
{quote}
Yes, I think we can revert HDFS-15461 if the feature is disabled by default.

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2020-10-28 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222107#comment-17222107
 ] 

Akira Ajisaka commented on HDFS-15659:
--

Thank you [~ayushtkn] for your comment.
{quote}IIRC couple of days back you had objections having this config disabled 
in a test. I didn't follow up, and I lost the track of that Jira too. Is that 
concern not there now?
{quote}
The comment is here: 
[https://github.com/apache/hadoop/pull/2404#pullrequestreview-515448731]

Sorry, I've changed my mind. For fixing one test, I thought it's okay to 
increase the number of DNs. However, for fixing many tests, 
 it's easier to disable this feature by default rather than increasing the 
number of DNs for each test case.
{quote}So, If this concludes, We can revert that. Increasing datanodes has 
adverse affect on test performance, though minor
{quote}
Yes, I think we can revert HDFS-15461 if the feature is disabled by default.

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2020-10-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222090#comment-17222090
 ] 

Ayush Saxena commented on HDFS-15659:
-

[~aajisaka] IIRC couple of days back you had objections having this config 
disabled in a test. I didn't follow up, and I lost the track of that Jira too. 
Is that concern not there now?

I think there the number of datanodes were increased instead? If we are doing 
this, I don't think that fix would be required? So, If this concludes, We can 
revert that. Increasing datanodes has adverse affect on test performance, 
though minor

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15655) Add option to make balancer prefer to get cold blocks

2020-10-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222088#comment-17222088
 ] 

Hadoop QA commented on HDFS-15655:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
24s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 1 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
25s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 18s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
13s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} |  | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
12s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} |  | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 58s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} |  | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:red}-1{color} | {color:red} unit {color} | 

[jira] [Resolved] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15657.
-
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed

Merged to trunk and branch-3.3. Thanks, [~aajisaka].

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> 

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505665
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 10:13
Start Date: 28/Oct/20 10:13
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717833448


   Thank you for reviewing and merging. Would you backport this to branch-3.3?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505665)
Time Spent: 2h  (was: 1h 50m)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> 

[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6

2020-10-28 Thread Ryan Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222077#comment-17222077
 ] 

Ryan Wu commented on HDFS-15660:


{code:java}
// code placeholder
enum StorageTypeProto {
  DISK = 1;
  SSD = 2;
  ARCHIVE = 3;
  RAM_DISK = 4;
  PROVIDED = 5;
}
{code}
this PROVIDE storageType is added from HDFS-10675

> StorageTypeProto is not compatiable between 3.x and 2.6
> ---
>
> Key: HDFS-15660
> URL: https://issues.apache.org/jira/browse/HDFS-15660
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0, 3.1.3
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
> Fix For: 3.4.0
>
>
> In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6,  
> we found hive to call getContentSummary method , the client and server was 
> not compatible  because of hadoop3 added new PROVIDED storage type.
> {code:java}
> // code placeholder
> 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while 
> invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over 
> x/x:8020. Trying to fail over immediately.
> java.io.IOException: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713)
>         at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109)
>         at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
>         at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
>         at 
> org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
>         at 
> org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
>         at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
> Caused by: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272)
>         at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816)
>         ... 23 more
> Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
> required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263)
>         ... 25 more

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505664
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 10:12
Start Date: 28/Oct/20 10:12
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717832919


   @aajisaka Thanks for your contribution!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505664)
Time Spent: 1h 50m  (was: 1h 40m)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>  

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505663
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 10:12
Start Date: 28/Oct/20 10:12
Worklog Time Spent: 10m 
  Work Description: tasanuma merged pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505663)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> 

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505656
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 09:58
Start Date: 28/Oct/20 09:58
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717825235


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 24s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 22s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  16m 10s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 32s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  12m 58s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  98m  5s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2418 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 1df7450ea6a4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d0c786db4de |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/testReport/ |
   | Max. process+thread count | 2730 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/3/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 |
   | Powered by | Apache Yetus 

[jira] [Commented] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6

2020-10-28 Thread Ryan Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222075#comment-17222075
 ] 

Ryan Wu commented on HDFS-15660:


This happened when our cluster enabled heterogeneous storage. To run hivesql or 
we call "hdfs dfs -count ", this problem will appear. 

> StorageTypeProto is not compatiable between 3.x and 2.6
> ---
>
> Key: HDFS-15660
> URL: https://issues.apache.org/jira/browse/HDFS-15660
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0, 3.1.3
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
> Fix For: 3.4.0
>
>
> In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6,  
> we found hive to call getContentSummary method , the client and server was 
> not compatible  because of hadoop3 added new PROVIDED storage type.
> {code:java}
> // code placeholder
> 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while 
> invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over 
> x/x:8020. Trying to fail over immediately.
> java.io.IOException: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713)
>         at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109)
>         at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
>         at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
>         at 
> org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
>         at 
> org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
>         at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
> Caused by: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272)
>         at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816)
>         ... 23 more
> Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
> required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263)
>         ... 25 more
> {code}



--
This message was sent by 

[jira] [Assigned] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6

2020-10-28 Thread Ryan Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Wu reassigned HDFS-15660:
--

Assignee: Ryan Wu

> StorageTypeProto is not compatiable between 3.x and 2.6
> ---
>
> Key: HDFS-15660
> URL: https://issues.apache.org/jira/browse/HDFS-15660
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0, 3.1.3
>Reporter: Ryan Wu
>Assignee: Ryan Wu
>Priority: Major
> Fix For: 3.4.0
>
>
> In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6,  
> we found hive to call getContentSummary method , the client and server was 
> not compatible  because of hadoop3 added new PROVIDED storage type.
> {code:java}
> // code placeholder
> 20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while 
> invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over 
> x/x:8020. Trying to fail over immediately.
> java.io.IOException: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713)
>         at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109)
>         at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
>         at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
>         at 
> org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
>         at 
> org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
>         at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
> Caused by: com.google.protobuf.ServiceException: 
> com.google.protobuf.UninitializedMessageException: Message missing required 
> fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272)
>         at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816)
>         ... 23 more
> Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
> required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
>         at 
> com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263)
>         ... 25 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Created] (HDFS-15660) StorageTypeProto is not compatiable between 3.x and 2.6

2020-10-28 Thread Ryan Wu (Jira)
Ryan Wu created HDFS-15660:
--

 Summary: StorageTypeProto is not compatiable between 3.x and 2.6
 Key: HDFS-15660
 URL: https://issues.apache.org/jira/browse/HDFS-15660
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.1.3, 3.2.0
Reporter: Ryan Wu
 Fix For: 3.4.0


In our case, when nn has upgraded to 3.1.3 and dn’s version was still 2.6,  we 
found hive to call getContentSummary method , the client and server was not 
compatible  because of hadoop3 added new PROVIDED storage type.
{code:java}
// code placeholder
20/04/15 14:28:35 INFO retry.RetryInvocationHandler---main: Exception while 
invoking getContentSummary of class ClientNamenodeProtocolTranslatorPB over 
x/x:8020. Trying to fail over immediately.
java.io.IOException: com.google.protobuf.ServiceException: 
com.google.protobuf.UninitializedMessageException: Message missing required 
fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
        at 
org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:819)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy11.getContentSummary(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:3144)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:706)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:702)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:713)
        at org.apache.hadoop.fs.shell.Count.processPath(Count.java:109)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at 
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
Caused by: com.google.protobuf.ServiceException: 
com.google.protobuf.UninitializedMessageException: Message missing required 
fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:272)
        at com.sun.proxy.$Proxy10.getContentSummary(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getContentSummary(ClientNamenodeProtocolTranslatorPB.java:816)
        ... 23 more
Caused by: com.google.protobuf.UninitializedMessageException: Message missing 
required fields: summary.typeQuotaInfos.typeQuotaInfo[3].type
        at 
com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65392)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetContentSummaryResponseProto$Builder.build(ClientNamenodeProtocolProtos.java:65331)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:263)
        ... 25 more
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505657
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 09:59
Start Date: 28/Oct/20 09:59
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717825773


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   2m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 28s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 26s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  16m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 28s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  12m 50s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  98m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2418 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 4d3d3255b1e2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d0c786db4de |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/testReport/ |
   | Max. process+thread count | 2716 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/4/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 |
   | Powered by | Apache Yetus 

[jira] [Commented] (HDFS-15658) Improve datanode capability balancing

2020-10-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222070#comment-17222070
 ] 

Hadoop QA commented on HDFS-15658:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} |  | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  3m 
54s{color} |  | {color:red} Docker failed to build yetus/hadoop:06eafeedf12. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15658 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13014273/HDFS-15658-branch-2.7.patch
 |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/271/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> Improve datanode capability balancing
> -
>
> Key: HDFS-15658
> URL: https://issues.apache.org/jira/browse/HDFS-15658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: chuanjie.duan
>Priority: Major
> Attachments: HDFS-15658-branch-2.7.patch
>
>
> How about adjust the order of choosing replication to deletion? 
> Is there any other meaning, choosing "oldestHeartbeatStorage" first?
>  
>   public DatanodeStorageInfo chooseReplicaToDelete(
>       Collection moreThanOne,
>       Collection exactlyOne,
>       final List excessTypes,
>       Map> rackMap) {
>     ..
>     final DatanodeStorageInfo storage;
>     if (minSpaceStorage != null) {
>       storage = minSpaceStorage;
>     } else if (oldestHeartbeatStorage != null) {
>       storage = oldestHeartbeatStorage;
>     } else {
>       return null;
>     }
>     excessTypes.remove(storage.getStorageType());
>     return storage;
>   }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15658) Improve datanode capability balancing

2020-10-28 Thread chuanjie.duan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chuanjie.duan updated HDFS-15658:
-
Attachment: HDFS-15658-branch-2.7.patch
Status: Patch Available  (was: Open)

> Improve datanode capability balancing
> -
>
> Key: HDFS-15658
> URL: https://issues.apache.org/jira/browse/HDFS-15658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: chuanjie.duan
>Priority: Major
> Attachments: HDFS-15658-branch-2.7.patch
>
>
> How about adjust the order of choosing replication to deletion? 
> Is there any other meaning, choosing "oldestHeartbeatStorage" first?
>  
>   public DatanodeStorageInfo chooseReplicaToDelete(
>       Collection moreThanOne,
>       Collection exactlyOne,
>       final List excessTypes,
>       Map> rackMap) {
>     ..
>     final DatanodeStorageInfo storage;
>     if (minSpaceStorage != null) {
>       storage = minSpaceStorage;
>     } else if (oldestHeartbeatStorage != null) {
>       storage = oldestHeartbeatStorage;
>     } else {
>       return null;
>     }
>     excessTypes.remove(storage.getStorageType());
>     return storage;
>   }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505630
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 08:32
Start Date: 28/Oct/20 08:32
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717780318


   +1, pending Jenkins.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505630)
Time Spent: 1h 10m  (was: 1h)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> 

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505626
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 08:21
Start Date: 28/Oct/20 08:21
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717775420


   @tasanuma 
   Thank you for checking this. Reverted the change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505626)
Time Spent: 1h  (was: 50m)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> 

[jira] [Work logged] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505610
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 07:59
Start Date: 28/Oct/20 07:59
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717765124


   @aajisaka Thanks for updating PR.
   
   > According to the javadoc, this test case is to verify the default 
behavior. Don't set the default value (true) explicitly to verify the default 
behavior.
   
   I checked  [HDFS-14653](https://issues.apache.org/jira/browse/HDFS-14653) 
again.  `testNamenodeHeartBeatEnableDefault` checks that 
`DFS_ROUTER_NAMENODE_HEARTBEAT_ENABLE` is suppose to take the value of 
`DFS_ROUTER_HEARTBEAT_ENABLE` when `DFS_ROUTER_NAMENODE_HEARTBEAT_ENABLE` isn't 
explicitly specified. Therefore, the previous code here may not be wrong.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505610)
Time Spent: 50m  (was: 40m)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>  

[jira] [Work logged] (HDFS-15654) TestBPOfferService#testMissBlocksWhenReregister fails intermittently

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15654?focusedWorklogId=505601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505601
 ]

ASF GitHub Bot logged work on HDFS-15654:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 07:35
Start Date: 28/Oct/20 07:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2419:
URL: https://github.com/apache/hadoop/pull/2419#issuecomment-717755720


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 46s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 12s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 10s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  16m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 19s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 110m 29s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 202m 33s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2419/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2419 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 7a2ecb2a7552 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d0c786db4de |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 

[jira] [Updated] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15657:
-
Summary: RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by 
BindException  (was: TestRouter#testNamenodeHeartBeatEnableDefault fails by 
BindException)

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> 

[jira] [Updated] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2020-10-28 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15659:
-
Parent: HDFS-15646
Issue Type: Sub-task  (was: Improvement)

> Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster
> ---
>
> Key: HDFS-15659
> URL: https://issues.apache.org/jira/browse/HDFS-15659
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Priority: Major
>
> dfs.namenode.redundancy.considerLoad is true by default and it is causing 
> many test failures. Let's disable it in MiniDFSCluster.
> Originally reported by [~weichiu]: 
> https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612
> {quote}
> i've certain seen this option causing test failures in the past.
> Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
> for specific tests.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-9776) TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9776?focusedWorklogId=505600=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505600
 ]

ASF GitHub Bot logged work on HDFS-9776:


Author: ASF GitHub Bot
Created on: 28/Oct/20 07:31
Start Date: 28/Oct/20 07:31
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2410:
URL: https://github.com/apache/hadoop/pull/2410#issuecomment-717754632


   > That sounds like a good idea. Perhaps we can file a follow up lira to set 
the flag in the miniDFSCluster
   
   Filed https://issues.apache.org/jira/browse/HDFS-15659



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505600)
Time Spent: 1h 20m  (was: 1h 10m)

> TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky
> -
>
> Key: HDFS-9776
> URL: https://issues.apache.org/jira/browse/HDFS-9776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Vinayakumar B
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: TestHAAppend.testMultipleAppendsDuringCatchupTailing.log
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Initial analysys of Recent test failure in 
> {{TestHAAppend#testMultipleAppendsDuringCatchupTailing}}
> [here|https://builds.apache.org/job/PreCommit-HDFS-Build/14420/testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestHAAppend/testMultipleAppendsDuringCatchupTailing/]
>  
> has found that, if the Active NameNode goes down immediately after truncate 
> operation, but before BlockRecovery command sent to datanode,
> Then this block will never be truncated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15659) Set dfs.namenode.redundancy.considerLoad to false in MiniDFSCluster

2020-10-28 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-15659:


 Summary: Set dfs.namenode.redundancy.considerLoad to false in 
MiniDFSCluster
 Key: HDFS-15659
 URL: https://issues.apache.org/jira/browse/HDFS-15659
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Reporter: Akira Ajisaka


dfs.namenode.redundancy.considerLoad is true by default and it is causing many 
test failures. Let's disable it in MiniDFSCluster.

Originally reported by [~weichiu]: 
https://github.com/apache/hadoop/pull/2410#pullrequestreview-51612

{quote}
i've certain seen this option causing test failures in the past.
Maybe we should turn it off by default in MiniDDFSCluster, and only enable it 
for specific tests.
{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks

2020-10-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15655:

Attachment: (was: HDFS-15655.002.patch)

> Add option to make balancer prefer to get cold blocks
> -
>
> Key: HDFS-15655
> URL: https://issues.apache.org/jira/browse/HDFS-15655
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch
>
>
> We met two issues when using balancer.
>  # Moving hot files may cause failing of dfsclient reading.
>  # Some blocks of temporary files are moved and they are deleted soon.
> Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the 
> blocks which are belong to the cold files created before this time period.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks

2020-10-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15655:

Attachment: HDFS-15655.002.patch
Status: Patch Available  (was: Open)

> Add option to make balancer prefer to get cold blocks
> -
>
> Key: HDFS-15655
> URL: https://issues.apache.org/jira/browse/HDFS-15655
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch
>
>
> We met two issues when using balancer.
>  # Moving hot files may cause failing of dfsclient reading.
>  # Some blocks of temporary files are moved and they are deleted soon.
> Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the 
> blocks which are belong to the cold files created before this time period.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15655) Add option to make balancer prefer to get cold blocks

2020-10-28 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15655:

Status: Open  (was: Patch Available)

> Add option to make balancer prefer to get cold blocks
> -
>
> Key: HDFS-15655
> URL: https://issues.apache.org/jira/browse/HDFS-15655
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15655.001.patch, HDFS-15655.002.patch
>
>
> We met two issues when using balancer.
>  # Moving hot files may cause failing of dfsclient reading.
>  # Some blocks of temporary files are moved and they are deleted soon.
> Add a option dfs.namenode.hot.block.interval, the balancer prefer to get the 
> blocks which are belong to the cold files created before this time period.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15657) TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?focusedWorklogId=505592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505592
 ]

ASF GitHub Bot logged work on HDFS-15657:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 07:09
Start Date: 28/Oct/20 07:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2418:
URL: https://github.com/apache/hadoop/pull/2418#issuecomment-717745936


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 32s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 49s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 12s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 11s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  16m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 14s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  12m 31s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  93m 16s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2418 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 7c610a70681f 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d0c786db4de |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/testReport/ |
   | Max. process+thread count | 2764 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2418/2/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 |
   | Powered by | Apache Yetus 

[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505580
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 06:40
Start Date: 28/Oct/20 06:40
Worklog Time Spent: 10m 
  Work Description: huangtianhua commented on a change in pull request 
#2377:
URL: https://github.com/apache/hadoop/pull/2377#discussion_r513210557



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/StorageType.java
##
@@ -33,13 +33,12 @@
 @InterfaceAudience.Public
 @InterfaceStability.Unstable
 public enum StorageType {
-  // sorted by the speed of the storage types, from fast to slow
   RAM_DISK(true, true),
-  NVDIMM(false, true),
   SSD(false, false),
   DISK(false, false),
   ARCHIVE(false, false),
-  PROVIDED(false, false);
+  PROVIDED(false, false),
+  NVDIMM(false, true);
 

Review comment:
   We add check for setQuota() in FSNamesystem.java, I think it's ok, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505580)
Time Spent: 4h  (was: 3h 50m)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15658) Improve datanode capability balancing

2020-10-28 Thread chuanjie.duan (Jira)
chuanjie.duan created HDFS-15658:


 Summary: Improve datanode capability balancing
 Key: HDFS-15658
 URL: https://issues.apache.org/jira/browse/HDFS-15658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: chuanjie.duan


How about adjust the order of choosing replication to deletion? 

Is there any other meaning, choosing "oldestHeartbeatStorage" first?

 

  public DatanodeStorageInfo chooseReplicaToDelete(

      Collection moreThanOne,

      Collection exactlyOne,

      final List excessTypes,

      Map> rackMap) {

    ..

    final DatanodeStorageInfo storage;

    if (minSpaceStorage != null) {

      storage = minSpaceStorage;

    } else if (oldestHeartbeatStorage != null) {

      storage = oldestHeartbeatStorage;

    } else {

      return null;

    }

    excessTypes.remove(storage.getStorageType());

    return storage;

  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15624) Fix the SetQuotaByStorageTypeOp problem after updating hadoop

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15624?focusedWorklogId=505578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505578
 ]

ASF GitHub Bot logged work on HDFS-15624:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 06:38
Start Date: 28/Oct/20 06:38
Worklog Time Spent: 10m 
  Work Description: huangtianhua commented on a change in pull request 
#2377:
URL: https://github.com/apache/hadoop/pull/2377#discussion_r513209898



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeLayoutVersion.java
##
@@ -89,7 +89,8 @@ public static boolean supports(final LayoutFeature f, final 
int lv) {
 APPEND_NEW_BLOCK(-62, -61, "Support appending to new block"),
 QUOTA_BY_STORAGE_TYPE(-63, -61, "Support quota for specific storage 
types"),
 ERASURE_CODING(-64, -61, "Support erasure coding"),
-EXPANDED_STRING_TABLE(-65, -61, "Support expanded string table in 
fsimage");
+EXPANDED_STRING_TABLE(-65, -61, "Support expanded string table in 
fsimage"),
+NVDIMM_SUPPORT(-66, -66, "Support NVDIMM storage type");

Review comment:
   As the comment above said: 
If the feature cannot satisfy compatibility with any prior version, 
then set its minimum compatible lqyout version to itself to indicate that 
downgrade is impossible.
   
   Or maybe we missed something?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505578)
Time Spent: 3h 50m  (was: 3h 40m)

>  Fix the SetQuotaByStorageTypeOp problem after updating hadoop 
> ---
>
> Key: HDFS-15624
> URL: https://issues.apache.org/jira/browse/HDFS-15624
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: YaYun Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> HDFS-15025 adds a new storage Type NVDIMM, changes the ordinal() of the enum 
> of StorageType. And, setting the quota by storageType depends on the 
> ordinal(), therefore, it may cause the setting of quota to be invalid after 
> upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org