[
https://issues.apache.org/jira/browse/HDFS-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218581#comment-17218581
]
Hadoop QA commented on HDFS-15644:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
45s{color} | | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m
0s{color} | | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | | {color:green} The patch does not contain any @author tags.
{color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m
0s{color} | | {color:red} The patch doesn't appear to include any new or
modified tests. Please justify why no new tests are needed for this patch. Also
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m
57s{color} | | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
18s{color} | | {color:green} trunk passed with JDK
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
11s{color} | | {color:green} trunk passed with JDK Private
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
48s{color} | | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m
20s{color} | | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}
16m 37s{color} | | {color:green} branch has no errors when building and
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
58s{color} | | {color:green} trunk passed with JDK
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m
23s{color} | | {color:green} trunk passed with JDK Private
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m
16s{color} | | {color:blue} Used deprecated FindBugs config; considering
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
13s{color} | | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m
13s{color} | | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
11s{color} | | {color:green} the patch passed with JDK
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m
11s{color} | | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
5s{color} | | {color:green} the patch passed with JDK Private
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m
5s{color} | | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m
0s{color} | | {color:green} The patch has no blanks issues. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
38s{color} | | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m
13s{color} | | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}
17m 5s{color} | | {color:green} patch has no errors when building and testing
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
49s{color} | | {color:green} the patch passed with JDK
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m
19s{color} | | {color:green} the patch passed with JDK Private
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
26s{color} | | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}145m 6s{color}
|
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/253/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
43s{color} | | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}220m 46s{color} |
| {color:black}{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultiThreadedHflush |
| |
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks |
| | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
| | hadoop.hdfs.TestDFSStorageStateRecovery |
| | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
| | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
| | hadoop.hdfs.TestFileChecksumCompositeCrc |
| | hadoop.hdfs.TestGetFileChecksum |
| | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
| | hadoop.hdfs.TestSafeModeWithStripedFile |
| | hadoop.hdfs.TestParallelUnixDomainRead |
| | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
| | hadoop.hdfs.TestErasureCodingPolicies |
| | hadoop.hdfs.TestDecommissionWithStriped |
| | hadoop.hdfs.TestDistributedFileSystemWithECFileWithRandomECPolicy |
| | hadoop.hdfs.TestDFSStripedInputStream |
| | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
| | hadoop.hdfs.tools.TestDFSAdminWithHA |
| | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
| | hadoop.hdfs.TestFileAppend2 |
| | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
| | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
| | hadoop.hdfs.TestDFSStripedOutputStream |
| | hadoop.hdfs.TestReadStripedFileWithDecodingDeletedData |
| | hadoop.hdfs.TestDistributedFileSystem |
| | hadoop.hdfs.TestReadStripedFileWithDNFailure |
| | hadoop.hdfs.tools.TestECAdmin |
| | hadoop.hdfs.TestDatanodeDeath |
| | hadoop.hdfs.server.balancer.TestBalancer |
| | hadoop.hdfs.TestReadStripedFileWithDecodingCorruptData |
| | hadoop.hdfs.TestErasureCodingMultipleRacks |
| | hadoop.hdfs.TestSnapshotCommands |
| | hadoop.hdfs.TestFileAppend4 |
| | hadoop.hdfs.TestFileChecksum |
| | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
| | hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy |
| | hadoop.hdfs.TestAclsEndToEnd |
| | hadoop.hdfs.TestDFSUpgradeFromImage |
| | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base:
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/253/artifact/out/Dockerfile
|
| JIRA Issue | HDFS-15644 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/13013942/HDFS-15644.002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite
unit shadedclient findbugs checkstyle |
| uname | Linux a960070b0bc5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 7f8ef76c4833262f60cac2956aaa7fb75c0a77bc |
| Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
| Multi-JDK versions |
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
/usr/lib/jvm/java-8-openjdk-amd64:Private
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
| Test Results |
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/253/testReport/ |
| Max. process+thread count | 3666 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U:
hadoop-hdfs-project/hadoop-hdfs |
| Console output |
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/253/console |
| versions | git=2.17.1 maven=3.6.0 findbugs=4.1.3 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
This message was automatically generated.
> Failed volumes can cause DNs to stop block reporting
> ----------------------------------------------------
>
> Key: HDFS-15644
> URL: https://issues.apache.org/jira/browse/HDFS-15644
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: block placement, datanode
> Reporter: Ahmed Hussein
> Assignee: Ahmed Hussein
> Priority: Major
> Labels: refactor
> Attachments: HDFS-15644.001.patch, HDFS-15644.002.patch
>
>
> [~daryn] found a corner case where remove failed volumes can cause a NPE in
> [FsDataSetImpl.getBlockReports()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1939].
> +Scenario:+
> * Inside {{Datanode#HandleVolumeFailures()}}, removing a failed volume is a
> 2-step process.
> ** First it's removed from from the volumes list
> ** Later in time are the replicas scrubbed from the volume map
> * A concurrent thread generating blockReports may access the replicaMap
> accessing a non existing VolumeID.
> He made a fix for that and we have been using it on our clusters since
> Hadoop-2.7.
> By analyzing the code, the bug is still applicable to Trunk.
> * The path Datanode#removeVolumes() is safe because the two step process in
> {{FsDataImpl.removeVolumes()}}
> [FsDatasetImpl.java#L577|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L577]
> is protected by {{datasetWriteLock}} .
> * The path Datanode#handleVolumeFailures() is not safe because the failed
> volume is removed from the list without acquiring
> {{datasetWriteLock}}.[FsVolumList#239|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java#L239]
> The race condition can cause the caller of getBlockReports() to throw NPE if
> the RUR is referring to a volume that has already been removed
> [FsDatasetImpl.java#L1976|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L1976].
> {code:java}
> case RUR:
> ReplicaInfo orig = b.getOriginalReplica();
> builders.get(volStorageID).add(orig);
> break;
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]