[jira] [Resolved] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin resolved HDFS-13571. -- Resolution: Fixed > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Major > Fix For: 3.3.0 > > Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node > status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-15020) Add a test case of storage type quota to TestHdfsAdmin.
Jinglun created HDFS-15020: -- Summary: Add a test case of storage type quota to TestHdfsAdmin. Key: HDFS-15020 URL: https://issues.apache.org/jira/browse/HDFS-15020 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jinglun Assignee: Jinglun -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Hey guys, I think we diverged a bit from the initial topic of this discussion, which is removing branch-2.10, and changing the version of branch-2 from 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT. Sounds like the subject line for this thread "Making 2.10 the last minor 2.x release" confused people. It is in fact a wider matter that can be discussed when somebody actually proposes to release 2.11, which I understand nobody does at the moment. So if anybody objects removing branch-2.10 please make an argument. Otherwise we should go ahead and just do it next week. I see people still struggling to keep branch-2 and branch-2.10 in sync. Thanks, --Konstantin On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung wrote: > Thanks for the detailed thoughts, everyone. > > Eric (Badger), my understanding is the same as yours re. minor vs patch > releases. As for putting features into minor/patch releases, if we keep the > convention of putting new features only into minor releases, my assumption > is still that it's unlikely people will want to get them into branch-2 > (based on the 2.10.0 release process). For the java 11 issue, we haven't > even really removed support for java 7 in branch-2 (much less java 8), so I > feel moving to java 11 would go along with a move to branch 3. And as you > mentioned, if people really want to use java 11 on branch-2, we can always > revive branch-2. But for now I think the convenience of not needing to port > to both branch-2 and branch-2.10 (and below) outweighs the cost of > potentially needing to revive branch-2. > > Jonathan Hung > > > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang wrote: > >> +1 for 2.10.x as last release for 2.x version. >> >> Software would become more compatible when more companies stress test the >> same software and making improvements in trunk. Some may be extra caution >> on moving up the version because obligation internally to keep things >> running. Company obligation should not be the driving force to maintain >> Hadoop branches. There is no proper collaboration in the community when >> every name brand company maintains its own Hadoop 2.x version. I think it >> would be more healthy for the community to reduce the branch forking and >> spend energy on trunk to harden the software. This will give more >> confidence to move up the version than trying to fix n permutations >> breakage like Flash fixing the timeline. >> >> Apache license stated, there is no warranty of any kind for code >> contributions. Fewer community release process should improve software >> quality when eyes are on trunk, and help steering toward the same end goals. >> >> regards, >> Eric >> >> >> >> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger >> wrote: >> >>> Hello all, >>> >>> Is it written anywhere what the difference is between a minor release >>> and a >>> point/dot/maintenance (I'll use "point" from here on out) release? I have >>> looked around and I can't find anything other than some compatibility >>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think >>> this would help shape my opinion on whether or not to keep branch-2 >>> alive. >>> My current understanding is that we can't really break compatibility in >>> either a minor or point release. But the only mention of the difference >>> between minor and point releases is how to deal with Stable, Evolving, >>> and >>> Unstable tags, and how to deal with changing default configuration >>> values. >>> So it seems like there really isn't a big official difference between the >>> two. In my mind, the functional difference between the two is that the >>> minor releases may have added features and rewrites, while the point >>> releases only have bug fixes. This might be an incorrect understanding, >>> but >>> that's what I have gathered from watching the releases over the last few >>> years. Whether or not this is a correct understanding, I think that this >>> needs to be documented somewhere, even if it is just a convention. >>> >>> Given my assumed understanding of minor vs point releases, here are the >>> pros/cons that I can think of for having a branch-2. Please add on or >>> correct me for anything you feel is missing or inadequate. >>> Pros: >>> - Features/rewrites/higher-risk patches are less likely to be put into >>> 2.10.x >>> - It is less necessary to move to 3.x >>> >>> Cons: >>> - Bug fixes are less likely to be put into 2.10.x >>> - An extra branch to maintain >>> - Committers have an extra branch (5 vs 4 total branches) to commit >>> patches to if they should go all the way back to 2.10.x >>> - It is less necessary to move to 3.x >>> >>> So on the one hand you get added stability in fewer features being >>> committed to 2.10.x, but then on the other you get fewer bug fixes being >>> committed. In a perfect world, we wouldn't have to make this tradeoff. >>> But >>> we don't live in a perfect world and committers will make mistakes either >>> because of lack of knowledge or simply because they
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1333/ [Nov 26, 2019 12:41:41 PM] (snemeth) YARN-9937. addendum: Add missing queue configs in [Nov 26, 2019 3:36:19 PM] (github) HADOOP-16709. S3Guard: Make authoritative mode exclusive for metadata - [Nov 26, 2019 3:42:59 PM] (snemeth) YARN-9444. YARN API ResourceUtils's getRequestedResourcesFromConfig [Nov 26, 2019 7:11:26 PM] (weichiu) HADOOP-16685: FileSystem#listStatusIterator does not check if given path [Nov 26, 2019 8:22:35 PM] (snemeth) YARN-9899. Migration tool that help to generate CS config based on FS [Nov 26, 2019 8:29:12 PM] (prabhujoseph) YARN-9991. Fix Application Tag prefix to userid. Contributed by Szilard [Nov 26, 2019 8:45:12 PM] (snemeth) YARN-9362. Code cleanup in TestNMLeveldbStateStoreService. Contributed [Nov 26, 2019 9:04:07 PM] (snemeth) YARN-9290. Invalid SchedulingRequest not rejected in Scheduler -1 overall The following subsystems voted -1: asflicense findbugs pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] FindBugs : module:hadoop-cloud-storage-project/hadoop-cos Redundant nullcheck of dir, which is known to be non-null in org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at BufferPool.java:is known to be non-null in org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at BufferPool.java:[line 66] org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may expose internal representation by returning CosNInputStream$ReadBuffer.buffer At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At CosNInputStream.java:[line 87] Found reliance on default encoding in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] Found reliance on default encoding in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, InputStream, byte[], long):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, InputStream, byte[], long): new String(byte[]) At CosNativeFileSystemStore.java:[line 178] org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, String, String, int) may fail to clean up java.io.InputStream Obligation to clean up resource created at CosNativeFileSystemStore.java:fail to clean up java.io.InputStream Obligation to clean up resource created at CosNativeFileSystemStore.java:[line 252] is not discharged Failed junit tests : hadoop.hdfs.server.balancer.TestBalancer hadoop.hdfs.server.namenode.TestNamenodeCapacityReport hadoop.hdfs.server.namenode.TestRedudantBlocks hadoop.hdfs.tools.TestDFSZKFailoverController hadoop.hdfs.server.federation.router.TestRouterFaultTolerant hadoop.yarn.server.webproxy.amfilter.TestAmFilter
[jira] [Created] (HDFS-15019) Refactor the unit test of TestDeadNodeDetection
Yiqun Lin created HDFS-15019: Summary: Refactor the unit test of TestDeadNodeDetection Key: HDFS-15019 URL: https://issues.apache.org/jira/browse/HDFS-15019 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yiqun Lin Assignee: Lisheng Sun There are many duplicated lines in unit test \{{TestDeadNodeDetection}}. We can simplified that. In additional in {{testDeadNodeDetectionInMultipleDFSInputStream}}, the DFSInputstream is passed incorrectly in asset operation. {code} din2 = (DFSInputStream) in1.getWrappedStream(); {code} Should be {code} din2 = (DFSInputStream) in2.getWrappedStream(); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/ [Nov 27, 2019 12:46:38 AM] (xkrogen) HDFS-14973. More strictly enforce Balancer/Mover/SPS throttling of -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.util.TestReadWriteDiskValidator hadoop.fs.sftp.TestSFTPFileSystem hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.client.api.impl.TestAMRMClient cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-cc-root-jdk1.8.0_222.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-javac-root-jdk1.8.0_222.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/whitespace-tabs.txt [1.3M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_222.txt [1.1M] unit: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [168K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [324K]