[jira] [Resolved] (HDFS-13571) Dead DataNode Detector

2019-11-27 Thread Yiqun Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin resolved HDFS-13571.
--
Resolution: Fixed

> Dead DataNode Detector
> --
>
> Key: HDFS-13571
> URL: https://issues.apache.org/jira/browse/HDFS-13571
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0, 3.0.2
>Reporter: Gang Xie
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15020) Add a test case of storage type quota to TestHdfsAdmin.

2019-11-27 Thread Jinglun (Jira)
Jinglun created HDFS-15020:
--

 Summary: Add a test case of storage type quota to TestHdfsAdmin.
 Key: HDFS-15020
 URL: https://issues.apache.org/jira/browse/HDFS-15020
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jinglun
Assignee: Jinglun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-27 Thread Konstantin Shvachko
Hey guys,

I think we diverged a bit from the initial topic of this discussion, which
is removing branch-2.10, and changing the version of branch-2 from
2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
Sounds like the subject line for this thread "Making 2.10 the last minor
2.x release" confused people.
It is in fact a wider matter that can be discussed when somebody actually
proposes to release 2.11, which I understand nobody does at the moment.

So if anybody objects removing branch-2.10 please make an argument.
Otherwise we should go ahead and just do it next week.
I see people still struggling to keep branch-2 and branch-2.10 in sync.

Thanks,
--Konstantin

On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung  wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>
> Jonathan Hung
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang  wrote:
>
>> +1 for 2.10.x as last release for 2.x version.
>>
>> Software would become more compatible when more companies stress test the
>> same software and making improvements in trunk.  Some may be extra caution
>> on moving up the version because obligation internally to keep things
>> running.  Company obligation should not be the driving force to maintain
>> Hadoop branches.  There is no proper collaboration in the community when
>> every name brand company maintains its own Hadoop 2.x version.  I think it
>> would be more healthy for the community to reduce the branch forking and
>> spend energy on trunk to harden the software.  This will give more
>> confidence to move up the version than trying to fix n permutations
>> breakage like Flash fixing the timeline.
>>
>> Apache license stated, there is no warranty of any kind for code
>> contributions.  Fewer community release process should improve software
>> quality when eyes are on trunk, and help steering toward the same end goals.
>>
>> regards,
>> Eric
>>
>>
>>
>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>  wrote:
>>
>>> Hello all,
>>>
>>> Is it written anywhere what the difference is between a minor release
>>> and a
>>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>>> looked around and I can't find anything other than some compatibility
>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>> this would help shape my opinion on whether or not to keep branch-2
>>> alive.
>>> My current understanding is that we can't really break compatibility in
>>> either a minor or point release. But the only mention of the difference
>>> between minor and point releases is how to deal with Stable, Evolving,
>>> and
>>> Unstable tags, and how to deal with changing default configuration
>>> values.
>>> So it seems like there really isn't a big official difference between the
>>> two. In my mind, the functional difference between the two is that the
>>> minor releases may have added features and rewrites, while the point
>>> releases only have bug fixes. This might be an incorrect understanding,
>>> but
>>> that's what I have gathered from watching the releases over the last few
>>> years. Whether or not this is a correct understanding, I think that this
>>> needs to be documented somewhere, even if it is just a convention.
>>>
>>> Given my assumed understanding of minor vs point releases, here are the
>>> pros/cons that I can think of for having a branch-2. Please add on or
>>> correct me for anything you feel is missing or inadequate.
>>> Pros:
>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>> 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> Cons:
>>> - Bug fixes are less likely to be put into 2.10.x
>>> - An extra branch to maintain
>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>> patches to if they should go all the way back to 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> So on the one hand you get added stability in fewer features being
>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>> But
>>> we don't live in a perfect world and committers will make mistakes either
>>> because of lack of knowledge or simply because they 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-11-27 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1333/

[Nov 26, 2019 12:41:41 PM] (snemeth) YARN-9937. addendum: Add missing queue 
configs in
[Nov 26, 2019 3:36:19 PM] (github) HADOOP-16709. S3Guard: Make authoritative 
mode exclusive for metadata -
[Nov 26, 2019 3:42:59 PM] (snemeth) YARN-9444. YARN API ResourceUtils's 
getRequestedResourcesFromConfig
[Nov 26, 2019 7:11:26 PM] (weichiu) HADOOP-16685: FileSystem#listStatusIterator 
does not check if given path
[Nov 26, 2019 8:22:35 PM] (snemeth) YARN-9899. Migration tool that help to 
generate CS config based on FS
[Nov 26, 2019 8:29:12 PM] (prabhujoseph) YARN-9991. Fix Application Tag prefix 
to userid. Contributed by Szilard
[Nov 26, 2019 8:45:12 PM] (snemeth) YARN-9362. Code cleanup in 
TestNMLeveldbStateStoreService. Contributed
[Nov 26, 2019 9:04:07 PM] (snemeth) YARN-9290. Invalid SchedulingRequest not 
rejected in Scheduler




-1 overall


The following subsystems voted -1:
asflicense findbugs pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

FindBugs :

   module:hadoop-cloud-storage-project/hadoop-cos 
   Redundant nullcheck of dir, which is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:[line 66] 
   org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may 
expose internal representation by returning CosNInputStream$ReadBuffer.buffer 
At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At 
CosNInputStream.java:[line 87] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, 
byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, 
File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long):in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long): new String(byte[]) At 
CosNativeFileSystemStore.java:[line 178] 
   org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, 
String, String, int) may fail to clean up java.io.InputStream Obligation to 
clean up resource created at CosNativeFileSystemStore.java:fail to clean up 
java.io.InputStream Obligation to clean up resource created at 
CosNativeFileSystemStore.java:[line 252] is not discharged 

Failed junit tests :

   hadoop.hdfs.server.balancer.TestBalancer 
   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport 
   hadoop.hdfs.server.namenode.TestRedudantBlocks 
   hadoop.hdfs.tools.TestDFSZKFailoverController 
   hadoop.hdfs.server.federation.router.TestRouterFaultTolerant 
   hadoop.yarn.server.webproxy.amfilter.TestAmFilter 
   

[jira] [Created] (HDFS-15019) Refactor the unit test of TestDeadNodeDetection

2019-11-27 Thread Yiqun Lin (Jira)
Yiqun Lin created HDFS-15019:


 Summary: Refactor the unit test of TestDeadNodeDetection 
 Key: HDFS-15019
 URL: https://issues.apache.org/jira/browse/HDFS-15019
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yiqun Lin
Assignee: Lisheng Sun


There are many duplicated lines in unit test \{{TestDeadNodeDetection}}. We can 
simplified that.

In additional in {{testDeadNodeDetectionInMultipleDFSInputStream}}, the 
DFSInputstream is passed incorrectly in asset operation.

{code}
din2 = (DFSInputStream) in1.getWrappedStream();
{code}
Should be 
{code}
din2 = (DFSInputStream) in2.getWrappedStream();
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-11-27 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/

[Nov 27, 2019 12:46:38 AM] (xkrogen) HDFS-14973. More strictly enforce 
Balancer/Mover/SPS throttling of




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.util.TestReadWriteDiskValidator 
   hadoop.fs.sftp.TestSFTPFileSystem 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.client.api.impl.TestAMRMClient 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-cc-root-jdk1.8.0_222.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-compile-javac-root-jdk1.8.0_222.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_222.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [168K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/518/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [324K]