[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285
[ https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306025#comment-16306025 ] genericqa commented on HDFS-12911: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} HDFS-10285 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 18s{color} | {color:red} root in HDFS-10285 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 0s{color} | {color:green} HDFS-10285 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 19s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 61 new + 393 unchanged - 0 fixed = 454 total (was 393) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 9s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 6m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12911 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903981/HDFS-12911-HDFS-10285-02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ab81a0066861 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-10285 / 7f4c4ab | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-mvninstall-root.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt | | mvnsite |
[jira] [Updated] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285
[ https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-12911: --- Status: Patch Available (was: Open) > [SPS]: Fix review comments from discussions in HDFS-10285 > - > > Key: HDFS-12911 > URL: https://issues.apache.org/jira/browse/HDFS-12911 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Uma Maheswara Rao G >Assignee: Rakesh R > Attachments: HDFS-12911-HDFS-10285-01.patch, > HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch > > > This is the JIRA for tracking the possible improvements or issues discussed > in main JIRA > So far comments to handle > Daryn: > # Lock should not kept while executing placement policy. > # While starting up the NN, SPS Xattrs checks happen even if feature > disabled. This could potentially impact the startup speed. > UMA: > # I am adding one more possible improvement to reduce Xattr objects > significantly. > SPS Xattr is constant object. So, we create one Xattr deduplication object > once statically and use the same object reference when required to add SPS > Xattr to Inode. So, here additional bytes required for storing SPS Xattr > would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr > overhead should come down significantly IMO. Lets explore the feasibility on > this option. > Xattr list Future will not be specially created for SPS, that list would have > been created by SetStoragePolicy already on the same directory. So, no extra > Feature creation because of SPS alone. > # Currently SPS putting long id objects in Q for tracking SPS called Inodes. > So, it is additional created and size of it would be (obj ref + value) = (8 + > 8) bytes [ ignoring alignment for time being] > So, the possible improvement here is, instead of creating new Long obj, we > can keep existing inode object for tracking. Advantage is, Inode object > already maintained in NN, so no new object creation is needed. So, we just > need to maintain one obj ref. Above two points should significantly reduce > the memory requirements of SPS. So, for SPS call: 8bytes for called inode > tracking + 8 bytes for Xattr ref. > # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will > reduce unnecessary Node creations inside LinkedList. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12966) Ozone: owner name should be set properly when the container allocation happens
[ https://issues.apache.org/jira/browse/HDFS-12966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-12966: --- Status: Patch Available (was: Open) > Ozone: owner name should be set properly when the container allocation happens > -- > > Key: HDFS-12966 > URL: https://issues.apache.org/jira/browse/HDFS-12966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: HDFS-7240 >Affects Versions: HDFS-7240 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee > Fix For: HDFS-7240 > > Attachments: HDFS-12966-HDFS-7240.001.patch > > > Currently , while the container allocation happens, the owner name is > hardcoded as "OZONE". > It should be set to KSM instance id/ CBlock Manager instance Id from where > the container creation call happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306012#comment-16306012 ] Uma Maheswara Rao G commented on HDFS-10285: Hi [~virajith], Thanks for the feedback. {quote} Quick question about this – what are clients you are referring to here? {quote} I meant, Client side SPS API ( Today it is exposed via HDFS clients). {quote} If the SPS is going to run outside the NN, I would think it is going to be decoupled from/not depend on the FSNamesystem lock and the NN-DN heartbeat protocol. The current implementation/design has a tight coupling between the SPS and both these components. {quote} I have just posted a patch with SPS modularization. With that, SPS core implementation will not access NN internals directly, instead it will access via an Interface. That interface implementation can be pluggable. Please take a look at HDFS-12911 > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-SPS-TestReport-20170708.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285
[ https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-12911: --- Attachment: HDFS-12911-HDFS-10285-02.patch A patch attached with SPS modularization. # For Scanning: Added an interface, FileIDCollector. We can plugin specific scanning implementations. # For SPS-NN communication: Added interface Context, which is same as [~chris.douglas] proposed in first patch. All nn related interaction will happen via this interface. A specific implementations can be plugged in. One could directly access NN internals and other could use RPC to get info from NN. # For Block move tasks: Added an interface BlockMoveTaskHandler. We could plugin specific implementation for block movements. One can be heartbeat based assignment and other implementation can be sending block move ops directly to DNs. In this patch, I provided internal SPS implementation ( IntraSPSNameNodeContext, IntraSPSNameNodeFileIDCollector, IntraSPSNameNodeBlockMoveTaskHandler) and for external I just created dummy plugin classes (ExternalSPSContext, ExternalSPSFileIDCollector, ExternalSPSBlockMoveTaskHandler). Feedbacks are most welcomed. > [SPS]: Fix review comments from discussions in HDFS-10285 > - > > Key: HDFS-12911 > URL: https://issues.apache.org/jira/browse/HDFS-12911 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Uma Maheswara Rao G >Assignee: Rakesh R > Attachments: HDFS-12911-HDFS-10285-01.patch, > HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch > > > This is the JIRA for tracking the possible improvements or issues discussed > in main JIRA > So far comments to handle > Daryn: > # Lock should not kept while executing placement policy. > # While starting up the NN, SPS Xattrs checks happen even if feature > disabled. This could potentially impact the startup speed. > UMA: > # I am adding one more possible improvement to reduce Xattr objects > significantly. > SPS Xattr is constant object. So, we create one Xattr deduplication object > once statically and use the same object reference when required to add SPS > Xattr to Inode. So, here additional bytes required for storing SPS Xattr > would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr > overhead should come down significantly IMO. Lets explore the feasibility on > this option. > Xattr list Future will not be specially created for SPS, that list would have > been created by SetStoragePolicy already on the same directory. So, no extra > Feature creation because of SPS alone. > # Currently SPS putting long id objects in Q for tracking SPS called Inodes. > So, it is additional created and size of it would be (obj ref + value) = (8 + > 8) bytes [ ignoring alignment for time being] > So, the possible improvement here is, instead of creating new Long obj, we > can keep existing inode object for tracking. Advantage is, Inode object > already maintained in NN, so no new object creation is needed. So, we just > need to maintain one obj ref. Above two points should significantly reduce > the memory requirements of SPS. So, for SPS call: 8bytes for called inode > tracking + 8 bytes for Xattr ref. > # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will > reduce unnecessary Node creations inside LinkedList. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7060) Avoid taking locks when sending heartbeats from the DataNode
[ https://issues.apache.org/jira/browse/HDFS-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305967#comment-16305967 ] He Xiaoqiao commented on HDFS-7060: --- The last patch is good for me. there are some suggestions for testing: 1. heartbeatsAvgTime latency will be not reduced obviously, especially HDFS cluster with federation and HA with QJM, since {{HeartbeatsAvgTime}} is a statistical indicator which is average value about sendheartbeat times to all {{namenodes}}, if Standby NameNode cost log time to process RPC, actually it happens frequently when tail editlog or checkpoint, heartbeatsAvgTime indicator at DataNode will not be reduced as expect. 2. when set DataNode only interact with Active NameNode for testing, and the result is as expect. for short, this patch is helpful for me. thanks [~yangjiandan] and [~cheersyang]. > Avoid taking locks when sending heartbeats from the DataNode > > > Key: HDFS-7060 > URL: https://issues.apache.org/jira/browse/HDFS-7060 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Jiandan Yang > Labels: BB2015-05-TBR, locks, performance > Fix For: 3.0.0, 3.1.0 > > Attachments: HDFS Status Post Patch.png, HDFS-7060-002.patch, > HDFS-7060.000.patch, HDFS-7060.001.patch, HDFS-7060.003.patch, > HDFS-7060.004.patch, HDFS-7060.005.patch, complete_failed_qps.png, > sendHeartbeat.png > > > We're seeing the heartbeat is blocked by the monitor of {{FsDatasetImpl}} > when the DN is under heavy load of writes: > {noformat} >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:115) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:91) > - locked <0x000780612fd8> (a java.lang.Object) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:563) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:668) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:827) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:743) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1006) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:59) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:244) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:195) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:753) > - locked <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) > at java.lang.Thread.run(Thread.java:744) > {noformat} -- This message was
[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305908#comment-16305908 ] genericqa commented on HDFS-12915: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 42s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}166m 34s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12915 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903955/HDFS-12915.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ad93ee01f0cc 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/22518/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | unit |
[jira] [Commented] (HDFS-12967) NNBench should support multi-cluster access
[ https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305877#comment-16305877 ] Chen Zhang commented on HDFS-12967: --- Thanks for your response and suggestion, [~jojochuang]. bq. what about adding a new command parameter so that this support is visible I think it don't need a new command parameter, just using path with prefix like hdfs://some-cluster/user/foo will work. And it's reasonable to add some explanation in help text. bq. You should also consider adding tests in TestNNBench Thanks for reminding this, I'll add a test soon > NNBench should support multi-cluster access > --- > > Key: HDFS-12967 > URL: https://issues.apache.org/jira/browse/HDFS-12967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: benchmarks >Reporter: Chen Zhang >Assignee: Chen Zhang > Attachments: HDFS-12967-001.patch > > > Sometimes we need to run NNBench for some scaling tests after made some > improvements on NameNode, so we have to deploy a new HDFS cluster and a new > Yarn cluster. > If NNBench support multi-cluster access, we only need to deploy a new HDFS > test cluster and add it to existing YARN cluster, it'll make the scaling test > easier. > Even more, if we want to do some A-B test, we have to run NNBench on > different HDFS clusters, this patch will be helpful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12967) NNBench should support multi-cluster access
[ https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated HDFS-12967: -- Description: Sometimes we need to run NNBench for some scaling tests after made some improvements on NameNode, so we have to deploy a new HDFS cluster and a new Yarn cluster. If NNBench support multi-cluster access, we only need to deploy a new HDFS test cluster and add it to existing YARN cluster, it'll make the scaling test easier. Even more, if we want to do some A-B test, we have to run NNBench on different HDFS clusters, this patch will be helpful. was: Sometimes we need to run NNBench for some scaling tests after made some improvements on NameNode, so we have to deploy a new HDFS cluster and a new Yarn cluster. If NNBench support multi-cluster access, we only need to deploy a new HDFS test cluster and add it to existing YARN cluster, it'll make the scaling test easier. Even more, if we want to make some A-B test, we have to run NNBench on different HDFS clusters, this patch will be helpful. > NNBench should support multi-cluster access > --- > > Key: HDFS-12967 > URL: https://issues.apache.org/jira/browse/HDFS-12967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: benchmarks >Reporter: Chen Zhang >Assignee: Chen Zhang > Attachments: HDFS-12967-001.patch > > > Sometimes we need to run NNBench for some scaling tests after made some > improvements on NameNode, so we have to deploy a new HDFS cluster and a new > Yarn cluster. > If NNBench support multi-cluster access, we only need to deploy a new HDFS > test cluster and add it to existing YARN cluster, it'll make the scaling test > easier. > Even more, if we want to do some A-B test, we have to run NNBench on > different HDFS clusters, this patch will be helpful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12860) StripedBlockUtil#getRangesInternalBlocks throws exception for the block group size larger than 2GB
[ https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305864#comment-16305864 ] genericqa commented on HDFS-12860: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 22 unchanged - 5 fixed = 22 total (was 27) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}192m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12860 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903942/HDFS-12860.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux babb7c90c411 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9 |
[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305833#comment-16305833 ] Chris Douglas commented on HDFS-12915: -- Please don't hesitate to effect a different refactoring (e.g., removing {{blockType}}) if you prefer it to this patch. There's no reason to rewrite the same utility function multiple times. > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
[ https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-12915: - Attachment: HDFS-12915.02.patch Added a check for replication policy set for {{STRIPED}} > Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy > --- > > Key: HDFS-12915 > URL: https://issues.apache.org/jira/browse/HDFS-12915 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang > Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, > HDFS-12915.02.patch > > > It seems HDFS-12840 creates a new findbugs warning. > Possible null pointer dereference of replication in > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Bug type NP_NULL_ON_SOME_PATH (click for details) > In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat > In method > org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, > Short, Byte) > Value loaded from replication > Dereferenced at INodeFile.java:[line 210] > Known null at INodeFile.java:[line 207] > From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] > would you please double check? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12860) StripedBlockUtil#getRangesInternalBlocks throws exception for the block group size larger than 2GB
[ https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-12860: - Attachment: HDFS-12860.01.patch Hi, [~Sammi]. I updated the patch to add more comments to the test, and address your comments 1) and 2). Also it fixes checkstyle warnings Regarding the end-to-end tests, I would like to know what's your thoughts on this matter. > StripedBlockUtil#getRangesInternalBlocks throws exception for the block group > size larger than 2GB > -- > > Key: HDFS-12860 > URL: https://issues.apache.org/jira/browse/HDFS-12860 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12860.00.patch, HDFS-12860.01.patch > > > Running terasort on a cluster with 8 datanodes, 256GB data, using > RS-3-2-1024k. > The test data was generated by {{teragen}} with 32 mappers. > The terasort benchmark fails with the following stack trace: > {code} > 17/11/27 14:44:31 INFO mapreduce.Job: map 45% reduce 0% > 17/11/27 14:44:33 INFO mapreduce.Job: Task Id : > attempt_1510080297865_0160_m_08_0, Status : FAILED > Error: java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) > at > org.apache.hadoop.hdfs.util.StripedBlockUtil$VerticalRange.(StripedBlockUtil.java:701) > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getRangesForInternalBlocks(StripedBlockUtil.java:442) > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(StripedBlockUtil.java:311) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:308) > at > org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813) > at java.io.DataInputStream.read(DataInputStream.java:149) > at > org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562) > at > org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12963) error log level in ShortCircuitRegistry#removeShm
[ https://issues.apache.org/jira/browse/HDFS-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305725#comment-16305725 ] Rushabh S Shah commented on HDFS-12963: --- +1 lgtm non-binding. > error log level in ShortCircuitRegistry#removeShm > - > > Key: HDFS-12963 > URL: https://issues.apache.org/jira/browse/HDFS-12963 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hu xiaodong >Assignee: hu xiaodong >Priority: Minor > Attachments: HDFS-12963.001.patch > > > {code:title=org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.java|borderStyle=solid} > public synchronized void removeShm(ShortCircuitShm shm) { > if (LOG.isTraceEnabled()) { > LOG.debug("removing shm " + shm); -- I think here > should be trace > } > // Stop tracking the shmId. > RegisteredShm removedShm = segments.remove(shm.getShmId()); > Preconditions.checkState(removedShm == shm, > "failed to remove " + shm.getShmId()); > // Stop tracking the slots. > for (Iterator iter = shm.slotIterator(); iter.hasNext(); ) { > Slot slot = iter.next(); > boolean removed = slots.remove(slot.getBlockId(), slot); > Preconditions.checkState(removed); > slot.makeInvalid(); > } > // De-allocate the memory map and close the shared file. > shm.free(); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-9023: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.1 2.9.1 2.10.0 3.1.0 Status: Resolved (was: Patch Available) Precommit failures look unrelated. Pushed this to trunk, branch-3.0, branch-2 and branch-2.9, to match HDFS-11494. Thanks for reporting the issue and the reviews, [~surendrasingh]! > When NN is not able to identify DN for replication, reason behind it can be > logged > -- > > Key: HDFS-9023 > URL: https://issues.apache.org/jira/browse/HDFS-9023 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Xiao Chen >Priority: Critical > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, > HDFS-9023.03.patch, HDFS-9023.branch-2.patch > > > When NN is not able to identify DN for replication, reason behind it can be > logged (at least critical information why DNs not chosen like disk is full). > At present it is expected to enable debug log. > For example the reason for below error looks like all 7 DNs are busy for data > writes. But at client or NN side no hint is given in the log message. > {noformat} > File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp > could only be replicated to 0 nodes instead of minReplication (=1). There > are 7 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553) > > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305700#comment-16305700 ] Hudson commented on HDFS-9023: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #13422 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13422/]) HDFS-9023. When NN is not able to identify DN for replication, reason (xiao: rev 5bf7e594d7d54e5295fe4240c3d60c08d4755ab7) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java > When NN is not able to identify DN for replication, reason behind it can be > logged > -- > > Key: HDFS-9023 > URL: https://issues.apache.org/jira/browse/HDFS-9023 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, > HDFS-9023.03.patch, HDFS-9023.branch-2.patch > > > When NN is not able to identify DN for replication, reason behind it can be > logged (at least critical information why DNs not chosen like disk is full). > At present it is expected to enable debug log. > For example the reason for below error looks like all 7 DNs are busy for data > writes. But at client or NN side no hint is given in the log message. > {noformat} > File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp > could only be replicated to 0 nodes instead of minReplication (=1). There > are 7 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553) > > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305693#comment-16305693 ] genericqa commented on HDFS-9023: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 18s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 1m 25s{color} | {color:red} The patch generated 350 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}105m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-hdfs:20 | | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | | | org.apache.hadoop.hdfs.web.TestWebHdfsTokens | | | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream | | | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade | | | org.apache.hadoop.hdfs.TestFileAppendRestart | | | org.apache.hadoop.hdfs.security.TestDelegationToken | | | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter | | | org.apache.hadoop.hdfs.TestDFSMkdirs | | | org.apache.hadoop.hdfs.TestDatanodeReport | | | org.apache.hadoop.hdfs.web.TestWebHDFS | | | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr | | | org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes | | | org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs | | | org.apache.hadoop.hdfs.TestSnapshotCommands | | | org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs | | | org.apache.hadoop.hdfs.TestDistributedFileSystem | | | org.apache.hadoop.hdfs.web.TestWebHDFSForHA | | | org.apache.hadoop.hdfs.TestReplaceDatanodeFailureReplication | | | org.apache.hadoop.hdfs.TestDFSShell | | | org.apache.hadoop.hdfs.web.TestWebHDFSAcl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 | | JIRA Issue | HDFS-9023 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903933/HDFS-9023.branch-2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | |
[jira] [Commented] (HDFS-12897) Path not found when we get the ec policy for a .snapshot dir
[ https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305670#comment-16305670 ] Xiao Chen commented on HDFS-12897: -- Thanks [~GeLiXin] for looking into the issue and providing the patch. Patch looks good overall. Some comments below: - The {{FSDirectory.isExactReservedName}} check can happen before the path resolution, so we don't do unnecessary {{fsd.resolvePath}}. - There is also {{/.reserved/raw}} under {{/.reserved}}. I think path resolution handles that but would love a test covering it. - Trivial, but prefer {{ecDotSnapshotDir}} declaration in the test to happen lazily, right before it's used (instead of at the beginning). > Path not found when we get the ec policy for a .snapshot dir > > > Key: HDFS-12897 > URL: https://issues.apache.org/jira/browse/HDFS-12897 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, hdfs, snapshots >Affects Versions: 3.0.0-alpha1, 3.1.0 >Reporter: Harshakiran Reddy >Assignee: LiXin Ge > Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch > > > Scenario:- > --- > Operation on snapshot dir. > *EC policy* > bin> ./hdfs ec -getPolicy -path /dir/ > RS-3-2-1024k > bin> ./hdfs ec -getPolicy -path /dir/.snapshot/ > {{FileNotFoundException: Path not found: /dir/.snapshot}} > bin> ./hdfs dfs -ls /dir/.snapshot/ > Found 2 items > drwxr-xr-x - user group 0 2017-12-05 12:27 /dir/.snapshot/s1 > drwxr-xr-x - user group 0 2017-12-05 12:28 /dir/.snapshot/s2 > *Storagepolicies* > bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/ > {{The storage policy of /dir/.snapshot/ is unspecified}} > bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/ > The storage policy of /dir/: > BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], > replicationFallbacks=[]} > *Which is the correct behavior ?* -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12967) NNBench should support multi-cluster access
[ https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305635#comment-16305635 ] Wei-Chiu Chuang commented on HDFS-12967: Sounds a good improvement. For completeness, what about adding a new command parameter so that this support is visible? You should also consider adding tests in TestNNBench. Thanks > NNBench should support multi-cluster access > --- > > Key: HDFS-12967 > URL: https://issues.apache.org/jira/browse/HDFS-12967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: benchmarks >Reporter: Chen Zhang >Assignee: Chen Zhang > Attachments: HDFS-12967-001.patch > > > Sometimes we need to run NNBench for some scaling tests after made some > improvements on NameNode, so we have to deploy a new HDFS cluster and a new > Yarn cluster. > If NNBench support multi-cluster access, we only need to deploy a new HDFS > test cluster and add it to existing YARN cluster, it'll make the scaling test > easier. > Even more, if we want to make some A-B test, we have to run NNBench on > different HDFS clusters, this patch will be helpful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12967) NNBench should support multi-cluster access
[ https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-12967: -- Assignee: Chen Zhang > NNBench should support multi-cluster access > --- > > Key: HDFS-12967 > URL: https://issues.apache.org/jira/browse/HDFS-12967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: benchmarks >Reporter: Chen Zhang >Assignee: Chen Zhang > Attachments: HDFS-12967-001.patch > > > Sometimes we need to run NNBench for some scaling tests after made some > improvements on NameNode, so we have to deploy a new HDFS cluster and a new > Yarn cluster. > If NNBench support multi-cluster access, we only need to deploy a new HDFS > test cluster and add it to existing YARN cluster, it'll make the scaling test > easier. > Even more, if we want to make some A-B test, we have to run NNBench on > different HDFS clusters, this patch will be helpful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-9023: Target Version/s: 2.9.1, 3.0.1 > When NN is not able to identify DN for replication, reason behind it can be > logged > -- > > Key: HDFS-9023 > URL: https://issues.apache.org/jira/browse/HDFS-9023 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, > HDFS-9023.03.patch, HDFS-9023.branch-2.patch > > > When NN is not able to identify DN for replication, reason behind it can be > logged (at least critical information why DNs not chosen like disk is full). > At present it is expected to enable debug log. > For example the reason for below error looks like all 7 DNs are busy for data > writes. But at client or NN side no hint is given in the log message. > {noformat} > File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp > could only be replicated to 0 nodes instead of minReplication (=1). There > are 7 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553) > > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305630#comment-16305630 ] Xiao Chen commented on HDFS-9023: - Thanks for the review Surendra. Findbugs / unittest failures are unrelated. This is logging change so no new unit test added. Attaching a branch-2 patch because lambda doesn't work with jdk7. Will commit after pre-commit comes back. > When NN is not able to identify DN for replication, reason behind it can be > logged > -- > > Key: HDFS-9023 > URL: https://issues.apache.org/jira/browse/HDFS-9023 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, > HDFS-9023.03.patch, HDFS-9023.branch-2.patch > > > When NN is not able to identify DN for replication, reason behind it can be > logged (at least critical information why DNs not chosen like disk is full). > At present it is expected to enable debug log. > For example the reason for below error looks like all 7 DNs are busy for data > writes. But at client or NN side no hint is given in the log message. > {noformat} > File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp > could only be replicated to 0 nodes instead of minReplication (=1). There > are 7 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553) > > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-9023: Attachment: HDFS-9023.branch-2.patch > When NN is not able to identify DN for replication, reason behind it can be > logged > -- > > Key: HDFS-9023 > URL: https://issues.apache.org/jira/browse/HDFS-9023 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, > HDFS-9023.03.patch, HDFS-9023.branch-2.patch > > > When NN is not able to identify DN for replication, reason behind it can be > logged (at least critical information why DNs not chosen like disk is full). > At present it is expected to enable debug log. > For example the reason for below error looks like all 7 DNs are busy for data > writes. But at client or NN side no hint is given in the log message. > {noformat} > File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp > could only be replicated to 0 nodes instead of minReplication (=1). There > are 7 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553) > > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12966) Ozone: owner name should be set properly when the container allocation happens
[ https://issues.apache.org/jira/browse/HDFS-12966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-12966: --- Attachment: HDFS-12966-HDFS-7240.001.patch Patch v1 updates the KSM ID/ Cblock Manager Id as the container owner when the the container allocation happens.Test cases are updated accordingly as well. > Ozone: owner name should be set properly when the container allocation happens > -- > > Key: HDFS-12966 > URL: https://issues.apache.org/jira/browse/HDFS-12966 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: HDFS-7240 >Affects Versions: HDFS-7240 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee > Fix For: HDFS-7240 > > Attachments: HDFS-12966-HDFS-7240.001.patch > > > Currently , while the container allocation happens, the owner name is > hardcoded as "OZONE". > It should be set to KSM instance id/ CBlock Manager instance Id from where > the container creation call happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12968) Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info.
[ https://issues.apache.org/jira/browse/HDFS-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-12968: --- Description: TestContainerStateManager#testUpdateContainerState is failing with the following exception: {code} org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container state container28655, reason: invalid state transition from state: OPEN upon event: CLOSE. at org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355) at org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336) at org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244) org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.rules.RunRules.evaluate(RunRules.java:20) {code} Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are failing with the same exception. was: TestContainerStateManager#testUpdateContainerState is failing with the following exception: org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container state container28655, reason: invalid state transition from state: OPEN upon event: CLOSE. at org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355) at org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336) at org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244) org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.rules.RunRules.evaluate(RunRules.java:20) Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are failing with the same exception. > Ozone : TestSCMCli and TestContainerStateManager tests are failing > consistently while updating the container state info. > > > Key: HDFS-12968 > URL: https://issues.apache.org/jira/browse/HDFS-12968 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: HDFS-7240 >Affects Versions: HDFS-7240 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee > Fix For: HDFS-7240 > > > TestContainerStateManager#testUpdateContainerState is failing with the > following exception: > {code} > org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update > container state container28655, reason: invalid state transition from state: > OPEN upon event: CLOSE. > at > org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355) > at > org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336) > at > org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244) > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > {code} > Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer > are failing with the same exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12968) Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info.
Shashikant Banerjee created HDFS-12968: -- Summary: Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info. Key: HDFS-12968 URL: https://issues.apache.org/jira/browse/HDFS-12968 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 TestContainerStateManager#testUpdateContainerState is failing with the following exception: org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container state container28655, reason: invalid state transition from state: OPEN upon event: CLOSE. at org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355) at org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336) at org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244) org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.rules.RunRules.evaluate(RunRules.java:20) Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are failing with the same exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long
[ https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-11225: --- Attachment: Snaphot_Deletion_Design_Proposal.pdf During the snapshot deletion of an older snapshot , maximum time is consumed while constructing the children list for every directory under the snapshottable root directory which requires adding up all the diffs for subsequent snapshots for each and every directory and then reverse applying to the current filesystem tree. The document attached here outlines the idea where the summation of sunsequent diifs for a certain no of snapshots will be maintained in a skip list for each directory once no of snapshots exceed a certain threshold for a snapshottable directory. This will speed up the construction of children list for a snapshot. Please have a look at the proposal for more details. > NameNode crashed because deleteSnapshot held FSNamesystem lock too long > --- > > Key: HDFS-11225 > URL: https://issues.apache.org/jira/browse/HDFS-11225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 > Environment: CDH5.8.2, HA >Reporter: Wei-Chiu Chuang >Assignee: Manoj Govindassamy >Priority: Critical > Labels: high-availability > Attachments: Snaphot_Deletion_Design_Proposal.pdf > > > The deleteSnapshot operation is synchronous. In certain situations this > operation may hold FSNamesystem lock for too long, bringing almost every > NameNode operation to a halt. > We have observed one incidence where it took so long that ZKFC believes the > NameNode is down. All other IPC threads were waiting to acquire FSNamesystem > lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection > timeout of 45 seconds by default, and if all IPC threads wait for > FSNamesystem lock and can't accept new incoming connection, ZKFC times out, > advances epoch and NameNode will therefore lose its active NN role and then > fail. > Relevant log: > {noformat} > Thread 154 (IPC Server handler 86 on 8020): > State: RUNNABLE > Blocked count: 2753455 > Waited count: 89201773 > Stack: > > org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879) > > org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339) > > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167) > > org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789) > {noformat} > After the ZKFC determined NameNode was down and advanced epoch, the NN > finished deleting snapshot, and
[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged
[ https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305279#comment-16305279 ] genericqa commented on HDFS-9023: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-9023 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903826/HDFS-9023.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cc1f2cc72332 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d31c9d8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/22515/testReport/ | | Max. process+thread count | 3375 (vs. ulimit of 5000) | | modules | C:
[jira] [Commented] (HDFS-12934) RBF: Federation supports global quota
[ https://issues.apache.org/jira/browse/HDFS-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305194#comment-16305194 ] Íñigo Goiri commented on HDFS-12934: I cannot do a very thorough review until next week but here a few minor comments: * I would extract the variables for {{QuotaCacheUpdateService#periodicInvoke()}}, in particular {{entry.getSourcePath()}} in a couple loops where is reused a bunch. * In the same {{periodicInvoke()}}, you could have a function to get the mount table entries in addition to {{getMountTableStore()}}, then the loop would just return {{List}}. * {{isQuotaSet()}} could directly return if the quota is set and save the loops; you could just return false at the end otherwise. * {{quotaUsageCache}} could be {{Map}} in the definition; the concurrent implementation is fine. Actually, the usage seems to be most of the time {{getQuotaUsageCache().getChildrenPaths(path)}}, we could expose that directly. * I found out a few weeks ago that {{Configuration}} has a proper interface to get time periods (e.g., 10s). It may make sense to set the intervals for updating the cache using that API. * {{TestRouterQuota}} probably covers most of the cases, but {{RouterQuotaLocalCache}} could be tested by itself pretty exhaustively. * In general, I would set some of the main variables in the loops as {{final}}. Next week, I'll test it in my cluster and try to get familiar to the way the NN does it to give a deeper review. > RBF: Federation supports global quota > - > > Key: HDFS-12934 > URL: https://issues.apache.org/jira/browse/HDFS-12934 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Labels: RBF > Attachments: HDFS-12934.001.patch, HDFS-12934.002.patch, RBF support > global quota.pdf > > > Now federation doesn't support set the global quota for each folder. > Currently the quota will be applied for each subcluster under the specified > folder via RPC call. > It will be very useful for users that federation can support setting global > quota and exposing the command of this. > In a federated environment, a folder can be spread across multiple > subclusters. For this reason, we plan to solve this by following way: > # Set global quota across each subcluster. We don't allow each subcluster can > exceed maximun quota value. > # We need to construct one cache map for storing the sum > quota usage of these subclusters under federation folder. Every time we want > to do WRITE operation under specified folder, we will get its quota usage > from cache and verify its quota. If quota exceeded, throw exception, > otherwise update its quota usage in cache when finishing operations. > The quota will be set to mount table and as a new field in mount table. The > set/unset command will be like: > {noformat} > hdfs dfsrouteradmin -setQuota -ns -ss > hdfs dfsrouteradmin -clrQuota > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org