[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306025#comment-16306025
 ] 

genericqa commented on HDFS-12911:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} root in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 0s{color} | {color:green} HDFS-10285 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
19s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 61 new + 393 unchanged - 0 fixed = 454 total (was 393) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  6m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12911 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903981/HDFS-12911-HDFS-10285-02.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ab81a0066861 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-10285 / 7f4c4ab |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| mvnsite | 

[jira] [Updated] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-28 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-12911:
---
Status: Patch Available  (was: Open)

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12966) Ozone: owner name should be set properly when the container allocation happens

2017-12-28 Thread Shashikant Banerjee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-12966:
---
Status: Patch Available  (was: Open)

> Ozone: owner name should be set properly when the container allocation happens
> --
>
> Key: HDFS-12966
> URL: https://issues.apache.org/jira/browse/HDFS-12966
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: HDFS-7240
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12966-HDFS-7240.001.patch
>
>
> Currently , while the container allocation happens, the owner name is 
> hardcoded as "OZONE".
> It should be set to KSM instance id/ CBlock Manager instance Id from where 
> the container creation call happens. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-28 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306012#comment-16306012
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


Hi [~virajith], Thanks for the feedback.

{quote}
Quick question about this – what are clients you are referring to here?
{quote}
I meant, Client side SPS API ( Today it is exposed via HDFS clients).

{quote}
If the SPS is going to run outside the NN, I would think it is going to be 
decoupled from/not depend on the FSNamesystem lock and the NN-DN heartbeat 
protocol. The current implementation/design has a tight coupling between the 
SPS and both these components.
{quote}
I have just posted a patch with SPS modularization. With that, SPS core 
implementation will not access NN internals directly, instead it will access 
via an Interface. That interface implementation can be pluggable.
Please take a look at HDFS-12911

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-28 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-12911:
---
Attachment: HDFS-12911-HDFS-10285-02.patch

A patch attached with SPS modularization.

# For Scanning: Added an interface, FileIDCollector. We can plugin specific 
scanning implementations.
# For SPS-NN communication: Added interface Context, which is same as 
[~chris.douglas] proposed in first patch. All nn related interaction will 
happen via this interface. A specific implementations can be plugged in. One 
could directly access NN internals and other could use RPC to get info from NN.
# For Block move tasks: Added an interface BlockMoveTaskHandler. We could 
plugin specific implementation for block movements. One can be heartbeat based 
assignment and other implementation can be sending block move ops directly to 
DNs.

In this patch, I provided internal SPS implementation ( 
IntraSPSNameNodeContext, IntraSPSNameNodeFileIDCollector, 
IntraSPSNameNodeBlockMoveTaskHandler) and for external I just created dummy 
plugin classes (ExternalSPSContext, ExternalSPSFileIDCollector, 
ExternalSPSBlockMoveTaskHandler).
Feedbacks are most welcomed. 

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7060) Avoid taking locks when sending heartbeats from the DataNode

2017-12-28 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305967#comment-16305967
 ] 

He Xiaoqiao commented on HDFS-7060:
---

The last patch is good for me. there are some suggestions for testing:
1. heartbeatsAvgTime latency will be not reduced obviously, especially HDFS 
cluster with federation and HA with QJM, since {{HeartbeatsAvgTime}} is a 
statistical indicator which is average value about sendheartbeat times to all 
{{namenodes}}, if Standby NameNode cost log time to process RPC, actually it 
happens frequently when tail editlog or checkpoint, heartbeatsAvgTime indicator 
at DataNode will not be reduced as expect.
2. when set DataNode only interact with Active NameNode for testing, and the 
result is as expect.
for short, this patch is helpful for me. thanks [~yangjiandan] and 
[~cheersyang].

> Avoid taking locks when sending heartbeats from the DataNode
> 
>
> Key: HDFS-7060
> URL: https://issues.apache.org/jira/browse/HDFS-7060
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Jiandan Yang 
>  Labels: BB2015-05-TBR, locks, performance
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HDFS Status Post Patch.png, HDFS-7060-002.patch, 
> HDFS-7060.000.patch, HDFS-7060.001.patch, HDFS-7060.003.patch, 
> HDFS-7060.004.patch, HDFS-7060.005.patch, complete_failed_qps.png, 
> sendHeartbeat.png
>
>
> We're seeing the heartbeat is blocked by the monitor of {{FsDatasetImpl}} 
> when the DN is under heavy load of writes:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:115)
> - waiting to lock <0x000780304fb8> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:91)
> - locked <0x000780612fd8> (a java.lang.Object)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:563)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:668)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:827)
> at java.lang.Thread.run(Thread.java:744)
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:743)
> - waiting to lock <0x000780304fb8> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> at java.lang.Thread.run(Thread.java:744)
>java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.createFileExclusively(Native Method)
> at java.io.File.createNewFile(File.java:1006)
> at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:59)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:244)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:195)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:753)
> - locked <0x000780304fb8> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}



--
This message was 

[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305908#comment-16305908
 ] 

genericqa commented on HDFS-12915:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
42s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs generated 0 new + 0 
unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12915 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903955/HDFS-12915.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ad93ee01f0cc 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5bf7e59 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22518/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 

[jira] [Commented] (HDFS-12967) NNBench should support multi-cluster access

2017-12-28 Thread Chen Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305877#comment-16305877
 ] 

Chen Zhang commented on HDFS-12967:
---

Thanks for your response and suggestion, [~jojochuang]. 

bq. what about adding a new command parameter so that this support is 
visible
I think it don't need a new command parameter, just using path with prefix like 
hdfs://some-cluster/user/foo will work. And it's reasonable to add some 
explanation in help text.

bq.  You should also consider adding tests in TestNNBench
Thanks for reminding this, I'll add a test soon

> NNBench should support multi-cluster access
> ---
>
> Key: HDFS-12967
> URL: https://issues.apache.org/jira/browse/HDFS-12967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks
>Reporter: Chen Zhang
>Assignee: Chen Zhang
> Attachments: HDFS-12967-001.patch
>
>
> Sometimes we need to run NNBench for some scaling tests after made some 
> improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
> Yarn cluster.
> If NNBench support multi-cluster access, we only need to deploy a new HDFS 
> test cluster and add it to existing YARN cluster, it'll make the scaling test 
> easier.
> Even more, if we want to do some A-B test, we have to run NNBench on 
> different HDFS clusters, this patch will be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12967) NNBench should support multi-cluster access

2017-12-28 Thread Chen Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-12967:
--
Description: 
Sometimes we need to run NNBench for some scaling tests after made some 
improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
Yarn cluster.
If NNBench support multi-cluster access, we only need to deploy a new HDFS test 
cluster and add it to existing YARN cluster, it'll make the scaling test easier.

Even more, if we want to do some A-B test, we have to run NNBench on different 
HDFS clusters, this patch will be helpful.

  was:
Sometimes we need to run NNBench for some scaling tests after made some 
improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
Yarn cluster.
If NNBench support multi-cluster access, we only need to deploy a new HDFS test 
cluster and add it to existing YARN cluster, it'll make the scaling test easier.

Even more, if we want to make some A-B test, we have to run NNBench on 
different HDFS clusters, this patch will be helpful.


> NNBench should support multi-cluster access
> ---
>
> Key: HDFS-12967
> URL: https://issues.apache.org/jira/browse/HDFS-12967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks
>Reporter: Chen Zhang
>Assignee: Chen Zhang
> Attachments: HDFS-12967-001.patch
>
>
> Sometimes we need to run NNBench for some scaling tests after made some 
> improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
> Yarn cluster.
> If NNBench support multi-cluster access, we only need to deploy a new HDFS 
> test cluster and add it to existing YARN cluster, it'll make the scaling test 
> easier.
> Even more, if we want to do some A-B test, we have to run NNBench on 
> different HDFS clusters, this patch will be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12860) StripedBlockUtil#getRangesInternalBlocks throws exception for the block group size larger than 2GB

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305864#comment-16305864
 ] 

genericqa commented on HDFS-12860:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m  
7s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
53s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 22 
unchanged - 5 fixed = 22 total (was 27) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 55s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12860 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903942/HDFS-12860.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux babb7c90c411 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5bf7e59 |
| maven | version: Apache Maven 3.3.9 |

[jira] [Commented] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-28 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305833#comment-16305833
 ] 

Chris Douglas commented on HDFS-12915:
--

Please don't hesitate to effect a different refactoring (e.g., removing 
{{blockType}}) if you prefer it to this patch. There's no reason to rewrite the 
same utility function multiple times.

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12915) Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy

2017-12-28 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-12915:
-
Attachment: HDFS-12915.02.patch

Added a check for replication policy set for {{STRIPED}}

> Fix findbugs warning in INodeFile$HeaderFormat.getBlockLayoutRedundancy
> ---
>
> Key: HDFS-12915
> URL: https://issues.apache.org/jira/browse/HDFS-12915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-12915.00.patch, HDFS-12915.01.patch, 
> HDFS-12915.02.patch
>
>
> It seems HDFS-12840 creates a new findbugs warning.
> Possible null pointer dereference of replication in 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat
> In method 
> org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType,
>  Short, Byte)
> Value loaded from replication
> Dereferenced at INodeFile.java:[line 210]
> Known null at INodeFile.java:[line 207]
> From a quick look at the patch, it seems bogus though. [~eddyxu][~Sammi] 
> would you please double check?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12860) StripedBlockUtil#getRangesInternalBlocks throws exception for the block group size larger than 2GB

2017-12-28 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12860:
-
Attachment: HDFS-12860.01.patch

Hi,  [~Sammi].

I updated the patch to add more comments to the test, and address your comments 
1) and 2).  Also it fixes checkstyle warnings 

Regarding the end-to-end tests, I would like to know what's your thoughts on 
this matter. 

> StripedBlockUtil#getRangesInternalBlocks throws exception for the block group 
> size larger than 2GB
> --
>
> Key: HDFS-12860
> URL: https://issues.apache.org/jira/browse/HDFS-12860
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12860.00.patch, HDFS-12860.01.patch
>
>
> Running terasort on a cluster with 8 datanodes, 256GB data, using 
> RS-3-2-1024k.
> The test data was generated by {{teragen}} with 32 mappers.
> The terasort benchmark fails with the following stack trace:
> {code}
> 17/11/27 14:44:31 INFO mapreduce.Job:  map 45% reduce 0%
> 17/11/27 14:44:33 INFO mapreduce.Job: Task Id : 
> attempt_1510080297865_0160_m_08_0, Status : FAILED
> Error: java.lang.IllegalArgumentException
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil$VerticalRange.(StripedBlockUtil.java:701)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getRangesForInternalBlocks(StripedBlockUtil.java:442)
>   at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(StripedBlockUtil.java:311)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:308)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12963) error log level in ShortCircuitRegistry#removeShm

2017-12-28 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305725#comment-16305725
 ] 

Rushabh S Shah commented on HDFS-12963:
---

+1 lgtm non-binding.

> error log level in ShortCircuitRegistry#removeShm
> -
>
> Key: HDFS-12963
> URL: https://issues.apache.org/jira/browse/HDFS-12963
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hu xiaodong
>Assignee: hu xiaodong
>Priority: Minor
> Attachments: HDFS-12963.001.patch
>
>
> {code:title=org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.java|borderStyle=solid}
>   public synchronized void removeShm(ShortCircuitShm shm) {
> if (LOG.isTraceEnabled()) {
> LOG.debug("removing shm " + shm);   -- I think here 
> should be trace
> }
> // Stop tracking the shmId.
> RegisteredShm removedShm = segments.remove(shm.getShmId());
> Preconditions.checkState(removedShm == shm,
> "failed to remove " + shm.getShmId());
> // Stop tracking the slots.
> for (Iterator iter = shm.slotIterator(); iter.hasNext(); ) {
>   Slot slot = iter.next();
>   boolean removed = slots.remove(slot.getBlockId(), slot);
>   Preconditions.checkState(removed);
>   slot.makeInvalid();
> }
> // De-allocate the memory map and close the shared file. 
> shm.free();
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9023:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.1
   2.9.1
   2.10.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Precommit failures look unrelated.

Pushed this to trunk, branch-3.0, branch-2 and branch-2.9, to match HDFS-11494.

Thanks for reporting the issue and the reviews, [~surendrasingh]!

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305700#comment-16305700
 ] 

Hudson commented on HDFS-9023:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #13422 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13422/])
HDFS-9023. When NN is not able to identify DN for replication, reason (xiao: 
rev 5bf7e594d7d54e5295fe4240c3d60c08d4755ab7)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java


> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305693#comment-16305693
 ] 

genericqa commented on HDFS-9023:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
18s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
25s{color} | {color:red} The patch generated 350 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:20 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
|   | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | org.apache.hadoop.hdfs.TestFileAppendRestart |
|   | org.apache.hadoop.hdfs.security.TestDelegationToken |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestDFSMkdirs |
|   | org.apache.hadoop.hdfs.TestDatanodeReport |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | org.apache.hadoop.hdfs.TestSnapshotCommands |
|   | org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestReplaceDatanodeFailureReplication |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-9023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903933/HDFS-9023.branch-2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| 

[jira] [Commented] (HDFS-12897) Path not found when we get the ec policy for a .snapshot dir

2017-12-28 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305670#comment-16305670
 ] 

Xiao Chen commented on HDFS-12897:
--

Thanks [~GeLiXin] for looking into the issue and providing the patch.

Patch looks good overall. Some comments below:
- The {{FSDirectory.isExactReservedName}} check can happen before the path 
resolution, so we don't do unnecessary {{fsd.resolvePath}}.
- There is also {{/.reserved/raw}} under {{/.reserved}}. I think path 
resolution handles that but would love a test covering it.
- Trivial, but prefer {{ecDotSnapshotDir}} declaration in the test to happen 
lazily, right before it's used (instead of at the beginning).

> Path not found when we get the ec policy for a .snapshot dir
> 
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12967) NNBench should support multi-cluster access

2017-12-28 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305635#comment-16305635
 ] 

Wei-Chiu Chuang commented on HDFS-12967:


Sounds a good improvement. For completeness, what about adding a new command 
parameter so that this support is visible? You should also consider adding 
tests in TestNNBench.

Thanks

> NNBench should support multi-cluster access
> ---
>
> Key: HDFS-12967
> URL: https://issues.apache.org/jira/browse/HDFS-12967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks
>Reporter: Chen Zhang
>Assignee: Chen Zhang
> Attachments: HDFS-12967-001.patch
>
>
> Sometimes we need to run NNBench for some scaling tests after made some 
> improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
> Yarn cluster.
> If NNBench support multi-cluster access, we only need to deploy a new HDFS 
> test cluster and add it to existing YARN cluster, it'll make the scaling test 
> easier.
> Even more, if we want to make some A-B test, we have to run NNBench on 
> different HDFS clusters, this patch will be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12967) NNBench should support multi-cluster access

2017-12-28 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-12967:
--

Assignee: Chen Zhang

> NNBench should support multi-cluster access
> ---
>
> Key: HDFS-12967
> URL: https://issues.apache.org/jira/browse/HDFS-12967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks
>Reporter: Chen Zhang
>Assignee: Chen Zhang
> Attachments: HDFS-12967-001.patch
>
>
> Sometimes we need to run NNBench for some scaling tests after made some 
> improvements on NameNode, so we have to deploy a new HDFS cluster and a new 
> Yarn cluster.
> If NNBench support multi-cluster access, we only need to deploy a new HDFS 
> test cluster and add it to existing YARN cluster, it'll make the scaling test 
> easier.
> Even more, if we want to make some A-B test, we have to run NNBench on 
> different HDFS clusters, this patch will be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9023:

Target Version/s: 2.9.1, 3.0.1

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305630#comment-16305630
 ] 

Xiao Chen commented on HDFS-9023:
-

Thanks for the review Surendra.

Findbugs / unittest failures are unrelated. This is logging change so no new 
unit test added. Attaching a branch-2 patch because lambda doesn't work with 
jdk7. Will commit after pre-commit comes back.


> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9023:

Attachment: HDFS-9023.branch-2.patch

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12966) Ozone: owner name should be set properly when the container allocation happens

2017-12-28 Thread Shashikant Banerjee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-12966:
---
Attachment: HDFS-12966-HDFS-7240.001.patch

Patch v1 updates the KSM ID/ Cblock Manager Id as the container owner when the 
the container allocation happens.Test cases are updated accordingly as well.

> Ozone: owner name should be set properly when the container allocation happens
> --
>
> Key: HDFS-12966
> URL: https://issues.apache.org/jira/browse/HDFS-12966
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: HDFS-7240
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12966-HDFS-7240.001.patch
>
>
> Currently , while the container allocation happens, the owner name is 
> hardcoded as "OZONE".
> It should be set to KSM instance id/ CBlock Manager instance Id from where 
> the container creation call happens. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12968) Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info.

2017-12-28 Thread Shashikant Banerjee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-12968:
---
Description: 
TestContainerStateManager#testUpdateContainerState is failing with the 
following exception:

{code}
org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container 
state container28655, reason: invalid state transition from state: OPEN upon 
event: CLOSE.

at 
org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355)
at 
org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336)
at 
org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244)
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
{code}

Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are 
failing with the same exception.

  was:
TestContainerStateManager#testUpdateContainerState is failing with the 
following exception:

org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container 
state container28655, reason: invalid state transition from state: OPEN upon 
event: CLOSE.

at 
org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355)
at 
org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336)
at 
org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244)
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)

Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are 
failing with the same exception.


> Ozone : TestSCMCli and TestContainerStateManager tests are failing 
> consistently while updating the container state info.
> 
>
> Key: HDFS-12968
> URL: https://issues.apache.org/jira/browse/HDFS-12968
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: HDFS-7240
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
>
> TestContainerStateManager#testUpdateContainerState is failing with the 
> following exception:
> {code}
> org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update 
> container state container28655, reason: invalid state transition from state: 
> OPEN upon event: CLOSE.
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355)
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336)
>   at 
> org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244)
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> {code}
> Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer 
> are failing with the same exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12968) Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info.

2017-12-28 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDFS-12968:
--

 Summary: Ozone : TestSCMCli and TestContainerStateManager tests 
are failing consistently while updating the container state info.
 Key: HDFS-12968
 URL: https://issues.apache.org/jira/browse/HDFS-12968
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS-7240
Affects Versions: HDFS-7240
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: HDFS-7240


TestContainerStateManager#testUpdateContainerState is failing with the 
following exception:

org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container 
state container28655, reason: invalid state transition from state: OPEN upon 
event: CLOSE.

at 
org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355)
at 
org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336)
at 
org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244)
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)

Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are 
failing with the same exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2017-12-28 Thread Shashikant Banerjee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-11225:
---
Attachment: Snaphot_Deletion_Design_Proposal.pdf

During the snapshot deletion of an older snapshot , maximum time is consumed 
while constructing the children list for every directory under the 
snapshottable root directory which requires adding up all the diffs for 
subsequent snapshots for each and every directory and then reverse applying to 
the current filesystem tree. The document attached here outlines the idea where 
the summation of sunsequent diifs for a certain no of snapshots will be 
maintained in a skip list for each directory once no of snapshots exceed a 
certain threshold for a snapshottable directory. This will speed up the 
construction of children list for a snapshot. Please have a look at the 
proposal for more details.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
> Attachments: Snaphot_Deletion_Design_Proposal.pdf
>
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and 

[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305279#comment-16305279
 ] 

genericqa commented on HDFS-9023:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-9023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903826/HDFS-9023.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cc1f2cc72332 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d31c9d8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/testReport/ |
| Max. process+thread count | 3375 (vs. ulimit of 5000) |
| modules | C: 

[jira] [Commented] (HDFS-12934) RBF: Federation supports global quota

2017-12-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305194#comment-16305194
 ] 

Íñigo Goiri commented on HDFS-12934:


I cannot do a very thorough review until next week but here a few minor 
comments:
* I would extract the variables for 
{{QuotaCacheUpdateService#periodicInvoke()}}, in particular 
{{entry.getSourcePath()}} in a couple loops where is reused a bunch.
* In the same {{periodicInvoke()}}, you could have a function to get the mount 
table entries in addition to {{getMountTableStore()}}, then the loop would just 
return {{List}}.
* {{isQuotaSet()}} could directly return if the quota is set and save the 
loops; you could just return false at the end otherwise.
* {{quotaUsageCache}} could be {{Map}} in the 
definition; the concurrent implementation is fine. Actually, the usage seems to 
be most of the time {{getQuotaUsageCache().getChildrenPaths(path)}}, we could 
expose that directly.
* I found out a few weeks ago that {{Configuration}} has a proper interface to 
get time periods (e.g., 10s). It may make sense to set the intervals for 
updating the cache using that API.
* {{TestRouterQuota}} probably covers most of the cases, but 
{{RouterQuotaLocalCache}} could be tested by itself pretty exhaustively.
* In general, I would set some of the main variables in the loops as {{final}}.

Next week, I'll test it in my cluster and try to get familiar to the way the NN 
does it to give a deeper review.

> RBF: Federation supports global quota
> -
>
> Key: HDFS-12934
> URL: https://issues.apache.org/jira/browse/HDFS-12934
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>  Labels: RBF
> Attachments: HDFS-12934.001.patch, HDFS-12934.002.patch, RBF support  
> global quota.pdf
>
>
> Now federation doesn't support set the global quota for each folder. 
> Currently the quota will be applied for each subcluster under the specified 
> folder via RPC call.
> It will be very useful for users that federation can support setting global 
> quota and exposing the command of this.
> In a federated environment, a folder can be spread across multiple 
> subclusters. For this reason, we plan to solve this by following way:
> # Set global quota across each subcluster. We don't allow each subcluster can 
> exceed maximun quota value.
> # We need to construct one  cache map for storing the sum  
> quota usage of these subclusters under federation folder. Every time we want 
> to do WRITE operation under specified folder, we will get its quota usage 
> from cache and verify its quota. If quota exceeded, throw exception, 
> otherwise update its quota usage in cache when finishing operations.
> The quota will be set to mount table and as a new field in mount table. The 
> set/unset command will be like:
> {noformat}
>  hdfs dfsrouteradmin -setQuota -ns  -ss  
>  hdfs dfsrouteradmin -clrQuota  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org