[jira] [Commented] (HDFS-11293) FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation

2017-01-04 Thread Yuanbo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800624#comment-15800624
 ] 

Yuanbo Liu commented on HDFS-11293:
---

[~umamaheswararao] / [~rakeshr] I tag you here because this situation always 
make SPS not stable even without my persistence code. And I don't think this 
issue is caused by SPS. It's a common issue. If you have any thoughts about 
this JIRA, please let me know, thanks in advance!

> FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation
> ---
>
> Key: HDFS-11293
> URL: https://issues.apache.org/jira/browse/HDFS-11293
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
>Priority: Critical
>
> In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica 
> info by block pool id. But in this situation:
> {code}
> datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
> 1. the same block replica exists in A[DISK] and B[DISK].
> 2. the block pool id of datanode A and datanode B are the same.
> {code}
> Then we start to change the file's storage policy and move the block replica 
> in the cluster. Very likely we have to move block from B[DISK] to A[SSD], at 
> this time, datanode A throws ReplicaAlreadyExistsException and it's not a 
> correct behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11293) FsDatasetImpl throws ReplicaAlreadyExistsException in a wrong situation

2017-01-04 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11293:
-

 Summary: FsDatasetImpl throws ReplicaAlreadyExistsException in a 
wrong situation
 Key: HDFS-11293
 URL: https://issues.apache.org/jira/browse/HDFS-11293
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu
Priority: Critical


In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica info 
by block pool id. But in this situation:
{code}
datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
1. the same block replica exists in A[DISK] and B[DISK].
2. the block pool id of datanode A and datanode B are the same.
{code}
Then we start to change the file's storage policy and move the block replica in 
the cluster. Very likely we have to move block from B[DISK] to A[SSD], at this 
time, datanode A throws ReplicaAlreadyExistsException and it's not a correct 
behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-04 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800575#comment-15800575
 ] 

Ming Ma commented on HDFS-9391:
---

Sure let us keep what you have in patch 02. Just to make sure, can you confirm 
the followings?

* For the case of "one replica is decommissioning and two replicas of the same 
block are entering maintenance", the code will still increment 
maintenanceOnlyReplicas when processing the decommissioning node, because 
NumberReplicas includes all replicas stats. Thus decommissionOnlyReplicas == 
maintenanceOnlyReplicas  == outOfServiceReplicas.
* For the case of "all replicas are decommissioning", then EnteringMaintenance 
page will have nothing to show to begin with given no nodes are entering 
maintenance.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN

2017-01-04 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800550#comment-15800550
 ] 

Haohui Mai commented on HDFS-11280:
---

Committed all the way down to branch-2.7. Many thanks everyone for reporting 
and validating the issues.


> Allow WebHDFS to reuse HTTP connections to NN
> -
>
> Key: HDFS-11280
> URL: https://issues.apache.org/jira/browse/HDFS-11280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11280.for.2.7.and.below.patch, 
> HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, 
> HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, 
> HDFS-11280.for.2.8.and.beyond.patch
>
>
> WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. 
>  When we use webhdfs as the source in distcp, this used up all ephemeral 
> ports on the client side since all closed connections continue to occupy the 
> port with TIME_WAIT status for some time.
> According to http://tinyurl.com/java7-http-keepalive, we should call 
> conn.getInputStream().close() instead to make sure the connection is kept 
> alive.  This will get rid of the ephemeral port problem.
> Manual steps used to verify the bug fix:
> 1. Build original hadoop jar.
> 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows a big number (100s).
> 3. Build hadoop jar with this diff.
> 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows 0.
> 5. The explanation:  distcp's client side does a lot of directory scanning, 
> which would create and close a lot of connections to the namenode HTTP port.
> Reference:
> 2.7 and below: 
> https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743
> 2.8 and above: 
> https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN

2017-01-04 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-11280:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha2
   2.7.4
   2.8.0
   Status: Resolved  (was: Patch Available)

> Allow WebHDFS to reuse HTTP connections to NN
> -
>
> Key: HDFS-11280
> URL: https://issues.apache.org/jira/browse/HDFS-11280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11280.for.2.7.and.below.patch, 
> HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, 
> HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, 
> HDFS-11280.for.2.8.and.beyond.patch
>
>
> WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. 
>  When we use webhdfs as the source in distcp, this used up all ephemeral 
> ports on the client side since all closed connections continue to occupy the 
> port with TIME_WAIT status for some time.
> According to http://tinyurl.com/java7-http-keepalive, we should call 
> conn.getInputStream().close() instead to make sure the connection is kept 
> alive.  This will get rid of the ephemeral port problem.
> Manual steps used to verify the bug fix:
> 1. Build original hadoop jar.
> 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows a big number (100s).
> 3. Build hadoop jar with this diff.
> 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows 0.
> 5. The explanation:  distcp's client side does a lot of directory scanning, 
> which would create and close a lot of connections to the namenode HTTP port.
> Reference:
> 2.7 and below: 
> https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743
> 2.8 and above: 
> https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9935) Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD constants from HdfsServerConstants

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800463#comment-15800463
 ] 

Hadoop QA commented on HDFS-9935:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 74m 
41s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-9935 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12792428/HDFS-9935.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b987f56496ae 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5ed63e3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18032/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18032/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD constants from HdfsServerConstants
> 
>
> Key: HDFS-9935
> URL: https://issues.apache.org/jira/browse/HDFS-9935
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-9935.001.patch
>
>
> In HDFS-9134, it has moved the 
> 

[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN

2017-01-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800394#comment-15800394
 ] 

Hudson commented on HDFS-11280:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11074/])
HDFS-11280. Allow WebHDFS to reuse HTTP connections to NN. Contributed (wheat9: 
rev a605ff36a53a3d1283c3f6d81eb073e4a2942143)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


> Allow WebHDFS to reuse HTTP connections to NN
> -
>
> Key: HDFS-11280
> URL: https://issues.apache.org/jira/browse/HDFS-11280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HDFS-11280.for.2.7.and.below.patch, 
> HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, 
> HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, 
> HDFS-11280.for.2.8.and.beyond.patch
>
>
> WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. 
>  When we use webhdfs as the source in distcp, this used up all ephemeral 
> ports on the client side since all closed connections continue to occupy the 
> port with TIME_WAIT status for some time.
> According to http://tinyurl.com/java7-http-keepalive, we should call 
> conn.getInputStream().close() instead to make sure the connection is kept 
> alive.  This will get rid of the ephemeral port problem.
> Manual steps used to verify the bug fix:
> 1. Build original hadoop jar.
> 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows a big number (100s).
> 3. Build hadoop jar with this diff.
> 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows 0.
> 5. The explanation:  distcp's client side does a lot of directory scanning, 
> which would create and close a lot of connections to the namenode HTTP port.
> Reference:
> 2.7 and below: 
> https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743
> 2.8 and above: 
> https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN

2017-01-04 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800337#comment-15800337
 ] 

Haohui Mai commented on HDFS-11280:
---

Validate that the the latest patch passes the unit tests mentioned in the jira. 
Committing.

> Allow WebHDFS to reuse HTTP connections to NN
> -
>
> Key: HDFS-11280
> URL: https://issues.apache.org/jira/browse/HDFS-11280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HDFS-11280.for.2.7.and.below.patch, 
> HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, 
> HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, 
> HDFS-11280.for.2.8.and.beyond.patch
>
>
> WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. 
>  When we use webhdfs as the source in distcp, this used up all ephemeral 
> ports on the client side since all closed connections continue to occupy the 
> port with TIME_WAIT status for some time.
> According to http://tinyurl.com/java7-http-keepalive, we should call 
> conn.getInputStream().close() instead to make sure the connection is kept 
> alive.  This will get rid of the ephemeral port problem.
> Manual steps used to verify the bug fix:
> 1. Build original hadoop jar.
> 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows a big number (100s).
> 3. Build hadoop jar with this diff.
> 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | 
> grep -c 50070" on the local machine shows 0.
> 5. The explanation:  distcp's client side does a lot of directory scanning, 
> which would create and close a lot of connections to the namenode HTTP port.
> Reference:
> 2.7 and below: 
> https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743
> 2.8 and above: 
> https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9935) Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD constants from HdfsServerConstants

2017-01-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800276#comment-15800276
 ] 

Brahma Reddy Battula commented on HDFS-9935:


I feel, this can be removed as these are not used anymore and HDFS-9936 also 
updated as part this jira only.

But need to confirm with [~wheat9] and [~liuml07] and these are kept as per 
[comment|https://issues.apache.org/jira/browse/HDFS-9134?focusedCommentId=14905935=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14905935]
 from [~wheat9] 

> Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD constants from HdfsServerConstants
> 
>
> Key: HDFS-9935
> URL: https://issues.apache.org/jira/browse/HDFS-9935
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-9935.001.patch
>
>
> In HDFS-9134, it has moved the 
> {{LEASE_SOFTLIMIT_PERIOD}},{{LEASE_HARDLIMIT_PERIOD}} constants from 
> {{HdfsServerConstants}} to {{HdfsConstants}} because these two constants are 
> used by {{DFSClient}} which is moved to {{hadoop-hdfs-client}}. And constants 
> in {{HdfsConstants}} can be both used by client and server side. In addition, 
> I have checked that these two constants in {{HdfsServerConstants}} has 
> already not been used in project now and were all replaced by 
> {{HdfsConstants.LEASE_SOFTLIMIT_PERIOD}},{{HdfsConstants.LEASE_HARDLIMIT_PERIOD}}.
>  So I think we can remove these unused constant values in 
> {{HdfsServerConstants}} completely. Instead of we can use them in 
> {{HdfsConstants}} if we want to use them in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11285) Dead DataNodes keep a long time in (Dead, DECOMMISSION_INPROGRESS), and never transition to (Dead, DECOMMISSIONED)

2017-01-04 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800224#comment-15800224
 ] 

Lantao Jin commented on HDFS-11285:
---

Thanks [~andrew.wang], I'm not sure whether or not our case is a common one. We 
have an upper layer application which trigger and monitor the decommissioning 
progress. When it finds the "Blocks with no live replicas" becoming 0 in the NN 
UI, it will shutdown the DN. Why not wait for being transited to 
decommissioned, because that sometimes we found decommissioning progress took 
very much time which there were only one or two "Under replicated blocks" left.

So, after none of  "Blocks with no live replicas", the DN is shutdown. And its 
status become [Dead, Decommissioning] forever. Therefore, I need to run the 
four steps mentioned above to retire them.

In the code of HeartbeatManager and DecommissionManager.
{code}
if (!node.isDecommissionInProgress() && !node.isDecommissioned()) {
  // Update DN stats maintained by HeartbeatManager
  hbManager.startDecommission(node);
{code}
Only [Dead, Normal] status can be set [Dead, Decommissioned] directly.

> Dead DataNodes keep a long time in (Dead, DECOMMISSION_INPROGRESS), and never 
> transition to (Dead, DECOMMISSIONED)
> --
>
> Key: HDFS-11285
> URL: https://issues.apache.org/jira/browse/HDFS-11285
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Lantao Jin
>
> We have seen the use case of decommissioning DataNodes that are already dead 
> or unresponsive, and not expected to rejoin the cluster. In a large cluster, 
> we met more than 100 nodes were dead, decommissioning and their {panel} Under 
> replicated blocks {panel} {panel} Blocks with no live replicas {panel} were 
> all ZERO. Actually It has been fixed in 
> [HDFS-7374|https://issues.apache.org/jira/browse/HDFS-7374]. After that, we 
> can refreshNode twice to eliminate this case. But, seems this patch missed 
> after refactor[HDFS-7411|https://issues.apache.org/jira/browse/HDFS-7411]. We 
> are using a Hadoop version based 2.7.1 and only below operations can 
> transition the status from {panel} Dead, DECOMMISSION_INPROGRESS {panel} to 
> {panel} Dead, DECOMMISSIONED {panel}:
> # Retire it from hdfs-exclude
> # refreshNodes
> # Re-add it to hdfs-exclude
> # refreshNodes
> So, why the code removed after refactor in the new DecommissionManager?
> {code:java}
> if (!node.isAlive) {
>   LOG.info("Dead node " + node + " is decommissioned immediately.");
>   node.setDecommissioned();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11193) [SPS]: Erasure coded files should be considered for satisfying storage policy

2017-01-04 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800153#comment-15800153
 ] 

Uma Maheswara Rao G commented on HDFS-11193:


[~rakeshr] There is a check style error reported is related and asf license 
header comment is invalid.
Also could you please look at test failures if they are related? 

> [SPS]: Erasure coded files should be considered for satisfying storage policy
> -
>
> Key: HDFS-11193
> URL: https://issues.apache.org/jira/browse/HDFS-11193
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-11193-HDFS-10285-00.patch, 
> HDFS-11193-HDFS-10285-01.patch, HDFS-11193-HDFS-10285-02.patch, 
> HDFS-11193-HDFS-10285-03.patch
>
>
> Erasure coded striped files supports storage policies {{HOT, COLD, ALLSSD}}. 
> {{HdfsAdmin#satisfyStoragePolicy}} API call on a directory should consider 
> all immediate files under that directory and need to check that, the files 
> really matching with namespace storage policy. All the mismatched striped 
> blocks should be chosen for block movement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9483) Documentation does not cover use of "swebhdfs" as URL scheme for SSL-secured WebHDFS.

2017-01-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800155#comment-15800155
 ] 

Brahma Reddy Battula commented on HDFS-9483:


[~cnauroth] thanks your reply..Agreed with you.Me too +1 on latest patch..

> Documentation does not cover use of "swebhdfs" as URL scheme for SSL-secured 
> WebHDFS.
> -
>
> Key: HDFS-9483
> URL: https://issues.apache.org/jira/browse/HDFS-9483
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9483.001.patch, HDFS-9483.002.patch, HDFS-9483.patch
>
>
> If WebHDFS is secured with SSL, then you can use "swebhdfs" as the scheme in 
> a URL to access it.  The current documentation does not state this anywhere.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId etc info in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Summary: log lastWrittenTxId etc info in logSyncAll  (was: log 
lastWrittenTxId in logSyncAll)

> log lastWrittenTxId etc info in logSyncAll
> --
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-04 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800144#comment-15800144
 ] 

Manoj Govindassamy commented on HDFS-9391:
--

Yes, in the example you gave both maintenanceOnlyReplicas and 
outOfServiceOnlyReplicas are incremented. I see outOfServiceOnlyReplicas as 
more of a cumulative number of both decommission and maintenance. The final 
counts would be 2 for maintenanceOnlyReplicas and 3 for 
outOfServiceOnlyReplicas. 

>> In other words, maintenanceOnlyReplicas isn't strictly "all 3 replicas are 
>> maintenance". Maybe this new definition is more desirable.

Yes, in the above example, all 3 replicas are in some sort of maintenance and 
it is ok to have EnteringMaintenance page display "OutOfServiceOnlyReplicas".

But, in an another example, where only one node is in decommission and no other 
nodes are in maintenance,
-- the Decommissioning page will rightly show 1 node in decommission. There is 
no problem with this page.
-- the EnteringMaintenance page, if we start to use the new cumulative 
"OutOfServiceOnlyReplicas", then this page also will show 1 node in maintenance 
the same one which is decommissioning.  

Hopefully you have thought about this case as well. The EnteringMaintenance 
page behavior for the second example sounds ok to you ? Please let me know.

{noformat}
  .put("maintenanceOnlyReplicas",
  node.getLeavingServiceStatus().getOutOfServiceOnlyReplicas())
{noformat}

Once this open question is resolved, will attach the new patch incorporating 
all pending changes. Thanks [~mingma].

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11291) Avoid unnecessary edit log for setStoragePolicy() and setReplication()

2017-01-04 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800087#comment-15800087
 ] 

Yiqun Lin commented on HDFS-11291:
--

Thanks [~surendrasingh] for the work on this. Yes, it will avoid extra rpc 
calls. And I looked into your patch, it almost looks good to me. Only one place 
I noticed:
In the patch, you use the method {{INodeFile#getFileReplication}} to get the 
old replication,
{code}
+  if (inode.asFile().getFileReplication() == replication) {
+return true;
+  }
{code}
However, the original code uses the method 
{{INodeFile#getPreferredBlockReplication}} (get the max replication value 
between file blocks and file)
{code}
short oldBR = file.getPreferredBlockReplication();
{code}
Should we keep this consistent?
In addition, can you test failure test 
{{hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer}} in your local, 
it seems related. Others look good to me. Thanks.

> Avoid unnecessary edit log for setStoragePolicy() and setReplication()
> --
>
> Key: HDFS-11291
> URL: https://issues.apache.org/jira/browse/HDFS-11291
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11291.001.patch
>
>
> We are setting the storage policy for file without checking the current 
> policy of file for avoiding extra getStoragePolicy() rpc call. Currently 
> namenode is not checking the current storage policy before setting new one 
> and adding edit logs. I think if the old and new storage policy is same we 
> can avoid set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5180) Output the processing time of slow RPC request to node's log

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-5180:
-
Fix Version/s: (was: 2.8.0)

> Output the processing time of slow RPC request to node's log
> 
>
> Key: HDFS-5180
> URL: https://issues.apache.org/jira/browse/HDFS-5180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
>  Labels: BB2015-05-TBR
> Attachments: HDFS-5180.patch, HDFS-5180.patch
>
>
> In current trunk, it is output at DEBUG level for the processing time of all 
> RPC requests to log.
> When we treat it by the troubleshooting of the large-scale cluster, it is 
> hard to handle the current implementation.
> Therefore we should set the threshold and output only a slow RPC to node's 
> log to know the abnormal sign.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8061) Create an Offline FSImage Viewer tool

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8061:
-
Fix Version/s: (was: 2.8.0)

> Create an Offline FSImage Viewer tool
> -
>
> Key: HDFS-8061
> URL: https://issues.apache.org/jira/browse/HDFS-8061
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Reporter: Mike Drob
>Assignee: Lei (Eddy) Xu
>
> We already have a tool for converting edit logs to and from binary and xml. 
> The next logical step it to create an `oiv` (offline image viewer) that will 
> allow users to manipulate the FS Image.
> When outputting to text, it might make sense to have two output formats - 1) 
> an XML that is easier to convert back to binary and 2) something that looks 
> like the output from `tree` command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8306) Outputs Xattr in OIV XML format

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8306:
-
Fix Version/s: (was: 2.8.0)

> Outputs Xattr in OIV XML format
> ---
>
> Key: HDFS-8306
> URL: https://issues.apache.org/jira/browse/HDFS-8306
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch, 
> HDFS-8306.002.patch, HDFS-8306.003.patch, HDFS-8306.004.patch, 
> HDFS-8306.005.patch, HDFS-8306.006.patch, HDFS-8306.007.patch, 
> HDFS-8306.008.patch, HDFS-8306.009.patch, HDFS-8306.debug0.patch, 
> HDFS-8306.debug1.patch
>
>
> Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
> outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
> Also it prevents recovering a fsimage from XML file.
> This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
> achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8872) Reporting of missing blocks is different in fsck and namenode ui/metasave

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8872:
-
Fix Version/s: (was: 2.8.0)

> Reporting of missing blocks is different in fsck and namenode ui/metasave
> -
>
> Key: HDFS-8872
> URL: https://issues.apache.org/jira/browse/HDFS-8872
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> Namenode ui and metasave will not report a block as missing if the only 
> replica is on decommissioning/decomissioned node while fsck will show it as 
> MISSING.
> Since decommissioned node can be formatted/removed anytime, we can actually 
> lose the block.
> Its better to alert on namenode ui if the only copy is on 
> decomissioned/decommissioning node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9310) TestDataNodeHotSwapVolumes fails occasionally

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9310:
-
Fix Version/s: (was: 2.8.0)

> TestDataNodeHotSwapVolumes fails occasionally
> -
>
> Key: HDFS-9310
> URL: https://issues.apache.org/jira/browse/HDFS-9310
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Lei (Eddy) Xu
>
> TestDataNodeHotSwapVolumes fails occasionally in Jenkins and locally. e.g. 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13197/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeHotSwapVolumes/testRemoveVolumeBeingWritten/
> *Error Message*
> Timed out waiting for /test to reach 3 replicas
> *Stacktrace*
> java.util.concurrent.TimeoutException: Timed out waiting for /test to reach 3 
> replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:768)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWrittenForDatanode(TestDataNodeHotSwapVolumes.java:644)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten(TestDataNodeHotSwapVolumes.java:569)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9940) Balancer should not use property dfs.datanode.balance.max.concurrent.moves

2017-01-04 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799935#comment-15799935
 ] 

Junping Du commented on HDFS-9940:
--

The patch is not get in, so we should set target version instead of fix 
version. 2.8.0 is in releasing, reset to 2.9 instead.

> Balancer should not use property dfs.datanode.balance.max.concurrent.moves
> --
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9287) Block placement completely fails if too many nodes are decommissioning

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9287:
-
Fix Version/s: (was: 2.8.0)

> Block placement completely fails if too many nodes are decommissioning
> --
>
> Key: HDFS-9287
> URL: https://issues.apache.org/jira/browse/HDFS-9287
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>Priority: Critical
>
> The DatanodeManager coordinates with the HeartbeatManager to update 
> HeartbeatManager.Stats to track capacity and load.   This is crucial for 
> block placement to consider space and load.  It's completely broken for 
> decomm nodes.
> The heartbeat manager substracts the prior values before it adds new values.  
> During registration of a decomm node, it substracts before seeding the 
> initial values.  This decrements nodesInService, flips state to decomm, add 
> will not increment nodesInService (correct).  There are other math bugs 
> (double adding) that accidentally work due to 0 values.
> The result is every decomm node decrements the node count used for block 
> placement.  When enough nodes are decomm, the replication monitor will 
> silently stop working.  No logging.  It searches all nodes and just gives up. 
>  Eventually, all block allocation will also completely fail.  No files can be 
> created.  No jobs can be submitted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9940) Balancer should not use property dfs.datanode.balance.max.concurrent.moves

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9940:
-
Fix Version/s: (was: 2.8.0)

> Balancer should not use property dfs.datanode.balance.max.concurrent.moves
> --
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9940) Balancer should not use property dfs.datanode.balance.max.concurrent.moves

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9940:
-
Target Version/s: 2.9.0  (was: 2.8.0)

> Balancer should not use property dfs.datanode.balance.max.concurrent.moves
> --
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10171) Balancer should log config values

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-10171:
--
Fix Version/s: (was: 2.8.0)

> Balancer should log config values
> -
>
> Key: HDFS-10171
> URL: https://issues.apache.org/jira/browse/HDFS-10171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> To improve supportability, Balancer should log config values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10198) File browser web UI should split to pages when files/dirs are too many

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-10198:
--
Fix Version/s: (was: 2.8.0)

> File browser web UI should split to pages when files/dirs are too many
> --
>
> Key: HDFS-10198
> URL: https://issues.apache.org/jira/browse/HDFS-10198
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.2
>Reporter: Weiwei Yang
>Assignee: Afzal Saan
>  Labels: ui
>
> When there are a large number of files/dirs, HDFS file browser UI takes too 
> long to load, and it loads all items in one single page, causes so many 
> problems to read. We should have it split to pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2017-01-04 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799932#comment-15799932
 ] 

Arpit Agarwal commented on HDFS-10608:
--

Hi [~churromorales], I am sorry you didn't get a timely committer review on 
this. I'll try to review it next week. I have never looked at INotify before so 
I'd have to spend some time to understand the feature first.

Alternatively you could try requesting a review on hdfs-dev at 
hadoop.apache.org and someone familiar with the feature may be able to review 
it.

> Include event for AddBlock in Inotify Event Stream
> --
>
> Key: HDFS-10608
> URL: https://issues.apache.org/jira/browse/HDFS-10608
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: churro morales
>Priority: Minor
> Attachments: HDFS-10608.patch, HDFS-10608.v1.patch, 
> HDFS-10608.v2.patch, HDFS-10608.v3.patch, HDFS-10608.v4.patch
>
>
> It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
> discussions from mailing list:
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10315) Fix TestRetryCacheWithHA and TestNamenodeRetryCache failures

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-10315:
--
Fix Version/s: (was: 2.8.0)

> Fix TestRetryCacheWithHA and TestNamenodeRetryCache failures
> 
>
> Key: HDFS-10315
> URL: https://issues.apache.org/jira/browse/HDFS-10315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> {noformat}
> FAILED:  
> org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild
> Error Message:
> expected:<25> but was:<26>
> Stack Trace:
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild(TestNamenodeRetryCache.java:419)
> FAILED:  
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN
> Error Message:
> expected:<25> but was:<26>
> Stack Trace:
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10716) In Balancer, the target task should be removed when its size < 0.

2017-01-04 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799925#comment-15799925
 ] 

Junping Du commented on HDFS-10716:
---

Sound like we were missing jira number in commit log.
{noformat}
commit cefa21e98a12b06602ee8000f8cef6c3b17af999
Author: Tsz-Wo Nicholas Sze 
Date:   Thu Aug 4 09:45:40 2016 -0700

In Balancer, the target task should be removed when its size < 0.  
Contributed by Yiqun Lin
{noformat}

> In Balancer, the target task should be removed when its size < 0.
> -
>
> Key: HDFS-10716
> URL: https://issues.apache.org/jira/browse/HDFS-10716
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-10716.001.patch, failing.log
>
>
> In HDFS-10602, we found a failing case that the balancer moves data always 
> between 2 DNs. And it made the balancer can't be finished. I debug the code 
> for this, I found there seems a bug in choosing pending blocks in 
> {{Dispatcher.Source.chooseNextMove}}.
> The codes:
> {code}
> private PendingMove chooseNextMove() {
>   for (Iterator i = tasks.iterator(); i.hasNext();) {
> final Task task = i.next();
> final DDatanode target = task.target.getDDatanode();
> final PendingMove pendingBlock = new PendingMove(this, task.target);
> if (target.addPendingBlock(pendingBlock)) {
>   // target is not busy, so do a tentative block allocation
>   if (pendingBlock.chooseBlockAndProxy()) {
> long blockSize = pendingBlock.reportedBlock.getNumBytes(this);
> incScheduledSize(-blockSize);
> task.size -= blockSize;
> // If the size of bytes that need to be moved was first reduced 
> to less than 0
> // it should also be removed.
> if (task.size == 0) {
>   i.remove();
> }
> return pendingBlock;
> //...
> {code}
> The value of task.size was assigned in 
> {{Balancer#matchSourceWithTargetToMove}}
> {code}
> long size = Math.min(source.availableSizeToMove(), 
> target.availableSizeToMove());
> final Task task = new Task(target, size);
> {code}
> This value was depended on the source and target node, and this value will 
> not always can be reduced to 0 in choosing pending blocks. And then, it will 
> still move the data to the target node even if the size of bytes that needed 
> to move has been already reduced less than 0. And finally it will make the 
> data imbalance again in cluster, then it leads the next balancer.
> We can opitimize for this as this title mentioned, I think this can speed the 
> balancer.
> Can see the logs for failling case, or see the HDFS-10602.(Concentrating on 
> the change record for the scheduled size of target node. That's my added info 
> for debug, like this).
> {code}
> 2016-08-01 16:51:57,492 [pool-51-thread-1] INFO  balancer.Dispatcher 
> (Dispatcher.java:chooseNextMove(799)) - TargetNode: 58794, bytes scheduled to 
> move, after: -67, before: 33
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10888) dfshealth.html#tab-datanode

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-10888:
--
Fix Version/s: (was: 2.8.0)

> dfshealth.html#tab-datanode
> ---
>
> Key: HDFS-10888
> URL: https://issues.apache.org/jira/browse/HDFS-10888
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 2.7.3
>Reporter: Alexey Ivanchin
>Assignee: Weiwei Yang
>
> When you click on the tab NN:50070/dfshealth.html#tab-overview i see live 
> datanode and other info. 
> When you click on the tab NN:50070/dfshealth.html#tab-datanode I see a blank 
> page. How to fix?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2017-01-04 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799919#comment-15799919
 ] 

Junping Du commented on HDFS-11160:
---

>From commit log, it indicate the commit only land in branch-2 but not 
>branch-2.8. Replace fix version to 2.9 instead.

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, 
> HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, 
> HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2017-01-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-11160:
--
Fix Version/s: (was: 2.8.0)
   2.9.0

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, 
> HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, 
> HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-04 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799905#comment-15799905
 ] 

Ming Ma commented on HDFS-9391:
---

Good point. Actually it seems maintenanceOnlyReplicas is the same as 
outOfServiceOnlyReplicas in such case. For example, say one replica is 
decommissioning and two are entering maintenance, both maintenanceOnlyReplicas 
and outOfServiceOnlyReplicas are incremented. In other words, 
maintenanceOnlyReplicas isn't strictly "all 3 replicas are maintenance". Maybe 
this new definition is more desirable. What do you think?

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11170) Add create API in filesystem public class to support assign parameter through builder

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799870#comment-15799870
 ] 

Hadoop QA commented on HDFS-11170:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
18s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
14s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 42s{color} | {color:orange} root: The patch generated 12 new + 133 unchanged 
- 1 fixed = 145 total (was 134) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
58s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 2 new 
+ 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 56s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
8s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
38s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  
org.apache.hadoop.hdfs.DistributedFileSystem$DistributedFileSystemCreateBuilder.getFavoredNodes()
 may expose internal representation by returning 
DistributedFileSystem$DistributedFileSystemCreateBuilder.favoredNodes  At 
DistributedFileSystem.java:by returning 
DistributedFileSystem$DistributedFileSystemCreateBuilder.favoredNodes  At 
DistributedFileSystem.java:[line 2567] |
|  |  
org.apache.hadoop.hdfs.DistributedFileSystem$DistributedFileSystemCreateBuilder.setFavoredNodes(InetSocketAddress[])
 may expose internal representation by storing an externally mutable object 
into DistributedFileSystem$DistributedFileSystemCreateBuilder.favoredNodes  At 
DistributedFileSystem.java:by storing an externally mutable object into 
DistributedFileSystem$DistributedFileSystemCreateBuilder.favoredNodes  At 
DistributedFileSystem.java:[line 2572] |
| Failed junit 

[jira] [Commented] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799806#comment-15799806
 ] 

Hadoop QA commented on HDFS-11292:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 
41s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845641/HDFS-11292.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8e542c7c6dff 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18029/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18029/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still 

[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-01-04 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799802#comment-15799802
 ] 

Sean Mackrory commented on HDFS-11096:
--

I've been doing a lot of testing. I've posted some automation here, we may want 
to hook into a Jenkins job or something: 
https://github.com/mackrorysd/hadoop-compatibility. I've tested running a bunch 
of MapReduce jobs while doing a rolling upgrade of HDFS, and haven't had any 
failures that indicate an incompatibility. I've also tested pulling data from 
an old cluster onto a new cluster. I'll keep adding other aspects to the tests 
to improve coverage.

I haven't seen a way to whitelist stuff. Filed an issue with jacc: 
https://github.com/lvc/japi-compliance-checker/issues/36.

As for the incompatibilities, I think there's relatively action to be taken, so 
I'll file JIRAs for those. In detail: metrics and s3a are technically violating 
the contract, but in all cases it would be some serious baggage and due to 
their nature I think it's acceptable. I think SortedMapWritable should be put 
back but deprecated (I'm sure someone's depending on it somewhere and it should 
be trivial), and FileStatus should still implement Comparable. Not so sure 
about NameodeMXBean, the missing configuration keys, or the cases of reduced 
visibility. I'm inclined to leave these as-is unless we know it breaks 
something and they care. They are technically incompatibilities, so maybe 
someone else feels differently (or is aware of applications they are likely to 
break), but it would be nice to shed baggage and poor practices where we can. 
All other issues I feel more confident that they're either not actually 
breaking the contract or are extremely unlikely to break anything enough to 
warrant sticking with the old way. I'll sleep on some of these one more night 
and file JIRAs to start addressing the issues I think are important enough 
tomorrow.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799783#comment-15799783
 ] 

Hadoop QA commented on HDFS-11292:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
34s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845641/HDFS-11292.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c097acb3e29c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18028/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18028/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still 

[jira] [Updated] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-04 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-11273:
--
Attachment: HDFS-11273.002.patch

Thank you [~jingzhao]. I have uploaded patch v02 with the new changes.

> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch, 
> HDFS-11273.002.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy

2017-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799712#comment-15799712
 ] 

Andrew Wang commented on HDFS-11072:


Thanks for the rev Sammi, looks like we're getting close. Agree that we can 
focus on getting this in, and worry about the replication policy later.

Some code review comments:

* nit: can combine these two lines into one in FSDirErasureCodingOp:

{code}
FSPermissionChecker pc = null;
pc = fsn.getPermissionChecker();
{code}

* removeErasureCodingPolicyXAttr can be private
* IOException text says "Attempt to unset an erasure coding policy from a 
file", prefer if we be more explicit about the error and say "Cannot unset the 
erasure coding policy on a file".
* Why do we need the new {{getLastCompleteINode}} method? IIUC an IIP has nulls 
if that path component doesn't exist, but that only happens when we're creating 
a new inode. Maybe we should test calling these APIs on a path that does not 
exist.
* In TestErasureCodingPolicies, should say "policies are supported" rather than 
"policies is supported" in both places
* Additional unit test ideas: setting and unsetting on a file, unsetting when 
not set, setting twice on the same directory with different policies

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch, HDFS-11072-v5.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: HDFS-11292.002.patch

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11156:
---
   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

I ran TestFailureToReadEdits locally and it passed, looks like a flake 
particularly since it worked on JDK8.

I've committed this to branch-2, thanks again for the contribution 
[~cheersyang]!

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9483) Documentation does not cover use of "swebhdfs" as URL scheme for SSL-secured WebHDFS.

2017-01-04 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799645#comment-15799645
 ] 

Chris Nauroth commented on HDFS-9483:
-

I'm +1 for patch 002.  [~brahmareddy], you had suggested perhaps a link to SSL 
configuration information.  Unfortunately, the only place I'm aware of with 
that information is the Encrypted Shuffle page, and it would be kind of odd to 
link to that from DistCp.  I'll hold off committing in case you have further 
comments.

> Documentation does not cover use of "swebhdfs" as URL scheme for SSL-secured 
> WebHDFS.
> -
>
> Key: HDFS-9483
> URL: https://issues.apache.org/jira/browse/HDFS-9483
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9483.001.patch, HDFS-9483.002.patch, HDFS-9483.patch
>
>
> If WebHDFS is secured with SSL, then you can use "swebhdfs" as the scheme in 
> a URL to access it.  The current documentation does not state this anywhere.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: (was: HDFS-11292.002.patch)

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11193) [SPS]: Erasure coded files should be considered for satisfying storage policy

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799571#comment-15799571
 ] 

Hadoop QA commented on HDFS-11193:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
38s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 126 unchanged - 2 fixed = 127 total (was 128) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 14s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
39s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
| Timed out junit tests | org.apache.hadoop.hdfs.TestFileChecksum |
|   | org.apache.hadoop.hdfs.TestDFSXORStripedOutputStream |
|   | org.apache.hadoop.hdfs.TestDFSStripedInputStream |
|   | org.apache.hadoop.hdfs.server.balancer.TestBalancer |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11193 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845279/HDFS-11193-HDFS-10285-03.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5a0feaf02fce 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10285 / 57193f7 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18025/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18025/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18025/testReport/ |
| asflicense | 

[jira] [Commented] (HDFS-10759) Change fsimage bool isStriped from boolean to an enum

2017-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799545#comment-15799545
 ] 

Andrew Wang commented on HDFS-10759:


Hi [~ehiggs], I think you accidentally attached a patch for HDFS-11026 instead. 
FWIW, TestWebHDFS also does pass for me on trunk.

> Change fsimage bool isStriped from boolean to an enum
> -
>
> Key: HDFS-10759
> URL: https://issues.apache.org/jira/browse/HDFS-10759
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1, 3.0.0-beta1, 3.0.0-alpha2
>Reporter: Ewan Higgs
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-10759.0001.patch, HDFS-10759.0002.patch, 
> HDFS-11026.003.patch
>
>
> The new erasure coding project has updated the protocol for fsimage such that 
> the {{INodeFile}} has a boolean '{{isStriped}}'. I think this is better as an 
> enum or integer since a boolean precludes any future block types. 
> For example:
> {code}
> enum BlockType {
>   CONTIGUOUS = 0,
>   STRIPED = 1,
> }
> {code}
> We can also make this more robust to future changes where there are different 
> block types supported in a staged rollout.  Here, we would use 
> {{UNKNOWN_BLOCK_TYPE}} as the first value since this is the default value. 
> See 
> [here|http://androiddevblog.com/protocol-buffers-pitfall-adding-enum-values/] 
> for more discussion.
> {code}
> enum BlockType {
>   UNKNOWN_BLOCK_TYPE = 0,
>   CONTIGUOUS = 1,
>   STRIPED = 2,
> }
> {code}
> But I'm not convinced this is necessary since there are other enums that 
> don't use this approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: HDFS-11292.002.patch

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: HDFS-7967-branch-2.8.patch

It wasn't as stale as I thought, I was mostly thrown off by author vs commit 
dates and thought my repo was broken.

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799519#comment-15799519
 ] 

Hadoop QA commented on HDFS-11156:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 8s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
12s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
8s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
13s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 
242 unchanged - 1 fixed = 244 total (was 243) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | hadoop.hdfs.server.namenode.TestStartup |
| JDK 

[jira] [Commented] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799513#comment-15799513
 ] 

Jing Zhao commented on HDFS-11273:
--

Thanks for updating the patch, [~hkoneru]! The updated patch is almost there. I 
have several extra minor comments (sorry I did not mention them in my last 
review...):
# The new Util#setTimeout method may no longer only load timeout value from 
"DFS_IMAGE_TRANSFER_TIMEOUT_KEY". Thus the code loading timeout from 
configuration can be left in TransferFsImage#doGetUrl.
{code}
+  /**
+   * Sets a timeout value in millisecods for the Http connection.
+   * @param connection the Http connection for which timeout needs to be set
+   * @param timeout value to be set as timeout in milliseconds
+   */
+  public static void setTimeout(HttpURLConnection connection, int timeout) {
+if (timeout <= 0) {
+  Configuration conf = new HdfsConfiguration();
+  timeout = conf.getInt(
+  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
+  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
+  LOG.info("Image Transfer timeout configured to " + timeout
+  + " milliseconds");
+}
+
+if (timeout > 0) {
+  connection.setConnectTimeout(timeout);
+  connection.setReadTimeout(timeout);
+}
+  }
{code}
# HttpGetFailedException can be defined as an upper level class and be moved to 
the o.a.h.hdfs.server.common package.
# The following code can be reformatted.
{code}
+  public static MD5Hash doGetUrl(URL url, List localPaths,
+  Storage dstStorage, boolean getChecksum, URLConnectionFactory
+  connectionFactory, int ioFileBufferSize, boolean isSpnegoEnabled, int
+  timeout) throws IOException {
{code}

> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: HDFS-11292.001.patch

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: (was: HDFS-11292.001.patch)

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Status: Patch Available  (was: Open)

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
Attachment: HDFS-11292.001.patch

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11170) Add create API in filesystem public class to support assign parameter through builder

2017-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799505#comment-15799505
 ] 

Andrew Wang commented on HDFS-11170:


I also retriggered the precommit build manually, somehow it didn't run before.

> Add create API in filesystem public class to support assign parameter through 
> builder
> -
>
> Key: HDFS-11170
> URL: https://issues.apache.org/jira/browse/HDFS-11170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
> Attachments: HDFS-11170-00.patch
>
>
> FileSystem class supports multiple create functions to help user create file. 
> Some create functions has many parameters, it's hard for user to exactly 
> remember these parameters and their orders. This task is to add builder  
> based create functions to help user more easily create file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11170) Add create API in filesystem public class to support assign parameter through builder

2017-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799493#comment-15799493
 ] 

Andrew Wang commented on HDFS-11170:


Thanks for working on this [~zhouwei], looks great overall :) A few review 
comments:

* Since {{path}} is a required parameter, it should be passed as an argument to 
the CreateBuilder constructor. This way we don't need to check it later.
* It looks like {{overwrite}} is sort of deprecated in favor of {{flags}}, so I 
think we should only support {{flags}}.
* What do you think about having a factory method in FileSystem to get the 
FS-specific builder? It's nice since then users can have generic code for the 
common CreateBuilder args, then test if it's a DFSCreateBuilder to 
conditionally set additional parameters. This way we don't need to pass in the 
FileSystem as a CreateBuilder argument too, it can set the correct defaults in 
the constructor.
* Would prefer we don't change the flags semantics to take null, the builder 
can set an empty EnumSet as appropriate.
* In FileSystem#doCreate and DFS#doCreate, I'd prefer we always call the same 
{{create}} method with the defaults filled in appropriately. I think this works 
if we stop supporting {{overwrite}}, and add a new private {{DFS#create}} that 
takes {{checksumOpt}} and {{flags}} along with {{favoredNodes}} is set.

> Add create API in filesystem public class to support assign parameter through 
> builder
> -
>
> Key: HDFS-11170
> URL: https://issues.apache.org/jira/browse/HDFS-11170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
> Attachments: HDFS-11170-00.patch
>
>
> FileSystem class supports multiple create functions to help user create file. 
> Some create functions has many parameters, it's hard for user to exactly 
> remember these parameters and their orders. This task is to add builder  
> based create functions to help user more easily create file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11290) TestFSNameSystemMBean should wait until the cache is cleared

2017-01-04 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-11290:
---
Attachment: HDFS-11290.000.patch

Attaching a patch which uses a {{Thread.sleep(1000)}} to force the JMX cache to 
expire. The way the metrics system is currently set up you can't set a JMX 
cache TTL below 1 second; setting the period to 0 causes an 
{{IllegalArgumentException}} in {{MetricsSystemImpl#startTimer()}}... Open to 
suggestions on a better way to do this, but it's a little tricky since it's an 
integration-style test so I can't easily hook into/modify the metrics system. I 
verified that without this patch it's possible for {{testWithFSEditLogLock}} to 
pass (as well as {{testWithFSNamesystemWriteLock}} if I increase the JMX cache 
TTL to 20 seconds, so it's basically performance-dependent) if I change 
{{FSNamesystem#getLastWrittenTransactionId}} to the following, which should 
cause both to fail:

{code}
  public long getLastWrittenTransactionId() {
readLock();
readUnlock();
return getEditLog().getLastWrittenTxId();
  }
{code}

[~ajisakaa], any thoughts? 

> TestFSNameSystemMBean should wait until the cache is cleared
> 
>
> Key: HDFS-11290
> URL: https://issues.apache.org/jira/browse/HDFS-11290
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Akira Ajisaka
>Assignee: Erik Krogen
> Attachments: HDFS-11290.000.patch
>
>
> TestFSNamesystemMBean#testWithFSNamesystemWriteLock and 
> #testWithFSEditLogLock get metrics after locking FSNameSystem/FSEditLog, but 
> when the metrics are cached, the tests success even if the metrics acquire 
> the locks. The tests should wait until the cache is cleared.
> This issue was reported by [~xkrogen] in HDFS-11180.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11290) TestFSNameSystemMBean should wait until the cache is cleared

2017-01-04 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-11290:
---
Status: Patch Available  (was: Open)

> TestFSNameSystemMBean should wait until the cache is cleared
> 
>
> Key: HDFS-11290
> URL: https://issues.apache.org/jira/browse/HDFS-11290
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Akira Ajisaka
>Assignee: Erik Krogen
> Attachments: HDFS-11290.000.patch
>
>
> TestFSNamesystemMBean#testWithFSNamesystemWriteLock and 
> #testWithFSEditLogLock get metrics after locking FSNameSystem/FSEditLog, but 
> when the metrics are cached, the tests success even if the metrics acquire 
> the locks. The tests should wait until the cache is cleared.
> This issue was reported by [~xkrogen] in HDFS-11180.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799420#comment-15799420
 ] 

Hadoop QA commented on HDFS-11273:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 20 unchanged - 2 fixed = 22 total (was 22) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}107m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11273 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845604/HDFS-11273.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6740543e8724 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18023/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18023/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18023/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18023/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Move 

[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-04 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799394#comment-15799394
 ] 

Daryn Sharp commented on HDFS-7967:
---

Removing stale 2.8 patch (based on earlier version), will repost shortly.

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: (was: HDFS-7967-branch-2.8.patch)

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Priority: Critical  (was: Major)

I'm inclined to make this a blocker since the balancer is virtually unusable 
w/o this patch, but I'll leave it up to up others to decide.

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: HDFS-7967-branch-2.patch
HDFS-7967-branch-2.8.patch

We’ve been using a similar patch since 2.6 to prevent the balancer destroying 
the performance of large and/or dense clusters.  When getBlocks queries 
exceeded tens or hundreds of ms, we had to reduce the balancer dispatcher 
thread count from 200 down to 1-5 to avoid call queue overflow.  This change 
allows us to use 200 dispatchers with little to no performance impact.

Basic design changes:
# getBlocks queries are O\(1\) instead of O\(n\)
# avoids moving recently completed blocks to prevent disruption to active 
clients.
# evenly returns blocks from storages with the least remaining space.

The main issue in the triplets currently represent a terminated LIFO list.  New 
blocks are inserted before the head, and become the new head.  Hence the 
current implementation’s O\(N\) seek to a random location in the block list.  
Worse, the random seek needlessly iterates through all the blocks of previous 
storages.

This design converts the triplets into a cyclic FIFO list.  New blocks are 
inserted as the tail - before the head, but the current head is not changed.  
The getBlocks query becomes O\(1\) by starting from the current head, returning 
the “oldest” completed blocks, then updating the head so the next query resumes 
where it left off. 

The block iterators track the size of the returned blocks so an “expected” 
remaining storage capacity determines the sorting order of a node’s storage 
iterators to maintain roughly the same free space across all storages.

FBR processing currently reconciles inconsistencies (phantom blocks to remove 
from blocks map) by adding a delimiter block as head, moving all reported 
blocks to the head, removing blocks from the delimiter to list termination.  
This change adds the delimiter as the tail, moves all reported blocks to the 
tail, removes blocks from the head to the delimiter.

Note that HDFS-9260 replaced the triplets with a RB tree on trunk, so it 
requires a completely new design which I unfortunately do not have the cycles 
to implement.  [~sfriberg], can you create a trunk patch?

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11193) [SPS]: Erasure coded files should be considered for satisfying storage policy

2017-01-04 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799327#comment-15799327
 ] 

Uma Maheswara Rao G commented on HDFS-11193:


Latest patch looks good to me. +1
pending jenkins

> [SPS]: Erasure coded files should be considered for satisfying storage policy
> -
>
> Key: HDFS-11193
> URL: https://issues.apache.org/jira/browse/HDFS-11193
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-11193-HDFS-10285-00.patch, 
> HDFS-11193-HDFS-10285-01.patch, HDFS-11193-HDFS-10285-02.patch, 
> HDFS-11193-HDFS-10285-03.patch
>
>
> Erasure coded striped files supports storage policies {{HOT, COLD, ALLSSD}}. 
> {{HdfsAdmin#satisfyStoragePolicy}} API call on a directory should consider 
> all immediate files under that directory and need to check that, the files 
> really matching with namespace storage policy. All the mismatched striped 
> blocks should be chosen for block movement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11259) Update fsck to display maintenance state info

2017-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11259:
--
Affects Version/s: 3.0.0-alpha1
 Target Version/s: 3.0.0-alpha2
   Status: Patch Available  (was: Open)

> Update fsck to display maintenance state info
> -
>
> Key: HDFS-11259
> URL: https://issues.apache.org/jira/browse/HDFS-11259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11259.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11259) Update fsck to display maintenance state info

2017-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11259:
--
Attachment: HDFS-11259.01.patch

Attached v01 patch to address the following:
1. Made {{NamenodeFsck}} to gather details on maintenance replicas for the 
following checks and modules -- blockIdCk(), getReplicaInfo(), 
collectBlockSummary() and Result

2. Updated {{TestFsck}} to verify maintenance replicas for the following 
commands and tests
* hdfs fsck /  (testFsckWithMaintenanceReplicas)
* hdfs fsck / -files -blocks -replicaDetails  (testFsckReplicaDetails())
* hdfs fsck -blockId blockid (testBlockIdCKMaintenance())
PS: TestFsck changes might show checkstyle issues for not using private 
variables. I chose to follow the usage pattern of all variables in Result 
class. May be I can work on cleaning up the checkstyle issues for the whole 
TestFsck later.
[~eddyxu], [~mingma], can you please review the patch and let me know your 
comments ?

> Update fsck to display maintenance state info
> -
>
> Key: HDFS-11259
> URL: https://issues.apache.org/jira/browse/HDFS-11259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11259.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11288) Manually allow block replication/deletion in Safe Mode

2017-01-04 Thread Lukas Majercak (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799253#comment-15799253
 ] 

Lukas Majercak commented on HDFS-11288:
---

Attached a patch. [~esteban], I don't understand what you meant by the single 
user, or admin group, could you elaborate?

> Manually allow block replication/deletion in Safe Mode
> --
>
> Key: HDFS-11288
> URL: https://issues.apache.org/jira/browse/HDFS-11288
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Lukas Majercak
> Attachments: HDFS-11288.001.patch
>
>
> Currently, the Safe Mode does not allow block replication/deletion, which 
> makes sense, especially on startup, as we do not want to replicate blocks 
> unnecessarily. 
> An issue we have seen in our clusters though, is when the NameNode is getting 
> overwhelmed with the amounts of needed replications; in which case, we would 
> like to be able to manually set the NN to be in a state in which R/Ws to FS 
> are disallowed but the NN continues replicating/deleting blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11288) Manually allow block replication/deletion in Safe Mode

2017-01-04 Thread Lukas Majercak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-11288:
--
Attachment: HDFS-11288.001.patch

> Manually allow block replication/deletion in Safe Mode
> --
>
> Key: HDFS-11288
> URL: https://issues.apache.org/jira/browse/HDFS-11288
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Lukas Majercak
> Attachments: HDFS-11288.001.patch
>
>
> Currently, the Safe Mode does not allow block replication/deletion, which 
> makes sense, especially on startup, as we do not want to replicate blocks 
> unnecessarily. 
> An issue we have seen in our clusters though, is when the NameNode is getting 
> overwhelmed with the amounts of needed replications; in which case, we would 
> like to be able to manually set the NN to be in a state in which R/Ws to FS 
> are disallowed but the NN continues replicating/deleting blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11290) TestFSNameSystemMBean should wait until the cache is cleared

2017-01-04 Thread Erik Krogen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-11290:
--

Assignee: Erik Krogen

> TestFSNameSystemMBean should wait until the cache is cleared
> 
>
> Key: HDFS-11290
> URL: https://issues.apache.org/jira/browse/HDFS-11290
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Akira Ajisaka
>Assignee: Erik Krogen
>
> TestFSNamesystemMBean#testWithFSNamesystemWriteLock and 
> #testWithFSEditLogLock get metrics after locking FSNameSystem/FSEditLog, but 
> when the metrics are cached, the tests success even if the metrics acquire 
> the locks. The tests should wait until the cache is cleared.
> This issue was reported by [~xkrogen] in HDFS-11180.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-04 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-11273:
--
Attachment: HDFS-11273.001.patch

Thank you [~jingzhao] for reviewing the patch. I have addressed your comments 
in patch v01.

> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-04 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-11273:
--
Description: 
TransferFsImage#doGetUrl downloads files from the specified url and stores them 
in the specified storage location. HDFS-4025 plans to synchronize the log 
segments in JournalNodes. If a log segment is missing from a JN, the JN 
downloads it from another JN which has the required log segment. We need 
TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
So we propose to move the said functions to a Utility class so as to be able to 
use it for JournalNode syncing as well, without duplication of code.

  was:TransferFsImage#doGetUrl function is required for JournalNode syncing as 
well. We can move the code to a Utility class to avoid duplication of code.


> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11273.000.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11258) File mtime change could not save to editlog

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11258:
---
Fix Version/s: (was: 3.0.0-alpha1)
   3.0.0-alpha2

> File mtime change could not save to editlog
> ---
>
> Key: HDFS-11258
> URL: https://issues.apache.org/jira/browse/HDFS-11258
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Critical
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: hdfs-11258-addendum-branch2.patch, hdfs-11258.1.patch, 
> hdfs-11258.2.patch, hdfs-11258.3.patch, hdfs-11258.4.patch
>
>
> When both mtime and atime are changed, and atime is not beyond the precision 
> limit, the mtime change is not saved to edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2017-01-04 Thread churro morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

churro morales updated HDFS-10608:
--
Status: Open  (was: Patch Available)

> Include event for AddBlock in Inotify Event Stream
> --
>
> Key: HDFS-10608
> URL: https://issues.apache.org/jira/browse/HDFS-10608
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: churro morales
>Priority: Minor
> Attachments: HDFS-10608.patch, HDFS-10608.v1.patch, 
> HDFS-10608.v2.patch, HDFS-10608.v3.patch, HDFS-10608.v4.patch
>
>
> It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
> discussions from mailing list:
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2017-01-04 Thread churro morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

churro morales resolved HDFS-10608.
---
Resolution: Won't Fix

looks like nobody is interested in this patch, so ill just close this off. 

> Include event for AddBlock in Inotify Event Stream
> --
>
> Key: HDFS-10608
> URL: https://issues.apache.org/jira/browse/HDFS-10608
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: churro morales
>Priority: Minor
> Attachments: HDFS-10608.patch, HDFS-10608.v1.patch, 
> HDFS-10608.v2.patch, HDFS-10608.v3.patch, HDFS-10608.v4.patch
>
>
> It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
> discussions from mailing list:
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11156:
---
Status: Patch Available  (was: Reopened)

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11291) Avoid unnecessary edit log for setStoragePolicy() and setReplication()

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799120#comment-15799120
 ] 

Hadoop QA commented on HDFS-11291:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 277 unchanged - 0 fixed = 280 total (was 277) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 31s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
51s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11291 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845579/HDFS-11291.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f93d9067d781 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18019/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18019/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18019/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18019/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Updated] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11156:
---
Fix Version/s: (was: 2.9.0)

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11156:


Reopening for precommit run.

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11028) libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel pending connections

2017-01-04 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-11028:
---
Attachment: HDFS-11028.HDFS-8707.002.patch

New patch, should be ready to review.

-Added C bindings that let the C style hdfsFS FileSystem be created and 
initialized without connecting so that the new hdfsCancelPendingConnection can 
be used to cancel it from another thread.

-Added a C example, runs under valgrind without leaks.

Testing has been done with the C++ and C examples as described in my previous 
comment: load a bad HA config so things hang, hit control-C, should exit 
immediately with operation canceled.  Should be able to run it under valgrind 
as well without errors.

> libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel 
> pending connections
> --
>
> Key: HDFS-11028
> URL: https://issues.apache.org/jira/browse/HDFS-11028
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-11028.HDFS-8707.000.patch, 
> HDFS-11028.HDFS-8707.001.patch, HDFS-11028.HDFS-8707.002.patch
>
>
> Cancel support is now reasonably robust except the case where a FileHandle 
> operation ends up causing the RpcEngine to try to create a new RpcConnection. 
>  In HA configs it's common to have something like 10-20 failovers and a 20 
> second failover delay (no exponential backoff just yet). This means that all 
> of the functions with synchronous interfaces can still block for many minutes 
> after an operation has been canceled, and often the cause of this is 
> something trivial like a bad config file.
> The current design makes this sort of thing tricky to do because the 
> FileHandles need to be individually cancelable via CancelOperations, but they 
> share the RpcEngine that does the async magic.
> Updated design:
> Original design would end up forcing lots of reconnects.  Not a huge issue on 
> an unauthenticated cluster but on a kerberized cluster this is a recipe for 
> Kerberos thinking we're attempting a replay attack.
> User visible cancellation and internal resources cleanup are separable 
> issues.  The former can be implemented by atomically swapping the callback of 
> the operation to be canceled with a no-op callback.  The original callback is 
> then posted to the IoService with an OperationCanceled status and the user is 
> no longer blocked.  For RPC cancels this is sufficient, it's not expensive to 
> keep a request around a little bit longer and when it's eventually invoked or 
> timed out it invokes the no-op callback and is ignored (other than a trace 
> level log notification).  Connect cancels push a flag down into the RPC 
> engine to kill the connection and make sure it doesn't attempt to reconnect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10943) rollEditLog expects empty EditsDoubleBuffer.bufCurrent which is not guaranteed

2017-01-04 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799075#comment-15799075
 ] 

Yongjun Zhang commented on HDFS-10943:
--

The issue remains even after HDFS-7964 is included. 

HDFS-7964 added the call {{logSyncAll{}}} before the edit log rolling. So 
either this method did not finish its job correctly, or some new edits gets in 
after the flush and before the edit log rolling. Created HDFS-11292 to help the 
diagnosis.


> rollEditLog expects empty EditsDoubleBuffer.bufCurrent which is not guaranteed
> --
>
> Key: HDFS-10943
> URL: https://issues.apache.org/jira/browse/HDFS-10943
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>
> Per the following trace stack:
> {code}
> FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: finalize log 
> segment 10562075963, 10562174157 failed for required journal 
> (JournalAndStream(mgr=QJM to [0.0.0.1:8485, 0.0.0.2:8485, 0.0.0.3:8485, 
> 0.0.0.4:8485, 0.0.0.5:8485], stream=QuorumOutputStream starting at txid 
> 10562075963))
> java.io.IOException: FSEditStream has 49708 bytes still to be flushed and 
> cannot be closed.
> at 
> org.apache.hadoop.hdfs.server.namenode.EditsDoubleBuffer.close(EditsDoubleBuffer.java:66)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.close(QuorumOutputStream.java:65)
> at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.closeStream(JournalSet.java:115)
> at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet$4.apply(JournalSet.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
> at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.finalizeLogSegment(JournalSet.java:231)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1243)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1172)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1243)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6437)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1002)
> at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)
> at 
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> 2016-09-23 21:40:59,618 WARN 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting 
> QuorumOutputStream starting at txid 10562075963
> {code}
> The exception is from  EditsDoubleBuffer
> {code}
>  public void close() throws IOException {
> Preconditions.checkNotNull(bufCurrent);
> Preconditions.checkNotNull(bufReady);
> int bufSize = bufCurrent.size();
> if (bufSize != 0) {
>   throw new IOException("FSEditStream has " + bufSize
>   + " bytes still to be flushed and cannot be closed.");
> }
> IOUtils.cleanup(null, bufCurrent, bufReady);
> bufCurrent = bufReady = null;
>   }
> {code}
> We can see that FSNamesystem.rollEditLog expects  
> EditsDoubleBuffer.bufCurrent to be empty.
> Edits are recorded via FSEditLog$logSync, which does:
> {code}
>* The data is double-buffered within each edit log implementation so that
>* in-memory writing can occur in parallel with the on-disk writing.
>*
>* Each sync occurs in three steps:
>*   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>*  flag.
>*   2. unsynchronized, it flushes the data to storage
>*   3. synchronized, it resets the flag and notifies anyone waiting on the
>*  sync.
>*
>* The lack of synchronization on step 2 allows other threads to continue
>* to write into the memory buffer while the sync is in progress.
>* Because this step is unsynchronized, actions that need to avoid
>* concurrency with sync() should 

[jira] [Assigned] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang reassigned HDFS-11292:


Assignee: Yongjun Zhang

> log lastWrittenTxId in logSyncAll
> -
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11292) log lastWrittenTxId in logSyncAll

2017-01-04 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-11292:


 Summary: log lastWrittenTxId in logSyncAll
 Key: HDFS-11292
 URL: https://issues.apache.org/jira/browse/HDFS-11292
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Yongjun Zhang


For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
the problem still exists, this means there might be some synchronization issue.

To diagnose that, create this jira to report the lastWrittenTxId info in 
{{logSyncAll()}} call, such that we can compare against the error message 
reported in HDFS-7964

Specifically, there is two possibility for the HDFS-10943 issue:

1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
requested txs for some reason

2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked in 
between A and B. It's observed that the lastWrittenTxId in B and C are the same.

This proposed reporting would help confirming if 2 is true.

{code}
 public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
LOG.info("Ending log segment " + curSegmentTxId);
Preconditions.checkState(isSegmentOpen(),
"Bad state: %s", state);

if (writeEndTxn) {
  logEdit(LogSegmentOp.getInstance(cache.get(),
  FSEditLogOpCodes.OP_END_LOG_SEGMENT));
}
// always sync to ensure all edits are flushed.
A.logSyncAll();

B.printStatistics(true);

final long lastTxId = getLastWrittenTxId();

try {
C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
  editLogStream = null;
} catch (IOException e) {
  //All journals have failed, it will be handled in logSync.
}

state = State.BETWEEN_LOG_SEGMENTS;
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-01-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799057#comment-15799057
 ] 

Andrew Wang commented on HDFS-11096:


Thanks for the very thorough analysis here Sean! Do you think we should act on 
any of these changes? If so we can chase them in some other JIRAs.

Also, do you have any ideas on how we can whitelist items in the JACC report? 
Otherwise it's hard to see what's changing from report to report.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11279) Cleanup unused DataNode#checkDiskErrorAsync()

2017-01-04 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799036#comment-15799036
 ] 

Hanisha Koneru commented on HDFS-11279:
---

Thank you [~xyao] for committing the patch.

> Cleanup unused DataNode#checkDiskErrorAsync()
> -
>
> Key: HDFS-11279
> URL: https://issues.apache.org/jira/browse/HDFS-11279
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiaoyu Yao
>Assignee: Hanisha Koneru
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11279.000.patch, HDFS-11279.001.patch, 
> HDFS-11279.002.patch
>
>
> After HDFS-11274, we will not trigger checking all datanode volumes upon IO 
> failure on a single volume. This makes the original implementation 
> DataNode#checkDiskErrorAsync and DatasetVolumeChecker#checkAllVolumesAsync() 
> not used in any of the production code. 
> This ticket is opened to remove these unused code and related tests if any. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11180) Intermittent deadlock in NameNode when failover happens.

2017-01-04 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799032#comment-15799032
 ] 

Erik Krogen commented on HDFS-11180:


Thanks [~ajisakaa]! I will do a little investigation and see if I can take on 
that ticket.

> Intermittent deadlock in NameNode when failover happens.
> 
>
> Key: HDFS-11180
> URL: https://issues.apache.org/jira/browse/HDFS-11180
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Abhishek Modi
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: high-availability
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2, 2.6.6
>
> Attachments: HDFS-11180-branch-2.01.patch, 
> HDFS-11180-branch-2.6.01.patch, HDFS-11180-branch-2.7.01.patch, 
> HDFS-11180-branch-2.8.01.patch, HDFS-11180.00.patch, HDFS-11180.01.patch, 
> HDFS-11180.02.patch, HDFS-11180.03.patch, HDFS-11180.04.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11028) libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel pending connections

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798799#comment-15798799
 ] 

Hadoop QA commented on HDFS-11028:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 6s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
20s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
22s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
22s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
32s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:78fc6b6 |
| JIRA Issue | HDFS-11028 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845561/HDFS-11028.HDFS-8707.001.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 401f5f1fa8f1 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 2ceec2b |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| JDK v1.7.0_121  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18018/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18018/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel 
> pending connections
> --
>
> Key: HDFS-11028
> URL: https://issues.apache.org/jira/browse/HDFS-11028
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James 

[jira] [Updated] (HDFS-11291) Avoid unnecessary edit log for setStoragePolicy() and setReplication()

2017-01-04 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-11291:
--
Status: Patch Available  (was: Open)

> Avoid unnecessary edit log for setStoragePolicy() and setReplication()
> --
>
> Key: HDFS-11291
> URL: https://issues.apache.org/jira/browse/HDFS-11291
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11291.001.patch
>
>
> We are setting the storage policy for file without checking the current 
> policy of file for avoiding extra getStoragePolicy() rpc call. Currently 
> namenode is not checking the current storage policy before setting new one 
> and adding edit logs. I think if the old and new storage policy is same we 
> can avoid set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11291) Avoid unnecessary edit log for setStoragePolicy() and setReplication()

2017-01-04 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-11291:
--
Attachment: HDFS-11291.001.patch

Attached patch..
Please review..

> Avoid unnecessary edit log for setStoragePolicy() and setReplication()
> --
>
> Key: HDFS-11291
> URL: https://issues.apache.org/jira/browse/HDFS-11291
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11291.001.patch
>
>
> We are setting the storage policy for file without checking the current 
> policy of file for avoiding extra getStoragePolicy() rpc call. Currently 
> namenode is not checking the current storage policy before setting new one 
> and adding edit logs. I think if the old and new storage policy is same we 
> can avoid set operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11028) libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel pending connections

2017-01-04 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-11028:
---
Attachment: HDFS-11028.HDFS-8707.001.patch

Updated patch:
-Isolated the connection cancel logic from the general RPC cancel logic, this 
patch just does connection.
-Cleaned up an example that can also be used as a simple test for cancel

To test:
1) Build libhdfs++, set $HADOOP_CONF_DIR to some valid configs for a running 
cluster (best to have an HA cluster).  It should go connect to the cluster.
2) Now copy the good config and do something like replace all of the NN port 
numbers with something invalid so libhdfs keeps getting connection refused or 
timeout errors.  You should be able to quit early with Control-C.

Everything should be fairly clean under valgrind.  There's a few statically 
initialized objects that make noise but it shouldn't be anything from inside 
libhdfs++.

Todo:
-Simple C binding to set up an hdfsFS without connection so it can be passed to 
an hdfsCancelPendingConnect function.

> libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel 
> pending connections
> --
>
> Key: HDFS-11028
> URL: https://issues.apache.org/jira/browse/HDFS-11028
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-11028.HDFS-8707.000.patch, 
> HDFS-11028.HDFS-8707.001.patch
>
>
> Cancel support is now reasonably robust except the case where a FileHandle 
> operation ends up causing the RpcEngine to try to create a new RpcConnection. 
>  In HA configs it's common to have something like 10-20 failovers and a 20 
> second failover delay (no exponential backoff just yet). This means that all 
> of the functions with synchronous interfaces can still block for many minutes 
> after an operation has been canceled, and often the cause of this is 
> something trivial like a bad config file.
> The current design makes this sort of thing tricky to do because the 
> FileHandles need to be individually cancelable via CancelOperations, but they 
> share the RpcEngine that does the async magic.
> Updated design:
> Original design would end up forcing lots of reconnects.  Not a huge issue on 
> an unauthenticated cluster but on a kerberized cluster this is a recipe for 
> Kerberos thinking we're attempting a replay attack.
> User visible cancellation and internal resources cleanup are separable 
> issues.  The former can be implemented by atomically swapping the callback of 
> the operation to be canceled with a no-op callback.  The original callback is 
> then posted to the IoService with an OperationCanceled status and the user is 
> no longer blocked.  For RPC cancels this is sufficient, it's not expensive to 
> keep a request around a little bit longer and when it's eventually invoked or 
> timed out it invokes the no-op callback and is ignored (other than a trace 
> level log notification).  Connect cancels push a flag down into the RPC 
> engine to kill the connection and make sure it doesn't attempt to reconnect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11291) Avoid unnecessary edit log for setStoragePolicy() and setReplication()

2017-01-04 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-11291:
-

 Summary: Avoid unnecessary edit log for setStoragePolicy() and 
setReplication()
 Key: HDFS-11291
 URL: https://issues.apache.org/jira/browse/HDFS-11291
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


We are setting the storage policy for file without checking the current policy 
of file for avoiding extra getStoragePolicy() rpc call. Currently namenode is 
not checking the current storage policy before setting new one and adding edit 
logs. I think if the old and new storage policy is same we can avoid set 
operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11282) Document the missing metrics of DataNode Volume IO operations

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798224#comment-15798224
 ] 

Hadoop QA commented on HDFS-11282:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11282 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845528/HDFS-11282.003.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux a97339cd1261 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e49e0a6 |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18017/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Document the missing metrics of DataNode Volume IO operations
> -
>
> Key: HDFS-11282
> URL: https://issues.apache.org/jira/browse/HDFS-11282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11282.001.patch, HDFS-11282.002.patch, 
> HDFS-11282.003.patch, metrics-rendered.png
>
>
> In HDFS-10959, it added many metrics of datanode volume io opearions. But it 
> hasn't been documented. This JIRA addressed on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11282) Document the missing metrics of DataNode Volume IO operations

2017-01-04 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11282:
-
Attachment: HDFS-11282.003.patch

> Document the missing metrics of DataNode Volume IO operations
> -
>
> Key: HDFS-11282
> URL: https://issues.apache.org/jira/browse/HDFS-11282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11282.001.patch, HDFS-11282.002.patch, 
> HDFS-11282.003.patch, metrics-rendered.png
>
>
> In HDFS-10959, it added many metrics of datanode volume io opearions. But it 
> hasn't been documented. This JIRA addressed on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11282) Document the missing metrics of DataNode Volume IO operations

2017-01-04 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11282:
-
Attachment: (was: HDFS-11282.003.patch)

> Document the missing metrics of DataNode Volume IO operations
> -
>
> Key: HDFS-11282
> URL: https://issues.apache.org/jira/browse/HDFS-11282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11282.001.patch, HDFS-11282.002.patch, 
> HDFS-11282.003.patch, metrics-rendered.png
>
>
> In HDFS-10959, it added many metrics of datanode volume io opearions. But it 
> hasn't been documented. This JIRA addressed on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11191) Datanode Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.

2017-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798133#comment-15798133
 ] 

Hadoop QA commented on HDFS-11191:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
387 unchanged - 7 fixed = 388 total (was 394) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.net.TestNetworkTopology |
|   | hadoop.hdfs.server.blockmanagement.TestBlockReportRateLimiting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11191 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845505/HDFS-11191.05.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 297b0273de23 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e49e0a6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18015/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| unit | 

[jira] [Updated] (HDFS-11282) Document the missing metrics of DataNode Volume IO operations

2017-01-04 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11282:
-
Attachment: HDFS-11282.003.patch

Thanks for the patient review, [~arpitagarwal].
{quote}
The description for the count metrics and the rate metric is the same e.g. 
TotalMetadataOperations and MetadataOperationRateNumOps. I am not sure that is 
correct.
{quote}
I looked into this, the main difference between {{MutableCounter}} and 
{{MutableRate}} is that the value of the former is monotonically increasing and 
the latter will be reset after some interval time. Others comments have been 
addressed in the latest patch. New patch attached, please have a review. Thanks.


> Document the missing metrics of DataNode Volume IO operations
> -
>
> Key: HDFS-11282
> URL: https://issues.apache.org/jira/browse/HDFS-11282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11282.001.patch, HDFS-11282.002.patch, 
> HDFS-11282.003.patch, metrics-rendered.png
>
>
> In HDFS-10959, it added many metrics of datanode volume io opearions. But it 
> hasn't been documented. This JIRA addressed on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11191) Datanode Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.

2017-01-04 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11191:
---
Attachment: HDFS-11191.05.patch

> Datanode Capacity is misleading if the dfs.datanode.data.dir is configured 
> with two directories from the same file system.
> --
>
> Key: HDFS-11191
> URL: https://issues.apache.org/jira/browse/HDFS-11191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.5.0
> Environment: SLES 11SP3
> HDP 2.5.0
>Reporter: Deepak Chander
>Assignee: Weiwei Yang
>  Labels: capacity, datanode, storage, user-experience
> Attachments: HDFS-11191.01.patch, HDFS-11191.02.patch, 
> HDFS-11191.03.patch, HDFS-11191.04.patch, HDFS-11191.05.patch
>
>
> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> hdfs@kimtest1:~> hdfs dfsadmin -report
> Configured Capacity: 239942369274 (223.46 GB)
> Present Capacity: 207894724602 (193.62 GB)
> DFS Remaining: 207894552570 (193.62 GB)
> DFS Used: 172032 (168 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9528000512 (8.87 GB)
> DFS Remaining: 70452731902 (65.61 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 13010952192 (12.12 GB)
> DFS Remaining: 66969780222 (62.37 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.79.86:50010 (kimtest2)
> Hostname: kimtest2
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9508691968 (8.86 GB)
> DFS Remaining: 70472040446 (65.63 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> If you see my datanode root file system size its only 38GB
> kimtest3:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> kimtest4:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  4.2G   32G  12% /
> kimtest2:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> The below is from hdfs-site.xml file 
> 
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn
>   
> I have removed the other directory grid1 and restarted datanode process.
>   
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn
>   
> Now the size is reflecting correctly
> hdfs@kimtest1:/grid> hdfs dfsadmin -report
> Configured Capacity: 119971184637 (111.73 GB)
> Present Capacity: 103947243517 (96.81 GB)
> DFS Remaining: 103947157501 (96.81 GB)
> DFS Used: 86016 (84 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 4764057600 (4.44 GB)
> DFS Remaining: 35226308607 (32.81 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 07:34:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 6505525248 (6.06 GB)
> DFS Remaining: 33484840959 

[jira] [Updated] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-01-04 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-6874:
--
Status: Patch Available  (was: In Progress)

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.3, 2.4.1
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-01-04 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-6874:
--
Attachment: HDFS-6874.04.patch

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org