[jira] [Commented] (HDFS-12363) Possible NPE in BlockManager$StorageInfoDefragmenter#scanAndCompactStorages

2017-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150079#comment-16150079
 ] 

Hudson commented on HDFS-12363:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12292 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12292/])
HDFS-12363. Possible NPE in (liuml07: rev 
1fbb662c7092d08a540acff7e92715693412e486)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> Possible NPE in BlockManager$StorageInfoDefragmenter#scanAndCompactStorages
> ---
>
> Key: HDFS-12363
> URL: https://issues.apache.org/jira/browse/HDFS-12363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12363.01.patch, HDFS-12363.02.patch
>
>
> Saw NN going down with NPE below:
> {noformat}
> ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Thread 
> received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$StorageInfoDefragmenter.scanAndCompactStorages(BlockManager.java:3897)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$StorageInfoDefragmenter.run(BlockManager.java:3852)
> at java.lang.Thread.run(Thread.java:745)
> 2017-08-21 22:14:05,303 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2017-08-21 22:14:05,313 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> {noformat}
> In that version, {{BlockManager}} code is:
> {code}
> 3896  try {
> 3897   DatanodeStorageInfo storage = datanodeManager.
> 3898 getDatanode(datanodesAndStorages.get(i)).
> 3899getStorageInfo(datanodesAndStorages.get(i + 1));
> 3900if (storage != null) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12363) Possible NPE in BlockManager$StorageInfoDefragmenter#scanAndCompactStorages

2017-08-31 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-12363:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

+1

Committed to {{trunk}} branch. Thanks for your contribution [~xiaochen]. Thanks 
for your review [~jojochuang].

> Possible NPE in BlockManager$StorageInfoDefragmenter#scanAndCompactStorages
> ---
>
> Key: HDFS-12363
> URL: https://issues.apache.org/jira/browse/HDFS-12363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12363.01.patch, HDFS-12363.02.patch
>
>
> Saw NN going down with NPE below:
> {noformat}
> ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Thread 
> received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$StorageInfoDefragmenter.scanAndCompactStorages(BlockManager.java:3897)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$StorageInfoDefragmenter.run(BlockManager.java:3852)
> at java.lang.Thread.run(Thread.java:745)
> 2017-08-21 22:14:05,303 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2017-08-21 22:14:05,313 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> {noformat}
> In that version, {{BlockManager}} code is:
> {code}
> 3896  try {
> 3897   DatanodeStorageInfo storage = datanodeManager.
> 3898 getDatanode(datanodesAndStorages.get(i)).
> 3899getStorageInfo(datanodesAndStorages.get(i + 1));
> 3900if (storage != null) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150021#comment-16150021
 ] 

Xiao Chen commented on HDFS-12300:
--

Thanks again Ravi!

Last pre-commit mvn install seems to failed with a race condition, retriggerred.

> Audit-log delegation token related operations
> -
>
> Key: HDFS-12300
> URL: https://issues.apache.org/jira/browse/HDFS-12300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12300.01.patch, HDFS-12300.02.patch
>
>
> When inspecting the code, I found that the following methods in FSNamesystem 
> are not audit logged:
> - getDelegationToken
> - renewDelegationToken
> - cancelDelegationToken
> The audit log itself does have a logTokenTrackingId field to additionally log 
> some details when a token is used for authentication.
> After emailing the community, we should add that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150021#comment-16150021
 ] 

Xiao Chen edited comment on HDFS-12300 at 9/1/17 4:12 AM:
--

Thanks again Ravi!

Last pre-commit mvn install seems to have failed with a race condition, 
retriggerred.


was (Author: xiaochen):
Thanks again Ravi!

Last pre-commit mvn install seems to failed with a race condition, retriggerred.

> Audit-log delegation token related operations
> -
>
> Key: HDFS-12300
> URL: https://issues.apache.org/jira/browse/HDFS-12300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12300.01.patch, HDFS-12300.02.patch
>
>
> When inspecting the code, I found that the following methods in FSNamesystem 
> are not audit logged:
> - getDelegationToken
> - renewDelegationToken
> - cancelDelegationToken
> The audit log itself does have a logTokenTrackingId field to additionally log 
> some details when a token is used for authentication.
> After emailing the community, we should add that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12335) Federation Metrics

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149994#comment-16149994
 ] 

Hadoop QA commented on HDFS-12335:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10467 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 15m 
51s{color} | {color:red} root in HDFS-10467 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} HDFS-10467 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 30s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12335 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884854/HDFS-12335-HDFS-10467.009.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 3f297c027240 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 
11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10467 / fc2c254 |
| Default Java | 1.8.0_144 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20958/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20958/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20958/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20958/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Federation Metrics
> --
>
> Key: 

[jira] [Comment Edited] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149993#comment-16149993
 ] 

Xiao Chen edited comment on HDFS-12383 at 9/1/17 3:44 AM:
--

Test failure unrelated, just triggered another run in case.
Also found TestCryptoAdminCLI is flaky, will fix separately.


was (Author: xiaochen):
Test failure unrelated, just triggered another run in case.

> Re-encryption updater should handle canceled tasks better
> -
>
> Key: HDFS-12383
> URL: https://issues.apache.org/jira/browse/HDFS-12383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0-beta1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12383.01.patch, HDFS-12383.02.patch
>
>
> Seen an instance where the re-encryption updater exited due to an exception, 
> and later tasks no longer executes. Logs below:
> {noformat}
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Zone 
> /tmp/encryption-zone-3(16819) is submitted for re-encryption.
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Executing 
> re-encrypt commands on zone 16819. Current zones:[zone:16787 state:Completed 
> lastProcessed:null filesReencrypted:1 fileReencryptionFailures:0][zone:16813 
> state:Completed lastProcessed:null filesReencrypted:1 
> fileReencryptionFailures:0][zone:16819 state:Submitted lastProcessed:null 
> filesReencrypted:0 fileReencryptionFailures:0]
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 starts 
> re-encryption processing
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Re-encrypting 
> zone /tmp/encryption-zone-3(id=16819)
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submitted batch 
> (start:/tmp/encryption-zone-3/data1, size:1) of zone 16819 to re-encrypt.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submission 
> completed of zone 16819 for re-encryption.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Processing 
> batched re-encryption for zone 16819, batch size 1, 
> start:/tmp/encryption-zone-3/data1
> 2017-08-31 09:54:08,979 INFO BlockStateChange: BLOCK* BlockManager: ask 
> 172.26.1.71:20002 to delete [blk_1073742291_1467]
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Cancelling 1 
> re-encryption tasks
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Cancelled zone 
> /tmp/encryption-zone-3(16819) for re-encryption.
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 completed 
> re-encryption.
> 2017-08-31 09:54:18,296 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Completed 
> re-encrypting one batch of 1 edeks from KMS, time consumed: 10.19 s, start: 
> /tmp/encryption-zone-3/data1.
> 2017-08-31 09:54:18,296 ERROR 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Re-encryption 
> updater thread exiting.
> java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.takeAndProcessTasks(ReencryptionUpdater.java:404)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.run(ReencryptionUpdater.java:250)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Updater should be fixed to handle canceled tasks better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149993#comment-16149993
 ] 

Xiao Chen commented on HDFS-12383:
--

Test failure unrelated, just triggered another run in case.

> Re-encryption updater should handle canceled tasks better
> -
>
> Key: HDFS-12383
> URL: https://issues.apache.org/jira/browse/HDFS-12383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0-beta1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12383.01.patch, HDFS-12383.02.patch
>
>
> Seen an instance where the re-encryption updater exited due to an exception, 
> and later tasks no longer executes. Logs below:
> {noformat}
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Zone 
> /tmp/encryption-zone-3(16819) is submitted for re-encryption.
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Executing 
> re-encrypt commands on zone 16819. Current zones:[zone:16787 state:Completed 
> lastProcessed:null filesReencrypted:1 fileReencryptionFailures:0][zone:16813 
> state:Completed lastProcessed:null filesReencrypted:1 
> fileReencryptionFailures:0][zone:16819 state:Submitted lastProcessed:null 
> filesReencrypted:0 fileReencryptionFailures:0]
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 starts 
> re-encryption processing
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Re-encrypting 
> zone /tmp/encryption-zone-3(id=16819)
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submitted batch 
> (start:/tmp/encryption-zone-3/data1, size:1) of zone 16819 to re-encrypt.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submission 
> completed of zone 16819 for re-encryption.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Processing 
> batched re-encryption for zone 16819, batch size 1, 
> start:/tmp/encryption-zone-3/data1
> 2017-08-31 09:54:08,979 INFO BlockStateChange: BLOCK* BlockManager: ask 
> 172.26.1.71:20002 to delete [blk_1073742291_1467]
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Cancelling 1 
> re-encryption tasks
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Cancelled zone 
> /tmp/encryption-zone-3(16819) for re-encryption.
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 completed 
> re-encryption.
> 2017-08-31 09:54:18,296 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Completed 
> re-encrypting one batch of 1 edeks from KMS, time consumed: 10.19 s, start: 
> /tmp/encryption-zone-3/data1.
> 2017-08-31 09:54:18,296 ERROR 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Re-encryption 
> updater thread exiting.
> java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.takeAndProcessTasks(ReencryptionUpdater.java:404)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.run(ReencryptionUpdater.java:250)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Updater should be fixed to handle canceled tasks better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12377) Refactor TestReadStripedFileWithDecoding to avoid test timeouts

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149988#comment-16149988
 ] 

Xiao Chen commented on HDFS-12377:
--

Thanks Andrew for taking care of pre-commits here! Looks pretty good to me.

Looking at the failure, 
{{TestReadStripedFileWithDecoding#testReadWithDNFailure}} also needs 
parameterizing.
Also, do you have insights about what's changed recently? IIRC pre-commit was 
fairly green at least a week ago. 

> Refactor TestReadStripedFileWithDecoding to avoid test timeouts
> ---
>
> Key: HDFS-12377
> URL: https://issues.apache.org/jira/browse/HDFS-12377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-12377.001.patch, HDFS-12377.002.patch
>
>
> This test times out since the nested for loops means it runs 12 
> configurations inside each test method.
> Let's refactor this to use JUnit parameters instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12235) Ozone: DeleteKey-3: KSM SCM block deletion message and ACK interactions

2017-08-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12235:
---
Attachment: HDFS-12235-HDFS-7240.005.patch

> Ozone: DeleteKey-3: KSM SCM block deletion message and ACK interactions
> ---
>
> Key: HDFS-12235
> URL: https://issues.apache.org/jira/browse/HDFS-12235
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-12235-HDFS-7240.001.patch, 
> HDFS-12235-HDFS-7240.002.patch, HDFS-12235-HDFS-7240.003.patch, 
> HDFS-12235-HDFS-7240.004.patch, HDFS-12235-HDFS-7240.005.patch
>
>
> KSM and SCM interaction for delete key operation, both KSM and SCM stores key 
> state info in a backlog, KSM needs to scan this log and send block-deletion 
> command to SCM, once SCM is fully aware of the message, KSM removes the key 
> completely from namespace. See more from the design doc under HDFS-11922, 
> this is task break down 2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12367) Ozone: Too many open files error while running corona

2017-08-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149974#comment-16149974
 ] 

Weiwei Yang commented on HDFS-12367:


Thanks [~nandakumar131], I will do some more tests with HDFS-12382 and to 
verify if this issue is completely resolved. Thanks for the hint.

> Ozone: Too many open files error while running corona
> -
>
> Key: HDFS-12367
> URL: https://issues.apache.org/jira/browse/HDFS-12367
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone, tools
>Reporter: Weiwei Yang
>Assignee: Mukul Kumar Singh
>
> Too many open files error keeps happening to me while using corona, I have 
> simply setup a single node cluster and run corona to generate 1000 keys, but 
> I keep getting following error
> {noformat}
> ./bin/hdfs corona -numOfThreads 1 -numOfVolumes 1 -numOfBuckets 1 -numOfKeys 
> 1000
> 17/08/28 00:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 17/08/28 00:47:42 INFO tools.Corona: Number of Threads: 1
> 17/08/28 00:47:42 INFO tools.Corona: Mode: offline
> 17/08/28 00:47:42 INFO tools.Corona: Number of Volumes: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Buckets per Volume: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Keys per Bucket: 1000.
> 17/08/28 00:47:42 INFO rpc.OzoneRpcClient: Creating Volume: vol-0-05000, with 
> wwei as owner and quota set to 1152921504606846976 bytes.
> 17/08/28 00:47:42 INFO tools.Corona: Starting progress bar Thread.
> ...
> ERROR tools.Corona: Exception while adding key: key-251-19293 in bucket: 
> bucket-0-34960 of volume: vol-0-05000.
> java.io.IOException: Exception getting XceiverClient.
>   at 
> org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:156)
>   at 
> org.apache.hadoop.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:122)
>   at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.getFromKsmKeyInfo(ChunkGroupOutputStream.java:289)
>   at 
> org.apache.hadoop.ozone.client.rpc.OzoneRpcClient.createKey(OzoneRpcClient.java:487)
>   at 
> org.apache.hadoop.ozone.tools.Corona$OfflineProcessor.run(Corona.java:352)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.IllegalStateException: failed to create a child event loop
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2234)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>   at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>   at 
> org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:144)
>   ... 9 more
> Caused by: java.lang.IllegalStateException: failed to create a child event 
> loop
>   at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:68)
>   at 
> io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:49)
>   at 
> io.netty.channel.nio.NioEventLoopGroup.(NioEventLoopGroup.java:61)
>   at 
> io.netty.channel.nio.NioEventLoopGroup.(NioEventLoopGroup.java:52)
>   at 
> io.netty.channel.nio.NioEventLoopGroup.(NioEventLoopGroup.java:44)
>   at 
> io.netty.channel.nio.NioEventLoopGroup.(NioEventLoopGroup.java:36)
>   at org.apache.hadoop.scm.XceiverClient.connect(XceiverClient.java:76)
>   at 
> org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:151)
>   at 
> org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:145)
>   at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
>   at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
>   at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
>   at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>   ... 12 more
> Caused by: io.netty.channel.ChannelException: failed to open a new selector
>   at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:128)
>   at io.netty.channel.nio.NioEventLoop.(NioEventLoop.java:120)
>   at 
> 

[jira] [Commented] (HDFS-12335) Federation Metrics

2017-08-31 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149949#comment-16149949
 ] 

Chris Douglas commented on HDFS-12335:
--

+1 lgtm

> Federation Metrics
> --
>
> Key: HDFS-12335
> URL: https://issues.apache.org/jira/browse/HDFS-12335
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12335-HDFS-10467-000.patch, 
> HDFS-12335-HDFS-10467-001.patch, HDFS-12335-HDFS-10467-002.patch, 
> HDFS-12335-HDFS-10467-003.patch, HDFS-12335-HDFS-10467-004.patch, 
> HDFS-12335-HDFS-10467-005.patch, HDFS-12335-HDFS-10467.006.patch, 
> HDFS-12335-HDFS-10467.007.patch, HDFS-12335-HDFS-10467.008.patch, 
> HDFS-12335-HDFS-10467.009.patch
>
>
> Add metrics for the Router and the State Store.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10467) Router-based HDFS federation

2017-08-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-10467:
---
Attachment: HDFS-10467.002.patch

Attaching patch with the status of the HDFS-10467 by August 31st for merge 
discussion.

> Router-based HDFS federation
> 
>
> Key: HDFS-10467
> URL: https://issues.apache.org/jira/browse/HDFS-10467
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.1
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-10467.002.patch, HDFS-10467.PoC.001.patch, 
> HDFS-10467.PoC.patch, HDFS Router Federation.pdf, 
> HDFS-Router-Federation-Prototype.patch
>
>
> Add a Router to provide a federated view of multiple HDFS clusters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-31 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149947#comment-16149947
 ] 

Chris Douglas commented on HDFS-12357:
--

Sorry to be dense, but why can't this live in the external attribute provider? 
{{NameNode::getRemoteUser}} is not only public, it's a 2-line method calling 
stable APIs. Given:
{code:java}
  INodeAttributes getAttributes(INodesInPath iip)
  throws FileNotFoundException {
INode node = FSDirectory.resolveLastINode(iip);
int snapshot = iip.getPathSnapshotId();
INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
if (attributeProvider != null) {
  // permission checking sends the full components array including the
  // first empty component for the root.  however file status
  // related calls are expected to strip out the root component according
  // to TestINodeAttributeProvider.
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
  nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
  }
{code}
can't {{attributeProvider}} return the formal {{nodeAttrs}} unmodified after 
performing the same logic as {{NameNode::getRemoteUser}}?

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12381) [Documentation] Adding configuration keys for the Router

2017-08-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149942#comment-16149942
 ] 

Íñigo Goiri commented on HDFS-12381:


Given the comment from [~manojg] about recommendations/best-practices about 
setting up the mount table, I think I can extend this JIRA to cover this a 
little more.

> [Documentation] Adding configuration keys for the Router
> 
>
> Key: HDFS-12381
> URL: https://issues.apache.org/jira/browse/HDFS-12381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: HDFS-10467
>
> Attachments: HDFS-12381-HDFS-10467.000.patch
>
>
> Adding configuration options in tabular format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12380) Simplify dataQueue.wait condition logical operation

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149939#comment-16149939
 ] 

Hadoop QA commented on HDFS-12380:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs-client: The patch 
generated 0 new + 74 unchanged - 1 fixed = 74 total (was 75) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12380 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884849/HDFS-12380.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 11c45382926f 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1904100 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20957/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop-hdfs-project/hadoop-hdfs-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20957/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Simplify dataQueue.wait condition logical operation
> ---
>
> Key: HDFS-12380
> URL: https://issues.apache.org/jira/browse/HDFS-12380
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-beta1
> Environment: cluster: 3 nodes
> os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, 

[jira] [Commented] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149936#comment-16149936
 ] 

Hadoop QA commented on HDFS-12383:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m 11s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}152m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.hdfs.TestEncryptionZonesWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestFileConcurrentReader |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12383 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884827/HDFS-12383.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5c8d0924a636 3.13.0-117-generic #164-Ubuntu 

[jira] [Commented] (HDFS-12381) [Documentation] Adding configuration keys for the Router

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149933#comment-16149933
 ] 

Hadoop QA commented on HDFS-12381:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} HDFS-10467 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 15m 
53s{color} | {color:red} root in HDFS-10467 failed. {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} HDFS-10467 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12381 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884855/HDFS-12381-HDFS-10467.000.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux d770de1e4e4a 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10467 / fc2c254 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20959/artifact/patchprocess/branch-mvninstall-root.txt
 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20959/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [Documentation] Adding configuration keys for the Router
> 
>
> Key: HDFS-12381
> URL: https://issues.apache.org/jira/browse/HDFS-12381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: HDFS-10467
>
> Attachments: HDFS-12381-HDFS-10467.000.patch
>
>
> Adding configuration options in tabular format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12317) HDFS metrics render error in the page of Github

2017-08-31 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-12317:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Have committed this to trunk and branch-2. Had verified the page, it looks well 
now. Thanks [~ajisakaa] for the review!

> HDFS metrics render error in the page of Github
> ---
>
> Key: HDFS-12317
> URL: https://issues.apache.org/jira/browse/HDFS-12317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Affects Versions: 3.0.0-alpha4
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12317.001.patch, HDFS-12317-branch-2.001.patch, 
> metrics-render-error.jpg
>
>
> Some HDFS metrics render error in the page of git respository. 
> The page link: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12381) [Documentation] Adding configuration keys for the Router

2017-08-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-12381:
---
Attachment: HDFS-12381-HDFS-10467.000.patch

> [Documentation] Adding configuration keys for the Router
> 
>
> Key: HDFS-12381
> URL: https://issues.apache.org/jira/browse/HDFS-12381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: HDFS-10467
>
> Attachments: HDFS-12381-HDFS-10467.000.patch
>
>
> Adding configuration options in tabular format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12381) [Documentation] Adding configuration keys for the Router

2017-08-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-12381:
---
Status: Patch Available  (was: Open)

> [Documentation] Adding configuration keys for the Router
> 
>
> Key: HDFS-12381
> URL: https://issues.apache.org/jira/browse/HDFS-12381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: HDFS-10467
>
> Attachments: HDFS-12381-HDFS-10467.000.patch
>
>
> Adding configuration options in tabular format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12335) Federation Metrics

2017-08-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-12335:
---
Attachment: HDFS-12335-HDFS-10467.009.patch

Fixing checkstyle.

> Federation Metrics
> --
>
> Key: HDFS-12335
> URL: https://issues.apache.org/jira/browse/HDFS-12335
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12335-HDFS-10467-000.patch, 
> HDFS-12335-HDFS-10467-001.patch, HDFS-12335-HDFS-10467-002.patch, 
> HDFS-12335-HDFS-10467-003.patch, HDFS-12335-HDFS-10467-004.patch, 
> HDFS-12335-HDFS-10467-005.patch, HDFS-12335-HDFS-10467.006.patch, 
> HDFS-12335-HDFS-10467.007.patch, HDFS-12335-HDFS-10467.008.patch, 
> HDFS-12335-HDFS-10467.009.patch
>
>
> Add metrics for the Router and the State Store.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-31 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149912#comment-16149912
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hi [~atm], [~daryn] [~manojg], any comments/thoughts on my previous reply?

Hi [~asuresh] and [~chris.douglas], would appreciate if you guys could take a 
look at the patch too.

Thanks a lot.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149906#comment-16149906
 ] 

Hadoop QA commented on HDFS-12300:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
57s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 250 unchanged - 1 fixed = 250 total (was 251) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}125m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.TestFileAppendRestart |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884824/HDFS-12300.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3eae518577a8 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| 

[jira] [Updated] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses

2017-08-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-12384:
---
Description: 
{{hadoop-client-modules}} is failing because of dependences added by 
{{CuratorManager}}:
{code}
[INFO]   Adding ignore: *
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses failed 
with message:
Duplicate classes found:
  Found in:
org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-SNAPSHOT:compile
org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAPSHOT:compile
  Duplicate classes:

org/apache/hadoop/shaded/org/apache/curator/framework/api/DeleteBuilder.class
org/apache/hadoop/shaded/org/apache/curator/framework/CuratorFramework.class
{code}

> Fixing compilation issue with BanDuplicateClasses
> -
>
> Key: HDFS-12384
> URL: https://issues.apache.org/jira/browse/HDFS-12384
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12384-HDFS-10467-000.patch
>
>
> {{hadoop-client-modules}} is failing because of dependences added by 
> {{CuratorManager}}:
> {code}
> [INFO]   Adding ignore: *
> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses 
> failed with message:
> Duplicate classes found:
>   Found in:
> 
> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-SNAPSHOT:compile
> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAPSHOT:compile
>   Duplicate classes:
> 
> org/apache/hadoop/shaded/org/apache/curator/framework/api/DeleteBuilder.class
> 
> org/apache/hadoop/shaded/org/apache/curator/framework/CuratorFramework.class
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12335) Federation Metrics

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149900#comment-16149900
 ] 

Hadoop QA commented on HDFS-12335:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10467 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 14m 
37s{color} | {color:red} root in HDFS-10467 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-10467 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 405 unchanged - 0 fixed = 406 total (was 405) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12335 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884830/HDFS-12335-HDFS-10467.008.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 4c9bf4cd12a4 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 
11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10467 / fc2c254 |
| Default Java | 1.8.0_144 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20950/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20950/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20950/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20950/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output 

[jira] [Commented] (HDFS-12380) Simplify dataQueue.wait condition logical operation

2017-08-31 Thread liaoyuxiangqin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149898#comment-16149898
 ] 

liaoyuxiangqin commented on HDFS-12380:
---

Thanks [~shahrs87] for review on this, i have fix the checkstyle warning and 
attach a new patch,thanks!

> Simplify dataQueue.wait condition logical operation
> ---
>
> Key: HDFS-12380
> URL: https://issues.apache.org/jira/browse/HDFS-12380
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-beta1
> Environment: cluster: 3 nodes
> os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, 
> Ubuntu4.4.0-31-generic)
> hadoop version: hadoop-3.0.0-beta1
> operation: Code review
>Reporter: liaoyuxiangqin
>Assignee: liaoyuxiangqin
> Attachments: HDFS-12380.001.patch, HDFS-12380.002.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When i read the run() of DataStream class in hdfs-client, i found the 
> following condition code could be more simplify and easy to understand.
> {code:title=DataStreamer.java|borderStyle=solid}
> // wait for a packet to be sent.
> long now = Time.monotonicNow();
> while ((!shouldStop() && dataQueue.size() == 0 &&
>  (stage != BlockConstructionStage.DATA_STREAMING ||
>   stage == BlockConstructionStage.DATA_STREAMING &&
> now - lastPacket < halfSocketTimeout)) || doSleep ) {
> {code}
>   as described above code segmet, i find the code of stage 
> !=DATA_STREAMING  and stage==DATA_STREAMING appear at the same time in one  
> condition, so i think this condition logical not good understanding and 
> should simplify more.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12380) Simplify dataQueue.wait condition logical operation

2017-08-31 Thread liaoyuxiangqin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liaoyuxiangqin updated HDFS-12380:
--
Status: Patch Available  (was: Open)

> Simplify dataQueue.wait condition logical operation
> ---
>
> Key: HDFS-12380
> URL: https://issues.apache.org/jira/browse/HDFS-12380
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-beta1
> Environment: cluster: 3 nodes
> os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, 
> Ubuntu4.4.0-31-generic)
> hadoop version: hadoop-3.0.0-beta1
> operation: Code review
>Reporter: liaoyuxiangqin
>Assignee: liaoyuxiangqin
> Attachments: HDFS-12380.001.patch, HDFS-12380.002.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When i read the run() of DataStream class in hdfs-client, i found the 
> following condition code could be more simplify and easy to understand.
> {code:title=DataStreamer.java|borderStyle=solid}
> // wait for a packet to be sent.
> long now = Time.monotonicNow();
> while ((!shouldStop() && dataQueue.size() == 0 &&
>  (stage != BlockConstructionStage.DATA_STREAMING ||
>   stage == BlockConstructionStage.DATA_STREAMING &&
> now - lastPacket < halfSocketTimeout)) || doSleep ) {
> {code}
>   as described above code segmet, i find the code of stage 
> !=DATA_STREAMING  and stage==DATA_STREAMING appear at the same time in one  
> condition, so i think this condition logical not good understanding and 
> should simplify more.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12380) Simplify dataQueue.wait condition logical operation

2017-08-31 Thread liaoyuxiangqin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liaoyuxiangqin updated HDFS-12380:
--
Attachment: HDFS-12380.002.patch

> Simplify dataQueue.wait condition logical operation
> ---
>
> Key: HDFS-12380
> URL: https://issues.apache.org/jira/browse/HDFS-12380
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-beta1
> Environment: cluster: 3 nodes
> os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, 
> Ubuntu4.4.0-31-generic)
> hadoop version: hadoop-3.0.0-beta1
> operation: Code review
>Reporter: liaoyuxiangqin
>Assignee: liaoyuxiangqin
> Attachments: HDFS-12380.001.patch, HDFS-12380.002.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When i read the run() of DataStream class in hdfs-client, i found the 
> following condition code could be more simplify and easy to understand.
> {code:title=DataStreamer.java|borderStyle=solid}
> // wait for a packet to be sent.
> long now = Time.monotonicNow();
> while ((!shouldStop() && dataQueue.size() == 0 &&
>  (stage != BlockConstructionStage.DATA_STREAMING ||
>   stage == BlockConstructionStage.DATA_STREAMING &&
> now - lastPacket < halfSocketTimeout)) || doSleep ) {
> {code}
>   as described above code segmet, i find the code of stage 
> !=DATA_STREAMING  and stage==DATA_STREAMING appear at the same time in one  
> condition, so i think this condition logical not good understanding and 
> should simplify more.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12380) Simplify dataQueue.wait condition logical operation

2017-08-31 Thread liaoyuxiangqin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liaoyuxiangqin updated HDFS-12380:
--
Status: Open  (was: Patch Available)

> Simplify dataQueue.wait condition logical operation
> ---
>
> Key: HDFS-12380
> URL: https://issues.apache.org/jira/browse/HDFS-12380
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-beta1
> Environment: cluster: 3 nodes
> os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, 
> Ubuntu4.4.0-31-generic)
> hadoop version: hadoop-3.0.0-beta1
> operation: Code review
>Reporter: liaoyuxiangqin
>Assignee: liaoyuxiangqin
> Attachments: HDFS-12380.001.patch, HDFS-12380.002.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When i read the run() of DataStream class in hdfs-client, i found the 
> following condition code could be more simplify and easy to understand.
> {code:title=DataStreamer.java|borderStyle=solid}
> // wait for a packet to be sent.
> long now = Time.monotonicNow();
> while ((!shouldStop() && dataQueue.size() == 0 &&
>  (stage != BlockConstructionStage.DATA_STREAMING ||
>   stage == BlockConstructionStage.DATA_STREAMING &&
> now - lastPacket < halfSocketTimeout)) || doSleep ) {
> {code}
>   as described above code segmet, i find the code of stage 
> !=DATA_STREAMING  and stage==DATA_STREAMING appear at the same time in one  
> condition, so i think this condition logical not good understanding and 
> should simplify more.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149881#comment-16149881
 ] 

Andrew Wang commented on HDFS-11882:


Thanks for taking a look Kai, I can update the patch tomorrow once you've had 
time to fully digest the patch.

bq. About "Parity cells are the length of the longest data cells", didn't quite 
follow and could you clarify some bit?

When there's a partially written stripe, we might have data lengths [10, 5, 0] 
and parity lengths [10, 10]. The parity cells are the length of the longest 
data cell (10). There could be multiple full data cells, so I made it plural.

> Client fails if acknowledged size is greater than bytes sent
> 
>
> Key: HDFS-11882
> URL: https://issues.apache.org/jira/browse/HDFS-11882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, test
>Reporter: Akira Ajisaka
>Assignee: Andrew Wang
>Priority: Critical
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, 
> HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, 
> HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following 
> test was removed by HDFS-11823, however, this type of error can happen in 
> real cluster.
> {noformat}
> Running 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
>   Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11394) Add method for getting erasure coding policy through WebHDFS

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149876#comment-16149876
 ] 

Hadoop QA commented on HDFS-11394:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-11394 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11394 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12868318/HDFS-11394.04.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20956/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add method for getting erasure coding policy through WebHDFS 
> -
>
> Key: HDFS-11394
> URL: https://issues.apache.org/jira/browse/HDFS-11394
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, namenode
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-11394.01.patch, HDFS-11394.02.patch, 
> HDFS-11394.03.patch, HDFS-11394.04.patch
>
>
> We can expose erasure coding policy by erasure coded directory through 
> WebHDFS method as well as storage policy. This information can be used by 
> NameNode Web UI and show the detail of erasure coded directories.
> see: HDFS-8196



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12379) NameNode getListing should use FileStatus instead of HdfsFileStatus

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149873#comment-16149873
 ] 

Andrew Wang commented on HDFS-12379:


I'm -1 on any change that breaks wire compatibility. Is that what is being 
proposed here?

> NameNode getListing should use FileStatus instead of HdfsFileStatus
> ---
>
> Key: HDFS-12379
> URL: https://issues.apache.org/jira/browse/HDFS-12379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>
> The public {{listStatus}} APIs in {{FileSystem}} and 
> {{DistributedFileSystem}} expose {{FileStatus}} instead of 
> {{HdfsFileStatus}}. Therefore it is a waste to create the more expensive 
> {{HdfsFileStatus}} objects on NameNode.
> It should be a simple change similar to HDFS-11641. Marking incompatible 
> because wire protocol is incompatible. Not sure what downstream apps are 
> affected by this incompatibility. Maybe those directly using curl, or writing 
> their own HDFS client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent

2017-08-31 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149869#comment-16149869
 ] 

Kai Zheng commented on HDFS-11882:
--

Thanks [~andrew.wang] for adding so many comments in the codes which is very 
helpful for understanding the complex logic. Some minor comments, please check 
if they make sense or not.

1. How about {{waitCreatingNewStreams}} => {{waitCreatingStreamers}}, like we 
have checkStreamerUpdates.
2. "Get the acked file length" => "Get the length of the acked bytes in the 
block group"; "A full stripe is acked when at least numDataBlocks streamers 
have that cell" => "... streamers have corresponding cells of the stripe"; 
About "Parity cells are the length of the longest data cells", didn't quite 
follow and could you clarify some bit? 
{code}
   /**
-   * Get the number of acked stripes. An acked stripe means at least data block
-   * number size cells of the stripe were acked.
+   * Get the acked file length.
+   *
+   * 
+   *   A full stripe is acked when at least numDataBlocks streamers have
+   *   that cell, and all previous full stripes are also acked. This enforces
+   *   the constraint that there is at most one partial stripe.
+   * 
+   * 
+   *   Partial stripes write all parity cells. Empty data cells are not 
written.
+   *   Parity cells are the length of the longest data cells.
+   *   To be considered acked, a partial stripe needs at least numDataBlocks
+   *   empty or written cells.
+   * 
+   * 
+   *   Currently, partial stripes can only happen when closing the file at a
+   *   non-stripe boundary, but this could also happen during (currently
+   *   unimplemented) hflush/hsync support.
+   * 
*/
-  private long getNumAckedStripes() {
-int minStripeNum = Integer.MAX_VALUE;
+  private long getAckedLength() {
{code}

May post more later today.

> Client fails if acknowledged size is greater than bytes sent
> 
>
> Key: HDFS-11882
> URL: https://issues.apache.org/jira/browse/HDFS-11882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, test
>Reporter: Akira Ajisaka
>Assignee: Andrew Wang
>Priority: Critical
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, 
> HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, 
> HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following 
> test was removed by HDFS-11823, however, this type of error can happen in 
> real cluster.
> {noformat}
> Running 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
>   Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> 

[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149868#comment-16149868
 ] 

Andrew Wang commented on HDFS-12384:


[~busbey] yea, that sounded wrong to me, thus asking the expert :)

> Fixing compilation issue with BanDuplicateClasses
> -
>
> Key: HDFS-12384
> URL: https://issues.apache.org/jira/browse/HDFS-12384
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12384-HDFS-10467-000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9381:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> When same block came for replication for Striped mode, we can move that block 
> to PendingReplications
> 
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-9381.00.patch, HDFS-9381.01.patch, 
> HDFS-9381-02.patch, HDFS-9381-03.patch, HDFS-9381-04.patch
>
>
> Currently I noticed that we are just returning null if block already exists 
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
>   if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
>   }
> {code}
>  Here if we just return null and if neededReplications contains only fewer 
> blocks(basically by default if less than numliveNodes*2), then same blocks 
> can be picked again from neededReplications from next loop as we are not 
> removing element from neededReplications. Since this replication process need 
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in 
> every loop. 
> So my suggestion/improvement is:
>  Instead of just returning null, how about incrementing pendingReplications 
> for this block and remove from neededReplications? and also another point to 
> consider here is, to add into pendingReplications, generally we need target 
> and it is nothing but to which node we issued replication command. Later when 
> after replication success and DN reported it, block will be removed from 
> pendingReplications from NN addBlock. 
>  So since this is newly picked block from neededReplications, we would not 
> have selected target yet. So which target to be passed to pendingReplications 
> if we add this block? One Option I am thinking is, how about just passing 
> srcNode itself as target for this special condition? So, anyway if the block 
> is really missed, srcNode will not report it. So this block will not be 
> removed from pending replications, so that when it is timed out, it will be 
> considered for replication again and that time it will find actual target to 
> replicate while processing as part of regular replication flow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11394) Add method for getting erasure coding policy through WebHDFS

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11394:
---
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Add method for getting erasure coding policy through WebHDFS 
> -
>
> Key: HDFS-11394
> URL: https://issues.apache.org/jira/browse/HDFS-11394
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, namenode
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-11394.01.patch, HDFS-11394.02.patch, 
> HDFS-11394.03.patch, HDFS-11394.04.patch
>
>
> We can expose erasure coding policy by erasure coded directory through 
> WebHDFS method as well as storage policy. This information can be used by 
> NameNode Web UI and show the detail of erasure coded directories.
> see: HDFS-8196



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9381:
--
Labels: hdfs-ec-3.0-nice-to-have  (was: )

> When same block came for replication for Striped mode, we can move that block 
> to PendingReplications
> 
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-9381.00.patch, HDFS-9381.01.patch, 
> HDFS-9381-02.patch, HDFS-9381-03.patch, HDFS-9381-04.patch
>
>
> Currently I noticed that we are just returning null if block already exists 
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
>   if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
>   }
> {code}
>  Here if we just return null and if neededReplications contains only fewer 
> blocks(basically by default if less than numliveNodes*2), then same blocks 
> can be picked again from neededReplications from next loop as we are not 
> removing element from neededReplications. Since this replication process need 
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in 
> every loop. 
> So my suggestion/improvement is:
>  Instead of just returning null, how about incrementing pendingReplications 
> for this block and remove from neededReplications? and also another point to 
> consider here is, to add into pendingReplications, generally we need target 
> and it is nothing but to which node we issued replication command. Later when 
> after replication success and DN reported it, block will be removed from 
> pendingReplications from NN addBlock. 
>  So since this is newly picked block from neededReplications, we would not 
> have selected target yet. So which target to be passed to pendingReplications 
> if we add this block? One Option I am thinking is, how about just passing 
> srcNode itself as target for this special condition? So, anyway if the block 
> is really missed, srcNode will not report it. So this block will not be 
> removed from pending replications, so that when it is timed out, it will be 
> considered for replication again and that time it will find actual target to 
> replicate while processing as part of regular replication flow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8196) Erasure Coding related information on NameNode UI

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149867#comment-16149867
 ] 

Hadoop QA commented on HDFS-8196:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-8196 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-8196 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12854772/HDFS-8196.04.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20955/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Erasure Coding related information on NameNode UI
> -
>
> Key: HDFS-8196
> URL: https://issues.apache.org/jira/browse/HDFS-8196
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: NameNode, WebUI, hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8196.01.patch, HDFS-8196.02.patch, 
> HDFS-8196.03.patch, HDFS-8196.04.patch, Screen Shot 2017-02-06 at 
> 22.30.40.png, Screen Shot 2017-02-12 at 20.21.42.png, Screen Shot 2017-02-14 
> at 22.43.57.png
>
>
> NameNode WebUI shows EC related information and metrics. 
> This is depend on [HDFS-7674|https://issues.apache.org/jira/browse/HDFS-7674].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11553) Erasure Coding: Missing parity blocks in the block group are warned as corrupt blocks

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11553:
---
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: Missing parity blocks in the block group are warned as 
> corrupt blocks
> -
>
> Key: HDFS-11553
> URL: https://issues.apache.org/jira/browse/HDFS-11553
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, {{DFSStripedOutputStream}} verifies if the allocated block 
> locations are at least numDataBlocks length. That is, for the EC Policy 
> RS-6-3-64K, though the total needed DNs for a full EC Block Group is 9, 
> Clients will be able to successfully create a DFSStripedOutputStream with 
> just 6 DNs. Moreover, the output stream thus created with less DNs will 
> totally ignore writing Parity Blocks. HDFS-11552 is tracking the improvement 
> needed to accommodate Parity Blocks along with Data Blocks from the same 
> Block Group.
> {code}
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=6
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=7
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=8
> {code}
> In the above case, upon file stream close we get the following warning 
> message when the parity blocks are not yet written out. The warning message 
> claims that there are 3 corrupt blocks, which is in-correct. Its just the EC 
> redundancy is not sufficient and not corrupt or lost yet. This warning 
> message in the context of above usecase need to be fixed.
> {code}
> INFO  namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2726)) - 
> BLOCK* blk_-9223372036854775792_1002 is COMMITTED but not COMPLETE(numNodes= 
> 0 <  minimum = 6) in file /ec/test1
> INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2679)) - DIR* 
> completeFile: /ec/test1 is closed by DFSClient_NONMAPREDUCE_-1900076771_17
> WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:logCorruptBlocks(1117)) - Block group <1> has 3 
> corrupt blocks. It's at high risk of losing data.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11553) Erasure Coding: Missing parity blocks in the block group are warned as corrupt blocks

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11553:
---
Component/s: erasure-coding

> Erasure Coding: Missing parity blocks in the block group are warned as 
> corrupt blocks
> -
>
> Key: HDFS-11553
> URL: https://issues.apache.org/jira/browse/HDFS-11553
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, {{DFSStripedOutputStream}} verifies if the allocated block 
> locations are at least numDataBlocks length. That is, for the EC Policy 
> RS-6-3-64K, though the total needed DNs for a full EC Block Group is 9, 
> Clients will be able to successfully create a DFSStripedOutputStream with 
> just 6 DNs. Moreover, the output stream thus created with less DNs will 
> totally ignore writing Parity Blocks. HDFS-11552 is tracking the improvement 
> needed to accommodate Parity Blocks along with Data Blocks from the same 
> Block Group.
> {code}
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=6
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=7
> [Thread-5] WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block 
> location for parity block, index=8
> {code}
> In the above case, upon file stream close we get the following warning 
> message when the parity blocks are not yet written out. The warning message 
> claims that there are 3 corrupt blocks, which is in-correct. Its just the EC 
> redundancy is not sufficient and not corrupt or lost yet. This warning 
> message in the context of above usecase need to be fixed.
> {code}
> INFO  namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2726)) - 
> BLOCK* blk_-9223372036854775792_1002 is COMMITTED but not COMPLETE(numNodes= 
> 0 <  minimum = 6) in file /ec/test1
> INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2679)) - DIR* 
> completeFile: /ec/test1 is closed by DFSClient_NONMAPREDUCE_-1900076771_17
> WARN  hdfs.DFSOutputStream 
> (DFSStripedOutputStream.java:logCorruptBlocks(1117)) - Block group <1> has 3 
> corrupt blocks. It's at high risk of losing data.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8196) Erasure Coding related information on NameNode UI

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8196:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding related information on NameNode UI
> -
>
> Key: HDFS-8196
> URL: https://issues.apache.org/jira/browse/HDFS-8196
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: NameNode, WebUI, hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8196.01.patch, HDFS-8196.02.patch, 
> HDFS-8196.03.patch, HDFS-8196.04.patch, Screen Shot 2017-02-06 at 
> 22.30.40.png, Screen Shot 2017-02-12 at 20.21.42.png, Screen Shot 2017-02-14 
> at 22.43.57.png
>
>
> NameNode WebUI shows EC related information and metrics. 
> This is depend on [HDFS-7674|https://issues.apache.org/jira/browse/HDFS-7674].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9603) Erasure Coding: Use ErasureCoder to encode/decode a block group

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149866#comment-16149866
 ] 

Hadoop QA commented on HDFS-9603:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-9603 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-9603 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12780522/HDFS-9603.3.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20954/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Erasure Coding: Use ErasureCoder to encode/decode a block group
> ---
>
> Key: HDFS-9603
> URL: https://issues.apache.org/jira/browse/HDFS-9603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Rui Li
>Assignee: Kai Zheng
> Attachments: HDFS-9603.1.patch, HDFS-9603.2.patch, HDFS-9603.3.patch
>
>
> According to design, {{ErasureCoder}} is responsible to encode/decode a block 
> group. Currently however, we directly use {{RawErasureCoder}} to do the work, 
> e.g. in {{DFSStripedOutputStream}}. This task attempts to encapsulate 
> {{RawErasureCoder}} to comply with the design.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses

2017-08-31 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149863#comment-16149863
 ] 

Sean Busbey commented on HDFS-12384:


Sorry, I'm on mobile so reading patch files is difficult. Is this proposing 
that we ignore all classes in the shaded minicluster module?

> Fixing compilation issue with BanDuplicateClasses
> -
>
> Key: HDFS-12384
> URL: https://issues.apache.org/jira/browse/HDFS-12384
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12384-HDFS-10467-000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8196) Erasure Coding related information on NameNode UI

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149864#comment-16149864
 ] 

Andrew Wang commented on HDFS-8196:
---

Ping, think we can get this patch revved and in?

> Erasure Coding related information on NameNode UI
> -
>
> Key: HDFS-8196
> URL: https://issues.apache.org/jira/browse/HDFS-8196
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: NameNode, WebUI, hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8196.01.patch, HDFS-8196.02.patch, 
> HDFS-8196.03.patch, HDFS-8196.04.patch, Screen Shot 2017-02-06 at 
> 22.30.40.png, Screen Shot 2017-02-12 at 20.21.42.png, Screen Shot 2017-02-14 
> at 22.43.57.png
>
>
> NameNode WebUI shows EC related information and metrics. 
> This is depend on [HDFS-7674|https://issues.apache.org/jira/browse/HDFS-7674].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8295.
---
Resolution: Invalid

I'm resolving this since it's likely stale, we've substantially revisited the 
pluggable EC policy API.

> Add MODIFY and REMOVE ECSchema editlog operations
> -
>
> Key: HDFS-8295
> URL: https://issues.apache.org/jira/browse/HDFS-8295
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xinwei Qin 
>Assignee: Xinwei Qin 
> Attachments: HDFS-8295.001.patch
>
>
> If MODIFY and REMOVE ECSchema operations are supported, then add these 
> editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11611) Handle unsuccessful encode and decode calls

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11611:
---
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-8031)

> Handle unsuccessful encode and decode calls
> ---
>
> Key: HDFS-11611
> URL: https://issues.apache.org/jira/browse/HDFS-11611
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>
> Normally encode and decode operations are performed successfully but there 
> can be some rare cases when they fail for some reason, specifically with the 
> native implementations. The framework should be ready to perceive these cases 
> (for example by catching exceptions or by checking the operations return 
> value) and then do some kind of rescheduling of the failed operation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9657:
--
Issue Type: Improvement  (was: Bug)

> Schedule EC tasks at proper time to reduce the impact of recovery traffic
> -
>
> Key: HDFS-9657
> URL: https://issues.apache.org/jira/browse/HDFS-9657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch
>
>
> The EC recover tasks consume a lot of network bandwidth and disk I/O. 
> Recovering a corrupt block requires transferring 6 blocks , hence creating a 
> 6X overhead in network bandwidth and disk I/O.  When a datanode fails , the 
> recovery of the whole blocks on this datanode may use up the network 
> bandwith.  We need to start a recovery task at a proper time in order to give 
> less impact to the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11543) Test multiple erasure coding implementations

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11543:
---
Issue Type: Improvement  (was: Bug)

> Test multiple erasure coding implementations
> 
>
> Key: HDFS-11543
> URL: https://issues.apache.org/jira/browse/HDFS-11543
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>  Labels: test
>
> Potentially, multiple native erasure coding plugins will be available to be 
> used from HDFS later on. These plugins should be tested as well. For example, 
> the *NativeRSRawErasureCoderFactory* class - which is used for instantiating 
> the native ISA-L plugin's encoder and decoder objects - are used in 5 test 
> files under the 
> *hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/*
>  directory. The files are:
> - *TestDFSStripedInputStream.java*
> - *TestDFSStripedOutputStream.java*
> - *TestDFSStripedOutputStreamWithFailure.java*
> - *TestReconstructStripedFile.java*
> - *TestUnsetAndChangeDirectoryEcPolicy.java*
> Other erasure coding plugins should be tested in these cases as well in a 
> nice way (not by for example making a new file for every new erasure coding 
> plugin). For this purpose [parameterized 
> tests|https://github.com/junit-team/junit4/wiki/parameterized-tests] might be 
> used.
> This is also true for the 
> *hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/*
>  directory where this approach could be used for example for the 
> interoperability tests (when it is checked that certain erasure coding 
> implementations are compatible with each other by doing the encoding and 
> decoding operations with different plugins and verifying their results). The 
> plugin pairs which should be tested could be the parameters for the 
> parameterized tests.
> The parameterized test is just an idea, there can be other solutions as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9657:
--
Component/s: erasure-coding

> Schedule EC tasks at proper time to reduce the impact of recovery traffic
> -
>
> Key: HDFS-9657
> URL: https://issues.apache.org/jira/browse/HDFS-9657
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch
>
>
> The EC recover tasks consume a lot of network bandwidth and disk I/O. 
> Recovering a corrupt block requires transferring 6 blocks , hence creating a 
> 6X overhead in network bandwidth and disk I/O.  When a datanode fails , the 
> recovery of the whole blocks on this datanode may use up the network 
> bandwith.  We need to start a recovery task at a proper time in order to give 
> less impact to the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9603) Erasure Coding: Use ErasureCoder to encode/decode a block group

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9603:
--
Component/s: erasure-coding

> Erasure Coding: Use ErasureCoder to encode/decode a block group
> ---
>
> Key: HDFS-9603
> URL: https://issues.apache.org/jira/browse/HDFS-9603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Rui Li
>Assignee: Kai Zheng
> Attachments: HDFS-9603.1.patch, HDFS-9603.2.patch, HDFS-9603.3.patch
>
>
> According to design, {{ErasureCoder}} is responsible to encode/decode a block 
> group. Currently however, we directly use {{RawErasureCoder}} to do the work, 
> e.g. in {{DFSStripedOutputStream}}. This task attempts to encapsulate 
> {{RawErasureCoder}} to comply with the design.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11543) Test multiple erasure coding implementations

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11543:
---
Labels: test  (was: hdfs-ec-3.0-nice-to-have test)

> Test multiple erasure coding implementations
> 
>
> Key: HDFS-11543
> URL: https://issues.apache.org/jira/browse/HDFS-11543
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>  Labels: test
>
> Potentially, multiple native erasure coding plugins will be available to be 
> used from HDFS later on. These plugins should be tested as well. For example, 
> the *NativeRSRawErasureCoderFactory* class - which is used for instantiating 
> the native ISA-L plugin's encoder and decoder objects - are used in 5 test 
> files under the 
> *hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/*
>  directory. The files are:
> - *TestDFSStripedInputStream.java*
> - *TestDFSStripedOutputStream.java*
> - *TestDFSStripedOutputStreamWithFailure.java*
> - *TestReconstructStripedFile.java*
> - *TestUnsetAndChangeDirectoryEcPolicy.java*
> Other erasure coding plugins should be tested in these cases as well in a 
> nice way (not by for example making a new file for every new erasure coding 
> plugin). For this purpose [parameterized 
> tests|https://github.com/junit-team/junit4/wiki/parameterized-tests] might be 
> used.
> This is also true for the 
> *hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/*
>  directory where this approach could be used for example for the 
> interoperability tests (when it is checked that certain erasure coding 
> implementations are compatible with each other by doing the encoding and 
> decoding operations with different plugins and verifying their results). The 
> plugin pairs which should be tested could be the parameters for the 
> parameterized tests.
> The parameterized test is just an idea, there can be other solutions as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9604) Move ErasureCodingPolicyManager to FSDirectory

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-9604.
---
Resolution: Invalid

This code has been refactored a few times now, no longer valid.

> Move ErasureCodingPolicyManager to FSDirectory
> --
>
> Key: HDFS-9604
> URL: https://issues.apache.org/jira/browse/HDFS-9604
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9604.01.patch
>
>
> ErasureCodingPolicy is a part of directory metedata, it's better to put it in 
> FSDirectory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9603) Erasure Coding: Use ErasureCoder to encode/decode a block group

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9603:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: Use ErasureCoder to encode/decode a block group
> ---
>
> Key: HDFS-9603
> URL: https://issues.apache.org/jira/browse/HDFS-9603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Kai Zheng
> Attachments: HDFS-9603.1.patch, HDFS-9603.2.patch, HDFS-9603.3.patch
>
>
> According to design, {{ErasureCoder}} is responsible to encode/decode a block 
> group. Currently however, we directly use {{RawErasureCoder}} to do the work, 
> e.g. in {{DFSStripedOutputStream}}. This task attempts to encapsulate 
> {{RawErasureCoder}} to comply with the design.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9657:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-8031)

> Schedule EC tasks at proper time to reduce the impact of recovery traffic
> -
>
> Key: HDFS-9657
> URL: https://issues.apache.org/jira/browse/HDFS-9657
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch
>
>
> The EC recover tasks consume a lot of network bandwidth and disk I/O. 
> Recovering a corrupt block requires transferring 6 blocks , hence creating a 
> 6X overhead in network bandwidth and disk I/O.  When a datanode fails , the 
> recovery of the whole blocks on this datanode may use up the network 
> bandwith.  We need to start a recovery task at a proper time in order to give 
> less impact to the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11543) Test multiple erasure coding implementations

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11543:
---
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-8031)

> Test multiple erasure coding implementations
> 
>
> Key: HDFS-11543
> URL: https://issues.apache.org/jira/browse/HDFS-11543
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha2
>Reporter: László Bence Nagy
>Priority: Minor
>  Labels: hdfs-ec-3.0-nice-to-have, test
>
> Potentially, multiple native erasure coding plugins will be available to be 
> used from HDFS later on. These plugins should be tested as well. For example, 
> the *NativeRSRawErasureCoderFactory* class - which is used for instantiating 
> the native ISA-L plugin's encoder and decoder objects - are used in 5 test 
> files under the 
> *hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/*
>  directory. The files are:
> - *TestDFSStripedInputStream.java*
> - *TestDFSStripedOutputStream.java*
> - *TestDFSStripedOutputStreamWithFailure.java*
> - *TestReconstructStripedFile.java*
> - *TestUnsetAndChangeDirectoryEcPolicy.java*
> Other erasure coding plugins should be tested in these cases as well in a 
> nice way (not by for example making a new file for every new erasure coding 
> plugin). For this purpose [parameterized 
> tests|https://github.com/junit-team/junit4/wiki/parameterized-tests] might be 
> used.
> This is also true for the 
> *hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/*
>  directory where this approach could be used for example for the 
> interoperability tests (when it is checked that certain erasure coding 
> implementations are compatible with each other by doing the encoding and 
> decoding operations with different plugins and verifying their results). The 
> plugin pairs which should be tested could be the parameters for the 
> parameterized tests.
> The parameterized test is just an idea, there can be other solutions as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9696) Garbage snapshot records lingering forever

2017-08-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149858#comment-16149858
 ] 

Junping Du commented on HDFS-9696:
--

Add 2.8.0 and 2.9.0 in fix version given patch get landed there.

> Garbage snapshot records lingering forever
> --
>
> Key: HDFS-9696
> URL: https://issues.apache.org/jira/browse/HDFS-9696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.8.0, 2.9.0, 2.6.5, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-9696.branch-2.6.patch, HDFS-9696-branch-2.7.patch, 
> HDFS-9696.patch, HDFS-9696.v2.patch
>
>
> We have a cluster where the snapshot feature might have been tested years 
> ago. When the HDFS does not have any snapshot, but I see filediff records 
> persisted in its fsimage.  Since it has been restarted many times and 
> checkpointed over 100 times since then, it must haven been persisted and  
> carried over since then.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9696) Garbage snapshot records lingering forever

2017-08-31 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9696:
-
Fix Version/s: 2.8.0
   2.9.0

> Garbage snapshot records lingering forever
> --
>
> Key: HDFS-9696
> URL: https://issues.apache.org/jira/browse/HDFS-9696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.8.0, 2.9.0, 2.6.5, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-9696.branch-2.6.patch, HDFS-9696-branch-2.7.patch, 
> HDFS-9696.patch, HDFS-9696.v2.patch
>
>
> We have a cluster where the snapshot feature might have been tested years 
> ago. When the HDFS does not have any snapshot, but I see filediff records 
> persisted in its fsimage.  Since it has been restarted many times and 
> checkpointed over 100 times since then, it must haven been persisted and  
> carried over since then.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10763) Open files can leak permanently due to inconsistent lease update

2017-08-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149856#comment-16149856
 ] 

Junping Du commented on HDFS-10763:
---

Add 2.8.0, 2.9.0 in fix version given patch get landed there.

> Open files can leak permanently due to inconsistent lease update
> 
>
> Key: HDFS-10763
> URL: https://issues.apache.org/jira/browse/HDFS-10763
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.8.0, 2.9.0, 2.6.5, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-10763.br27.patch, 
> HDFS-10763.branch-2.7.supplement.patch, HDFS-10763.branch-2.7.v2.patch, 
> HDFS-10763.patch
>
>
> This can heppen during {{commitBlockSynchronization()}} or a client gives up 
> on closing a file after retries.
> From {{finalizeINodeFileUnderConstruction()}}, the lease is removed first and 
> then the inode is turned into the closed state. But if any block is not in 
> COMPLETE state, 
> {{INodeFile#assertAllBlocksComplete()}} will throw an exception. This will 
> cause the lease is removed from the lease manager, but not from the inode. 
> Since the lease manager does not have a lease for the file, no lease recovery 
> will happen for this file. Moreover, this broken state is persisted and 
> reconstructed through saving and loading of fsimage. Since no replication is 
> scheduled for the blocks for the file, this can cause a data loss and also 
> block decommissioning of datanode.
> The lease cannot be manually recovered either. It fails with
> {noformat}
> ...AlreadyBeingCreatedException): Failed to RECOVER_LEASE /xyz/xyz for user1 
> on
>  0.0.0.1 because the file is under construction but no leases found.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2950)
> ...
> {noformat}
> When a client retries {{close()}}, the same inconsistent state is created, 
> but it can work in the next time since {{checkLease()}} only looks at the 
> inode, not the lease manager in this case. The close behavior is different if 
> HDFS-8999 is activated by setting 
> {{dfs.namenode.file.close.num-committed-allowed}} to 1 (unlikely) or 2 
> (never). 
> In principle, the under-construction feature of an inode and the lease in the 
> lease manager should never go out of sync. The fix involves two parts.
> 1) Prevent inconsistent lease updates. We can achieve this by calling 
> {{removeLease()}} after checking the block state. 
> 2) Avoid reconstructing inconsistent lease states from a fsimage. 1) alone 
> does not correct the existing inconsistencies surviving through fsimages.  
> This can be done during fsimage loading time by making sure a corresponding 
> lease exists for each inode that are with the underconstruction feature. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10763) Open files can leak permanently due to inconsistent lease update

2017-08-31 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-10763:
--
Fix Version/s: 2.8.0
   2.9.0

> Open files can leak permanently due to inconsistent lease update
> 
>
> Key: HDFS-10763
> URL: https://issues.apache.org/jira/browse/HDFS-10763
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.8.0, 2.9.0, 2.6.5, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-10763.br27.patch, 
> HDFS-10763.branch-2.7.supplement.patch, HDFS-10763.branch-2.7.v2.patch, 
> HDFS-10763.patch
>
>
> This can heppen during {{commitBlockSynchronization()}} or a client gives up 
> on closing a file after retries.
> From {{finalizeINodeFileUnderConstruction()}}, the lease is removed first and 
> then the inode is turned into the closed state. But if any block is not in 
> COMPLETE state, 
> {{INodeFile#assertAllBlocksComplete()}} will throw an exception. This will 
> cause the lease is removed from the lease manager, but not from the inode. 
> Since the lease manager does not have a lease for the file, no lease recovery 
> will happen for this file. Moreover, this broken state is persisted and 
> reconstructed through saving and loading of fsimage. Since no replication is 
> scheduled for the blocks for the file, this can cause a data loss and also 
> block decommissioning of datanode.
> The lease cannot be manually recovered either. It fails with
> {noformat}
> ...AlreadyBeingCreatedException): Failed to RECOVER_LEASE /xyz/xyz for user1 
> on
>  0.0.0.1 because the file is under construction but no leases found.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2950)
> ...
> {noformat}
> When a client retries {{close()}}, the same inconsistent state is created, 
> but it can work in the next time since {{checkLease()}} only looks at the 
> inode, not the lease manager in this case. The close behavior is different if 
> HDFS-8999 is activated by setting 
> {{dfs.namenode.file.close.num-committed-allowed}} to 1 (unlikely) or 2 
> (never). 
> In principle, the under-construction feature of an inode and the lease in the 
> lease manager should never go out of sync. The fix involves two parts.
> 1) Prevent inconsistent lease updates. We can achieve this by calling 
> {{removeLease()}} after checking the block state. 
> 2) Avoid reconstructing inconsistent lease states from a fsimage. 1) alone 
> does not correct the existing inconsistencies surviving through fsimages.  
> This can be done during fsimage loading time by making sure a corresponding 
> lease exists for each inode that are with the underconstruction feature. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149851#comment-16149851
 ] 

Hadoop QA commented on HDFS-9381:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-9381 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-9381 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775199/HDFS-9381-04.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20952/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> When same block came for replication for Striped mode, we can move that block 
> to PendingReplications
> 
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9381.00.patch, HDFS-9381.01.patch, 
> HDFS-9381-02.patch, HDFS-9381-03.patch, HDFS-9381-04.patch
>
>
> Currently I noticed that we are just returning null if block already exists 
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
>   if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
>   }
> {code}
>  Here if we just return null and if neededReplications contains only fewer 
> blocks(basically by default if less than numliveNodes*2), then same blocks 
> can be picked again from neededReplications from next loop as we are not 
> removing element from neededReplications. Since this replication process need 
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in 
> every loop. 
> So my suggestion/improvement is:
>  Instead of just returning null, how about incrementing pendingReplications 
> for this block and remove from neededReplications? and also another point to 
> consider here is, to add into pendingReplications, generally we need target 
> and it is nothing but to which node we issued replication command. Later when 
> after replication success and DN reported it, block will be removed from 
> pendingReplications from NN addBlock. 
>  So since this is newly picked block from neededReplications, we would not 
> have selected target yet. So which target to be passed to pendingReplications 
> if we add this block? One Option I am thinking is, how about just passing 
> srcNode itself as target for this special condition? So, anyway if the block 
> is really missed, srcNode will not report it. So this block will not be 
> removed from pending replications, so that when it is timed out, it will be 
> considered for replication again and that time it will find actual target to 
> replicate while processing as part of regular replication flow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses

2017-08-31 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149853#comment-16149853
 ] 

Sean Busbey commented on HDFS-12384:


Gimme time to gain context here.

> Fixing compilation issue with BanDuplicateClasses
> -
>
> Key: HDFS-12384
> URL: https://issues.apache.org/jira/browse/HDFS-12384
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12384-HDFS-10467-000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9374) Inform user when a file with corrupted data blocks are read

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149852#comment-16149852
 ] 

Hadoop QA commented on HDFS-9374:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-9374 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-9374 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782223/HDFS-9374-001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20953/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Inform user when a file with corrupted data blocks are read
> ---
>
> Key: HDFS-9374
> URL: https://issues.apache.org/jira/browse/HDFS-9374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Li Bo
>Assignee: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have, supportability
> Attachments: HDFS-9374-001.patch
>
>
> When reading a block group with corrupt data blocks, it would be better to 
> tell user the corrupt blocks so that user can know the status of the file 
> being read. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8881) Erasure Coding: internal blocks got missed and got over-replicated at the same time

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8881:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: internal blocks got missed and got over-replicated at the 
> same time
> ---
>
> Key: HDFS-8881
> URL: https://issues.apache.org/jira/browse/HDFS-8881
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8881.00.patch
>
>
> We know the Repl checking depends on {{BlockManager#countNodes()}}, but 
> countNodes() has limitation for striped blockGroup.
> *One* missing internal block will be catched by Repl checking, and handled by 
> ReplicationMonitor.
> *One* over-replicated internal block will be catched by Repl checking, and 
> handled by processOverReplicatedBlocks.
> *One* missing internal block and *two* over-replicated internal blocks *at 
> the same time* will be catched by Repl checking, and handled by 
> processOverReplicatedBlocks, later by ReplicationMonitor.
> *One* missing internal block and *One* over-replicated internal block *at the 
> same time* will *NOT* be catched by Repl checking.
> "at the same time" means one missing internal block can't be recovered, and 
> one internal block got over-replicated anyway. For example:
> scenario A:
> step 1. block #0 and #1 are reported missing.
> 2. a new #1 got recovered.
> 3. the old #1 come back, and the recovery work for #0 failed.
> scenario B:
> 1. An DN decommissioned/dead which has #1.
> 2. block #0 is reported missing.
> 3. The DN has #1 recommisioned, and the recovery work for #0 failed.
> In the end, the blockGroup has \[1, 1, 2, 3, 4, 5, 6, 7, 8\], assume 6+3 
> schema. Client always needs to decode #0 if the blockGroup doesn't get 
> handled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9256) Erasure Coding: Improve failure handling of ECWorker striped block reconstruction

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9256:
--
Component/s: erasure-coding

> Erasure Coding: Improve failure handling of ECWorker striped block 
> reconstruction
> -
>
> Key: HDFS-9256
> URL: https://issues.apache.org/jira/browse/HDFS-9256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-nice-to-have
>
> As we know reconstruction of missed striped block is a costly operation, it 
> involves the following steps:-
> step-1) read the data from minimum number of sources(remotely reading the 
> data)
> step-2) decode data for the targets (CPU cycles)
> step-3) transfer the data to the targets(remotely writing the data)
> Assume there is a failure in step-3 due to target DN disconnected or dead 
> etc. Presently {{ECWorker}} is skipping the failed DN and continue 
> transferring data to the other targets. In the next round, it should again 
> start the reconstruction operation from first step. Considering the cost of 
> reconstruction, it would be good to give another chance to retry the failed 
> operation. The idea of this jira is to disucss the possible approaches and 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12377) Refactor TestReadStripedFileWithDecoding to avoid test timeouts

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149847#comment-16149847
 ] 

Hadoop QA commented on HDFS-12377:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-hdfs-project_hadoop-hdfs generated 0 new + 
408 unchanged - 3 fixed = 408 total (was 411) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 7 new + 9 unchanged - 6 fixed = 16 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.TestReadStripedFileWithDecodingDeletedData |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12377 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884817/HDFS-12377.002.patch |
| Optional Tests |  

[jira] [Updated] (HDFS-8931) Erasure Coding: Notify exception to client side from ParityGenerator.

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8931:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: Notify exception to client side from ParityGenerator.
> -
>
> Key: HDFS-8931
> URL: https://issues.apache.org/jira/browse/HDFS-8931
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>  Labels: EC
>
> Following HDFS-8287. 
> Current client thread catch up the exception from {{ParityGenerator}}. In 
> order to handle properly, 
> 1. Put together handling logic into UncaughtExceptionHandler.
> 2. Notify exception to client side from UncaughtExceptionHandler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8225) EC client code should not print info log message

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8225:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> EC client code should not print info log message
> 
>
> Key: HDFS-8225
> URL: https://issues.apache.org/jira/browse/HDFS-8225
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tsz Wo Nicholas Sze
>  Labels: hdfs-ec-3.0-nice-to-have
>
> There are many LOG.info(..) calls in the code.  We should either remove them 
> or change the log level.  Users don't want to see any log message on the 
> screen when running the client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8881) Erasure Coding: internal blocks got missed and got over-replicated at the same time

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8881:
--
Component/s: erasure-coding

> Erasure Coding: internal blocks got missed and got over-replicated at the 
> same time
> ---
>
> Key: HDFS-8881
> URL: https://issues.apache.org/jira/browse/HDFS-8881
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8881.00.patch
>
>
> We know the Repl checking depends on {{BlockManager#countNodes()}}, but 
> countNodes() has limitation for striped blockGroup.
> *One* missing internal block will be catched by Repl checking, and handled by 
> ReplicationMonitor.
> *One* over-replicated internal block will be catched by Repl checking, and 
> handled by processOverReplicatedBlocks.
> *One* missing internal block and *two* over-replicated internal blocks *at 
> the same time* will be catched by Repl checking, and handled by 
> processOverReplicatedBlocks, later by ReplicationMonitor.
> *One* missing internal block and *One* over-replicated internal block *at the 
> same time* will *NOT* be catched by Repl checking.
> "at the same time" means one missing internal block can't be recovered, and 
> one internal block got over-replicated anyway. For example:
> scenario A:
> step 1. block #0 and #1 are reported missing.
> 2. a new #1 got recovered.
> 3. the old #1 come back, and the recovery work for #0 failed.
> scenario B:
> 1. An DN decommissioned/dead which has #1.
> 2. block #0 is reported missing.
> 3. The DN has #1 recommisioned, and the recovery work for #0 failed.
> In the end, the blockGroup has \[1, 1, 2, 3, 4, 5, 6, 7, 8\], assume 6+3 
> schema. Client always needs to decode #0 if the blockGroup doesn't get 
> handled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9256) Erasure Coding: Improve failure handling of ECWorker striped block reconstruction

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9256:
--
Labels: hdfs-ec-3.0-nice-to-have  (was: )

> Erasure Coding: Improve failure handling of ECWorker striped block 
> reconstruction
> -
>
> Key: HDFS-9256
> URL: https://issues.apache.org/jira/browse/HDFS-9256
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-nice-to-have
>
> As we know reconstruction of missed striped block is a costly operation, it 
> involves the following steps:-
> step-1) read the data from minimum number of sources(remotely reading the 
> data)
> step-2) decode data for the targets (CPU cycles)
> step-3) transfer the data to the targets(remotely writing the data)
> Assume there is a failure in step-3 due to target DN disconnected or dead 
> etc. Presently {{ECWorker}} is skipping the failed DN and continue 
> transferring data to the other targets. In the next round, it should again 
> start the reconstruction operation from first step. Considering the cost of 
> reconstruction, it would be good to give another chance to retry the failed 
> operation. The idea of this jira is to disucss the possible approaches and 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9345) Erasure Coding: create dummy coder and schema

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9345:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: create dummy coder and schema
> -
>
> Key: HDFS-9345
> URL: https://issues.apache.org/jira/browse/HDFS-9345
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-nice-to-have
>
> We  can create dummy coder which does no computation and simply returns zero 
> bytes. Similarly, we can create a test-only schema with no parity blocks.
> Such coder and schema can be used to isolate the performance issue to 
> HDFS-side logic instead of codec, which would be useful when tuning 
> performance of EC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9256) Erasure Coding: Improve failure handling of ECWorker striped block reconstruction

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9256:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Erasure Coding: Improve failure handling of ECWorker striped block 
> reconstruction
> -
>
> Key: HDFS-9256
> URL: https://issues.apache.org/jira/browse/HDFS-9256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-nice-to-have
>
> As we know reconstruction of missed striped block is a costly operation, it 
> involves the following steps:-
> step-1) read the data from minimum number of sources(remotely reading the 
> data)
> step-2) decode data for the targets (CPU cycles)
> step-3) transfer the data to the targets(remotely writing the data)
> Assume there is a failure in step-3 due to target DN disconnected or dead 
> etc. Presently {{ECWorker}} is skipping the failed DN and continue 
> transferring data to the other targets. In the next round, it should again 
> start the reconstruction operation from first step. Considering the cost of 
> reconstruction, it would be good to give another chance to retry the failed 
> operation. The idea of this jira is to disucss the possible approaches and 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9374) Inform user when a file with corrupted data blocks are read

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9374:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Inform user when a file with corrupted data blocks are read
> ---
>
> Key: HDFS-9374
> URL: https://issues.apache.org/jira/browse/HDFS-9374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Li Bo
>Assignee: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have, supportability
> Attachments: HDFS-9374-001.patch
>
>
> When reading a block group with corrupt data blocks, it would be better to 
> tell user the corrupt blocks so that user can know the status of the file 
> being read. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149845#comment-16149845
 ] 

Andrew Wang commented on HDFS-9381:
---

[~eddyxu] could you triage this one, since you've been looking at EC recovery 
recently? Worth putting on nice-to-have, or just bump it out?

> When same block came for replication for Striped mode, we can move that block 
> to PendingReplications
> 
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9381.00.patch, HDFS-9381.01.patch, 
> HDFS-9381-02.patch, HDFS-9381-03.patch, HDFS-9381-04.patch
>
>
> Currently I noticed that we are just returning null if block already exists 
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
>   if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
>   }
> {code}
>  Here if we just return null and if neededReplications contains only fewer 
> blocks(basically by default if less than numliveNodes*2), then same blocks 
> can be picked again from neededReplications from next loop as we are not 
> removing element from neededReplications. Since this replication process need 
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in 
> every loop. 
> So my suggestion/improvement is:
>  Instead of just returning null, how about incrementing pendingReplications 
> for this block and remove from neededReplications? and also another point to 
> consider here is, to add into pendingReplications, generally we need target 
> and it is nothing but to which node we issued replication command. Later when 
> after replication success and DN reported it, block will be removed from 
> pendingReplications from NN addBlock. 
>  So since this is newly picked block from neededReplications, we would not 
> have selected target yet. So which target to be passed to pendingReplications 
> if we add this block? One Option I am thinking is, how about just passing 
> srcNode itself as target for this special condition? So, anyway if the block 
> is really missed, srcNode will not report it. So this block will not be 
> removed from pending replications, so that when it is timed out, it will be 
> considered for replication again and that time it will find actual target to 
> replicate while processing as part of regular replication flow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9374) Inform user when a file with corrupted data blocks are read

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9374:
--
Labels: hdfs-ec-3.0-nice-to-have supportability  (was: supportability)

> Inform user when a file with corrupted data blocks are read
> ---
>
> Key: HDFS-9374
> URL: https://issues.apache.org/jira/browse/HDFS-9374
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
>  Labels: hdfs-ec-3.0-nice-to-have, supportability
> Attachments: HDFS-9374-001.patch
>
>
> When reading a block group with corrupt data blocks, it would be better to 
> tell user the corrupt blocks so that user can know the status of the file 
> being read. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9382) Track the acks for the packets which are sent from ErasureCodingWorker as part of reconstruction work

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9382:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: HDFS-8031)

> Track the acks for the packets which are sent from ErasureCodingWorker as 
> part of reconstruction work
> -
>
> Key: HDFS-9382
> URL: https://issues.apache.org/jira/browse/HDFS-9382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> Currently we are not tracking the acks for the packets which are sent from DN 
> ECWorker as part of reconstruction work. This jira is proposing to tracks the 
> acks as reconstruction work is really expensive, so we should know if any 
> packets failed to write at target DN 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9386) Erasure coding: updateBlockForPipeline sometimes returns non-striped block for striped file

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-9386.
---
  Resolution: Cannot Reproduce
Target Version/s:   (was: )

Haven't seen this recently and bug is two years old, resolving.

> Erasure coding: updateBlockForPipeline sometimes returns non-striped block 
> for striped file
> ---
>
> Key: HDFS-9386
> URL: https://issues.apache.org/jira/browse/HDFS-9386
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Zhe Zhang
>
> I've seen this bug a few times. The returned {{LocatedBlock}} from 
> {{updateBlockForPipeline}} is sometimes not {{LocatedStripedBlock}}. However, 
> {{FSNamesystem#bumpBlockGenerationStamp}} did return a 
> {{LocatedStripedBlock}}. Maybe a bug in PB. I'm still debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8140) ECSchema supports for offline EditsVisitor over an OEV XML file

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8140.
---
Resolution: Duplicate

This is a dupe of HDFS-11467.

> ECSchema supports for offline EditsVisitor over an OEV XML file
> ---
>
> Key: HDFS-8140
> URL: https://issues.apache.org/jira/browse/HDFS-8140
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Xinwei Qin 
>Assignee: Xinwei Qin 
>
> Make the ECSchema info in Editlog Support for offline EditsVistor over an OEV 
> XML file, which is not implemented in HDFS-7859.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCodingPolicyManager section in OIV XML/ReverseXML and OEV tools

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149838#comment-16149838
 ] 

Andrew Wang commented on HDFS-11467:


Hi [~Sammi] FYI that with HDFS-7859, we also need to add OIV and OEV support 
for the new fields. Can do it in HDFS-7859 or here as a follow-on.

> Support ErasureCodingPolicyManager section in OIV XML/ReverseXML and OEV tools
> --
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-7859, after ErasureCodingPolicyManager section is added 
> into fsimage, we would like to also support exporting this section into an 
> XML back and forth using the OIV tool.
> Likewise, HDFS-7859 adds new edit log ops, so OEV tool should also support it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7859:
--
Labels: hdfs-ec-3.0-must-do  (was: )

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-7859.001.patch, HDFS-7859.002.patch, 
> HDFS-7859.004.patch, HDFS-7859.005.patch, HDFS-7859.006.patch, 
> HDFS-7859.007.patch, HDFS-7859.008.patch, HDFS-7859.009.patch, 
> HDFS-7859.010.patch, HDFS-7859.011.patch, HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent

2017-08-31 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-11882:
-
Comment: was deleted

(was: Travel today. Please expect slow response.

)

> Client fails if acknowledged size is greater than bytes sent
> 
>
> Key: HDFS-11882
> URL: https://issues.apache.org/jira/browse/HDFS-11882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, test
>Reporter: Akira Ajisaka
>Assignee: Andrew Wang
>Priority: Critical
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, 
> HDFS-11882.03.patch, HDFS-11882.04.patch, HDFS-11882.05.patch, 
> HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following 
> test was removed by HDFS-11823, however, this type of error can happen in 
> real cluster.
> {noformat}
> Running 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
>   Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149832#comment-16149832
 ] 

Andrew Wang commented on HDFS-1:


Hi Huafeng, thanks for the rev, looks good. Please proceed with a full patch.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12335) Federation Metrics

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149824#comment-16149824
 ] 

Hadoop QA commented on HDFS-12335:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10467 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 18m 
56s{color} | {color:red} root in HDFS-10467 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-10467 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 50s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 404 unchanged - 0 fixed = 409 total (was 404) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12335 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884815/HDFS-12335-HDFS-10467.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux cfb058b0f6f1 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10467 / fc2c254 |
| Default Java | 1.8.0_144 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20947/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20947/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20947/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20947/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output 

[jira] [Updated] (HDFS-12382) Ozone: SCM: BlockManager creates a new container for each allocateBlock call

2017-08-31 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-12382:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: HDFS-7240
Target Version/s: HDFS-7240
  Status: Resolved  (was: Patch Available)

[~nandakumar131] Thank you for the fix. I have committed this to the feature 
branch.
[~xyao] Thanks for the review.

> Ozone: SCM: BlockManager creates a new container for each allocateBlock call
> 
>
> Key: HDFS-12382
> URL: https://issues.apache.org/jira/browse/HDFS-12382
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Fix For: HDFS-7240
>
> Attachments: HDFS-12382-HDFS-7240.000.patch
>
>
> {{StorageContainerManager}}'s block protocol creates a new container for each 
> allocate block call instead of using existing open containers. This behavior 
> is not seen once the cluster is restarted.
> When {{createContainer}} flag is set, the container state is changed from 
> {{ALLOCATED}} to {{CREATING}}. But in {{refreshContainers}} call only 
> {{ALLOCATED}} state is handled.
> During restart {{loadAllocatedContainers}} loads the {{containers}} map 
> properly, which fixes the issue after restart. But we will face the same 
> issue later when we allocate new containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12335) Federation Metrics

2017-08-31 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-12335:
-
Attachment: HDFS-12335-HDFS-10467.008.patch

Fixup checkstyle

> Federation Metrics
> --
>
> Key: HDFS-12335
> URL: https://issues.apache.org/jira/browse/HDFS-12335
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12335-HDFS-10467-000.patch, 
> HDFS-12335-HDFS-10467-001.patch, HDFS-12335-HDFS-10467-002.patch, 
> HDFS-12335-HDFS-10467-003.patch, HDFS-12335-HDFS-10467-004.patch, 
> HDFS-12335-HDFS-10467-005.patch, HDFS-12335-HDFS-10467.006.patch, 
> HDFS-12335-HDFS-10467.007.patch, HDFS-12335-HDFS-10467.008.patch
>
>
> Add metrics for the Router and the State Store.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses

2017-08-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149808#comment-16149808
 ] 

Íñigo Goiri commented on HDFS-12384:


[~busbey], [~andrew.wang], do you guys have a better solution?
It might be better to do it in a HADOOP JIRA.
Feel free to repurpose this JIRA.

> Fixing compilation issue with BanDuplicateClasses
> -
>
> Key: HDFS-12384
> URL: https://issues.apache.org/jira/browse/HDFS-12384
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12384-HDFS-10467-000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12383:
-
Attachment: HDFS-12383.02.patch

Thanks so much for the review Wei-Chiu! Attached patch 2 to address all 
comments.

> Re-encryption updater should handle canceled tasks better
> -
>
> Key: HDFS-12383
> URL: https://issues.apache.org/jira/browse/HDFS-12383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0-beta1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12383.01.patch, HDFS-12383.02.patch
>
>
> Seen an instance where the re-encryption updater exited due to an exception, 
> and later tasks no longer executes. Logs below:
> {noformat}
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Zone 
> /tmp/encryption-zone-3(16819) is submitted for re-encryption.
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Executing 
> re-encrypt commands on zone 16819. Current zones:[zone:16787 state:Completed 
> lastProcessed:null filesReencrypted:1 fileReencryptionFailures:0][zone:16813 
> state:Completed lastProcessed:null filesReencrypted:1 
> fileReencryptionFailures:0][zone:16819 state:Submitted lastProcessed:null 
> filesReencrypted:0 fileReencryptionFailures:0]
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 starts 
> re-encryption processing
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Re-encrypting 
> zone /tmp/encryption-zone-3(id=16819)
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submitted batch 
> (start:/tmp/encryption-zone-3/data1, size:1) of zone 16819 to re-encrypt.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submission 
> completed of zone 16819 for re-encryption.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Processing 
> batched re-encryption for zone 16819, batch size 1, 
> start:/tmp/encryption-zone-3/data1
> 2017-08-31 09:54:08,979 INFO BlockStateChange: BLOCK* BlockManager: ask 
> 172.26.1.71:20002 to delete [blk_1073742291_1467]
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Cancelling 1 
> re-encryption tasks
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Cancelled zone 
> /tmp/encryption-zone-3(16819) for re-encryption.
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 completed 
> re-encryption.
> 2017-08-31 09:54:18,296 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Completed 
> re-encrypting one batch of 1 edeks from KMS, time consumed: 10.19 s, start: 
> /tmp/encryption-zone-3/data1.
> 2017-08-31 09:54:18,296 ERROR 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Re-encryption 
> updater thread exiting.
> java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.takeAndProcessTasks(ReencryptionUpdater.java:404)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.run(ReencryptionUpdater.java:250)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Updater should be fixed to handle canceled tasks better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149796#comment-16149796
 ] 

Ravi Prakash commented on HDFS-12300:
-

Sounds good to me. Patch looks good to me. +1. 

> Audit-log delegation token related operations
> -
>
> Key: HDFS-12300
> URL: https://issues.apache.org/jira/browse/HDFS-12300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12300.01.patch, HDFS-12300.02.patch
>
>
> When inspecting the code, I found that the following methods in FSNamesystem 
> are not audit logged:
> - getDelegationToken
> - renewDelegationToken
> - cancelDelegationToken
> The audit log itself does have a logTokenTrackingId field to additionally log 
> some details when a token is used for authentication.
> After emailing the community, we should add that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149790#comment-16149790
 ] 

Wei-Chiu Chuang commented on HDFS-12383:


ReencryptionUpdater does not capture CancellationException so the thread would 
terminate upon CANCEL reencryption command, and reencryption can't be resumed 
as a result.

The 002 patch looks mostly good.
{code}
if (completed.isCancelled()) {
  LOG.debug("Skipped canceled re-encryption task for zone {}, last: {}",
  task.zoneId, task.lastFile);
  return;
}
{code}
How come I missed it? Sorry about that.

{code}
private boolean isRunning = false;
{code}
should it be a volatile variable?

{code}
try {
} finally {
  LOG.info("Re-encrypted callable running = {} ", callableRunning.get());
}
{code}
I feel this try block is not needed. But either way is good for me.

> Re-encryption updater should handle canceled tasks better
> -
>
> Key: HDFS-12383
> URL: https://issues.apache.org/jira/browse/HDFS-12383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.0.0-beta1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12383.01.patch
>
>
> Seen an instance where the re-encryption updater exited due to an exception, 
> and later tasks no longer executes. Logs below:
> {noformat}
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Zone 
> /tmp/encryption-zone-3(16819) is submitted for re-encryption.
> 2017-08-31 09:54:08,104 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Executing 
> re-encrypt commands on zone 16819. Current zones:[zone:16787 state:Completed 
> lastProcessed:null filesReencrypted:1 fileReencryptionFailures:0][zone:16813 
> state:Completed lastProcessed:null filesReencrypted:1 
> fileReencryptionFailures:0][zone:16819 state:Submitted lastProcessed:null 
> filesReencrypted:0 fileReencryptionFailures:0]
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 starts 
> re-encryption processing
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Re-encrypting 
> zone /tmp/encryption-zone-3(id=16819)
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submitted batch 
> (start:/tmp/encryption-zone-3/data1, size:1) of zone 16819 to re-encrypt.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Submission 
> completed of zone 16819 for re-encryption.
> 2017-08-31 09:54:08,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Processing 
> batched re-encryption for zone 16819, batch size 1, 
> start:/tmp/encryption-zone-3/data1
> 2017-08-31 09:54:08,979 INFO BlockStateChange: BLOCK* BlockManager: ask 
> 172.26.1.71:20002 to delete [blk_1073742291_1467]
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Cancelling 1 
> re-encryption tasks
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager: Cancelled zone 
> /tmp/encryption-zone-3(16819) for re-encryption.
> 2017-08-31 09:54:18,295 INFO 
> org.apache.hadoop.hdfs.protocol.ReencryptionStatus: Zone 16819 completed 
> re-encryption.
> 2017-08-31 09:54:18,296 INFO 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionHandler: Completed 
> re-encrypting one batch of 1 edeks from KMS, time consumed: 10.19 s, start: 
> /tmp/encryption-zone-3/data1.
> 2017-08-31 09:54:18,296 ERROR 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater: Re-encryption 
> updater thread exiting.
> java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.takeAndProcessTasks(ReencryptionUpdater.java:404)
> at 
> org.apache.hadoop.hdfs.server.namenode.ReencryptionUpdater.run(ReencryptionUpdater.java:250)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Updater should be fixed to handle canceled tasks better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12383) Re-encryption updater should handle canceled tasks better

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149788#comment-16149788
 ] 

Hadoop QA commented on HDFS-12383:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestFileAppendRestart |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestFileCreationEmpty |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
| Timed 

[jira] [Updated] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12300:
-
Attachment: HDFS-12300.02.patch

> Audit-log delegation token related operations
> -
>
> Key: HDFS-12300
> URL: https://issues.apache.org/jira/browse/HDFS-12300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12300.01.patch, HDFS-12300.02.patch
>
>
> When inspecting the code, I found that the following methods in FSNamesystem 
> are not audit logged:
> - getDelegationToken
> - renewDelegationToken
> - cancelDelegationToken
> The audit log itself does have a logTokenTrackingId field to additionally log 
> some details when a token is used for authentication.
> After emailing the community, we should add that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12300) Audit-log delegation token related operations

2017-08-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149787#comment-16149787
 ] 

Xiao Chen commented on HDFS-12300:
--

Hey Ravi,

Thanks a lot for reviewing!

I was only deducting about the reasoning why FSN writes is own code instead of 
calling the general helper method, so I don't have numbers. Considering 
reflections are usually resource-heavy, and we tend to optimize NN within the 
namespace lock, it seems plausible.
The overhead I was referring to are specific to [these 
lines|https://github.com/apache/hadoop/blob/branch-3.0.0-alpha4/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java#L170-L174].
 IMHO it's up to each module to decide whether this is costly enough for 
optimization, likely from the result of stress tests.

bq. why we don't audit log when some exceptions are thrown and not others
I think HDFS-10776 (and its first comment) is the best answer available - we 
only log AccessControlExceptions, and don't care about others.

Patch 2 to fix the checkstyle.

> Audit-log delegation token related operations
> -
>
> Key: HDFS-12300
> URL: https://issues.apache.org/jira/browse/HDFS-12300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12300.01.patch, HDFS-12300.02.patch
>
>
> When inspecting the code, I found that the following methods in FSNamesystem 
> are not audit logged:
> - getDelegationToken
> - renewDelegationToken
> - cancelDelegationToken
> The audit log itself does have a logTokenTrackingId field to additionally log 
> some details when a token is used for authentication.
> After emailing the community, we should add that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12335) Federation Metrics

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149780#comment-16149780
 ] 

Hadoop QA commented on HDFS-12335:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10467 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 15m  
2s{color} | {color:red} root in HDFS-10467 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-10467 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 405 unchanged - 0 fixed = 409 total (was 405) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
59s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Dead store to namespacesInfo in 
org.apache.hadoop.hdfs.server.federation.metrics.NamenodeBeanMetrics.getNamespaceInfo(Function)
  At 
NamenodeBeanMetrics.java:org.apache.hadoop.hdfs.server.federation.metrics.NamenodeBeanMetrics.getNamespaceInfo(Function)
  At NamenodeBeanMetrics.java:[line 380] |
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12335 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12884800/HDFS-12335-HDFS-10467.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux e68cc6de1503 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 
11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10467 / fc2c254 |
| Default Java | 1.8.0_144 |
| mvninstall | 

[jira] [Commented] (HDFS-12043) Add counters for block re-replication

2017-08-31 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149762#comment-16149762
 ] 

Arpit Agarwal commented on HDFS-12043:
--

Thank you [~andrew.wang].

> Add counters for block re-replication
> -
>
> Key: HDFS-12043
> URL: https://issues.apache.org/jira/browse/HDFS-12043
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12043.001.patch, HDFS-12043.002.patch, 
> HDFS-12043.003.patch, HDFS-12043.004.patch, HDFS-12043-branch-2.005.patch
>
>
> We occasionally see that the under-replicated block count is not going down 
> quickly enough. We've made at least one fix to speed up block replications 
> (HDFS-9205) but we need better insight into the current state and activity of 
> the block re-replication logic. For example, we need to understand whether is 
> it because re-replication is not making forward progress at all, or is it 
> because new under-replicated blocks are being added faster.
> We should include additional metrics:
> # Cumulative number of blocks that were successfully replicated. 
> # Cumulative number of re-replications that timed out.
> # Cumulative number of blocks that were dequeued for re-replication but not 
> scheduled e.g. because they were invalid, or under-construction or 
> replication was postponed.
>  
> The growth rate of of the above metrics will make it clear whether block 
> replication is making forward progress and if not then provide potential 
> clues about why it is stalled.
> Thanks [~arpitagarwal] for the offline discussions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12043) Add counters for block re-replication

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12043:
---
Fix Version/s: (was: 3.0.0-alpha4)
   3.0.0-beta1

> Add counters for block re-replication
> -
>
> Key: HDFS-12043
> URL: https://issues.apache.org/jira/browse/HDFS-12043
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12043.001.patch, HDFS-12043.002.patch, 
> HDFS-12043.003.patch, HDFS-12043.004.patch, HDFS-12043-branch-2.005.patch
>
>
> We occasionally see that the under-replicated block count is not going down 
> quickly enough. We've made at least one fix to speed up block replications 
> (HDFS-9205) but we need better insight into the current state and activity of 
> the block re-replication logic. For example, we need to understand whether is 
> it because re-replication is not making forward progress at all, or is it 
> because new under-replicated blocks are being added faster.
> We should include additional metrics:
> # Cumulative number of blocks that were successfully replicated. 
> # Cumulative number of re-replications that timed out.
> # Cumulative number of blocks that were dequeued for re-replication but not 
> scheduled e.g. because they were invalid, or under-construction or 
> replication was postponed.
>  
> The growth rate of of the above metrics will make it clear whether block 
> replication is making forward progress and if not then provide potential 
> clues about why it is stalled.
> Thanks [~arpitagarwal] for the offline discussions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12043) Add counters for block re-replication

2017-08-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149759#comment-16149759
 ] 

Andrew Wang commented on HDFS-12043:


Bit of unfortunate history on this JIRA, looks like alpha4 was released between 
the initial bad commit and the subsequent revert and re-commit.

I'm going to change the fix version to 3.0.0-beta1, since that's where the 
correct version will be released.

> Add counters for block re-replication
> -
>
> Key: HDFS-12043
> URL: https://issues.apache.org/jira/browse/HDFS-12043
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12043.001.patch, HDFS-12043.002.patch, 
> HDFS-12043.003.patch, HDFS-12043.004.patch, HDFS-12043-branch-2.005.patch
>
>
> We occasionally see that the under-replicated block count is not going down 
> quickly enough. We've made at least one fix to speed up block replications 
> (HDFS-9205) but we need better insight into the current state and activity of 
> the block re-replication logic. For example, we need to understand whether is 
> it because re-replication is not making forward progress at all, or is it 
> because new under-replicated blocks are being added faster.
> We should include additional metrics:
> # Cumulative number of blocks that were successfully replicated. 
> # Cumulative number of re-replications that timed out.
> # Cumulative number of blocks that were dequeued for re-replication but not 
> scheduled e.g. because they were invalid, or under-construction or 
> replication was postponed.
>  
> The growth rate of of the above metrics will make it clear whether block 
> replication is making forward progress and if not then provide potential 
> clues about why it is stalled.
> Thanks [~arpitagarwal] for the offline discussions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10480) Add an admin command to list currently open files

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-10480:
---
Fix Version/s: (was: 3.0.0-beta1)
   3.0.0-alpha4

> Add an admin command to list currently open files
> -
>
> Key: HDFS-10480
> URL: https://issues.apache.org/jira/browse/HDFS-10480
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Kihwal Lee
>Assignee: Manoj Govindassamy
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.3
>
> Attachments: HDFS-10480.02.patch, HDFS-10480.03.patch, 
> HDFS-10480.04.patch, HDFS-10480.05.patch, HDFS-10480.06.patch, 
> HDFS-10480.07.patch, HDFS-10480-branch-2.01.patch, 
> HDFS-10480-branch-2.8.01.patch, HDFS-10480-trunk-1.patch, 
> HDFS-10480-trunk.patch
>
>
> Currently there is no easy way to obtain the list of active leases or files 
> being written. It will be nice if we have an admin command to list open files 
> and their lease holders.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11947) When constructing a thread name, BPOfferService may print a bogus warning message

2017-08-31 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-11947:
--
Fix Version/s: (was: 2.8.2)

> When constructing a thread name, BPOfferService may print a bogus warning 
> message 
> --
>
> Key: HDFS-11947
> URL: https://issues.apache.org/jira/browse/HDFS-11947
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Weiwei Yang
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11947.001.patch, HDFS-11947.002.patch, 
> HDFS-11947.003.patch
>
>
> HDFS-11558 tries to get Block pool ID for constructing thread names.  When 
> the service is not yet registered with NN, it prints the bogus warning "Block 
> pool ID needed, but service not yet registered with NN" with stack trace.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12107) FsDatasetImpl#removeVolumes floods the logs when removing the volume

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12107:
---
Fix Version/s: (was: 3.0.0-alpha4)
   3.0.0-beta1

> FsDatasetImpl#removeVolumes floods the logs when removing the volume
> 
>
> Key: HDFS-12107
> URL: https://issues.apache.org/jira/browse/HDFS-12107
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haohui Mai
>Assignee: Kelvin Chu
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12107.001.patch
>
>
> FsDatasetImpl#removeVolumes() prints all block ids on a volume when removing 
> it, which floods the log of DN.
> {noformat}
> for (String bpid : volumeMap.getBlockPoolList()) {
> List blocks = new ArrayList<>();
> for (Iterator it =
>   volumeMap.replicas(bpid).iterator(); it.hasNext();) {
>   ReplicaInfo block = it.next();
>   final StorageLocation blockStorageLocation =
>   block.getVolume().getStorageLocation();
>   LOG.info("checking for block " + block.getBlockId() +
>   " with storageLocation " + blockStorageLocation);
>   if (blockStorageLocation.equals(sdLocation)) {
> blocks.add(block);
> it.remove();
>   }
> }
> {noformat}
> The logging level should be {{DEBUG}} or {{TRACE}} instead of {{INFO}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11947) When constructing a thread name, BPOfferService may print a bogus warning message

2017-08-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149749#comment-16149749
 ] 

Junping Du commented on HDFS-11947:
---

Sounds like we only commit to trunk and branch-2. Dropping 2.8.2 in fix version.

> When constructing a thread name, BPOfferService may print a bogus warning 
> message 
> --
>
> Key: HDFS-11947
> URL: https://issues.apache.org/jira/browse/HDFS-11947
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Weiwei Yang
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11947.001.patch, HDFS-11947.002.patch, 
> HDFS-11947.003.patch
>
>
> HDFS-11558 tries to get Block pool ID for constructing thread names.  When 
> the service is not yet registered with NN, it prints the bogus warning "Block 
> pool ID needed, but service not yet registered with NN" with stack trace.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10326) Disable setting tcp socket send/receive buffers for write pipelines

2017-08-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149744#comment-16149744
 ] 

Junping Du commented on HDFS-10326:
---

Sounds like we only commit the patch to branch-2.8 but forget to commit to 
branch-2.8.2. Just commit it.
Also, add 2.9 and 3.0 in fixed version.

> Disable setting tcp socket send/receive buffers for write pipelines
> ---
>
> Key: HDFS-10326
> URL: https://issues.apache.org/jira/browse/HDFS-10326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.2
>
> Attachments: HDFS-10326.000.patch, HDFS-10326.001.patch, 
> HDFS-10326.001.patch
>
>
> The DataStreamer and the Datanode use a hardcoded 
> DEFAULT_DATA_SOCKET_SIZE=128K for the send and receive buffers of a write 
> pipeline.  Explicitly setting tcp buffer sizes disables tcp stack 
> auto-tuning.  
> The hardcoded value will saturate a 1Gb with 1ms RTT.  105Mbs at 10ms.  
> Paltry 11Mbs over a 100ms long haul.  10Gb networks are underutilized.
> There should either be a configuration to completely disable setting the 
> buffers, or the the setReceiveBuffer and setSendBuffer should be removed 
> entirely.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >