[jira] [Commented] (HDFS-13051) Fix dead lock during async editlog rolling if edit queue is full

2018-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610132#comment-16610132
 ] 

Hudson commented on HDFS-13051:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14918 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14918/])
HDFS-13051. Fix dead lock during async editlog rolling if edit queue is (xiao: 
rev 8e54da1511e78477c1d4655d5ff0a69d0330869f)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogAsync.java


> Fix dead lock during async editlog rolling if edit queue is full
> 
>
> Key: HDFS-13051
> URL: https://issues.apache.org/jira/browse/HDFS-13051
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.5
>Reporter: zhangwei
>Assignee: Daryn Sharp
>Priority: Major
>  Labels: AsyncEditlog, deadlock
> Attachments: HDFS-13112.patch, deadlock.patch
>
>
> when doing rolleditlog it acquires  fs write lock,then acquire FSEditLogAsync 
> lock object,and write 3 EDIT(the second one override logEdit method and 
> return true)
> in extremely case,when FSEditLogAsync's logSync is very 
> slow,editPendingQ(default size 4096)is full,it case IPC thread can not offer 
> edit object into editPendingQ when doing rolleditlog,it block on editPendingQ 
> .put  method,however it does't release FSEditLogAsync object lock, and 
> edit.logEdit method in FSEditLogAsync.run thread can never acquire 
> FSEditLogAsync object lock, it case dead lock
> stack trace like below
> "Thread[Thread-44528,5,main]" #130093 daemon prio=5 os_prio=0 
> tid=0x02377000 nid=0x13fda waiting on condition [0x7fb3297de000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.enqueueEdit(FSEditLogAsync.java:156)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logEdit(FSEditLogAsync.java:118)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logCancelDelegationToken(FSEditLog.java:1008)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logExpireDelegationToken(FSNamesystem.java:7635)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:395)
>  - locked <0x7fbd3cbae500> (a java.lang.Object)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:62)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:604)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:54)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:656)
>  at java.lang.Thread.run(Thread.java:745)
> "FSEditLogAsync" #130072 daemon prio=5 os_prio=0 tid=0x0715b800 
> nid=0x13fbf waiting for monitor entry [0x7fb32c51a000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:443)
>  - waiting to lock <*0x7fbcbc131000*> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:233)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:177)
>  at java.lang.Thread.run(Thread.java:745)
> "IPC Server handler 47 on 53310" #337 daemon prio=5 os_prio=0 
> tid=0x7fe659d46000 nid=0x4c62 waiting on condition [0x7fb32fe52000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610125#comment-16610125
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 12 new + 16 unchanged - 0 fixed = 28 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}171m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12937996/HDFS-13768.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 69ec7d898a3c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Commented] (HDFS-13051) Fix dead lock during async editlog rolling if edit queue is full

2018-09-10 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610109#comment-16610109
 ] 

Xiao Chen commented on HDFS-13051:
--

+1, committing this

> Fix dead lock during async editlog rolling if edit queue is full
> 
>
> Key: HDFS-13051
> URL: https://issues.apache.org/jira/browse/HDFS-13051
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.5
>Reporter: zhangwei
>Assignee: Daryn Sharp
>Priority: Major
>  Labels: AsyncEditlog, deadlock
> Attachments: HDFS-13112.patch, deadlock.patch
>
>
> when doing rolleditlog it acquires  fs write lock,then acquire FSEditLogAsync 
> lock object,and write 3 EDIT(the second one override logEdit method and 
> return true)
> in extremely case,when FSEditLogAsync's logSync is very 
> slow,editPendingQ(default size 4096)is full,it case IPC thread can not offer 
> edit object into editPendingQ when doing rolleditlog,it block on editPendingQ 
> .put  method,however it does't release FSEditLogAsync object lock, and 
> edit.logEdit method in FSEditLogAsync.run thread can never acquire 
> FSEditLogAsync object lock, it case dead lock
> stack trace like below
> "Thread[Thread-44528,5,main]" #130093 daemon prio=5 os_prio=0 
> tid=0x02377000 nid=0x13fda waiting on condition [0x7fb3297de000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.enqueueEdit(FSEditLogAsync.java:156)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logEdit(FSEditLogAsync.java:118)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logCancelDelegationToken(FSEditLog.java:1008)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logExpireDelegationToken(FSNamesystem.java:7635)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:395)
>  - locked <0x7fbd3cbae500> (a java.lang.Object)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:62)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:604)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:54)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:656)
>  at java.lang.Thread.run(Thread.java:745)
> "FSEditLogAsync" #130072 daemon prio=5 os_prio=0 tid=0x0715b800 
> nid=0x13fbf waiting for monitor entry [0x7fb32c51a000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:443)
>  - waiting to lock <*0x7fbcbc131000*> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:233)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:177)
>  at java.lang.Thread.run(Thread.java:745)
> "IPC Server handler 47 on 53310" #337 daemon prio=5 os_prio=0 
> tid=0x7fe659d46000 nid=0x4c62 waiting on condition [0x7fb32fe52000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.enqueueEdit(FSEditLogAsync.java:156)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logEdit(FSEditLogAsync.java:118)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1251)
>  - locked <*0x7fbcbc131000*> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
>  at 
> 

[jira] [Updated] (HDFS-13051) Fix dead lock during async editlog rolling if edit queue is full

2018-09-10 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-13051:
-
Summary: Fix dead lock during async editlog rolling if edit queue is full  
(was: dead lock occurs when rolleditlog rpc call happen and editPendingQ is 
full)

> Fix dead lock during async editlog rolling if edit queue is full
> 
>
> Key: HDFS-13051
> URL: https://issues.apache.org/jira/browse/HDFS-13051
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.5
>Reporter: zhangwei
>Assignee: Daryn Sharp
>Priority: Major
>  Labels: AsyncEditlog, deadlock
> Attachments: HDFS-13112.patch, deadlock.patch
>
>
> when doing rolleditlog it acquires  fs write lock,then acquire FSEditLogAsync 
> lock object,and write 3 EDIT(the second one override logEdit method and 
> return true)
> in extremely case,when FSEditLogAsync's logSync is very 
> slow,editPendingQ(default size 4096)is full,it case IPC thread can not offer 
> edit object into editPendingQ when doing rolleditlog,it block on editPendingQ 
> .put  method,however it does't release FSEditLogAsync object lock, and 
> edit.logEdit method in FSEditLogAsync.run thread can never acquire 
> FSEditLogAsync object lock, it case dead lock
> stack trace like below
> "Thread[Thread-44528,5,main]" #130093 daemon prio=5 os_prio=0 
> tid=0x02377000 nid=0x13fda waiting on condition [0x7fb3297de000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.enqueueEdit(FSEditLogAsync.java:156)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logEdit(FSEditLogAsync.java:118)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logCancelDelegationToken(FSEditLog.java:1008)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logExpireDelegationToken(FSNamesystem.java:7635)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:395)
>  - locked <0x7fbd3cbae500> (a java.lang.Object)
>  at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logExpireToken(DelegationTokenSecretManager.java:62)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:604)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:54)
>  at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:656)
>  at java.lang.Thread.run(Thread.java:745)
> "FSEditLogAsync" #130072 daemon prio=5 os_prio=0 tid=0x0715b800 
> nid=0x13fbf waiting for monitor entry [0x7fb32c51a000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:443)
>  - waiting to lock <*0x7fbcbc131000*> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:233)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:177)
>  at java.lang.Thread.run(Thread.java:745)
> "IPC Server handler 47 on 53310" #337 daemon prio=5 os_prio=0 
> tid=0x7fe659d46000 nid=0x4c62 waiting on condition [0x7fb32fe52000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x7fbd3cb96f58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.enqueueEdit(FSEditLogAsync.java:156)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logEdit(FSEditLogAsync.java:118)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1251)
>  - locked <*0x7fbcbc131000*> (a 

[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610107#comment-16610107
 ] 

Hudson commented on HDFS-13237:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14917 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14917/])
HDFS-13237. [Documentation] RBF: Mount points across multiple (brahma: rev 
96892c469b16c5aaff1b7c42f66f820344256bc2)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md


> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12782) After Unset the EC policy for a directory, Still inside the directory files having the EC Policy

2018-09-10 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-12782.
-
Resolution: Not A Problem

> After Unset the EC policy for a directory, Still inside the directory files 
> having the EC Policy
> 
>
> Key: HDFS-12782
> URL: https://issues.apache.org/jira/browse/HDFS-12782
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Priority: Major
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Scenario:
> Set the EC policy for Dir
> Write a file and check the EC policy for that file
> Unset the EC policy for the above Dir
> Check the policy for the file.
> Actual Output:
> ==
> Still having the EC policy for a file
> Expected Output:
> 
> Inside the Dir all files release the EC policy when we do unset the top level 
> Dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12782) After Unset the EC policy for a directory, Still inside the directory files having the EC Policy

2018-09-10 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610098#comment-16610098
 ] 

Ayush Saxena commented on HDFS-12782:
-

Thanx [~Harsha1206] for reporting.
I feel the present doc has addressed the issue.There is even a warning on 
console when you unset EC Policy from a non empty directory that it won't 
effect the present files and it would be applicable for only new files being 
added into the directory.
Resolving the issue.

> After Unset the EC policy for a directory, Still inside the directory files 
> having the EC Policy
> 
>
> Key: HDFS-12782
> URL: https://issues.apache.org/jira/browse/HDFS-12782
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Priority: Major
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Scenario:
> Set the EC policy for Dir
> Write a file and check the EC policy for that file
> Unset the EC policy for the above Dir
> Check the policy for the file.
> Actual Output:
> ==
> Still having the EC policy for a file
> Expected Output:
> 
> Inside the Dir all files release the EC policy when we do unset the top level 
> Dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610099#comment-16610099
 ] 

Brahma Reddy Battula commented on HDFS-13237:
-

Committed to trunk and branch-3.1, [~elgoiri] thanks for contributions and 
thanks to other for additional review.

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13237:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.2.0
   Status: Resolved  (was: Patch Available)

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610097#comment-16610097
 ] 

Brahma Reddy Battula commented on HDFS-13237:
-

+1, Committing shortly.

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13237:

Parent Issue: HDFS-12615  (was: HDFS-13891)

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610057#comment-16610057
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks for working this, [~RANith] and [~surendrasingh].
I will get a chance to have a review recently, :).

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610055#comment-16610055
 ] 

Surendra Singh Lilhore edited comment on HDFS-13768 at 9/11/18 3:09 AM:


Added initial patch, I will add test case in next patch. 

*+_Initial test report_+* 
 # Before fix : Restarted datanode with 101260 blocks and it's took *16203ms*
 # After fix :  Restarted datanode with 101260 blocks and it's took *9693ms*


was (Author: surendrasingh):
Added initial patch, I will add test case in next patch. 

*+_Initial test report_+* 
 # Before fix : Restarted datanode with 101260 block and it's took *16203ms*
 # After fix :  Restarted datanode with 101260 block and it's took *9693ms*

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610055#comment-16610055
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Added initial patch, I will add test case in next patch. 

*+_Initial test report_+* 
 # Before fix : Restarted datanode with 101260 block and it's took *16203ms*
 # After fix :  Restarted datanode with 101260 block and it's took *9693ms*

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message 

[jira] [Updated] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13768:
--
Attachment: HDFS-13768.01.patch

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13768:
--
Attachment: (was: HDFS-13768.patch)

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13768:
--
Attachment: HDFS-13768.patch

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610040#comment-16610040
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Discussed with [~RANith] offline, assigning jira to my self, I will upload 
updated patch..

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org

[jira] [Assigned] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore reassigned HDFS-13768:
-

Assignee: Surendra Singh Lilhore  (was: Ranith Sardar)

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Status: Open  (was: Patch Available)

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-320) Failed to start container with apache/hadoop-runner image.

2018-09-10 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610033#comment-16610033
 ] 

Junjie Chen commented on HDDS-320:
--

I tried on another centos 7  machine. The issue still exists.

Can someone show your docker version and docker-compose version if you can 
successfully run docker-compose up -d? 

docker-compose version 1.22.0, build f46880fe   


  
Docker version 18.09.0-ce-beta1, build 78a6bdb  
  


> Failed to start container with apache/hadoop-runner image.
> --
>
> Key: HDDS-320
> URL: https://issues.apache.org/jira/browse/HDDS-320
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: document
> Environment: centos 7.4
>Reporter: Junjie Chen
>Priority: Minor
>
> Following the doc in hadoop-ozone/doc/content/GettingStarted.md, the 
> docker-compose up -d step failed, the error list list below:
> [root@VM_16_5_centos ozone]# docker-compose logs
> Attaching to ozone_scm_1, ozone_datanode_1, ozone_ozoneManager_1
> datanode_1  | Traceback (most recent call last):
> datanode_1  |   File "/opt/envtoconf.py", line 104, in 
> datanode_1  | Simple(sys.argv[1:]).main()
> datanode_1  |   File "/opt/envtoconf.py", line 93, in main
> datanode_1  | self.process_envs()
> datanode_1  |   File "/opt/envtoconf.py", line 67, in process_envs
> datanode_1  | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:
> datanode_1  | IOError: [Errno 13] Permission denied: 
> '/opt/hadoop/etc/hadoop/log4j.properties.raw'
> datanode_1  | Traceback (most recent call last):
> datanode_1  |   File "/opt/envtoconf.py", line 104, in 
> datanode_1  | Simple(sys.argv[1:]).main()
> datanode_1  |   File "/opt/envtoconf.py", line 93, in main
> datanode_1  | self.process_envs()
> datanode_1  |   File "/opt/envtoconf.py", line 67, in process_envs
> datanode_1  | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:
> ozoneManager_1  | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:
> ozoneManager_1  | IOError: [Errno 13] Permission denied: 
> '/opt/hadoop/etc/hadoop/log4j.properties.raw'
> ozoneManager_1  | Traceback (most recent call last):
> ozoneManager_1  |   File "/opt/envtoconf.py", line 104, in 
> ozoneManager_1  | Simple(sys.argv[1:]).main()
> ozoneManager_1  |   File "/opt/envtoconf.py", line 93, in main
> ozoneManager_1  | self.process_envs()
> ozoneManager_1  |   File "/opt/envtoconf.py", line 67, in process_envs
>  
> ozoneManager_1  | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:  
> ozoneManager_1  | IOError: [Errno 13] Permission denied: 
> '/opt/hadoop/etc/hadoop/log4j.properties.raw' 
> scm_1   | Traceback (most recent call last):
> scm_1   |   File "/opt/envtoconf.py", line 104, in
>  
> scm_1   | Simple(sys.argv[1:]).main()
> scm_1   |   File "/opt/envtoconf.py", line 93, in main
> scm_1   | self.process_envs()
> scm_1   |   File "/opt/envtoconf.py", line 67, in process_envs
>  
> scm_1   | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:  
> scm_1   | IOError: [Errno 13] Permission denied: 
> '/opt/hadoop/etc/hadoop/log4j.properties.raw' 
> scm_1   | Traceback (most recent call last):
> scm_1   |   File "/opt/envtoconf.py", line 104, in
>  
> scm_1   | Simple(sys.argv[1:]).main()
> scm_1   |   File "/opt/envtoconf.py", line 93, in main
> scm_1   | self.process_envs()
> scm_1   |   File "/opt/envtoconf.py", line 67, in process_envs
>  
> scm_1   | with open(self.destination_file_path(name, extension) + 
> ".raw", "w") as myfile:  
> scm_1   | IOError: [Errno 13] Permission denied: 
> '/opt/hadoop/etc/hadoop/log4j.properties.raw' 
> scm_1   | Traceback (most recent call last):
> 

[jira] [Commented] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610032#comment-16610032
 ] 

Hadoop QA commented on HDDS-427:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 47s{color} | {color:orange} root: The patch generated 2 new + 0 unchanged - 
0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
35s{color} | {color:green} server-scm in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 27s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
58s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.ozShell.TestOzoneShell |
|   | hadoop.ozone.container.common.TestBlockDeletingService |
|   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline
 |
|   | hadoop.ozone.freon.TestDataValidate |
|   | hadoop.ozone.container.ozoneimpl.TestOzoneContainer |
|   | hadoop.ozone.TestMiniOzoneCluster |
|   | hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
|   | 

[jira] [Commented] (HDFS-13902) Add JMX, conf and stacks menus to the datanode page

2018-09-10 Thread fengchuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609994#comment-16609994
 ] 

fengchuang commented on HDFS-13902:
---

[~elgoiri][~brahmareddy]thanks for review,i will start another jira ticket 
about journalnode page soon

>  Add JMX, conf and stacks menus to the datanode page
> 
>
> Key: HDFS-13902
> URL: https://issues.apache.org/jira/browse/HDFS-13902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.3
>Reporter: fengchuang
>Assignee: fengchuang
>Priority: Minor
> Attachments: HDFS-13902.001.patch
>
>
> Add JMX, conf and stacks menus to the datanode page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609993#comment-16609993
 ] 

Hadoop QA commented on HDFS-13868:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
31s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 28s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13868 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939178/HDFS-13868.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux be3cad39d10c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 987d819 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 

[jira] [Commented] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609971#comment-16609971
 ] 

Hadoop QA commented on HDFS-13778:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
50s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 0 unchanged - 5 fixed = 1 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 21s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}137m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:9b55946 |
| JIRA Issue | HDFS-13778 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939177/HDFS-13778-HDFS-12943.003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4eedecc6fd2e 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-12943 / 039c158 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25028/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25028/artifact/out/whitespace-eol.txt
 |
| unit | 

[jira] [Commented] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609953#comment-16609953
 ] 

Anu Engineer commented on HDDS-427:
---

Just so that we are on the same page, I made very same comments on HDDS-351. So 
I am little surprised that this patch is coming back.

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609951#comment-16609951
 ] 

Anu Engineer commented on HDDS-427:
---

{quote} Not all of them may view container with no replica as something which 
warrants an exception from api itself.
{quote}
The Ozone/HDDS assumes that a container has the requisite number of replicas. 
Not having the replica is an error state. So the normal path should be that we 
should throw an error.
{quote} In some of those cases it is perfectly normal case
{quote}
Yes, in most cases this is the normal path.
{quote}For example during SCM chill mode a container might not have any replica 
reported initially
{quote}
Chill mode is a special case, it is the time when the cluster is booting up and 
we may not have container reports. So shouldn't the Chill Mode manager handle 
this special case, as appropriate?
{quote}Similarly during container report handling if no replicas are found for 
given container ContainerReportHandler should log the message and move on as 
ReplicationManager can't do anything in that specific case
{quote}
Absolutely not. If the replicas are missing, it is up to replica manager to 
deal with that issue. The appropriate course of action, if it is creating 
alerting events, logging or whatever should be done, should be done by the 
replica manager. This is a case where Container report handler should catch the 
error and send across to Replica Manager – probably with logging. But deciding 
that information will not  go to ReplicaManager is a no go. Returning null to 
flag an error state can lead to users of API ignoring this error. If we throw 
an exception, we are flagging and error which the caller cannot ignore and 
therefore if the user decides that it an okay error state; for example, Chill 
Mode,  User can catch this error and ignore it.

Due to the above reasons  I am -1 on this patch.

 

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609684#comment-16609684
 ] 

Ajay Kumar edited comment on HDDS-427 at 9/11/18 12:12 AM:
---

getContainerReplicas can be used by multiple callers to check replicas for 
given container. Not all of them may view container with no replica as 
something which warrants an exception from api itself. In some of those cases 
it is perfectly normal case. For example during SCM chill mode a container 
might not have any replica reported initially. Similarly during container 
report handling if no replicas are found for given container 
ContainerReportHandler should log the message and move on as ReplicationManager 
can't do anything in that specific case.


was (Author: ajayydv):
getContainerReplicas can be used by multiple callers to check replicas for 
given container. Not all of them may view container with no replica as 
something which warrants an exception from api itself. In some of those cases 
it is perfectly normal case. For example during SCM chill mode a container 
might not have any replica reported initially. Similarly during container 
report handling if no replicas are found for given container 
ContainerReportHandler should take log the message and move on as 
ReplicationManager can't do anything in that specific case.

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Attachment: (was: HDDS-427.00.patch)

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Status: Patch Available  (was: Open)

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Attachment: HDDS-427.00.patch

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Attachment: HDDS-427.00.patch

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDDS-427.00.patch
>
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-222) Remove hdfs command line from ozone distribution.

2018-09-10 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609908#comment-16609908
 ] 

Hanisha Koneru commented on HDDS-222:
-

Thanks [~elek].
 After this patch, the sbin directory has the following files only:
{code:java}
$ ll sbin
total 48
-rwxr-xr-x 1 hkoneru staff 1.9K Sep 10 15:13 hadoop-daemon.sh
-rwxr-xr-x 1 hkoneru staff 2.5K Sep 10 15:13 hadoop-daemons.sh
-rwxr-xr-x 1 hkoneru staff 1.8K Sep 10 15:13 ozone-config.sh
-rwxr-xr-x 1 hkoneru staff 4.0K Sep 10 15:13 start-ozone.sh
-rwxr-xr-x 1 hkoneru staff 3.4K Sep 10 15:13 stop-ozone.sh
-rwxr-xr-x 1 hkoneru staff 1.9K Sep 10 15:13 workers.sh
{code}
And {{ozone-config.sh}} is added to the libexec folder.

Deployed using tar ball and verified that ozone-env.sh is loaded if available.

Patch v05 LGTM. Can you please post a patch for trunk also.

> Remove hdfs command line from ozone distribution.
> -
>
> Key: HDDS-222
> URL: https://issues.apache.org/jira/browse/HDDS-222
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: newbie
> Fix For: 0.3.0
>
> Attachments: HDDS-222-ozone-0.2.005.patch, HDDS-222.001.patch, 
> HDDS-222.002.patch, HDDS-222.003.patch, HDDS-222.004.patch
>
>
> As the ozone release artifact doesn't contain a stable namenode/datanode code 
> the hdfs command should be removed from the ozone artifact.
> ozone-dist-layout-stitching also could be simplified to copy only the 
> required jar files (we don't need to copy the namenode/datanode server side 
> jars, just the common artifacts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609907#comment-16609907
 ] 

Hadoop QA commented on HDFS-13697:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 18 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
35s{color} | {color:green} root: The patch generated 0 new + 417 unchanged - 7 
fixed = 417 total (was 424) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
54s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
24s{color} | {color:green} hadoop-kms in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
41s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
24s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
46s{color} | {color:green} hadoop-hdfs-nfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}260m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|  

[jira] [Updated] (HDDS-413) Ozone freon help needs the Scm and OM running

2018-09-10 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDDS-413:
--
Fix Version/s: (was: 0.2.1)
   0.3.0

> Ozone freon help needs the Scm and OM running
> -
>
> Key: HDDS-413
> URL: https://issues.apache.org/jira/browse/HDDS-413
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 0.3.0
>
>
> Ozone freon help needs the Scm and OM running
> {code:java}
> ./ozone freon --help
> 2018-09-07 12:23:28,983 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2018-09-07 12:23:30,203 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:9862. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2018-09-07 12:23:31,204 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:9862. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ^C⏎ 
> HW11767 ~/t/o/bin> jps
> 52445
> 86095 Jps{code}
> If Scm and Om are running, freon help works fine:
> {code:java}
> HW11767 ~/t/o/bin> /ozone freon --help
> 2018-09-07 12:30:18,535 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Options supported are:
> -numOfThreads    number of threads to be launched for the run.
> -validateWrites do random validation of data written into 
> ozone, only subset of data is validated.
> -jsonDirdirectory where json is created.
> -mode [online | offline]specifies the mode in which Freon should run.
> -source    specifies the URL of s3 commoncrawl warc file 
> to be used when the mode is online.
> -numOfVolumes    specifies number of Volumes to be created in 
> offline mode
> -numOfBuckets    specifies number of Buckets to be created per 
> Volume in offline mode
> -numOfKeys   specifies number of Keys to be created per 
> Bucket in offline mode
> -keySize specifies the size of Key in bytes to be 
> created in offline mode
> -help   prints usage.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-10 Thread Pranay Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranay Singh updated HDFS-13868:

Status: Patch Available  (was: In Progress)

> WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but 
> "oldsnapshotname" is not.
> -
>
> Key: HDFS-13868
> URL: https://issues.apache.org/jira/browse/HDFS-13868
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, webhdfs
>Affects Versions: 3.0.3, 3.1.0
>Reporter: Siyao Meng
>Assignee: Pranay Singh
>Priority: Major
> Attachments: HDFS-13868.001.patch, HDFS-13868.002.patch, 
> HDFS-13868.003.patch, HDFS-13868.004.patch
>
>
> HDFS-13052 implements GETSNAPSHOTDIFF for WebHDFS.
>  
> Proof:
> {code:java}
> # Bash
> # Prerequisite: You will need to create the directory "/snapshot", 
> allowSnapshot() on it, and create a snapshot named "snap3" for it to reach 
> NPE.
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap2=snap3"
> # Note that I intentionally typed the wrong parameter name for 
> "oldsnapshotname" above to cause NPE.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs==snap3"
> # Empty string for oldsnapshotname
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap3"
> # Missing param oldsnapshotname, essentially the same as the first case.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-10 Thread Pranay Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranay Singh updated HDFS-13868:

Status: In Progress  (was: Patch Available)

> WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but 
> "oldsnapshotname" is not.
> -
>
> Key: HDFS-13868
> URL: https://issues.apache.org/jira/browse/HDFS-13868
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, webhdfs
>Affects Versions: 3.0.3, 3.1.0
>Reporter: Siyao Meng
>Assignee: Pranay Singh
>Priority: Major
> Attachments: HDFS-13868.001.patch, HDFS-13868.002.patch, 
> HDFS-13868.003.patch, HDFS-13868.004.patch
>
>
> HDFS-13052 implements GETSNAPSHOTDIFF for WebHDFS.
>  
> Proof:
> {code:java}
> # Bash
> # Prerequisite: You will need to create the directory "/snapshot", 
> allowSnapshot() on it, and create a snapshot named "snap3" for it to reach 
> NPE.
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap2=snap3"
> # Note that I intentionally typed the wrong parameter name for 
> "oldsnapshotname" above to cause NPE.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs==snap3"
> # Empty string for oldsnapshotname
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap3"
> # Missing param oldsnapshotname, essentially the same as the first case.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-10 Thread Pranay Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranay Singh updated HDFS-13868:

Attachment: HDFS-13868.004.patch

> WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but 
> "oldsnapshotname" is not.
> -
>
> Key: HDFS-13868
> URL: https://issues.apache.org/jira/browse/HDFS-13868
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, webhdfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Siyao Meng
>Assignee: Pranay Singh
>Priority: Major
> Attachments: HDFS-13868.001.patch, HDFS-13868.002.patch, 
> HDFS-13868.003.patch, HDFS-13868.004.patch
>
>
> HDFS-13052 implements GETSNAPSHOTDIFF for WebHDFS.
>  
> Proof:
> {code:java}
> # Bash
> # Prerequisite: You will need to create the directory "/snapshot", 
> allowSnapshot() on it, and create a snapshot named "snap3" for it to reach 
> NPE.
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap2=snap3"
> # Note that I intentionally typed the wrong parameter name for 
> "oldsnapshotname" above to cause NPE.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs==snap3"
> # Empty string for oldsnapshotname
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap3"
> # Missing param oldsnapshotname, essentially the same as the first case.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Plamen Jeliazkov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-13778:

Attachment: HDFS-13778-HDFS-12943.003.patch

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13778-HDFS-12943.001.patch, 
> HDFS-13778-HDFS-12943.002.patch, HDFS-13778-HDFS-12943.003.patch
>
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609865#comment-16609865
 ] 

Plamen Jeliazkov commented on HDFS-13778:
-

Hey [~shv],

Couple things from me:
(1) Should we move this to the 'org.apache.hadoop.hdfs.server.namenode.ha' 
package? Seems a more appropriate place.
(2) Regarding 'testMultiClientStatesWithRandomFailovers', what if we split the 
load across two runs and only did 1 failover per run? The issue that I am 
seeing on the Jenkins logs is clients being stuck waiting for each other. I 
think we need to decrease number of clients as well. Optimal case is two 
clients competing against each other. I will attach a patch (003) demonstrating 
my idea.
(3) I attempted to resurrect 'testClientSendsState' but unfortunately I have no 
way of accessing the ClientHAFactory needed to actually change the client's 
AlignmentContext to some mocked one, nor does there seem to be anything else I 
can mock to check if client is sending state as the AlignmentContext is the 
only source for that.

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13778-HDFS-12943.001.patch, 
> HDFS-13778-HDFS-12943.002.patch
>
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-414) sbin/stop-all.sh does not stop Ozone daemons

2018-09-10 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609859#comment-16609859
 ] 

Hanisha Koneru commented on HDDS-414:
-

[~elek], the changes to {{stop-ozone.sh}} script loog good.
Can we do the same changes to {{start-ozone.sh}} script as well.

> sbin/stop-all.sh does not stop Ozone daemons
> 
>
> Key: HDDS-414
> URL: https://issues.apache.org/jira/browse/HDDS-414
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-414-ozone-0.2.1.001.patch
>
>
> sbin/stop-all.sh does not stop Ozone daemons.
> Please see below:
> {code:java}
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8896 Jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> ➜ ozone-0.2.1-SNAPSHOT pwd
> /tmp/ozone-0.2.1-SNAPSHOT
> ➜ ozone-0.2.1-SNAPSHOT sbin/stop-all.sh
> WARNING: Stopping all Apache Hadoop daemons as nmaheshwari in 10 seconds.
> WARNING: Use CTRL-C to abort.
> Stopping namenodes on [localhost]
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping datanodes
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping secondary namenodes [HW11469.local]
> HW11469.local: ssh: connect to host hw11469.local port 22: Connection refused
> 2018-09-07 12:38:49,044 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> 9150 Jps
> ➜ ozone-0.2.1-SNAPSHOT
> {code}
> The Ozone daemons processes are not stopped even after sbin/stop-all.sh 
> finished executing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-222) Remove hdfs command line from ozone distribution.

2018-09-10 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDDS-222:
--
Fix Version/s: (was: 0.2.1)
   0.3.0

> Remove hdfs command line from ozone distribution.
> -
>
> Key: HDDS-222
> URL: https://issues.apache.org/jira/browse/HDDS-222
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: newbie
> Fix For: 0.3.0
>
> Attachments: HDDS-222-ozone-0.2.005.patch, HDDS-222.001.patch, 
> HDDS-222.002.patch, HDDS-222.003.patch, HDDS-222.004.patch
>
>
> As the ozone release artifact doesn't contain a stable namenode/datanode code 
> the hdfs command should be removed from the ozone artifact.
> ozone-dist-layout-stitching also could be simplified to copy only the 
> required jar files (we don't need to copy the namenode/datanode server side 
> jars, just the common artifacts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-318) ratis INFO logs should not shown during ozoneFs command-line execution

2018-09-10 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDDS-318:
--
Fix Version/s: (was: 0.2.1)
   0.3.0

> ratis INFO logs should not shown during ozoneFs command-line execution
> --
>
> Key: HDDS-318
> URL: https://issues.apache.org/jira/browse/HDDS-318
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Nilotpal Nandi
>Assignee: Tsz Wo Nicholas Sze
>Priority: Blocker
>  Labels: newbie
> Fix For: 0.3.0
>
> Attachments: HDDS-318.20180907.patch
>
>
> ratis INFOs should not be shown during ozoneFS CLI execution.
> Please find the snippet from one othe execution :
>  
> {noformat}
> hadoop@08315aa4b367:~/bin$ ./ozone fs -put /etc/passwd /p2
> 2018-08-02 12:17:18 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.rpc.type = GRPC (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.message.size.max = 33554432 
> (custom)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.client.rpc.retryInterval = 300 
> ms (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - 
> raft.client.async.outstanding-requests.max = 100 (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.client.async.scheduler-threads = 
> 3 (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.flow.control.window = 1MB 
> (=1048576) (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.message.size.max = 33554432 
> (custom)
> 2018-08-02 12:17:20 INFO ConfUtils:41 - raft.client.rpc.request.timeout = 
> 3000 ms (default)
> Aug 02, 2018 12:17:20 PM 
> org.apache.ratis.shaded.io.grpc.internal.ProxyDetectorImpl detectProxy
> WARNING: Failed to construct URI for proxy lookup, proceeding without proxy
> ..
> ..
> ..
>  
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-425) Move unit test of the genconf tool to hadoop-ozone/tools module

2018-09-10 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609856#comment-16609856
 ] 

Dinesh Chitlangia commented on HDDS-425:


Test Failure not related to patch.

License failure is due to the heap dump file generated by test failure.

> Move unit test of the genconf tool to hadoop-ozone/tools module
> ---
>
> Key: HDDS-425
> URL: https://issues.apache.org/jira/browse/HDDS-425
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Minor
> Fix For: 0.3.0
>
> Attachments: HDDS-425.001.patch
>
>
> Based on review comment from [~elek] in HDDS-417, this Jira proposes to move 
> unit test of genconf tool to hadoop-ozone/tools module. It doesn't require 
> miniozone cluster so it shouldn't be in the integration test module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-283) Need an option to list all volumes created in the cluster

2018-09-10 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDDS-283:
--
Fix Version/s: (was: 0.2.1)
   0.3.0

> Need an option to list all volumes created in the cluster
> -
>
> Key: HDDS-283
> URL: https://issues.apache.org/jira/browse/HDDS-283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Nilotpal Nandi
>Assignee: Nilotpal Nandi
>Priority: Blocker
> Fix For: 0.3.0
>
> Attachments: HDDS-283.001.patch
>
>
> Currently , listVolume command either gives :
> 1) all the volumes created by a particular user , using -user argument.
> 2) or , all the volumes created by the logged in user , if no -user argument 
> is provided.
>  
> We need an option to list all the volumes created in the cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609852#comment-16609852
 ] 

Hadoop QA commented on HDFS-13778:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
43s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 0 unchanged - 5 fixed = 1 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.TestStateAlignmentContextWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:9b55946 |
| JIRA Issue | HDFS-13778 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939141/HDFS-13778-HDFS-12943.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fb8ee51d1ce8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-12943 / 039c158 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25025/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25025/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDDS-414) sbin/stop-all.sh does not stop Ozone daemons

2018-09-10 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609842#comment-16609842
 ] 

Tsz Wo Nicholas Sze commented on HDDS-414:
--

[~nmaheshwari], could you test the patch?

> sbin/stop-all.sh does not stop Ozone daemons
> 
>
> Key: HDDS-414
> URL: https://issues.apache.org/jira/browse/HDDS-414
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-414-ozone-0.2.1.001.patch
>
>
> sbin/stop-all.sh does not stop Ozone daemons.
> Please see below:
> {code:java}
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8896 Jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> ➜ ozone-0.2.1-SNAPSHOT pwd
> /tmp/ozone-0.2.1-SNAPSHOT
> ➜ ozone-0.2.1-SNAPSHOT sbin/stop-all.sh
> WARNING: Stopping all Apache Hadoop daemons as nmaheshwari in 10 seconds.
> WARNING: Use CTRL-C to abort.
> Stopping namenodes on [localhost]
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping datanodes
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping secondary namenodes [HW11469.local]
> HW11469.local: ssh: connect to host hw11469.local port 22: Connection refused
> 2018-09-07 12:38:49,044 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> 9150 Jps
> ➜ ozone-0.2.1-SNAPSHOT
> {code}
> The Ozone daemons processes are not stopped even after sbin/stop-all.sh 
> finished executing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-414) sbin/stop-all.sh does not stop Ozone daemons

2018-09-10 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609840#comment-16609840
 ] 

Tsz Wo Nicholas Sze commented on HDDS-414:
--

+1 patch looks good.

> sbin/stop-all.sh does not stop Ozone daemons
> 
>
> Key: HDDS-414
> URL: https://issues.apache.org/jira/browse/HDDS-414
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-414-ozone-0.2.1.001.patch
>
>
> sbin/stop-all.sh does not stop Ozone daemons.
> Please see below:
> {code:java}
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8896 Jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> ➜ ozone-0.2.1-SNAPSHOT pwd
> /tmp/ozone-0.2.1-SNAPSHOT
> ➜ ozone-0.2.1-SNAPSHOT sbin/stop-all.sh
> WARNING: Stopping all Apache Hadoop daemons as nmaheshwari in 10 seconds.
> WARNING: Use CTRL-C to abort.
> Stopping namenodes on [localhost]
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping datanodes
> localhost: ssh: connect to host localhost port 22: Connection refused
> Stopping secondary namenodes [HW11469.local]
> HW11469.local: ssh: connect to host hw11469.local port 22: Connection refused
> 2018-09-07 12:38:49,044 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> ➜ ozone-0.2.1-SNAPSHOT jps
> 8224 HddsDatanodeService
> 8162 OzoneManager
> 7701 StorageContainerManager
> 9150 Jps
> ➜ ozone-0.2.1-SNAPSHOT
> {code}
> The Ozone daemons processes are not stopped even after sbin/stop-all.sh 
> finished executing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13880) Add mechanism to allow certain RPC calls to bypass sync

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609826#comment-16609826
 ] 

Hadoop QA commented on HDFS-13880:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  6m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
52s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
2s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
26s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
22s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
21s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
31s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 29s{color} | {color:orange} root: The patch generated 1 new + 221 unchanged 
- 0 fixed = 222 total (was 221) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
34s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
46s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
49s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}233m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestFileAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:9b55946 |
| JIRA Issue | HDFS-13880 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939130/HDFS-13880-HDFS-12943.005.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bcb4cdee9cb1 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 

[jira] [Commented] (HDDS-416) Fix bug in ChunkInputStreamEntry

2018-09-10 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609792#comment-16609792
 ] 

Xiaoyu Yao commented on HDDS-416:
-

Agree with [~ljain], this seems to be bug if we consider seek/skip operations.

The fix looks good to me, +1.

> Fix bug in ChunkInputStreamEntry
> 
>
> Key: HDDS-416
> URL: https://issues.apache.org/jira/browse/HDDS-416
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-416.001.patch
>
>
> ChunkInputStreamEntry maintains currentPosition field. This field is 
> redundant and can be replaced by getPos().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2018-09-10 Thread Zsolt Venczel (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609685#comment-16609685
 ] 

Zsolt Venczel edited comment on HDFS-13697 at 9/10/18 8:48 PM:
---

In my latest patch I fixed the TestEncryptionZonesWithKMS failure.

With the latest patch (12) all above, failed tests have passed:
{code}
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.35 s 
- in org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Running org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 143.89 
s - in org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.541 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 366.403 
s - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 129.862 
s - in org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 117, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
{code}


was (Author: zvenczel):
In my latest patch I fixed the TestEncryptionZonesWithKMS failure.

With the latest patch (11) all above, failed tests have passed:
{code}
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.35 s 
- in org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Running org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 143.89 
s - in org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.541 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 366.403 
s - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 129.862 
s - in org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 117, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
{code}

> DFSClient should instantiate and cache KMSClientProvider using UGI at 
> creation time for consistent UGI handling
> ---
>
> Key: HDFS-13697
> URL: https://issues.apache.org/jira/browse/HDFS-13697
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, 
> HDFS-13697.03.patch, HDFS-13697.04.patch, HDFS-13697.05.patch, 
> HDFS-13697.06.patch, HDFS-13697.07.patch, HDFS-13697.08.patch, 
> HDFS-13697.09.patch, HDFS-13697.10.patch, HDFS-13697.11.patch, 
> HDFS-13697.12.patch, HDFS-13697.prelim.patch
>
>
> While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack 
> might not have doAs privileged execution call (in the DFSClient for example). 
> This results in loosing the proxy user from UGI as UGI.getCurrentUser finds 
> no AccessControllerContext and does a re-login for the login user only.
> This can cause the following for example: if we have set up the oozie user to 
> be entitled to perform actions on behalf of example_user but oozie is 
> forbidden to decrypt any EDEK (for security reasons), due to the above issue, 
> example_user entitlements are lost from UGI and the following error is 
> reported:
> {code}
> [0] 
> SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] 
> JOB[0020905-180313191552532-oozie-oozi-W] 
> ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting 
> action [polling_dir_path]. 

[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609782#comment-16609782
 ] 

Hadoop QA commented on HDFS-13237:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
28m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13237 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939152/HDFS-13237.005.patch |
| Optional Tests |  dupname  asflicense  mvnsite  |
| uname | Linux 6df095a9213e 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 987d819 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 395 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25026/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609769#comment-16609769
 ] 

Konstantin Shvachko commented on HDFS-13778:


{{TestUnderReplicatedBlocks}} failure is covered under HDFS-9243.

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13778-HDFS-12943.001.patch, 
> HDFS-13778-HDFS-12943.002.patch
>
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-425) Move unit test of the genconf tool to hadoop-ozone/tools module

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609765#comment-16609765
 ] 

Hadoop QA commented on HDDS-425:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} tools in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  5m 30s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
33s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.freon.TestDataValidate |
|   | hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-425 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939140/HDDS-425.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d8cb6361ec23 

[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609715#comment-16609715
 ] 

Íñigo Goiri commented on HDFS-13237:


[^HDFS-13237.005.patch] makes the RANDOM statement general and makes the 
reading case as an example.
Let me know if this is good enough.

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13237:
---
Attachment: HDFS-13237.005.patch

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch, 
> HDFS-13237.005.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-420) putKey failing with KEY_ALLOCATION_ERROR

2018-09-10 Thread Nilotpal Nandi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609702#comment-16609702
 ] 

Nilotpal Nandi commented on HDDS-420:
-

[~shashikant] : logs from all nodes present in this tar file:

[^all-node-ozone-logs-1536607597.tar.gz]

> putKey failing with KEY_ALLOCATION_ERROR
> 
>
> Key: HDDS-420
> URL: https://issues.apache.org/jira/browse/HDDS-420
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: all-node-ozone-logs-1536607597.tar.gz
>
>
> Here are the commands run :
> {noformat}
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone oz -putKey 
> /fs-volume/fs-bucket/nn1 -file /etc/passwd
> 2018-09-09 15:39:31,131 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Create key failed, error:KEY_ALLOCATION_ERROR
> [root@ctr-e138-1518143905142-468367-01-02 bin]#
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone fs -copyFromLocal 
> /etc/passwd /
> 2018-09-09 15:40:16,879 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2018-09-09 15:40:23,632 [main] ERROR - Try to allocate more blocks for write 
> failed, already allocated 0 blocks for this write.
> copyFromLocal: Message missing required fields: keyLocation
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone oz -putKey 
> /fs-volume/fs-bucket/nn2 -file /etc/passwd
> 2018-09-09 15:44:55,912 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Create key failed, error:KEY_ALLOCATION_ERROR{noformat}
>  
> hadoop version :
> ---
> {noformat}
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./hadoop version
> Hadoop 3.2.0-SNAPSHOT
> Source code repository git://git.apache.org/hadoop.git -r 
> bf8a1750e99cfbfa76021ce51b6514c74c06f498
> Compiled by root on 2018-09-08T10:22Z
> Compiled with protoc 2.5.0
> From source with checksum c5bbb375aed8edabd89c377af83189d
> This command was run using 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT.jar{noformat}
>  
> scm log :
> ---
> {noformat}
> 2018-09-09 15:45:00,907 INFO 
> org.apache.hadoop.hdds.scm.pipelines.ratis.RatisManagerImpl: Allocating a new 
> ratis pipeline of size: 3 id: pipelineId=f210716d-ba7b-4adf-91d6-da286e5fd010
> 2018-09-09 15:45:00,973 INFO org.apache.ratis.conf.ConfUtils: raft.rpc.type = 
> GRPC (default)
> 2018-09-09 15:45:01,007 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.message.size.max = 33554432 (custom)
> 2018-09-09 15:45:01,011 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.rpc.retryInterval = 300 ms (default)
> 2018-09-09 15:45:01,012 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.async.outstanding-requests.max = 100 (default)
> 2018-09-09 15:45:01,012 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.async.scheduler-threads = 3 (default)
> 2018-09-09 15:45:01,020 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.flow.control.window = 1MB (=1048576) (default)
> 2018-09-09 15:45:01,020 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.message.size.max = 33554432 (custom)
> 2018-09-09 15:45:01,102 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.rpc.request.timeout = 3000 ms (default)
> 2018-09-09 15:45:01,667 ERROR org.apache.hadoop.hdds.scm.XceiverClientRatis: 
> Failed to reinitialize 
> RaftPeer:bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9:172.27.12.96:9858 datanode: 
> bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9{ip: 172.27.12.96, host: 
> ctr-e138-1518143905142-468367-01-07.hwx.site}
> org.apache.ratis.protocol.GroupMismatchException: 
> bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9: The group (group-7347726F7570) of 
> client-409D68EB500F does not match the group (group-2041ABBEE452) of the 
> server bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:222)
>  at 
> org.apache.ratis.grpc.RaftGrpcUtil.tryUnwrapException(RaftGrpcUtil.java:79)
>  at org.apache.ratis.grpc.RaftGrpcUtil.unwrapException(RaftGrpcUtil.java:67)
>  at 
> 

[jira] [Updated] (HDDS-420) putKey failing with KEY_ALLOCATION_ERROR

2018-09-10 Thread Nilotpal Nandi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nilotpal Nandi updated HDDS-420:

Attachment: all-node-ozone-logs-1536607597.tar.gz

> putKey failing with KEY_ALLOCATION_ERROR
> 
>
> Key: HDDS-420
> URL: https://issues.apache.org/jira/browse/HDDS-420
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Nilotpal Nandi
>Assignee: Shashikant Banerjee
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: all-node-ozone-logs-1536607597.tar.gz
>
>
> Here are the commands run :
> {noformat}
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone oz -putKey 
> /fs-volume/fs-bucket/nn1 -file /etc/passwd
> 2018-09-09 15:39:31,131 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Create key failed, error:KEY_ALLOCATION_ERROR
> [root@ctr-e138-1518143905142-468367-01-02 bin]#
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone fs -copyFromLocal 
> /etc/passwd /
> 2018-09-09 15:40:16,879 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2018-09-09 15:40:23,632 [main] ERROR - Try to allocate more blocks for write 
> failed, already allocated 0 blocks for this write.
> copyFromLocal: Message missing required fields: keyLocation
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./ozone oz -putKey 
> /fs-volume/fs-bucket/nn2 -file /etc/passwd
> 2018-09-09 15:44:55,912 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Create key failed, error:KEY_ALLOCATION_ERROR{noformat}
>  
> hadoop version :
> ---
> {noformat}
> [root@ctr-e138-1518143905142-468367-01-02 bin]# ./hadoop version
> Hadoop 3.2.0-SNAPSHOT
> Source code repository git://git.apache.org/hadoop.git -r 
> bf8a1750e99cfbfa76021ce51b6514c74c06f498
> Compiled by root on 2018-09-08T10:22Z
> Compiled with protoc 2.5.0
> From source with checksum c5bbb375aed8edabd89c377af83189d
> This command was run using 
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT.jar{noformat}
>  
> scm log :
> ---
> {noformat}
> 2018-09-09 15:45:00,907 INFO 
> org.apache.hadoop.hdds.scm.pipelines.ratis.RatisManagerImpl: Allocating a new 
> ratis pipeline of size: 3 id: pipelineId=f210716d-ba7b-4adf-91d6-da286e5fd010
> 2018-09-09 15:45:00,973 INFO org.apache.ratis.conf.ConfUtils: raft.rpc.type = 
> GRPC (default)
> 2018-09-09 15:45:01,007 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.message.size.max = 33554432 (custom)
> 2018-09-09 15:45:01,011 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.rpc.retryInterval = 300 ms (default)
> 2018-09-09 15:45:01,012 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.async.outstanding-requests.max = 100 (default)
> 2018-09-09 15:45:01,012 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.async.scheduler-threads = 3 (default)
> 2018-09-09 15:45:01,020 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.flow.control.window = 1MB (=1048576) (default)
> 2018-09-09 15:45:01,020 INFO org.apache.ratis.conf.ConfUtils: 
> raft.grpc.message.size.max = 33554432 (custom)
> 2018-09-09 15:45:01,102 INFO org.apache.ratis.conf.ConfUtils: 
> raft.client.rpc.request.timeout = 3000 ms (default)
> 2018-09-09 15:45:01,667 ERROR org.apache.hadoop.hdds.scm.XceiverClientRatis: 
> Failed to reinitialize 
> RaftPeer:bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9:172.27.12.96:9858 datanode: 
> bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9{ip: 172.27.12.96, host: 
> ctr-e138-1518143905142-468367-01-07.hwx.site}
> org.apache.ratis.protocol.GroupMismatchException: 
> bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9: The group (group-7347726F7570) of 
> client-409D68EB500F does not match the group (group-2041ABBEE452) of the 
> server bfe9c5f2-da9b-4a8f-9013-7540cbbed1c9
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.ratis.util.ReflectionUtils.instantiateException(ReflectionUtils.java:222)
>  at 
> org.apache.ratis.grpc.RaftGrpcUtil.tryUnwrapException(RaftGrpcUtil.java:79)
>  at org.apache.ratis.grpc.RaftGrpcUtil.unwrapException(RaftGrpcUtil.java:67)
>  at 
> org.apache.ratis.grpc.client.RaftClientProtocolClient.blockingCall(RaftClientProtocolClient.java:127)
>  at 
> 

[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609700#comment-16609700
 ] 

Hudson commented on HDDS-421:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14915/])
HDDS-421. Resilient DNS resolution in datanode-service. Contributed by (elek: 
rev 317f317d4b9f8db4b55039227c7e13baac337544)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/datanode/InitDatanodeState.java


> Resilient DNS resolution in datanode-service
> 
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609698#comment-16609698
 ] 

Konstantin Shvachko commented on HDFS-13778:


As I said the runtime of {{testMultiClientStatesWithRandomFailovers()}} is not 
very deterministic. According to the logs Jenkins could not create all 300 
files for each of 10 threads in 5 minutes and timed out. I reduce {{NUMFILES}} 
from 300 to 120. This should let the test finish on Jenkins, but it is not 
guaranteed that the failover happens at least twice, which is desired.

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13778-HDFS-12943.001.patch, 
> HDFS-13778-HDFS-12943.002.patch
>
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13778) In TestStateAlignmentContextWithHA replace artificial AlignmentContextProxyProvider with real ObserverReadProxyProvider.

2018-09-10 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-13778:
---
Attachment: HDFS-13778-HDFS-12943.002.patch

> In TestStateAlignmentContextWithHA replace artificial 
> AlignmentContextProxyProvider with real ObserverReadProxyProvider.
> 
>
> Key: HDFS-13778
> URL: https://issues.apache.org/jira/browse/HDFS-13778
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13778-HDFS-12943.001.patch, 
> HDFS-13778-HDFS-12943.002.patch
>
>
> TestStateAlignmentContextWithHA uses an artificial 
> AlignmentContextProxyProvider, which was temporary needed for testing. Now 
> that we have real ObserverReadProxyProvider it can take over ACPP. This is 
> also useful for testing the ORPP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-421:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Resilient DNS resolution in datanode-service
> 
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609690#comment-16609690
 ] 

Hadoop QA commented on HDFS-12478:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12090 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  6m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
39s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
2s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
36s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
19s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
31s{color} | {color:green} HDFS-12090 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
41s{color} | {color:green} HDFS-12090 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 37s{color} | {color:orange} root: The patch generated 113 new + 1072 
unchanged - 9 fixed = 1185 total (was 1081) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 2 new 
+ 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client generated 1 new 
+ 0 unchanged - 0 fixed = 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
43s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
44s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-fs2img in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
47s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}236m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  

[jira] [Updated] (HDDS-425) Move unit test of the genconf tool to hadoop-ozone/tools module

2018-09-10 Thread Dinesh Chitlangia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-425:
---
Attachment: HDDS-425.001.patch
Status: Patch Available  (was: Open)

> Move unit test of the genconf tool to hadoop-ozone/tools module
> ---
>
> Key: HDDS-425
> URL: https://issues.apache.org/jira/browse/HDDS-425
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Minor
> Fix For: 0.3.0
>
> Attachments: HDDS-425.001.patch
>
>
> Based on review comment from [~elek] in HDDS-417, this Jira proposes to move 
> unit test of genconf tool to hadoop-ozone/tools module. It doesn't require 
> miniozone cluster so it shouldn't be in the integration test module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2018-09-10 Thread Zsolt Venczel (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609685#comment-16609685
 ] 

Zsolt Venczel commented on HDFS-13697:
--

In my latest patch I fixed the TestEncryptionZonesWithKMS failure.

With the latest patch (11) all above, failed tests have passed:
{code}
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.35 s 
- in org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
[INFO] Running org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 143.89 
s - in org.apache.hadoop.hdfs.TestRollingUpgrade
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.541 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 366.403 
s - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 129.862 
s - in org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 117, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
{code}

> DFSClient should instantiate and cache KMSClientProvider using UGI at 
> creation time for consistent UGI handling
> ---
>
> Key: HDFS-13697
> URL: https://issues.apache.org/jira/browse/HDFS-13697
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, 
> HDFS-13697.03.patch, HDFS-13697.04.patch, HDFS-13697.05.patch, 
> HDFS-13697.06.patch, HDFS-13697.07.patch, HDFS-13697.08.patch, 
> HDFS-13697.09.patch, HDFS-13697.10.patch, HDFS-13697.11.patch, 
> HDFS-13697.12.patch, HDFS-13697.prelim.patch
>
>
> While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack 
> might not have doAs privileged execution call (in the DFSClient for example). 
> This results in loosing the proxy user from UGI as UGI.getCurrentUser finds 
> no AccessControllerContext and does a re-login for the login user only.
> This can cause the following for example: if we have set up the oozie user to 
> be entitled to perform actions on behalf of example_user but oozie is 
> forbidden to decrypt any EDEK (for security reasons), due to the above issue, 
> example_user entitlements are lost from UGI and the following error is 
> reported:
> {code}
> [0] 
> SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] 
> JOB[0020905-180313191552532-oozie-oozi-W] 
> ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting 
> action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message 
> [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with 
> ACL name [encrypted_key]!!]
> org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not 
> authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!!
>  at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
>  at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
>  at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>  at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> 

[jira] [Commented] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609684#comment-16609684
 ] 

Ajay Kumar commented on HDDS-427:
-

getContainerReplicas can be used by multiple callers to check replicas for 
given container. Not all of them may view container with no replica as 
something which warrants an exception from api itself. In some of those cases 
it is perfectly normal case. For example during SCM chill mode a container 
might not have any replica reported initially. Similarly during container 
report handling if no replicas are found for given container 
ContainerReportHandler should take log the message and move on as 
ReplicationManager can't do anything in that specific case.

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2018-09-10 Thread Zsolt Venczel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Venczel updated HDFS-13697:
-
Attachment: HDFS-13697.12.patch

> DFSClient should instantiate and cache KMSClientProvider using UGI at 
> creation time for consistent UGI handling
> ---
>
> Key: HDFS-13697
> URL: https://issues.apache.org/jira/browse/HDFS-13697
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, 
> HDFS-13697.03.patch, HDFS-13697.04.patch, HDFS-13697.05.patch, 
> HDFS-13697.06.patch, HDFS-13697.07.patch, HDFS-13697.08.patch, 
> HDFS-13697.09.patch, HDFS-13697.10.patch, HDFS-13697.11.patch, 
> HDFS-13697.12.patch, HDFS-13697.prelim.patch
>
>
> While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack 
> might not have doAs privileged execution call (in the DFSClient for example). 
> This results in loosing the proxy user from UGI as UGI.getCurrentUser finds 
> no AccessControllerContext and does a re-login for the login user only.
> This can cause the following for example: if we have set up the oozie user to 
> be entitled to perform actions on behalf of example_user but oozie is 
> forbidden to decrypt any EDEK (for security reasons), due to the above issue, 
> example_user entitlements are lost from UGI and the following error is 
> reported:
> {code}
> [0] 
> SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] 
> JOB[0020905-180313191552532-oozie-oozi-W] 
> ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting 
> action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message 
> [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with 
> ACL name [encrypted_key]!!]
> org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not 
> authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!!
>  at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
>  at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
>  at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>  at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User 
> [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name 
> [encrypted_key]!!
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>  at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:205)
>  at 
> 

[jira] [Commented] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609678#comment-16609678
 ] 

Anu Engineer commented on HDDS-427:
---

I am sorry, but I don't understand this problem correctly – can give us some 
use cases?

 

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-421:
--
Fix Version/s: 0.3.0

> Resilient DNS resolution in datanode-service
> 
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-421:
--
Summary: Resilient DNS resolution in datanode-service  (was: Resilient DNS 
resolution in datanode-service )

> Resilient DNS resolution in datanode-service
> 
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609670#comment-16609670
 ] 

Hadoop QA commented on HDFS-13237:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
32m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13237 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939131/HDFS-13237.004.patch |
| Optional Tests |  dupname  asflicense  mvnsite  |
| uname | Linux 83f2bcbe27e1 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8fe4062 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 336 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25023/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-427:

Description: Callers of API ContainerStateMap#getContainerReplicas should 
be allowed to decide if its violation of some condition (when there are no 
replicas, warranting exception) instead of it being implicit to every caller. 
Instead ContainerStateMap#getContainerReplicas should return empty map for no 
replicas. Callers can decide if no replicas for given container is a volition 
of business condition. Throwing exception in this API also results in a subtle 
bug in ContainerReportHandler where exception for one container may result in 
whole container list being skipped.  (was: Callers of an API 
ContainerStateMap#getContainerReplicas should be allowed to decide if its 
voilaton of some condition (warranting exception) instead of it being implicit 
to every caller. ContainerStateMap#getContainerReplicas should return empty 
map. Callers can decide if no replicas for given container is a volition of a 
business condition. Throwing exception in this API also results in a subtle bug 
in ContainerReportHandler where exception for one container may result in whole 
container list being skipped.)

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>
> Callers of API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its violation of some condition (when there are no replicas, 
> warranting exception) instead of it being implicit to every caller. Instead 
> ContainerStateMap#getContainerReplicas should return empty map for no 
> replicas. Callers can decide if no replicas for given container is a volition 
> of business condition. Throwing exception in this API also results in a 
> subtle bug in ContainerReportHandler where exception for one container may 
> result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)
Ajay Kumar created HDDS-427:
---

 Summary: For containers with no replica 
ContainerStateMap#getContainerReplicas should return empty map
 Key: HDDS-427
 URL: https://issues.apache.org/jira/browse/HDDS-427
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Ajay Kumar


Callers of an API ContainerStateMap#getContainerReplicas should be allowed to 
decide if its voilaton of some condition (warranting exception) instead of it 
being implicit to every caller. ContainerStateMap#getContainerReplicas should 
return empty map. Callers can decide if no replicas for given container is a 
volition of a business condition. Throwing exception in this API also results 
in a subtle bug in ContainerReportHandler where exception for one container may 
result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-427) For containers with no replica ContainerStateMap#getContainerReplicas should return empty map

2018-09-10 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar reassigned HDDS-427:
---

Assignee: Ajay Kumar

> For containers with no replica ContainerStateMap#getContainerReplicas should 
> return empty map
> -
>
> Key: HDDS-427
> URL: https://issues.apache.org/jira/browse/HDDS-427
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>
> Callers of an API ContainerStateMap#getContainerReplicas should be allowed to 
> decide if its voilaton of some condition (warranting exception) instead of it 
> being implicit to every caller. ContainerStateMap#getContainerReplicas should 
> return empty map. Callers can decide if no replicas for given container is a 
> volition of a business condition. Throwing exception in this API also results 
> in a subtle bug in ContainerReportHandler where exception for one container 
> may result in whole container list being skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609652#comment-16609652
 ] 

Elek, Marton commented on HDDS-421:
---

Thanks [~anu] the review. I will take care of  the committing part and commit 
it soon...

> Resilient DNS resolution in datanode-service 
> -
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609635#comment-16609635
 ] 

Brahma Reddy Battula commented on HDFS-13237:
-

latest patch,lgtm, looks one of wei yan comment is not addresse ("RANDOM can 
balance both READ and WRITE workload across subcusters, not just reading 
workload, right?")

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol

2018-09-10 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609566#comment-16609566
 ] 

Ajay Kumar edited comment on HDDS-362 at 9/10/18 6:17 PM:
--

I think map in {{ChillModeRestrictedOps}} should be renamed to restrictedOps. 
Will change it with review comments.


was (Author: ajayydv):
I think map in {{ChillModeRestrictedOps}} should be renamed to restrictedOps.

> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
> ---
>
> Key: HDDS-362
> URL: https://issues.apache.org/jira/browse/HDDS-362
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-362.00.patch, HDDS-362.01.patch
>
>
> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-426) Add field modificationTime for Volume and Bucket

2018-09-10 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609629#comment-16609629
 ] 

Xiaoyu Yao commented on HDDS-426:
-

Thanks [~dineshchitlangia] for reporting the issue. The reason that we don't 
want to track the modification time at volume bucket level is because the cost 
might be too high (lock contention, metadata update etc.) If we really want to 
enable this, we should make this optional and off by default. 

> Add field modificationTime for Volume and Bucket
> 
>
> Key: HDDS-426
> URL: https://issues.apache.org/jira/browse/HDDS-426
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 0.3.0
>
>
> There are update operations that can be performed for Volume, Bucket and Key.
> While Key records the modification time, Volume and & Bucket do not capture 
> this.
>  
> This Jira proposes to add the required field to Volume and Bucket in order to 
> capture the modficationTime.
>  
> Current Status:
> {noformat}
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoVolume /dummyvol
> 2018-09-10 17:16:12 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "owner" : {
> "name" : "bilbo"
> },
> "quota" : {
> "unit" : "TB",
> "size" : 1048576
> },
> "volumeName" : "dummyvol",
> "createdOn" : "Mon, 10 Sep 2018 17:11:32 GMT",
> "createdBy" : "bilbo"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoBucket /dummyvol/mybuck
> 2018-09-10 17:15:25 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "volumeName" : "dummyvol",
> "bucketName" : "mybuck",
> "createdOn" : "Mon, 10 Sep 2018 17:12:09 GMT",
> "acls" : [ {
> "type" : "USER",
> "name" : "hadoop",
> "rights" : "READ_WRITE"
> }, {
> "type" : "GROUP",
> "name" : "users",
> "rights" : "READ_WRITE"
> }, {
> "type" : "USER",
> "name" : "spark",
> "rights" : "READ_WRITE"
> } ],
> "versioning" : "DISABLED",
> "storageType" : "DISK"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoKey /dummyvol/mybuck/myk1
> 2018-09-10 17:19:43 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "version" : 0,
> "md5hash" : null,
> "createdOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "modifiedOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "size" : 0,
> "keyName" : "myk1",
> "keyLocations" : [ ]
> }{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-419) ChunkInputStream bulk read api does not read from all the chunks

2018-09-10 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609619#comment-16609619
 ] 

Xiaoyu Yao edited comment on HDDS-419 at 9/10/18 6:08 PM:
--

Thanks [~msingh] for reporting the issue and posting the fix. The fix looks 
good with the new test added but it hides a latent bug in 
ChunkGroupInputStream#read(). I propose to fix it there like below without 
changing OzoneInputStream, let me know your thoughts. 
{code:java}
 

*--- 
a/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

*+++ 
b/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

@@ -122,12 +122,8 @@ public synchronized int read(byte[] b, int off, int len) 
throws IOException {

         return totalReadLen > 0 ? totalReadLen : EOF;

       }

       totalReadLen += actualLen;

-      // this means there is no more data to read beyond this point, return

-      if (actualLen != readLen) {

-        return totalReadLen;

-      }

-      off += readLen;

-      len -= readLen;

+      off += actualLen;

+      len -= actualLen;

       if (current.getRemaining() <= 0) {

         currentStreamIndex += 1;

       }

{code}


was (Author: xyao):
Thanks [~msingh] for reporting the issue and posting the fix. The fix looks 
good with the new test added but it hides a latent bug in 
ChunkGroupInputStream#read(). I propose to fix it there like below, let me know 
your thoughts. 

{code}

 

*--- 
a/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

*+++ 
b/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

@@ -122,12 +122,8 @@ public synchronized int read(byte[] b, int off, int len) 
throws IOException {

         return totalReadLen > 0 ? totalReadLen : EOF;

       }

       totalReadLen += actualLen;

-      // this means there is no more data to read beyond this point, return

-      if (actualLen != readLen) {

-        return totalReadLen;

-      }

-      off += readLen;

-      len -= readLen;

+      off += actualLen;

+      len -= actualLen;

       if (current.getRemaining() <= 0) {

         currentStreamIndex += 1;

       }

{code}

> ChunkInputStream bulk read api does not read from all the chunks
> 
>
> Key: HDDS-419
> URL: https://issues.apache.org/jira/browse/HDDS-419
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-419.001.patch
>
>
> After enabling of bulk reads with HDDS-408, testDataValidate started failing 
> because the bulk read api does not read all the chunks from the block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-419) ChunkInputStream bulk read api does not read from all the chunks

2018-09-10 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609619#comment-16609619
 ] 

Xiaoyu Yao commented on HDDS-419:
-

Thanks [~msingh] for reporting the issue and posting the fix. The fix looks 
good with the new test added but it hides a latent bug in 
ChunkGroupInputStream#read(). I propose to fix it there like below, let me know 
your thoughts. 

{code}

 

*--- 
a/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

*+++ 
b/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java*

@@ -122,12 +122,8 @@ public synchronized int read(byte[] b, int off, int len) 
throws IOException {

         return totalReadLen > 0 ? totalReadLen : EOF;

       }

       totalReadLen += actualLen;

-      // this means there is no more data to read beyond this point, return

-      if (actualLen != readLen) {

-        return totalReadLen;

-      }

-      off += readLen;

-      len -= readLen;

+      off += actualLen;

+      len -= actualLen;

       if (current.getRemaining() <= 0) {

         currentStreamIndex += 1;

       }

{code}

> ChunkInputStream bulk read api does not read from all the chunks
> 
>
> Key: HDDS-419
> URL: https://issues.apache.org/jira/browse/HDDS-419
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-419.001.patch
>
>
> After enabling of bulk reads with HDDS-408, testDataValidate started failing 
> because the bulk read api does not read all the chunks from the block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13237:

Parent Issue: HDFS-13891  (was: HDFS-12615)

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609609#comment-16609609
 ] 

Íñigo Goiri commented on HDFS-13237:


Thanks [~anu], we will track the new features in branches starting with 
HDFS-13891.
I updated the patch to resolve the conflicts.

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13577) RBF: Failed mount point operations, returns wrong exit code.

2018-09-10 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13577:

Parent Issue: HDFS-13891  (was: HDFS-12615)

> RBF: Failed mount point operations, returns wrong exit code.
> 
>
> Key: HDFS-13577
> URL: https://issues.apache.org/jira/browse/HDFS-13577
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Y. SREENIVASULU REDDY
>Assignee: Dibyendu Karmakar
>Priority: Major
>  Labels: RBF
>
> If client is performed add mount point with some special character, mount 
> point add is failed.
> And prints the message like
> {noformat}
> 18/05/17 09:58:34 DEBUG ipc.ProtobufRpcEngine: Call: addMountTableEntry took 
> 19ms Cannot add mount point /testSpecialCharMountPointCreation/test/
> {noformat}
> In the above case it should return the exist code is non zero value.
> {code:java|title=RouterAdmin.java|borderStyle=solid}
> Exception debugException = null;
> exitCode = 0;
> try {
> if ("-add".equals(cmd)) {
> if (addMount(argv, i)) {
> System.out.println("Successfully added mount point " + argv[i]);
> }
> {code}
> we should handle this kind of cases also.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-318) ratis INFO logs should not shown during ozoneFs command-line execution

2018-09-10 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609604#comment-16609604
 ] 

Tsz Wo Nicholas Sze commented on HDDS-318:
--

Thanks [~ljain] for checking it.  It seems better to use the log4j API directly 
instead of LogUtils.

> 1. What do you think about decrease the log level on the ratis side?  ...
[~elek], for the server side, showing the logs is useful.  I think Ratis could 
use different loggers for different conf.  Then, we could enable the server log 
and disable the client log by defaults.

I have filed RATIS-314.  If we could get it committed soon.  We don't need to 
change anything in Ozone.

> ratis INFO logs should not shown during ozoneFs command-line execution
> --
>
> Key: HDDS-318
> URL: https://issues.apache.org/jira/browse/HDDS-318
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Nilotpal Nandi
>Assignee: Tsz Wo Nicholas Sze
>Priority: Blocker
>  Labels: newbie
> Fix For: 0.2.1
>
> Attachments: HDDS-318.20180907.patch
>
>
> ratis INFOs should not be shown during ozoneFS CLI execution.
> Please find the snippet from one othe execution :
>  
> {noformat}
> hadoop@08315aa4b367:~/bin$ ./ozone fs -put /etc/passwd /p2
> 2018-08-02 12:17:18 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.rpc.type = GRPC (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.message.size.max = 33554432 
> (custom)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.client.rpc.retryInterval = 300 
> ms (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - 
> raft.client.async.outstanding-requests.max = 100 (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.client.async.scheduler-threads = 
> 3 (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.flow.control.window = 1MB 
> (=1048576) (default)
> 2018-08-02 12:17:19 INFO ConfUtils:41 - raft.grpc.message.size.max = 33554432 
> (custom)
> 2018-08-02 12:17:20 INFO ConfUtils:41 - raft.client.rpc.request.timeout = 
> 3000 ms (default)
> Aug 02, 2018 12:17:20 PM 
> org.apache.ratis.shaded.io.grpc.internal.ProxyDetectorImpl detectProxy
> WARNING: Failed to construct URI for proxy lookup, proceeding without proxy
> ..
> ..
> ..
>  
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13237) [Documentation] RBF: Mount points across multiple subclusters

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13237:
---
Attachment: HDFS-13237.004.patch

> [Documentation] RBF: Mount points across multiple subclusters
> -
>
> Key: HDFS-13237
> URL: https://issues.apache.org/jira/browse/HDFS-13237
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Minor
> Attachments: HDFS-13237.000.patch, HDFS-13237.001.patch, 
> HDFS-13237.002.patch, HDFS-13237.003.patch, HDFS-13237.004.patch
>
>
> Document the feature to spread mount points across multiple subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-413) Ozone freon help needs the Scm and OM running

2018-09-10 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609506#comment-16609506
 ] 

Dinesh Chitlangia edited comment on HDDS-413 at 9/10/18 5:55 PM:
-

[~elek] - I just verified this and was unable to replicate this issue. I 
believe this has got fixed by HDDS-190 the picocli patch. Need your +1 to mark 
it resolved.
{noformat}
hadoop@1987b5de4203:~$ ./bin/ozone freon -help
Unknown option: -elp
Usage: ozone freon [-hV] [--verbose] [-D=]... [COMMAND]
Load generator and tester tool for ozone
--verbose More verbose output. Show the stack trace of the errors.
-D, --set=

-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
randomkeys, rk Generate volumes/buckets and put generated keys.

hadoop@1987b5de4203:~$ ./bin/ozone freon -h
Usage: ozone freon [-hV] [--verbose] [-D=]... [COMMAND]
Load generator and tester tool for ozone
--verbose More verbose output. Show the stack trace of the errors.
-D, --set=

-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
randomkeys, rk Generate volumes/buckets and put generated keys.

hadoop@1987b5de4203:~$ ./bin/ozone freon --help
Usage: ozone freon [-hV] [--verbose] [-D=]... [COMMAND]
Load generator and tester tool for ozone
--verbose More verbose output. Show the stack trace of the errors.
-D, --set=

-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
randomkeys, rk Generate volumes/buckets and put generated keys.
hadoop@1987b5de4203:~$

{noformat}


was (Author: dineshchitlangia):
[~elek] - I just verified this and was unable to replicate this issue. I 
believe this has got fixed by HDDS-190 the picocli patch. Need your +1 to mark 
it resolved.
{noformat}
hadoop@1987b5de4203:~$ ./bin/ozone freon -help
Unknown option: -elp
Usage: ozone freon [-hV] [--verbose] [-D=]... [COMMAND]
Load generator and tester tool for ozone
--verbose More verbose output. Show the stack trace of the errors.
-D, --set=

-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
randomkeys, rk Generate volumes/buckets and put generated keys.
hadoop@1987b5de4203:~$ ./bin/ozone freon -h
Usage: ozone freon [-hV] [--verbose] [-D=]... [COMMAND]
Load generator and tester tool for ozone
--verbose More verbose output. Show the stack trace of the errors.
-D, --set=

-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
randomkeys, rk Generate volumes/buckets and put generated keys.
hadoop@1987b5de4203:~$

{noformat}

> Ozone freon help needs the Scm and OM running
> -
>
> Key: HDDS-413
> URL: https://issues.apache.org/jira/browse/HDDS-413
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 0.2.1
>
>
> Ozone freon help needs the Scm and OM running
> {code:java}
> ./ozone freon --help
> 2018-09-07 12:23:28,983 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2018-09-07 12:23:30,203 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:9862. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2018-09-07 12:23:31,204 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:9862. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ^C⏎ 
> HW11767 ~/t/o/bin> jps
> 52445
> 86095 Jps{code}
> If Scm and Om are running, freon help works fine:
> {code:java}
> HW11767 ~/t/o/bin> /ozone freon --help
> 2018-09-07 12:30:18,535 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> Options supported are:
> -numOfThreads    number of threads to be launched for the run.
> -validateWrites do random validation of data written into 
> ozone, only subset of data is validated.
> -jsonDirdirectory where json is created.
> -mode [online | offline]specifies the mode in which Freon should run.
> -source    specifies the URL of s3 commoncrawl warc file 
> to be used when the mode is online.
> -numOfVolumes    specifies number of Volumes to be created in 
> offline mode
> -numOfBuckets    specifies number of Buckets to be created per 
> Volume in offline mode
> -numOfKeys   specifies number of Keys to be created per 
> Bucket in offline mode
> -keySize specifies the 

[jira] [Updated] (HDDS-426) Add field modificationTime for Volume and Bucket

2018-09-10 Thread Dinesh Chitlangia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-426:
---
Fix Version/s: 0.3.0

> Add field modificationTime for Volume and Bucket
> 
>
> Key: HDDS-426
> URL: https://issues.apache.org/jira/browse/HDDS-426
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 0.3.0
>
>
> There are update operations that can be performed for Volume, Bucket and Key.
> While Key records the modification time, Volume and & Bucket do not capture 
> this.
>  
> This Jira proposes to add the required field to Volume and Bucket in order to 
> capture the modficationTime.
>  
> Current Status:
> {noformat}
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoVolume /dummyvol
> 2018-09-10 17:16:12 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "owner" : {
> "name" : "bilbo"
> },
> "quota" : {
> "unit" : "TB",
> "size" : 1048576
> },
> "volumeName" : "dummyvol",
> "createdOn" : "Mon, 10 Sep 2018 17:11:32 GMT",
> "createdBy" : "bilbo"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoBucket /dummyvol/mybuck
> 2018-09-10 17:15:25 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "volumeName" : "dummyvol",
> "bucketName" : "mybuck",
> "createdOn" : "Mon, 10 Sep 2018 17:12:09 GMT",
> "acls" : [ {
> "type" : "USER",
> "name" : "hadoop",
> "rights" : "READ_WRITE"
> }, {
> "type" : "GROUP",
> "name" : "users",
> "rights" : "READ_WRITE"
> }, {
> "type" : "USER",
> "name" : "spark",
> "rights" : "READ_WRITE"
> } ],
> "versioning" : "DISABLED",
> "storageType" : "DISK"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoKey /dummyvol/mybuck/myk1
> 2018-09-10 17:19:43 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "version" : 0,
> "md5hash" : null,
> "createdOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "modifiedOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "size" : 0,
> "keyName" : "myk1",
> "keyLocations" : [ ]
> }{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-12615) Router-based HDFS federation phase 2

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-12615 started by Íñigo Goiri.
--
> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-426) Add field modificationTime for Volume and Bucket

2018-09-10 Thread Dinesh Chitlangia (JIRA)
Dinesh Chitlangia created HDDS-426:
--

 Summary: Add field modificationTime for Volume and Bucket
 Key: HDDS-426
 URL: https://issues.apache.org/jira/browse/HDDS-426
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Manager
Reporter: Dinesh Chitlangia
Assignee: Dinesh Chitlangia


There are update operations that can be performed for Volume, Bucket and Key.

While Key records the modification time, Volume and & Bucket do not capture 
this.

 

This Jira proposes to add the required field to Volume and Bucket in order to 
capture the modficationTime.

 

Current Status:
{noformat}
hadoop@1987b5de4203:~$ ./bin/ozone oz -infoVolume /dummyvol
2018-09-10 17:16:12 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{
"owner" : {
"name" : "bilbo"
},
"quota" : {
"unit" : "TB",
"size" : 1048576
},
"volumeName" : "dummyvol",
"createdOn" : "Mon, 10 Sep 2018 17:11:32 GMT",
"createdBy" : "bilbo"
}

hadoop@1987b5de4203:~$ ./bin/ozone oz -infoBucket /dummyvol/mybuck
2018-09-10 17:15:25 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{
"volumeName" : "dummyvol",
"bucketName" : "mybuck",
"createdOn" : "Mon, 10 Sep 2018 17:12:09 GMT",
"acls" : [ {
"type" : "USER",
"name" : "hadoop",
"rights" : "READ_WRITE"
}, {
"type" : "GROUP",
"name" : "users",
"rights" : "READ_WRITE"
}, {
"type" : "USER",
"name" : "spark",
"rights" : "READ_WRITE"
} ],
"versioning" : "DISABLED",
"storageType" : "DISK"
}

hadoop@1987b5de4203:~$ ./bin/ozone oz -infoKey /dummyvol/mybuck/myk1
2018-09-10 17:19:43 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{
"version" : 0,
"md5hash" : null,
"createdOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
"modifiedOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
"size" : 0,
"keyName" : "myk1",
"keyLocations" : [ ]
}{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13908) TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is flaky

2018-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609594#comment-16609594
 ] 

Íñigo Goiri commented on HDFS-13908:


There is a similar error here:
https://builds.apache.org/job/PreCommit-HADOOP-Build/15116/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testDNWithInvalidStorageWithHA/
This one is for testDNWithInvalidStorageWithHA.

> TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is 
> flaky
> ---
>
> Key: HDFS-13908
> URL: https://issues.apache.org/jira/browse/HDFS-13908
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Priority: Major
>
> We have seen this issue in multiple runs:
> https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13908) TestDataNodeMultipleRegistrations is flaky

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13908:
---
Summary: TestDataNodeMultipleRegistrations is flaky  (was: 
TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is flaky)

> TestDataNodeMultipleRegistrations is flaky
> --
>
> Key: HDFS-13908
> URL: https://issues.apache.org/jira/browse/HDFS-13908
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Priority: Major
>
> We have seen this issue in multiple runs:
> https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13908) TestDataNodeMultipleRegistrations is flaky

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13908:
---
Description: 
We have seen this issue in multiple runs:
https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/
https://builds.apache.org/job/PreCommit-HADOOP-Build/15116/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testDNWithInvalidStorageWithHA/

  was:
We have seen this issue in multiple runs:
https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/


> TestDataNodeMultipleRegistrations is flaky
> --
>
> Key: HDFS-13908
> URL: https://issues.apache.org/jira/browse/HDFS-13908
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Priority: Major
>
> We have seen this issue in multiple runs:
> https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/
> https://builds.apache.org/job/PreCommit-HADOOP-Build/15116/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testDNWithInvalidStorageWithHA/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13908) TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is flaky

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13908:
---
Description: 
We have seen this issue in multiple runs:
https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/

  was:
We have seen this issue in multiple runs:



> TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is 
> flaky
> ---
>
> Key: HDFS-13908
> URL: https://issues.apache.org/jira/browse/HDFS-13908
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Priority: Major
>
> We have seen this issue in multiple runs:
> https://builds.apache.org/job/PreCommit-HADOOP-Build/15146/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeMultipleRegistrations/testClusterIdMismatchAtStartupWithHA/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13908) TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is flaky

2018-09-10 Thread JIRA
Íñigo Goiri created HDFS-13908:
--

 Summary: 
TestDataNodeMultipleRegistrations#testClusterIdMismatchAtStartupWithHA is flaky
 Key: HDFS-13908
 URL: https://issues.apache.org/jira/browse/HDFS-13908
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Íñigo Goiri


We have seen this issue in multiple runs:




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol

2018-09-10 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609566#comment-16609566
 ] 

Ajay Kumar edited comment on HDDS-362 at 9/10/18 5:39 PM:
--

I think map in {{ChillModeRestrictedOps}} should be renamed to restrictedOps.


was (Author: ajayydv):
I think map in {{ChillModeRestrictedOps}} should be renamed to restrictedOps. 
Also EnumSet seems to be better choice as we are not using value part of map.

> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
> ---
>
> Key: HDDS-362
> URL: https://issues.apache.org/jira/browse/HDDS-362
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.3.0
>
> Attachments: HDDS-362.00.patch, HDDS-362.01.patch
>
>
> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13633) RBF: Synchronous way to create RPC client connections to NN

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13633:
---
Parent Issue: HDFS-13891  (was: HDFS-12615)

> RBF: Synchronous way to create RPC client connections to NN
> ---
>
> Key: HDFS-13633
> URL: https://issues.apache.org/jira/browse/HDFS-13633
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
>
> Currently the router code does the following.
>  # IPC handler thread gets a connection from the pool, even if the connection 
> is NOT usable.
>  # At the same time the IPC thread also submits a request to connection 
> creator thread for adding a new connection to the pool asynchronously.
>  # The new connection is NOT utilized by the IPC threads that get back an 
> unusable connection.
> With this approach burst behaviors of clients, fill up the pool without 
> necessarily using the connections. Also the approach is indeterministic.
> We propose a flag that can allow router admins to control the behavior of 
> getting connections by the IPC handler threads. The flag would allow to 
> toggle ON/OFF asynchronous vs synchronous way of connection creation.
> In the new model, if a connection is unusable, IPC handler thread would go 
> ahead and create a connection and add to the pool and utilize it 
> subsequently. It would still utilize the unusable connection if the pool is 
> full.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13880) Add mechanism to allow certain RPC calls to bypass sync

2018-09-10 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-13880:
--
Attachment: HDFS-13880-HDFS-12943.005.patch

> Add mechanism to allow certain RPC calls to bypass sync
> ---
>
> Key: HDFS-13880
> URL: https://issues.apache.org/jira/browse/HDFS-13880
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13880-HDFS-12943.001.patch, 
> HDFS-13880-HDFS-12943.002.patch, HDFS-13880-HDFS-12943.003.patch, 
> HDFS-13880-HDFS-12943.004.patch, HDFS-13880-HDFS-12943.005.patch
>
>
> Currently, every single call to NameNode will be synced, in the sense that 
> NameNode will not process it until state id catches up. But in certain cases, 
> we would like to bypass this check and allow the call to return immediately, 
> even when the server id is not up to date. One case could be the to-be-added 
> new API in HDFS-13749 that request for current state id. Others may include 
> calls that do not promise real time responses such as {{getContentSummary}}. 
> This Jira is to add the mechanism to allow certain calls to bypass sync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13902) Add JMX, conf and stacks menus to the datanode page

2018-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609582#comment-16609582
 ] 

Íñigo Goiri commented on HDFS-13902:


+1
Yes, the JN changes should be done in a separate JIRA.

>  Add JMX, conf and stacks menus to the datanode page
> 
>
> Key: HDFS-13902
> URL: https://issues.apache.org/jira/browse/HDFS-13902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.3
>Reporter: fengchuang
>Assignee: fengchuang
>Priority: Minor
> Attachments: HDFS-13902.001.patch
>
>
> Add JMX, conf and stacks menus to the datanode page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13866) RBF: Reset exitcode to -1 when invalid params are inputed for dfsrouteradmin commands

2018-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13866:
---
Description: Reset exitcode to -1 when invalid params are inputed for 
dfsrouteradmin commands.

> RBF: Reset exitcode to -1 when invalid params are inputed for dfsrouteradmin 
> commands
> -
>
> Key: HDFS-13866
> URL: https://issues.apache.org/jira/browse/HDFS-13866
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
>
> Reset exitcode to -1 when invalid params are inputed for dfsrouteradmin 
> commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13880) Add mechanism to allow certain RPC calls to bypass sync

2018-09-10 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609586#comment-16609586
 ] 

Chen Liang commented on HDFS-13880:
---

Post v005 patch to address Erik's comments, along with a couple other minor 
improvements.

> Add mechanism to allow certain RPC calls to bypass sync
> ---
>
> Key: HDFS-13880
> URL: https://issues.apache.org/jira/browse/HDFS-13880
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13880-HDFS-12943.001.patch, 
> HDFS-13880-HDFS-12943.002.patch, HDFS-13880-HDFS-12943.003.patch, 
> HDFS-13880-HDFS-12943.004.patch, HDFS-13880-HDFS-12943.005.patch
>
>
> Currently, every single call to NameNode will be synced, in the sense that 
> NameNode will not process it until state id catches up. But in certain cases, 
> we would like to bypass this check and allow the call to return immediately, 
> even when the server id is not up to date. One case could be the to-be-added 
> new API in HDFS-13749 that request for current state id. Others may include 
> calls that do not promise real time responses such as {{getContentSummary}}. 
> This Jira is to add the mechanism to allow certain calls to bypass sync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   >