[jira] [Commented] (HDFS-13817) RBF: create mount point with RANDOM policy and with 2 Nameservices doesn't work properly

2018-09-11 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611615#comment-16611615
 ] 

Ayush Saxena commented on HDFS-13817:
-

Thanx [~Harsha1206] for reporting tried to check this.
This scenario doesn't seems to be present at the present stage.
Pls give it a check once more.

> RBF: create mount point with RANDOM policy and with 2 Nameservices doesn't 
> work properly 
> -
>
> Key: HDFS-13817
> URL: https://issues.apache.org/jira/browse/HDFS-13817
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Reporter: Harshakiran Reddy
>Priority: Major
>  Labels: RBF
>
> {{Scenario:-}} 
> # Create a mount point with RANDOM policy and with 2 Nameservices .
> # List the target mount path of the Global path.
> Actual Output: 
> === 
> {{ls: `/apps5': No such file or directory}}
> Expected Output: 
> =
> {{if the files are availabel list those files or if it's emtpy it will disply 
> nothing}}
> {noformat} 
> bin> ./hdfs dfsrouteradmin -add /apps5 hacluster,ns2 /tmp10 -order RANDOM 
> -owner securedn -group hadoop
> Successfully added mount point /apps5
> bin> ./hdfs dfs -ls /apps5
> ls: `/apps5': No such file or directory
> bin> ./hdfs dfs -ls /apps3
> Found 2 items
> drwxrwxrwx   - user group 0 2018-08-09 19:55 /apps3/apps1
> -rw-r--r--   3   - user group  4 2018-08-10 11:55 /apps3/ttt
>  {noformat}
> {{please refer the bellow image for mount inofrmation}}
> {{/apps3 tagged with HASH policy}}
> {{/apps5 tagged with RANDOM policy}}
> {noformat}
> /bin> ./hdfs dfsrouteradmin -ls
> Mount Table Entries:
> SourceDestinations  Owner 
> Group Mode  Quota/Usage
> /apps3hacluster->/tmp3,ns2->/tmp4 securedn
>   users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> /apps5hacluster->/tmp5,ns2->/tmp5 securedn
>   users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611605#comment-16611605
 ] 

Lokesh Jain commented on HDDS-433:
--

[~hanishakoneru] This case would never arrive. The readStateMachineData api is 
called only when stateMachineDataAttached is true.

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
> at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at 
> org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786)
> at 
> 

[jira] [Commented] (HDFS-13566) Add configurable additional RPC listener to NameNode

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611587#comment-16611587
 ] 

Hadoop QA commented on HDFS-13566:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 57s{color} | {color:orange} root: The patch generated 11 new + 719 unchanged 
- 2 fixed = 730 total (was 721) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
53s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
36s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}224m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | 

[jira] [Commented] (HDFS-8196) Erasure Coding related information on NameNode UI

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611582#comment-16611582
 ] 

Hadoop QA commented on HDFS-8196:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
223 unchanged - 0 fixed = 224 total (was 223) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
33s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}178m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-8196 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939341/HDFS-8196.05.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b46d11b84a2e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b2432d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | 

[jira] [Commented] (HDFS-13882) Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 10

2018-09-11 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611579#comment-16611579
 ] 

Xiao Chen commented on HDFS-13882:
--

Thanks for the reply [~arpitagarwal], and for sharing your internal defaults. 
If the community is not comfortable, we should't change the default for compat 
reasons. :) 

For this jira, it feels we can probably do the improvement of adding the 
maximum sleep between retries - otherwise if someone configs this larger, the 
retry would be ridiculously long (e.g. 10 times, 409 secs seems pretty long to 
me). The added maximum config can default to not take effect for compat.

> Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 
> 10
> ---
>
> Key: HDFS-13882
> URL: https://issues.apache.org/jira/browse/HDFS-13882
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13882.001.patch
>
>
> More and more we are seeing cases where customers are running into the java 
> io exception "Unable to close file because the last block does not have 
> enough number of replicas" on client file closure. The common workaround is 
> to increase dfs.client.block.write.locateFollowingBlock.retries from 5 to 10. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13882) Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 10

2018-09-11 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611572#comment-16611572
 ] 

Arpit Agarwal commented on HDFS-13882:
--

Also 10 certainly feels too high - that could prevent timely recovery from 
legitimate failures.

> Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 
> 10
> ---
>
> Key: HDFS-13882
> URL: https://issues.apache.org/jira/browse/HDFS-13882
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13882.001.patch
>
>
> More and more we are seeing cases where customers are running into the java 
> io exception "Unable to close file because the last block does not have 
> enough number of replicas" on client file closure. The common workaround is 
> to increase dfs.client.block.write.locateFollowingBlock.retries from 5 to 10. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13882) Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 10

2018-09-11 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611566#comment-16611566
 ] 

Arpit Agarwal commented on HDFS-13882:
--

Sorry I missed responding to this earlier. We set this to 7 for a couple of our 
busier customers to fix the same issue and that worked. 7 retries works out to 
~50 seconds. However I am unsure about increasing this across the board for 
everyone wrt cascading side effects.


> Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 
> 10
> ---
>
> Key: HDFS-13882
> URL: https://issues.apache.org/jira/browse/HDFS-13882
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13882.001.patch
>
>
> More and more we are seeing cases where customers are running into the java 
> io exception "Unable to close file because the last block does not have 
> enough number of replicas" on client file closure. The common workaround is 
> to increase dfs.client.block.write.locateFollowingBlock.retries from 5 to 10. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13882) Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 10

2018-09-11 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611547#comment-16611547
 ] 

Xiao Chen commented on HDFS-13882:
--

Thanks for investigating into this, [~knanasi].

I think what we can do here to prevent the exponential backoff going too far 
off, is to have another variable of the idea like maximum wait time between 
retries'. If the backoff number is larger than that, we simply change it to a 
fixed sleep retry. The unlimited exponential backoff just doesn't make sense 
beyond a certain point.

Unless [~arpitagarwal] or [~kihwal] has concerns on changing the default of 5 
retries here.

> Change dfs.client.block.write.locateFollowingBlock.retries default from 5 to 
> 10
> ---
>
> Key: HDFS-13882
> URL: https://issues.apache.org/jira/browse/HDFS-13882
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13882.001.patch
>
>
> More and more we are seeing cases where customers are running into the java 
> io exception "Unable to close file because the last block does not have 
> enough number of replicas" on client file closure. The common workaround is 
> to increase dfs.client.block.write.locateFollowingBlock.retries from 5 to 10. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13833) Failed to choose from local rack (location = /default); the second replica is not found, retry choosing ramdomly

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611506#comment-16611506
 ] 

Hadoop QA commented on HDFS-13833:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 45 unchanged - 0 fixed = 48 total (was 45) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 52s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13833 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939335/HDFS-13833.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f69358da82fb 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c238ff |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25042/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25042/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25042/testReport/ |
| Max. 

[jira] [Updated] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13906:

Attachment: HDFS-13906-03.patch

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch, HDFS-13906-02.patch, 
> HDFS-13906-03.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-423) Introduce an ozone specific log4j.properties

2018-09-11 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611482#comment-16611482
 ] 

Xiaoyu Yao commented on HDDS-423:
-

Agree with [~anu]. With the dedicate log4j.properties for ozone, how do we plan 
to handle multiple log4j.properties files in the classpath. If i remember 
correctly in log4j1, the first one appear in the class path will be loaded and 
the others will be ignored. I can't find change in the classpath in this patch, 
do you see any potential issue in deployment? 

> Introduce an ozone specific log4j.properties
> 
>
> Key: HDDS-423
> URL: https://issues.apache.org/jira/browse/HDDS-423
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Fix For: 0.2.1
>
> Attachments: HDDS-423-ozone-0.2.1.001.patch
>
>
> Currently the ozone distribution uses the common log4j.properties from the 
> hadoop-common project.
>  
> For this reason it's very hard to define default settings just for the ozone. 
> For example for ozone we can turn off the warning of the NativeCodeLoader
>  
> I propose to maintain an ozone specific version of the log4j.properties and 
> adjust the log level of the NativeCodeLoader and ratis ConfUtils there



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-11 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611453#comment-16611453
 ] 

Wei-Chiu Chuang commented on HDFS-13868:


Thanks. Interesting. I didn't realized "" represents current snapshot. From the 
doc it is suggested "." represents current snapshot.
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report


> WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but 
> "oldsnapshotname" is not.
> -
>
> Key: HDFS-13868
> URL: https://issues.apache.org/jira/browse/HDFS-13868
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, webhdfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Siyao Meng
>Assignee: Pranay Singh
>Priority: Major
> Attachments: HDFS-13868.001.patch, HDFS-13868.002.patch, 
> HDFS-13868.003.patch, HDFS-13868.004.patch
>
>
> HDFS-13052 implements GETSNAPSHOTDIFF for WebHDFS.
>  
> Proof:
> {code:java}
> # Bash
> # Prerequisite: You will need to create the directory "/snapshot", 
> allowSnapshot() on it, and create a snapshot named "snap3" for it to reach 
> NPE.
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap2=snap3"
> # Note that I intentionally typed the wrong parameter name for 
> "oldsnapshotname" above to cause NPE.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs==snap3"
> # Empty string for oldsnapshotname
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap3"
> # Missing param oldsnapshotname, essentially the same as the first case.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13902) Add JMX, conf and stacks menus to the datanode page

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611452#comment-16611452
 ] 

Hudson commented on HDFS-13902:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14928 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14928/])
HDFS-13902. Add JMX, conf and stacks menus to the datanode page. (brahma: rev 
b2432d254c486d0e360b76b7b094a71a011678ab)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode/datanode.html


>  Add JMX, conf and stacks menus to the datanode page
> 
>
> Key: HDFS-13902
> URL: https://issues.apache.org/jira/browse/HDFS-13902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.3
>Reporter: fengchuang
>Assignee: fengchuang
>Priority: Minor
> Fix For: 2.9.2
>
> Attachments: HDFS-13902.001.patch
>
>
> Add JMX, conf and stacks menus to the datanode page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8196) Erasure Coding related information on NameNode UI

2018-09-11 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-8196:
---
Attachment: HDFS-8196.05.patch

> Erasure Coding related information on NameNode UI
> -
>
> Key: HDFS-8196
> URL: https://issues.apache.org/jira/browse/HDFS-8196
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: NameNode, WebUI, hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8196.01.patch, HDFS-8196.02.patch, 
> HDFS-8196.03.patch, HDFS-8196.04.patch, HDFS-8196.05.patch, Screen Shot 
> 2017-02-06 at 22.30.40.png, Screen Shot 2017-02-12 at 20.21.42.png, Screen 
> Shot 2017-02-14 at 22.43.57.png
>
>
> NameNode WebUI shows EC related information and metrics. 
> This is depend on [HDFS-7674|https://issues.apache.org/jira/browse/HDFS-7674].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13846) Safe blocks counter is not decremented correctly if the block is striped

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611445#comment-16611445
 ] 

Hadoop QA commented on HDFS-13846:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 55s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 416 unchanged - 
0 fixed = 417 total (was 416) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 32s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13846 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12937751/HDFS-13846.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ab8fff3e89d8 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1d567c2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25041/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25041/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-8196) Erasure Coding related information on NameNode UI

2018-09-11 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611439#comment-16611439
 ] 

Kitti Nanasi commented on HDFS-8196:


[~lewuathe], I will upload a patch for this soon, let me know if you would like 
to make a change.

> Erasure Coding related information on NameNode UI
> -
>
> Key: HDFS-8196
> URL: https://issues.apache.org/jira/browse/HDFS-8196
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: NameNode, WebUI, hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-8196.01.patch, HDFS-8196.02.patch, 
> HDFS-8196.03.patch, HDFS-8196.04.patch, Screen Shot 2017-02-06 at 
> 22.30.40.png, Screen Shot 2017-02-12 at 20.21.42.png, Screen Shot 2017-02-14 
> at 22.43.57.png
>
>
> NameNode WebUI shows EC related information and metrics. 
> This is depend on [HDFS-7674|https://issues.apache.org/jira/browse/HDFS-7674].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13846) Safe blocks counter is not decremented correctly if the block is striped

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611435#comment-16611435
 ] 

Hadoop QA commented on HDFS-13846:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 53s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 416 unchanged - 
0 fixed = 417 total (was 416) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 40s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.namenode.TestListCorruptFileBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13846 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12937751/HDFS-13846.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 075df655167b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1d567c2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25040/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 

[jira] [Updated] (HDFS-13902) Add JMX, conf and stacks menus to the datanode page

2018-09-11 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-13902:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.2
   Status: Resolved  (was: Patch Available)

Committed to trunk through branch-2.9.

[~fengchuang] thanks for contribution and thanks [~elgoiri] for additional 
review.

 

>  Add JMX, conf and stacks menus to the datanode page
> 
>
> Key: HDFS-13902
> URL: https://issues.apache.org/jira/browse/HDFS-13902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.3
>Reporter: fengchuang
>Assignee: fengchuang
>Priority: Minor
> Fix For: 2.9.2
>
> Attachments: HDFS-13902.001.patch
>
>
> Add JMX, conf and stacks menus to the datanode page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-399) Handle pipeline discovery on SCM restart.

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611416#comment-16611416
 ] 

Anu Engineer edited comment on HDDS-399 at 9/12/18 1:01 AM:


Thanks for updating the patch. It is much more easier to understand. I really 
appreciate it.

Something that is not very clear to me from the patch. I will dig deeper into 
the code to see if I can puzzle this out.

Let us say that I tried to create a pipeline and I timed-out. We will set the 
status to closed. However, what is the state of pipeline in the cluster ? what 
happens if we get a pipeline report little later ? how is that handled ? 

I think I found it, in {{processNodeReport(Pipeline pipeline, DatanodeDetails 
dn)}} in {{PipelineManager.java}} we will set that to open, that is  we will 
set the pipeline state to {{Open}}. If we had done that via the newly 
introduced {{statemanager}} in {{PipelineStateManager}}, then this would have 
thrown, becomes there is no path in the state machine to go from {{closed}} to 
{{open}}.

Now if we reboot, the reloadExistingPipelines will think this datanode/pipeline 
is closed (since we never updated the database), but in a little while when the 
new NodeReport/Pipeline report arrives we will set it to {{open}} again. 

Please let me know if I am mistaken about this, it is complex code,  so I might 
be mistaken about this.


was (Author: anu):
Thanks for updating the patch. It is much more easier to understand. I really 
appreciate it.

Something that is not very clear to me from the patch. I will dig deeper into 
the code to see if I can puzzle this out.

Let us say that I tried to create a pipeline and I timed-out. We will set the 
status to closed. However, what is the state of pipeline in the cluster ? what 
happens if we get a pipeline report little later ? how is that handled ? 

I think I found it, in {{processNodeReport(Pipeline pipeline, DatanodeDetails 
dn)}} in {{PipelineManager.java}} we will set that to open, and we will set the 
pipeline state to {{Open}. If we had done that via the newly introduced 
{{statemanager}} in {{PipelineStateManager}}, then this would have thrown, 
becomes there is no path in the state machine to go from {{closed}} to {{open}}.

Now if we reboot, the reload will think this datanode/pipeline is closed, but 
in a little while when the NodeReport/Pipeline report arrives we will set it to 
{{open}}. 

Please let me know if I am mistaken about this, it is  pretty complex code,  so 
I might be mistaken about this.

> Handle pipeline discovery on SCM restart.
> -
>
> Key: HDDS-399
> URL: https://issues.apache.org/jira/browse/HDDS-399
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-399.001.patch, HDDS-399.002.patch, 
> HDDS-399.003.patch
>
>
> On SCM restart, as part on node registration, SCM should find out the list on 
> open pipeline on the node. Once all the nodes of the pipeline have reported 
> back, they should be added as active pipelines for further allocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-399) Handle pipeline discovery on SCM restart.

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611416#comment-16611416
 ] 

Anu Engineer edited comment on HDDS-399 at 9/12/18 12:59 AM:
-

Thanks for updating the patch. It is much more easier to understand. I really 
appreciate it.

Something that is not very clear to me from the patch. I will dig deeper into 
the code to see if I can puzzle this out.

Let us say that I tried to create a pipeline and I timed-out. We will set the 
status to closed. However, what is the state of pipeline in the cluster ? what 
happens if we get a pipeline report little later ? how is that handled ? 

I think I found it, in {{processNodeReport(Pipeline pipeline, DatanodeDetails 
dn)}} in {{PipelineManager.java}} we will set that to open, and we will set the 
pipeline state to {{Open}. If we had done that via the newly introduced 
{{statemanager}} in {{PipelineStateManager}}, then this would have thrown, 
becomes there is no path in the state machine to go from {{closed}} to {{open}}.

Now if we reboot, the reload will think this datanode/pipeline is closed, but 
in a little while when the NodeReport/Pipeline report arrives we will set it to 
{{open}}. 

Please let me know if I am mistaken about this, it is  pretty complex code,  so 
I might be mistaken about this.


was (Author: anu):
Thanks for updating the patch. It is much more easier to understand. I really 
appreciate it.

Something that is not very clear to me from the patch. I will dig deeper into 
the code to see if I can puzzle this out.

Let us say that I tried to create a pipeline and I timedout. We will set the 
status to closed. However, what is the state of pipeline in the cluster ? what 
happens if we get a pipeline report little later ? how is that handled ? 

> Handle pipeline discovery on SCM restart.
> -
>
> Key: HDDS-399
> URL: https://issues.apache.org/jira/browse/HDDS-399
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-399.001.patch, HDDS-399.002.patch, 
> HDDS-399.003.patch
>
>
> On SCM restart, as part on node registration, SCM should find out the list on 
> open pipeline on the node. Once all the nodes of the pipeline have reported 
> back, they should be added as active pipelines for further allocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-399) Handle pipeline discovery on SCM restart.

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611416#comment-16611416
 ] 

Anu Engineer commented on HDDS-399:
---

Thanks for updating the patch. It is much more easier to understand. I really 
appreciate it.

Something that is not very clear to me from the patch. I will dig deeper into 
the code to see if I can puzzle this out.

Let us say that I tried to create a pipeline and I timedout. We will set the 
status to closed. However, what is the state of pipeline in the cluster ? what 
happens if we get a pipeline report little later ? how is that handled ? 

> Handle pipeline discovery on SCM restart.
> -
>
> Key: HDDS-399
> URL: https://issues.apache.org/jira/browse/HDDS-399
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-399.001.patch, HDDS-399.002.patch, 
> HDDS-399.003.patch
>
>
> On SCM restart, as part on node registration, SCM should find out the list on 
> open pipeline on the node. Once all the nodes of the pipeline have reported 
> back, they should be added as active pipelines for further allocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-395) TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"

2018-09-11 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611415#comment-16611415
 ] 

Dinesh Chitlangia commented on HDDS-395:


[~anu] - Indeed! I will investigate this. Thanks for bringing this to my 
attention.

> TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"
> --
>
> Key: HDDS-395
> URL: https://issues.apache.org/jira/browse/HDDS-395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Mukul Kumar Singh
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.2.1
>
> Attachments: HDDS-395.001.patch
>
>
> Ozone datanode initialization is failing with the following exception.
> This was noted in the following precommit build result.
> https://builds.apache.org/job/PreCommit-HDDS-Build/935/testReport/org.apache.hadoop.ozone.web/TestOzoneRestWithMiniCluster/org_apache_hadoop_ozone_web_TestOzoneRestWithMiniCluster_2/
> {code}
> 2018-09-02 20:56:33,501 INFO  db.DBStoreBuilder 
> (DBStoreBuilder.java:getDbProfile(176)) - Unable to read ROCKDB config
> java.io.IOException: Unable to find the configuration directory. Please make 
> sure that HADOOP_CONF_DIR is setup correctly 
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.getConfigLocation(DBConfigFromFile.java:62)
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.readFromFile(DBConfigFromFile.java:118)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.getDbProfile(DBStoreBuilder.java:170)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:122)
>   at 
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.(OmMetadataManagerImpl.java:133)
>   at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:146)
>   at 
> org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:295)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createOM(MiniOzoneClusterImpl.java:357)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:304)
>   at 
> org.apache.hadoop.ozone.web.TestOzoneRestWithMiniCluster.init(TestOzoneRestWithMiniCluster.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-432) Replication of closed containers is not working

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611414#comment-16611414
 ] 

Hudson commented on HDDS-432:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14926 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14926/])
HDDS-432. Replication of closed containers is not working. Contributed 
(aengineer: rev 9c238ffc301c9aa1ae0f811c065e7426b1e23540)
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/ReplicateContainerCommandHandler.java
* (edit) 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestReplicateContainerCommandHandler.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java


> Replication of closed containers is not working
> ---
>
> Key: HDDS-432
> URL: https://issues.apache.org/jira/browse/HDDS-432
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-432-ozone-0.2.001.patch
>
>
> Steps to reproduce:
> 1. Start a cluster with three datanodes:
> docker-compose up -d
> docker-compose scale datanode=3
> 2. Create keys:
> ozone oz -createVolume /vol1 -user hadoop --quota 1TB --root
> ozone oz -createBucket /vol1/bucket
> dd if=/dev/zero of=/tmp/test bs=1024000 count=512
> ozone oz -putKey /vol1/bucket/file1 -replicationFactor THREE -file /tmp/test  
> 3. Close the containers with scmcli
> 4. kill a datanode with a replica
> {code}
> for i in `seq 1 4`; do docker diff ozone_datanode_$i && echo 
> ""; done
> #Choose a datanode with replica
> docker kill ozone_datanode_3
> {code}
>  
> 5. Wait
> 6. After a while the last data node should container the chunks (checked with 
> docker diff)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-395) TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611411#comment-16611411
 ] 

Anu Engineer commented on HDDS-395:
---

[~dineshchitlangia] This is one of the sad times when Jenkins lets us pass but 
in reality it should have failed.

Once I applied this patch, and ran "*mvn test",* I get this result.
{noformat}

[ERROR] 
testSetBucketPropertyRemoveACL(org.apache.hadoop.ozone.om.TestBucketManagerImpl)
 Time elapsed: 0.01 s <<< ERROR!
java.lang.NullPointerException
at java.io.File.(File.java:277)
at 
org.apache.hadoop.utils.db.DBConfigFromFile.getConfigLocation(DBConfigFromFile.java:70)
at 
org.apache.hadoop.utils.db.DBConfigFromFile.readFromFile(DBConfigFromFile.java:123)
at 
org.apache.hadoop.utils.db.DBStoreBuilder.getDbProfile(DBStoreBuilder.java:170){noformat}

> TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"
> --
>
> Key: HDDS-395
> URL: https://issues.apache.org/jira/browse/HDDS-395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Mukul Kumar Singh
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.2.1
>
> Attachments: HDDS-395.001.patch
>
>
> Ozone datanode initialization is failing with the following exception.
> This was noted in the following precommit build result.
> https://builds.apache.org/job/PreCommit-HDDS-Build/935/testReport/org.apache.hadoop.ozone.web/TestOzoneRestWithMiniCluster/org_apache_hadoop_ozone_web_TestOzoneRestWithMiniCluster_2/
> {code}
> 2018-09-02 20:56:33,501 INFO  db.DBStoreBuilder 
> (DBStoreBuilder.java:getDbProfile(176)) - Unable to read ROCKDB config
> java.io.IOException: Unable to find the configuration directory. Please make 
> sure that HADOOP_CONF_DIR is setup correctly 
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.getConfigLocation(DBConfigFromFile.java:62)
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.readFromFile(DBConfigFromFile.java:118)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.getDbProfile(DBStoreBuilder.java:170)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:122)
>   at 
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.(OmMetadataManagerImpl.java:133)
>   at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:146)
>   at 
> org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:295)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createOM(MiniOzoneClusterImpl.java:357)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:304)
>   at 
> org.apache.hadoop.ozone.web.TestOzoneRestWithMiniCluster.init(TestOzoneRestWithMiniCluster.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-233) Update ozone to latest ratis snapshot build

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611405#comment-16611405
 ] 

Hadoop QA commented on HDDS-233:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m  
2s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
10s{color} | {color:red} client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
11s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
11s{color} | {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 14m  
5s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m  5s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
59s{color} | {color:green} root: The patch generated 0 new + 0 unchanged - 1 
fixed = 0 total (was 1) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
27s{color} | {color:red} client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
24s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} integration-test in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
23s{color} | {color:red} client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
23s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | 

[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611396#comment-16611396
 ] 

Hudson commented on HDDS-424:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14925 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14925/])
HDDS-424. Consolidate ozone oz parameters to use GNU convention. (aengineer: 
rev a406f6f60ee0caf8229d13bda595d621a9779aa8)
* (add) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/BucketCommands.java
* (edit) hadoop-ozone/acceptance-test/src/test/acceptance/ozonefs/ozonefs.robot
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/ListVolumeHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/ListBucketHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/CreateVolumeHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/Shell.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/UpdateVolumeHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/Handler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/DeleteKeyHandler.java
* (add) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/cli/GenericParentCommand.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/InfoBucketHandler.java
* (add) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/KeyCommands.java
* (add) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/VolumeCommands.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/PutKeyHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/DeleteBucketHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/ListKeyHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/InfoKeyHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/UpdateBucketHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/GetKeyHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/InfoVolumeHandler.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/ozShell/TestOzoneShell.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/DeleteVolumeHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/bucket/CreateBucketHandler.java
* (edit) 
hadoop-ozone/acceptance-test/src/test/acceptance/basic/ozone-shell.robot
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/cli/GenericCli.java
* (add) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/cli/MissingSubcommandException.java


> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13566) Add configurable additional RPC listener to NameNode

2018-09-11 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-13566:
--
Attachment: HDFS-13566.004.patch

> Add configurable additional RPC listener to NameNode
> 
>
> Key: HDFS-13566
> URL: https://issues.apache.org/jira/browse/HDFS-13566
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ipc
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13566.001.patch, HDFS-13566.002.patch, 
> HDFS-13566.003.patch, HDFS-13566.004.patch
>
>
> This Jira aims to add the capability to NameNode to run additional 
> listener(s). Such that NameNode can be accessed from multiple ports. 
> Fundamentally, this Jira tries to extend ipc.Server to allow configured with 
> more listeners, binding to different ports, but sharing the same call queue 
> and the handlers. Useful when different clients are only allowed to access 
> certain different ports. Combined with HDFS-13547, this also allows different 
> ports to have different SASL security levels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13566) Add configurable additional RPC listener to NameNode

2018-09-11 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611397#comment-16611397
 ] 

Chen Liang commented on HDFS-13566:
---

Post v004 patch with the configuration key changes, and more unit tests added.

> Add configurable additional RPC listener to NameNode
> 
>
> Key: HDFS-13566
> URL: https://issues.apache.org/jira/browse/HDFS-13566
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ipc
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13566.001.patch, HDFS-13566.002.patch, 
> HDFS-13566.003.patch, HDFS-13566.004.patch
>
>
> This Jira aims to add the capability to NameNode to run additional 
> listener(s). Such that NameNode can be accessed from multiple ports. 
> Fundamentally, this Jira tries to extend ipc.Server to allow configured with 
> more listeners, binding to different ports, but sharing the same call queue 
> and the handlers. Useful when different clients are only allowed to access 
> certain different ports. Combined with HDFS-13547, this also allows different 
> ports to have different SASL security levels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-432) Replication of closed containers is not working

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-432:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~elek] Thanks for the contribution. I have committed this to trunk and 
ozone-2.0, while committing I have fixed the comparison issues that I commented 
about earlier.

> Replication of closed containers is not working
> ---
>
> Key: HDDS-432
> URL: https://issues.apache.org/jira/browse/HDDS-432
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-432-ozone-0.2.001.patch
>
>
> Steps to reproduce:
> 1. Start a cluster with three datanodes:
> docker-compose up -d
> docker-compose scale datanode=3
> 2. Create keys:
> ozone oz -createVolume /vol1 -user hadoop --quota 1TB --root
> ozone oz -createBucket /vol1/bucket
> dd if=/dev/zero of=/tmp/test bs=1024000 count=512
> ozone oz -putKey /vol1/bucket/file1 -replicationFactor THREE -file /tmp/test  
> 3. Close the containers with scmcli
> 4. kill a datanode with a replica
> {code}
> for i in `seq 1 4`; do docker diff ozone_datanode_$i && echo 
> ""; done
> #Choose a datanode with replica
> docker kill ozone_datanode_3
> {code}
>  
> 5. Wait
> 6. After a while the last data node should container the chunks (checked with 
> docker diff)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13833) Failed to choose from local rack (location = /default); the second replica is not found, retry choosing ramdomly

2018-09-11 Thread Shweta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611391#comment-16611391
 ] 

Shweta commented on HDFS-13833:
---

Thank you [~jojochuang], [~xiaochen], [~hexiaoqiao], [~rikeppb100] for your 
insightful discussions and comments.

While working on a similar issue, it was seen that the considerLoad can be bad 
for small (1-node) clusters.
When there is no heartbeat, maxLoad will be 0 and the existing logic will 
filter out many nodes that have workload, which should be avoided ref. load: 8 
> 0.0.  
For making sure this doesn’t happen, I have added an addition check of maxLoad 
> 0 along with nodeLoad > maxLoad. So, the method does not return false when 
maxLoad == 0 and nodes are not filtered. 
I have posted a patch with this change and unit test to validate the behavior. 
Please review and suggest if any further conditions need to be considered or 
changes required. Thanks. 

> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> 
>
> Key: HDFS-13833
> URL: https://issues.apache.org/jira/browse/HDFS-13833
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Henrique Barros
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13833.001.patch
>
>
> I'm having a random problem with blocks replication with Hadoop 
> 2.6.0-cdh5.15.0
> With Cloudera CDH-5.15.0-1.cdh5.15.0.p0.21
>  
> In my case we are getting this error very randomly (after some hours) and 
> with only one Datanode (for now, we are trying this cloudera cluster for a 
> POC)
> Here is the Log.
> {code:java}
> Choosing random from 1 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> Choosing random from 0 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[192.168.220.53:50010]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning null
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> [
> Node /default/192.168.220.53:50010 [
>   Datanode 192.168.220.53:50010 is not chosen since the node is too busy 
> (load: 8 > 0.0).
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning 192.168.220.53:50010
> 2:38:20.527 PMINFOBlockPlacementPolicy
> Not enough replicas was chosen. Reason:{NODE_TOO_BUSY=1}
> 2:38:20.527 PMDEBUG   StateChange 
> closeFile: 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/eef8bff6-75a9-43c1-ae93-4b1a9ca31ad9
>  with 1 blocks is persisted to the file system
> 2:38:20.527 PMDEBUG   StateChange 
> *BLOCK* NameNode.addBlock: file 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/1cfe900d-6f45-4b55-baaa-73c02ace2660
>  fileId=129628869 for DFSClient_NONMAPREDUCE_467616914_65
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
>  
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:784)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:601)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:561)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:464)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:395)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:270)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:142)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:158)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1715)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3505)
>   at 
> 

[jira] [Updated] (HDFS-13833) Failed to choose from local rack (location = /default); the second replica is not found, retry choosing ramdomly

2018-09-11 Thread Shweta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta updated HDFS-13833:
--
Attachment: HDFS-13833.001.patch
Status: Patch Available  (was: Open)

> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> 
>
> Key: HDFS-13833
> URL: https://issues.apache.org/jira/browse/HDFS-13833
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Henrique Barros
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13833.001.patch
>
>
> I'm having a random problem with blocks replication with Hadoop 
> 2.6.0-cdh5.15.0
> With Cloudera CDH-5.15.0-1.cdh5.15.0.p0.21
>  
> In my case we are getting this error very randomly (after some hours) and 
> with only one Datanode (for now, we are trying this cloudera cluster for a 
> POC)
> Here is the Log.
> {code:java}
> Choosing random from 1 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> Choosing random from 0 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[192.168.220.53:50010]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning null
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> [
> Node /default/192.168.220.53:50010 [
>   Datanode 192.168.220.53:50010 is not chosen since the node is too busy 
> (load: 8 > 0.0).
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning 192.168.220.53:50010
> 2:38:20.527 PMINFOBlockPlacementPolicy
> Not enough replicas was chosen. Reason:{NODE_TOO_BUSY=1}
> 2:38:20.527 PMDEBUG   StateChange 
> closeFile: 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/eef8bff6-75a9-43c1-ae93-4b1a9ca31ad9
>  with 1 blocks is persisted to the file system
> 2:38:20.527 PMDEBUG   StateChange 
> *BLOCK* NameNode.addBlock: file 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/1cfe900d-6f45-4b55-baaa-73c02ace2660
>  fileId=129628869 for DFSClient_NONMAPREDUCE_467616914_65
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
>  
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:784)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:601)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:561)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:464)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:395)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:270)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:142)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:158)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1715)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3505)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:694)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:219)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:507)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at 

[jira] [Comment Edited] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611382#comment-16611382
 ] 

Anu Engineer edited comment on HDDS-424 at 9/11/18 11:58 PM:
-

[~dineshchitlangia],[~arpitagarwal] Thanks for comments and reviews. [~elek] 
Thanks for the contribution. May the force be with you. I have committed this 
patch to trunk and ozone-2.0


was (Author: anu):
[~dineshchitlangia],[~arpitagarwal] Thanks for comments and reviews. [~elek] 
Thanks for the contribution. May the force be with you.

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-424:
--
Fix Version/s: 0.3.0

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-424:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~dineshchitlangia],[~arpitagarwal] Thanks for comments and reviews. [~elek] 
Thanks for the contribution. May the force be with you.

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611381#comment-16611381
 ] 

Hadoop QA commented on HDFS-13910:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 45 unchanged - 0 fixed = 48 total (was 45) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13910 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939313/HDFS-13910.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8acd8688f2c9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25039/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25039/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25039/testReport/ |
| Max. process+thread count | 

[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611378#comment-16611378
 ] 

Anu Engineer commented on HDDS-424:
---

cc: [~nilotpalnandi], [~namit081] This changes the Command line syntax. Might 
have to update your scripts.

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611376#comment-16611376
 ] 

Anu Engineer commented on HDDS-428:
---

Excellent patch. I am almost a +1, I have some very minor comments.
 # ActiveLock#resetCounter - instead of new object allocation we can set the 
value, as {{count.set(0);}}
 # We seem to hold bucketLock while deleting a bucket. Should we also hold 
volumeLock?
 # List Buckets – Looks like we are holding volume lock? do we need this ? We 
use an iterator internally which relies on the snapshot.
 # KeyManagerImpl#allocateBlock - I understand that we support volume and 
bucket locks, yet for allocating a block we take the bucket lock. I am going to 
argue we don't need any locks here. Here is what I am thinking, please correct 
me if I am wrong. Only a client that has the handle of an OpenKey can do 
allocateBlock for that Key. Hence it means that allocateBlock can be done 
without holding a lock.

> OzoneManager lock optimization
> --
>
> Key: HDDS-428
> URL: https://issues.apache.org/jira/browse/HDDS-428
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: OM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-428.000.patch
>
>
> Currently, {{OzoneManager}} uses a single lock for everything which impacts 
> the performance. We can introduce a separate lock for each resource like 
> User/Volume/Bucket which will give us a performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-416) Remove currentPosition from ChunkInputStreamEntry

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611349#comment-16611349
 ] 

Hudson commented on HDDS-416:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14924 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14924/])
HDDS-416. Remove currentPosition from ChunkInputStreamEntry. Contributed (xyao: 
rev 1d567c25d0c97603f35ab5c789217df8ec6893d7)
* (edit) 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ChunkGroupInputStream.java


> Remove currentPosition from ChunkInputStreamEntry
> -
>
> Key: HDDS-416
> URL: https://issues.apache.org/jira/browse/HDDS-416
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-416.001.patch
>
>
> ChunkInputStreamEntry maintains currentPosition field. This field is 
> redundant and can be replaced by getPos().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-233) Update ozone to latest ratis snapshot build

2018-09-11 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611340#comment-16611340
 ] 

Tsz Wo Nicholas Sze commented on HDDS-233:
--

I just have deployed a new Ratis snapshot 0.3.0-50588bd-SNAPSHOT.

[~shashikant], please feel free to add more code to the patch.  Thanks.

> Update ozone to latest ratis snapshot build
> ---
>
> Key: HDDS-233
> URL: https://issues.apache.org/jira/browse/HDDS-233
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-233_20180911.patch
>
>
> This jira proposes to update ozone to latest ratis snapshot build. This jira 
> also will add config to set append entry timeout as well as controlling the 
> number of entries in retry cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13846) Safe blocks counter is not decremented correctly if the block is striped

2018-09-11 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611338#comment-16611338
 ] 

Daniel Templeton commented on HDFS-13846:
-

That sounds good to me.  Hmmm...  I'm wondering why there hasn't been a Jenkins 
run.  Lemme go kick it.

> Safe blocks counter is not decremented correctly if the block is striped
> 
>
> Key: HDFS-13846
> URL: https://issues.apache.org/jira/browse/HDFS-13846
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13846.001.patch, HDFS-13846.002.patch, 
> HDFS-13846.003.patch, HDFS-13846.004.patch
>
>
> In BlockManagerSafeMode class, the "safe blocks" counter is incremented if 
> the number of nodes containing the block equals to the number of data units 
> specified by the erasure coding policy, which looks like this in the code:
> {code:java}
> final int safe = storedBlock.isStriped() ?
> ((BlockInfoStriped)storedBlock).getRealDataBlockNum() : 
> safeReplication;
> if (storageNum == safe) {
>   this.blockSafe++;
> {code}
> But when it is decremented the code does not check if the block is striped or 
> not, just compares the number of nodes containing the block with 0 
> (safeReplication - 1) if the block is complete, which is not correct.
> {code:java}
> if (storedBlock.isComplete() &&
> blockManager.countNodes(b).liveReplicas() == safeReplication - 1) {
>   this.blockSafe--;
>   assert blockSafe >= 0;
>   checkSafeMode();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-416) Remove currentPosition from ChunkInputStreamEntry

2018-09-11 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-416:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~ljain] for the contribution and [~nandakumar131] for the reviews. I've 
commit the patch to trunk and ozone-0.2

> Remove currentPosition from ChunkInputStreamEntry
> -
>
> Key: HDDS-416
> URL: https://issues.apache.org/jira/browse/HDDS-416
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-416.001.patch
>
>
> ChunkInputStreamEntry maintains currentPosition field. This field is 
> redundant and can be replaced by getPos().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol

2018-09-11 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611334#comment-16611334
 ] 

Jitendra Nath Pandey commented on HDDS-362:
---

Need a configuration to disable chill-mode for test deployments. We could just 
set the threshold to zero, and that would cause SCM to exit chill mode 
immediately.

The test frameworks should be updated to deploy with chill-mode disabled for 
the test cases that don't like chill mode checks.

> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
> ---
>
> Key: HDDS-362
> URL: https://issues.apache.org/jira/browse/HDDS-362
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-362.00.patch, HDDS-362.01.patch
>
>
> [HDDS-351] adds chill mode state to SCM. When SCM is in chill mode certain 
> operations will be restricted for end users. This jira intends to modify 
> functions impacted by SCM chill mode in ScmBlockLocationProtocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-233) Update ozone to latest ratis snapshot build

2018-09-11 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611330#comment-16611330
 ] 

Tsz Wo Nicholas Sze commented on HDDS-233:
--

HDDS-233_20180911.patch: use the new Raft group APIs.

> Update ozone to latest ratis snapshot build
> ---
>
> Key: HDDS-233
> URL: https://issues.apache.org/jira/browse/HDDS-233
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-233_20180911.patch
>
>
> This jira proposes to update ozone to latest ratis snapshot build. This jira 
> also will add config to set append entry timeout as well as controlling the 
> number of entries in retry cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-233) Update ozone to latest ratis snapshot build

2018-09-11 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDDS-233:
-
Status: Patch Available  (was: Open)

> Update ozone to latest ratis snapshot build
> ---
>
> Key: HDDS-233
> URL: https://issues.apache.org/jira/browse/HDDS-233
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-233_20180911.patch
>
>
> This jira proposes to update ozone to latest ratis snapshot build. This jira 
> also will add config to set append entry timeout as well as controlling the 
> number of entries in retry cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-233) Update ozone to latest ratis snapshot build

2018-09-11 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDDS-233:
-
Attachment: HDDS-233_20180911.patch

> Update ozone to latest ratis snapshot build
> ---
>
> Key: HDDS-233
> URL: https://issues.apache.org/jira/browse/HDDS-233
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-233_20180911.patch
>
>
> This jira proposes to update ozone to latest ratis snapshot build. This jira 
> also will add config to set append entry timeout as well as controlling the 
> number of entries in retry cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol

2018-09-11 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-362:

Description: [HDDS-351] adds chill mode state to SCM. When SCM is in chill 
mode certain operations will be restricted for end users. This jira intends to 
modify functions impacted by SCM chill mode in ScmBlockLocationProtocol.  (was: 
Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol)

> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
> ---
>
> Key: HDDS-362
> URL: https://issues.apache.org/jira/browse/HDDS-362
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-362.00.patch, HDDS-362.01.patch
>
>
> [HDDS-351] adds chill mode state to SCM. When SCM is in chill mode certain 
> operations will be restricted for end users. This jira intends to modify 
> functions impacted by SCM chill mode in ScmBlockLocationProtocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-416) Remove currentPosition from ChunkInputStreamEntry

2018-09-11 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-416:

Summary: Remove currentPosition from ChunkInputStreamEntry  (was: Fix bug 
in ChunkInputStreamEntry)

> Remove currentPosition from ChunkInputStreamEntry
> -
>
> Key: HDDS-416
> URL: https://issues.apache.org/jira/browse/HDDS-416
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-416.001.patch
>
>
> ChunkInputStreamEntry maintains currentPosition field. This field is 
> redundant and can be replaced by getPos().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol

2018-09-11 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDDS-362:
--
Fix Version/s: (was: 0.3.0)
   0.2.1

> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
> ---
>
> Key: HDDS-362
> URL: https://issues.apache.org/jira/browse/HDDS-362
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-362.00.patch, HDDS-362.01.patch
>
>
> Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-411) Add Ozone submodule to the hadoop.apache.org

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-411:
--
Fix Version/s: (was: 0.2.1)
   0.3.0

> Add Ozone submodule to the hadoop.apache.org
> 
>
> Key: HDDS-411
> URL: https://issues.apache.org/jira/browse/HDDS-411
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Dinesh Chitlangia
>Priority: Trivial
>  Labels: site
> Fix For: 0.3.0
>
> Attachments: HDDS-411.001.patch, HDDS-411.002.patch
>
>
> The current hadoop.apache.org doesn't mention Ozone in the "Modules" section.
> We can add something like this (or better):
> {quote}Hadoop Ozone is an object store for Hadoop on top of the Hadoop HDDS 
> which provides low-level binary storage layer.
> {quote}
> We can also link to the 
> [http://ozone.hadoop.apache.org|http://ozone.hadoop.apache.org/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611317#comment-16611317
 ] 

Arpit Agarwal commented on HDDS-424:


[~anu] please feel free to commit if it looks good to you.

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Shweta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta updated HDFS-13910:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> Improve BlockPlacementPolicy for small clusters
> ---
>
> Key: HDFS-13910
> URL: https://issues.apache.org/jira/browse/HDFS-13910
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13910.001.patch
>
>
> From investigations and a few test occurrences, the NameNode 
> BlockPlacementPolicy’s considerLoad can be bad for small test clusters.
> A small (1-node) cluster may trigger a corner case of maxLoad = 0. In this 
> case, filtering should not take place. 
> When there is no heartbeat, maxLoad will be 0 and the existing logic will 
> filter out many nodes that have workload, which should be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611307#comment-16611307
 ] 

Hadoop QA commented on HDFS-13906:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 59s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13906 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939309/HDFS-13906-02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2a7815202eff 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25038/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25038/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25038/testReport/ |
| Max. process+thread count | 951 (vs. 

[jira] [Assigned] (HDFS-13833) Failed to choose from local rack (location = /default); the second replica is not found, retry choosing ramdomly

2018-09-11 Thread Shweta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta reassigned HDFS-13833:
-

Assignee: Shweta

> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> 
>
> Key: HDFS-13833
> URL: https://issues.apache.org/jira/browse/HDFS-13833
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Henrique Barros
>Assignee: Shweta
>Priority: Critical
>
> I'm having a random problem with blocks replication with Hadoop 
> 2.6.0-cdh5.15.0
> With Cloudera CDH-5.15.0-1.cdh5.15.0.p0.21
>  
> In my case we are getting this error very randomly (after some hours) and 
> with only one Datanode (for now, we are trying this cloudera cluster for a 
> POC)
> Here is the Log.
> {code:java}
> Choosing random from 1 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> Choosing random from 0 available nodes on node /default, scope=/default, 
> excludedScope=null, excludeNodes=[192.168.220.53:50010]
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning null
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> [
> Node /default/192.168.220.53:50010 [
>   Datanode 192.168.220.53:50010 is not chosen since the node is too busy 
> (load: 8 > 0.0).
> 2:38:20.527 PMDEBUG   NetworkTopology 
> chooseRandom returning 192.168.220.53:50010
> 2:38:20.527 PMINFOBlockPlacementPolicy
> Not enough replicas was chosen. Reason:{NODE_TOO_BUSY=1}
> 2:38:20.527 PMDEBUG   StateChange 
> closeFile: 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/eef8bff6-75a9-43c1-ae93-4b1a9ca31ad9
>  with 1 blocks is persisted to the file system
> 2:38:20.527 PMDEBUG   StateChange 
> *BLOCK* NameNode.addBlock: file 
> /mobi.me/development/apps/flink/checkpoints/a5a6806866c1640660924ea1453cbe34/chk-2118/1cfe900d-6f45-4b55-baaa-73c02ace2660
>  fileId=129628869 for DFSClient_NONMAPREDUCE_467616914_65
> 2:38:20.527 PMDEBUG   BlockPlacementPolicy
> Failed to choose from local rack (location = /default); the second replica is 
> not found, retry choosing ramdomly
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
>  
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:784)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:601)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:561)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:464)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:395)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:270)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:142)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:158)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1715)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3505)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:694)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:219)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:507)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
>   at 

[jira] [Comment Edited] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Shweta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611296#comment-16611296
 ] 

Shweta edited comment on HDFS-13910 at 9/11/18 9:50 PM:


Hi [~xiaochen],

Thank you for helping me on this. 
I have added the patch for the corner case when there is No heartbeat and the 
maxLoad is 0. For making sure that the nodes which have some work load are not 
filtered, I have added an addition check of maxLoad > 0 along with nodeLoad > 
maxLoad. So that the method does not return false when maxLoad == 0 and nodes 
are not filtered. 
Please review.


was (Author: shwetayakkali):
Hi [~xiaochen],

I have added the patch for the corner case when there is No heartbeat and the 
maxLoad is 0. For making sure that the nodes which have some work load are not 
filtered, I have added an addition check of maxLoad > 0 along with nodeLoad > 
maxLoad. So that the method does not return false when maxLoad == 0 and nodes 
are not filtered. 

> Improve BlockPlacementPolicy for small clusters
> ---
>
> Key: HDFS-13910
> URL: https://issues.apache.org/jira/browse/HDFS-13910
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13910.001.patch
>
>
> From investigations and a few test occurrences, the NameNode 
> BlockPlacementPolicy’s considerLoad can be bad for small test clusters.
> A small (1-node) cluster may trigger a corner case of maxLoad = 0. In this 
> case, filtering should not take place. 
> When there is no heartbeat, maxLoad will be 0 and the existing logic will 
> filter out many nodes that have workload, which should be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Shweta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611296#comment-16611296
 ] 

Shweta commented on HDFS-13910:
---

Hi [~xiaochen],

I have added the patch for the corner case when there is No heartbeat and the 
maxLoad is 0. For making sure that the nodes which have some work load are not 
filtered, I have added an addition check of maxLoad > 0 along with nodeLoad > 
maxLoad. So that the method does not return false when maxLoad == 0 and nodes 
are not filtered. 

> Improve BlockPlacementPolicy for small clusters
> ---
>
> Key: HDFS-13910
> URL: https://issues.apache.org/jira/browse/HDFS-13910
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13910.001.patch
>
>
> From investigations and a few test occurrences, the NameNode 
> BlockPlacementPolicy’s considerLoad can be bad for small test clusters.
> A small (1-node) cluster may trigger a corner case of maxLoad = 0. In this 
> case, filtering should not take place. 
> When there is no heartbeat, maxLoad will be 0 and the existing logic will 
> filter out many nodes that have workload, which should be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611271#comment-16611271
 ] 

Hadoop QA commented on HDDS-428:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
7s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} ozone-manager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDDS-428 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939261/HDDS-428.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux f38eb5a013a6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 

[jira] [Updated] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Shweta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta updated HDFS-13910:
--
Attachment: HDFS-13910.001.patch
Status: Patch Available  (was: Open)

> Improve BlockPlacementPolicy for small clusters
> ---
>
> Key: HDFS-13910
> URL: https://issues.apache.org/jira/browse/HDFS-13910
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Shweta
>Assignee: Shweta
>Priority: Critical
> Attachments: HDFS-13910.001.patch
>
>
> From investigations and a few test occurrences, the NameNode 
> BlockPlacementPolicy’s considerLoad can be bad for small test clusters.
> A small (1-node) cluster may trigger a corner case of maxLoad = 0. In this 
> case, filtering should not take place. 
> When there is no heartbeat, maxLoad will be 0 and the existing logic will 
> filter out many nodes that have workload, which should be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13910) Improve BlockPlacementPolicy for small clusters

2018-09-11 Thread Shweta (JIRA)
Shweta created HDFS-13910:
-

 Summary: Improve BlockPlacementPolicy for small clusters
 Key: HDFS-13910
 URL: https://issues.apache.org/jira/browse/HDFS-13910
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Shweta
Assignee: Shweta


>From investigations and a few test occurrences, the NameNode 
>BlockPlacementPolicy’s considerLoad can be bad for small test clusters.
A small (1-node) cluster may trigger a corner case of maxLoad = 0. In this 
case, filtering should not take place. 

When there is no heartbeat, maxLoad will be 0 and the existing logic will 
filter out many nodes that have workload, which should be avoided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611222#comment-16611222
 ] 

Ayush Saxena commented on HDFS-13906:
-

Thanx [~elgoiri] for the comment.
Have uploaded the patch v2 with changes as per your comment and checkstyle 
warnings.
Pls Review :)

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch, HDFS-13906-02.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13906:

Attachment: HDFS-13906-02.patch

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch, HDFS-13906-02.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-13900) NameNode: Unable to trigger a roll of the active NN

2018-09-11 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved HDFS-13900.

Resolution: Duplicate

> NameNode: Unable to trigger a roll of the active NN
> ---
>
> Key: HDFS-13900
> URL: https://issues.apache.org/jira/browse/HDFS-13900
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuhongtong
>Priority: Critical
>
> I have backport Multi-standby NNs to our own hdfs version. I found an issue 
> of EditLog roll.
> h2. Reproducible Steps:
> h3. 1.original state
> nn1 active
> nn2 standby
> nn3 standby
> h3. 2. stop nn1
> h3. 3. new state
> nn1 stopped
> nn2 active
> nn3 standby
> h3. 4. nn3 unable to trigger a roll of the active NN
> [2018-08-22T10:33:38.025+08:00] [WARN] 
> namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java 307) [Edit 
> log tailer] : Unable to trigger a roll of the active NN
> java.net.ConnectException: Call From  to  failed 
> on connection exception: java.net.ConnectException: Connection refused; For 
> more details see:[http://wiki.apache.org/hadoop/ConnectionRefused]
> at sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:722)
> at org.apache.hadoop.ipc.Client.call(Client.java:1536)
> at org.apache.hadoop.ipc.Client.call(Client.java:1463)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:237)
> at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$MultipleNameNodeProxy.call(EditLogTailer.java:414)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:304)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$800(EditLogTailer.java:69)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:346)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:315)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:332)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:328)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:521)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:485)
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
> at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:419)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1585)
> at org.apache.hadoop.ipc.Client.call(Client.java:1502)
> ... 14 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13900) NameNode: Unable to trigger a roll of the active NN

2018-09-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611195#comment-16611195
 ] 

Íñigo Goiri commented on HDFS-13900:


I'm closing this as a duplicate of HADOOP-15684.
We would appreciate a review there.

> NameNode: Unable to trigger a roll of the active NN
> ---
>
> Key: HDFS-13900
> URL: https://issues.apache.org/jira/browse/HDFS-13900
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuhongtong
>Priority: Critical
>
> I have backport Multi-standby NNs to our own hdfs version. I found an issue 
> of EditLog roll.
> h2. Reproducible Steps:
> h3. 1.original state
> nn1 active
> nn2 standby
> nn3 standby
> h3. 2. stop nn1
> h3. 3. new state
> nn1 stopped
> nn2 active
> nn3 standby
> h3. 4. nn3 unable to trigger a roll of the active NN
> [2018-08-22T10:33:38.025+08:00] [WARN] 
> namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java 307) [Edit 
> log tailer] : Unable to trigger a roll of the active NN
> java.net.ConnectException: Call From  to  failed 
> on connection exception: java.net.ConnectException: Connection refused; For 
> more details see:[http://wiki.apache.org/hadoop/ConnectionRefused]
> at sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:722)
> at org.apache.hadoop.ipc.Client.call(Client.java:1536)
> at org.apache.hadoop.ipc.Client.call(Client.java:1463)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:237)
> at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$MultipleNameNodeProxy.call(EditLogTailer.java:414)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:304)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$800(EditLogTailer.java:69)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:346)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:315)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:332)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:328)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:521)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:485)
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
> at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:419)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1585)
> at org.apache.hadoop.ipc.Client.call(Client.java:1502)
> ... 14 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-395) TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611154#comment-16611154
 ] 

Hadoop QA commented on HDDS-395:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDDS-395 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939299/HDDS-395.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8597a8a2ccb5 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1032/testReport/ |
| Max. process+thread count | 406 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/common U: hadoop-hdds/common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1032/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"
> 

[jira] [Commented] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611152#comment-16611152
 ] 

Hadoop QA commented on HDFS-13156:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
31m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13156 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939300/HDFS-13156-01.patch |
| Optional Tests |  dupname  asflicense  mvnsite  |
| uname | Linux fd1381894413 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 288 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25035/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> HDFS Block Placement Policy - Client Local Rack
> ---
>
> Key: HDFS-13156
> URL: https://issues.apache.org/jira/browse/HDFS-13156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0, 3.2.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-13156-01.patch
>
>
> {quote}For the common case, when the replication factor is three, HDFS’s 
> placement policy is to put one replica on the local machine if the writer is 
> on a datanode, otherwise on a random datanode, another replica on a node in a 
> different (remote) rack, and the last on a different node in the same remote 
> rack.
> {quote}
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps]
> Having just looked over the Default Block Placement code, the way I 
> understand this, is that, there are three basic scenarios:
>  # HDFS client is running on a datanode inside the cluster
>  # HDFS client is running on a node outside the cluster
>  # HDFS client is running on a non-datanode inside the cluster
> The documentation is ambiguous concerning the third scenario. Please correct 
> me if I'm wrong, but the way I understand the code, if there is an HDFS 
> client inside the cluster, but it is not on a datanode, the first block will 
> be placed on a datanode within the set of datanodes available on the local 
> rack and not simply on any _random datanode_ from the set of all datanodes in 
> the cluster.
> That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, 
> I should expect that every first block will be written to a _random datanode_ 
> on the same rack as the HDFS Flume agent, 

[jira] [Comment Edited] (HDFS-13898) Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode

2018-09-11 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611150#comment-16611150
 ] 

Erik Krogen edited comment on HDFS-13898 at 9/11/18 7:47 PM:
-

Nice find [~csun]! The change to production code LGTM. For the tests:
* You don't need to make the changes to {{MiniQJMHACluster}}. You can set the 
number of DataNodes like:
{code}
builder.getDfsBuilder().numDataNodes(3);
{code}
I also don't really understand why we need to tweak the number of DNs if the 
BlockManager is mocked anyway?
* I don't really find the BlockManager mocking to be very clean. I think we 
should be able to achieve something similar by using a real BlockManager, but 
injecting some fake blocks:
{code}
NameNodeAdapter.getNamesystem(namenodes[2]).getBlockManager().addBlockCollection(...)
{code}
or creating real blocks, but then corrupting them:
{code}
dfsCluster.corruptBlockOnDataNodes(...)
{code}


was (Author: xkrogen):
Nice find [~csun]! The change to production code LGTM. For the tests:
* You don't need to make the changes to {{MiniQJMHACluster}}. You can set the 
number of DataNodes like:
{code}
builder.getDfsBuilder().numDataNodes(3);
{code}
I also don't really understand why we need to tweak the number of DNs if the 
BlockManager is mocked anyway?
* I don't really find the BlockManager mocking to be very clean. I think we 
should be able to achieve something similar by using a real BlockManager, but 
injecting some fake blocks:
{code}
NameNodeAdapter.getNamesystem(namenodes[2]).getBlockManager().addBlockCollection(...)
{code}
or creating real blocks, but then corrupting them:
{code}
dfsCluster.corruptBlockOnDataNodes()
{code}

> Throw retriable exception for getBlockLocations when ObserverNameNode is in 
> safemode
> 
>
> Key: HDFS-13898
> URL: https://issues.apache.org/jira/browse/HDFS-13898
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13898-HDFS-12943.000.patch
>
>
> When ObserverNameNode is in safe mode, {{getBlockLocations}} may throw safe 
> mode exception if the given file doesn't have any block yet. 
> {code}
> try {
>   checkOperation(OperationCategory.READ);
>   res = FSDirStatAndListingOp.getBlockLocations(
>   dir, pc, srcArg, offset, length, true);
>   if (isInSafeMode()) {
> for (LocatedBlock b : res.blocks.getLocatedBlocks()) {
>   // if safemode & no block locations yet then throw safemodeException
>   if ((b.getLocations() == null) || (b.getLocations().length == 0)) {
> SafeModeException se = newSafemodeException(
> "Zero blocklocations for " + srcArg);
> if (haEnabled && haContext != null &&
> haContext.getState().getServiceState() == 
> HAServiceState.ACTIVE) {
>   throw new RetriableException(se);
> } else {
>   throw se;
> }
>   }
> }
>   }
> {code}
> It only throws {{RetriableException}} for active NN so requests on observer 
> may just fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13898) Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode

2018-09-11 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611150#comment-16611150
 ] 

Erik Krogen commented on HDFS-13898:


Nice find [~csun]! The change to production code LGTM. For the tests:
* You don't need to make the changes to {{MiniQJMHACluster}}. You can set the 
number of DataNodes like:
{code}
builder.getDfsBuilder().numDataNodes(3);
{code}
I also don't really understand why we need to tweak the number of DNs if the 
BlockManager is mocked anyway?
* I don't really find the BlockManager mocking to be very clean. I think we 
should be able to achieve something similar by using a real BlockManager, but 
injecting some fake blocks:
{code}
NameNodeAdapter.getNamesystem(namenodes[2]).getBlockManager().addBlockCollection(...)
{code}
or creating real blocks, but then corrupting them:
{code}
dfsCluster.corruptBlockOnDataNodes()
{code}

> Throw retriable exception for getBlockLocations when ObserverNameNode is in 
> safemode
> 
>
> Key: HDFS-13898
> URL: https://issues.apache.org/jira/browse/HDFS-13898
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13898-HDFS-12943.000.patch
>
>
> When ObserverNameNode is in safe mode, {{getBlockLocations}} may throw safe 
> mode exception if the given file doesn't have any block yet. 
> {code}
> try {
>   checkOperation(OperationCategory.READ);
>   res = FSDirStatAndListingOp.getBlockLocations(
>   dir, pc, srcArg, offset, length, true);
>   if (isInSafeMode()) {
> for (LocatedBlock b : res.blocks.getLocatedBlocks()) {
>   // if safemode & no block locations yet then throw safemodeException
>   if ((b.getLocations() == null) || (b.getLocations().length == 0)) {
> SafeModeException se = newSafemodeException(
> "Zero blocklocations for " + srcArg);
> if (haEnabled && haContext != null &&
> haContext.getState().getServiceState() == 
> HAServiceState.ACTIVE) {
>   throw new RetriableException(se);
> } else {
>   throw se;
> }
>   }
> }
>   }
> {code}
> It only throws {{RetriableException}} for active NN so requests on observer 
> may just fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611148#comment-16611148
 ] 

Hadoop QA commented on HDFS-13906:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 8 new + 1 unchanged - 0 fixed = 9 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
30s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13906 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939292/HDFS-13906-01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux caa3dc611dff 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 598380a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25034/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25034/artifact/out/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25034/testReport/ |
| Max. process+thread count | 963 (vs. ulimit of 1) |
| modules | C: 

[jira] [Updated] (HDDS-411) Add Ozone submodule to the hadoop.apache.org

2018-09-11 Thread Dinesh Chitlangia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-411:
---
Fix Version/s: 0.2.1

> Add Ozone submodule to the hadoop.apache.org
> 
>
> Key: HDDS-411
> URL: https://issues.apache.org/jira/browse/HDDS-411
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Dinesh Chitlangia
>Priority: Trivial
>  Labels: site
> Fix For: 0.2.1
>
> Attachments: HDDS-411.001.patch, HDDS-411.002.patch
>
>
> The current hadoop.apache.org doesn't mention Ozone in the "Modules" section.
> We can add something like this (or better):
> {quote}Hadoop Ozone is an object store for Hadoop on top of the Hadoop HDDS 
> which provides low-level binary storage layer.
> {quote}
> We can also link to the 
> [http://ozone.hadoop.apache.org|http://ozone.hadoop.apache.org/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611136#comment-16611136
 ] 

Anu Engineer commented on HDDS-428:
---

Restarted the build:

https://builds.apache.org/blue/organizations/jenkins/PreCommit-HDDS-Build/detail/PreCommit-HDDS-Build/1033/pipeline

> OzoneManager lock optimization
> --
>
> Key: HDDS-428
> URL: https://issues.apache.org/jira/browse/HDDS-428
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: OM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-428.000.patch
>
>
> Currently, {{OzoneManager}} uses a single lock for everything which impacts 
> the performance. We can introduce a separate lock for each resource like 
> User/Volume/Bucket which will give us a performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-428:
--
Status: Patch Available  (was: Open)

> OzoneManager lock optimization
> --
>
> Key: HDDS-428
> URL: https://issues.apache.org/jira/browse/HDDS-428
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: OM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-428.000.patch
>
>
> Currently, {{OzoneManager}} uses a single lock for everything which impacts 
> the performance. We can introduce a separate lock for each resource like 
> User/Volume/Bucket which will give us a performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-428:
--
Status: Open  (was: Patch Available)

> OzoneManager lock optimization
> --
>
> Key: HDDS-428
> URL: https://issues.apache.org/jira/browse/HDDS-428
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: OM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-428.000.patch
>
>
> Currently, {{OzoneManager}} uses a single lock for everything which impacts 
> the performance. We can introduce a separate lock for each resource like 
> User/Volume/Bucket which will give us a performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-428) OzoneManager lock optimization

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611130#comment-16611130
 ] 

Anu Engineer commented on HDDS-428:
---

Looks the compiler failure is due to download failure of {{phantomjs}} in 
YARN-UI. It is not related to this patch.

> OzoneManager lock optimization
> --
>
> Key: HDDS-428
> URL: https://issues.apache.org/jira/browse/HDDS-428
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: OM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-428.000.patch
>
>
> Currently, {{OzoneManager}} uses a single lock for everything which impacts 
> the performance. We can introduce a separate lock for each resource like 
> User/Volume/Bucket which will give us a performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13868) WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but "oldsnapshotname" is not.

2018-09-11 Thread Pranay Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611129#comment-16611129
 ] 

Pranay Singh commented on HDFS-13868:
-

Hi Wei-Chiu,

If you are referring to changes in DFSClient.java for Illegal argument that 
code path will not be exercised in this case for webHDFS

The below code is equivalent to the below CLI, which compares between the 
snapshot s1 and the current branch.
diffReport = webHdfs.getSnapshotDiffReport(foo, "s1", null);
Assert.assertEquals(diffReport.getDiffList().size(), 5);

 CLI for snapshot diff
---
hdfs snapshotDiff /  s1 ""

Since 5 files have been changed after taking snapshot s2 in the current branch 
all the five files should be part of the result.
Hence the diffReport count should be 5.



> WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" is given but 
> "oldsnapshotname" is not.
> -
>
> Key: HDFS-13868
> URL: https://issues.apache.org/jira/browse/HDFS-13868
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, webhdfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Siyao Meng
>Assignee: Pranay Singh
>Priority: Major
> Attachments: HDFS-13868.001.patch, HDFS-13868.002.patch, 
> HDFS-13868.003.patch, HDFS-13868.004.patch
>
>
> HDFS-13052 implements GETSNAPSHOTDIFF for WebHDFS.
>  
> Proof:
> {code:java}
> # Bash
> # Prerequisite: You will need to create the directory "/snapshot", 
> allowSnapshot() on it, and create a snapshot named "snap3" for it to reach 
> NPE.
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap2=snap3"
> # Note that I intentionally typed the wrong parameter name for 
> "oldsnapshotname" above to cause NPE.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs==snap3"
> # Empty string for oldsnapshotname
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> # OR
> $ curl 
> "http://:/webhdfs/v1/snaptest/?op=GETSNAPSHOTDIFF=hdfs=snap3"
> # Missing param oldsnapshotname, essentially the same as the first case.
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-395) TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611126#comment-16611126
 ] 

Anu Engineer commented on HDDS-395:
---

Thanks for the update, +1 pending Jenkins.

> TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"
> --
>
> Key: HDDS-395
> URL: https://issues.apache.org/jira/browse/HDDS-395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Mukul Kumar Singh
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.2.1
>
> Attachments: HDDS-395.001.patch
>
>
> Ozone datanode initialization is failing with the following exception.
> This was noted in the following precommit build result.
> https://builds.apache.org/job/PreCommit-HDDS-Build/935/testReport/org.apache.hadoop.ozone.web/TestOzoneRestWithMiniCluster/org_apache_hadoop_ozone_web_TestOzoneRestWithMiniCluster_2/
> {code}
> 2018-09-02 20:56:33,501 INFO  db.DBStoreBuilder 
> (DBStoreBuilder.java:getDbProfile(176)) - Unable to read ROCKDB config
> java.io.IOException: Unable to find the configuration directory. Please make 
> sure that HADOOP_CONF_DIR is setup correctly 
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.getConfigLocation(DBConfigFromFile.java:62)
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.readFromFile(DBConfigFromFile.java:118)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.getDbProfile(DBStoreBuilder.java:170)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:122)
>   at 
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.(OmMetadataManagerImpl.java:133)
>   at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:146)
>   at 
> org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:295)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createOM(MiniOzoneClusterImpl.java:357)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:304)
>   at 
> org.apache.hadoop.ozone.web.TestOzoneRestWithMiniCluster.init(TestOzoneRestWithMiniCluster.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-432) Replication of closed containers is not working

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611125#comment-16611125
 ] 

Anu Engineer commented on HDDS-432:
---

+1, LGTM. I have two minor comments.

1. Nit:
{code:java}
 datanodesWithReplicas.size() < 1){code}
is the size of a list which can never be negative, rewrite as 
{code:java}

datanodesWithReplicas.size() == 0){code}
2. It would be nice if the Replication Manager maintained a list of container 
with no replicas, so that we can show it UI etc. later, we can just file a 
future looking Jira for it.

+1, Feel free to commit and fix the  less than 1 issue while committing, no 
need for a new patch. Thanks

 

 

 

> Replication of closed containers is not working
> ---
>
> Key: HDDS-432
> URL: https://issues.apache.org/jira/browse/HDDS-432
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Critical
> Fix For: 0.2.1
>
> Attachments: HDDS-432-ozone-0.2.001.patch
>
>
> Steps to reproduce:
> 1. Start a cluster with three datanodes:
> docker-compose up -d
> docker-compose scale datanode=3
> 2. Create keys:
> ozone oz -createVolume /vol1 -user hadoop --quota 1TB --root
> ozone oz -createBucket /vol1/bucket
> dd if=/dev/zero of=/tmp/test bs=1024000 count=512
> ozone oz -putKey /vol1/bucket/file1 -replicationFactor THREE -file /tmp/test  
> 3. Close the containers with scmcli
> 4. kill a datanode with a replica
> {code}
> for i in `seq 1 4`; do docker diff ozone_datanode_$i && echo 
> ""; done
> #Choose a datanode with replica
> docker kill ozone_datanode_3
> {code}
>  
> 5. Wait
> 6. After a while the last data node should container the chunks (checked with 
> docker diff)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-419) ChunkInputStream bulk read api does not read from all the chunks

2018-09-11 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611101#comment-16611101
 ] 

Ajay Kumar commented on HDDS-419:
-

[~msingh] thanks for posting the fix.
Correct me if i am wrong but it seems there is a subtle bug in while loop.
 {code}while (len > 0) {
  int available = prepareRead(len);
  if (available == EOF) {
return EOF;
  }
  buffers.get(bufferIndex).get(b, off + total, available);
  len -= available;
  total += available;
} {code}
If after subsequent iterations, prepareRead return EOF we will return EOF 
instead of total bytes read till last iteration. 

> ChunkInputStream bulk read api does not read from all the chunks
> 
>
> Key: HDDS-419
> URL: https://issues.apache.org/jira/browse/HDDS-419
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-419.001.patch
>
>
> After enabling of bulk reads with HDDS-408, testDataValidate started failing 
> because the bulk read api does not read all the chunks from the block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2018-09-11 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-13596:
-
Priority: Critical  (was: Blocker)

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Zsolt Venczel
>Priority: Critical
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at 

[jira] [Updated] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2018-09-11 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-13596:
-
Target Version/s:   (was: 3.2.0, 3.0.4, 3.1.2)

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Zsolt Venczel
>Priority: Blocker
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at 

[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611093#comment-16611093
 ] 

Anu Engineer commented on HDDS-424:
---

only one comment: This is not related to this patch. 

_GenericCli.java: Line 42: @Option(names = \{"-D", "--set"})_
 -D maps to --define ? not part of your changes, just something that I thought 
will flag.

I understand the parsing difficulties with the   approach. +1, I 
will wait for comments from [~arpitagarwal]

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611093#comment-16611093
 ] 

Anu Engineer edited comment on HDDS-424 at 9/11/18 7:01 PM:


only one comment: This is not related to this patch. 

_GenericCli.java: Line 42: @Option(names = \{"\-D", "--set"})_
 -D maps to --define ? not part of your changes, just something that I thought 
will flag.

I understand the parsing difficulties with the   approach. +1, I 
will wait for comments from [~arpitagarwal]


was (Author: anu):
only one comment: This is not related to this patch. 

_GenericCli.java: Line 42: @Option(names = \{"-D", "--set"})_
 -D maps to --define ? not part of your changes, just something that I thought 
will flag.

I understand the parsing difficulties with the   approach. +1, I 
will wait for comments from [~arpitagarwal]

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611092#comment-16611092
 ] 

Hadoop QA commented on HDDS-433:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDDS-433 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939290/HDDS-433.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aaba1a3d9b0a 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8ffbbf5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1031/testReport/ |
| Max. process+thread count | 467 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/container-service U: hadoop-hdds/container-service |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1031/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ContainerStateMachine#readStateMachineData should properly build 

[jira] [Commented] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611091#comment-16611091
 ] 

Hanisha Koneru commented on HDDS-433:
-

{{SMLogEntryProto.newBuilder(smLogEntryProto)}} would just make sure that 
whatever field is set in {{smLogEntryProto}} is copied over to the new object. 

What if the {{stateMachineDataAttached}} field is not set in 
{{smLogEntryProto}} as that does not have any stateMachineData? If this case 
can never arise or if this field is never used in Ratis, then I think we are 
good.

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
>

[jira] [Commented] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack

2018-09-11 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611084#comment-16611084
 ] 

Ayush Saxena commented on HDFS-13156:
-

Thanx [~belugabehr] for reporting.
It was really a  miss on the documentation part.The default block placement 
policy looks for a datanode in the same rack as that of client if it is not 
able to write on the client node itself in case of first replica.not a random 
node.
Have Uploaded the patch to update the Doc.
Pls Review

> HDFS Block Placement Policy - Client Local Rack
> ---
>
> Key: HDFS-13156
> URL: https://issues.apache.org/jira/browse/HDFS-13156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0, 3.2.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-13156-01.patch
>
>
> {quote}For the common case, when the replication factor is three, HDFS’s 
> placement policy is to put one replica on the local machine if the writer is 
> on a datanode, otherwise on a random datanode, another replica on a node in a 
> different (remote) rack, and the last on a different node in the same remote 
> rack.
> {quote}
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps]
> Having just looked over the Default Block Placement code, the way I 
> understand this, is that, there are three basic scenarios:
>  # HDFS client is running on a datanode inside the cluster
>  # HDFS client is running on a node outside the cluster
>  # HDFS client is running on a non-datanode inside the cluster
> The documentation is ambiguous concerning the third scenario. Please correct 
> me if I'm wrong, but the way I understand the code, if there is an HDFS 
> client inside the cluster, but it is not on a datanode, the first block will 
> be placed on a datanode within the set of datanodes available on the local 
> rack and not simply on any _random datanode_ from the set of all datanodes in 
> the cluster.
> That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, 
> I should expect that every first block will be written to a _random datanode_ 
> on the same rack as the HDFS Flume agent, assuming the network topology 
> script is written to include this Flume node.
> If that is correct, can the documentation be updated to include this third 
> common scenario?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack

2018-09-11 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13156:

Affects Version/s: 3.2.0
   3.1.1
   Status: Patch Available  (was: Open)

> HDFS Block Placement Policy - Client Local Rack
> ---
>
> Key: HDFS-13156
> URL: https://issues.apache.org/jira/browse/HDFS-13156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.1.1, 2.9.0, 3.2.0
>Reporter: BELUGA BEHR
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-13156-01.patch
>
>
> {quote}For the common case, when the replication factor is three, HDFS’s 
> placement policy is to put one replica on the local machine if the writer is 
> on a datanode, otherwise on a random datanode, another replica on a node in a 
> different (remote) rack, and the last on a different node in the same remote 
> rack.
> {quote}
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps]
> Having just looked over the Default Block Placement code, the way I 
> understand this, is that, there are three basic scenarios:
>  # HDFS client is running on a datanode inside the cluster
>  # HDFS client is running on a node outside the cluster
>  # HDFS client is running on a non-datanode inside the cluster
> The documentation is ambiguous concerning the third scenario. Please correct 
> me if I'm wrong, but the way I understand the code, if there is an HDFS 
> client inside the cluster, but it is not on a datanode, the first block will 
> be placed on a datanode within the set of datanodes available on the local 
> rack and not simply on any _random datanode_ from the set of all datanodes in 
> the cluster.
> That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, 
> I should expect that every first block will be written to a _random datanode_ 
> on the same rack as the HDFS Flume agent, assuming the network topology 
> script is written to include this Flume node.
> If that is correct, can the documentation be updated to include this third 
> common scenario?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611083#comment-16611083
 ] 

Anu Engineer commented on HDDS-424:
---

Ok, I understand why this is being done. {{VolumeCommands.java}} can now 
organize all volume commands in one place. Very smart and very developer 
friendly :)

However, I am worried a group of people are going to give us this feedback 
about   ordering in the future. 

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13156) HDFS Block Placement Policy - Client Local Rack

2018-09-11 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13156:

Attachment: HDFS-13156-01.patch

> HDFS Block Placement Policy - Client Local Rack
> ---
>
> Key: HDFS-13156
> URL: https://issues.apache.org/jira/browse/HDFS-13156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: BELUGA BEHR
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-13156-01.patch
>
>
> {quote}For the common case, when the replication factor is three, HDFS’s 
> placement policy is to put one replica on the local machine if the writer is 
> on a datanode, otherwise on a random datanode, another replica on a node in a 
> different (remote) rack, and the last on a different node in the same remote 
> rack.
> {quote}
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps]
> Having just looked over the Default Block Placement code, the way I 
> understand this, is that, there are three basic scenarios:
>  # HDFS client is running on a datanode inside the cluster
>  # HDFS client is running on a node outside the cluster
>  # HDFS client is running on a non-datanode inside the cluster
> The documentation is ambiguous concerning the third scenario. Please correct 
> me if I'm wrong, but the way I understand the code, if there is an HDFS 
> client inside the cluster, but it is not on a datanode, the first block will 
> be placed on a datanode within the set of datanodes available on the local 
> rack and not simply on any _random datanode_ from the set of all datanodes in 
> the cluster.
> That is to say, if one rack has an HDFS Sink Flume Agent on a dedicated node, 
> I should expect that every first block will be written to a _random datanode_ 
> on the same rack as the HDFS Flume agent, assuming the network topology 
> script is written to include this Flume node.
> If that is correct, can the documentation be updated to include this third 
> common scenario?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-395) TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"

2018-09-11 Thread Dinesh Chitlangia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDDS-395:
---
Attachment: HDDS-395.001.patch
Status: Patch Available  (was: Open)

[~anu] - Posted patch 001 for your review.

> TestOzoneRestWithMiniCluster fails with "Unable to read ROCKDB config"
> --
>
> Key: HDDS-395
> URL: https://issues.apache.org/jira/browse/HDDS-395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Mukul Kumar Singh
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.2.1
>
> Attachments: HDDS-395.001.patch
>
>
> Ozone datanode initialization is failing with the following exception.
> This was noted in the following precommit build result.
> https://builds.apache.org/job/PreCommit-HDDS-Build/935/testReport/org.apache.hadoop.ozone.web/TestOzoneRestWithMiniCluster/org_apache_hadoop_ozone_web_TestOzoneRestWithMiniCluster_2/
> {code}
> 2018-09-02 20:56:33,501 INFO  db.DBStoreBuilder 
> (DBStoreBuilder.java:getDbProfile(176)) - Unable to read ROCKDB config
> java.io.IOException: Unable to find the configuration directory. Please make 
> sure that HADOOP_CONF_DIR is setup correctly 
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.getConfigLocation(DBConfigFromFile.java:62)
>   at 
> org.apache.hadoop.utils.db.DBConfigFromFile.readFromFile(DBConfigFromFile.java:118)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.getDbProfile(DBStoreBuilder.java:170)
>   at 
> org.apache.hadoop.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:122)
>   at 
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.(OmMetadataManagerImpl.java:133)
>   at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:146)
>   at 
> org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:295)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createOM(MiniOzoneClusterImpl.java:357)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:304)
>   at 
> org.apache.hadoop.ozone.web.TestOzoneRestWithMiniCluster.init(TestOzoneRestWithMiniCluster.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-424) Consolidate ozone oz parameters to use GNU convention

2018-09-11 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611078#comment-16611078
 ] 

Anu Engineer commented on HDDS-424:
---

First and foremost, Bikeshedding feedback:
 # Should we use   or  
 # 
||Noun||Verb|| ||Verb||Noun||
|volume |create| |create|volume|
|volume|list| |list|volume|
|volume|update| |update|volume|
|volume|info| |info|volume|
|bucket|create| |create|bucket|
|bucket|update| |update|bucket|
|key|put| |put|key|
|key|get| |get|key|

To me it feels more natural to do   I am sure it is influenced by 
the languages that I speak, so it is just a linguistic preference. Just wanted 
to flag it.

End of BikeShedding. 

Feel free to ignore this feedback, as I said this not really based on 
quantitative measurements. To me it feels more natural to say  . 

> Consolidate ozone oz parameters to use GNU convention
> -
>
> Key: HDDS-424
> URL: https://issues.apache.org/jira/browse/HDDS-424
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-424-ozone-0.2.001.patch
>
>
> In the common linux commands the convention is
> 1. avoid to use camelCase argument/flags
> 2. use double dash with words (--user) and singla dash with letters (-u) 
> I propose to modify ozone oz with:
> * Adding a second dash for all the word flags
> * Use 'key get', 'key info' instead of -infoKey
> * Define the input/output file name as a second argument instead of 
> --file/-file as it's always required



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611077#comment-16611077
 ] 

Lokesh Jain commented on HDDS-433:
--

[~hanishakoneru] Thanks for reviewing the patch!
{code:java}
SMLogEntryProto.newBuilder(smLogEntryProto)
{code}
makes sure that all the fields of smLogEntryProto are used in the new object. 
Therefore we do not need to explicitly set it.

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
> at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-11 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611051#comment-16611051
 ] 

Arpit Agarwal commented on HDFS-13768:
--

The patch did not apply cleanly for me. Can you please rebase it?

Looks like this is going to use number processors x num disks threads by 
default. Any idea what kind of speedup you get with lower number of threads. 
e.g. 2?

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611048#comment-16611048
 ] 

Hanisha Koneru commented on HDDS-433:
-

Hi [~ljain], 

I see that in {{SMLogEntryProto}}, we have a \{{stateMachineDataAttached}} 
field to be set when state machine data is attached. Shouldn't we be setting 
this field to true when setting state machine data in Line 318. Unless this is 
a redundant/ deprecated field in Ratis?

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
> at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at 

[jira] [Commented] (HDDS-431) LeaseManager of CommandWatcher is not started

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611047#comment-16611047
 ] 

Hudson commented on HDDS-431:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14922 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14922/])
HDDS-431. LeaseManager of CommandWatcher is not started. Contributed by 
(aengineer: rev 8ffbbf51c9a0f9bbcd9ee7452d95e588a2cb87ac)
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java


> LeaseManager of CommandWatcher is not started
> -
>
> Key: HDDS-431
> URL: https://issues.apache.org/jira/browse/HDDS-431
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-431-ozone-0.2.001.patch
>
>
> You can see the following error in case of a datanode failure:
> {code}
> scm_1   | 2018-09-11 11:43:46 ERROR SingleThreadExecutor:88 - Error 
> on execution message 
> org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler$CloseContainerRetryableReq@2aa17d1cscm_1
>| org.apache.hadoop.ozone.lease.LeaseManagerNotRunningException: 
> LeaseManager not running.scm_1   |   at 
> org.apache.hadoop.ozone.lease.LeaseManager.checkStatus(LeaseManager.java:189)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:112)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:97)scm_1 
>   |at 
> org.apache.hadoop.hdds.server.events.EventWatcher.handleStartEvent(EventWatcher.java:128)scm_1
>|   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)scm_1
>|  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)scm_1
>| at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)scm_1
>| at java.lang.Thread.run(Thread.java:748)scm_1   | 
> 2018-09-11 11:43:46 ERROR SingleThreadExecutor:88 - Error on execution 
> message 
> org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler$CloseContainerRetryableReq@2772f338scm_1
>| org.apache.hadoop.ozone.lease.LeaseManagerNotRunningException: 
> LeaseManager not running.scm_1   | at 
> org.apache.hadoop.ozone.lease.LeaseManager.checkStatus(LeaseManager.java:189)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:112)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:97)scm_1 
>   |at 
> org.apache.hadoop.hdds.server.events.EventWatcher.handleStartEvent(EventWatcher.java:128)scm_1
>|   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)scm_1
>|  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)scm_1
>| at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)scm_1
>| at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13898) Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode

2018-09-11 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611042#comment-16611042
 ] 

Chao Sun commented on HDFS-13898:
-

cc [~xkrogen], [~shv], [~vagarychen].

> Throw retriable exception for getBlockLocations when ObserverNameNode is in 
> safemode
> 
>
> Key: HDFS-13898
> URL: https://issues.apache.org/jira/browse/HDFS-13898
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13898-HDFS-12943.000.patch
>
>
> When ObserverNameNode is in safe mode, {{getBlockLocations}} may throw safe 
> mode exception if the given file doesn't have any block yet. 
> {code}
> try {
>   checkOperation(OperationCategory.READ);
>   res = FSDirStatAndListingOp.getBlockLocations(
>   dir, pc, srcArg, offset, length, true);
>   if (isInSafeMode()) {
> for (LocatedBlock b : res.blocks.getLocatedBlocks()) {
>   // if safemode & no block locations yet then throw safemodeException
>   if ((b.getLocations() == null) || (b.getLocations().length == 0)) {
> SafeModeException se = newSafemodeException(
> "Zero blocklocations for " + srcArg);
> if (haEnabled && haContext != null &&
> haContext.getState().getServiceState() == 
> HAServiceState.ACTIVE) {
>   throw new RetriableException(se);
> } else {
>   throw se;
> }
>   }
> }
>   }
> {code}
> It only throws {{RetriableException}} for active NN so requests on observer 
> may just fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13749) Use getServiceStatus to discover observer namenodes

2018-09-11 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611039#comment-16611039
 ] 

Chao Sun commented on HDFS-13749:
-

The test failure is because in {{testMultiObserver}}, we shutdown a observer 
and then restart it, and we expect the RPC should go to the observer once it is 
restarted.

However, it's interesting that after the observer is restarted, the 
{{getServiceStatus}} call will fail with EOF exception. I tried by wrapping the 
proxy with a RetryPolicy like the following:
{code}
  public static HAServiceProtocol createNonHAProxyWithHAServiceProtocol(
  InetSocketAddress address, Configuration conf) throws IOException {
RetryPolicy timeoutPolicy = RetryPolicies.exponentialBackoffRetry(5, 200,
TimeUnit.MILLISECONDS);

HAServiceProtocol proxy =
new HAServiceProtocolClientSideTranslatorPB(
address, conf, NetUtils.getDefaultSocketFactory(conf),
3);
Map methodNameToPolicyMap = new HashMap<>();
return (HAServiceProtocol) RetryProxy.create(
HAServiceProtocol.class,
new DefaultFailoverProxyProvider<>(HAServiceProtocol.class, proxy),
methodNameToPolicyMap,
timeoutPolicy
);
{code}

but it still failed after multiple retries, with connection refused exception.

However, if I add a simple look in the {{refreshCachedState}}, then it always 
succeed on the second try:
{code}
public void refreshCachedState() {
  for (int i = 0; i < 3; i++) {
try {
  cachedState = serviceProxy.getServiceStatus().getState();
  LOG.info("Successfully set cache state to " + cachedState.name());
  return;
} catch (IOException e) {
  LOG.warn("Failed to connect to {}. Setting cached state to Standby",
  address, e);
  cachedState = HAServiceState.STANDBY;
}
  }
}
{code}

> Use getServiceStatus to discover observer namenodes
> ---
>
> Key: HDFS-13749
> URL: https://issues.apache.org/jira/browse/HDFS-13749
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-13749-HDFS-12943.000.patch, 
> HDFS-13749-HDFS-12943.001.patch, HDFS-13749-HDFS-12943.002.patch
>
>
> In HDFS-12976 currently we discover NameNode state by calling 
> {{reportBadBlocks}} as a temporary solution. Here, we'll properly implement 
> this by using {{HAServiceProtocol#getServiceStatus}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611037#comment-16611037
 ] 

Íñigo Goiri commented on HDFS-13906:


Thanks [~ayushtkn] for  [^HDFS-13906-01.patch].
The fix looks good, just a couple aesthetic comments from my side:
* Remove extra line in TestRouterAdminCLI#634.
* When creating the new spaces, I would use src1, src2, dest1, and dest2.
* Add a couple comments in between to split the unit test for 
{{testMultiArgsRemoveMountTable}}. For example, one part to add the mount 
points, another block to check them and another block to remove them and check.
* Similar for the quota one.

Let's see what Yetus says too, there might be some checkstyle too.

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-431) LeaseManager of CommandWatcher is not started

2018-09-11 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-431:
--
   Resolution: Fixed
Fix Version/s: 0.3.0
   Status: Resolved  (was: Patch Available)

[~elek] Thanks for finding and fixing the issue. I have committed this patch to 
trunk and ozone-2.0 branch.

> LeaseManager of CommandWatcher is not started
> -
>
> Key: HDDS-431
> URL: https://issues.apache.org/jira/browse/HDDS-431
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-431-ozone-0.2.001.patch
>
>
> You can see the following error in case of a datanode failure:
> {code}
> scm_1   | 2018-09-11 11:43:46 ERROR SingleThreadExecutor:88 - Error 
> on execution message 
> org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler$CloseContainerRetryableReq@2aa17d1cscm_1
>| org.apache.hadoop.ozone.lease.LeaseManagerNotRunningException: 
> LeaseManager not running.scm_1   |   at 
> org.apache.hadoop.ozone.lease.LeaseManager.checkStatus(LeaseManager.java:189)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:112)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:97)scm_1 
>   |at 
> org.apache.hadoop.hdds.server.events.EventWatcher.handleStartEvent(EventWatcher.java:128)scm_1
>|   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)scm_1
>|  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)scm_1
>| at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)scm_1
>| at java.lang.Thread.run(Thread.java:748)scm_1   | 
> 2018-09-11 11:43:46 ERROR SingleThreadExecutor:88 - Error on execution 
> message 
> org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler$CloseContainerRetryableReq@2772f338scm_1
>| org.apache.hadoop.ozone.lease.LeaseManagerNotRunningException: 
> LeaseManager not running.scm_1   | at 
> org.apache.hadoop.ozone.lease.LeaseManager.checkStatus(LeaseManager.java:189)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:112)scm_1
>|   at 
> org.apache.hadoop.ozone.lease.LeaseManager.acquire(LeaseManager.java:97)scm_1 
>   |at 
> org.apache.hadoop.hdds.server.events.EventWatcher.handleStartEvent(EventWatcher.java:128)scm_1
>|   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)scm_1
>|  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)scm_1
>| at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)scm_1
>| at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611016#comment-16611016
 ] 

Ayush Saxena commented on HDFS-13906:
-

Thanx [~SoumyaPN] for putting up the issue.
Have uploaded the patch with changes.
Pls Review!!! 

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Description: 
ContainerStateMachine#readStateMachineData returns LogEntryProto with index set 
to 0. This leads to exception in Ratis. The LogEntryProto to return should be 
built over the input LogEntryProto.

The following exception was seen using Ozone, where the leader send incorrect 
append entries to follower.

{code}
2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
to:20312
2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
lastRpcTime:1182, electionTimeout:990ms
2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
for changeToCandidate
2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
for changeToFollower
2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
lastRpcTime:2167, electionTimeout:976ms
2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
for changeToCandidate
2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
tElection
2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
for changeToFollower
2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
endEntries
 
2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
 Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
lastRpcElapsed=0ms
 
2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
voted=2e240240-0fac-4f93-8aa8-fa8f
74bf1810_9858, raftlog=[(t:14, i:20374)], 
conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
2e240240-0fac-4f93-8aa8-fa8f74bf
1810_9858:172.26.32.228:9858], old=null
2018-08-20 07:54:31,227 WARN 
org.apache.ratis.grpc.server.RaftServerProtocolService: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
java.lang.IllegalStateException: Unexpected Index: previous is (t:14, i:20374) 
but entries[0].getIndex()=0
at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
at 
org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786)
at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:859)
at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:824)
at 
org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:247)
at 
org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:76)
at 
org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:66)
at 
org.apache.ratis.shaded.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.ratis.shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:252)
at 

[jira] [Commented] (HDDS-222) Remove hdfs command line from ozone distribution.

2018-09-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611014#comment-16611014
 ] 

Hudson commented on HDDS-222:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14921 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14921/])
HDDS-222. Remove hdfs command line from ozone distribution. Contributed 
(aengineer: rev 7b5886bf784579cc97656266901e6f934522b0e8)
* (edit) hadoop-ozone/pom.xml
* (edit) dev-support/bin/ozone-dist-layout-stitching
* (edit) hadoop-ozone/common/src/main/bin/stop-ozone.sh
* (edit) hadoop-hdds/server-scm/pom.xml
* (edit) hadoop-ozone/objectstore-service/pom.xml
* (edit) hadoop-hdds/framework/pom.xml
* (edit) hadoop-ozone/common/src/main/bin/start-ozone.sh
* (edit) hadoop-hdds/container-service/pom.xml
* (edit) hadoop-ozone/ozone-manager/pom.xml
* (edit) hadoop-hdds/pom.xml
* (edit) hadoop-ozone/common/src/main/bin/ozone
* (add) hadoop-ozone/common/src/main/bin/ozone-config.sh
* (edit) hadoop-hdds/client/pom.xml


> Remove hdfs command line from ozone distribution.
> -
>
> Key: HDDS-222
> URL: https://issues.apache.org/jira/browse/HDDS-222
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: newbie
> Fix For: 0.3.0
>
> Attachments: HDDS-222-ozone-0.2.005.patch, HDDS-222.001.patch, 
> HDDS-222.002.patch, HDDS-222.003.patch, HDDS-222.004.patch
>
>
> As the ozone release artifact doesn't contain a stable namenode/datanode code 
> the hdfs command should be removed from the ozone artifact.
> ozone-dist-layout-stitching also could be simplified to copy only the 
> required jar files (we don't need to copy the namenode/datanode server side 
> jars, just the common artifacts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13906) RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands

2018-09-11 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13906:

Attachment: HDFS-13906-01.patch

> RBF: Add multiple paths for dfsrouteradmin "rm" and "clrquota" commands
> ---
>
> Key: HDFS-13906
> URL: https://issues.apache.org/jira/browse/HDFS-13906
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13906-01.patch
>
>
> Currently we have option to delete only one mount entry at once. 
> If we have multiple mount entries, then it would be difficult for the user to 
> execute the command for N number of times.
> Better If the "rm" and "clrQuota" command supports multiple entries, then It 
> would be easy for the user to provide all the required entries in one single 
> command.
> Namenode is already suporting "rm" and "clrQuota" with multiple destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >