[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366367#comment-17366367
 ] 

Hadoop QA commented on HDFS-16067:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
32s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:blue}0{color} | {color:blue} markdownlint {color} | {color:blue}  0m  
0s{color} | {color:blue}{color} | {color:blue} markdownlint was not available. 
{color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
16s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m  
8s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
59s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 43s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 33m 
35s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  5m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
46s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 
46s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m  
7s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m  
7s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
51s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
2s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient 

[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=612411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612411
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 21/Jun/21 02:05
Start Date: 21/Jun/21 02:05
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-864670631


   Hi @tasanuma @jojochuang @goiri @Hexiaoqiao @ayushtkn , could you please to 
review the code? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612411)
Time Spent: 50m  (was: 40m)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16076) Avoid using slow DataNodes for reading by sorting locations

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16076?focusedWorklogId=612410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612410
 ]

ASF GitHub Bot logged work on HDFS-16076:
-

Author: ASF GitHub Bot
Created on: 21/Jun/21 02:02
Start Date: 21/Jun/21 02:02
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3117:
URL: https://github.com/apache/hadoop/pull/3117#issuecomment-864669265


   Those failed unit tests work fine locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612410)
Time Spent: 40m  (was: 0.5h)

> Avoid using slow DataNodes for reading by sorting locations
> ---
>
> Key: HDFS-16076
> URL: https://issues.apache.org/jira/browse/HDFS-16076
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After sorting the expected location list will be: live -> slow -> stale -> 
> staleAndSlow -> entering_maintenance -> decommissioned. This reduces the 
> probability that slow nodes will be used for reading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16069) Remove locally stored files (edit log) when NameNode becomes Standby

2021-06-20 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu reassigned HDFS-16069:
---

Assignee: (was: JiangHua Zhu)

> Remove locally stored files (edit log) when NameNode becomes Standby
> 
>
> Key: HDFS-16069
> URL: https://issues.apache.org/jira/browse/HDFS-16069
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Priority: Minor
>
> When zkfc is working, one of the NameNode (Active) will become the Standby 
> state. Before the state change, this NameNode has saved some files (edit 
> log), these files are stored in the directory 
> (dfs.namenode.edits.dir/dfs.namenode.name.dir) , And will not disappear in 
> the short term until the status of this NameNode becomes Active again.
> These files (edit log) are of little significance to the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16075) Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects

2021-06-20 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-16075:

Fix Version/s: 3.3.2
   3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Use empty array constants present in StorageType and DatanodeInfo to avoid 
> creating redundant objects
> -
>
> Key: HDFS-16075
> URL: https://issues.apache.org/jira/browse/HDFS-16075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> StorageType and DatanodeInfo already provides empty array constants. We 
> should use them where possible in order to avoid creating unnecessary new 
> empty array objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16078) Remove unused parameters for DatanodeManager.handleLifeline()

2021-06-20 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366325#comment-17366325
 ] 

Hui Fei commented on HDFS-16078:


Change fix versions from 3.3.1 3.4.0 3.3.2 to 3.2.3 3.4.0 3.3.2

> Remove unused parameters for DatanodeManager.handleLifeline()
> -
>
> Key: HDFS-16078
> URL: https://issues.apache.org/jira/browse/HDFS-16078
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.3.2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove unused parameters (blockPoolId, maxTransfers) for 
> DatanodeManager.handleLifeline().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16078) Remove unused parameters for DatanodeManager.handleLifeline()

2021-06-20 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-16078:
---
Fix Version/s: (was: 3.3.1)
   3.2.3

> Remove unused parameters for DatanodeManager.handleLifeline()
> -
>
> Key: HDFS-16078
> URL: https://issues.apache.org/jira/browse/HDFS-16078
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove unused parameters (blockPoolId, maxTransfers) for 
> DatanodeManager.handleLifeline().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16075) Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16075?focusedWorklogId=612408=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612408
 ]

ASF GitHub Bot logged work on HDFS-16075:
-

Author: ASF GitHub Bot
Created on: 21/Jun/21 01:26
Start Date: 21/Jun/21 01:26
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3115:
URL: https://github.com/apache/hadoop/pull/3115#issuecomment-864659363


   Merged it. Thanks for your contribution, @virajjasani. Thanks for your 
review, @ferhui.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612408)
Time Spent: 1.5h  (was: 1h 20m)

> Use empty array constants present in StorageType and DatanodeInfo to avoid 
> creating redundant objects
> -
>
> Key: HDFS-16075
> URL: https://issues.apache.org/jira/browse/HDFS-16075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> StorageType and DatanodeInfo already provides empty array constants. We 
> should use them where possible in order to avoid creating unnecessary new 
> empty array objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16075) Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16075?focusedWorklogId=612407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612407
 ]

ASF GitHub Bot logged work on HDFS-16075:
-

Author: ASF GitHub Bot
Created on: 21/Jun/21 01:25
Start Date: 21/Jun/21 01:25
Worklog Time Spent: 10m 
  Work Description: tasanuma merged pull request #3115:
URL: https://github.com/apache/hadoop/pull/3115


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612407)
Time Spent: 1h 20m  (was: 1h 10m)

> Use empty array constants present in StorageType and DatanodeInfo to avoid 
> creating redundant objects
> -
>
> Key: HDFS-16075
> URL: https://issues.apache.org/jira/browse/HDFS-16075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> StorageType and DatanodeInfo already provides empty array constants. We 
> should use them where possible in order to avoid creating unnecessary new 
> empty array objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16079) Improve the block state change log

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16079?focusedWorklogId=612406=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612406
 ]

ASF GitHub Bot logged work on HDFS-16079:
-

Author: ASF GitHub Bot
Created on: 21/Jun/21 01:23
Start Date: 21/Jun/21 01:23
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3120:
URL: https://github.com/apache/hadoop/pull/3120#issuecomment-864658614


   Thanks @ayushtkn and @Hexiaoqiao for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612406)
Time Spent: 1h  (was: 50m)

> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-06-20 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366323#comment-17366323
 ] 

ludun edited comment on HDFS-16044 at 6/21/21, 1:08 AM:


[~hexiaoqiao] when request getLsiting, we can pass a parameter to  the method.
{code:java}
DirectoryListing files = namesystem.getListing(
src, startAfter, needLocation);
{code}

so  we should not return location with needLocation is false.

maybe  we should change to

{code:java}
if (null == locations && isdir && null == symlink || !locatedStatus) {
{code}




was (Author: pilchard):
[~hexiaoqiao] when request getLsiting, we can pass a parameter to  the method.
{code:java}
DirectoryListing files = namesystem.getListing(
src, startAfter, needLocation);
{code}

so  we should not return location with needLocation is false.

maybe  we should changge to

{code:java}
if (null == locations && isdir && null == symlink || !locatedStatus) {
{code}



> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: ludun
>Assignee: ludun
>Priority: Major
> Attachments: HDFS-16044.00.patch, HDFS-16044.01.patch
>
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> the request is and  needLocation is false
> {code:java}
> 2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
> 8-5-231-4/8.5.231.4:25000: getListing {src: 
> "/data/connector/test/topics/102test" startAfter: "" needLocation: false}
> {code}
> but getListing request 1000 times getLocatedBlocks which not needed.
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by 

[jira] [Comment Edited] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-06-20 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366323#comment-17366323
 ] 

ludun edited comment on HDFS-16044 at 6/21/21, 1:08 AM:


[~hexiaoqiao] when request getLsiting, we can pass a parameter to  the method.
{code:java}
DirectoryListing files = namesystem.getListing(
src, startAfter, needLocation);
{code}

so  we should not return location with needLocation is false.

maybe  we should change to

{code:java}
if (null == locations && isdir && null == symlink || !locatedStatus) {
{code}

i will fix checkstyles and build failure


was (Author: pilchard):
[~hexiaoqiao] when request getLsiting, we can pass a parameter to  the method.
{code:java}
DirectoryListing files = namesystem.getListing(
src, startAfter, needLocation);
{code}

so  we should not return location with needLocation is false.

maybe  we should change to

{code:java}
if (null == locations && isdir && null == symlink || !locatedStatus) {
{code}



> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: ludun
>Assignee: ludun
>Priority: Major
> Attachments: HDFS-16044.00.patch, HDFS-16044.01.patch
>
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> the request is and  needLocation is false
> {code:java}
> 2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
> 8-5-231-4/8.5.231.4:25000: getListing {src: 
> "/data/connector/test/topics/102test" startAfter: "" needLocation: false}
> {code}
> but getListing request 1000 times getLocatedBlocks which not needed.
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}

[jira] [Commented] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-06-20 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366323#comment-17366323
 ] 

ludun commented on HDFS-16044:
--

[~hexiaoqiao] when request getLsiting, we can pass a parameter to  the method.
{code:java}
DirectoryListing files = namesystem.getListing(
src, startAfter, needLocation);
{code}

so  we should not return location with needLocation is false.

maybe  we should changge to

{code:java}
if (null == locations && isdir && null == symlink || !locatedStatus) {
{code}



> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: ludun
>Assignee: ludun
>Priority: Major
> Attachments: HDFS-16044.00.patch, HDFS-16044.01.patch
>
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> the request is and  needLocation is false
> {code:java}
> 2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
> 8-5-231-4/8.5.231.4:25000: getListing {src: 
> "/data/connector/test/topics/102test" startAfter: "" needLocation: false}
> {code}
> but getListing request 1000 times getLocatedBlocks which not needed.
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366236#comment-17366236
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] for quick review & feedback.

A. Regarding Prepare Fileset: AppendFileStats extends OpenFileStats, so 
generateInputs() is being called from the base 
classOperationStatsBase#benchmark(). Please correct me if i missed something 
from your point.

B. I had updated and uploaded HDFS-16067.002.patch.

Please review & feedback if anything else i missed. Thank you.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.002.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-16080:

Fix Version/s: 3.3.2

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16079) Improve the block state change log

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16079?focusedWorklogId=612368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612368
 ]

ASF GitHub Bot logged work on HDFS-16079:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 11:59
Start Date: 20/Jun/21 11:59
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on pull request #3120:
URL: https://github.com/apache/hadoop/pull/3120#issuecomment-864543045


   Thanx @tomscut for the contribution and @Hexiaoqiao for the review!!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612368)
Time Spent: 50m  (was: 40m)

> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16079) Improve the block state change log

2021-06-20 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-16079.
-
Fix Version/s: 3.3.2
   3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16079) Improve the block state change log

2021-06-20 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366172#comment-17366172
 ] 

Ayush Saxena commented on HDFS-16079:
-

Committed to trunk and branch-3.3


> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16079) Improve the block state change log

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16079?focusedWorklogId=612367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612367
 ]

ASF GitHub Bot logged work on HDFS-16079:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 11:53
Start Date: 20/Jun/21 11:53
Worklog Time Spent: 10m 
  Work Description: ayushtkn merged pull request #3120:
URL: https://github.com/apache/hadoop/pull/3120


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612367)
Time Spent: 40m  (was: 0.5h)

> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366171#comment-17366171
 ] 

Ayush Saxena commented on HDFS-16080:
-

Committed to trunk. Thanx [~vjasani] for the contribution!!!

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?focusedWorklogId=612366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612366
 ]

ASF GitHub Bot logged work on HDFS-16080:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 11:49
Start Date: 20/Jun/21 11:49
Worklog Time Spent: 10m 
  Work Description: ayushtkn merged pull request #3121:
URL: https://github.com/apache/hadoop/pull/3121


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612366)
Time Spent: 1.5h  (was: 1h 20m)

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-16080:

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366167#comment-17366167
 ] 

Xiaoqiao He commented on HDFS-16067:


Thanks [~prasad-acit] involve me here. I agree that it is useful to add 
operation `append` to NNThroughputBenchmark.
Just have a quick glance [^HDFS-16067.001.patch]. It seems not implement 
completely now.
A. For Append operation, we need prepare files set for it, reference to 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.OpenFileStats#generateInputs
 for Open operation.
B. We need to update benchmarking manual at 
hadoop-common-project/hadoop-common/src/site/markdown/Benchmarking.md.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?focusedWorklogId=612360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612360
 ]

ASF GitHub Bot logged work on HDFS-16080:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 08:13
Start Date: 20/Jun/21 08:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3121:
URL: https://github.com/apache/hadoop/pull/3121#issuecomment-864516862


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 11s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  24m 49s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 104m 31s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3121/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3121 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ded718d56d97 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 552f185b6de1ec9d343b7c6963b2f94fbac0f6fe |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3121/3/testReport/ |
   | Max. process+thread count | 2302 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3121/3/console |
   | 

[jira] [Work logged] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?focusedWorklogId=612355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612355
 ]

ASF GitHub Bot logged work on HDFS-16080:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 06:28
Start Date: 20/Jun/21 06:28
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3121:
URL: https://github.com/apache/hadoop/pull/3121#discussion_r654884615



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -1129,25 +1129,17 @@ private static boolean isExpectedValue(Object 
expectedValue, Object value) {
* Invoke method in all locations and return success if any succeeds.
*
* @param  The type of the remote location.
-   * @param  The type of the remote method return.
* @param locations List of remote locations to call concurrently.
* @param method The remote method and parameters to invoke.
* @return If the call succeeds in any location.
* @throws IOException If any of the calls return an exception.
*/
-  public  boolean invokeAll(
+  public  boolean invokeAll(
   final Collection locations, final RemoteMethod method)
-  throws IOException {
-boolean anyResult = false;
+  throws IOException {
 Map results =
 invokeConcurrent(locations, method, false, false, Boolean.class);
-for (Boolean value : results.values()) {
-  boolean result = value.booleanValue();
-  if (result) {
-anyResult = true;
-  }
-}
-return anyResult;
+return results.values().stream().anyMatch(value -> value);

Review comment:
   Done. Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612355)
Time Spent: 1h 10m  (was: 1h)

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16080) RBF: Invoking method in all locations should break the loop after successful result

2021-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16080?focusedWorklogId=612354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612354
 ]

ASF GitHub Bot logged work on HDFS-16080:
-

Author: ASF GitHub Bot
Created on: 20/Jun/21 06:24
Start Date: 20/Jun/21 06:24
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3121:
URL: https://github.com/apache/hadoop/pull/3121#discussion_r654884131



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -1129,25 +1129,17 @@ private static boolean isExpectedValue(Object 
expectedValue, Object value) {
* Invoke method in all locations and return success if any succeeds.
*
* @param  The type of the remote location.
-   * @param  The type of the remote method return.
* @param locations List of remote locations to call concurrently.
* @param method The remote method and parameters to invoke.
* @return If the call succeeds in any location.
* @throws IOException If any of the calls return an exception.
*/
-  public  boolean invokeAll(
+  public  boolean invokeAll(
   final Collection locations, final RemoteMethod method)
-  throws IOException {
-boolean anyResult = false;
+  throws IOException {
 Map results =
 invokeConcurrent(locations, method, false, false, Boolean.class);
-for (Boolean value : results.values()) {
-  boolean result = value.booleanValue();
-  if (result) {
-anyResult = true;
-  }
-}
-return anyResult;
+return results.values().stream().anyMatch(value -> value);

Review comment:
   I think your suggestion is good enough to change this. Let me update the 
PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612354)
Time Spent: 1h  (was: 50m)

> RBF: Invoking method in all locations should break the loop after successful 
> result
> ---
>
> Key: HDFS-16080
> URL: https://issues.apache.org/jira/browse/HDFS-16080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> rename, delete and mkdir used by Router client usually calls multiple 
> locations if the path is present in multiple sub-clusters. After invoking 
> multiple concurrent proxy calls to multiple clients, we iterate through all 
> results and mark anyResult true if at least one of them was successful. We 
> should break the loop if one of the proxy call result was successful rather 
> than iterating over remaining calls.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org