[jira] [Created] (HDFS-16770) [Documentation] RBF: Duplicate statement to be removed for better readabilty

2022-09-12 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16770:
-

 Summary: [Documentation] RBF: Duplicate statement to be removed 
for better readabilty
 Key: HDFS-16770
 URL: https://issues.apache.org/jira/browse/HDFS-16770
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Both the below 2 statements gives the same meaning, later one can be removed.

The Router monitors the local NameNode and its state and heartbeats to the 
State Store.
The Router monitors the local NameNode and heartbeats the state to the State 
Store.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15067) Optimize heartbeat for large cluster

2022-07-17 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567819#comment-17567819
 ] 

Renukaprasad C commented on HDFS-15067:
---

Thanks [~surendralilhore]  for reporting the issue and the patch. Thanks 
[~ayushtkn] [~umamaheswararao] for review & feedback.

This optimization is running in our large clusters for long time and no related 
issues reported. Patch shall be consider for merge? Other improvements we shall 
take it further.

> Optimize heartbeat for large cluster
> 
>
> Key: HDFS-15067
> URL: https://issues.apache.org/jira/browse/HDFS-15067
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15067.01.patch, HDFS-15067.02.patch, 
> HDFS-15067.03.patch, image-2020-01-09-18-00-49-556.png
>
>
> In a large cluster Namenode spend some time in processing heartbeats. For 
> example, in 10K node cluster namenode process 10K RPC's for heartbeat in each 
> 3sec. This will impact the client response time. This heart beat can be 
> optimized. DN can start skipping one heart beat if no 
> work(Write/replication/Delete) is allocated from long time. DN can start 
> sending heart beat in 6 sec. Once the DN stating getting work from NN , it 
> can start sending heart beat normally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16580) Datanode to print the blockID while releasing the SCFds

2022-05-16 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C reassigned HDFS-16580:
-

Assignee: (was: Renukaprasad C)

> Datanode to print the blockID while releasing the SCFds
> ---
>
> Key: HDFS-16580
> URL: https://issues.apache.org/jira/browse/HDFS-16580
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.4, 3.4.0, 3.3.2
>Reporter: Renukaprasad C
>Priority: Major
>
> Method - 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver#requestShortCircuitFds 
> prints the block ID entered for short circuit read, but corresponding entry 
> missing in 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver#releaseShortCircuitFds. 
> Its good to have corresponding blockID in release method as well.
> We are facing some random file read issues when SCR enabled. It From the 
> current logs, we cannot map request & release flows. 
> It will be helpful in debugging issues if we log blockID in release methods 
> as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16580) Datanode to print the blockID while releasing the SCFds

2022-05-16 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16580:
-

 Summary: Datanode to print the blockID while releasing the SCFds
 Key: HDFS-16580
 URL: https://issues.apache.org/jira/browse/HDFS-16580
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.3.2, 3.1.4, 3.4.0
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Method - 
org.apache.hadoop.hdfs.server.datanode.DataXceiver#requestShortCircuitFds 
prints the block ID entered for short circuit read, but corresponding entry 
missing in 
org.apache.hadoop.hdfs.server.datanode.DataXceiver#releaseShortCircuitFds. 

Its good to have corresponding blockID in release method as well.

We are facing some random file read issues when SCR enabled. It From the 
current logs, we cannot map request & release flows. 

It will be helpful in debugging issues if we log blockID in release methods as 
well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16563) Namenode WebUI prints sensitve information on Token Expiry

2022-04-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16563:
--
Attachment: image-2022-04-27-23-28-40-568.png

> Namenode WebUI prints sensitve information on Token Expiry
> --
>
> Key: HDFS-16563
> URL: https://issues.apache.org/jira/browse/HDFS-16563
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-04-27-23-01-16-033.png, 
> image-2022-04-27-23-28-40-568.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Login to Namenode WebUI.
> Wait for token to expire. (Or modify the Token refresh time 
> dfs.namenode.delegation.token.renew/update-interval to lower value)
> Refresh the WebUI after the Token expiry.
> Full token information gets printed in WebUI.
>  
> !image-2022-04-27-23-01-16-033.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16563) Namenode WebUI prints sensitve information on Token Expiry

2022-04-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528943#comment-17528943
 ] 

Renukaprasad C commented on HDFS-16563:
---

Changes verified in cluster:

!image-2022-04-27-23-28-40-568.png!

> Namenode WebUI prints sensitve information on Token Expiry
> --
>
> Key: HDFS-16563
> URL: https://issues.apache.org/jira/browse/HDFS-16563
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-04-27-23-01-16-033.png, 
> image-2022-04-27-23-28-40-568.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Login to Namenode WebUI.
> Wait for token to expire. (Or modify the Token refresh time 
> dfs.namenode.delegation.token.renew/update-interval to lower value)
> Refresh the WebUI after the Token expiry.
> Full token information gets printed in WebUI.
>  
> !image-2022-04-27-23-01-16-033.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16563) Namenode WebUI prints sensitve information on Token Expiry

2022-04-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C reassigned HDFS-16563:
-

Assignee: Renukaprasad C

> Namenode WebUI prints sensitve information on Token Expiry
> --
>
> Key: HDFS-16563
> URL: https://issues.apache.org/jira/browse/HDFS-16563
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: image-2022-04-27-23-01-16-033.png
>
>
> Login to Namenode WebUI.
> Wait for token to expire. (Or modify the Token refresh time 
> dfs.namenode.delegation.token.renew/update-interval to lower value)
> Refresh the WebUI after the Token expiry.
> Full token information gets printed in WebUI.
>  
> !image-2022-04-27-23-01-16-033.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16563) Namenode WebUI prints sensitve information on Token Expiry

2022-04-27 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16563:
-

 Summary: Namenode WebUI prints sensitve information on Token Expiry
 Key: HDFS-16563
 URL: https://issues.apache.org/jira/browse/HDFS-16563
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Renukaprasad C
 Attachments: image-2022-04-27-23-01-16-033.png

Login to Namenode WebUI.

Wait for token to expire. (Or modify the Token refresh time 

dfs.namenode.delegation.token.renew/update-interval to lower value)

Refresh the WebUI after the Token expiry.

Full token information gets printed in WebUI.

 

!image-2022-04-27-23-01-16-033.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16093) DataNodes under decommission will still be returned to the client via getLocatedBlocks, so the client may request decommissioning datanodes to read which will cause bad

2022-04-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528889#comment-17528889
 ] 

Renukaprasad C commented on HDFS-16093:
---

[~Daniel Ma] Thanks for reporting the issue. Thanks [~hexiaoqiao] [~sodonnell] 
[~tomscut] for review & feedback.

I do agree with [~hexiaoqiao] / [~tomscut] / [~sodonnell] , instead of 
excluding the Decommissioning (Decommissioned), this can be placed last. And 
read will be success with other normal DNs.

Are you still working on this solution?

> DataNodes under decommission will still be returned to the client via 
> getLocatedBlocks, so the client may request decommissioning datanodes to read 
> which will cause badly competation on disk IO.
> --
>
> Key: HDFS-16093
> URL: https://issues.apache.org/jira/browse/HDFS-16093
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Critical
>
> DataNodes under decommission will still be returned to the client via 
> getLocatedBlocks, so the client may request decommissioning datanodes to read 
> which will cause badly competation on disk IO.
> Therefore, datanodes under decommission should be removed from the return 
> list of getLocatedBlocks api.
> !image-2021-06-29-10-50-44-739.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15134) Any write calls with REST API on Standby NN print error message with wrong online help URL

2022-04-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528881#comment-17528881
 ] 

Renukaprasad C commented on HDFS-15134:
---

Thanks [~Sushma_28]  for the patch. LGTM

[~Hemanth Boyina] [~hexiaoqiao] Can you plz take a look into it?

> Any write calls with REST API on Standby NN print error message with wrong 
> online help URL
> --
>
> Key: HDFS-15134
> URL: https://issues.apache.org/jira/browse/HDFS-15134
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15134.001.patch
>
>
> vm2:/opt# curl -k -i --negotiate -u : 
> "http://IP:PORT/webhdfs/v1/test?op=MKDIRS;
> HTTP/1.1 403 Forbidden
> Date: Mon, 20 Jan 2020 07:28:19 GMT
> Cache-Control: no-cache
> Expires: Mon, 20 Jan 2020 07:28:20 GMT
> Date: Mon, 20 Jan 2020 07:28:20 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> Content-Type: application/json
> Transfer-Encoding: chunked
> {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation
>  category WRITE is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error"}}
> Invalid link doesnt exists - https://s.apache.org/sbnn-error. This need to be 
> updated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16094) HDFS balancer process start failed owing to daemon pid file is not cleared in some exception senario

2022-04-26 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528382#comment-17528382
 ] 

Renukaprasad C commented on HDFS-16094:
---

Similar issue HDFS-15932 has been addressed. [~Daniel Ma] please check if any 
other information to be added.

> HDFS balancer process start failed owing to daemon pid file is not cleared in 
> some exception senario
> 
>
> Key: HDFS-16094
> URL: https://issues.apache.org/jira/browse/HDFS-16094
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Major
>
> HDFS balancer process start failed owing to daemon pid file is not cleared in 
> some exception senario, but there is no useful information in log to trouble 
> shoot as below.
> {code:java}
> //代码占位符
> hadoop_error "${daemonname} is running as process $(cat "${daemon_pidfile}")
> {code}
> but actually, the process is not running as the error msg details above.
> Therefore, some more explicit information should be print in error log to 
> guide  users to clear the pid file and where the pid file location is.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16551) Backport HADOOP-17588 to 3.3 and other active old branches.

2022-04-26 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527906#comment-17527906
 ] 

Renukaprasad C commented on HDFS-16551:
---

Thanks [~ste...@apache.org] & [~weichiu] for review & merge.

> Backport HADOOP-17588 to 3.3 and other active old branches.
> ---
>
> Key: HDFS-16551
> URL: https://issues.apache.org/jira/browse/HDFS-16551
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2, 3.2.4, 3.3.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This random issue has been handled in trunk, same needs to be backported to 
> active branches.
> org.apache.hadoop.crypto.CryptoInputStream.close() - when 2 threads try to 
> close the stream second thread, fails with error.
> This operation should be synchronized to avoid multiple threads to perform 
> the close operation concurrently.
> [~Hemanth Boyina] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15741) Vulnerability fixes needed for Jackson Hadoop dependency library

2022-04-24 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C resolved HDFS-15741.
---
Resolution: Duplicate

Upgraded as part of HADOOP-17534, this is no more valid.

[~weichiu] [~SouryakantaDwivedy] 

> Vulnerability fixes needed for Jackson Hadoop dependency library 
> -
>
> Key: HDFS-15741
> URL: https://issues.apache.org/jira/browse/HDFS-15741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.1
>Reporter: Souryakanta Dwivedy
>Priority: Minor
> Attachments: CVEs_found.png
>
>
> Vulnerability fixes need for Jackson Hadoop dependency library 
> Below are the Jackson library jars used for hadoop where CVEs are found
> Jackson [version 2.10.3 ]
>  - jackson-core-2.10.3.jar
> CVE details :- [  CVE-2020-25649  ]
>  ==
> Jackson-core [version 2.4.0 ]
>  - htrace-core-3.1.0-incubating.jar
> CVE details :- [ CVE-2020-24616 ]
>   =
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16551) Backport HADOOP-17588 to 3.3 and other active old branches.

2022-04-23 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526876#comment-17526876
 ] 

Renukaprasad C commented on HDFS-16551:
---

Thanks [~ste...@apache.org] for the quick review.

I had raised PR for branch-3.2. 

Do I need to raise separate PR for branch-2.10 also?

> Backport HADOOP-17588 to 3.3 and other active old branches.
> ---
>
> Key: HDFS-16551
> URL: https://issues.apache.org/jira/browse/HDFS-16551
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This random issue has been handled in trunk, same needs to be backported to 
> active branches.
> org.apache.hadoop.crypto.CryptoInputStream.close() - when 2 threads try to 
> close the stream second thread, fails with error.
> This operation should be synchronized to avoid multiple threads to perform 
> the close operation concurrently.
> [~Hemanth Boyina] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16551) Backport HADOOP-17588 to 3.3 and other active old branches.

2022-04-21 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525762#comment-17525762
 ] 

Renukaprasad C commented on HDFS-16551:
---

Thanks [~ste...@apache.org] , i have raised PR for branch-3.3. 

> Backport HADOOP-17588 to 3.3 and other active old branches.
> ---
>
> Key: HDFS-16551
> URL: https://issues.apache.org/jira/browse/HDFS-16551
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This random issue has been handled in trunk, same needs to be backported to 
> active branches.
> org.apache.hadoop.crypto.CryptoInputStream.close() - when 2 threads try to 
> close the stream second thread, fails with error.
> This operation should be synchronized to avoid multiple threads to perform 
> the close operation concurrently.
> [~Hemanth Boyina] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16551) Backport HADOOP-17588 to 3.3 and other active old branches.

2022-04-20 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16551:
-

 Summary: Backport HADOOP-17588 to 3.3 and other active old 
branches.
 Key: HDFS-16551
 URL: https://issues.apache.org/jira/browse/HDFS-16551
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


This random issue has been handled in trunk, same needs to be backported to 
active branches.

org.apache.hadoop.crypto.CryptoInputStream.close() - when 2 threads try to 
close the stream second thread, fails with error.

This operation should be synchronized to avoid multiple threads to perform the 
close operation concurrently.

[~Hemanth Boyina] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16545) Provide option to balance rack level in Balancer

2022-04-18 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523806#comment-17523806
 ] 

Renukaprasad C commented on HDFS-16545:
---

Thanks [~liuml07]  for the quick review & feedback.

Once rack based balancer executed, it will balance the blocks with-in the rack. 
These DNs (balanced with rack based balancer) wont participate in the global 
balancer (without rack option / cluster-wide).

"Also do we plan to allow multiple rack-wide balancers (different racks)?"

So far we considered single rack balance. Need to analyze further to support 
multiple racks.

> Provide option to balance rack level in Balancer
> 
>
> Key: HDFS-16545
> URL: https://issues.apache.org/jira/browse/HDFS-16545
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
>
> Currently Balancer tool run on entire cluster and balance across the racks. 
> In we need to balance within a rack, then need to provide an option to 
> support the rack level balancing.
> [~surendralilhore] [~hemanthboyina] 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16545) Provide option to balance rack level in Balancer

2022-04-17 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16545:
-

 Summary: Provide option to balance rack level in Balancer
 Key: HDFS-16545
 URL: https://issues.apache.org/jira/browse/HDFS-16545
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Currently Balancer tool run on entire cluster and balance across the racks. In 
we need to balance within a rack, then need to provide an option to support the 
rack level balancing.

[~surendralilhore] [~hemanthboyina] 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16526) Add metrics for slow DataNode

2022-03-30 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16526:
-

 Summary: Add metrics for slow DataNode
 Key: HDFS-16526
 URL: https://issues.apache.org/jira/browse/HDFS-16526
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Add some more metrics for slow datanode operations - FlushOrSync, 
PacketResponder send ACK.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16428) Source path with storagePolicy cause wrong typeConsumed while rename

2022-03-01 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17499930#comment-17499930
 ] 

Renukaprasad C commented on HDFS-16428:
---

[~lei w] Thanks for your contribution.

[~hexiaoqiao]  Thanks for the review, same can be merged to other branches as 
well right?

> Source path with storagePolicy cause wrong typeConsumed while rename
> 
>
> Key: HDFS-16428
> URL: https://issues.apache.org/jira/browse/HDFS-16428
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: lei w
>Assignee: lei w
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: example.txt
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When compute quota in rename operation , we use storage policy of the target 
> directory to compute src  quota usage. This will cause wrong value of 
> typeConsumed when source path was setted storage policy. I provided a unit 
> test to present this situation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16239) XAttr#toString doesnt print the attribute value in readable format

2021-10-06 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C resolved HDFS-16239.
---
Resolution: Invalid

To print, have we considered using XattrCodec APIs. 

Its not neccessary to print the XAttr.

> XAttr#toString doesnt print the attribute value in readable format
> --
>
> Key: HDFS-16239
> URL: https://issues.apache.org/jira/browse/HDFS-16239
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.fs.XAttr#toString prints the value of attribute in bytes. 
> return "XAttr [ns=" + ns + ", name=" + name + ", value="
>  + Arrays.toString(value) + "]";
> XAttr [ns=SYSTEM, name=az.expression, value=[82, 69, 80, 91, 50, 93..]
> This should be converted to String rather than printing to Array of bytes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14575) LeaseRenewer#daemon threads leak in DFSClient

2021-09-29 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422037#comment-17422037
 ] 

Renukaprasad C commented on HDFS-14575:
---

Thanks [~weichiu] for bring it up.

[~weichiu] / [~hexiaoqiao]

Yes, It should be fine to merge into both the branches -3.3/3.2. How it will be 
handled? Merged as part of same Jira or separate MR needs to be raised?

> LeaseRenewer#daemon threads leak in DFSClient
> -
>
> Key: HDFS-14575
> URL: https://issues.apache.org/jira/browse/HDFS-14575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-14575.001.patch, HDFS-14575.002.patch, 
> HDFS-14575.003.patch, HDFS-14575.004.patch
>
>
> Currently LeaseRenewer (and its daemon thread) without clients should be 
> terminated after a grace period which defaults to 60 seconds. A race 
> condition may happen when a new request is coming just after LeaseRenewer 
> expired.
>  Reproduce this race condition:
>  # Client#1 creates File#1: creates LeaseRenewer#1 and starts Daemon#1 
> thread, after a few seconds, File#1 is closed , there is no clients in 
> LeaseRenewer#1 now.
>  # 60 seconds (grace period) later, LeaseRenewer#1 just expires but daemon#1 
> thread is still in sleep, Client#1 creates File#2, lead to the creation of 
> Daemon#2.
>  # Daemon#1 is awake then exit, after that, LeaseRenewer#1 is removed from 
> factory.
>  # File#2 is closed after a few seconds, LeaseRenewer#2 is created since it 
> can’t get renewer from factory.
> Daemon#2 thread leaks from now on, since Client#1 in it can never be removed 
> and it won't have a chance to stop.
> To solve this problem, IIUIC, a simple way I think is to make sure that all 
> clients are cleared when LeaseRenewer is removed from factory. Please feel 
> free to give your suggestions. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16243) The available disk space is less than the reserved space, and no log message is displayed

2021-09-28 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421924#comment-17421924
 ] 

Renukaprasad C commented on HDFS-16243:
---

Thanks [~zhttylz] for the issue & the patch.

 

LOG.warn("Configured reserved space is higher than Disk capacity"); - Here can 
you print values as well.

I think you created patch for the specific version. Is it applicable to trunk 
as well?

Also, you can raise a PR, which will be easier for review & trace.

> The available disk space is less than the reserved space, and no log message 
> is displayed
> -
>
> Key: HDFS-16243
> URL: https://issues.apache.org/jira/browse/HDFS-16243
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Hualong Zhang
>Priority: Major
> Attachments: HDFS-16243.patch
>
>
> When I submitted a task to the hadoop test cluster, it appeared "could only 
> be replicated to 0 nodes instead of minReplication (=1)"
> I checked the namenode and datanode logs and did not find any error logs. It 
> was not until the use of dfsadmin -report that the available capacity was 0 
> and I realized that it may be a configuration problem.
> Checking the configuration found that the value of the 
> "dfs.datanode.du.reserved" configuration is greater than the available disk 
> space of HDFS, which caused this problem
> It seems that there should be some warnings or errors in the log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15893) Logs are flooded when dfs.ha.tail-edits.in-progress set to true or dfs.ha.tail-edits.period to 0ms

2021-09-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420971#comment-17420971
 ] 

Renukaprasad C commented on HDFS-15893:
---

Thanks [~Sushma_28] for reporting the issue and detailed clarification. Are you 
working on this patch further? 

Thanks [~jianghuazhu] for the quick review & update.

 

> Logs are flooded when dfs.ha.tail-edits.in-progress set to true or 
> dfs.ha.tail-edits.period to 0ms
> --
>
> Key: HDFS-15893
> URL: https://issues.apache.org/jira/browse/HDFS-15893
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15893.001.patch
>
>
> When we set dfs.ha.tail-edits.in-progress to true, dfs.ha.tail-edits.period 
> to 0ms almost all the logs on standby and observer NN are loaded. Such logs 
> will flood useful logs.
> We can adjust the log level of few logs to debug while observer node is in 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16239) XAttr#toString doesnt print the attribute value in readable format

2021-09-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16239 started by Renukaprasad C.
-
> XAttr#toString doesnt print the attribute value in readable format
> --
>
> Key: HDFS-16239
> URL: https://issues.apache.org/jira/browse/HDFS-16239
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.fs.XAttr#toString prints the value of attribute in bytes. 
> return "XAttr [ns=" + ns + ", name=" + name + ", value="
>  + Arrays.toString(value) + "]";
> XAttr [ns=SYSTEM, name=az.expression, value=[82, 69, 80, 91, 50, 93..]
> This should be converted to String rather than printing to Array of bytes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16239) XAttr#toString doesnt print the attribute value in readable format

2021-09-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16239:
--
Summary: XAttr#toString doesnt print the attribute value in readable format 
 (was: XAttr#toString doesnt print the value)

> XAttr#toString doesnt print the attribute value in readable format
> --
>
> Key: HDFS-16239
> URL: https://issues.apache.org/jira/browse/HDFS-16239
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> org.apache.hadoop.fs.XAttr#toString prints the value of attribute in bytes. 
> return "XAttr [ns=" + ns + ", name=" + name + ", value="
>  + Arrays.toString(value) + "]";
> XAttr [ns=SYSTEM, name=az.expression, value=[82, 69, 80, 91, 50, 93..]
> This should be converted to String rather than printing to Array of bytes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16239) XAttr#toString doesnt print the value

2021-09-27 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16239:
-

 Summary: XAttr#toString doesnt print the value
 Key: HDFS-16239
 URL: https://issues.apache.org/jira/browse/HDFS-16239
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Renukaprasad C
Assignee: Renukaprasad C


org.apache.hadoop.fs.XAttr#toString prints the value of attribute in bytes. 

return "XAttr [ns=" + ns + ", name=" + name + ", value="
 + Arrays.toString(value) + "]";

XAttr [ns=SYSTEM, name=az.expression, value=[82, 69, 80, 91, 50, 93..]

This should be converted to String rather than printing to Array of bytes.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419920#comment-17419920
 ] 

Renukaprasad C commented on HDFS-16235:
---

Thanks [~angerszhuuu] for the clarification. Its clear & PR changes are fine.

PR LGTM. [~ferhui] Thanks for the reviw, we shall merge the PR.

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-23 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419445#comment-17419445
 ] 

Renukaprasad C commented on HDFS-16235:
---

Good catch [~angerszhu] & Thanks for reporting the issue & the fix.

I would like to see the problem. Do you have test case / scenario to reproduce 
the issue? 

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16236) Example command for daemonlog is not correct

2021-09-23 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16236 started by Renukaprasad C.
-
> Example command for daemonlog is not correct
> 
>
> Key: HDFS-16236
> URL: https://issues.apache.org/jira/browse/HDFS-16236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.1
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> getlevel command included the log level, which lead to command failure. 
> Loglevel required only for setlevel API.
> bin/hadoop daemonlog -getlevel 127.0.0.1:9871 
> org.apache.hadoop.hdfs.server.namenode.NameNode DEBUG -protocol https



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16236) Example command for daemonlog is not correct

2021-09-23 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16236:
-

 Summary: Example command for daemonlog is not correct
 Key: HDFS-16236
 URL: https://issues.apache.org/jira/browse/HDFS-16236
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.3.1
Reporter: Renukaprasad C
Assignee: Renukaprasad C


getlevel command included the log level, which lead to command failure. 
Loglevel required only for setlevel API.

bin/hadoop daemonlog -getlevel 127.0.0.1:9871 
org.apache.hadoop.hdfs.server.namenode.NameNode DEBUG -protocol https



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16220) [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC

2021-09-12 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413768#comment-17413768
 ] 

Renukaprasad C commented on HDFS-16220:
---

Thanks [~jianghuazhu] for reporting the issue and the patch.

Configuration file has issue, which results many test failures. You may correct 
it, should be able to get rid of these unwanted results.

Also, there are some static issues reported please take a look.

[~shv]  [~xinglin] when you feel free, can you please take a look at the PR? 

> [FGL]Configurable INodeMap#NAMESPACE_KEY_DEPTH_RANGES_STATIC
> 
>
> Key: HDFS-16220
> URL: https://issues.apache.org/jira/browse/HDFS-16220
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In INodeMap, NAMESPACE_KEY_DEPTH and NUM_RANGES_STATIC are a fixed value, we 
> should make it configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-09-10 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413087#comment-17413087
 ] 

Renukaprasad C commented on HDFS-14703:
---

Thanks [~jianghuazhu] for your interest & attention on this task.

Yes, we need to make it configurable. Didnt  pay much attention to it in the 
POC. It will be great if you can trace this issue.

Also, i suggest to make partition count - INodeMap#NUM_RANGES_STATIC 
configurable along with DEPTH. 

"By the way, in our cluster, there are more than 100 million INodes."

  – We have tried upto 10M files/Dirs. Larger the data set, could see the 
better results.  You can share us reports in case you have done benchmarking 
with FGL branch.

 

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-09-09 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412801#comment-17412801
 ] 

Renukaprasad C commented on HDFS-14703:
---

Thanks [~jianghuazhu] for sharing your thoughts. Hope this will clarify your 
doubts. 

INodeMap#NAMESPACE_KEY_DEPTH is desighed with flexibility. Yes, by default it 
is 2 which is cobmination of (ParentINodeId, INodeId). When you set it to 3, 
then GrandParentId as well.  We have tried upto level 3 with basic 
functionality. But performance not measured. We continued to use with the 
default value - 2. I am not sure of any use case to increase the values to 
higher number (Atleast i havent done any testing on this part).

By default each partition capacity is 117965 (65536 * 1.8), we continue to use 
the default values in our test. We also checked the scenarios when dynamic 
partitions were added. No perf degrade on dynamic partitions, infact this is 
expected to give higher throuput. We havent noticed very high CPU usage upto 1M 
file write Ops (Resouce usage statistics we need to capture yet with base & FGL 
Patch), so this shouldnt have any impact of the other operations (RPC or any 
other server side processing tasks). 

In case if you have missed the design please go through the latest desigh doc - 
NameNode Fine-Grained Locking.pdf

[~shv] [~xinglin] Would you like to share your inputs?

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412170#comment-17412170
 ] 

Renukaprasad C commented on HDFS-16191:
---

Thanks [~xinglin] for review & feedback.

 

org.apache.hadoop.util.PartitionedGSet#addNewPartitionIfNeeded – Here are check 
the SIZE of the partition and create/return new partition if the size exceeds 
otherwise the same partition.

private PartitionEntry addNewPartitionIfNeeded(
 PartitionEntry curPart, K key) {
 if(curPart.size() < DEFAULT_PARTITION_CAPACITY * DEFAULT_PARTITION_OVERFLOW
 || curPart.contains(key)) {
 return curPart;
 }
 return addNewPartition(key);
}

Here we add new partition whenever the size exceeds the threshold configured. 

 

Once new partition is added and some inodes added into it, which fails while 
iterating (As we iterated only static partitions).

With the above patch, i had verified the functionality  & related UTs, which 
are working fine.

 

One issue i found here is, Static partitions were added as => range key[0, 
16385],range key[1, 16385],range key[25, 16385], where as dynamic 
partitions were added like inodefile[0, ], inodefile[0, Y 
InodeId]  When these nodes are compared to get the partition, we get the 
newly added partition iNodeFile[0, X inodeId] after range key[0, 16385] is full.

 

Let me check this scenario once again, any other issue will discuss. Meanwhile 
you can also check the scenario when one partition gets full.

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-09-06 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410702#comment-17410702
 ] 

Renukaprasad C commented on HDFS-14703:
---

[~jianghuazhu] Initially there are 2 commits done as part of POC in the 
beginning.

INodeMap with PartitionedGSet and per-partition locking (This will map to Jira 
- HDFS-14734 & HDFS-14732).

[FGL] Introduce INode key. (This will map to Jira - HDFS-14733)

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16208) [FGL] Implement Delete API with FGL

2021-09-06 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410693#comment-17410693
 ] 

Renukaprasad C commented on HDFS-16208:
---

Sure [~jianghuazhu].

I missed to attach the report for DELETE operation.

And the query used - ./hadoop 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs file:/// -op 
delete -threads 200 -files 100 -filesPerDir 100

 
||Itr||Base||Patch||
|1|36886|55126|
|2|40783|52029|
|3|39698|40950|
|4|42247|55157|
|5|38197|49285|
|Avg|39562|50509|
|Imp %| |27%|

 

 

> [FGL] Implement Delete API with FGL
> ---
>
> Key: HDFS-16208
> URL: https://issues.apache.org/jira/browse/HDFS-16208
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Replace all global locks for file / directory deletion with FGL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16128) [FGL] Add support for saving/loading an FS Image for PartitionedGSet

2021-09-05 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410229#comment-17410229
 ] 

Renukaprasad C commented on HDFS-16128:
---

Understood [~xinglin], thanks for detailed clarification. Addressed couple of 
related issues - https://issues.apache.org/jira/browse/HDFS-16191

Please take a look whenever you get time. 

> [FGL] Add support for saving/loading an FS Image for PartitionedGSet
> 
>
> Key: HDFS-16128
> URL: https://issues.apache.org/jira/browse/HDFS-16128
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Major
>  Labels: pull-request-available
> Fix For: Fine-Grained Locking
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add support to save Inodes stored in PartitionedGSet when saving an FS image 
> and load Inodes into PartitionedGSet from a saved FS image.
> h1. Saving FSImage
> *Original HDFS design*: iterate every inode in inodeMap and save them into 
> the FSImage file. 
> *FGL*: no change is needed here, since PartitionedGSet also provides an 
> iterator interface, to iterate over inodes stored in partitions. 
> h1. Loading an HDFS 
> *Original HDFS design*: it first loads the FSImage files and then loads edit 
> logs for recent changes. FSImage files contain different sections, including 
> INodeSections and INodeDirectorySections. An InodeSection contains serialized 
> Inodes objects and the INodeDirectorySection contains the parent inode for an 
> Inode. When loading an FSImage, the system first loads INodeSections and then 
> load the INodeDirectorySections, to set the parent inode for each inode. 
> After FSImage files are loaded, edit logs are then loaded. Edit log contains 
> recent changes to the filesystem, including Inodes creation/deletion. For a 
> newly created INode, the parent inode is set before it is added to the 
> inodeMap.
> *FGL*: when adding an Inode into the partitionedGSet, we need the parent 
> inode of an inode, in order to determine which partition to store that inode, 
> when NAMESPACE_KEY_DEPTH = 2. Thus, in FGL, when loading FSImage files, we 
> used a temporary LightweightGSet (inodeMapTemp), to store inodes. When 
> LoadFSImage is done, the parent inode for all existing inodes in FSImage 
> files is set. We can now move the inodes into a partitionedGSet. Load edit 
> logs can work as usual, as the parent inode for an inode is set before it is 
> added to the inodeMap. 
> In theory, PartitionedGSet can support to store inodes without setting its 
> parent inodes. All these inodes will be stored in the 0th partition. However, 
> we decide to use a temporary LightweightGSet (inodeMapTemp) to store these 
> inodes, to make this case more transparent.          
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16138) BlockReportProcessingThread exit doesn't print the actual stack

2021-09-05 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410226#comment-17410226
 ] 

Renukaprasad C commented on HDFS-16138:
---

Thank you [~hexiaoqiao]and [~hemanthboyina] . 

> BlockReportProcessingThread exit doesn't print the actual stack
> ---
>
> Key: HDFS-16138
> URL: https://issues.apache.org/jira/browse/HDFS-16138
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> BlockReportProcessingThread thread may gets exited with multiple reasons, but 
> the current logging prints only the exception message with different stack 
> which is difficult to debug the issue.
>  
> Existing logging:
> 2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError
> 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
> (ExitUtil.java:terminate(213)) - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
>  
> Actual issue found at:
> 2021-07-20 10:20:23,101 [Block report processor] ERROR 
> blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
> java.lang.AssertionError
> java.lang.AssertionError
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)
>  
> This issue found while working on FGL branch. But, same issue can happen in 
> Trunk also in any error scenario.
>  
> [~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-04 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410106#comment-17410106
 ] 

Renukaprasad C edited comment on HDFS-16195 at 9/5/21, 5:21 AM:


Thanks [~vjasani] for the detailed info.

If you think this is good idea, I can create a Jira for the same. 

– yes, we shall go with this.

[~hemanthboyina] can you take a look into the latest patch?

 


was (Author: prasad-acit):
Thanks [~vjasani] for the detailed info.

If you think this is good idea, I can create a Jira for the same. 

– yes, we shall go with this.

[~Hemanth Boyina] can you take a look into the latest patch?

 

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch, 
> HADOOP-16195.003.patch, hadoop-format.xml
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-04 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410106#comment-17410106
 ] 

Renukaprasad C commented on HDFS-16195:
---

Thanks [~vjasani] for the detailed info.

If you think this is good idea, I can create a Jira for the same. 

– yes, we shall go with this.

[~Hemanth Boyina] can you take a look into the latest patch?

 

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch, 
> HADOOP-16195.003.patch, hadoop-format.xml
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16077) OIV parsing tool throws NPE for a FSImage with multiple InodeSections

2021-09-04 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C resolved HDFS-16077.
---
Resolution: Not A Bug

Private changes caused the bug, not applicable to OS.

> OIV parsing tool throws NPE for a FSImage with multiple InodeSections
> -
>
> Key: HDFS-16077
> URL: https://issues.apache.org/jira/browse/HDFS-16077
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Priority: Major
>
> An FSImage with Multiple InodeSections is resulting in NPE when accessed 
> through OIV Tool with default Parser (WEB)
> This issue is reproducible only with multiple InodeSections (Writing more 
> than 1 Million Files) 
> On analyzing the code further we found that NPE is caused in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.FSImageLoader.fromINodeId(long).
>  fromINodeId(long) is searching for Inode in an Inodesection which doesn't 
> have the Inode(but exists in another InodeSection)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-04 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16195:
--
Attachment: hadoop-format.xml

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch, 
> HADOOP-16195.003.patch, hadoop-format.xml
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-04 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409918#comment-17409918
 ] 

Renukaprasad C commented on HDFS-16195:
---

Thanks for the patch, changes are fine.

LGTM for HADOOP-16195.003.patch.

Formatter you can refer to - hadoop-format.xml attached  above.
[~vjasani] could you share the link if the common formatter is available 
globally? Thank you.

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch, 
> HADOOP-16195.003.patch, hadoop-format.xml
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16077) OIV parsing tool throws NPE for a FSImage with multiple InodeSections

2021-09-04 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409916#comment-17409916
 ] 

Renukaprasad C commented on HDFS-16077:
---

Verified the scenario with trunk version, issue doesnt exists.

Scneario is specific to some private changes, same has been confirmed. Thanks 
[~sodonnell] for the clarification, Thanks [~Sushma_28] for reporting the issue.

> OIV parsing tool throws NPE for a FSImage with multiple InodeSections
> -
>
> Key: HDFS-16077
> URL: https://issues.apache.org/jira/browse/HDFS-16077
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Priority: Major
>
> An FSImage with Multiple InodeSections is resulting in NPE when accessed 
> through OIV Tool with default Parser (WEB)
> This issue is reproducible only with multiple InodeSections (Writing more 
> than 1 Million Files) 
> On analyzing the code further we found that NPE is caused in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.FSImageLoader.fromINodeId(long).
>  fromINodeId(long) is searching for Inode in an Inodesection which doesn't 
> have the Inode(but exists in another InodeSection)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16208) [FGL] Implement Delete API with FGL

2021-09-03 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16208 started by Renukaprasad C.
-
> [FGL] Implement Delete API with FGL
> ---
>
> Key: HDFS-16208
> URL: https://issues.apache.org/jira/browse/HDFS-16208
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Replace all global locks for file / directory deletion with FGL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-03 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409310#comment-17409310
 ] 

Renukaprasad C commented on HDFS-16195:
---

Thanks [~preetium] for the patch, still line length is exceeding the threshold. 
You can correct.

Also, formatter is different, you can follow the hadoop formatting.

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch, HADOOP-16195.002.patch
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack

2021-09-02 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409130#comment-17409130
 ] 

Renukaprasad C commented on HDFS-16138:
---

Thanks [~hemanthboyina], Exception being thrown in 
org.apache.hadoop.util.ExitUtil#terminate(int, java.lang.String) via BP 
Processing thread - 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.BlockReportProcessingThread#run

Below code, which create new exception bu consuming the actual exception. 
  public static void terminate(int status, String msg) throws ExitException {
terminate(new ExitException(status, msg));
  }
I couldnt extend UT as error is from private thread. Simulation would required 
lot of mocking. If you still insist, we shall look into it further.

Also, other comments addressed and pushed the changes. Please review the 
changes. Thank you.

> BlockReportProcessingThread exit doesnt print the acutal stack
> --
>
> Key: HDFS-16138
> URL: https://issues.apache.org/jira/browse/HDFS-16138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> BlockReportProcessingThread thread may gets exited with multiple reasons, but 
> the current logging prints only the exception message with different stack 
> which is difficult to debug the issue.
>  
> Existing logging:
> 2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError
> 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
> (ExitUtil.java:terminate(213)) - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
>  
> Actual issue found at:
> 2021-07-20 10:20:23,101 [Block report processor] ERROR 
> blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
> java.lang.AssertionError
> java.lang.AssertionError
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)
>  
> This issue found while working on FGL branch. But, same issue can happen in 
> Trunk also in any error scenario.
>  
> [~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16208) [FGL] Implement Delete API with FGL

2021-09-02 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16208:
-

 Summary: [FGL] Implement Delete API with FGL
 Key: HDFS-16208
 URL: https://issues.apache.org/jira/browse/HDFS-16208
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Replace all global locks for file / directory deletion with FGL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16193) [FGL] Implement Append & Rename APIs with FGL

2021-09-02 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16193 started by Renukaprasad C.
-
> [FGL] Implement Append & Rename APIs with FGL
> -
>
> Key: HDFS-16193
> URL: https://issues.apache.org/jira/browse/HDFS-16193
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Replace globla lock with FGL in Append & Rename APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-09-02 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17408980#comment-17408980
 ] 

Renukaprasad C commented on HDFS-16195:
---

Thanks [~preetium] for the patch. Messages are more more meaninful than before.

Would you fix the checkstyle issues & update the patch?

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
> Attachments: HADOOP-16195.001.patch
>
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16193) [FGL] Implement Append & Rename APIs with FGL

2021-09-02 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17408811#comment-17408811
 ] 

Renukaprasad C commented on HDFS-16193:
---

[~shv] [~xinglin] Modified 2 APIs to support FGL. Can you please review the 
changes? Thanks.

Attached the performance  report for both the APIs for reference.

 

./hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
file:/// -op rename -threads 200 -files 100 -filesPerDir 100 -keepResults

 
Performance report for Rename API:
||Itr||Base||Patch||
|1|41001|51519|
|2|41310|49431|
|3|39062|49652|
|Avg|40457|50200|
|Impr| |24%|
 
./hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
file:/// -op append -threads 100 -files 10 -filesPerDir 100
Performance report for Append API: 
||Itr||Base||Patch||
|1|35523|39478|
|2|41390|55096|
|3|41425|47014|
|4|32829|43649|
|5| 36443|55157|
|Avg|37522|48078|
|Impr| |28%|

 

 

> [FGL] Implement Append & Rename APIs with FGL
> -
>
> Key: HDFS-16193
> URL: https://issues.apache.org/jira/browse/HDFS-16193
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Replace globla lock with FGL in Append & Rename APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16141) [FGL] Address permission related issues with File / Directory

2021-09-02 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17408729#comment-17408729
 ] 

Renukaprasad C commented on HDFS-16141:
---

Thank you [~shv] for the review of patch and commit.

> [FGL] Address permission related issues with File / Directory
> -
>
> Key: HDFS-16141
> URL: https://issues.apache.org/jira/browse/HDFS-16141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: Fine-Grained Locking
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Post FGL implementation (MKDIR & Create File), there are existing UTs got 
> impacted which needs to be addressed.
> Failed Tests:
> TestDFSPermission
> TestPermission
> TestFileCreation
> TestDFSMkdirs (Added tests)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16195) Fix log message when choosing storage groups for block movement in balancer

2021-08-29 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406531#comment-17406531
 ] 

Renukaprasad C commented on HDFS-16195:
---

Thanks [~preetium] for reporting the issue.

True. Though code is written accordingly, log message is little confusing. We 
can correct the message here. Are you working on the patch?

> Fix log message when choosing storage groups for block movement in balancer
> ---
>
> Key: HDFS-16195
> URL: https://issues.apache.org/jira/browse/HDFS-16195
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover
>Reporter: Preeti
>Priority: Major
>
> Correct the log message in line with the logic associated with
> moving blocks in chooseStorageGroups() in the balancer. All log lines should 
> indicate from which storage source the blocks are being moved correctly to 
> avoid ambiguity. Right now one of the log lines is incorrect: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java#L555]
>  which indicates that storage blocks are moved from underUtilized to 
> aboveAvgUtilized nodes, while it is actually the other way around in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-08-28 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16191 started by Renukaprasad C.
-
> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16193) [FGL] Implement Append & Rename APIs with FGL

2021-08-28 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16193:
-

 Summary: [FGL] Implement Append & Rename APIs with FGL
 Key: HDFS-16193
 URL: https://issues.apache.org/jira/browse/HDFS-16193
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Replace globla lock with FGL in Append & Rename APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-08-28 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406160#comment-17406160
 ] 

Renukaprasad C commented on HDFS-16191:
---

[~shv] [~xinglin] Some failed scenarios were handled in FSImage loading, can 
you please help to review the changes?

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-08-28 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16191:
--
Summary: [FGL] Fix FSImage loading issues on dynamic partitions  (was: 
[FGL] Loading FSImage loading with errors)

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16191) [FGL] Loading FSImage loading with errors

2021-08-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16191:
--
Summary: [FGL] Loading FSImage loading with errors  (was: Loading FSImage 
loading with errors)

> [FGL] Loading FSImage loading with errors
> -
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16191) Loading FSImage loading with errors

2021-08-27 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16191:
-

 Summary: Loading FSImage loading with errors
 Key: HDFS-16191
 URL: https://issues.apache.org/jira/browse/HDFS-16191
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


When new partitions gets added into PartitionGSet, iterator do not consider the 
new partitions. Which always iterate on Static Partition count. This lead to 
full of warn messages as below.

2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139780 when saving the leases.
2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139781 when saving the leases.
2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139784 when saving the leases.
2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139785 when saving the leases.
2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139786 when saving the leases.
2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139788 when saving the leases.
2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139789 when saving the leases.
2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139790 when saving the leases.
2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139791 when saving the leases.
2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139793 when saving the leases.
2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139795 when saving the leases.
2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139796 when saving the leases.
2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139797 when saving the leases.
2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139800 when saving the leases.
2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find inode 
139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16128) [FGL] Add support for saving/loading an FS Image for PartitionedGSet

2021-08-24 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404107#comment-17404107
 ] 

Renukaprasad C commented on HDFS-16128:
---

org.apache.hadoop.hdfs.server.namenode.INodeMap#get(long) here 

 
{code:java}
pgs.get(inode); should be able to get the inode from the partitions. But we 
changed this code with 
for (int p = 0; p < NUM_RANGES_STATIC; p++) { INodeDirectory key = new 
INodeDirectory(INodeId.ROOT_INODE_ID, "range 
key".getBytes(StandardCharsets.UTF_8), perm, 0); key.setParent(new 
INodeDirectory((long)p, null, perm, 0)); PartitionedGSet.PartitionEntry e = 
pgs.getPartition(key); if (e.contains(inode)) { return (INode) e.get(inode); } }
{code}
But the new code fails to get the INode when new partitions were added 
dynamically.

This part of code can be changed back to "pgs.get(inode);" ? Any issue found 
with this code?

 

> [FGL] Add support for saving/loading an FS Image for PartitionedGSet
> 
>
> Key: HDFS-16128
> URL: https://issues.apache.org/jira/browse/HDFS-16128
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Major
>  Labels: pull-request-available
> Fix For: Fine-Grained Locking
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add support to save Inodes stored in PartitionedGSet when saving an FS image 
> and load Inodes into PartitionedGSet from a saved FS image.
> h1. Saving FSImage
> *Original HDFS design*: iterate every inode in inodeMap and save them into 
> the FSImage file. 
> *FGL*: no change is needed here, since PartitionedGSet also provides an 
> iterator interface, to iterate over inodes stored in partitions. 
> h1. Loading an HDFS 
> *Original HDFS design*: it first loads the FSImage files and then loads edit 
> logs for recent changes. FSImage files contain different sections, including 
> INodeSections and INodeDirectorySections. An InodeSection contains serialized 
> Inodes objects and the INodeDirectorySection contains the parent inode for an 
> Inode. When loading an FSImage, the system first loads INodeSections and then 
> load the INodeDirectorySections, to set the parent inode for each inode. 
> After FSImage files are loaded, edit logs are then loaded. Edit log contains 
> recent changes to the filesystem, including Inodes creation/deletion. For a 
> newly created INode, the parent inode is set before it is added to the 
> inodeMap.
> *FGL*: when adding an Inode into the partitionedGSet, we need the parent 
> inode of an inode, in order to determine which partition to store that inode, 
> when NAMESPACE_KEY_DEPTH = 2. Thus, in FGL, when loading FSImage files, we 
> used a temporary LightweightGSet (inodeMapTemp), to store inodes. When 
> LoadFSImage is done, the parent inode for all existing inodes in FSImage 
> files is set. We can now move the inodes into a partitionedGSet. Load edit 
> logs can work as usual, as the parent inode for an inode is set before it is 
> added to the inodeMap. 
> In theory, PartitionedGSet can support to store inodes without setting its 
> parent inodes. All these inodes will be stored in the 0th partition. However, 
> we decide to use a temporary LightweightGSet (inodeMapTemp) to store these 
> inodes, to make this case more transparent.          
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack

2021-07-27 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16138 started by Renukaprasad C.
-
> BlockReportProcessingThread exit doesnt print the acutal stack
> --
>
> Key: HDFS-16138
> URL: https://issues.apache.org/jira/browse/HDFS-16138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> BlockReportProcessingThread thread may gets exited with multiple reasons, but 
> the current logging prints only the exception message with different stack 
> which is difficult to debug the issue.
>  
> Existing logging:
> 2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError
> 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
> (ExitUtil.java:terminate(213)) - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
>  
> Actual issue found at:
> 2021-07-20 10:20:23,101 [Block report processor] ERROR 
> blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
> java.lang.AssertionError
> java.lang.AssertionError
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)
>  
> This issue found while working on FGL branch. But, same issue can happen in 
> Trunk also in any error scenario.
>  
> [~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack

2021-07-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388439#comment-17388439
 ] 

Renukaprasad C commented on HDFS-16138:
---

org.apache.hadoop.util.ExitUtil#terminate(int, java.lang.String) This create 
new exception, which include the Exception message but miss the actual stack 
trace. 

Now, adding full stack. Logging continue as before based on other parameters.

[~hexiaoqiao] [~hemanthboyina] can you please take a look whenever you get time?

> BlockReportProcessingThread exit doesnt print the acutal stack
> --
>
> Key: HDFS-16138
> URL: https://issues.apache.org/jira/browse/HDFS-16138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> BlockReportProcessingThread thread may gets exited with multiple reasons, but 
> the current logging prints only the exception message with different stack 
> which is difficult to debug the issue.
>  
> Existing logging:
> 2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError
> 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
> (ExitUtil.java:terminate(213)) - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
>  
> Actual issue found at:
> 2021-07-20 10:20:23,101 [Block report processor] ERROR 
> blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
> java.lang.AssertionError
> java.lang.AssertionError
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)
>  
> This issue found while working on FGL branch. But, same issue can happen in 
> Trunk also in any error scenario.
>  
> [~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-07-27 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388217#comment-17388217
 ] 

Renukaprasad C commented on HDFS-14703:
---

Thanks [Daryn 
Sharp|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=daryn] for 
the review & comments. Thanks [Xing 
Lin|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=xinglin] for 
quick update.
 # Was the entry point for the calls via the rpc server, fsn, fsdir, etc? 
Relevant since end-to-end benchmarking rarely matches microbenchmarks.

We have run the benchmarking took in standalone mode with file:// schema. With 
this we would be able to achieve 50k-60k throughput. 
 # What is “30-40%” improvement? How many ops/sec before and after?

When we test in standalone mode, we found an average of 30% improvement with 
mkdir op.

https://issues.apache.org/jira/browse/HDFS-14703?focusedCommentId=17346002=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346002
 # What impact did it have on gc/min and gc time? These are often hidden 
killers of performance when not taken into consideration.

We have noticed that there is no CPU bottleneck with the patch. These metrics 
we need to capture yet. We shall check further and publish if any impact on GC 
with the patch.
 

We would like [~shv] to clarify further.

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15894) Trace Time-consuming RPC response of certain threshold.

2021-07-25 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-15894:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> Trace Time-consuming RPC response of certain threshold.
> ---
>
> Key: HDFS-15894
> URL: https://issues.apache.org/jira/browse/HDFS-15894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-15894.001.patch, HDFS-15894.002.patch, 
> HDFS-15894.003.patch
>
>
> Monitor & Trace Time-consuming RPC requests.
> Sometimes RPC Requests gets delayed, which impacts the system performance. 
> Currently, there is no track for delayed RPC request. 
> We can log such delayed RPC calls which exceeds certain threshold.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16141) [FGL] Address permission related issues with File / Directory

2021-07-25 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16141 started by Renukaprasad C.
-
> [FGL] Address permission related issues with File / Directory
> -
>
> Key: HDFS-16141
> URL: https://issues.apache.org/jira/browse/HDFS-16141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Post FGL implementation (MKDIR & Create File), there are existing UTs got 
> impacted which needs to be addressed.
> Failed Tests:
> TestDFSPermission
> TestPermission
> TestFileCreation
> TestDFSMkdirs (Added tests)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16141) [FGL] Address permission related issues with File / Directory

2021-07-25 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386943#comment-17386943
 ] 

Renukaprasad C commented on HDFS-16141:
---

[~shv] , [~xinglin] can you please have a look? Thank you. 
I had run the UT pipeline locally, could see many more tests passed with the 
patch.

> [FGL] Address permission related issues with File / Directory
> -
>
> Key: HDFS-16141
> URL: https://issues.apache.org/jira/browse/HDFS-16141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Post FGL implementation (MKDIR & Create File), there are existing UTs got 
> impacted which needs to be addressed.
> Failed Tests:
> TestDFSPermission
> TestPermission
> TestFileCreation
> TestDFSMkdirs (Added tests)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack

2021-07-25 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386932#comment-17386932
 ] 

Renukaprasad C commented on HDFS-16138:
---

Thanks [~hexiaoqiao] for the review. 
This is FGL branch (Trunk & 3.1.1) and issue (AssertionError) is specific to 
FGL code only. We found the cause and addressed the issue. But, any kind of 
exceptions in trunk lead to the same stack and ignore the acutal issue. Which 
is difficult to debug especially in production envs. So, its better to log the 
complete trace which cause the issue. 

> BlockReportProcessingThread exit doesnt print the acutal stack
> --
>
> Key: HDFS-16138
> URL: https://issues.apache.org/jira/browse/HDFS-16138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> BlockReportProcessingThread thread may gets exited with multiple reasons, but 
> the current logging prints only the exception message with different stack 
> which is difficult to debug the issue.
>  
> Existing logging:
> 2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report 
> processor encountered fatal exception: java.lang.AssertionError
> 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
> (ExitUtil.java:terminate(213)) - Terminate called
> 1: Block report processor encountered fatal exception: 
> java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
> Exception in thread "Block report processor" 1: Block report processor 
> encountered fatal exception: java.lang.AssertionError
>     at 
> org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)
>  
> Actual issue found at:
> 2021-07-20 10:20:23,101 [Block report processor] ERROR 
> blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
> java.lang.AssertionError
> java.lang.AssertionError
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)
>  
> This issue found while working on FGL branch. But, same issue can happen in 
> Trunk also in any error scenario.
>  
> [~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16141) [FGL] Address permission related issues with File / Directory

2021-07-25 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16141:
-

 Summary: [FGL] Address permission related issues with File / 
Directory
 Key: HDFS-16141
 URL: https://issues.apache.org/jira/browse/HDFS-16141
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Post FGL implementation (MKDIR & Create File), there are existing UTs got 
impacted which needs to be addressed.

Failed Tests:

TestDFSPermission

TestPermission

TestFileCreation

TestDFSMkdirs (Added tests)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16130) [FGL] Implement Create File with FGL

2021-07-23 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386603#comment-17386603
 ] 

Renukaprasad C commented on HDFS-16130:
---

Thank you [~shv] for review, feedback and corrections. 

> [FGL] Implement Create File with FGL
> 
>
> Key: HDFS-16130
> URL: https://issues.apache.org/jira/browse/HDFS-16130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Fine-Grained Locking
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: Fine-Grained Locking
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Implement FGL for Create File.
> Create API acquire global lock at mulitiple stages. Acquire the respective 
> partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack

2021-07-23 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16138:
-

 Summary: BlockReportProcessingThread exit doesnt print the acutal 
stack
 Key: HDFS-16138
 URL: https://issues.apache.org/jira/browse/HDFS-16138
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Renukaprasad C
Assignee: Renukaprasad C


BlockReportProcessingThread thread may gets exited with multiple reasons, but 
the current logging prints only the exception message with different stack 
which is difficult to debug the issue.

 

Existing logging:

2021-07-20 10:20:23,104 [Block report processor] INFO  util.ExitUtil 
(ExitUtil.java:terminate(210)) - Exiting with status 1: Block report processor 
encountered fatal exception: java.lang.AssertionError

2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil 
(ExitUtil.java:terminate(213)) - Terminate called

1: Block report processor encountered fatal exception: java.lang.AssertionError

    at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)

Exception in thread "Block report processor" 1: Block report processor 
encountered fatal exception: java.lang.AssertionError

    at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315)

 

Actual issue found at:

2021-07-20 10:20:23,101 [Block report processor] ERROR 
blockmanagement.BlockManager (BlockManager.java:run(5314)) - 
java.lang.AssertionError

java.lang.AssertionError

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305)

    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853)

    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334)

    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312)

 

This issue found while working on FGL branch. But, same issue can happen in 
Trunk also in any error scenario.

 

[~hemanthboyina] [~hexiaoqiao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16130) [FGL] Implement Create File with FGL

2021-07-21 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385019#comment-17385019
 ] 

Renukaprasad C commented on HDFS-16130:
---

Thanks [~xinglin] for quick review & feedback. Corrected the findings, please 
take a look. Thank you.

> [FGL] Implement Create File with FGL
> 
>
> Key: HDFS-16130
> URL: https://issues.apache.org/jira/browse/HDFS-16130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Fine-Grained Locking
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement FGL for Create File.
> Create API acquire global lock at mulitiple stages. Acquire the respective 
> partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16128) [FGL] Add support for saving/loading an FS Image for PartitionedGSet

2021-07-18 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383018#comment-17383018
 ] 

Renukaprasad C commented on HDFS-16128:
---

Loading Performance we need to measure, we can deal with it later if any 
degrade. Functionality is fine.

Rest all are ok, +1 from my side.

> [FGL] Add support for saving/loading an FS Image for PartitionedGSet
> 
>
> Key: HDFS-16128
> URL: https://issues.apache.org/jira/browse/HDFS-16128
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Major
>  Labels: pull-request-available
>
> Add support to save Inodes stored in PartitionedGSet when saving an FS image 
> and load Inodes into PartitionedGSet from a saved FS image.
> h1. Saving FSImage
> *Original HDFS design*: iterate every inode in inodeMap and save them into 
> the FSImage file. 
> *FGL*: no change is needed here, since PartitionedGSet also provides an 
> iterator interface, to iterate over inodes stored in partitions. 
> h1. Loading an HDFS 
> *Original HDFS design*: it first loads the FSImage files and then loads edit 
> logs for recent changes. FSImage files contain different sections, including 
> INodeSections and INodeDirectorySections. An InodeSection contains serialized 
> Inodes objects and the INodeDirectorySection contains the parent inode for an 
> Inode. When loading an FSImage, the system first loads INodeSections and then 
> load the INodeDirectorySections, to set the parent inode for each inode. 
> After FSImage files are loaded, edit logs are then loaded. Edit log contains 
> recent changes to the filesystem, including Inodes creation/deletion. For a 
> newly created INode, the parent inode is set before it is added to the 
> inodeMap.
> *FGL*: when adding an Inode into the partitionedGSet, we need the parent 
> inode of an inode, in order to determine which partition to store that inode, 
> when NAMESPACE_KEY_DEPTH = 2. Thus, in FGL, when loading FSImage files, we 
> used a temporary LightweightGSet (inodeMapTemp), to store inodes. When 
> LoadFSImage is done, the parent inode for all existing inodes in FSImage 
> files is set. We can now move the inodes into a partitionedGSet. Load edit 
> logs can work as usual, as the parent inode for an inode is set before it is 
> added to the inodeMap. 
> In theory, PartitionedGSet can support to store inodes without setting its 
> parent inodes. All these inodes will be stored in the 0th partition. However, 
> we decide to use a temporary LightweightGSet (inodeMapTemp) to store these 
> inodes, to make this case more transparent.          
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16128) [FGL] Add support for saving/loading an FS Image for PartitionedGSet

2021-07-18 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382897#comment-17382897
 ] 

Renukaprasad C commented on HDFS-16128:
---

[~xinglin] Thanks for reporting the issue and the patch.

Patch holds good, i have tested with MKDIR & Create File with some corrections 
mentioned in PR. We shall discuss further if any confusion.

[~shv] [~hexiaoqiao] Create File along with the MKDIR (POC), would be great 
combination for testing the framework. If you feel so, can you please take a 
look at HDFS-16130?

> [FGL] Add support for saving/loading an FS Image for PartitionedGSet
> 
>
> Key: HDFS-16128
> URL: https://issues.apache.org/jira/browse/HDFS-16128
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Major
>  Labels: pull-request-available
>
> Add support to save Inodes stored in PartitionedGSet when saving an FS image 
> and load Inodes into PartitionedGSet from a saved FS image.
> h1. Saving FSImage
> *Original HDFS design*: iterate every inode in inodeMap and save them into 
> the FSImage file. 
> *FGL*: no change is needed here, since PartitionedGSet also provides an 
> iterator interface, to iterate over inodes stored in partitions. 
> h1. Loading an HDFS 
> *Original HDFS design*: it first loads the FSImage files and then loads edit 
> logs for recent changes. FSImage files contain different sections, including 
> INodeSections and INodeDirectorySections. An InodeSection contains serialized 
> Inodes objects and the INodeDirectorySection contains the parent inode for an 
> Inode. When loading an FSImage, the system first loads INodeSections and then 
> load the INodeDirectorySections, to set the parent inode for each inode. 
> After FSImage files are loaded, edit logs are then loaded. Edit log contains 
> recent changes to the filesystem, including Inodes creation/deletion. For a 
> newly created INode, the parent inode is set before it is added to the 
> inodeMap.
> *FGL*: when adding an Inode into the partitionedGSet, we need the parent 
> inode of an inode, in order to determine which partition to store that inode, 
> when NAMESPACE_KEY_DEPTH = 2. Thus, in FGL, when loading FSImage files, we 
> used a temporary LightweightGSet (inodeMapTemp), to store inodes. When 
> LoadFSImage is done, the parent inode for all existing inodes in FSImage 
> files is set. We can now move the inodes into a partitionedGSet. Load edit 
> logs can work as usual, as the parent inode for an inode is set before it is 
> added to the inodeMap. 
> In theory, PartitionedGSet can support to store inodes without setting its 
> parent inodes. All these inodes will be stored in the 0th partition. However, 
> we decide to use a temporary LightweightGSet (inodeMapTemp) to store these 
> inodes, to make this case more transparent.          
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-17 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382654#comment-17382654
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] & [~ayushtkn] for review & feedback.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch, 
> HDFS-16067.006.patch, HDFS-16067.007.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-15 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381261#comment-17381261
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] for quick review & feedback. 

line break - updated the patch, also other checkstyle & compiler warnings were 
handled in latest patch. Please have a look. Sure, Lets wait for the build 
results. Thank you.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch, 
> HDFS-16067.006.patch, HDFS-16067.007.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-15 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.007.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch, 
> HDFS-16067.006.patch, HDFS-16067.007.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-14 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.006.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch, 
> HDFS-16067.006.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16125) [FGL] Fix the iterator for PartitionedGSet

2021-07-14 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380814#comment-17380814
 ] 

Renukaprasad C commented on HDFS-16125:
---

[~weichiu] Merge build got failed, could you help us to locate the issue? 

> [FGL] Fix the iterator for PartitionedGSet 
> ---
>
> Key: HDFS-16125
> URL: https://issues.apache.org/jira/browse/HDFS-16125
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Iterator in PartitionedGSet would visit the first partition twice, since we 
> did not set the keyIterator to move to the first key during initialization.  
>  
> This is related to fgl: https://issues.apache.org/jira/browse/HDFS-14703



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16130) [FGL] Implement Create File with FGL

2021-07-14 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380811#comment-17380811
 ] 

Renukaprasad C commented on HDFS-16130:
---

[~shv] [~xinglin] Can you please take a look into the PR for Create File 
operation? Thank you.

With the above changes i could see around 25% performance improvement.

> [FGL] Implement Create File with FGL
> 
>
> Key: HDFS-16130
> URL: https://issues.apache.org/jira/browse/HDFS-16130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Fine-Grained Locking
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Implement FGL for Create File.
> Create API acquire global lock at mulitiple stages. Acquire the respective 
> partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16130) [FGL] Implement Create File with FGL

2021-07-14 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16130 started by Renukaprasad C.
-
> [FGL] Implement Create File with FGL
> 
>
> Key: HDFS-16130
> URL: https://issues.apache.org/jira/browse/HDFS-16130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Fine-Grained Locking
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Implement FGL for Create File.
> Create API acquire global lock at mulitiple stages. Acquire the respective 
> partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16130) [FGL] Implement Create File with FGL

2021-07-14 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16130:
--
Summary: [FGL] Implement Create File with FGL  (was: FGL for Create File)

> [FGL] Implement Create File with FGL
> 
>
> Key: HDFS-16130
> URL: https://issues.apache.org/jira/browse/HDFS-16130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Fine-Grained Locking
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> Implement FGL for Create File.
> Create API acquire global lock at mulitiple stages. Acquire the respective 
> partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16130) FGL for Create File

2021-07-14 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16130:
-

 Summary: FGL for Create File
 Key: HDFS-16130
 URL: https://issues.apache.org/jira/browse/HDFS-16130
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Fine-Grained Locking
Reporter: Renukaprasad C
Assignee: Renukaprasad C


Implement FGL for Create File.

Create API acquire global lock at mulitiple stages. Acquire the respective 
partitioned lock and continue the create operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-07-14 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380424#comment-17380424
 ] 

Renukaprasad C commented on HDFS-14703:
---

Thanks [~shv] for review & feedback.

Shall i raise separate Jira for Create and trace the PR? Or is ok to go with 
the current PR?
{noformat}
Noticed that you implemented getInode(id) by iterating through all inodes. This 
is probably the key part of this effort. We should eventually replace 
getInode(id) with getInode(key) to make the inode lookup efficient.{noformat}
I totally agree with you, this is an overhead in finding the iNodes on large 
dataset. Just provided work-around to continue, we shall work on it and 
eventually optimize it better.

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16126) VolumePair should override hashcode() method

2021-07-13 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380106#comment-17380106
 ] 

Renukaprasad C commented on HDFS-16126:
---

[~lei w] Thanks for reporting the issue. 

org.apache.hadoop.hdfs.server.datanode.DiskBalancer.VolumePair#hashCode

org.apache.hadoop.hdfs.server.datanode.DiskBalancer.VolumePair#equals

These 2 methods already implemented in VolumePair. Anything missing out of this?

>  VolumePair  should  override hashcode() method
> ---
>
> Key: HDFS-16126
> URL: https://issues.apache.org/jira/browse/HDFS-16126
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: diskbalancer
>Reporter: lei w
>Priority: Minor
>
> Now  we use a map to check  one plan with more than one line of same 
> VolumePair in createWorkPlan(final VolumePair volumePair, Step step) , code 
> is as flow:
> {code:java}
> private void createWorkPlan(final VolumePair volumePair, Step step)
>   throws DiskBalancerException {
>  // ... 
> // In case we have a plan with more than
> // one line of same VolumePair
> // we compress that into one work order.
> if (workMap.containsKey(volumePair)) {//  To check use map
>   bytesToMove += workMap.get(volumePair).getBytesToCopy();
> }
>// ...
>   }
> {code}
>  I found the object volumePair is always a new object and without hashcode() 
> method, So use a map to check is invalid. Should we add  hashcode() in 
> VolumePair ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-12 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379187#comment-17379187
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~ayushtkn] for reviewing the patch,

Addressed the comments, can you have a look into it when you get time? Thanks .

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-07-12 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.005.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch, HDFS-16067.005.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-29 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17371175#comment-17371175
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~ayushtkn] for review & feedback.
{code:java}
  HdfsFileStatus status = blkWithStatus.getFileStatus();
{code}
This i have added as read API after the APPEND operation, aprt from this it is 
not related to it.

Other comments i will address & update the patch soon. Thank you.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-24 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369056#comment-17369056
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] for the UT clarification & patch review. Yes, printUsage 
got missed, corrected it in HDFS-16067.004.patch. Please review.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-24 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.004.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch, HDFS-16067.004.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-24 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368748#comment-17368748
 ] 

Renukaprasad C commented on HDFS-16067:
---

[~hexiaoqiao] There are random UT failures, but the changes are not related to 
the failed test. Which i verified locally. Can you have a look at the failed 
tests? Thank you.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14575) LeaseRenewer#daemon threads leak in DFSClient

2021-06-21 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366784#comment-17366784
 ] 

Renukaprasad C commented on HDFS-14575:
---

Thank you [~hexiaoqiao] & [~weichiu] for review and feedback. [~Tao Yang] 
Thanks for the proposal.

> LeaseRenewer#daemon threads leak in DFSClient
> -
>
> Key: HDFS-14575
> URL: https://issues.apache.org/jira/browse/HDFS-14575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-14575.001.patch, HDFS-14575.002.patch, 
> HDFS-14575.003.patch, HDFS-14575.004.patch
>
>
> Currently LeaseRenewer (and its daemon thread) without clients should be 
> terminated after a grace period which defaults to 60 seconds. A race 
> condition may happen when a new request is coming just after LeaseRenewer 
> expired.
>  Reproduce this race condition:
>  # Client#1 creates File#1: creates LeaseRenewer#1 and starts Daemon#1 
> thread, after a few seconds, File#1 is closed , there is no clients in 
> LeaseRenewer#1 now.
>  # 60 seconds (grace period) later, LeaseRenewer#1 just expires but daemon#1 
> thread is still in sleep, Client#1 creates File#2, lead to the creation of 
> Daemon#2.
>  # Daemon#1 is awake then exit, after that, LeaseRenewer#1 is removed from 
> factory.
>  # File#2 is closed after a few seconds, LeaseRenewer#2 is created since it 
> can’t get renewer from factory.
> Daemon#2 thread leaks from now on, since Client#1 in it can never be removed 
> and it won't have a chance to stop.
> To solve this problem, IIUIC, a simple way I think is to make sure that all 
> clients are cleared when LeaseRenewer is removed from factory. Please feel 
> free to give your suggestions. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-21 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366780#comment-17366780
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] for quick update.

I had update the patch. Failed tests are unrelated, lets wait for the results 
of build you triggered.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-21 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.003.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch, 
> HDFS-16067.003.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366236#comment-17366236
 ] 

Renukaprasad C commented on HDFS-16067:
---

Thanks [~hexiaoqiao] for quick review & feedback.

A. Regarding Prepare Fileset: AppendFileStats extends OpenFileStats, so 
generateInputs() is being called from the base 
classOperationStatsBase#benchmark(). Please correct me if i missed something 
from your point.

B. I had updated and uploaded HDFS-16067.002.patch.

Please review & feedback if anything else i missed. Thank you.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-20 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-16067:
--
Attachment: HDFS-16067.002.patch

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch, HDFS-16067.002.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16067) Support Append API in NNThroughputBenchmark

2021-06-19 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365938#comment-17365938
 ] 

Renukaprasad C commented on HDFS-16067:
---

[~hexiaoqiao] [~surendralilhore] can you please have a look into the patch when 
you find time? Failed test is not related to the code changes done.

> Support Append API in NNThroughputBenchmark
> ---
>
> Key: HDFS-16067
> URL: https://issues.apache.org/jira/browse/HDFS-16067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Minor
> Attachments: HDFS-16067.001.patch
>
>
> Append API needs to be added into NNThroughputBenchmark tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14575) LeaseRenewer#daemon threads leak in DFSClient

2021-06-19 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C reassigned HDFS-14575:
-

Assignee: Renukaprasad C  (was: Tao Yang)

> LeaseRenewer#daemon threads leak in DFSClient
> -
>
> Key: HDFS-14575
> URL: https://issues.apache.org/jira/browse/HDFS-14575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14575.001.patch, HDFS-14575.002.patch, 
> HDFS-14575.003.patch, HDFS-14575.004.patch
>
>
> Currently LeaseRenewer (and its daemon thread) without clients should be 
> terminated after a grace period which defaults to 60 seconds. A race 
> condition may happen when a new request is coming just after LeaseRenewer 
> expired.
>  Reproduce this race condition:
>  # Client#1 creates File#1: creates LeaseRenewer#1 and starts Daemon#1 
> thread, after a few seconds, File#1 is closed , there is no clients in 
> LeaseRenewer#1 now.
>  # 60 seconds (grace period) later, LeaseRenewer#1 just expires but daemon#1 
> thread is still in sleep, Client#1 creates File#2, lead to the creation of 
> Daemon#2.
>  # Daemon#1 is awake then exit, after that, LeaseRenewer#1 is removed from 
> factory.
>  # File#2 is closed after a few seconds, LeaseRenewer#2 is created since it 
> can’t get renewer from factory.
> Daemon#2 thread leaks from now on, since Client#1 in it can never be removed 
> and it won't have a chance to stop.
> To solve this problem, IIUIC, a simple way I think is to make sure that all 
> clients are cleared when LeaseRenewer is removed from factory. Please feel 
> free to give your suggestions. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14575) LeaseRenewer#daemon threads leak in DFSClient

2021-06-19 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365932#comment-17365932
 ] 

Renukaprasad C commented on HDFS-14575:
---

Thank you [~hexiaoqiao] for quick review and feedback. 

I had incorported the changes and updated the patch - HDFS-14575.004.patch. 
Please have a look into when you find time.

Regarding wild import - Sure, i will consider your suggestion for further 
patches. Thank you.

> LeaseRenewer#daemon threads leak in DFSClient
> -
>
> Key: HDFS-14575
> URL: https://issues.apache.org/jira/browse/HDFS-14575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: HDFS-14575.001.patch, HDFS-14575.002.patch, 
> HDFS-14575.003.patch, HDFS-14575.004.patch
>
>
> Currently LeaseRenewer (and its daemon thread) without clients should be 
> terminated after a grace period which defaults to 60 seconds. A race 
> condition may happen when a new request is coming just after LeaseRenewer 
> expired.
>  Reproduce this race condition:
>  # Client#1 creates File#1: creates LeaseRenewer#1 and starts Daemon#1 
> thread, after a few seconds, File#1 is closed , there is no clients in 
> LeaseRenewer#1 now.
>  # 60 seconds (grace period) later, LeaseRenewer#1 just expires but daemon#1 
> thread is still in sleep, Client#1 creates File#2, lead to the creation of 
> Daemon#2.
>  # Daemon#1 is awake then exit, after that, LeaseRenewer#1 is removed from 
> factory.
>  # File#2 is closed after a few seconds, LeaseRenewer#2 is created since it 
> can’t get renewer from factory.
> Daemon#2 thread leaks from now on, since Client#1 in it can never be removed 
> and it won't have a chance to stop.
> To solve this problem, IIUIC, a simple way I think is to make sure that all 
> clients are cleared when LeaseRenewer is removed from factory. Please feel 
> free to give your suggestions. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14575) LeaseRenewer#daemon threads leak in DFSClient

2021-06-19 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-14575:
--
Attachment: HDFS-14575.004.patch

> LeaseRenewer#daemon threads leak in DFSClient
> -
>
> Key: HDFS-14575
> URL: https://issues.apache.org/jira/browse/HDFS-14575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: HDFS-14575.001.patch, HDFS-14575.002.patch, 
> HDFS-14575.003.patch, HDFS-14575.004.patch
>
>
> Currently LeaseRenewer (and its daemon thread) without clients should be 
> terminated after a grace period which defaults to 60 seconds. A race 
> condition may happen when a new request is coming just after LeaseRenewer 
> expired.
>  Reproduce this race condition:
>  # Client#1 creates File#1: creates LeaseRenewer#1 and starts Daemon#1 
> thread, after a few seconds, File#1 is closed , there is no clients in 
> LeaseRenewer#1 now.
>  # 60 seconds (grace period) later, LeaseRenewer#1 just expires but daemon#1 
> thread is still in sleep, Client#1 creates File#2, lead to the creation of 
> Daemon#2.
>  # Daemon#1 is awake then exit, after that, LeaseRenewer#1 is removed from 
> factory.
>  # File#2 is closed after a few seconds, LeaseRenewer#2 is created since it 
> can’t get renewer from factory.
> Daemon#2 thread leaks from now on, since Client#1 in it can never be removed 
> and it won't have a chance to stop.
> To solve this problem, IIUIC, a simple way I think is to make sure that all 
> clients are cleared when LeaseRenewer is removed from factory. Please feel 
> free to give your suggestions. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   >