[jira] [Commented] (HDFS-15584) Improve HDFS large deletion cause namenode lockqueue boom and pending deletion boom.

2020-09-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198220#comment-17198220
 ] 

zhuqi commented on HDFS-15584:
--

Hi [~sodonnell] 

Yeah, i agree with you that we should sleep.

And the good default for "dfs.namenode.block.deletion.lock.time.threshold", we 
should test how much it should be. If the sleep time is 1ms, the 
"dfs.namenode.block.deletion.lock.time.threshold" may should be tens of ms, but 
i set the it 100ms for init.

Also i add the threshold is to avoid sleep when not heavy deletion, to reduce 
the time it holds the lock. You are right.

When pending deletion too many, the datanode will be heavy to deal with too 
many deletion. Also the speed namenode put into will be slower, which affect 
the performance.

Thanks for your quickly reply.

> Improve HDFS large deletion cause namenode lockqueue boom and pending 
> deletion boom.
> 
>
> Key: HDFS-15584
> URL: https://issues.apache.org/jira/browse/HDFS-15584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15584.001.patch
>
>
> In our production cluster, the large deletion will boom the namenode lock 
> queue, also will lead to the boom of pending deletion in invalidate blocks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15584) Improve HDFS large deletion cause namenode lockqueue boom and pending deletion boom.

2020-09-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198198#comment-17198198
 ] 

zhuqi commented on HDFS-15584:
--

Hi [~LiJinglun], in our very busy cluster with thousands of nodes, very heavy 
deletion everyday causes the lock queue full for a couple of minutes. Also, 
when millions of blocks are put into the pending deletion queue, NameNode will 
suffer from a big performance drop. When the above situation happens, the 
original block increment solution also can not solve the problem in our 
cluster, so i add the patch to try to solve it.

Thanks.

> Improve HDFS large deletion cause namenode lockqueue boom and pending 
> deletion boom.
> 
>
> Key: HDFS-15584
> URL: https://issues.apache.org/jira/browse/HDFS-15584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15584.001.patch
>
>
> In our production cluster, the large deletion will boom the namenode lock 
> queue, also will lead to the boom of pending deletion in invalidate blocks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15584) Improve HDFS large deletion cause namenode lockqueue boom and pending deletion boom.

2020-09-17 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198093#comment-17198093
 ] 

zhuqi commented on HDFS-15584:
--

cc [~sodonnell] ,   [~hexiaoqiao] 

I add the draft patch without unit test.

I add the wait lock time, also i add the threshold to control the wait lock 
interval. If the lock time of block deletion exceed the threshold of the total 
lock time, we will wait the lock.

All two we can configure according cluster situation.

If you any advice , thanks.

> Improve HDFS large deletion cause namenode lockqueue boom and pending 
> deletion boom.
> 
>
> Key: HDFS-15584
> URL: https://issues.apache.org/jira/browse/HDFS-15584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15584.001.patch
>
>
> In our production cluster, the large deletion will boom the namenode lock 
> queue, also will lead to the boom of pending deletion in invalidate blocks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15584) Improve HDFS large deletion cause namenode lockqueue boom and pending deletion boom.

2020-09-17 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15584:
-
Attachment: HDFS-15584.001.patch
Status: Patch Available  (was: Open)

> Improve HDFS large deletion cause namenode lockqueue boom and pending 
> deletion boom.
> 
>
> Key: HDFS-15584
> URL: https://issues.apache.org/jira/browse/HDFS-15584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15584.001.patch
>
>
> In our production cluster, the large deletion will boom the namenode lock 
> queue, also will lead to the boom of pending deletion in invalidate blocks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15584) Improve HDFS large deletion cause namenode lockqueue boom and pending deletion boom.

2020-09-17 Thread zhuqi (Jira)
zhuqi created HDFS-15584:


 Summary: Improve HDFS large deletion cause namenode lockqueue boom 
and pending deletion boom.
 Key: HDFS-15584
 URL: https://issues.apache.org/jira/browse/HDFS-15584
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.4.0
Reporter: zhuqi
Assignee: zhuqi


In our production cluster, the large deletion will boom the namenode lock 
queue, also will lead to the boom of pending deletion in invalidate blocks.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-05-08 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102302#comment-17102302
 ] 

zhuqi commented on HDFS-15160:
--

Hi [~weichiu] 

The wrong case seems not related to this patch, i apply to the 005 in our 
cluster.
And the cluster works well now.

Thanks.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-04-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081128#comment-17081128
 ] 

zhuqi commented on HDFS-15160:
--

cc [~weichiu]

Thanks for your reply.

The 003 i apply , i will change to 005 to see if the problem will be solved.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-04-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080371#comment-17080371
 ] 

zhuqi commented on HDFS-15160:
--

cc [~sodonnell]

If the race condition cause the RBW get genstamp not consistent.

There are some cases in our production cluster.

!image-2020-04-10-17-18-08-128.png|width=860,height=226!

!image-2020-04-10-17-18-55-938.png|width=1144,height=157!

 

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-04-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15160:
-
Attachment: image-2020-04-10-17-18-55-938.png

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-04-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15160:
-
Attachment: image-2020-04-10-17-18-08-128.png

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> image-2020-04-10-17-18-08-128.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-04-08 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078878#comment-17078878
 ] 

zhuqi commented on HDFS-15180:
--

cc [~Aiphag0]

The Block in org.apache.hadoop.hdfs.protocol GenerationStamp and bytes action 
should change to synchronized.

There may some  cases that when holding the read lock, but we update the Block 
GenerationStamp at the same time.

!image-2020-04-09-11-20-36-459.png!

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png, image-2020-04-09-11-20-36-459.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-04-08 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15180:
-
Attachment: image-2020-04-09-11-20-36-459.png

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png, image-2020-04-09-11-20-36-459.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14524) NNTop total counts does not add up as expected

2020-03-19 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi reassigned HDFS-14524:


Assignee: (was: zhuqi)

> NNTop total counts does not add up as expected
> --
>
> Key: HDFS-14524
> URL: https://issues.apache.org/jira/browse/HDFS-14524
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Priority: Minor
> Attachments: HDFS-14524.001.patch
>
>
> {{opType='*'}} is sometimes smaller than the sum of the individual operation 
> types.
> {code:java}
> {
>   "windows": [
> {
>   "windowLenMs": 30,
>   "ops": [
> {
>   "totalCount": 24158,
>   "opType": "rpc.complete",
>   "topUsers": [{ "count": 2944, "user": "user1" }]
> },
> {
>   "totalCount": 15921,
>   "opType": "rpc.rename",
>   "topUsers": [{ "count": 2891, "user": "user1" }]
> },
> {
>   "totalCount": 3015834,
>   "opType": "*",
>   "topUsers": [{ "count": 66652, "user": "user1" }]
> },
> {
>   "totalCount": 2086,
>   "opType": "rpc.abandonBlock",
>   "topUsers": [{ "count": 603, "user": "user1" }]
> },
> {
>   "totalCount": 30258,
>   "opType": "rpc.addBlock",
>   "topUsers": [{ "count": 3182, "user": "user1" }]
> },
> {
>   "totalCount": 101440,
>   "opType": "rpc.getServerDefaults",
>   "topUsers": [{ "count": 3521, "user": "user1" }]
> },
> {
>   "totalCount": 25258,
>   "opType": "rpc.create",
>   "topUsers": [{ "count": 1864, "user": "user1" }]
> },
> {
>   "totalCount": 1377563,
>   "opType": "rpc.getFileInfo",
>   "topUsers": [{ "count": 56541, "user": "user1" }]
> },
> {
>   "totalCount": 60836,
>   "opType": "rpc.renewLease",
>   "topUsers": [{ "count": 3783, "user": "user1" }]
> },
> {
>   "totalCount": 182212,
>   "opType": "rpc.getListing",
>   "topUsers": [{ "count": 1848, "user": "user1" }]
> },
> {
>   "totalCount": 380,
>   "opType": "rpc.updateBlockForPipeline",
>   "topUsers": [{ "count": 58, "user": "user1" }]
> },
> {
>   "totalCount": 215,
>   "opType": "rpc.updatePipeline",
>   "topUsers": [{ "count": 18, "user": "user1" }]
> }
>   ]
> }
>   ],
>   "timestamp": "2019-01-12"
> }
> {code}
>  
>  {{opType='*'}} from user {{user1}} is {{66652}}, but the sum of counts for 
> other {{optype}} values by {{user1}} is actually larger: {{77253}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14524) NNTop total counts does not add up as expected

2020-03-19 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi reassigned HDFS-14524:


Assignee: zhuqi  (was: Ahmed Hussein)

> NNTop total counts does not add up as expected
> --
>
> Key: HDFS-14524
> URL: https://issues.apache.org/jira/browse/HDFS-14524
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ahmed Hussein
>Assignee: zhuqi
>Priority: Minor
> Attachments: HDFS-14524.001.patch
>
>
> {{opType='*'}} is sometimes smaller than the sum of the individual operation 
> types.
> {code:java}
> {
>   "windows": [
> {
>   "windowLenMs": 30,
>   "ops": [
> {
>   "totalCount": 24158,
>   "opType": "rpc.complete",
>   "topUsers": [{ "count": 2944, "user": "user1" }]
> },
> {
>   "totalCount": 15921,
>   "opType": "rpc.rename",
>   "topUsers": [{ "count": 2891, "user": "user1" }]
> },
> {
>   "totalCount": 3015834,
>   "opType": "*",
>   "topUsers": [{ "count": 66652, "user": "user1" }]
> },
> {
>   "totalCount": 2086,
>   "opType": "rpc.abandonBlock",
>   "topUsers": [{ "count": 603, "user": "user1" }]
> },
> {
>   "totalCount": 30258,
>   "opType": "rpc.addBlock",
>   "topUsers": [{ "count": 3182, "user": "user1" }]
> },
> {
>   "totalCount": 101440,
>   "opType": "rpc.getServerDefaults",
>   "topUsers": [{ "count": 3521, "user": "user1" }]
> },
> {
>   "totalCount": 25258,
>   "opType": "rpc.create",
>   "topUsers": [{ "count": 1864, "user": "user1" }]
> },
> {
>   "totalCount": 1377563,
>   "opType": "rpc.getFileInfo",
>   "topUsers": [{ "count": 56541, "user": "user1" }]
> },
> {
>   "totalCount": 60836,
>   "opType": "rpc.renewLease",
>   "topUsers": [{ "count": 3783, "user": "user1" }]
> },
> {
>   "totalCount": 182212,
>   "opType": "rpc.getListing",
>   "topUsers": [{ "count": 1848, "user": "user1" }]
> },
> {
>   "totalCount": 380,
>   "opType": "rpc.updateBlockForPipeline",
>   "topUsers": [{ "count": 58, "user": "user1" }]
> },
> {
>   "totalCount": 215,
>   "opType": "rpc.updatePipeline",
>   "topUsers": [{ "count": 18, "user": "user1" }]
> }
>   ]
> }
>   ],
>   "timestamp": "2019-01-12"
> }
> {code}
>  
>  {{opType='*'}} from user {{user1}} is {{66652}}, but the sum of counts for 
> other {{optype}} values by {{user1}} is actually larger: {{77253}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-17 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060700#comment-17060700
 ] 

zhuqi commented on HDFS-15180:
--

Thanks  [~Aiphag0] for your works and the POC patch. 

CC [~hexiaoqiao]  [~Aiphag0] LGTM the POC patch.

There are some suggestions that :

First , we'd better to use Lock to implement AutoCloseableLock, so that 
consistent with the new read write lock in DataNode and can use try() without 
finally{} consisely.

Second, the get replica information in 
DataNode#transferReplicaForPipelineRecovery should change to read lock.

And i am looking forward to the volume level lock, and remove remain IO lock. 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056054#comment-17056054
 ] 

zhuqi edited comment on HDFS-15160 at 3/10/20, 3:23 PM:


Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 
data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.


was (Author: zhuqi):
Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 

data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056054#comment-17056054
 ] 

zhuqi commented on HDFS-15160:
--

Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 

data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056011#comment-17056011
 ] 

zhuqi commented on HDFS-15180:
--

Hi [~sodonnell]

Yeah, your comment is just my mean.

If we need add the top lock time information to the datanode metrics , so that 
may help to make the future performance improvement decision.

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi edited comment on HDFS-15180 at 3/10/20, 9:44 AM:


!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell] 
It's the thread blocked avg per day monitor.

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
 And i analyze the log to find the long lock time happen when :
 !image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
 And other does not cause too much time when very busy:
 !image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.


was (Author: zhuqi):
!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell]

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
And i analyze the log to find the long lock time happen when :
!image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
And other does not cause too much time when very busy:
!image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055747#comment-17055747
 ] 

zhuqi commented on HDFS-15180:
--

CC [~sodonnell]
And my version is based cdh 5.16.1, and my lock is fair lock. What is the key 
performance factor about the RW in DataNode you want to know, i can try to 
confirm the RW improvement when i next to gray more nodes in our busy cluster.
May be we need more metrics to help the performance improvement confirm, what 
do you think about it? 

Thanks a lot.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi edited comment on HDFS-15180 at 3/10/20, 9:35 AM:


!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell]

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
And i analyze the log to find the long lock time happen when :
!image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
And other does not cause too much time when very busy:
!image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.


was (Author: zhuqi):
!image-2020-03-10-17-22-57-391.png!

 

 

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi commented on HDFS-15180:
--

!image-2020-03-10-17-22-57-391.png!

 

 

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15180:
-
Attachment: image-2020-03-10-17-22-57-391.png

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055671#comment-17055671
 ] 

zhuqi commented on HDFS-15180:
--

[~sodonnell] , [~hexiaoqiao] 

Thanks for your patient reply.

 [~sodonnell] have done some work, ref. HDFS-15150 introduce read write lock 
and HDFS-15160 is in progress currently. I have used HDFS-15160  in our product 
cluster to gray, and now the blocked thread in datanode  has been reduced a 
lot.--

[~hexiaoqiao] i am looking forward to the {{BlockPoolLockManager}} to split 
{{dataLock}} more fine-grained, i can assign to  [~Aiphag0] anytime if he wants 
to take it.

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17042798#comment-17042798
 ] 

zhuqi commented on HDFS-15041:
--

cc [~ayushtkn]

Thanks for your review.

I have fixed it.

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch, 
> HDFS-15041.003.patch, HDFS-15041.004.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Attachment: HDFS-15041.004.patch
Status: Patch Available  (was: In Progress)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch, 
> HDFS-15041.003.patch, HDFS-15041.004.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Status: In Progress  (was: Patch Available)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch, 
> HDFS-15041.003.patch, HDFS-15041.004.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17042783#comment-17042783
 ] 

zhuqi commented on HDFS-15041:
--

cc [~ayushtkn]    [~weichiu]

I have changed the configuration to support time units. If any other change for 
merging it.

Thanks.

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch, 
> HDFS-15041.003.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Attachment: HDFS-15041.003.patch
Status: Patch Available  (was: In Progress)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch, 
> HDFS-15041.003.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2020-02-22 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Status: In Progress  (was: Patch Available)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-20 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041574#comment-17041574
 ] 

zhuqi commented on HDFS-15171:
--

Hi [~weichiu] 
There are no cache file if the datanode shutdow ungracefully , change the 
dfs.datanode.cached-dfsused.check.interval.ms will not help my case.

The HDFS-14313  should can reduce the refresh time, i will try it.

Thanks.

> Add a thread to call saveDfsUsed periodically, to prevent datanode too long 
> restart time.  
> ---
>
> Key: HDFS-15171
> URL: https://issues.apache.org/jira/browse/HDFS-15171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> There are 30 storage dirs per datanode in our production cluster , it will 
> take too many time to restart, because sometimes the datanode didn't shutdown 
> gracefully. Now only the datanode graceful shut down hook and the 
> blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
> restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
> add a thread to periodically call the saveDfsUsed function.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-20 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041574#comment-17041574
 ] 

zhuqi edited comment on HDFS-15171 at 2/21/20 6:08 AM:
---

Hi [~weichiu] 
 There are no cache file if the datanode shutdown ungracefully , change the 
dfs.datanode.cached-dfsused.check.interval.ms will not help my case.

The HDFS-14313  should can reduce the refresh time, i will try it.

Thanks.


was (Author: zhuqi):
Hi [~weichiu] 
There are no cache file if the datanode shutdow ungracefully , change the 
dfs.datanode.cached-dfsused.check.interval.ms will not help my case.

The HDFS-14313  should can reduce the refresh time, i will try it.

Thanks.

> Add a thread to call saveDfsUsed periodically, to prevent datanode too long 
> restart time.  
> ---
>
> Key: HDFS-15171
> URL: https://issues.apache.org/jira/browse/HDFS-15171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> There are 30 storage dirs per datanode in our production cluster , it will 
> take too many time to restart, because sometimes the datanode didn't shutdown 
> gracefully. Now only the datanode graceful shut down hook and the 
> blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
> restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
> add a thread to periodically call the saveDfsUsed function.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-20 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041564#comment-17041564
 ] 

zhuqi commented on HDFS-15171:
--

Hi [~sodonnell]

Thanks for your patient reply.

First, the every 10 minutes thread in CachingGetSpaceUsed, now with a random 
jitter time to random the refresh operation, and if we can persist the value to 
the cache file when the value refresh, this is the most real time cache.

Second, when the value refresh, we can compare it with last one, if they are 
same, we can jump the persist operation to reduce the disk operation.

In order to reduce the disk operation, we can add a fixed time interval which 
can be configurated, when the real time fresh time exceed the fixed time 
interval , then to persist the value to disk.

Then we can remove the shutdown hook persist operation and don't need to 
caculate what dfs.datanode.cached-dfsused.check.interval.ms is suitable 
anymore. 

And also can reslove my problem, which caused by the datanode shutdown 
ungracefully. 

What do you think about my advice?

> Add a thread to call saveDfsUsed periodically, to prevent datanode too long 
> restart time.  
> ---
>
> Key: HDFS-15171
> URL: https://issues.apache.org/jira/browse/HDFS-15171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> There are 30 storage dirs per datanode in our production cluster , it will 
> take too many time to restart, because sometimes the datanode didn't shutdown 
> gracefully. Now only the datanode graceful shut down hook and the 
> blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
> restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
> add a thread to periodically call the saveDfsUsed function.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-20 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041510#comment-17041510
 ] 

zhuqi commented on HDFS-15177:
--

cc [~sodonnell] 

Thanks your patient reply. I will change to fair.

 

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-20 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040816#comment-17040816
 ] 

zhuqi commented on HDFS-15177:
--

Hi [~sodonnell]

Thanks for your reply.

I will monitor the FoldedTreeSet problem such as HDFS-15131. 
And you said on the 3.x branch, the locking in the DN has been changed to a 
fair lock for some time now, but i find the AutoCloseableLock uses the 
ReentrantLock and it default uses NonfairSync , and when will the DN uses the 
fair lock?



 

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15180:
-
Component/s: datanode

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039723#comment-17039723
 ] 

zhuqi commented on HDFS-15177:
--

Hi [~sodonnell]

Thanks for your reply.

Next, i will change our cluster to a fair lock which has been changed in 3.x 
branch, then to see if the blocked thread problem will be improved.

I support you that the FoldedTreeSet should be improved to get better 
performance, and i will monitor the namenode stack when the datanode become 
slower next time, to see if the FoldedTreeSet problem happen.

I am excited to just see the  
[HDFS-15150|https://issues.apache.org/jira/browse/HDFS-15150] and 
the[HDFS-15160|https://issues.apache.org/jira/browse/HDFS-15160] , it is a good 
news for the improvement for the concurrency and throughput of the lock, and it 
is a start for a lock per block pool proposal.

 

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-02-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039167#comment-17039167
 ] 

zhuqi commented on HDFS-15180:
--

cc [~sodonnell] ,  [~linyiqun], [~weichiu] , [~hexiaoqiao] 

 What do you think about it. Can you give some advice ?

Thanks.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15180:
-
Description: Now the FsDatasetImpl datasetLock is heavy, when their are 
many namespaces in big cluster. If we can split the FsDatasetImpl datasetLock 
via blockpool.   (was: Now the FsDatasetImpl datasetLock is heavy, when their 
are many namespaces in big cluster, we can split the FsDatasetImpl datasetLock 
via blockpool. )

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039165#comment-17039165
 ] 

zhuqi #ommented on HDFS-15177:
--

cc [~sodonnell] ,  

Not the trunk version, my version is hadoop2.6.0-cdh5.16.1. And when the 
pending deletion in monitor booms , the synchronized FsDatasetImpl heavy 
problem become more obvious. 

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-02-18 Thread zhuqi (Jira)
zhuqi created HDFS-15180:


 Summary:  DataNode FsDatasetImpl Fine-Grained Locking via 
BlockPool.
 Key: HDFS-15180
 URL: https://issues.apache.org/jira/browse/HDFS-15180
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: zhuqi
Assignee: zhuqi


Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
big cluster, we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039146#comment-17039146
 ] 

zhuqi edited comment on HDFS-15177 at 2/18/20 2:56 PM:
---

cc [~sodonnell] , 

Thanks for your reply. Our cluster' deletion is so heavy with many namespaces 
and with about 30 storage dirs on one datanode. All namespace will call the 
same FsDatasetImpl function.

I am on 2.x branch, cdh5.16.1 version, I mean BPOfferService receive too many 
deletion blocks waiting to be delete. And cause too many items in Block 
invalidBlks[], so that the synchronized FsDatasetImpl including the volumeMap 
removed will iterator so many times, even blocked the heartbeat which uses the  
synchronized FsDatasetImpl and when we change to remove the heartbeat 
synchronized FsDatasetImpl the heartbeat recover normal according to HDFS-7060 
, but heavy deletion will still block the synchronized FsDatasetImpl some times 
which will affect other action about synchronized FsDatasetImpl.

The blocked stack example is :

!image-2020-02-18-22-39-00-642.png|width=843,height=115!

Also affect the read and write.
!image-2020-02-18-22-55-38-661.png|width=960,height=122!

!image-2020-02-18-22-51-28-624.png|width=891,height=120!

!image-2020-02-18-22-52-59-202.png|width=996,height=120!


was (Author: zhuqi):
cc [~sodonnell] , 

Thanks for your reply. Our cluster' deletion is so heavy with many namespaces 
and with about 30 storage dirs on one datanode. All namespace will call the 
same FsDatasetImpl function.

I am on 2.x branch, cdh5.16.1 version, I mean BPOfferService receive too many 
deletion blocks waiting to be delete. And cause too many items in Block 
invalidBlks[], so that the synchronized FsDatasetImpl including the volumeMap 
removed will iterator so many times, even blocked the heartbeat which uses the  
synchronized FsDatasetImpl and when we change to remove the heartbeat 
synchronized FsDatasetImpl the heartbeat recover normal according to 
[HDFS-7060|https://issues.apache.org/jira/browse/HDFS-7060] , but heavy 
deletion will still block the synchronized FsDatasetImpl some times which will 
affect other action about synchronized FsDatasetImpl.

The blocked stack example is :

!image-2020-02-18-22-39-00-642.png|width=843,height=115!

Also affect the read and write.

!image-2020-02-18-22-51-28-624.png|width=891,height=120!

!image-2020-02-18-22-52-59-202.png|width=996,height=120!

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039146#comment-17039146
 ] 

zhuqi commented on HDFS-15177:
--

cc [~sodonnell] , 

Thanks for your reply. Our cluster' deletion is so heavy with many namespaces 
and with about 30 storage dirs on one datanode. All namespace will call the 
same FsDatasetImpl function.

I am on 2.x branch, cdh5.16.1 version, I mean BPOfferService receive too many 
deletion blocks waiting to be delete. And cause too many items in Block 
invalidBlks[], so that the synchronized FsDatasetImpl including the volumeMap 
removed will iterator so many times, even blocked the heartbeat which uses the  
synchronized FsDatasetImpl and when we change to remove the heartbeat 
synchronized FsDatasetImpl the heartbeat recover normal according to 
[HDFS-7060|https://issues.apache.org/jira/browse/HDFS-7060] , but heavy 
deletion will still block the synchronized FsDatasetImpl some times which will 
affect other action about synchronized FsDatasetImpl.

The blocked stack example is :

!image-2020-02-18-22-39-00-642.png|width=843,height=115!

Also affect the read and write.

!image-2020-02-18-22-51-28-624.png|width=891,height=120!

!image-2020-02-18-22-52-59-202.png|width=996,height=120!

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Attachment: image-2020-02-18-22-52-59-202.png

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Attachment: image-2020-02-18-22-51-28-624.png

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Attachment: image-2020-02-18-22-39-00-642.png

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-02-18-22-39-00-642.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-18 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Component/s: datanode

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.

2020-02-17 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Summary: Split datanode invalide block deletion, to avoid the FsDatasetImpl 
lock too much time.  (was: Split datanode invalide block deletion, to avoid the 
FsDatasetImpl lock too many time.)

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.

2020-02-17 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Description: 
In our cluster, the datanode receive the delete command with too many blocks 
deletion when we have many blockpools sharing the same datanode and the 
datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
much time.

 

  was:
In our cluster, the datanode receive the delete command with too many blocks 
deletion when we have many blockpools sharing the same datanode, it will cause 
the FsDatasetImpl lock too much time.

 


> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> many time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.

2020-02-17 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15177:
-
Description: 
In our cluster, the datanode receive the delete command with too many blocks 
deletion when we have many blockpools sharing the same datanode, it will cause 
the FsDatasetImpl lock too much time.

 

  was:
In our cluster , the datanode receive the delete command with too many blocks 
deletion when we have many blockpools sharing the same datanode, it will cause 
the FsDatasetImpl lock too much time.

 


> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> many time.
> --
>
> Key: HDFS-15177
> URL: https://issues.apache.org/jira/browse/HDFS-15177
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode, it will 
> cause the FsDatasetImpl lock too much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too many time.

2020-02-17 Thread zhuqi (Jira)
zhuqi created HDFS-15177:


 Summary: Split datanode invalide block deletion, to avoid the 
FsDatasetImpl lock too many time.
 Key: HDFS-15177
 URL: https://issues.apache.org/jira/browse/HDFS-15177
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: zhuqi
Assignee: zhuqi


In our cluster , the datanode receive the delete command with too many blocks 
deletion when we have many blockpools sharing the same datanode, it will cause 
the FsDatasetImpl lock too much time.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-16 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038090#comment-17038090
 ] 

zhuqi commented on HDFS-15171:
--

cc [~linyiqun], [~weichiu] , [~hexiaoqiao] 
What do you think about this problem? If you can give me some advice.

Thanks.

> Add a thread to call saveDfsUsed periodically, to prevent datanode too long 
> restart time.  
> ---
>
> Key: HDFS-15171
> URL: https://issues.apache.org/jira/browse/HDFS-15171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> There are 30 storage dirs per datanode in our production cluster , it will 
> take too many time to restart, because sometimes the datanode didn't shutdown 
> gracefully. Now only the datanode graceful shut down hook and the 
> blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
> restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
> add a thread to periodically call the saveDfsUsed function.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-15 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15171:
-
Description: 
There are 30 storage dirs per datanode in our production cluster , it will take 
too many time to restart, because sometimes the datanode didn't shutdown 
gracefully. Now only the datanode graceful shut down hook and the 
blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
add a thread to periodically call the saveDfsUsed function.

 

  was:
There are 30 storage dirs in our production cluster , it will take too many 
time to restart, because sometimes the datanode didn't shutdown gracefully. Now 
only the datanode graceful shut down hook and the blockpoolslice shutdown will 
cause the saveDfsUsed function, that cause the restart of datanode can't reuse 
the dfsuse cache sometimes. I think if we can add a thread to periodically call 
the saveDfsUsed function.

 


> Add a thread to call saveDfsUsed periodically, to prevent datanode too long 
> restart time.  
> ---
>
> Key: HDFS-15171
> URL: https://issues.apache.org/jira/browse/HDFS-15171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> There are 30 storage dirs per datanode in our production cluster , it will 
> take too many time to restart, because sometimes the datanode didn't shutdown 
> gracefully. Now only the datanode graceful shut down hook and the 
> blockpoolslice shutdown will cause the saveDfsUsed function, that cause the 
> restart of datanode can't reuse the dfsuse cache sometimes. I think if we can 
> add a thread to periodically call the saveDfsUsed function.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15171) Add a thread to call saveDfsUsed periodically, to prevent datanode too long restart time.

2020-02-15 Thread zhuqi (Jira)
zhuqi created HDFS-15171:


 Summary: Add a thread to call saveDfsUsed periodically, to prevent 
datanode too long restart time.  
 Key: HDFS-15171
 URL: https://issues.apache.org/jira/browse/HDFS-15171
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.2.0
Reporter: zhuqi
Assignee: zhuqi


There are 30 storage dirs in our production cluster , it will take too many 
time to restart, because sometimes the datanode didn't shutdown gracefully. Now 
only the datanode graceful shut down hook and the blockpoolslice shutdown will 
cause the saveDfsUsed function, that cause the restart of datanode can't reuse 
the dfsuse cache sometimes. I think if we can add a thread to periodically call 
the saveDfsUsed function.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15083) Add new trash rpc which move the trash (mkdir and the rename) operation to the server side.

2020-01-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012433#comment-17012433
 ] 

zhuqi commented on HDFS-15083:
--

cc [~weichiu]

Thanks for your comment, sorry for my draft patch, the cloud storage should be 
supported, i think so.

I just change the TrashPolicyDefault in order to support the 
DistributedFileSystem trash in the server side quickly for our cluster need, 
for our Router trash need in HDFS-14117  , i think the trash in server side is 
graceful compare the HDFS-14117 , and also can reduce the trash rpc to 50%, 
because of that our hdfs life time system's trash action will lead to heavy 
load to namenode.

If you any advice to push the graceful trash and reduce the trash rpc ?

 

> Add new trash rpc which move the trash (mkdir and the rename) operation to 
> the server side.
> ---
>
> Key: HDFS-15083
> URL: https://issues.apache.org/jira/browse/HDFS-15083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, rbf
>Affects Versions: 2.10.0, 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15083.001.patch
>
>
> Now the rbf trash with multi cluster mounted  in 
> [HDFS-14117|https://issues.apache.org/jira/browse/HDFS-14117] , the solution 
> is not graceful。
> If we can move the client side trash (mkdir and rename) to the  server side, 
> we can not only solve the problem gracefully, but also reduce the trash rpc 
> load in server side to about %50 compare to the origin trash which call two 
> times rpc(mkdir and rename).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15109) RBF: Plugin interface to enable delegation of Router

2020-01-09 Thread zhuqi (Jira)
zhuqi created HDFS-15109:


 Summary: RBF: Plugin interface to enable delegation of Router 
 Key: HDFS-15109
 URL: https://issues.apache.org/jira/browse/HDFS-15109
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: zhuqi


If we can  support plugin interface in router side, may be we can Implement 
permission control and other important need in router side, and the control is 
Independent of the namenode side default control.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15083) Add new trash rpc which move the trash (mkdir and the rename) operation to the server side.

2020-01-05 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008334#comment-17008334
 ] 

zhuqi commented on HDFS-15083:
--

Thanks [~hexiaoqiao] for your reply, cc [~ayushtkn],[~ramkumar] :

I just use the DistributedFileSystem and the RawFileSystem for the draft 
function which i need for our cluster, and there are some backward 
compatibility issue we should solve. 

We can discuss and push forward it, and  then i can separate the RBF trash when 
i am free.

 

> Add new trash rpc which move the trash (mkdir and the rename) operation to 
> the server side.
> ---
>
> Key: HDFS-15083
> URL: https://issues.apache.org/jira/browse/HDFS-15083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, rbf
>Affects Versions: 2.10.0, 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15083.001.patch
>
>
> Now the rbf trash with multi cluster mounted  in 
> [HDFS-14117|https://issues.apache.org/jira/browse/HDFS-14117] , the solution 
> is not graceful。
> If we can move the client side trash (mkdir and rename) to the  server side, 
> we can not only solve the problem gracefully, but also reduce the trash rpc 
> load in server side to about %50 compare to the origin trash which call two 
> times rpc(mkdir and rename).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15083) Add new trash rpc which move the trash (mkdir and the rename) operation to the server side.

2020-01-04 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008072#comment-17008072
 ] 

zhuqi commented on HDFS-15083:
--

cc [~elgoiri] , [~weichiu] , [~hexiaoqiao] :


Now , i push a draft patch without checking style, which i have used in our 
cluster.
It may be used for the graceful trash function and reduce the trash namenode 
RPC load to 50%.
I also include the router side draft code.

 Sorry for my so draft code. I can improve it when i am free.

> Add new trash rpc which move the trash (mkdir and the rename) operation to 
> the server side.
> ---
>
> Key: HDFS-15083
> URL: https://issues.apache.org/jira/browse/HDFS-15083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, rbf
>Affects Versions: 2.10.0, 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15083.001.patch
>
>
> Now the rbf trash with multi cluster mounted  in 
> [HDFS-14117|https://issues.apache.org/jira/browse/HDFS-14117] , the solution 
> is not graceful。
> If we can move the client side trash (mkdir and rename) to the  server side, 
> we can not only solve the problem gracefully, but also reduce the trash rpc 
> load in server side to about %50 compare to the origin trash which call two 
> times rpc(mkdir and rename).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15083) Add new trash rpc which move the trash (mkdir and the rename) operation to the server side.

2020-01-04 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15083:
-
Attachment: HDFS-15083.001.patch
Status: Patch Available  (was: Open)

> Add new trash rpc which move the trash (mkdir and the rename) operation to 
> the server side.
> ---
>
> Key: HDFS-15083
> URL: https://issues.apache.org/jira/browse/HDFS-15083
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, rbf
>Affects Versions: 3.2.0, 2.10.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15083.001.patch
>
>
> Now the rbf trash with multi cluster mounted  in 
> [HDFS-14117|https://issues.apache.org/jira/browse/HDFS-14117] , the solution 
> is not graceful。
> If we can move the client side trash (mkdir and rename) to the  server side, 
> we can not only solve the problem gracefully, but also reduce the trash rpc 
> load in server side to about %50 compare to the origin trash which call two 
> times rpc(mkdir and rename).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15083) Add new trash rpc which move the trash (mkdir and the rename) operation to the server side.

2019-12-29 Thread zhuqi (Jira)
zhuqi created HDFS-15083:


 Summary: Add new trash rpc which move the trash (mkdir and the 
rename) operation to the server side.
 Key: HDFS-15083
 URL: https://issues.apache.org/jira/browse/HDFS-15083
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: dfsclient, namenode, rbf
Affects Versions: 3.2.0, 2.10.0
Reporter: zhuqi
Assignee: zhuqi


Now the rbf trash with multi cluster mounted  in 
[HDFS-14117|https://issues.apache.org/jira/browse/HDFS-14117] , the solution is 
not graceful。
If we can move the client side trash (mkdir and rename) to the  server side, we 
can not only solve the problem gracefully, but also reduce the trash rpc load 
in server side to about %50 compare to the origin trash which call two times 
rpc(mkdir and rename).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-12 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Comment: was deleted

(was: Thanks for [~hexiaoqiao]  to help to cc [~weichiu].

Now i am the Hadoop YARN Contributor, could you help me to add to Hadoop HDFS 
Contributor.

It's my honor to contribute to Hadoop HDFS.)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
  Attachment: HDFS-15041.002.patch
Release Note: fix checkstyle
  Status: Patch Available  (was: Open)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch, HDFS-15041.002.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Status: Open  (was: Patch Available)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992261#comment-16992261
 ] 

zhuqi commented on HDFS-15041:
--

Thanks for [~hexiaoqiao]  to help to cc [~weichiu].

Now i am the Hadoop YARN Contributor, could you help me to add to Hadoop HDFS 
Contributor.

It's my honor to contribute to Hadoop HDFS.

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992240#comment-16992240
 ] 

zhuqi commented on HDFS-15041:
--

Hi [~hexiaoqiao] 

Our one cluster with too many deleted operations and write operations because 
of our new hive lifetime system with too many partitions . In some cases the 
4ms will be too short to handle, so i want to let it to be configurable. Also 
some presto based realtime situation without using read from standby, may want 
to shorter the max lock time, in order to better the read performance.

Thanks.

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992124#comment-16992124
 ] 

zhuqi edited comment on HDFS-15041 at 12/10/19 5:48 AM:


Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after my change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. Sorry for the mistake , not mean the hdfs balancer, the 
 balancer pressure i have changed to standby.  


was (Author: zhuqi):
Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after my change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. Sorry for the mistake , not mean the hdfs balancer, the 
 balancer pressure i have changed to standby. Also we can add this queue size 
to metrics if needed?

 

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15041:
-
Attachment: HDFS-15041.001.patch
Status: Patch Available  (was: Open)

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
> Attachments: HDFS-15041.001.patch
>
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992124#comment-16992124
 ] 

zhuqi edited comment on HDFS-15041 at 12/10/19 2:45 AM:


Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after my change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. Sorry for the mistake , not mean the hdfs balancer, the 
 balancer pressure i have changed to standby. Also we can add this queue size 
to metrics if needed?

 


was (Author: zhuqi):
Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after my change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. The balancer pressure i have changed to standby. Also 
we can add this queue size to metrics if needed.

 

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992124#comment-16992124
 ] 

zhuqi edited comment on HDFS-15041 at 12/10/19 2:26 AM:


Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after my change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. The balancer pressure i have changed to standby. Also 
we can add this queue size to metrics if needed.

 


was (Author: zhuqi):
Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after your change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. The balancer pressure i have changed to standby. 

 

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992124#comment-16992124
 ] 

zhuqi commented on HDFS-15041:
--

Hi [~weichiu]

Yeah, i mean to make MAX_LOCK_HOLD_MS configurable after your change in 
HDFS-14553, because of the different need for client latency and balance the 
pressure for rpc queue. The balancer pressure i have changed to standby. 

 

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991729#comment-16991729
 ] 

zhuqi commented on HDFS-15041:
--

cc [~daryn]  ,  [~weichiu]
Our cluster wants to change it in order to get the better balancer between 
latency and rpc queue size boom. What do you think about it ? May i have the 
access to assign to myself. 

Thanks.

> Make MAX_LOCK_HOLD_MS and full queue size configurable
> --
>
> Key: HDFS-15041
> URL: https://issues.apache.org/jira/browse/HDFS-15041
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
> cluster have different need for the latency and the queue health standard. 
> We'd better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15041) Make MAX_LOCK_HOLD_MS and full queue size configurable

2019-12-09 Thread zhuqi (Jira)
zhuqi created HDFS-15041:


 Summary: Make MAX_LOCK_HOLD_MS and full queue size configurable
 Key: HDFS-15041
 URL: https://issues.apache.org/jira/browse/HDFS-15041
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.2.0
Reporter: zhuqi


Now the MAX_LOCK_HOLD_MS and the full queue size are fixed. But different 
cluster have different need for the latency and the queue health standard. We'd 
better to make the two parameter configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14944) ec admin such as : -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963747#comment-16963747
 ] 

zhuqi commented on HDFS-14944:
--

Hi [~ayushtkn] 
We can add -fs to support dynamic define of the namespace.

> ec admin such as : -enablePolicy should support multi federation namespace 
> not only the default namespace in core-site.xml
> --
>
> Key: HDFS-14944
> URL: https://issues.apache.org/jira/browse/HDFS-14944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.1.0, 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> when we use the ec -enablePolicy, we only can enable the defaultFs namespace, 
> we should improve to support more namespace in our federation environment. We 
> can move the ecadmin to support multi namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14944) ec admin such as : -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-14944:
-
Affects Version/s: 3.1.0

> ec admin such as : -enablePolicy should support multi federation namespace 
> not only the default namespace in core-site.xml
> --
>
> Key: HDFS-14944
> URL: https://issues.apache.org/jira/browse/HDFS-14944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.1.0, 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> when we use the ec -enablePolicy, we only can enable the defaultFs namespace, 
> we should improve to support more namespace in our federation environment. We 
> can move the ecadmin to support multi namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14944) ec admin such as : -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-14944:
-
Description: when we use the ec -enablePolicy, we only can enable the 
defaultFs namespace, we should improve to support more namespace in our 
federation environment. We can move the ecadmin to support multi namespace.  
(was: when we use the ec -enablePolicy, we only can enable the defaultFs 
namespace, we should improve to support more namespace in our federation 
environment.)

> ec admin such as : -enablePolicy should support multi federation namespace 
> not only the default namespace in core-site.xml
> --
>
> Key: HDFS-14944
> URL: https://issues.apache.org/jira/browse/HDFS-14944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> when we use the ec -enablePolicy, we only can enable the defaultFs namespace, 
> we should improve to support more namespace in our federation environment. We 
> can move the ecadmin to support multi namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14944) ec admin such as : -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-14944:
-
Description: when we use the ec -enablePolicy, we only can enable the 
defaultFs namespace, we should improve to support more namespace in our 
federation environment.

> ec admin such as : -enablePolicy should support multi federation namespace 
> not only the default namespace in core-site.xml
> --
>
> Key: HDFS-14944
> URL: https://issues.apache.org/jira/browse/HDFS-14944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.2.0
>Reporter: zhuqi
>Priority: Major
>
> when we use the ec -enablePolicy, we only can enable the defaultFs namespace, 
> we should improve to support more namespace in our federation environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14944) ec admin such as : -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-14944:
-
Summary: ec admin such as : -enablePolicy should support multi federation 
namespace not only the default namespace in core-site.xml  (was: ec 
-enablePolicy should support multi federation namespace not only the default 
namespace in core-site.xml)

> ec admin such as : -enablePolicy should support multi federation namespace 
> not only the default namespace in core-site.xml
> --
>
> Key: HDFS-14944
> URL: https://issues.apache.org/jira/browse/HDFS-14944
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.2.0
>Reporter: zhuqi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14944) ec -enablePolicy should support multi federation namespace not only the default namespace in core-site.xml

2019-10-31 Thread zhuqi (Jira)
zhuqi created HDFS-14944:


 Summary: ec -enablePolicy should support multi federation 
namespace not only the default namespace in core-site.xml
 Key: HDFS-14944
 URL: https://issues.apache.org/jira/browse/HDFS-14944
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.2.0, 3.0.0
Reporter: zhuqi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org