[jira] [Resolved] (HDFS-15975) Use LongAdder instead of AtomicLong

2021-04-17 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15975.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Use LongAdder instead of AtomicLong
> ---
>
> Key: HDFS-15975
> URL: https://issues.apache.org/jira/browse/HDFS-15975
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When counting some indicators, we can use LongAdder instead of AtomicLong to 
> improve performance. The long value is not an atomic snapshot in LongAdder, 
> but I think we can tolerate that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15898) Test case TestOfflineImageViewer fails

2021-04-15 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17322615#comment-17322615
 ] 

Takanobu Asanuma commented on HDFS-15898:
-

Cherry-picked to branch-3.3.

> Test case TestOfflineImageViewer fails
> --
>
> Key: HDFS-15898
> URL: https://issues.apache.org/jira/browse/HDFS-15898
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The following 3 cases failed locally
> TestOfflineImageViewer#testWriterOutputEntryBuilderForFile
>  
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/file,5,2000-01-01 00:00,2000-01-01 
> 00:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1Actual   
> :/path/file,5,2000-01-01 08:00,2000-01-01 
> 08:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1
> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForFile(TestOfflineImageViewer.java:760){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForDirectory
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/dir,0,2000-01-01 00:00,1970-01-01 
> 00:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1Actual   
> :/path/dir,0,2000-01-01 08:00,1970-01-01 
> 08:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1 at 
> org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForDirectory(TestOfflineImageViewer.java:768){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForSymlink
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/sym,0,2000-01-01 00:00,2000-01-01 
> 00:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1Actual   :/path/sym,0,2000-01-01 
> 08:00,2000-01-01 08:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1 difference> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForSymlink(TestOfflineImageViewer.java:776){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15898) Test case TestOfflineImageViewer fails

2021-04-15 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15898:

Fix Version/s: 3.3.1

> Test case TestOfflineImageViewer fails
> --
>
> Key: HDFS-15898
> URL: https://issues.apache.org/jira/browse/HDFS-15898
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The following 3 cases failed locally
> TestOfflineImageViewer#testWriterOutputEntryBuilderForFile
>  
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/file,5,2000-01-01 00:00,2000-01-01 
> 00:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1Actual   
> :/path/file,5,2000-01-01 08:00,2000-01-01 
> 08:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1
> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForFile(TestOfflineImageViewer.java:760){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForDirectory
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/dir,0,2000-01-01 00:00,1970-01-01 
> 00:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1Actual   
> :/path/dir,0,2000-01-01 08:00,1970-01-01 
> 08:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1 at 
> org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForDirectory(TestOfflineImageViewer.java:768){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForSymlink
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/sym,0,2000-01-01 00:00,2000-01-01 
> 00:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1Actual   :/path/sym,0,2000-01-01 
> 08:00,2000-01-01 08:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1 difference> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForSymlink(TestOfflineImageViewer.java:776){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-04-06 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15900:

Fix Version/s: 3.1.5

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: image.png
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-04-05 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15900:

Fix Version/s: 3.2.3

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: image.png
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15883) Add a metric BlockReportQueueFullCount

2021-04-05 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314746#comment-17314746
 ] 

Takanobu Asanuma edited comment on HDFS-15883 at 4/5/21, 10:06 AM:
---

Closes as the PR closed. Feel free to reopen it if necessary.


was (Author: tasanuma0829):
Closes as the PR closed.

> Add a metric BlockReportQueueFullCount
> --
>
> Key: HDFS-15883
> URL: https://issues.apache.org/jira/browse/HDFS-15883
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add a metric that reflects the number of times the block report queue is full



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15883) Add a metric BlockReportQueueFullCount

2021-04-05 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15883.
-
Resolution: Won't Fix

Closes as the PR closed.

> Add a metric BlockReportQueueFullCount
> --
>
> Key: HDFS-15883
> URL: https://issues.apache.org/jira/browse/HDFS-15883
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add a metric that reflects the number of times the block report queue is full



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15951) Remove unused parameters in NameNodeProxiesClient

2021-04-05 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15951.
-
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed

> Remove unused parameters in NameNodeProxiesClient
> -
>
> Key: HDFS-15951
> URL: https://issues.apache.org/jira/browse/HDFS-15951
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Remove unused parameters in org.apache.hadoop.hdfs.NameNodeProxiesClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-05 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314711#comment-17314711
 ] 

Takanobu Asanuma commented on HDFS-15945:
-

Thanks for your comment, [~weichiu].

> DataNodes with zero capacity and zero blocks should be decommissioned 
> immediately
> -
>
> Key: HDFS-15945
> URL: https://issues.apache.org/jira/browse/HDFS-15945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Such as when there is a storage problem, DataNode capacity and block count 
> sometimes become zero.
>  When we tried to decommission those DataNodes, we ran into an issue that the 
> decommission did not complete because the NameNode had not received their 
> first block report.
> {noformat}
> INFO  blockmanagement.DatanodeAdminManager 
> (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
> 127.0.0.1:58343 
> [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
> blocks
> INFO  blockmanagement.BlockManager 
> (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
> 127.0.0.1:58343 hasn't sent its first block report.
> INFO  blockmanagement.DatanodeAdminDefaultMonitor 
> (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
> healthy. It needs to replicate 0 more blocks. Decommission In Progress is 
> still in progress.
> {noformat}
> To make matters worse, even if we stopped these DataNodes afterward, they 
> remained in a dead state until NameNode restarted.
> I think those DataNodes should be decommissioned immediately even if NameNode 
> hasn't recived the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-02 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15945:

Description: 
Such as when there is a storage problem, DataNode capacity and block count 
sometimes become zero.
 When we tried to decommission those DataNodes, we ran into an issue that the 
decommission did not complete because the NameNode had not received their first 
block report.
{noformat}
INFO  blockmanagement.DatanodeAdminManager 
(DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
127.0.0.1:58343 
[DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
blocks
INFO  blockmanagement.BlockManager 
(BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
127.0.0.1:58343 hasn't sent its first block report.
INFO  blockmanagement.DatanodeAdminDefaultMonitor 
(DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
healthy. It needs to replicate 0 more blocks. Decommission In Progress is still 
in progress.
{noformat}
To make matters worse, even if we stopped these DataNodes afterward, they 
remained in a dead state until NameNode restarted.

I think those DataNodes should be decommissioned immediately even if NameNode 
hasn't recived the first block report.

  was:
Such as when there is a storage problem, DataNode capacity and block count 
becomes zero.
When I tried to decommission those DataNodes, I ran into an issue that the 
decommission did not complete because the NameNode had not received their first 
block report.

{noformat}
INFO  blockmanagement.DatanodeAdminManager 
(DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
127.0.0.1:58343 
[DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
blocks
INFO  blockmanagement.BlockManager 
(BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
127.0.0.1:58343 hasn't sent its first block report.
INFO  blockmanagement.DatanodeAdminDefaultMonitor 
(DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
healthy. It needs to replicate 0 more blocks. Decommission In Progress is still 
in progress.
{noformat}

I think those DataNodes should be decommissioned immediately even if they 
haven't reported the first block report.


> DataNodes with zero capacity and zero blocks should be decommissioned 
> immediately
> -
>
> Key: HDFS-15945
> URL: https://issues.apache.org/jira/browse/HDFS-15945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Such as when there is a storage problem, DataNode capacity and block count 
> sometimes become zero.
>  When we tried to decommission those DataNodes, we ran into an issue that the 
> decommission did not complete because the NameNode had not received their 
> first block report.
> {noformat}
> INFO  blockmanagement.DatanodeAdminManager 
> (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
> 127.0.0.1:58343 
> [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
> blocks
> INFO  blockmanagement.BlockManager 
> (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
> 127.0.0.1:58343 hasn't sent its first block report.
> INFO  blockmanagement.DatanodeAdminDefaultMonitor 
> (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
> healthy. It needs to replicate 0 more blocks. Decommission In Progress is 
> still in progress.
> {noformat}
> To make matters worse, even if we stopped these DataNodes afterward, they 
> remained in a dead state until NameNode restarted.
> I think those DataNodes should be decommissioned immediately even if NameNode 
> hasn't recived the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-02 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15945 started by Takanobu Asanuma.
---
> DataNodes with zero capacity and zero blocks should be decommissioned 
> immediately
> -
>
> Key: HDFS-15945
> URL: https://issues.apache.org/jira/browse/HDFS-15945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Such as when there is a storage problem, DataNode capacity and block count 
> becomes zero.
> When I tried to decommission those DataNodes, I ran into an issue that the 
> decommission did not complete because the NameNode had not received their 
> first block report.
> {noformat}
> INFO  blockmanagement.DatanodeAdminManager 
> (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
> 127.0.0.1:58343 
> [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
> blocks
> INFO  blockmanagement.BlockManager 
> (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
> 127.0.0.1:58343 hasn't sent its first block report.
> INFO  blockmanagement.DatanodeAdminDefaultMonitor 
> (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
> healthy. It needs to replicate 0 more blocks. Decommission In Progress is 
> still in progress.
> {noformat}
> I think those DataNodes should be decommissioned immediately even if they 
> haven't reported the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-02 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15945:

Description: 
Such as when there is a storage problem, DataNode capacity and block count 
becomes zero.
When I tried to decommission those DataNodes, I ran into an issue that the 
decommission did not complete because the NameNode had not received their first 
block report.

{noformat}
INFO  blockmanagement.DatanodeAdminManager 
(DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
127.0.0.1:58343 
[DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
blocks
INFO  blockmanagement.BlockManager 
(BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
127.0.0.1:58343 hasn't sent its first block report.
INFO  blockmanagement.DatanodeAdminDefaultMonitor 
(DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
healthy. It needs to replicate 0 more blocks. Decommission In Progress is still 
in progress.
{noformat}

I think those DataNodes should be decommissioned immediately even if they 
haven't reported the first block report.

  was:
Such as when there is a storage problem, DataNode capacity and block count 
becomes zero.
When I tried to decommission those DataNodes, I ran into an issue that the 
decommission did not complete because the NameNode had not received their first 
block report.

I think those DataNodes should be decommissioned immediately even if they 
haven't reported the first block report.


> DataNodes with zero capacity and zero blocks should be decommissioned 
> immediately
> -
>
> Key: HDFS-15945
> URL: https://issues.apache.org/jira/browse/HDFS-15945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Such as when there is a storage problem, DataNode capacity and block count 
> becomes zero.
> When I tried to decommission those DataNodes, I ran into an issue that the 
> decommission did not complete because the NameNode had not received their 
> first block report.
> {noformat}
> INFO  blockmanagement.DatanodeAdminManager 
> (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
> 127.0.0.1:58343 
> [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
> blocks
> INFO  blockmanagement.BlockManager 
> (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
> 127.0.0.1:58343 hasn't sent its first block report.
> INFO  blockmanagement.DatanodeAdminDefaultMonitor 
> (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
> healthy. It needs to replicate 0 more blocks. Decommission In Progress is 
> still in progress.
> {noformat}
> I think those DataNodes should be decommissioned immediately even if they 
> haven't reported the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-02 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15945:

Status: Patch Available  (was: In Progress)

> DataNodes with zero capacity and zero blocks should be decommissioned 
> immediately
> -
>
> Key: HDFS-15945
> URL: https://issues.apache.org/jira/browse/HDFS-15945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Such as when there is a storage problem, DataNode capacity and block count 
> becomes zero.
> When I tried to decommission those DataNodes, I ran into an issue that the 
> decommission did not complete because the NameNode had not received their 
> first block report.
> {noformat}
> INFO  blockmanagement.DatanodeAdminManager 
> (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 
> 127.0.0.1:58343 
> [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 
> blocks
> INFO  blockmanagement.BlockManager 
> (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 
> 127.0.0.1:58343 hasn't sent its first block report.
> INFO  blockmanagement.DatanodeAdminDefaultMonitor 
> (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't 
> healthy. It needs to replicate 0 more blocks. Decommission In Progress is 
> still in progress.
> {noformat}
> I think those DataNodes should be decommissioned immediately even if they 
> haven't reported the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15945) DataNodes with zero capacity and zero blocks should be decommissioned immediately

2021-04-02 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15945:
---

 Summary: DataNodes with zero capacity and zero blocks should be 
decommissioned immediately
 Key: HDFS-15945
 URL: https://issues.apache.org/jira/browse/HDFS-15945
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


Such as when there is a storage problem, DataNode capacity and block count 
becomes zero.
When I tried to decommission those DataNodes, I ran into an issue that the 
decommission did not complete because the NameNode had not received their first 
block report.

I think those DataNodes should be decommissioned immediately even if they 
haven't reported the first block report.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15892) Add metric for editPendingQ in FSEditLogAsync

2021-04-01 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15892.
-
Fix Version/s: 3.2.3
   3.4.0
   3.3.1
   Resolution: Fixed

> Add metric for editPendingQ in FSEditLogAsync
> -
>
> Key: HDFS-15892
> URL: https://issues.apache.org/jira/browse/HDFS-15892
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> To monitor editPendingQ in FSEditLogAsync, we add a metric  
> and print log when the queue is full.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-03-31 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15900:

Fix Version/s: (was: 3.2.3)
   (was: 3.1.5)

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
> Attachments: image.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-03-31 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312864#comment-17312864
 ] 

Takanobu Asanuma commented on HDFS-15900:
-

Reverted from branch-3.2 and branch-3.1.

[~hdaikoku] I think we should fix the bug in branch-3.2 and branch-3.1. Would 
you mind creating another PR for branch-3.2 and branch-3.1?

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: image.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-03-31 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312855#comment-17312855
 ] 

Takanobu Asanuma commented on HDFS-15900:
-

[~ayushtkn] Sorry, I will check it. Thanks for reporting it.

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: image.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15900) RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode

2021-03-28 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15900:

Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: empty blockpool id on dfsrouter caused by UNAVAILABLE NameNode
> ---
>
> Key: HDFS-15900
> URL: https://issues.apache.org/jira/browse/HDFS-15900
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0
>Reporter: Harunobu Daikoku
>Assignee: Harunobu Daikoku
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: image.png
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> We observed that when a NameNode becomes UNAVAILABLE, the corresponding 
> blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter 
> unintentionally sets to empty, its initial value.
>  !image.png|height=250!
> As a result of this, concat operations through dfsrouter fail with the 
> following error as it cannot resolve the block id in the recognized active 
> namespaces.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> Cannot locate a nameservice for block pool BP-...
> {noformat}
> A possible fix is to ignore UNAVAILABLE NameNode registrations, and set 
> proper namespace information obtained from available NameNode registrations 
> when constructing the cache of active namespaces.
>  
> [https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15879) Exclude slow nodes when choose targets for blocks

2021-03-27 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309904#comment-17309904
 ] 

Takanobu Asanuma commented on HDFS-15879:
-

Merged to trunk.

> Exclude slow nodes when choose targets for blocks
> -
>
> Key: HDFS-15879
> URL: https://issues.apache.org/jira/browse/HDFS-15879
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Previously, we have monitored the slow nodes, related to 
> [HDFS-11194|https://issues.apache.org/jira/browse/HDFS-11194].
> We can use a thread to periodically collect these slow nodes into a set. Then 
> use the set to filter out slow nodes when choose targets for blocks.
> This feature can be configured to be turned on when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15879) Exclude slow nodes when choose targets for blocks

2021-03-27 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15879.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Exclude slow nodes when choose targets for blocks
> -
>
> Key: HDFS-15879
> URL: https://issues.apache.org/jira/browse/HDFS-15879
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Previously, we have monitored the slow nodes, related to 
> [HDFS-11194|https://issues.apache.org/jira/browse/HDFS-11194].
> We can use a thread to periodically collect these slow nodes into a set. Then 
> use the set to filter out slow nodes when choose targets for blocks.
> This feature can be configured to be turned on when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15879) Exclude slow nodes when choose targets for blocks

2021-03-27 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15879:

Issue Type: Improvement  (was: Wish)

> Exclude slow nodes when choose targets for blocks
> -
>
> Key: HDFS-15879
> URL: https://issues.apache.org/jira/browse/HDFS-15879
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Previously, we have monitored the slow nodes, related to 
> [HDFS-11194|https://issues.apache.org/jira/browse/HDFS-11194].
> We can use a thread to periodically collect these slow nodes into a set. Then 
> use the set to filter out slow nodes when choose targets for blocks.
> This feature can be configured to be turned on when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15902) Improve the log for HTTPFS server operation

2021-03-24 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15902:

Fix Version/s: 3.2.3
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.3, branch-3.2. Thanks for your contribution, 
[~bpatel].

> Improve the log for HTTPFS server operation
> ---
>
> Key: HDFS-15902
> URL: https://issues.apache.org/jira/browse/HDFS-15902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: httpfs
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HDFS-15902.001.patch
>
>
> Improve the log for HTTPFS server operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15902) Improve the log for HTTPFS server operation

2021-03-24 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307656#comment-17307656
 ] 

Takanobu Asanuma commented on HDFS-15902:
-

+1 on [^HDFS-15902.001.patch].

> Improve the log for HTTPFS server operation
> ---
>
> Key: HDFS-15902
> URL: https://issues.apache.org/jira/browse/HDFS-15902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: httpfs
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15902.001.patch
>
>
> Improve the log for HTTPFS server operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15906) Close FSImage and FSNamesystem after formatting is complete

2021-03-23 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15906:

Fix Version/s: 3.2.3
   3.1.5

> Close FSImage and FSNamesystem after formatting is complete
> ---
>
> Key: HDFS-15906
> URL: https://issues.apache.org/jira/browse/HDFS-15906
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Close FSImage and FSNamesystem after formatting is complete. 
> org.apache.hadoop.hdfs.server.namenode#format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15906) Close FSImage and FSNamesystem after formatting is complete

2021-03-22 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15906.
-
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed

> Close FSImage and FSNamesystem after formatting is complete
> ---
>
> Key: HDFS-15906
> URL: https://issues.apache.org/jira/browse/HDFS-15906
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Close FSImage and FSNamesystem after formatting is complete. 
> org.apache.hadoop.hdfs.server.namenode#format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15895) DFSAdmin#printOpenFiles has redundant String#format usage

2021-03-17 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15895.
-
Resolution: Fixed

> DFSAdmin#printOpenFiles has redundant String#format usage
> -
>
> Key: HDFS-15895
> URL: https://issues.apache.org/jira/browse/HDFS-15895
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15890) Improve the Logs for File Concat Operation

2021-03-16 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15890:

Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.3, branch-3.2, branch-3.1.

Thanks for your contribution, [~bpatel].

> Improve the Logs for File Concat Operation 
> ---
>
> Key: HDFS-15890
> URL: https://issues.apache.org/jira/browse/HDFS-15890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15890.001.patch
>
>
> As we are using the slf4j logger so we can remove the debug enabled 
> conditions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15890) Improve the Logs for File Concat Operation

2021-03-16 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303102#comment-17303102
 ] 

Takanobu Asanuma commented on HDFS-15890:
-

The failed tests are not related.

> Improve the Logs for File Concat Operation 
> ---
>
> Key: HDFS-15890
> URL: https://issues.apache.org/jira/browse/HDFS-15890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15890.001.patch
>
>
> As we are using the slf4j logger so we can remove the debug enabled 
> conditions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15895) DFSAdmin#printOpenFiles has redundant String#format usage

2021-03-16 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303096#comment-17303096
 ] 

Takanobu Asanuma commented on HDFS-15895:
-

I added you to a contributor role.

> DFSAdmin#printOpenFiles has redundant String#format usage
> -
>
> Key: HDFS-15895
> URL: https://issues.apache.org/jira/browse/HDFS-15895
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15895) DFSAdmin#printOpenFiles has redundant String#format usage

2021-03-16 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma reassigned HDFS-15895:
---

Assignee: Viraj Jasani

> DFSAdmin#printOpenFiles has redundant String#format usage
> -
>
> Key: HDFS-15895
> URL: https://issues.apache.org/jira/browse/HDFS-15895
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15890) Improve the Logs for File Concat Operation

2021-03-16 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303090#comment-17303090
 ] 

Takanobu Asanuma commented on HDFS-15890:
-

+1 on [^HDFS-15890.001.patch].

> Improve the Logs for File Concat Operation 
> ---
>
> Key: HDFS-15890
> URL: https://issues.apache.org/jira/browse/HDFS-15890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15890.001.patch
>
>
> As we are using the slf4j logger so we can remove the debug enabled 
> conditions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-14 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301381#comment-17301381
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

[~bpatel] Yes, we can, but there are some conflicts. Could you create a patch 
for branch-3.3?

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch, 
> HDFS-15848.003.patch, HDFS-15848.004.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15480) Ordered snapshot deletion: record snapshot deletion in XAttr

2021-03-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17300085#comment-17300085
 ] 

Takanobu Asanuma commented on HDFS-15480:
-

This commit is in trunk, and not in branch-3.3. I changed the fix version to 
3.4.0.

> Ordered snapshot deletion: record snapshot deletion in XAttr
> 
>
> Key: HDFS-15480
> URL: https://issues.apache.org/jira/browse/HDFS-15480
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Reporter: Tsz-wo Sze
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15480.000.patch, HDFS-15480.001.patch, 
> HDFS-15480.002.patch
>
>
> In this JIRA, the behavior of deleting the non-earliest snapshots will be 
> changed to marking them as deleted in XAttr but not actually deleting them.  
> Note that
> # The marked-for-deletion snapshots will be garbage collected later on; see 
> HDFS-15481.
> # The marked-for-deletion snapshots will be hided from users; see HDFS-15482.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15480) Ordered snapshot deletion: record snapshot deletion in XAttr

2021-03-11 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15480:

Fix Version/s: (was: 1.3.0)
   3.4.0

> Ordered snapshot deletion: record snapshot deletion in XAttr
> 
>
> Key: HDFS-15480
> URL: https://issues.apache.org/jira/browse/HDFS-15480
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Reporter: Tsz-wo Sze
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15480.000.patch, HDFS-15480.001.patch, 
> HDFS-15480.002.patch
>
>
> In this JIRA, the behavior of deleting the non-earliest snapshots will be 
> changed to marking them as deleted in XAttr but not actually deleting them.  
> Note that
> # The marked-for-deletion snapshots will be garbage collected later on; see 
> HDFS-15481.
> # The marked-for-deletion snapshots will be hided from users; see HDFS-15482.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-11 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15848:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for your contribution, [~bpatel]!

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch, 
> HDFS-15848.003.patch, HDFS-15848.004.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299970#comment-17299970
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

+1 on [^HDFS-15848.004.patch].

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch, 
> HDFS-15848.003.patch, HDFS-15848.004.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299480#comment-17299480
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

[~bpatel] Could you fix the checkstyle issue?

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-10 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299332#comment-17299332
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

Thanks for updating the patch, [~bpatel], and thanks for your help, [~aajisaka].

+1 on [^HDFS-15848.002.patch], pending Jenkins.

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-10 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299256#comment-17299256
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

[~bpatel] Thanks for your explanation. OK, let's use a new logger of 
FSDirSnapshotOp.

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-09 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298191#comment-17298191
 ] 

Takanobu Asanuma edited comment on HDFS-15848 at 3/9/21, 5:03 PM:
--

[~bpatel] Thanks for your contribution.

One minor comment, why do you use both the new logger and NameNode.LOG in 
FSDirSnapshotOp? I feel that we can just use NameNode.LOG here rather than 
creating a new logger.


was (Author: tasanuma0829):
[~bpatel] Thanks for your contribution.

One minor comment, why do you use both the new logger and NameNode.LOG in 
FSDirSnapshotOp?

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2021-03-09 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298191#comment-17298191
 ] 

Takanobu Asanuma commented on HDFS-15848:
-

[~bpatel] Thanks for your contribution.

One minor comment, why do you use both the new logger and NameNode.LOG in 
FSDirSnapshotOp?

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15848.001.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15870) Remove unused configuration dfs.namenode.stripe.min

2021-03-03 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15870.
-
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed

> Remove unused configuration dfs.namenode.stripe.min
> ---
>
> Key: HDFS-15870
> URL: https://issues.apache.org/jira/browse/HDFS-15870
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Remove unused configuration dfs.namenode.stripe.min.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15845) RBF: Router fails to start due to NoClassDefFoundError for hadoop-federation-balance

2021-02-22 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15845:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Merged the PR. Thanks, [~elgoiri] and [~LiJinglun].

> RBF: Router fails to start due to NoClassDefFoundError for 
> hadoop-federation-balance
> 
>
> Key: HDFS-15845
> URL: https://issues.apache.org/jira/browse/HDFS-15845
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> $ hdfs dfsrouter
> ...
> 2021-02-22 17:21:55,400 ERROR router.DFSRouter: Failed to start router
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.(RouterClientProtocol.java:195)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.(RouterRpcServer.java:394)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Router.createRpcServer(Router.java:391)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Router.serviceInit(Router.java:188)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
> at 
> org.apache.hadoop.hdfs.server.federation.router.DFSRouter.main(DFSRouter.java:69)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedure
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 6 more
> 2021-02-22 17:21:55,402 INFO util.ExitUtil: Exiting with status 1: 
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
> 2021-02-22 17:21:55,404 INFO router.DFSRouter: SHUTDOWN_MSG:
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15845) RBF: Router fails to start due to NoClassDefFoundError for hadoop-federation-balance

2021-02-22 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15845:

Status: Patch Available  (was: Open)

> RBF: Router fails to start due to NoClassDefFoundError for 
> hadoop-federation-balance
> 
>
> Key: HDFS-15845
> URL: https://issues.apache.org/jira/browse/HDFS-15845
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> $ hdfs dfsrouter
> ...
> 2021-02-22 17:21:55,400 ERROR router.DFSRouter: Failed to start router
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.(RouterClientProtocol.java:195)
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.(RouterRpcServer.java:394)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Router.createRpcServer(Router.java:391)
> at 
> org.apache.hadoop.hdfs.server.federation.router.Router.serviceInit(Router.java:188)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
> at 
> org.apache.hadoop.hdfs.server.federation.router.DFSRouter.main(DFSRouter.java:69)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedure
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 6 more
> 2021-02-22 17:21:55,402 INFO util.ExitUtil: Exiting with status 1: 
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
> 2021-02-22 17:21:55,404 INFO router.DFSRouter: SHUTDOWN_MSG:
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15845) RBF: Router fails to start due to NoClassDefFoundError for hadoop-federation-balance

2021-02-22 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15845:
---

 Summary: RBF: Router fails to start due to NoClassDefFoundError 
for hadoop-federation-balance
 Key: HDFS-15845
 URL: https://issues.apache.org/jira/browse/HDFS-15845
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


{noformat}
$ hdfs dfsrouter
...
2021-02-22 17:21:55,400 ERROR router.DFSRouter: Failed to start router
java.lang.NoClassDefFoundError: 
org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
at 
org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.(RouterClientProtocol.java:195)
at 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.(RouterRpcServer.java:394)
at 
org.apache.hadoop.hdfs.server.federation.router.Router.createRpcServer(Router.java:391)
at 
org.apache.hadoop.hdfs.server.federation.router.Router.serviceInit(Router.java:188)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
at 
org.apache.hadoop.hdfs.server.federation.router.DFSRouter.main(DFSRouter.java:69)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedure
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 6 more
2021-02-22 17:21:55,402 INFO util.ExitUtil: Exiting with status 1: 
java.lang.NoClassDefFoundError: 
org/apache/hadoop/tools/fedbalance/procedure/BalanceProcedure
2021-02-22 17:21:55,404 INFO router.DFSRouter: SHUTDOWN_MSG:
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15840) TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock fails on trunk intermittently

2021-02-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15840:

Summary: 
TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock 
fails on trunk intermittently  (was: 
TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock 
fails on trunk)

> TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock 
> fails on trunk intermittently
> 
>
> Key: HDFS-15840
> URL: https://issues.apache.org/jira/browse/HDFS-15840
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Priority: Major
>
> Found from HDFS-15835.
> {quote}java.lang.AssertionError: expected:<10> but was:<11>
>  at org.junit.Assert.fail(Assert.java:88)
>  at org.junit.Assert.failNotEquals(Assert.java:834)
>  at org.junit.Assert.assertEquals(Assert.java:645)
>  at org.junit.Assert.assertEquals(Assert.java:631)
>  at 
> org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithMissingBlock(TestDecommissionWithStriped.java:910)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15840) TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock fails on trunk

2021-02-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15840:

Description: 
Found from HDFS-15835.
{quote}java.lang.AssertionError: expected:<10> but was:<11>
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:834)
 at org.junit.Assert.assertEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:631)
 at 
org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithMissingBlock(TestDecommissionWithStriped.java:910)
{quote}

  was:
{quote}
java.lang.AssertionError: expected:<10> but was:<11>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithMissingBlock(TestDecommissionWithStriped.java:910)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{quote}


> TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock 
> fails on trunk
> -
>
> Key: HDFS-15840
> URL: https://issues.apache.org/jira/browse/HDFS-15840
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Priority: Major
>
> Found from HDFS-15835.
> {quote}java.lang.AssertionError: expected:<10> but was:<11>
>  at org.junit.Assert.fail(Assert.java:88)
>  at org.junit.Assert.failNotEquals(Assert.java:834)
>  at org.junit.Assert.assertEquals(Assert.java:645)
>  at org.junit.Assert.assertEquals(Assert.java:631)
>  at 
> org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithMissingBlock(TestDecommissionWithStriped.java:910)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15840) TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock fails on trunk

2021-02-18 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15840:
---

 Summary: 
TestDecommissionWithStripedBackoffMonitor#testDecommissionWithMissingBlock 
fails on trunk
 Key: HDFS-15840
 URL: https://issues.apache.org/jira/browse/HDFS-15840
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takanobu Asanuma


{quote}
java.lang.AssertionError: expected:<10> but was:<11>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithMissingBlock(TestDecommissionWithStriped.java:910)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15835) Erasure coding: Add/remove logs for the better readability/debugging

2021-02-18 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286860#comment-17286860
 ] 

Takanobu Asanuma commented on HDFS-15835:
-

[~bpatel]  I added you to hadoop contributor role. You can assign yourself to 
JIRA next time. 

> Erasure coding: Add/remove logs for the better readability/debugging
> 
>
> Key: HDFS-15835
> URL: https://issues.apache.org/jira/browse/HDFS-15835
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15835.001.patch
>
>
> * Unnecessary Namenode logs displaying for Disabling EC policies which are 
> already disabled.
> * There is no info/debug are present for addPolicy, unsetPolicy 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15835) Erasure coding: Add/remove logs for the better readability/debugging

2021-02-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma reassigned HDFS-15835:
---

Assignee: Bhavik Patel

> Erasure coding: Add/remove logs for the better readability/debugging
> 
>
> Key: HDFS-15835
> URL: https://issues.apache.org/jira/browse/HDFS-15835
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15835.001.patch
>
>
> * Unnecessary Namenode logs displaying for Disabling EC policies which are 
> already disabled.
> * There is no info/debug are present for addPolicy, unsetPolicy 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15835) Erasure coding: Add/remove logs for the better readability/debugging

2021-02-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15835:

Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

The failed test is not related.

Committed to trunk and branch-3.3.  Thanks for your contribution, [~bpatel].

> Erasure coding: Add/remove logs for the better readability/debugging
> 
>
> Key: HDFS-15835
> URL: https://issues.apache.org/jira/browse/HDFS-15835
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Reporter: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15835.001.patch
>
>
> * Unnecessary Namenode logs displaying for Disabling EC policies which are 
> already disabled.
> * There is no info/debug are present for addPolicy, unsetPolicy 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15835) Erasure coding: Add/remove logs for the better readability/debugging

2021-02-18 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286853#comment-17286853
 ] 

Takanobu Asanuma commented on HDFS-15835:
-

+1 on [^HDFS-15835.001.patch]. I will fix the checkstyle issue when committing 
it.

> Erasure coding: Add/remove logs for the better readability/debugging
> 
>
> Key: HDFS-15835
> URL: https://issues.apache.org/jira/browse/HDFS-15835
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs
>Reporter: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15835.001.patch
>
>
> * Unnecessary Namenode logs displaying for Disabling EC policies which are 
> already disabled.
> * There is no info/debug are present for addPolicy, unsetPolicy 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15805) Hadoop prints sensitive Cookie information.

2021-02-03 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15805:

Target Version/s: 3.3.0, 3.1.1  (was: 3.1.1, 3.3.0)
  Resolution: Duplicate
  Status: Resolved  (was: Patch Available)

Thanks for your report, [~prasad-acit]. I closed this issue.

BTW, You didn't need to create a new jira. You could convert this jira to 
Hadoop common.

> Hadoop prints sensitive Cookie information.
> ---
>
> Key: HDFS-15805
> URL: https://issues.apache.org/jira/browse/HDFS-15805
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-15805.001.patch
>
>
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.AuthCookieHandler#setAuthCookie
>  - prints cookie information in log. Any sensitive infomation in Cookies will 
> be logged, which needs to be avaided.
> LOG.trace("Setting token value to {} ({})", authCookie, oldCookie);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-07 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245678#comment-17245678
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

Thanks again, [~marvelrock] and [~ferhui]!

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Blocker
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-15240-branch-3.1-001.patch, 
> HDFS-15240-branch-3.1.001.patch, HDFS-15240-branch-3.2.001.patch, 
> HDFS-15240-branch-3.3-001.patch, HDFS-15240-branch-3.3.001.patch, 
> HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, 
> HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, 
> HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, 
> HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, 
> HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-06 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244910#comment-17244910
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

+1 on [^HDFS-15240-branch-3.1-001.patch] and [^HDFS-15240-branch-3.3-001.patch].

[~ferhui] Could you double check and commit them?

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Blocker
> Fix For: 3.2.2, 3.4.0, 3.2.3
>
> Attachments: HDFS-15240-branch-3.1-001.patch, 
> HDFS-15240-branch-3.2.001.patch, HDFS-15240-branch-3.3-001.patch, 
> HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, 
> HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, 
> HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, 
> HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, 
> HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> 

[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-04 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15240:

Priority: Blocker  (was: Major)

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Blocker
> Fix For: 3.4.0, 3.2.3
>
> Attachments: HDFS-15240-branch-3.2.001.patch, HDFS-15240.001.patch, 
> HDFS-15240.002.patch, HDFS-15240.003.patch, HDFS-15240.004.patch, 
> HDFS-15240.005.patch, HDFS-15240.006.patch, HDFS-15240.007.patch, 
> HDFS-15240.008.patch, HDFS-15240.009.patch, HDFS-15240.010.patch, 
> HDFS-15240.011.patch, HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-04 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244195#comment-17244195
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

BTW, I'm going to change this issue to blocker. All new releases of branch-3.x 
should include it.

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0, 3.2.3
>
> Attachments: HDFS-15240-branch-3.2.001.patch, HDFS-15240.001.patch, 
> HDFS-15240.002.patch, HDFS-15240.003.patch, HDFS-15240.004.patch, 
> HDFS-15240.005.patch, HDFS-15240.006.patch, HDFS-15240.007.patch, 
> HDFS-15240.008.patch, HDFS-15240.009.patch, HDFS-15240.010.patch, 
> HDFS-15240.011.patch, HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-04 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244187#comment-17244187
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

[~hexiaoqiao] I confirmed TestReconstructStripedFile succeeded with or without 
ISA-L for branch-3.2.2 with [^HDFS-15240-branch-3.2.001.patch] in my 
environment. +1 on [^HDFS-15240-branch-3.2.001.patch]. Please go ahead. Thanks.

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0, 3.2.3
>
> Attachments: HDFS-15240-branch-3.2.001.patch, HDFS-15240.001.patch, 
> HDFS-15240.002.patch, HDFS-15240.003.patch, HDFS-15240.004.patch, 
> HDFS-15240.005.patch, HDFS-15240.006.patch, HDFS-15240.007.patch, 
> HDFS-15240.008.patch, HDFS-15240.009.patch, HDFS-15240.010.patch, 
> HDFS-15240.011.patch, HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> 

[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-03 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15240:

Target Version/s: 3.3.1, 3.4.0, 3.1.5, 3.2.3  (was: 3.3.1, 3.4.0)

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, 
> HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-03 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243662#comment-17243662
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

Thanks for committing it, [~ferhui]!

[~marvelrock] We are maintaining branch-3.1 or later, and I think they have the 
same bug. Would it also be possible to provide the patch for branch-3.2 and 
branch-3.1?

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, 
> HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> 

[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.

2020-12-02 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242857#comment-17242857
 ] 

Takanobu Asanuma commented on HDFS-14353:
-

Seems 3.3.0 doesn't include this jira. Fixed Fix Versions.

> Erasure Coding: metrics xmitsInProgress become to negative.
> ---
>
> Key: HDFS-14353
> URL: https://issues.apache.org/jira/browse/HDFS-14353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, erasure-coding
>Affects Versions: 3.3.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, 
> HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, 
> HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, 
> HDFS-14353.009.patch, HDFS-14353.010.patch, screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.

2020-12-02 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-14353:

Fix Version/s: (was: 3.3.0)
   3.3.1

> Erasure Coding: metrics xmitsInProgress become to negative.
> ---
>
> Key: HDFS-14353
> URL: https://issues.apache.org/jira/browse/HDFS-14353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, erasure-coding
>Affects Versions: 3.3.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, 
> HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, 
> HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, 
> HDFS-14353.009.patch, HDFS-14353.010.patch, screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-02 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242842#comment-17242842
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

I confirmed that the patch also worked well with ISA-L. (CI doesn't use ISA-L 
since HADOOP-17224 is reverted.)

+1 on [^HDFS-15240.013.patch]. Thanks!

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, 
> HDFS-15240.012.patch, HDFS-15240.013.patch, 
> image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> 

[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-12-02 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242355#comment-17242355
 ] 

Takanobu Asanuma commented on HDFS-15240:
-

This bug seems to have been since Hadoop-3.0.0 was released. We saw a similar 
issue a long time ago, but we were unable to find its cause. Thank you very 
much for finding and fixing it, [~marvelrock]. Thanks for your reviews, 
[~ferhui] and [~umamaheswararao].

The main fix of [^HDFS-15240.012.patch] looks good to me. Some minor comments 
for the unit tests:
 * testTimeoutReadBlockInReconstruction: Please use JIRA number for the comment.
{code:java}
- // before this fix, NPE will cause reconstruction fail(test timeout)
+ // before HDFS-15240, NPE will cause reconstruction fail(test timeout)
{code}

 * assertBufferPoolIsEmpty: This line could be removed?
{code:java}
- byteBuffer = null;
{code}

 * emptyBufferPool: Calling {{getBuffer}} may just be enough?
{code:java}
-  ByteBuffer byteBuffer =  bufferPool.getBuffer(direct, 0);
-  byteBuffer = null;
+  bufferPool.getBuffer(direct, 0);
{code}

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, 
> HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, 
> HDFS-15240.012.patch, image-2020-07-16-15-56-38-608.png, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, 
> org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, 
> test-HDFS-15240.006.patch
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> 

[jira] [Assigned] (HDFS-15682) TestStripedFileAppend#testAppendToNewBlock fails on trunk

2020-11-12 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma reassigned HDFS-15682:
---

Assignee: Takanobu Asanuma

> TestStripedFileAppend#testAppendToNewBlock fails on trunk
> -
>
> Key: HDFS-15682
> URL: https://issues.apache.org/jira/browse/HDFS-15682
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>
> The jenkins result of HDFS-15538 shows the failure.
> {noformat}
> File /TestFileAppendStriped/testAppendToNewBlock could only be written to 5 
> of the 6 required nodes for RS-6-3-1024k. There are 9 datanode(s) running and 
> 9 node(s) are excluded in this operation.
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2333)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3002)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:909)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:599)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:537)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1074)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams in hdfs-default.xml

2020-11-12 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15538:

Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.3, branch-3.2, branch-3.1.

Thanks for your contribution, [~risyomei]!

> Fix the documentation for dfs.namenode.replication.max-streams in 
> hdfs-default.xml
> --
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15538.001.patch, HDFS-15538.002.patch
>
>
> The description of dfs.namenode.replication.max-streams in hdfs-default.xml 
> is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams in hdfs-default.xml

2020-11-12 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231088#comment-17231088
 ] 

Takanobu Asanuma commented on HDFS-15538:
-

The failure of TestStripedFileAppend doesn't seem to be related. I filed it by 
HDFS-15682.

> Fix the documentation for dfs.namenode.replication.max-streams in 
> hdfs-default.xml
> --
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch, HDFS-15538.002.patch
>
>
> The description of dfs.namenode.replication.max-streams in hdfs-default.xml 
> is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15682) TestStripedFileAppend#testAppendToNewBlock fails on trunk

2020-11-12 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15682:
---

 Summary: TestStripedFileAppend#testAppendToNewBlock fails on trunk
 Key: HDFS-15682
 URL: https://issues.apache.org/jira/browse/HDFS-15682
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takanobu Asanuma


The jenkins result of HDFS-15538 shows the failure.
{noformat}
File /TestFileAppendStriped/testAppendToNewBlock could only be written to 5 of 
the 6 required nodes for RS-6-3-1024k. There are 9 datanode(s) running and 9 
node(s) are excluded in this operation.
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2333)
 at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3002)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:909)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:599)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:537)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1074)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in hdfs-default.xml

2020-11-11 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15538:

Description: 
The description of dfs.namenode.replication.max-streams in hdfs-default.xml is 
misleading.

The maxReplicationStreams is not limiting the replication streams with the 
highest priority.

[https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]

  was:
The description of dfs.namenode.replication.max-streams-hard-limit in 
hdfs-default.xml is misleading.

The maxReplicationStreams is not limiting the replication streams with the 
highest priority.

[https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]


> Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml
> -
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch, HDFS-15538.002.patch
>
>
> The description of dfs.namenode.replication.max-streams in hdfs-default.xml 
> is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams in hdfs-default.xml

2020-11-11 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15538:

Summary: Fix the documentation for dfs.namenode.replication.max-streams in 
hdfs-default.xml  (was: Fix the documentation for 
dfs.namenode.replication.max-streams-hard-limit in hdfs-default.xml)

> Fix the documentation for dfs.namenode.replication.max-streams in 
> hdfs-default.xml
> --
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch, HDFS-15538.002.patch
>
>
> The description of dfs.namenode.replication.max-streams in hdfs-default.xml 
> is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in hdfs-default.xml

2020-11-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230375#comment-17230375
 ] 

Takanobu Asanuma commented on HDFS-15538:
-

[~risyomei] Thanks for updating the patch, and thanks for your message!

+1 on [^HDFS-15538.002.patch], pending jenkins.

> Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml
> -
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch, HDFS-15538.002.patch
>
>
> The description of dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15657) RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException

2020-10-28 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HDFS-15657.
-
Fix Version/s: 3.4.0
   3.3.1
   Resolution: Fixed

Merged to trunk and branch-3.3. Thanks, [~aajisaka].

> RBF: TestRouter#testNamenodeHeartBeatEnableDefault fails by BindException
> -
>
> Key: HDFS-15657
> URL: https://issues.apache.org/jira/browse/HDFS-15657
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/40/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
> {noformat}
> [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.431 
> s <<< FAILURE! - in org.apache.hadoop.hdfs.server.federation.router.TestRouter
> [ERROR] 
> testNamenodeHeartBeatEnableDefault(org.apache.hadoop.hdfs.server.federation.router.TestRouter)
>   Time elapsed: 1.04 s  <<< ERROR!
> org.apache.hadoop.service.ServiceStateException: java.net.BindException: 
> Problem binding to [0.0.0.0:] java.net.BindException: Address already in 
> use; For more details see:  http://wiki.apache.org/hadoop/BindException
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.checkNamenodeHeartBeatEnableDefault(TestRouter.java:281)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.TestRouter.testNamenodeHeartBeatEnableDefault(TestRouter.java:267)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> 

[jira] [Updated] (HDFS-15639) [JDK 11] Fix Javadoc errors in hadoop-hdfs-client

2020-10-20 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15639:

Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Merged to trunk, branch-3.3, branch-3.2, branch-3.1.

> [JDK 11] Fix Javadoc errors in hadoop-hdfs-client
> -
>
> Key: HDFS-15639
> URL: https://issues.apache.org/jira/browse/HDFS-15639
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This is caused by HDFS-15567.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15639) [JDK 11] Fix Javadoc errors in hadoop-hdfs-client

2020-10-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15639 started by Takanobu Asanuma.
---
> [JDK 11] Fix Javadoc errors in hadoop-hdfs-client
> -
>
> Key: HDFS-15639
> URL: https://issues.apache.org/jira/browse/HDFS-15639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>
> This is caused by HDFS-15567.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15639) [JDK 11] Fix Javadoc errors in hadoop-hdfs-client

2020-10-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15639:

Status: Patch Available  (was: In Progress)

> [JDK 11] Fix Javadoc errors in hadoop-hdfs-client
> -
>
> Key: HDFS-15639
> URL: https://issues.apache.org/jira/browse/HDFS-15639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is caused by HDFS-15567.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15639) [JDK 11] Fix Javadoc errors in hadoop-hdfs-client

2020-10-18 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15639:

Description: This is caused by HDFS-15567.

> [JDK 11] Fix Javadoc errors in hadoop-hdfs-client
> -
>
> Key: HDFS-15639
> URL: https://issues.apache.org/jira/browse/HDFS-15639
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>
> This is caused by HDFS-15567.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15639) [JDK 11] Fix Javadoc errors in hadoop-hdfs-client

2020-10-18 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15639:
---

 Summary: [JDK 11] Fix Javadoc errors in hadoop-hdfs-client
 Key: HDFS-15639
 URL: https://issues.apache.org/jira/browse/HDFS-15639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15572) RBF: Quota updating every 1 min once, if I setquota 50 on mount path, in a minute i am able to write the files morethan quota in mount path.

2020-10-14 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214037#comment-17214037
 ] 

Takanobu Asanuma commented on HDFS-15572:
-

If reducing {{dfs.federation.router.cache.ttl}} or 
{{dfs.federation.router.quota-cache.update.interval}}, router would update the 
cache near real-time with the current implementation. But it will cause 
frequency read operations to state store if it is too short.

> RBF: Quota updating every 1 min once, if I setquota 50 on mount path, in a 
> minute i am able to write the files morethan quota in mount path. 
> -
>
> Key: HDFS-15572
> URL: https://issues.apache.org/jira/browse/HDFS-15572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Hemanth Boyina
>Priority: Major
>
> IN Router State store quota updating every 1 min once only, if i set quota 50 
> on mount path am able to write more that quota in the mount path here quota 
> will not work out.
> {noformat}
> 1. Create a destinations dir in Namespaces
> 2. Create a mount path with multiple destinations
> 3. Setquota 50 on mount path 
> 4. write a files morethan 50 + in the mount path in a minute
> {noformat}
> *Excepted Result:-*
> Here after setquota mount path should not allow morethan that at any cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15577) Refactor TestTracing

2020-09-28 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15577:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Refactor TestTracing
> 
>
> Key: HDFS-15577
> URL: https://issues.apache.org/jira/browse/HDFS-15577
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are some unused imports, unused variables, and checkstyle warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15576) Erasure Coding: Add rs and rs-legacy codec test for addPolicies

2020-09-15 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15576:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for your contribution, [~ferhui]!

> Erasure Coding: Add rs and rs-legacy codec test for addPolicies
> ---
>
> Key: HDFS-15576
> URL: https://issues.apache.org/jira/browse/HDFS-15576
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HDFS-15576.001.patch, HDFS-15576.002.patch
>
>
> * Add rs and rs-legacy codec test for  TestErasureCodingCLI
> * Add comments for failed test RS
> * Modify UT, change "RS" to "rs", because "RS" is not supported 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15576) Erasure Coding: Add rs and rs-legacy codec test for addPolicies

2020-09-15 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196282#comment-17196282
 ] 

Takanobu Asanuma commented on HDFS-15576:
-

The failed tests are not related to the patch.

> Erasure Coding: Add rs and rs-legacy codec test for addPolicies
> ---
>
> Key: HDFS-15576
> URL: https://issues.apache.org/jira/browse/HDFS-15576
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-15576.001.patch, HDFS-15576.002.patch
>
>
> * Add rs and rs-legacy codec test for  TestErasureCodingCLI
> * Add comments for failed test RS
> * Modify UT, change "RS" to "rs", because "RS" is not supported 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15576) Erasure Coding: Add rs and rs-legacy codec test for addPolicies

2020-09-15 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196095#comment-17196095
 ] 

Takanobu Asanuma commented on HDFS-15576:
-

Thanks for updating the patch. +1 on [^HDFS-15576.002.patch], pending Jenkins.

> Erasure Coding: Add rs and rs-legacy codec test for addPolicies
> ---
>
> Key: HDFS-15576
> URL: https://issues.apache.org/jira/browse/HDFS-15576
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-15576.001.patch, HDFS-15576.002.patch
>
>
> * Add rs and rs-legacy codec test for  TestErasureCodingCLI
> * Add comments for failed test RS
> * Modify UT, change "RS" to "rs", because "RS" is not supported 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15576) Erasure Coding: Add test addPolicies to ECAdmin

2020-09-15 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195936#comment-17195936
 ] 

Takanobu Asanuma commented on HDFS-15576:
-

Thanks for your patch, [~ferhui]. Thanks for pinging me, [~aajisaka].

About {{TestECAdmin#testAddPolicies()}}, we already have similar tests in 
TestErasureCodingCLI, which is the end-to-end test for ec commands.
 How about adding the schemas of RSk12m4/RS-legacyk12m4 to test_ec_policies.xml 
that TestErasureCodingCLI is using?

> Erasure Coding: Add test addPolicies to ECAdmin
> ---
>
> Key: HDFS-15576
> URL: https://issues.apache.org/jira/browse/HDFS-15576
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-15576.001.patch
>
>
> * Add UT TestECAdmin#testAddPolicies
> * Modify UT, change "RS" to "rs", because "RS" is not supported 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15577) Refactor TestTracing

2020-09-14 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15577:

Status: Patch Available  (was: Open)

> Refactor TestTracing
> 
>
> Key: HDFS-15577
> URL: https://issues.apache.org/jira/browse/HDFS-15577
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are some unused imports, unused variables, and checkstyle warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15577) Refactor TestTracing

2020-09-14 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15577:
---

 Summary: Refactor TestTracing
 Key: HDFS-15577
 URL: https://issues.apache.org/jira/browse/HDFS-15577
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


There are some unused imports, unused variables, and checkstyle warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15572) RBF: Quota updating every 1 min once, if I setquota 50 on mount path, in a minute i am able to write the files morethan quota in mount path.

2020-09-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194407#comment-17194407
 ] 

Takanobu Asanuma commented on HDFS-15572:
-

If I understand correctly, RBF doesn't guarantee strong consistency for mount 
table operations.

> RBF: Quota updating every 1 min once, if I setquota 50 on mount path, in a 
> minute i am able to write the files morethan quota in mount path. 
> -
>
> Key: HDFS-15572
> URL: https://issues.apache.org/jira/browse/HDFS-15572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Hemanth Boyina
>Priority: Major
>
> IN Router State store quota updating every 1 min once only, if i set quota 50 
> on mount path am able to write more that quota in the mount path here quota 
> will not work out.
> {noformat}
> 1. Create a destinations dir in Namespaces
> 2. Create a mount path with multiple destinations
> 3. Setquota 50 on mount path 
> 4. write a files morethan 50 + in the mount path in a minute
> {noformat}
> *Excepted Result:-*
> Here after setquota mount path should not allow morethan that at any cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15571) RBF: When we set the quota on mount Path which is having 4 Destinations, count command shows the 4 as dir count, it should be 1 as per mount entry.

2020-09-11 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194364#comment-17194364
 ] 

Takanobu Asanuma commented on HDFS-15571:
-

Agreed. I feel It should be 1 at a mount point.

> RBF: When we set the quota on mount Path which is having 4 Destinations, 
> count command shows the 4 as dir count, it should be 1 as per mount entry.
> ---
>
> Key: HDFS-15571
> URL: https://issues.apache.org/jira/browse/HDFS-15571
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Hemanth Boyina
>Priority: Major
>
> When we set the quota on mount Path which is having 4 Destinations, count 
> command shows the 4 as dir count, it should be 1 as per End user.
> {code:java}
> 1. Create dir's in 4 NS.
> 2. Create Mount path  with 4 destinations with order HASH_ALL
> 3. Setquota 10 on Mount path
> 4. Get quota on mount path  
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15510) RBF: Quota and Content Summary was not correct in Multiple Destinations

2020-08-26 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15510:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for your contribution, [~hemanthboyina].

Thanks for your review and discussion, [~elgoiri], [~brahmareddy], [~ayushtkn].

> RBF: Quota and Content Summary was not correct in Multiple Destinations
> ---
>
> Key: HDFS-15510
> URL: https://issues.apache.org/jira/browse/HDFS-15510
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: 15510.png, HDFS-15510.001.patch, HDFS-15510.002.patch, 
> HDFS-15510.003.patch, HDFS-15510.004.patch, HDFS-15510.005.patch, 
> HDFS-15510.006.patch
>
>
> steps :
> *) create a mount entry with multiple destinations ( for suppose 2)
> *) Set NS quota as 10 for mount entry by dfsrouteradmin command, Content 
> Summary on the Mount Entry shows NS quota as 20
> *) Create 10 files through router, on creating 11th file , NS Quota Exceeded 
> Exception is coming 
> though the Content Summary showing the NS quota as 20 , we are not able to 
> create 20 files
>  
> the problem here is router stores the mount entry's NS quota as 10 , but 
> invokes NS quota on both the name services by set NS quota as 10 , so content 
> summary on mount entry aggregates the content summary of both the name 
> services by making NS quota as 20



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15510) RBF: Quota and Content Summary was not correct in Multiple Destinations

2020-08-26 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185502#comment-17185502
 ] 

Takanobu Asanuma commented on HDFS-15510:
-

+1.

> RBF: Quota and Content Summary was not correct in Multiple Destinations
> ---
>
> Key: HDFS-15510
> URL: https://issues.apache.org/jira/browse/HDFS-15510
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Critical
> Attachments: 15510.png, HDFS-15510.001.patch, HDFS-15510.002.patch, 
> HDFS-15510.003.patch, HDFS-15510.004.patch, HDFS-15510.005.patch, 
> HDFS-15510.006.patch
>
>
> steps :
> *) create a mount entry with multiple destinations ( for suppose 2)
> *) Set NS quota as 10 for mount entry by dfsrouteradmin command, Content 
> Summary on the Mount Entry shows NS quota as 20
> *) Create 10 files through router, on creating 11th file , NS Quota Exceeded 
> Exception is coming 
> though the Content Summary showing the NS quota as 20 , we are not able to 
> create 20 files
>  
> the problem here is router stores the mount entry's NS quota as 10 , but 
> invokes NS quota on both the name services by set NS quota as 10 , so content 
> summary on mount entry aggregates the content summary of both the name 
> services by making NS quota as 20



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15510) RBF: Quota and Content Summary was not correct in Multiple Destinations

2020-08-26 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185210#comment-17185210
 ] 

Takanobu Asanuma commented on HDFS-15510:
-

[~hemanthboyina] Looks good for me except for the test. Could you fix it?

> RBF: Quota and Content Summary was not correct in Multiple Destinations
> ---
>
> Key: HDFS-15510
> URL: https://issues.apache.org/jira/browse/HDFS-15510
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Critical
> Attachments: 15510.png, HDFS-15510.001.patch, HDFS-15510.002.patch, 
> HDFS-15510.003.patch, HDFS-15510.004.patch, HDFS-15510.005.patch, 
> HDFS-15510.006.patch
>
>
> steps :
> *) create a mount entry with multiple destinations ( for suppose 2)
> *) Set NS quota as 10 for mount entry by dfsrouteradmin command, Content 
> Summary on the Mount Entry shows NS quota as 20
> *) Create 10 files through router, on creating 11th file , NS Quota Exceeded 
> Exception is coming 
> though the Content Summary showing the NS quota as 20 , we are not able to 
> create 20 files
>  
> the problem here is router stores the mount entry's NS quota as 10 , but 
> invokes NS quota on both the name services by set NS quota as 10 , so content 
> summary on mount entry aggregates the content summary of both the name 
> services by making NS quota as 20



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in hdfs-default.xml

2020-08-23 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182900#comment-17182900
 ] 

Takanobu Asanuma commented on HDFS-15538:
-

Sorry, I cancel my last vote.

[~risyomei] maxReplicationStreams is dfs.namenode.replication.max-streams. 
Could you fix the description of dfs.namenode.replication.max-streams?

> Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml
> -
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch
>
>
> The description of dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15538) Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in hdfs-default.xml

2020-08-23 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182896#comment-17182896
 ] 

Takanobu Asanuma commented on HDFS-15538:
-

+1 on [^HDFS-15538.001.patch].

> Fix the documentation for dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml
> -
>
> Key: HDFS-15538
> URL: https://issues.apache.org/jira/browse/HDFS-15538
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-15538.001.patch
>
>
> The description of dfs.namenode.replication.max-streams-hard-limit in 
> hdfs-default.xml is misleading.
> The maxReplicationStreams is not limiting the replication streams with the 
> highest priority.
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2463-L2471]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15510) RBF: Quota and Content Summary was not correct in Multiple Destinations

2020-08-17 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179163#comment-17179163
 ] 

Takanobu Asanuma commented on HDFS-15510:
-

Thanks for finding the issue, [~hemanthboyina].

Agreed with [~brahmareddy]. I prefer to 3). In this case, I think the quota of 
the content summary should be 10, and the quota for each subcluster should also 
be 10, since there will be bias and at worst one subcluster will have 10 files.

> RBF: Quota and Content Summary was not correct in Multiple Destinations
> ---
>
> Key: HDFS-15510
> URL: https://issues.apache.org/jira/browse/HDFS-15510
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Critical
> Attachments: 15510.png
>
>
> steps :
> *) create a mount entry with multiple destinations ( for suppose 2)
> *) Set NS quota as 10 for mount entry by dfsrouteradmin command, Content 
> Summary on the Mount Entry shows NS quota as 20
> *) Create 10 files through router, on creating 11th file , NS Quota Exceeded 
> Exception is coming 
> though the Content Summary showing the NS quota as 20 , we are not able to 
> create 20 files
>  
> the problem here is router stores the mount entry's NS quota as 10 , but 
> invokes NS quota on both the name services by set NS quota as 10 , so content 
> summary on mount entry aggregates the content summary of both the name 
> services by making NS quota as 20



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15512) Remove smallBufferSize in DFSClient

2020-08-06 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172749#comment-17172749
 ] 

Takanobu Asanuma commented on HDFS-15512:
-

Thanks for committing it, [~hemanthboyina].

> Remove smallBufferSize in DFSClient
> ---
>
> Key: HDFS-15512
> URL: https://issues.apache.org/jira/browse/HDFS-15512
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Minor
> Fix For: 3.4.0
>
>
> It seems an unused variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15512) Remove smallBufferSize in DFSClient

2020-08-04 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15512:
---

 Summary: Remove smallBufferSize in DFSClient
 Key: HDFS-15512
 URL: https://issues.apache.org/jira/browse/HDFS-15512
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


It seems an unused variable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15502) Implement service-user feature in DecayRPCScheduler

2020-07-30 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167761#comment-17167761
 ] 

Takanobu Asanuma commented on HDFS-15502:
-

[~csun]

Oh, I didn't know that jira. thanks for letting me know.
This jira may be a duplicate, but I just move it to HADOOP for now.

> Implement service-user feature in DecayRPCScheduler
> ---
>
> Key: HDFS-15502
> URL: https://issues.apache.org/jira/browse/HDFS-15502
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
>
> In our cluster, we want to use FairCallQueue to limit heavy users, but not 
> want to restrict certain users who are submitting important requests. This 
> jira proposes to implement the service-user feature that the user is always 
> scheduled high-priority queue.
> According to HADOOP-9640, the initial concept of FCQ has this feature, but 
> not implemented finally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15502) Implement service-user feature in DecayRPCScheduler

2020-07-30 Thread Takanobu Asanuma (Jira)
Takanobu Asanuma created HDFS-15502:
---

 Summary: Implement service-user feature in DecayRPCScheduler
 Key: HDFS-15502
 URL: https://issues.apache.org/jira/browse/HDFS-15502
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


In our cluster, we want to use FairCallQueue to limit heavy users, but not want 
to restrict certain users who are submitting important requests. This jira 
proposes to implement the service-user feature that the user is always 
scheduled high-priority queue.
According to HADOOP-9640, the initial concept of FCQ has this feature, but not 
implemented finally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12969) DfsAdmin listOpenFiles should report files by type

2020-07-28 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166361#comment-17166361
 ] 

Takanobu Asanuma edited comment on HDFS-12969 at 7/28/20, 11:43 AM:


Thanks for updating the patch, [~hemanthboyina].

If I understand correctly, [^HDFS-12969.003.patch] assumes that the order of 
the loop in Fsnamesystem#getFilesBlockingDecom is the same as the subset of 
LeaseManager.leasesById. Does this assumption always hold true?


was (Author: tasanuma0829):
Thanks for updating the patch, [~hemanthboyina].

If I understand correctly, this patch assumes that the order of the loop in 
Fsnamesystem#getFilesBlockingDecom is the same as the subset of 
LeaseManager.leasesById. Does this assumption always hold true?

> DfsAdmin listOpenFiles should report files by type
> --
>
> Key: HDFS-12969
> URL: https://issues.apache.org/jira/browse/HDFS-12969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Manoj Govindassamy
>Assignee: Hemanth Boyina
>Priority: Major
> Attachments: HDFS-12969.001.patch, HDFS-12969.002.patch, 
> HDFS-12969.003.patch
>
>
> HDFS-11847 has introduced a new option to {{-blockingDecommission}} to an 
> existing command 
> {{dfsadmin -listOpenFiles}}. But the reporting done by the command doesn't 
> differentiate the files based on the type (like blocking decommission). In 
> order to change the reporting style, the proto format used for the base 
> command has to be updated to carry additional fields and better be done in a 
> new jira outside of HDFS-11847. This jira is to track the end-to-end 
> enhancements needed for dfsadmin -listOpenFiles console output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12969) DfsAdmin listOpenFiles should report files by type

2020-07-28 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166361#comment-17166361
 ] 

Takanobu Asanuma commented on HDFS-12969:
-

Thanks for updating the patch, [~hemanthboyina].

If I understand correctly, this patch assumes that the order of the loop in 
Fsnamesystem#getFilesBlockingDecom is the same as the subset of 
LeaseManager.leasesById. Does this assumption always hold true?

> DfsAdmin listOpenFiles should report files by type
> --
>
> Key: HDFS-12969
> URL: https://issues.apache.org/jira/browse/HDFS-12969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Manoj Govindassamy
>Assignee: Hemanth Boyina
>Priority: Major
> Attachments: HDFS-12969.001.patch, HDFS-12969.002.patch, 
> HDFS-12969.003.patch
>
>
> HDFS-11847 has introduced a new option to {{-blockingDecommission}} to an 
> existing command 
> {{dfsadmin -listOpenFiles}}. But the reporting done by the command doesn't 
> differentiate the files based on the type (like blocking decommission). In 
> order to change the reporting style, the proto format used for the base 
> command has to be updated to carry additional fields and better be done in a 
> new jira outside of HDFS-11847. This jira is to track the end-to-end 
> enhancements needed for dfsadmin -listOpenFiles console output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    1   2   3   4   5   6   7   8   9   10   >