[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2024-01-27 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17003:
--
Affects Version/s: 3.3.6
   3.4.0

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.4.0, 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Fix Version/s: 3.3.6

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Component/s: namenode

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Summary: Erasure Coding: invalidate wrong block after reporting bad blocks 
from datanode  (was: Erasure coding: invalidate wrong block after reporting bad 
blocks from datanode)

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-17003:
---
Target Version/s: 3.3.6

> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17003:
--
Labels: pull-request-available  (was: )

> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-09 Thread farmmamba (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

farmmamba updated HDFS-17003:
-
Description: 
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
datanode1:50010

2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
datanode2:50010{code}
datanode1 log:
{code:java}
2023-05-08 21:23:49,088 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
/data7/hadoop/hdfs/datanode

2023-05-08 21:24:00,509 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036848404319_1471186: ReplicaInfo not found.{code}
 

This phenomenon can be reproduced.

  was:
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}
 

This phenomenon can be reproduced.


> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Priority: Critical
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-08 Thread farmmamba (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

farmmamba updated HDFS-17003:
-
Description: 
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}
 

This phenomenon can be reproduced.

  was:
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}


> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Priority: Critical
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
> datanode1:50010 {code}
> datanode1 log:
> {code:java}
> 2023-05-08 14:39:42,183 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036846808880_1669008
>  on /data1/hadoop/hdfs/datanode
> 2023-05-08 14:39:47,338 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036846808879_1669008: ReplicaInfo
> not found. {code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org