[ 
https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ruiliang resolved HDFS-16806.
-----------------------------
    Hadoop Flags: Reviewed
      Resolution: Fixed

> ec data balancer block blk_id The index error ,Data cannot be moved
> -------------------------------------------------------------------
>
>                 Key: HDFS-16806
>                 URL: https://issues.apache.org/jira/browse/HDFS-16806
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.0
>            Reporter: ruiliang
>            Priority: Critical
>         Attachments: image-2022-10-20-11-32-35-833.png
>
>
> ec data balancer block blk_id The index error ,Data cannot be moved
> dn->10.12.15.149 use disk 100%
>  
> {code:java}
> echo 10.12.15.149>sorucehost
> balancer  -fs hdfs://xxcluster06  -threshold 10 -source -f sorucehost   
> 2>>~/balancer.log &  {code}
>  
> datanode logs 
> A lot of this log output  
> {code:java}
> datanode logs
> ...
> 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - 
> fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK 
> operation  src: /10.12.65.216:58214 dst: /10.12.15.149:1019
> org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not 
> found for 
> BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:256)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
>         at java.lang.Thread.run(Thread.java:748)
> ...    
>     
> hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 
> Connecting to namenode via 
> http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F
> FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 
> 14:47:15 CST 2022Block Id: blk_-9223372036799576592
> Block belongs to: 
> /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz
> No. of Expected Replica: 5
> No. of live Replica: 5
> No. of excess Replica: 0
> No. of stale Replica: 5
> No. of decommissioned Replica: 0
> No. of decommissioning Replica: 0
> No. of corrupted Replica: 0
> Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is 
> HEALTHY
> Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is 
> HEALTHY
> Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is 
> HEALTHY
> Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is 
> HEALTHY
> Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is 
> HEALTHY
> hdfs fsck -fs hdfs://xxcluster06 
> /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz
>  -files -blocks -locations
> Connecting to namenode via 
> http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz
> FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path 
> /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz
>  at Wed Oct 19 14:48:42 CST 2022
> /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz
>  500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s):  OK
> 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 
> len=500582412 Live_repl=5  
> [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK],
>  
> blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK],
>  
> blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK],
>  
> blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK],
>  
> blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]]
> Status: HEALTHY
>  Number of data-nodes:  62
>  Number of racks:               19
>  Total dirs:                    0
>  Total symlinks:                0Replicated Blocks:
>  Total size:    0 B
>  Total files:   0
>  Total blocks (validated):      0
>  Minimally replicated blocks:   0
>  Over-replicated blocks:        0
>  Under-replicated blocks:       0
>  Mis-replicated blocks:         0
>  Default replication factor:    3
>  Average block replication:     0.0
>  Missing blocks:                0
>  Corrupt blocks:                0
>  Missing replicas:              0Erasure Coded Block Groups:
>  Total size:    500582412 B
>  Total files:   1
>  Total block groups (validated):        1 (avg. block group size 500582412 B)
>  Minimally erasure-coded block groups:  1 (100.0 %)
>  Over-erasure-coded block groups:       0 (0.0 %)
>  Under-erasure-coded block groups:      0 (0.0 %)
>  Unsatisfactory placement block groups: 0 (0.0 %)
>  Average block group size:      5.0
>  Missing block groups:          0
>  Corrupt block groups:          0
>  Missing internal blocks:       0 (0.0 %)
> FSCK ended at Wed Oct 19 14:48:42 CST 2022 in 1 milliseconds
> The filesystem under path 
> '/hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz'
>  is HEALTHY
>  {code}
> {code:java}
> ssh 10.12.15.149 
> ls -alh  
> /data*/hadoop/dfs/data/current/*/current/finalized/subdir*/subdir*/blk_-9223372036799576592
>  
> ls: cannot access 
> '/data*/hadoop/dfs/data/current/*/current/finalized/subdir*/subdir*/blk_-9223372036799576592':
>  No such file or directory
> ls -alh  
> /data*/hadoop/dfs/data/current/*/current/finalized/subdir*/subdir*/blk_-9223372036799576590
>  -rw-r--r-- 1 hdfs hdfs 159M Oct 18 20:02 
> /data7/hadoop/dfs/data/current/BP-1822992414-10.12.65.48-1660893388633/current/finalized/subdir10/subdir5/blk_-9223372036799576590
>  {code}
> Should move blk_-9223372036799576590



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to