[ 
https://issues.apache.org/jira/browse/HDDS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDDS-8769.
-----------------------------------
    Resolution: Cannot Reproduce

Ok think this one is no longer an issue.

Despite HDDS-9130 is not resolved yet, I can see a ratis log of 31MB that 
contains 216023 transactions, and it takes several hours to fill up this log 
file at current rate.

> [hsync] disk usage thread aborts if ratis log rolls very quickly
> ----------------------------------------------------------------
>
>                 Key: HDDS-8769
>                 URL: https://issues.apache.org/jira/browse/HDDS-8769
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> The Ratis log file corresponding to a HBase WAL block rolls very quickly.
> The disk usage thread aborts because of the change of log file name, and then 
> the DN is unable to get correct disk usage.
> {noformat}
> 2023-06-05 08:44:55,462 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  created new log segment 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186383
> 2023-06-05 08:44:55,514 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315-server-thread16] INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolling segment log-186383_186396 to index:186396
> 2023-06-05 08:44:55,516 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolled log segment from 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186383
>  to 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_186383-186396
> 2023-06-05 08:44:55,517 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  created new log segment 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186397
> 2023-06-05 08:44:55,570 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315-server-thread18] INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolling segment log-186397_186411 to index:186411
> 2023-06-05 08:44:55,572 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolled log segment from 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186397
>  to 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_186397-186411
> 2023-06-05 08:44:55,573 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  created new log segment 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186412
> 2023-06-05 08:44:55,644 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315-server-thread18] INFO 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolling segment log-186412_186434 to index:186434
> 2023-06-05 08:44:55,646 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  Rolled log segment from 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186412
>  to 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_186412-186434
> 2023-06-05 08:44:55,647 
> [37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker]
>  INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: 
> 37d8fb56-9f29-4cd6-b9e1-dcdbef05b315@group-133D49B637D1-SegmentedRaftLogWorker:
>  created new log segment 
> /var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186435
> 2023-06-05 08:44:55,673 [DiskUsage-/var/lib/hadoop-ozone/datanode/ratis/data-
> ] WARN org.apache.hadoop.hdds.fs.CachingSpaceUsageSource: Error refreshing 
> space usage for /var/lib/hadoop-ozone/datanode/ratis/data
> java.io.UncheckedIOException: ExitCodeException exitCode=1: du: cannot access 
> ‘/var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186383’:
>  No such file or directory
>         at org.apache.hadoop.hdds.fs.DU$DUShell.getUsed(DU.java:94)
>         at 
> org.apache.hadoop.hdds.fs.AbstractSpaceUsageSource.time(AbstractSpaceUsageSource.java:56)
>         at org.apache.hadoop.hdds.fs.DU.getUsedSpace(DU.java:63)
>         at 
> org.apache.hadoop.hdds.fs.CachingSpaceUsageSource.refresh(CachingSpaceUsageSource.java:140)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: ExitCodeException exitCode=1: du: cannot access 
> ‘/var/lib/hadoop-ozone/datanode/ratis/data/39885220-c182-47d3-ade0-133d49b637d1/current/log_inprogress_186383’:
>  No such file or directory
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1008)
>         at org.apache.hadoop.util.Shell.run(Shell.java:901)
>         at org.apache.hadoop.hdds.fs.DU$DUShell.getUsed(DU.java:91)
>         ... 10 more
> {noformat}
> The workaround is use DF instead of DU to calculate disk usage 
> (hdds.datanode.du.factory=org.apache.hadoop.hdds.fs.DedicatedDiskSpaceUsageFactory)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to