Do you have a list of files that was being opened? I'd like to know if
those are files opened for writes or for reads.

If you are on the more recent version of Hadoop (2.8.0 and above),
there's a HDFS command to interrupt ongoing writes to DataNodes (HDFS-9945
<https://issues.apache.org/jira/browse/HDFS-9945>)

https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#dfsadmin
hdfs dfsadmin -evictWriters

Looking at HDFS hotswap implementation, it looks like DataNode doesn't
interrupt writers when a volume is removed. That sounds like a bug.

On Tue, May 28, 2019 at 9:39 PM Kang Minwoo <minwoo.k...@outlook.com> wrote:

> Hello, Users.
>
> I use JBOD for data node. Some times the disk in the data node has a
> problem.
>
> The first time, I shut down all instance include data node and region
> server in the machine that has a disk problem.
> But It is not a good solution. So I improve the process.
>
> When I detect disk problem in the server. I just perform disk hot swap.
>
> But System administrator complains of some FD that still open so they
> cannot remove the disk.
> Regionserver has an FD, I use short circuit reads feature. (HBase version
> 1.2.9)
>
> When we first met this issue, we force unmount disk and remount.
> But after this process, kernel report error[1].
>
> So we avoid this issue. purge stale FD.
>
> I think this issue is common.
> I want to know how hbase-users deal with this issue.
>
> Thank you very much for sharing your experience.
>
> Best regards,
> Minwoo Kang
>
> [1]:
> https://www.thegeekdiary.com/xfs_log_force-error-5-returned-xfs-error-centos-rhel-7/
>

Reply via email to