[
https://issues.apache.org/jira/browse/HDFS-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145827#comment-14145827
]
Lei (Eddy) Xu commented on HDFS-6877:
-------------------------------------
[~cmccabe] Thank you very much for your quick reviews.
I anticipated a behavior that the application is writing the block in very slow
speed. For instance, a light traffic service opens a log file and only appends
10KB logs after waking up for every 30 seconds? Or some applications that dump
immediate results (e.g., 1MB) after every 10 minutes? In such cases, the
BlockReceiver thread can live for a long time, until the block is full and then
be closed. But from the user point of view, after {{hdfs dfsadmin -reconfig
status}} indicates the reconfigure task has finished, the user might expect to
be able to remove the disks physically anytime? Since BlockReceiver thread is
still alive for uncertain time and it opens file descriptor on the disk, it
might be difficult to explain to user why the disk can not be {{umount}} and
how long they should wait for it.
{code}
for (ReplicaInPipeline pip : activeWriters) {
try {
LOG.info("Stopping active writer.");
pip.stopWriter(datanode.getDnConf().getXceiverStopTimeout());
} catch (IOException e) {
LOG.warn("IOException when stopping active writer.", e);
}
}
{code}
The reason that I moved the above out of synchronized section is that
# {stopWrite()} is calling {{interrupt()}} and {{join()}}, which is slow. So I
attempted to make the synchronized section in {{removeVolumes()}} shorter.
#.
> Avoid calling checkDisk and stop active BlockReceiver thread when an HDFS
> volume is removed during a write.
> -----------------------------------------------------------------------------------------------------------
>
> Key: HDFS-6877
> URL: https://issues.apache.org/jira/browse/HDFS-6877
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Affects Versions: 2.5.0
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Attachments: HDFS-6877.000.consolidate.txt,
> HDFS-6877.000.delta-HDFS-6727.txt, HDFS-6877.001.combo.txt,
> HDFS-6877.001.patch, HDFS-6877.002.patch, HDFS-6877.003.patch,
> HDFS-6877.004.patch, HDFS-6877.005.patch
>
>
> Avoid calling checkDisk and stop active BlockReceiver thread when an HDFS
> volume is removed during a write.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)