[
https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136600#comment-14136600
]
Colin Patrick McCabe commented on HDFS-6727:
--------------------------------------------
Thinking about this a little more, we might have timeouts if the reconfigurable
API does everything synchronously. So I think we need to modify the existing
API to just start a reconfiguration, and add another API that queries what keys
are currently being reconfigured.
>From the command line, this would be like
{code}
$ hdfs reconfigure -datanode -start
Datanode reconfiguration started at 12:34 PM, Dec 12 2014
{code}
followed by
{code}
hdfs reconfigure -datanode -list
Datanode is in the process of reconfiguring {{dfs.data.dirs}} and
{{foo.bar.baz}}
{code}
later....
{code}
hdfs reconfigure -datanode -list
Datanode reconfiguration completed at 13:46 PM, Dec 12 2014
{code}
In the logs, we'd see information about whether the storage directories were
all successfully reconfigured.
If the user tried to reconfigure while another reconfiguration was going on,
we'd get:
{code}
$ hdfs reconfigure -datanode -start
Can't start a new reconfiguration because the one started at 12:34 PM, Dec 12
2014 is still going on
{code}
I think we can do this in a follow-up change, though.
Back to the patch itself....
{code}
+ public synchronized List<StorageLocation> addVolumes(
+ final List<StorageLocation> volumes,
+ final Collection<String> bpids, final Configuration conf)
throws IOException {
{code}
Let's change this to not throw IOException. It's confusing to have both a
return value of the volumes that we successfully added and an IOE. I would
pick one... just the list of volumes.
{code}
final List<IOException> exceptions = Collections.synchronizedList(
new ArrayList<IOException>());
List<Thread> volumeLoadingThreads = Lists.newArrayList();
for (final String bpid : bpids) {
// Multi threading
Thread t = new Thread() {
public void run() {
try {
fsVolume.addBlockPool(bpid, conf);
fsVolume.getVolumeMap(bpid, tempVolumeMap);
} catch (IOException e) {
LOG.warn("Caught exception when adding " + fsVolume +
". Will throw later.", e);
exceptions.add(e);
}
}
};
volumeLoadingThreads.add(t);
t.start();
}
for (Thread t : volumeLoadingThreads) {
try {
t.join();
} catch (InterruptedException ie) {
exceptions.add(new IOException(ie));
}
{code}
We're already creating a thread per volume... I think multiplying that by a
thread per bpid is too much. Keep in mind that when dealing with a volume,
having multiple threads will be less useful since we'll often be I/O bound (a
single volume is normally a single disk). So let's just have a for loop here,
not multiple threads.
Looks pretty good aside from that.
> Refresh data volumes on DataNode based on configuration changes
> ---------------------------------------------------------------
>
> Key: HDFS-6727
> URL: https://issues.apache.org/jira/browse/HDFS-6727
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Affects Versions: 2.5.0, 2.4.1
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Labels: datanode
> Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch,
> HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.combo.patch
>
>
> HDFS-1362 requires DataNode to reload configuration file during the runtime,
> so that DN can change the data volumes dynamically. This JIRA reuses the
> reconfiguration framework introduced by HADOOP-7001 to enable DN to
> reconfigure at runtime.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)