[ 
https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136600#comment-14136600
 ] 

Colin Patrick McCabe commented on HDFS-6727:
--------------------------------------------

Thinking about this a little more, we might have timeouts if the reconfigurable 
API does everything synchronously.  So I think we need to modify the existing 
API to just start a reconfiguration, and add another API that queries what keys 
are currently being reconfigured.

>From the command line, this would be like
{code}
$ hdfs reconfigure -datanode -start
Datanode reconfiguration started at 12:34 PM, Dec 12 2014
{code}
followed by
{code}
hdfs reconfigure -datanode -list
Datanode is in the process of reconfiguring {{dfs.data.dirs}} and 
{{foo.bar.baz}}
{code}

later....
{code}
hdfs reconfigure -datanode -list
Datanode reconfiguration completed at 13:46 PM, Dec 12 2014
{code}

In the logs, we'd see information about whether the storage directories were 
all successfully reconfigured.

If the user tried to reconfigure while another reconfiguration was going on, 
we'd get:
{code}
$ hdfs reconfigure -datanode -start
Can't start a new reconfiguration because the one started at 12:34 PM, Dec 12 
2014 is still going on
{code}

I think we can do this in a follow-up change, though.

Back to the patch itself....
{code}
+  public synchronized List<StorageLocation> addVolumes(
+      final List<StorageLocation> volumes,
+      final Collection<String> bpids, final Configuration conf)
       throws IOException {
{code}

Let's change this to not throw IOException.  It's confusing to have both a 
return value of the volumes that we successfully added and an IOE.  I would 
pick one... just the list of volumes.

{code}
    final List<IOException> exceptions = Collections.synchronizedList(
        new ArrayList<IOException>());
    List<Thread> volumeLoadingThreads = Lists.newArrayList();
    for (final String bpid : bpids) {
      // Multi threading
      Thread t = new Thread() {
        public void run() {
          try {
            fsVolume.addBlockPool(bpid, conf);
            fsVolume.getVolumeMap(bpid, tempVolumeMap);
          } catch (IOException e) {
            LOG.warn("Caught exception when adding " + fsVolume +
              ". Will throw later.", e);
            exceptions.add(e);
          }
        }
      };
      volumeLoadingThreads.add(t);
      t.start();
    }
    for (Thread t : volumeLoadingThreads) {
      try {
        t.join();
      } catch (InterruptedException ie) {
        exceptions.add(new IOException(ie));
      }
 {code}
We're already creating a thread per volume... I think multiplying that by a 
thread per bpid is too much.  Keep in mind that when dealing with a volume, 
having multiple threads will be less useful since we'll often be I/O bound (a 
single volume is normally a single disk).  So let's just have a for loop here, 
not multiple threads.

Looks pretty good aside from that.

> Refresh data volumes on DataNode based on configuration changes
> ---------------------------------------------------------------
>
>                 Key: HDFS-6727
>                 URL: https://issues.apache.org/jira/browse/HDFS-6727
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.5.0, 2.4.1
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>              Labels: datanode
>         Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, 
> HDFS-6727.002.patch, HDFS-6727.003.patch, HDFS-6727.combo.patch
>
>
> HDFS-1362 requires DataNode to reload configuration file during the runtime, 
> so that DN can change the data volumes dynamically. This JIRA reuses the 
> reconfiguration framework introduced by HADOOP-7001 to enable DN to 
> reconfigure at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to