goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297801564
 
 

 ##########
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##########
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
       return ctx;
     }
 
+    /***
+     * a thread for parallel MD5 computing to increase performance when loading
+     */
+    private static class DigestThread extends Thread {
+      volatile private IOException ioe = null;
+      volatile private MD5Hash digest = null;
+      private File file;
+
+      public DigestThread(File inFile) {
+        file = inFile;
+      }
+
+      public MD5Hash getDigest() {
+        return digest;
+      }
+
+      public IOException getException() {
+        return ioe;
+      }
+
+      @Override
+      public void run() {
+        try {
+          digest = MD5FileUtils.computeMd5ForFile(file);
+        } catch (IOException e) {
+          ioe = e;
+        } catch (Throwable t) {
+          ioe = new IOException(t);
+        }
+      }
+    }
+
     void load(File file) throws IOException {
       long start = Time.monotonicNow();
-      imgDigest = MD5FileUtils.computeMd5ForFile(file);
+      DigestThread dt = new DigestThread(file);
+      dt.start();
       RandomAccessFile raFile = new RandomAccessFile(file, "r");
       FileInputStream fin = new FileInputStream(file);
       try {
         loadInternal(raFile, fin);
+        try {
+          dt.join();
+          if (dt.getException() != null) {
+            throw dt.getException();
+          }
+          imgDigest = dt.getDigest();
 
 Review comment:
   If you do the approach of getDigest to throw the exception, you cover this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to