[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-28 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298665951
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,72 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private static class DigestThread extends Thread {
 
 Review comment:
   Javadoc explaining the overall idea. Something like:
   Thread to compute the MD5 of a file as this can be in parallel while loading 
the image without interfering much.
   
   Actually, this might be a good util to have in MD5FileUtils.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-27 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298268348
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
 
 Review comment:
   Correct, but we could move all the exception handling there. 
   My proposal is that the getter here checks if ioe == null and throw it.
   Then it goes to my other comment where you don't need to check for the 
exception anymore and just do a getter of the digest.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-27 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298268698
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
 
 Review comment:
   That's fair, can you extend the description of the JIRA saying it can reduce 
up to 10% of the time or similar?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-27 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298269206
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,72 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private static class DigestThread extends Thread {
+
+  /**
+   * Exception thrown when computing the digest if it cannot be calculated.
+   */
+  private volatile IOException ioe = null;
+
+  /**
+   * Calculated digest if there are no error.
+   */
+  private volatile MD5Hash digest = null;
+
+  /**
+   * FsImage file computed MD5
+   */
+  private final File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+
+  @Override
+  public String toString() {
+return "DigestThread{" + "ThreadName=" + getName() + ", digest="
 
 Review comment:
   No need for the first +


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-27 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298269662
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,72 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private static class DigestThread extends Thread {
+
+  /**
+   * Exception thrown when computing the digest if it cannot be calculated.
+   */
+  private volatile IOException ioe = null;
+
+  /**
+   * Calculated digest if there are no error.
+   */
+  private volatile MD5Hash digest = null;
+
+  /**
+   * FsImage file computed MD5
+   */
+  private final File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+
+  @Override
+  public String toString() {
+return "DigestThread{" + "ThreadName=" + getName() + ", digest="
++ digest + ", file=" + file + '}';
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
+  dt.setName(file.getName() + " computed MD5 Thread");
 
 Review comment:
   You can do this setter and the daemon one in the constructor of the class.
   The text should be something like "file.getName() + " MD5 compute"
   (No need for the word Thread specifically.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-27 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r298269004
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
+  dt.start();
   RandomAccessFile raFile = new RandomAccessFile(file, "r");
   FileInputStream fin = new FileInputStream(file);
   try {
 loadInternal(raFile, fin);
+try {
+  dt.join();
+  if (dt.getException() != null) {
+throw dt.getException();
+  }
+  imgDigest = dt.getDigest();
 
 Review comment:
   My point is following the other comment.
   If getDigest checks if there is an exception and throws it here, you don't 
need to check for the exception here.
   You can also remove dt.getException().


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297800514
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
 
 Review comment:
   What about checking if the exception is there and throw it if so?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297800195
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
 
 Review comment:
   Javadocs of the style:
   /** Exception thrown when computing the diggest if it cannot be calculated. 
*/
   private volatile IOException ioe = null;
   /** Calculated digest if there are no error. */
   private volatile MD5Hash digest = null;
   
   Note that I think the style says "private volatile"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297800303
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
 
 Review comment:
   final and add javadoc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297801948
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
 
 Review comment:
   Actually how expensive are these two steps? (digest calculation and load)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297800888
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
 
 Review comment:
   It would be nice to have a name for the thread:
   - setName
   - toString (and let the daemon library do it if so)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297801564
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
+  dt.start();
   RandomAccessFile raFile = new RandomAccessFile(file, "r");
   FileInputStream fin = new FileInputStream(file);
   try {
 loadInternal(raFile, fin);
+try {
+  dt.join();
+  if (dt.getException() != null) {
+throw dt.getException();
+  }
+  imgDigest = dt.getDigest();
 
 Review comment:
   If you do the approach of getDigest to throw the exception, you cover this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: HDFS-13694. Making md5 computing being in parallel with image loading.

2019-06-26 Thread GitBox
goiri commented on a change in pull request #1010: HDFS-13694. Making md5 
computing being in parallel with image loading.
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297801115
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+/***
+ * a thread for parallel MD5 computing to increase performance when loading
+ */
+private static class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (IOException e) {
+  ioe = e;
+} catch (Throwable t) {
+  ioe = new IOException(t);
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  DigestThread dt = new DigestThread(file);
 
 Review comment:
   Should this be a daemon?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297275249
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
 
 Review comment:
   Javadoc mentioning that this a thread for parallel MD5 computing to increase 
performance when loading.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297275624
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (Throwable t) {
+  if (t instanceof IOException) {
+ioe = (IOException) t;
+  } else {
+ioe = new IOException(t);
+  }
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  FSImageFormatProtobuf.Loader.DigestThread dt =
+  new FSImageFormatProtobuf.Loader.DigestThread(file);
+  dt.start();
   RandomAccessFile raFile = new RandomAccessFile(file, "r");
   FileInputStream fin = new FileInputStream(file);
   try {
 loadInternal(raFile, fin);
+try {
+  dt.join();
+  imgDigest = dt.getDigest();
+  if (dt.getException() != null) {
 
 Review comment:
   I guess you can move this to the previous line?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297274889
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
 
 Review comment:
   I think the common approach is to define this as static.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297275806
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (Throwable t) {
+  if (t instanceof IOException) {
+ioe = (IOException) t;
+  } else {
+ioe = new IOException(t);
 
 Review comment:
   Do we have unit tests throwing these exceptions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297274689
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (Throwable t) {
 
 Review comment:
   } catch (IOException ioe) {
 throw ioe;
   } catch (Throwable t) {
 throw new IOException(t);
   }


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 computing being in parallel with image loading

2019-06-25 Thread GitBox
goiri commented on a change in pull request #1010: [HDFS-13694]Making md5 
computing being in parallel with image loading
URL: https://github.com/apache/hadoop/pull/1010#discussion_r297274374
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
 ##
 @@ -172,13 +172,55 @@ public LoaderContext getLoaderContext() {
   return ctx;
 }
 
+private class DigestThread extends Thread {
+  volatile private IOException ioe = null;
+  volatile private MD5Hash digest = null;
+  private File file;
+
+  public DigestThread(File inFile) {
+file = inFile;
+  }
+
+  public MD5Hash getDigest() {
+return digest;
+  }
+
+  public IOException getException() {
+return ioe;
+  }
+
+  @Override
+  public void run() {
+try {
+  digest = MD5FileUtils.computeMd5ForFile(file);
+} catch (Throwable t) {
+  if (t instanceof IOException) {
+ioe = (IOException) t;
+  } else {
+ioe = new IOException(t);
+  }
+}
+  }
+}
+
 void load(File file) throws IOException {
   long start = Time.monotonicNow();
-  imgDigest = MD5FileUtils.computeMd5ForFile(file);
+  FSImageFormatProtobuf.Loader.DigestThread dt =
 
 Review comment:
   Can you just use DigestThread? Instead of the full declaration.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org