anoopsjohn commented on a change in pull request #633: HBASE-22890 Verify the 
file integrity in persistent IOEngine
URL: https://github.com/apache/hbase/pull/633#discussion_r326120074
 
 

 ##########
 File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/FileIOEngine.java
 ##########
 @@ -68,15 +97,18 @@ public FileIOEngine(long capacity, String... filePaths) 
throws IOException {
           // The next setting length will throw exception,logging this message
           // is just used for the detail reason of exception,
           String msg = "Only " + StringUtils.byteDesc(totalSpace)
-              + " total space under " + filePath + ", not enough for requested 
"
-              + StringUtils.byteDesc(sizePerFile);
+            + " total space under " + filePath + ", not enough for requested "
+            + StringUtils.byteDesc(sizePerFile);
           LOG.warn(msg);
         }
-        rafs[i].setLength(sizePerFile);
+        File file = new File(filePath);
+        if (file.length() != sizePerFile) {
+          rafs[i].setLength(sizePerFile);
+        }
 
 Review comment:
   Ya some fat comments here would be nice.  Got why u have this check now..  I 
can think of a case though.  Say we have a file based cache with one file and 
size was 10 GB.  Now the restart of the RS happening. The cache is persisted 
also.  Before restart the size is been increased to 20 GB. There is no truncate 
and ideally the cache get rebuilt.  Only thing is after the restart the cache 
capacity is increased.  But now as per the code, the length is changed here and 
so the last modified time and which will fail the verify phase.  Is it some 
thing to be considered?  Dont want much complex handling for this.  Might not 
be a common case for persisted cache.  Max what happening is we not able to 
retrieve persisted cache.  But welcoming thinking/suggestion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to