klcopp commented on a change in pull request #1716:
URL: https://github.com/apache/hive/pull/1716#discussion_r532794904



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##########
@@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) {
   }
 
   /**
-   * @return true if any files were removed
+   * @return true if the cleaner has removed all files rendered obsolete by 
compaction
    */
   private boolean removeFiles(String location, ValidWriteIdList writeIdList, 
CompactionInfo ci)
       throws IOException, NoSuchObjectException, MetaException {
     Path locPath = new Path(location);
+    FileSystem fs = locPath.getFileSystem(conf);
+    Map<Path, AcidUtils.HdfsDirSnapshot> dirSnapshots = 
AcidUtils.getHdfsDirSnapshots(fs, locPath);
     AcidUtils.Directory dir = 
AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, 
Ref.from(

Review comment:
       No, not with HIVE-24291.
   Without HIVE-24291 (which might not be usable if for example if HMS schema 
changes are out the question) and without this change, we still could have a 
pileup of the same table/partition in "ready for cleaning" in the queue.
   Without this change (HIVE-24444) some of them might not be deleted.
   The goal of this change is that, when the table does get cleaned, all of the 
records will be deleted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to