pvargacl commented on a change in pull request #1779: URL: https://github.com/apache/hive/pull/1779#discussion_r546927637
########## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ########## @@ -1253,19 +1190,12 @@ public static ParsedDelta parsedDelta(Path deltaDir, String deltaPrefix, FileSys ParsedDelta p = parsedDelta(deltaDir, isRawFormat); List<HdfsFileStatusWithId> files = null; if (dirSnapshot != null) { + final PathFilter filter = isRawFormat ? AcidUtils.originalBucketFilter : AcidUtils.bucketFileFilter; + // If we already know the files, store it for future use files = dirSnapshot.getFiles().stream() - .filter(fileStatus -> bucketFileFilter.accept(fileStatus.getPath())) + .filter(fileStatus -> filter.accept(fileStatus.getPath())) .map(HdfsFileStatusWithoutId::new) .collect(Collectors.toList()); - } else if (isDeleteDelta) { - // For delete deltas we need the files for AcidState - try { - files = SHIMS.listLocatedHdfsStatus(fs, deltaDir, bucketFileFilter); - } catch (UnsupportedOperationException uoe) { - files = Arrays.stream(fs.listStatus(deltaDir, bucketFileFilter)) - .map(HdfsFileStatusWithoutId::new) - .collect(Collectors.toList()); - } Review comment: For delete delta folders the metadata for all files will be send to the execution side, so the FileId could be used to retrieve the orctail - and with the new feature of Adam - the whole file from llap cache. This was here to ensure the file list is propagated even in HDFS where we don't have anything in directory snapshot. Now this functionality is hidden inside ParsedDelta#getFiles it will either return the list from cache or do the FileSystem call itself if it was not done before ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org