[
https://issues.apache.org/jira/browse/HUDI-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408885#comment-17408885
]
sivabalan narayanan commented on HUDI-2005:
-------------------------------------------
1. ListingBasedRollbackHelper.
{code:java}
// collect all log files that is supposed to be deleted with this rollback
Map<FileStatus, Long> writtenLogFileSizeMap =
FSUtils.getAllLogFiles(metaClient.getFs(),
FSUtils.getPartitionPath(config.getBasePath(),
rollbackRequest.getPartitionPath()),
fileId, HoodieFileFormat.HOODIE_LOG.getFileExtension(), latestBaseInstant)
.collect(Collectors.toMap(HoodieLogFile::getFileStatus, value ->
value.getFileStatus().getLen()));{code}
{code:java}
// This step is intentionally done after writer is closed. Guarantees that
// getFileStatus would reflect correct stats and FileNotFoundException is not
thrown in
// cloud-storage : HUDI-168
Map<FileStatus, Long> filesToNumBlocksRollback = Collections.singletonMap(
metaClient.getFs().getFileStatus(Objects.requireNonNull(writer).getLogFile().getPath()),
1L
);
{code}
2. SparkMarkerBasedRollbackStrategy
{code:java}
protected Map<FileStatus, Long> getWrittenLogFileSizeMap(String
partitionPathStr, String baseCommitTime, String fileId) throws IOException {
// collect all log files that is supposed to be deleted with this rollback
return FSUtils.getAllLogFiles(table.getMetaClient().getFs(),
FSUtils.getPartitionPath(config.getBasePath(), partitionPathStr), fileId,
HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime)
.collect(Collectors.toMap(HoodieLogFile::getFileStatus, value ->
value.getFileStatus().getLen()));
}
{code}
3. HoodieLogFileReader fetches log file position using
{code:java}
if (this.reverseReader) {
this.reverseLogFilePosition = this.lastReverseLogFilePosition =
fs.getFileStatus(logFile.getPath()).getLen();
}
{code}
As of now, HoodieLogFileReader only has FileSystem to access. Not sure if we
can leak Metadata to this layer.
> Audit and remove references of fs.listStatus() and fs.getFileStatus() or
> fs.exists()
> ------------------------------------------------------------------------------------
>
> Key: HUDI-2005
> URL: https://issues.apache.org/jira/browse/HUDI-2005
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Nishith Agarwal
> Assignee: sivabalan narayanan
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)