xushiyan commented on a change in pull request #2845:
URL: https://github.com/apache/hudi/pull/2845#discussion_r636615564
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java
##########
@@ -219,13 +221,19 @@ public static void createBaseFile(String basePath, String
partitionPath, String
public static void createBaseFile(String basePath, String partitionPath,
String instantTime, String fileId, long length)
throws Exception {
+ createBaseFile(basePath, partitionPath, instantTime, fileId, length,
Instant.now().toEpochMilli());
+ }
+
+ public static void createBaseFile(String basePath, String partitionPath,
String instantTime, String fileId, long length, long lastModificationTimeMilli)
+ throws Exception {
Path parentPath = Paths.get(basePath, partitionPath);
Files.createDirectories(parentPath);
Path baseFilePath = parentPath.resolve(baseFileName(instantTime, fileId));
if (Files.notExists(baseFilePath)) {
Files.createFile(baseFilePath);
}
new RandomAccessFile(baseFilePath.toFile(), "rw").setLength(length);
+ Files.setLastModifiedTime(baseFilePath,
FileTime.fromMillis(lastModificationTimeMilli));
Review comment:
@nsivabalan the problem comes from mod time being the same for multiple
input files. I uploaded the screenshot in the JIRA ticket. Also posting here
for easy illustration

DFSPathSelector reads last modification time and saves it as checkpoint,
which is then used to compare with next batch of input files. It's not about
files being mutated; the input files are append-only and last mod time _is_
create time. The test setup is to ensure last mod time being the same to avoid
code execution causing delays when creating them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]