prashantwason commented on a change in pull request #2197:
URL: https://github.com/apache/hudi/pull/2197#discussion_r512991547
##########
File path:
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/DeltaGenerator.java
##########
@@ -77,6 +82,16 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig,
JavaSparkContext jsc, Spark
}
public JavaRDD<DeltaWriteStats> writeRecords(JavaRDD<GenericRecord> records)
{
+ if (deltaOutputConfig.shouldDeleteOldInputData() && batchId > 1) {
+ Path oldInputDir = new Path(deltaOutputConfig.getDeltaBasePath(),
Integer.toString(batchId - 1));
Review comment:
RollbackNode will rollback the last commit. This should not interfere
will these input directories.
The shouldDeleteOldInputData() setting only affects the data generated in
the "input" directory (a separate directory) which is not part of the HUDI
dataset under test. For each Node in the yaml, a sub-directory in the input
directory (identified by batchId) is created. Within this new sub-directory,
the data to be ingested as part of the Node is written as avro files.
We are deleting older input sub-directories. The default is to not delete
anything.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]