vinothchandar commented on a change in pull request #2322:
URL: https://github.com/apache/hudi/pull/2322#discussion_r543092948
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java
##########
@@ -403,6 +403,7 @@ public void finalizeWrite(HoodieEngineContext context,
String instantTs, List<Ho
private void deleteInvalidFilesByPartitions(HoodieEngineContext context,
Map<String, List<Pair<String, String>>> invalidFilesByPartition) {
// Now delete partially written files
+ context.setJobStatus(this.getClass().getSimpleName(), "Delete invalid
files by partitions");
Review comment:
change message to : "Delete invalid files generated during the write
operation" ?
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/MarkerFiles.java
##########
@@ -135,6 +135,7 @@ public boolean doesMarkerDirExist() throws IOException {
if (subDirectories.size() > 0) {
parallelism = Math.min(subDirectories.size(), parallelism);
SerializableConfiguration serializedConf = new
SerializableConfiguration(fs.getConf());
+ context.setJobStatus(this.getClass().getSimpleName(), "MarkerFiles
created and merged data paths");
Review comment:
change to : `Obtaining marker files for all created, merged paths`
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bloom/SparkHoodieBloomIndex.java
##########
@@ -137,12 +137,14 @@ public SparkHoodieBloomIndex(HoodieWriteConfig config) {
*/
private Map<String, Long> computeComparisonsPerFileGroup(final Map<String,
Long> recordsPerPartition,
final Map<String,
List<BloomIndexFileInfo>> partitionToFileInfo,
- JavaPairRDD<String,
String> partitionRecordKeyPairRDD) {
+ JavaPairRDD<String,
String> partitionRecordKeyPairRDD,
+ final
HoodieEngineContext context) {
Map<String, Long> fileToComparisons;
if (config.getBloomIndexPruneByRanges()) {
// we will just try exploding the input and then count to determine
comparisons
// FIX(vc): Only do sampling here and extrapolate?
+ context.setJobStatus(this.getClass().getSimpleName(), "Explode recordRDD
with file comparisons");
Review comment:
Change to : `Compute all comparisons needed between records and files`
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java
##########
@@ -101,6 +101,7 @@ public BaseSparkCommitActionExecutor(HoodieEngineContext
context,
WorkloadProfile profile = null;
if (isWorkloadProfileNeeded()) {
+ context.setJobStatus(this.getClass().getSimpleName(), "Build workload
profile");
Review comment:
Change to : `Building workload profile`
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/rollback/SparkMarkerBasedRollbackStrategy.java
##########
@@ -52,6 +52,7 @@ public SparkMarkerBasedRollbackStrategy(HoodieTable<T,
JavaRDD<HoodieRecord<T>>,
MarkerFiles markerFiles = new MarkerFiles(table,
instantToRollback.getTimestamp());
List<String> markerFilePaths = markerFiles.allMarkerFilePaths();
int parallelism = Math.max(Math.min(markerFilePaths.size(),
config.getRollbackParallelism()), 1);
+ jsc.setJobGroup(this.getClass().getSimpleName(), "Marker files
rollback");
Review comment:
Change to: `Rolling back using marker files`
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/compact/SparkRunCompactionActionExecutor.java
##########
@@ -76,6 +76,7 @@ public
SparkRunCompactionActionExecutor(HoodieSparkEngineContext context,
JavaRDD<WriteStatus> statuses = compactor.compact(context,
compactionPlan, table, config, instantTime);
statuses.persist(SparkMemoryUtils.getWriteStatusStorageLevel(config.getProps()));
+ context.setJobStatus(this.getClass().getSimpleName(), "Collect
compaction metadata status");
Review comment:
Change to : `Preparing compaction metadata`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]