cdmikechen commented on a change in pull request #1119: Fix: 
HoodieCommitMetadata only show first commit insert rows.
URL: https://github.com/apache/incubator-hudi/pull/1119#discussion_r361078012
 
 

 ##########
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieCommitMetadata.java
 ##########
 @@ -175,7 +175,9 @@ public long fetchTotalInsertRecordsWritten() {
     long totalInsertRecordsWritten = 0;
     for (List<HoodieWriteStat> stats : partitionToWriteStats.values()) {
       for (HoodieWriteStat stat : stats) {
-        if (stat.getPrevCommit() != null && 
stat.getPrevCommit().equalsIgnoreCase("null")) {
 
 Review comment:
   @n3nash 
   It may be in 
https://github.com/apache/incubator-hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java#L165
   ```java
         HoodieWriteStat stat = new HoodieWriteStat();
         stat.setPartitionPath(writeStatus.getPartitionPath());
         stat.setNumWrites(recordsWritten);
         stat.setNumDeletes(recordsDeleted);
         stat.setNumInserts(insertRecordsWritten);
         stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT);
         stat.setFileId(writeStatus.getFileId());
         stat.setPath(new Path(config.getBasePath()), path);
         long fileSizeInBytes = FSUtils.getFileSize(fs, path);
         stat.setTotalWriteBytes(fileSizeInBytes);
         stat.setFileSizeInBytes(fileSizeInBytes);
         stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords());
         RuntimeStats runtimeStats = new RuntimeStats();
         runtimeStats.setTotalCreateTime(timer.endTimer());
         stat.setRuntimeStats(runtimeStats);
         writeStatus.setStat(stat);
   ```
   In `org.apache.hudi.common.model.HoodieWriteStat` it is a "null", in other 
cases prevCommit will be set a real commit time.
   ```java
   public static final String NULL_COMMIT = "null";
   ```
   So that `fetchTotalFilesInsert` can recognize first commit file and pass the 
condition, meanwhile `fetchTotalInsertRecordsWritten` can not.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to