vortual commented on issue #3919:
URL: https://github.com/apache/hudi/issues/3919#issuecomment-959399779


   我找到了出问题代码的地方,在HoodieRealtimeInputFormatUtils类里面:
   List<FileSplit> dataFileSplits = 
groupedInputSplits.get(fileSlice.getFileId());
   这一行groupedInputSplits这里存的是每个file group对应的parquet文件,没有包括log文件
   
fileSlice是新增的log,因为groupedInputSplits里面没有对应的filegroup,只有parquet没有log,所以得到的dataFileSplits
 为空。之后再调用dataFileSplits.forEach(split -> {})时就会出现空指针异常。
   
   I have found the wrong code, inside the HoodieRealtimeInputFormatUtils class:
   List<FileSplit> dataFileSplits = 
groupedInputSplits.get(fileSlice.getFileId());
   The groupedInputSplits Map contains the Parquet files corresponding to each 
file group, and the log files are not included
   FileSlice is the new log, and the dataFileSplits obtained are empty because 
there is no corresponding Filegroup in the groupedInputSplits Map and only 
Parquet has no log file.A subsequent call to datafilesplits. forEach(split -> 
{}) will result in a null pointer exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to