guanziyue commented on issue #2648:
URL: https://github.com/apache/hudi/issues/2648#issuecomment-794236129


   Hi
   Happy to see your reply. I'd like to share more information about that. 
After I posted this, I discovered more with help from my colleague.  To 
reproduce this, we need an index which returns true when canIndexLogFiles() is 
called, such as HbaseIndex. At such time, DeltaCommitActionExecutor will try to 
append insert records to a log file rather than create a parquet base file as 
code shows.
    
[https://github.com/apache/hudi/blob/release-0.6.0/hudi-client/src/main/java/org/apache/hudi/table/action/deltacommit/DeltaCommitActionExecutor.java#L94](url)
   
   This is how a fileGroup without parquet base file produced. However, with my 
limited knowledge about data source, it seems that dataSource assumes every 
fileGroup has a parquet base file and all log files are appended to the base 
file. I guess this may be the root of error.
   I plan to try if making canIndexLogFiles() return false can avoid this 
problem temporarily while the other way I can com up with now is to generate a 
parquet file when inserting records.
   Could you please correct me if I made some mistake?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to