sumihehe commented on issue #2346: URL: https://github.com/apache/hudi/issues/2346#issuecomment-758591316
> @sumihehe Did you get a chance to look at above? It'll be helpful if you can provide more information. - Around 1w rows when the case happened - Around 60 deltacommits and 0 commit when the case happened - it returns correctly on the *mor_ro table - It can not be reproduced when i made more deltacommit/commits on the table later. So it returns correctly now with HoodieParquetRealtimeInputFormat or It think It will occur on condition that: 1. in a rt table 2. the hive query has predicate push down 3. there are no less than 3 splits (thus no less than 3 recordReaders in HoodieCombineRealtimeRecordReader), and the records satisfy the predicate are in the split which is in a relatively back position of the List 4. 2 recordReaders in succession with _this.currentRecordReader.next(key, value)_ returns false, as the predicate push down has filtered the baseFile. In step 4, it leads to _HoodieCombineRealtimeRecordReader::next(NullWritable key, ArrayWritable value)_ return false and the reader will stop read next. So, records which satisfy the predicate are in the remanined recordReaders but can not be read. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
