[
https://issues.apache.org/jira/browse/HUDI-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377188#comment-17377188
]
ASF GitHub Bot commented on HUDI-1969:
--------------------------------------
danny0405 edited a comment on pull request #3033:
URL: https://github.com/apache/hudi/pull/3033#issuecomment-876222041
> Just one comment. The logic seems correct. Can we add tests that
explicitly send inserts to logs and see if its all returned correctly
Thanks @vinothchandar , before this patch , we assume that each file group
has a base parquet fie PLUS logs with only UPDATE records, this is true for
BloomFilter index, but with global index like flink, the logs are not only
UPDATEs but can also be INSERTs. So the new logic is to read the parquet first
then the logs.
The modified test actually tests the case, with first 100 log records to
update the old one then another 20 insert records for pure INSERTs.
While this patch does not solve the case to read pure logs file group,
because after debugging, i found that this input format found the file groups
based on the visible parquet first, the logs are hidden file and be ignored.
I have add one line comment to the test case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Support reading logs for MOR Hive rt table
> ------------------------------------------
>
> Key: HUDI-1969
> URL: https://issues.apache.org/jira/browse/HUDI-1969
> Project: Apache Hudi
> Issue Type: Improvement
> Components: Hive Integration
> Reporter: Danny Chen
> Priority: Major
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)