cshuo opened a new pull request, #14197:
URL: https://github.com/apache/hudi/pull/14197
### Describe the issue this Pull Request addresses
<!-- Either describe the issue inline here with motivation behind the
changes
(or) link to an issue by including `Closes #<issue-number>` for
context.
If this PR includes changes to the storage format, public APIs,
or has breaking changes, use `!` (e.g., feat!: ...) -->
Fix issue: https://github.com/apache/hudi/issues/14196
### Summary and Changelog
The current pushed-down predicate for base file reader in flink FileGroup
reader can be wrong in some cases, like
// base file: (uuid:'k1', age: 23, ts: 1003)
// log file: (uuid: 'k1', age: 25, ts: 1001)
// query filter: age = 25;
// Then the expected result should be empty, but if predicate age = 25
is pushed down
// into the parquet reader, the result would be wrong as (uuid: 'k1',
age: 25, ts: 1001)
When there is log files in a file slice to read, we should make sure the
predicate contains only primary key fields.
### Impact
<!-- Describe any public API or user-facing feature change or any
performance impact. -->
### Risk Level
<!-- Accepted values: none, low, medium or high. Other than `none`, explain
the risk.
If medium or high, explain what verification was done to mitigate the
risks. -->
### Documentation Update
<!-- Describe any necessary documentation update if there is any new
feature, config, or user-facing change. If not, put "none".
- The config description must be updated if new configs are added or the
default value of the configs are changed.
- Any new feature or user-facing change requires updating the Hudi website.
Please follow the
[instruction](https://hudi.apache.org/contribute/developer-setup#website)
to make changes to the website. -->
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Enough context is provided in the sections above
- [ ] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]