[
https://issues.apache.org/jira/browse/HUDI-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-7033:
---------------------------------
Labels: pull-request-available (was: )
> Fix read error for schema evolution + partition value extraction
> ----------------------------------------------------------------
>
> Key: HUDI-7033
> URL: https://issues.apache.org/jira/browse/HUDI-7033
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: voon
> Priority: Major
> Labels: pull-request-available
>
> After HUDI-6960 is merged, there
> *shouldExtractPartitionValuesFromPartitionPath* will correctly ignore
> partition columns in requiredSchema.
>
> When using the configs below, there will be read errors.
>
> {code:java}
> hoodie.datasource.read.extract.partition.values.from.path = true {code}
>
>
> When the config above is added together with:
>
> {code:java}
> hoodie.schema.on.read.enable = true {code}
>
> The query schema will be pruned to **{*}NOT{*}** contain any partition
> columns.
>
> When rebuilding parquet filters, file schema's columns are scanned against
> querySchema. However, Hudi files (file schema) might still contain partition
> columns. And when partition filters are being rebuilt with these file schema
> against query schema, it will lead to partition columns not being found.
>
> {code:java}
> Caused by: java.lang.IllegalArgumentException: cannot found filter col
> name:region from querySchema: table {
> 5: id: optional int
> 6: name: optional string
> 7: ts: optional long
> }
> at
> org.apache.hudi.internal.schema.utils.InternalSchemaUtils.reBuildFilterName(InternalSchemaUtils.java:180)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)