[
https://issues.apache.org/jira/browse/HUDI-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit updated HUDI-8621:
------------------------------
Description:
In [https://github.com/apache/hudi/pull/12376] - we attempted to revert the
optimization for single file slice, and do the computation such as
getRecordByKeys, etc. over executors even if it is for a single file slice.
This means when listing files using metadata files index, even if the data
partition has only one file slice, it happens over the executor and the request
is sent to the timeline server (RemoteFileSystemView). However, we noticed that
the timeline server did not respond and the request timed out in the case of
bootstrap of a MOR table having multiple partition fields.
To reproduce locally, follow below steps:
# First, revert the single file slice optimization in
HoodieBackedTableMetadata. Look at this commit for ref -
[https://github.com/codope/hudi/commit/e9f58e007b8428e52f7d3d60e655108376950679]
# Now, run the `TestBootstrapRead.testBootstrapFunctional`. You will notice
that COW case passes, MOR with 2 partition fields just hangs in fetching from
fs view.
was:
In [https://github.com/apache/hudi/pull/12376] - we attempted to revert the
optimization for single file slice, and do the computation such as
getRecordByKeys, etc. over executors even if it is for a single file slice.
This means when listing files using metadata files index, even if the data
partition has only one file slice, it happens over the executor and the request
is sent to the timeline server (RemoteFileSystemView). However, we noticed that
the timeline server did not respond and the request timed out in the case of
bootstrap of a MOR table having multiple partition fields.
To reproduce locally, follow below steps:
1. First, revert the single file slice optimization in HoodieBackedTableMetadata
> Bootstrap MOR with mutliple partition fields fail when metadata enabled on
> read path
> ------------------------------------------------------------------------------------
>
> Key: HUDI-8621
> URL: https://issues.apache.org/jira/browse/HUDI-8621
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Sagar Sumit
> Priority: Major
>
> In [https://github.com/apache/hudi/pull/12376] - we attempted to revert the
> optimization for single file slice, and do the computation such as
> getRecordByKeys, etc. over executors even if it is for a single file slice.
> This means when listing files using metadata files index, even if the data
> partition has only one file slice, it happens over the executor and the
> request is sent to the timeline server (RemoteFileSystemView). However, we
> noticed that the timeline server did not respond and the request timed out in
> the case of bootstrap of a MOR table having multiple partition fields.
> To reproduce locally, follow below steps:
> # First, revert the single file slice optimization in
> HoodieBackedTableMetadata. Look at this commit for ref -
> [https://github.com/codope/hudi/commit/e9f58e007b8428e52f7d3d60e655108376950679]
> # Now, run the `TestBootstrapRead.testBootstrapFunctional`. You will notice
> that COW case passes, MOR with 2 partition fields just hangs in fetching from
> fs view.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)