[
https://issues.apache.org/jira/browse/HUDI-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinoth Chandar updated HUDI-1442:
---------------------------------
Fix Version/s: 0.14.0
(was: 1.0.0)
> Simplify clustering executor SparkRunClusteringCommitActionExecutor
> -------------------------------------------------------------------
>
> Key: HUDI-1442
> URL: https://issues.apache.org/jira/browse/HUDI-1442
> Project: Apache Hudi
> Issue Type: Task
> Components: performance
> Reporter: satish
> Priority: Minor
> Fix For: 0.14.0
>
>
> readRecordsForGroup in SparkRunClusteringCommitActionExecutor has two
> implementations
> 1) readRecordsForGroupWithLogs to read records from fileslice with log files
> 2) readRecordsForGroupBaseFiles to read records from fileslice that dont have
> log files
> If theres no performance impact of using #1, we can just use the same
> approach for file slice that dont have log files.
> Do performance measurement and remove#2 if there is no big difference.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)