[jira] [Updated] (HUDI-1442) Simplify clustering executor SparkRunClusteringCommitActionExecutor

Vinoth Chandar (Jira) Wed, 17 May 2023 08:53:06 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vinoth Chandar updated HUDI-1442:
---------------------------------
    Fix Version/s: 0.14.0
                       (was: 1.0.0)

> Simplify clustering executor SparkRunClusteringCommitActionExecutor
> -------------------------------------------------------------------
>
>                 Key: HUDI-1442
>                 URL: https://issues.apache.org/jira/browse/HUDI-1442
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: performance
>            Reporter: satish
>            Priority: Minor
>             Fix For: 0.14.0
>
>
>  readRecordsForGroup in SparkRunClusteringCommitActionExecutor has two 
> implementations 
> 1) readRecordsForGroupWithLogs to read records from fileslice with log files
> 2) readRecordsForGroupBaseFiles to read records from fileslice that dont have 
> log files
> If theres no performance impact of using #1, we can just use the same 
> approach for file slice that dont have log files.
> Do performance measurement and remove#2 if there is no big difference.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HUDI-1442) Simplify clustering executor SparkRunClusteringCommitActionExecutor

Reply via email to