satish created HUDI-1442:
----------------------------
Summary: Simplify clustering executor
SparkRunClusteringCommitActionExecutor
Key: HUDI-1442
URL: https://issues.apache.org/jira/browse/HUDI-1442
Project: Apache Hudi
Issue Type: Sub-task
Components: Performance
Reporter: satish
readRecordsForGroup in SparkRunClusteringCommitActionExecutor has two
implementations
1) readRecordsForGroupWithLogs to read records from fileslice with log files
2) readRecordsForGroupBaseFiles to read records from fileslice that dont have
log files
If theres no performance impact of using #1, we can just use the same approach
for file slice that dont have log files.
Do performance measurement and remove#2 if there is no big difference.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)