sivabalan narayanan created HUDI-4835:
-----------------------------------------
Summary: Speed up reading of files from S3 in S3EventsIncrSource
Key: HUDI-4835
URL: https://issues.apache.org/jira/browse/HUDI-4835
Project: Apache Hudi
Issue Type: Improvement
Components: deltastreamer
Reporter: sivabalan narayanan
In S3EventsIncrSource, we load dataframe of N files using
dataframeReader.load(files[]). we can improve the speed of reading S3 files by
leveraging spark.parallelize().
Ref issue: https://github.com/apache/hudi/issues/5952
--
This message was sent by Atlassian Jira
(v8.20.10#820010)