yihua commented on issue #5952:
URL: https://github.com/apache/hudi/issues/5952#issuecomment-1168986618
Thanks for the feature request.
The referenced code you mentioned in `S3EventsSource` converts the Json
records already in Dataset to Dataframe for further processing. Do you
actually refer to the optimization of reading events from SQS (which should not
actually involve file reading)?
```
Dataset<String> eventRecords =
sparkSession.createDataset(selectPathsWithLatestSqsMessage.getLeft(),
Encoders.STRING());
return Pair.of(
Option.of(sparkSession.read().json(eventRecords)),
selectPathsWithLatestSqsMessage.getRight());
```
Feel free to create a Jira ticket for the feature request and I encourage
you to put up a PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]