yihua commented on issue #5952:
URL: https://github.com/apache/hudi/issues/5952#issuecomment-1168986618

   Thanks for the feature request.
   
   The referenced code you mentioned in `S3EventsSource` converts the Json 
records already in Dataset to Dataframe for further processing.  Do you 
actually refer to the optimization of reading events from SQS (which should not 
actually involve file reading)?
   ```
   Dataset<String> eventRecords = 
sparkSession.createDataset(selectPathsWithLatestSqsMessage.getLeft(), 
Encoders.STRING());
         return Pair.of(
             Option.of(sparkSession.read().json(eventRecords)),
             selectPathsWithLatestSqsMessage.getRight());
   ```
   
   Feel free to create a Jira ticket for the feature request and I encourage 
you to put up a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to