HeartSaVioR commented on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-666732439
> There are quite a bit of commits here. Would it be preferred if I re-opened this PR for a cleaner merge to master? Let's hear the voice on reopening the PR, as there're already huge number of review comments, and once you reopen the PR these comments are lost. We can do that at any time we feel it's needed, even just before merging, so personally it won't matter much. > Once this is committed, I'd be curious to follow up on SPARK-32155. I noticed your repo, https://github.com/HeartSaVioR/structured_streaming_experiments, for stress-testing streaming from a file data source. Particularly your findings between streaming from a file data source to delta lake vs Iceberg. I've raised related discussion thread, but haven't got enough inputs so far. Please add your voice on discussion if you are interested - IMHO latestFirst and start by timestamp are conflicting and can't go together. https://lists.apache.org/thread.html/r08e3a8d7df74354b38d19ffdebe1afe7fa73c2f611f0a812a867dffb%40%3Cdev.spark.apache.org%3E ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
