anishshri-db opened a new pull request, #40600:
URL: https://github.com/apache/spark/pull/40600

   ### What changes were proposed in this pull request?
   Add option to skip commit coordinator as part of StreamingWrite API for DSv2 
sources/sinks. This option was already present as part of the BatchWrite API
   
   ### Why are the changes needed?
   Sinks such as the following are atleast-once for which we do not need to go 
through the commit coordinator on the driver to ensure that a single partition 
commits. This is even less useful for streaming use-cases where batches could 
be replayed from the checkpoint dir.
   
   - memory sink
   - console sink
   - no-op sink
   - Kafka v2 sink
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added unit test for the change
   ```
   [info] ReportSinkMetricsSuite:
   22:23:01.276 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
   22:23:03.139 WARN 
org.apache.spark.sql.execution.streaming.ResolveWriteToStream: 
spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets 
and will be disabled.
   [info] - test ReportSinkMetrics with useCommitCoordinator=true (2 seconds, 
709 milliseconds)
   22:23:04.522 WARN 
org.apache.spark.sql.execution.streaming.ResolveWriteToStream: 
spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets 
and will be disabled.
   [info] - test ReportSinkMetrics with useCommitCoordinator=false (373 
milliseconds)
   22:23:04.941 WARN org.apache.spark.sql.streaming.ReportSinkMetricsSuite:
   
   ===== POSSIBLE THREAD LEAK IN SUITE 
o.a.s.sql.streaming.ReportSinkMetricsSuite, threads: 
ForkJoinPool.commonPool-worker-19 (daemon=true), rpc-boss-3-1 (daemon=true), 
shuffle-boss-6-1 (daemon=true) =====
   [info] Run completed in 4 seconds, 934 milliseconds.
   [info] Total number of tests run: 2
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 2, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to