jiwen624 opened a new pull request, #46604:
URL: https://github.com/apache/spark/pull/46604

   ### What changes were proposed in this pull request?
   Working on it...
   
   ### Why are the changes needed?
   As mentioned in the Jira ticket: 
https://issues.apache.org/jira/browse/SPARK-48298
   Currently, the StatsdSink in Spark supports UDP mode only, which is the 
default mode of StatsD. However, in real production environments, we often find 
that a more reliable transmission of metrics is needed to avoid metrics lose in 
high-traffic systems.
   
   TCP mode is already supported by Statsd: 
https://github.com/statsd/statsd/blob/master/docs/server.md
   Prometheus' statsd_exporter: https://github.com/prometheus/statsd_exporter 
   and also many other Statsd-based metrics proxies/receivers.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. 
   The following new config options are added to 
`conf/metrics.properties.template`:
   `*.sink.statsd.protocol`
   `*.sink.statsd.connTimeoutMs`
   A new error condition is defined in error-conditions.json for protocol 
configuration error.
   
   ### How was this patch tested?
   Unit tests.
   Manually tests with metric configurations sending metrics to a Netcat 
TCP/UDP server
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to