dybyte opened a new pull request, #9833:
URL: https://github.com/apache/seatunnel/pull/9833

   Fixes https://github.com/apache/seatunnel/issues/9817
   ### Purpose of this pull request
   
   This PR introduces a new configuration option `JOB_METRICS_PARTITION_COUNT` 
which controls the number of partitions used to store running job metrics in 
Hazelcast IMap. This allows distributing metrics across multiple keys to reduce 
contention and improve performance when multiple tasks report metrics 
concurrently.
   
   If you prefer to use a single key for storing metrics, set 
`JOB_METRICS_PARTITION_COUNT` to `1` (this is the default, so no configuration 
is required).  
   For better parallelism under high load, you can increase this value.  
   However, setting it too high may introduce overhead when aggregating metrics 
across multiple partitions, which can lead to reduced overall performance.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Users can tune `JOB_METRICS_PARTITION_COUNT` to balance write 
contention and aggregation overhead. Default is 1 for backward compatibility.
   
   ### How was this patch tested?
   
   Current tests verify that `metricsImap` is updated and cleared with a custom 
partition configuration.
   
   The use of `compute` ensures that updates are atomic and merge with existing 
metrics without overwriting, reducing write contention compared to the previous 
synchronized get/put approach. This also minimizes network round-trips, 
improving performance under concurrent updates.
   
   In performance tests, we populated an initially empty `localMap` with a 
large number of entries to induce contention. Comparing the previous structure 
and the new compute-based structure with a single key, the new approach was 
observed to be faster under these conditions.
   
   ### Check list
   
   * [ ] If any new Jar binary package adding in your PR, please add License 
Notice according
     [New License 
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
   * [ ] If necessary, please update the documentation to describe the new 
feature. https://github.com/apache/seatunnel/tree/dev/docs
   * [ ] If you are contributing the connector code, please check that the 
following files are updated:
     1. Update 
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
 and add new connector information in it
     2. Update the pom file of 
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
     3. Add ci label in 
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
     4. Add e2e testcase in 
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
     5. Update connector 
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to