Aleksey Plekhanov created IGNITE-10642: ------------------------------------------
Summary: Cache metrics distribution mechanism should be changed from broadcast to request-response communication pattern Key: IGNITE-10642 URL: https://issues.apache.org/jira/browse/IGNITE-10642 Project: Ignite Issue Type: Improvement Affects Versions: 2.7 Reporter: Aleksey Plekhanov In the current implementation, all cache metrics are collected on each node for all caches and sent across the whole cluster with discovery message ({{TcpDiscoveryMetricsUpdateMessage}}) with configured frequency (MetricsUpdateFrequency, 2 seconds by default) even if no one requested them. This mechanism should be changed in the following ways: * Local cache metrics should be available (if configured) on each node * If a node needs to collect data from the cluster, it sends explicit request over communication SPI (request should contain a limited set of caches and/or metrics) * For performance reasons collected cluster-wide values must be cached. Previously collected metrics should be returned without re-requesting them again if they are not too old (configurable) * The mechanism should be easily adaptable for other types of statistics, which probably needs to be shared between nodes in the future (IO statistics, SQL statistics, SQL execution history, etc) * Message format should be carefully designed to minimize message size (cluster can contain thousands of caches and hundreds of nodes) * There must be an opportunity to configure metrics in runtime -- This message was sent by Atlassian JIRA (v7.6.3#76005)