[jira] [Created] (IGNITE-13982) Add documentation for new checkpoint, cluster and cache metrics
Amelchev Nikita created IGNITE-13982: Summary: Add documentation for new checkpoint, cluster and cache metrics Key: IGNITE-13982 URL: https://issues.apache.org/jira/browse/IGNITE-13982 Project: Ignite Issue Type: Task Reporter: Amelchev Nikita Assignee: Amelchev Nikita Fix For: 2.10 Add documentation for new metrics: * LastCheckpointbeforeLockDuration * LastCheckpointListenersExecuteDuration * LastCheckpointLockHoldDuration * LastCheckpointWalCpRecordFsyncDuration * LastCheckpointWriteCheckpointEntryDuration * LastCheckpointSplitAndSortPagesDuration * CheckpointBeforeLockHistogram * CheckpointLockWaitHistogram * CheckpointListenersExecuteHistogram * CheckpointMarkHistogram * CheckpointLockHoldHistogram * CheckpointPagesWriteHistogram * CheckpointFsyncHistogram * CheckpointWalRecordFsyncHistogram * CheckpointWriteEntryHistogram * CheckpointSplitAndSortPagesHistogram * CheckpointHistogram * TopologyVersion * TotalNodes * TotalBaselineNodes * TotalServerNodes * TotalClientNodes * ActiveBaselineNodes * OffHeapEntriesCount * OffHeapBackupEntriesCount * OffHeapPrimaryEntriesCount * HeapEntriesCount * CacheSize -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13952) Cache metrics without description
Alexand Polyakov created IGNITE-13952: - Summary: Cache metrics without description Key: IGNITE-13952 URL: https://issues.apache.org/jira/browse/IGNITE-13952 Project: Ignite Issue Type: Sub-task Reporter: Alexand Polyakov Assignee: Alexand Polyakov list of metrics |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|RebalancedKeys| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EstimatedRebalancingKeys| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|RebalanceClearingPartitionsLeft| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorAverageInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorHitPercentage| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorHits| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorInvocations| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMaxInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMinInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMisses| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMissPercentage| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorPuts| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorReadOnlyInvocations| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorRemovals| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|RebalancedKeys| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EstimatedRebalancingKeys| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|RebalanceClearingPartitionsLeft| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorAverageInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorHitPercentage| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorHits| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorInvocations| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMaxInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMinInvocationTime| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMisses| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMissPercentage| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorPuts| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorReadOnlyInvocations| |org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorRemovals| -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Cache metrics on server nodes does not update correctly
Hi, AI 2.7.6 doesn't contain a bug with aggregation of cache hits/misses. I don't sure that described problem is related with IGNITE-3495 [1]. So it makes sense to file an issue. [1] https://issues.apache.org/jira/browse/IGNITE-3495 On Thu, Mar 12, 2020 at 8:21 PM Dominik Przybysz wrote: > > Hi, > I used ignite in version 2.7.6 (but I have also seen this behaviour on other > 2.7.x versions) and there aren't any near or local cache. > I expect that if I ask distributed cache about key which does not exist then > the miss metric will be incremented. > > > śr., 11 mar 2020 o 11:35 Andrey Gura napisał(a): >> >> Denis, >> >> I don't sure that I understand what is expected behavior should be. >> There are local and aggregated cluster wide metrics. I don't know >> which one used by Visor because I never used it :) >> >> Also it would be great to know what version of Apache Ignite used in >> described case. I remember some bug with metrics aggregation during >> discovery metrics message round trip. >> >> On Wed, Mar 11, 2020 at 12:05 AM Denis Magda wrote: >> > >> > @Nikolay Izhikov , @Andrey Gura , >> > could you folks check out this thread? >> > >> > I have a feeling that what Dominik is describing was talked out before and >> > rather some sort of a limitation than an issue with the current >> > implementation. >> > >> > - >> > Denis >> > >> > >> > On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz >> > wrote: >> > >> > > Hi, >> > > I am trying to use partitioned cache on server nodes to which I connect >> > > with client node. Statistics of cache in the cluster are updated, but >> > > only >> > > for hits metric - misses metric is always 0. >> > > >> > > To reproduce this problem I created cluster of two nodes: >> > > >> > > Server node 1 adds 100 random test cases and prints cache statistics >> > > continuously: >> > > >> > > public class IgniteClusterNode1 { >> > > public static void main(String[] args) throws InterruptedException { >> > > IgniteConfiguration igniteConfiguration = new >> > > IgniteConfiguration(); >> > > >> > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); >> > > cacheConfiguration.setName("test"); >> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); >> > > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC); >> > > cacheConfiguration.setStatisticsEnabled(true); >> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); >> > > >> > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); >> > > communicationSpi.setLocalPort(47500); >> > > igniteConfiguration.setCommunicationSpi(communicationSpi); >> > > >> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); >> > > discoverySpi.setLocalPort(47100); >> > > discoverySpi.setLocalPortRange(100); >> > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); >> > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", >> > > "127.0.0.1:48100..48200")); >> > > igniteConfiguration.setDiscoverySpi(discoverySpi); >> > > >> > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { >> > > try (IgniteCache cache = >> > > ignite.getOrCreateCache("test")) { >> > > new Random().ints(1000).map(i -> Math.abs(i % >> > > 1000)).distinct().limit(100).forEach(i -> { >> > > String key = "data_" + i; >> > > String value = UUID.randomUUID().toString(); >> > > cache.put(key, value); >> > > } >> > > ); >> > > } >> > > while (true) { >> > > System.out.println(ignite.cache("test").metrics()); >> > > Thread.sleep(5000); >> > > } >> > > } >> > > } >> > > } >> > > >> > > Server node 2 only prints cache statistics continuously: >> > > >> > > public class IgniteClusterNode2 { >> > > public static void main(String[] args) throws InterruptedException { >> > > IgniteConfiguration igniteConfiguration = new >> > > IgniteConfiguration(); >> > > >> > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); >> > > cacheConfiguration.setName("test"); >> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); >> > > cacheConfiguration.setStatisticsEnabled(true); >> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); >> > > >> > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); >> > > communicationSpi.setLocalPort(48500); >> > > igniteConfiguration.setCommunicationSpi(communicationSpi); >> > > >> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); >> > > discoverySpi.setLocalPort(48100); >> > > discoverySpi.setLocalPortRange(100); >> > > TcpDiscoveryVmIpFinder
Re: Cache metrics on server nodes does not update correctly
Hi, I used ignite in version 2.7.6 (but I have also seen this behaviour on other 2.7.x versions) and there aren't any near or local cache. I expect that if I ask distributed cache about key which does not exist then the miss metric will be incremented. śr., 11 mar 2020 o 11:35 Andrey Gura napisał(a): > Denis, > > I don't sure that I understand what is expected behavior should be. > There are local and aggregated cluster wide metrics. I don't know > which one used by Visor because I never used it :) > > Also it would be great to know what version of Apache Ignite used in > described case. I remember some bug with metrics aggregation during > discovery metrics message round trip. > > On Wed, Mar 11, 2020 at 12:05 AM Denis Magda wrote: > > > > @Nikolay Izhikov , @Andrey Gura , > > could you folks check out this thread? > > > > I have a feeling that what Dominik is describing was talked out before > and > > rather some sort of a limitation than an issue with the current > > implementation. > > > > - > > Denis > > > > > > On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz > > wrote: > > > > > Hi, > > > I am trying to use partitioned cache on server nodes to which I connect > > > with client node. Statistics of cache in the cluster are updated, but > only > > > for hits metric - misses metric is always 0. > > > > > > To reproduce this problem I created cluster of two nodes: > > > > > > Server node 1 adds 100 random test cases and prints cache statistics > > > continuously: > > > > > > public class IgniteClusterNode1 { > > > public static void main(String[] args) throws InterruptedException > { > > > IgniteConfiguration igniteConfiguration = new > > > IgniteConfiguration(); > > > > > > CacheConfiguration cacheConfiguration = new > CacheConfiguration(); > > > cacheConfiguration.setName("test"); > > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > > > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC); > > > cacheConfiguration.setStatisticsEnabled(true); > > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > > > > > TcpCommunicationSpi communicationSpi = new > TcpCommunicationSpi(); > > > communicationSpi.setLocalPort(47500); > > > igniteConfiguration.setCommunicationSpi(communicationSpi); > > > > > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > > > discoverySpi.setLocalPort(47100); > > > discoverySpi.setLocalPortRange(100); > > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > > > "127.0.0.1:48100..48200")); > > > igniteConfiguration.setDiscoverySpi(discoverySpi); > > > > > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > > > try (IgniteCache cache = > > > ignite.getOrCreateCache("test")) { > > > new Random().ints(1000).map(i -> Math.abs(i % > > > 1000)).distinct().limit(100).forEach(i -> { > > > String key = "data_" + i; > > > String value = > UUID.randomUUID().toString(); > > > cache.put(key, value); > > > } > > > ); > > > } > > > while (true) { > > > System.out.println(ignite.cache("test").metrics()); > > > Thread.sleep(5000); > > > } > > > } > > > } > > > } > > > > > > Server node 2 only prints cache statistics continuously: > > > > > > public class IgniteClusterNode2 { > > > public static void main(String[] args) throws InterruptedException > { > > > IgniteConfiguration igniteConfiguration = new > > > IgniteConfiguration(); > > > > > > CacheConfiguration cacheConfiguration = new > CacheConfiguration(); > > > cacheConfiguration.setName("test"); > > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > > > cacheConfiguration.setStatisticsEnabled(true); > > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > > > > > TcpCommunicationSpi communicationSpi = new > TcpCommunicationSpi(); > > > communicationSpi.setLocalPort(48500); > > > igniteConfiguration.setCommunicationSpi(communicationSpi); > > > > > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > > > discoverySpi.setLocalPort(48100); > > > discoverySpi.setLocalPortRange(100); > > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > > > "127.0.0.1:48100..48200")); > > > igniteConfiguration.setDiscoverySpi(discoverySpi); > > > > > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > > > while (true) { > > > System.out.println(ignite.cache("test").metrics()); > >
Re: Cache metrics on server nodes does not update correctly
Denis, I don't sure that I understand what is expected behavior should be. There are local and aggregated cluster wide metrics. I don't know which one used by Visor because I never used it :) Also it would be great to know what version of Apache Ignite used in described case. I remember some bug with metrics aggregation during discovery metrics message round trip. On Wed, Mar 11, 2020 at 12:05 AM Denis Magda wrote: > > @Nikolay Izhikov , @Andrey Gura , > could you folks check out this thread? > > I have a feeling that what Dominik is describing was talked out before and > rather some sort of a limitation than an issue with the current > implementation. > > - > Denis > > > On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz > wrote: > > > Hi, > > I am trying to use partitioned cache on server nodes to which I connect > > with client node. Statistics of cache in the cluster are updated, but only > > for hits metric - misses metric is always 0. > > > > To reproduce this problem I created cluster of two nodes: > > > > Server node 1 adds 100 random test cases and prints cache statistics > > continuously: > > > > public class IgniteClusterNode1 { > > public static void main(String[] args) throws InterruptedException { > > IgniteConfiguration igniteConfiguration = new > > IgniteConfiguration(); > > > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); > > cacheConfiguration.setName("test"); > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC); > > cacheConfiguration.setStatisticsEnabled(true); > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); > > communicationSpi.setLocalPort(47500); > > igniteConfiguration.setCommunicationSpi(communicationSpi); > > > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > > discoverySpi.setLocalPort(47100); > > discoverySpi.setLocalPortRange(100); > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > > "127.0.0.1:48100..48200")); > > igniteConfiguration.setDiscoverySpi(discoverySpi); > > > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > > try (IgniteCache cache = > > ignite.getOrCreateCache("test")) { > > new Random().ints(1000).map(i -> Math.abs(i % > > 1000)).distinct().limit(100).forEach(i -> { > > String key = "data_" + i; > > String value = UUID.randomUUID().toString(); > > cache.put(key, value); > > } > > ); > > } > > while (true) { > > System.out.println(ignite.cache("test").metrics()); > > Thread.sleep(5000); > > } > > } > > } > > } > > > > Server node 2 only prints cache statistics continuously: > > > > public class IgniteClusterNode2 { > > public static void main(String[] args) throws InterruptedException { > > IgniteConfiguration igniteConfiguration = new > > IgniteConfiguration(); > > > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); > > cacheConfiguration.setName("test"); > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > > cacheConfiguration.setStatisticsEnabled(true); > > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); > > communicationSpi.setLocalPort(48500); > > igniteConfiguration.setCommunicationSpi(communicationSpi); > > > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > > discoverySpi.setLocalPort(48100); > > discoverySpi.setLocalPortRange(100); > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > > "127.0.0.1:48100..48200")); > > igniteConfiguration.setDiscoverySpi(discoverySpi); > > > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > > while (true) { > > System.out.println(ignite.cache("test").metrics()); > > Thread.sleep(5000); > > } > > } > > } > > } > > > > Next I start a client node which continuously read data from the cluster: > > > > public class CacheClusterReader { > > public static void main(String[] args) throws InterruptedException { > > IgniteConfiguration cfg = new IgniteConfiguration(); > > cfg.setClientMode(true); > > > > TcpDiscoverySpi spi = new TcpDiscoverySpi(); > > TcpDiscoveryVmIpFinder tcMp = new TcpDiscoveryVmIpFinder(); > >
Re: Cache metrics on server nodes does not update correctly
@Nikolay Izhikov , @Andrey Gura , could you folks check out this thread? I have a feeling that what Dominik is describing was talked out before and rather some sort of a limitation than an issue with the current implementation. - Denis On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz wrote: > Hi, > I am trying to use partitioned cache on server nodes to which I connect > with client node. Statistics of cache in the cluster are updated, but only > for hits metric - misses metric is always 0. > > To reproduce this problem I created cluster of two nodes: > > Server node 1 adds 100 random test cases and prints cache statistics > continuously: > > public class IgniteClusterNode1 { > public static void main(String[] args) throws InterruptedException { > IgniteConfiguration igniteConfiguration = new > IgniteConfiguration(); > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); > cacheConfiguration.setName("test"); > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC); > cacheConfiguration.setStatisticsEnabled(true); > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); > communicationSpi.setLocalPort(47500); > igniteConfiguration.setCommunicationSpi(communicationSpi); > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > discoverySpi.setLocalPort(47100); > discoverySpi.setLocalPortRange(100); > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > "127.0.0.1:48100..48200")); > igniteConfiguration.setDiscoverySpi(discoverySpi); > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > try (IgniteCache cache = > ignite.getOrCreateCache("test")) { > new Random().ints(1000).map(i -> Math.abs(i % > 1000)).distinct().limit(100).forEach(i -> { > String key = "data_" + i; > String value = UUID.randomUUID().toString(); > cache.put(key, value); > } > ); > } > while (true) { > System.out.println(ignite.cache("test").metrics()); > Thread.sleep(5000); > } > } > } > } > > Server node 2 only prints cache statistics continuously: > > public class IgniteClusterNode2 { > public static void main(String[] args) throws InterruptedException { > IgniteConfiguration igniteConfiguration = new > IgniteConfiguration(); > > CacheConfiguration cacheConfiguration = new CacheConfiguration(); > cacheConfiguration.setName("test"); > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); > cacheConfiguration.setStatisticsEnabled(true); > igniteConfiguration.setCacheConfiguration(cacheConfiguration); > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi(); > communicationSpi.setLocalPort(48500); > igniteConfiguration.setCommunicationSpi(communicationSpi); > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi(); > discoverySpi.setLocalPort(48100); > discoverySpi.setLocalPortRange(100); > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(); > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > "127.0.0.1:48100..48200")); > igniteConfiguration.setDiscoverySpi(discoverySpi); > > try (Ignite ignite = Ignition.start(igniteConfiguration)) { > while (true) { > System.out.println(ignite.cache("test").metrics()); > Thread.sleep(5000); > } > } > } > } > > Next I start a client node which continuously read data from the cluster: > > public class CacheClusterReader { > public static void main(String[] args) throws InterruptedException { > IgniteConfiguration cfg = new IgniteConfiguration(); > cfg.setClientMode(true); > > TcpDiscoverySpi spi = new TcpDiscoverySpi(); > TcpDiscoveryVmIpFinder tcMp = new TcpDiscoveryVmIpFinder(); > tcMp.setAddresses(Arrays.asList("127.0.0.1:47100..47200", > "127.0.0.1:48100..48200")); > spi.setIpFinder(tcMp); > cfg.setDiscoverySpi(spi); > > CacheConfiguration cacheConfig = new > CacheConfiguration<>("test"); > cacheConfig.setStatisticsEnabled(true); > cacheConfig.setCacheMode(CacheMode.PARTITIONED); > cfg.setCacheConfiguration(cacheConfig); > > try (Ignite ignite = Ignition.start(cfg)) { > System.out.println(ignite.cacheNames()); > > while (true) { > try (IgniteCache cache = > ignite.getOrCreateCache(cacheConfig)) { >
Re: MetaStorage key length limitations and Cache Metrics configuration
Ivan, I also don't think this issue is a blocker for 2.8 as it affects only experimental functionality and only in special cases. Removing key length limitations in MetaStorage seems more strategic approach to me but depending on how we decide to approach it (as a local fix or as part of a broader improvement of MetaStorage internal implementation) we may target it to 2.8.1 or 2.9. In the latter case it makes sense to implement key length validation [1] and include it to 2.8.1 to prevent user from making destructive actions. Otherwise if we decide to implement [2] earlier and remove this pesky limitation in 2.8.1 then I'm fine with closing [1] with "Won't fix" resolution. Does it make sense to you? [1] https://issues.apache.org/jira/browse/IGNITE-12721 [2] https://issues.apache.org/jira/browse/IGNITE-12726 On Fri, Feb 28, 2020 at 4:18 PM Maxim Muzafarov wrote: > Ivan, > > > This issue doesn't seem to be a blocker for 2.8 release from my point of > view. > I think we definitely will have such bugs in future and 2.8.1 is our > goal for them. > > Please, let me know if we should wait for the fix and include it exactly > in 2.8. > > On Fri, 28 Feb 2020 at 15:40, Nikolay Izhikov wrote: > > > > Igniters, > > > > I think we can replace cache name with the cache id. > > This should solve issue with the length limitation. > > > > What do you think? > > > > > 28 февр. 2020 г., в 15:32, Ivan Bessonov > написал(а): > > > > > > Hello Igniters, > > > > > > we have an issue in master branch and in the upcoming 2.8 release that > > > related to new metrics functionality implemented in [1]. You can't use > new > > > "configureHistogramMetric" and "configureHitRateMetric" configuration > > > methods on caches with long names. My estimation shows that cache with > 30 > > > characters in its name will shut down your whole cluster with failure > > > handler if > > > you try to change metrics configuration for it using one of those > methods. > > > > > > Initially we wanted to merge [2] to show a valid error message instead > of > > > failing > > > the cluster, but it wasn't in plans for 2.8 because we didn't know > that it > > > clashes > > > with [1]. > > > > > > I created issue [3] with plans of removing MetaStorage key length > > > limitations, but > > > it requires some thoughtful MetaStorageTree reworkings. I mean that it > > > can't be > > > done in only a few days. > > > > > > What do you think? Does this issue affect 2.8 release? AFAIK new > metrics are > > > experimental and they can have some known issues. Feel free to ask me > for > > > more > > > details if it's needed. > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11987 > > > [2] https://issues.apache.org/jira/browse/IGNITE-12721 > > > [3] https://issues.apache.org/jira/browse/IGNITE-12726 > > > > > > -- > > > Sincerely yours, > > > Ivan Bessonov > > >
Re: MetaStorage key length limitations and Cache Metrics configuration
Ivan, This issue doesn't seem to be a blocker for 2.8 release from my point of view. I think we definitely will have such bugs in future and 2.8.1 is our goal for them. Please, let me know if we should wait for the fix and include it exactly in 2.8. On Fri, 28 Feb 2020 at 15:40, Nikolay Izhikov wrote: > > Igniters, > > I think we can replace cache name with the cache id. > This should solve issue with the length limitation. > > What do you think? > > > 28 февр. 2020 г., в 15:32, Ivan Bessonov написал(а): > > > > Hello Igniters, > > > > we have an issue in master branch and in the upcoming 2.8 release that > > related to new metrics functionality implemented in [1]. You can't use new > > "configureHistogramMetric" and "configureHitRateMetric" configuration > > methods on caches with long names. My estimation shows that cache with 30 > > characters in its name will shut down your whole cluster with failure > > handler if > > you try to change metrics configuration for it using one of those methods. > > > > Initially we wanted to merge [2] to show a valid error message instead of > > failing > > the cluster, but it wasn't in plans for 2.8 because we didn't know that it > > clashes > > with [1]. > > > > I created issue [3] with plans of removing MetaStorage key length > > limitations, but > > it requires some thoughtful MetaStorageTree reworkings. I mean that it > > can't be > > done in only a few days. > > > > What do you think? Does this issue affect 2.8 release? AFAIK new metrics are > > experimental and they can have some known issues. Feel free to ask me for > > more > > details if it's needed. > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11987 > > [2] https://issues.apache.org/jira/browse/IGNITE-12721 > > [3] https://issues.apache.org/jira/browse/IGNITE-12726 > > > > -- > > Sincerely yours, > > Ivan Bessonov >
Re: MetaStorage key length limitations and Cache Metrics configuration
Igniters, I think we can replace cache name with the cache id. This should solve issue with the length limitation. What do you think? > 28 февр. 2020 г., в 15:32, Ivan Bessonov написал(а): > > Hello Igniters, > > we have an issue in master branch and in the upcoming 2.8 release that > related to new metrics functionality implemented in [1]. You can't use new > "configureHistogramMetric" and "configureHitRateMetric" configuration > methods on caches with long names. My estimation shows that cache with 30 > characters in its name will shut down your whole cluster with failure > handler if > you try to change metrics configuration for it using one of those methods. > > Initially we wanted to merge [2] to show a valid error message instead of > failing > the cluster, but it wasn't in plans for 2.8 because we didn't know that it > clashes > with [1]. > > I created issue [3] with plans of removing MetaStorage key length > limitations, but > it requires some thoughtful MetaStorageTree reworkings. I mean that it > can't be > done in only a few days. > > What do you think? Does this issue affect 2.8 release? AFAIK new metrics are > experimental and they can have some known issues. Feel free to ask me for > more > details if it's needed. > > > [1] https://issues.apache.org/jira/browse/IGNITE-11987 > [2] https://issues.apache.org/jira/browse/IGNITE-12721 > [3] https://issues.apache.org/jira/browse/IGNITE-12726 > > -- > Sincerely yours, > Ivan Bessonov
MetaStorage key length limitations and Cache Metrics configuration
Hello Igniters, we have an issue in master branch and in the upcoming 2.8 release that related to new metrics functionality implemented in [1]. You can't use new "configureHistogramMetric" and "configureHitRateMetric" configuration methods on caches with long names. My estimation shows that cache with 30 characters in its name will shut down your whole cluster with failure handler if you try to change metrics configuration for it using one of those methods. Initially we wanted to merge [2] to show a valid error message instead of failing the cluster, but it wasn't in plans for 2.8 because we didn't know that it clashes with [1]. I created issue [3] with plans of removing MetaStorage key length limitations, but it requires some thoughtful MetaStorageTree reworkings. I mean that it can't be done in only a few days. What do you think? Does this issue affect 2.8 release? AFAIK new metrics are experimental and they can have some known issues. Feel free to ask me for more details if it's needed. [1] https://issues.apache.org/jira/browse/IGNITE-11987 [2] https://issues.apache.org/jira/browse/IGNITE-12721 [3] https://issues.apache.org/jira/browse/IGNITE-12726 -- Sincerely yours, Ivan Bessonov
Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Hi Roman, I suppose that we can resolve the ticket with 2.8 fix version if you have no objections. чт, 26 дек. 2019 г. в 10:52, : > > Hi Ivan, > Does it mean that the problem is gone and I should close the JIRA > IGNITE-12445 ? > > > -Original Message- > From: Ivan Pavlukhin > Sent: Monday, December 16, 2019 11:05 PM > To: dev > Subject: Re: When Cache Metrics are switched on (statisticsEnabled = true) > the empty cache events arrive to the client nodes > > I also checked the reproducer with current master. It seems that the problem > is fixed there. > > пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev : > > > > Hello! > > > > Is there a chance you are using Zk? > > > > I believe it's https://issues.apache.org/jira/browse/IGNITE-6564 > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > пт, 13 дек. 2019 г. в 12:24, : > > > > > Hi Community, > > > > > > I’d like to ask you about the following behavior of Apache Ignite: > > > > > > > > > If we want to react on some PUT or READ cache operations first of > > > all we need to turn on the appropriate cache events on the server > > > node and catch those events on the client nodes using remote approach > > > with two listeners. > > > It works well until we switch on statisticsEnabled on the server > > > node, it will lead to the situation when we get empty CacheEvent objects. > > > > > > The example that demonstrates this issue is in the attachments. This > > > example is consists of three nodes: 1 server node with cache and 2 > > > clients. One client is filling the cache and the second one is > > > listening PUT operations. When we turn on Cache Metrics on the server > > > node: > > > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we > > > get empty events (Sometimes CacheEvent objects with null fields. > > > Sometimes there are no events at all) > > > > > > My suppose is there is some Exception in > > > GridCacheEventManager.addEvent() when Cache Metrics is turned on. > > > > > > catch (Exception e) { > > > if > > > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) > > > throw e; if (log.isDebugEnabled()) > > > log.debug("Failed to unmarshall cache object value for the > > > event > > > notification: " + e); > > > > > > if (!forceKeepBinary) > > > LT.warn(log, "Failed to unmarshall cache object value for the > > > event notification " + > > > "(all further notifications will keep binary object > > > format)."); > > > > > > forceKeepBinary = true; > > > > > > key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, > > > false); > > > > > > val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, > > > true, false); > > > > > > oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, > > > true, false); > > > > > > } > > > > > > Can public this point in JIRA? > > > > > > Best regards, > > > > > > T-Systems RUS > > > Point of Production > > > Roman Koriakov > > > Software Developer > > > Kirova 11, Voronezh, Russia > > > Tel: + 7 473 200 15 30 > > > E-mail: > > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> > > > http://www.t-systems.com<http://www.t-systems.ru/> > > > > > > > > > > > > -Original Message- > > > From: Ilya Kasnacheev > > > Sent: Thursday, December 12, 2019 6:35 PM > > > To: dev > > > Subject: Re: joining > > > > > > > > > > > > Hello! > > > > > > > > > > > > You will need to register on https://issues.apache.org/jira/ first. > > > > > > > > > > > > Please tell me when you do. > > > > > > > > > > > > Regards, > > > > > > -- > > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > чт, 12 дек. 2019 г. в 18:09, > > roman.koria...@t-systems.com>>: > > > > > > > > > > > > > Hi Ilya, > > > > > > > > > > > > > > i
RE: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Hi Ivan, Does it mean that the problem is gone and I should close the JIRA IGNITE-12445 ? -Original Message- From: Ivan Pavlukhin Sent: Monday, December 16, 2019 11:05 PM To: dev Subject: Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes I also checked the reproducer with current master. It seems that the problem is fixed there. пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev : > > Hello! > > Is there a chance you are using Zk? > > I believe it's https://issues.apache.org/jira/browse/IGNITE-6564 > > Regards, > -- > Ilya Kasnacheev > > > пт, 13 дек. 2019 г. в 12:24, : > > > Hi Community, > > > > I’d like to ask you about the following behavior of Apache Ignite: > > > > > > If we want to react on some PUT or READ cache operations first of > > all we need to turn on the appropriate cache events on the server > > node and catch those events on the client nodes using remote approach with > > two listeners. > > It works well until we switch on statisticsEnabled on the server > > node, it will lead to the situation when we get empty CacheEvent objects. > > > > The example that demonstrates this issue is in the attachments. This > > example is consists of three nodes: 1 server node with cache and 2 > > clients. One client is filling the cache and the second one is > > listening PUT operations. When we turn on Cache Metrics on the server node: > > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we > > get empty events (Sometimes CacheEvent objects with null fields. > > Sometimes there are no events at all) > > > > My suppose is there is some Exception in > > GridCacheEventManager.addEvent() when Cache Metrics is turned on. > > > > catch (Exception e) { > > if > > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) > > throw e; if (log.isDebugEnabled()) > > log.debug("Failed to unmarshall cache object value for the > > event > > notification: " + e); > > > > if (!forceKeepBinary) > > LT.warn(log, "Failed to unmarshall cache object value for the > > event notification " + > > "(all further notifications will keep binary object > > format)."); > > > > forceKeepBinary = true; > > > > key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, > > false); > > > > val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, > > true, false); > > > > oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, > > true, false); > > > > } > > > > Can public this point in JIRA? > > > > Best regards, > > > > T-Systems RUS > > Point of Production > > Roman Koriakov > > Software Developer > > Kirova 11, Voronezh, Russia > > Tel: + 7 473 200 15 30 > > E-mail: > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> > > http://www.t-systems.com<http://www.t-systems.ru/> > > > > > > > > -Original Message- > > From: Ilya Kasnacheev > > Sent: Thursday, December 12, 2019 6:35 PM > > To: dev > > Subject: Re: joining > > > > > > > > Hello! > > > > > > > > You will need to register on https://issues.apache.org/jira/ first. > > > > > > > > Please tell me when you do. > > > > > > > > Regards, > > > > -- > > > > Ilya Kasnacheev > > > > > > > > > > > > чт, 12 дек. 2019 г. в 18:09, > roman.koria...@t-systems.com>>: > > > > > > > > > Hi Ilya, > > > > > > > > > > it’d be nice if it were rkoriakov > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > T-Systems RUS > > > > > Point of Production > > > > > Roman Koriakov > > > > > Software Developer > > > > > Kirova 11, Voronezh, Russia > > > > > Tel: + 7 473 200 15 30 > > > > > E-mail: > > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com > > <mailto:roman.koria...@t-systems.com%3cmailto:Roman.Koriakov@t-syste > > ms.com > > >> > > > > > http://www.t-systems.com<http://www.t-systems.ru/< > > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > &g
Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
I also checked the reproducer with current master. It seems that the problem is fixed there. пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev : > > Hello! > > Is there a chance you are using Zk? > > I believe it's https://issues.apache.org/jira/browse/IGNITE-6564 > > Regards, > -- > Ilya Kasnacheev > > > пт, 13 дек. 2019 г. в 12:24, : > > > Hi Community, > > > > I’d like to ask you about the following behavior of Apache Ignite: > > > > > > If we want to react on some PUT or READ cache operations first of all we > > need to turn on the appropriate cache events on the server node and catch > > those events on the client nodes using remote approach with two listeners. > > It works well until we switch on statisticsEnabled on the server node, it > > will lead to the situation when we get empty CacheEvent objects. > > > > The example that demonstrates this issue is in the attachments. This > > example is consists of three nodes: 1 server node with cache and 2 > > clients. One client is filling the cache and the second one is listening > > PUT operations. When we turn on Cache Metrics on the server node: > > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get > > empty events (Sometimes CacheEvent objects with null fields. Sometimes > > there are no events at all) > > > > My suppose is there is some Exception in GridCacheEventManager.addEvent() > > when Cache Metrics is turned on. > > > > catch (Exception e) { > > if > > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) > > throw e; if (log.isDebugEnabled()) > > log.debug("Failed to unmarshall cache object value for the event > > notification: " + e); > > > > if (!forceKeepBinary) > > LT.warn(log, "Failed to unmarshall cache object value for the event > > notification " + > > "(all further notifications will keep binary object > > format)."); > > > > forceKeepBinary = true; > > > > key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false); > > > > val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, > > false); > > > > oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, > > false); > > > > } > > > > Can public this point in JIRA? > > > > Best regards, > > > > T-Systems RUS > > Point of Production > > Roman Koriakov > > Software Developer > > Kirova 11, Voronezh, Russia > > Tel: + 7 473 200 15 30 > > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> > > http://www.t-systems.com<http://www.t-systems.ru/> > > > > > > > > -Original Message- > > From: Ilya Kasnacheev > > Sent: Thursday, December 12, 2019 6:35 PM > > To: dev > > Subject: Re: joining > > > > > > > > Hello! > > > > > > > > You will need to register on https://issues.apache.org/jira/ first. > > > > > > > > Please tell me when you do. > > > > > > > > Regards, > > > > -- > > > > Ilya Kasnacheev > > > > > > > > > > > > чт, 12 дек. 2019 г. в 18:09, > roman.koria...@t-systems.com>>: > > > > > > > > > Hi Ilya, > > > > > > > > > > it’d be nice if it were rkoriakov > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > T-Systems RUS > > > > > Point of Production > > > > > Roman Koriakov > > > > > Software Developer > > > > > Kirova 11, Voronezh, Russia > > > > > Tel: + 7 473 200 15 30 > > > > > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com > > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com > > >> > > > > > http://www.t-systems.com<http://www.t-systems.ru/< > > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > > > > > > > > > > > > > > > -Original Message- > > > > > From: Ilya Kasnacheev > ilya.kasnach...@gmail.com>> > > > > > Sent: Thursday, December 12, 2019 5:25 PM > > > > > To: dev mailto:dev@ignite.apache.org>> > > > > > Subject: Re: joining > > > > > >
Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Hello! Is there a chance you are using Zk? I believe it's https://issues.apache.org/jira/browse/IGNITE-6564 Regards, -- Ilya Kasnacheev пт, 13 дек. 2019 г. в 12:24, : > Hi Community, > > I’d like to ask you about the following behavior of Apache Ignite: > > > If we want to react on some PUT or READ cache operations first of all we > need to turn on the appropriate cache events on the server node and catch > those events on the client nodes using remote approach with two listeners. > It works well until we switch on statisticsEnabled on the server node, it > will lead to the situation when we get empty CacheEvent objects. > > The example that demonstrates this issue is in the attachments. This > example is consists of three nodes: 1 server node with cache and 2 > clients. One client is filling the cache and the second one is listening > PUT operations. When we turn on Cache Metrics on the server node: > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get > empty events (Sometimes CacheEvent objects with null fields. Sometimes > there are no events at all) > > My suppose is there is some Exception in GridCacheEventManager.addEvent() > when Cache Metrics is turned on. > > catch (Exception e) { > if > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) > throw e; if (log.isDebugEnabled()) > log.debug("Failed to unmarshall cache object value for the event > notification: " + e); > > if (!forceKeepBinary) > LT.warn(log, "Failed to unmarshall cache object value for the event > notification " + > "(all further notifications will keep binary object > format)."); > > forceKeepBinary = true; > > key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false); > > val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, > false); > > oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, > false); > > } > > Can public this point in JIRA? > > Best regards, > > T-Systems RUS > Point of Production > Roman Koriakov > Software Developer > Kirova 11, Voronezh, Russia > Tel: + 7 473 200 15 30 > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> > http://www.t-systems.com<http://www.t-systems.ru/> > > > > -Original Message- > From: Ilya Kasnacheev > Sent: Thursday, December 12, 2019 6:35 PM > To: dev > Subject: Re: joining > > > > Hello! > > > > You will need to register on https://issues.apache.org/jira/ first. > > > > Please tell me when you do. > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > чт, 12 дек. 2019 г. в 18:09, roman.koria...@t-systems.com>>: > > > > > Hi Ilya, > > > > > > it’d be nice if it were rkoriakov > > > > > > > > > > > > Best regards, > > > > > > T-Systems RUS > > > Point of Production > > > Roman Koriakov > > > Software Developer > > > Kirova 11, Voronezh, Russia > > > Tel: + 7 473 200 15 30 > > > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com > >> > > > http://www.t-systems.com<http://www.t-systems.ru/< > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > > > > > > > -Original Message- > > > From: Ilya Kasnacheev ilya.kasnach...@gmail.com>> > > > Sent: Thursday, December 12, 2019 5:25 PM > > > To: dev mailto:dev@ignite.apache.org>> > > > Subject: Re: joining > > > > > > > > > > > > Hello! > > > > > > > > > > > > I will need an Apache JIRA username to add you to contributors. Can you > > > provide it? > > > > > > > > > > > > Regards, > > > > > > -- > > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > чт, 12 дек. 2019 г. в 17:20, > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>: > > > > > > > > > > > > > Hi everyone, > > > > > > > I'd like to participate in this project! > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > T-Systems RUS > > > > > > > Point of Production > > > > > > > Roman Koriakov > > > > > > > Software Developer > > > > > > > Kirova 11, Voronezh, Russia > > > > > > > Tel: + 7 473 200 15 30 > > > > > > > E-mail: roman.koria...@t-systems.com roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com% > 3cmailto:roman.koria...@t-systems.com> > > > <mailto:roman.koria...@t-systems.com% > 3cmailto:roman.koria...@t-systems.com > > > >> > > > > > > > http://www.t-systems.com<http://www.t-systems.ru/<< > http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c> > > > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > > > > > > > > > > > > >
Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Hi Roman, Thank you for reporting this! I looked into and on my machine I was able to receive events on client-handler node but exception occurred in the local listener! In a following line: System.out.println("Received event [evt=" + evt.name() + ", cacheName=" + evt.cacheName() + ", key=" + evt.key().toString()); This indeed looks like a weird bug. Event appears broken after deserialization on a listener side after it is received from a server. пт, 13 дек. 2019 г. в 12:24, : > > Hi Community, > > I’d like to ask you about the following behavior of Apache Ignite: > > > If we want to react on some PUT or READ cache operations first of all we need > to turn on the appropriate cache events on the server node and catch those > events on the client nodes using remote approach with two listeners. It works > well until we switch on statisticsEnabled on the server node, it will lead to > the situation when we get empty CacheEvent objects. > > The example that demonstrates this issue is in the attachments. This example > is consists of three nodes: 1 server node with cache and 2 clients. One > client is filling the cache and the second one is listening PUT operations. > When we turn on Cache Metrics on the server node: > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get empty > events (Sometimes CacheEvent objects with null fields. Sometimes there are no > events at all) > > My suppose is there is some Exception in GridCacheEventManager.addEvent() > when Cache Metrics is turned on. > > catch (Exception e) { > if > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) > throw e; if (log.isDebugEnabled()) > log.debug("Failed to unmarshall cache object value for the event > notification: " + e); > > if (!forceKeepBinary) > LT.warn(log, "Failed to unmarshall cache object value for the event > notification " + > "(all further notifications will keep binary object format)."); > > forceKeepBinary = true; > > key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false); > > val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, false); > > oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, > false); > > } > > Can public this point in JIRA? > > Best regards, > > T-Systems RUS > Point of Production > Roman Koriakov > Software Developer > Kirova 11, Voronezh, Russia > Tel: + 7 473 200 15 30 > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> > http://www.t-systems.com<http://www.t-systems.ru/> > > > > -Original Message- > From: Ilya Kasnacheev > Sent: Thursday, December 12, 2019 6:35 PM > To: dev > Subject: Re: joining > > > > Hello! > > > > You will need to register on https://issues.apache.org/jira/ first. > > > > Please tell me when you do. > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > чт, 12 дек. 2019 г. в 18:09, > mailto:roman.koria...@t-systems.com>>: > > > > > Hi Ilya, > > > > > > it’d be nice if it were rkoriakov > > > > > > > > > > > > Best regards, > > > > > > T-Systems RUS > > > Point of Production > > > Roman Koriakov > > > Software Developer > > > Kirova 11, Voronezh, Russia > > > Tel: + 7 473 200 15 30 > > > E-mail: > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>> > > > http://www.t-systems.com<http://www.t-systems.ru/<http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > > > > > > > -Original Message- > > > From: Ilya Kasnacheev > > mailto:ilya.kasnach...@gmail.com>> > > > Sent: Thursday, December 12, 2019 5:25 PM > > > To: dev mailto:dev@ignite.apache.org>> > > > Subject: Re: joining > > > > > > > > > > > > Hello! > > > > > > > > > > > > I will need an Apache JIRA username to add you to contributors. Can you > > > provide it? > > > > > > > > > > > > Regards, > > > > > > -- > > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > чт, 12 дек. 2019 г. в 17:20, > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>: > > > > > > > > > > > > > Hi everyone, > > > > > > > I'd like to participate in this project! > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > T-Systems RUS > > > > > > > Point of Production > > > > > > > Roman Koriakov > > > > > > > Software Developer > > > > > > > Kirova 11, Voronezh, Russia > > > > > > > Tel: + 7 473 200 15 30 > > > > > > > E-mail: > > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com> > > > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com > > > >> > > > > > > > http://www.t-systems.com<http://www.t-systems.ru/<<http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c> > > > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > > > > > > > > > > > > -- Best regards, Ivan Pavlukhin
[jira] [Created] (IGNITE-12445) When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Roman Koriakov created IGNITE-12445: --- Summary: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes Key: IGNITE-12445 URL: https://issues.apache.org/jira/browse/IGNITE-12445 Project: Ignite Issue Type: Bug Components: general Affects Versions: 2.7.6 Environment: OS Name Microsoft Windows 10 Pro java version "1.8.0_231" java version OpenJDK 64-Bit Server VM 11+28 Reporter: Roman Koriakov If we want to react on some PUT or READ cache operations first of all we need to turn on the appropriate cache events on the server node and catch those events on the client nodes using remote approach with two listeners. It works well until we switch on *statisticsEnabled* on the server node, it will lead to the situation when we get empty *CacheEvent* objects. The example that demonstrates this issue is in the attachments. This example is consists of three nodes: 1 server node with cache and 2 clients. One client is filling the cache and the second one is listening PUT operations. When we turn on Cache Metrics on the server node: *cacheConfig.setStatisticsEnabled(true);* in *EventServerCache.java* we get empty events ({color:#172b4d}Sometimes {color}CacheEvent objects with null fields. Sometimes there are no events at all) My suppose is there is some Exception in GridCacheEventManager.addEvent() when Cache Metrics is turned on. {color:#cc7832}catch {color}(Exception e) { {color:#cc7832}if {color}(!{color:#9876aa}cctx{color}.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled({color:#9876aa}cctx{color}.config())) {color:#cc7832}throw {color}e{color:#cc7832};{color}{color:#cc7832} if {color}({color:#9876aa}log{color}.isDebugEnabled()) {color:#9876aa}log{color}.debug({color:#6a8759}"Failed to unmarshall cache object value for the event notification: " {color}+ e){color:#cc7832};{color}{color:#cc7832} {color} {color:#cc7832} if {color}(!{color:#9876aa}forceKeepBinary{color}) LT.warn({color:#9876aa}log{color}{color:#cc7832}, {color}{color:#6a8759}"Failed to unmarshall cache object value for the event notification " {color}+ {color:#6a8759}"(all further notifications will keep binary object format)."{color}){color:#cc7832};{color} {color:#9876aa} forceKeepBinary {color}= {color:#cc7832}true;{color} key0 = {color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(key{color:#cc7832}, true, false{color}){color:#cc7832};{color} val0 = {color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(newVal{color:#cc7832}, true, false{color}){color:#cc7832};{color} oldVal0 = {color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(oldVal{color:#cc7832}, true, false{color}){color:#cc7832};{color} } -- This message was sent by Atlassian Jira (v8.3.4#803005)
When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes
Hi Community, I’d like to ask you about the following behavior of Apache Ignite: If we want to react on some PUT or READ cache operations first of all we need to turn on the appropriate cache events on the server node and catch those events on the client nodes using remote approach with two listeners. It works well until we switch on statisticsEnabled on the server node, it will lead to the situation when we get empty CacheEvent objects. The example that demonstrates this issue is in the attachments. This example is consists of three nodes: 1 server node with cache and 2 clients. One client is filling the cache and the second one is listening PUT operations. When we turn on Cache Metrics on the server node: cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get empty events (Sometimes CacheEvent objects with null fields. Sometimes there are no events at all) My suppose is there is some Exception in GridCacheEventManager.addEvent() when Cache Metrics is turned on. catch (Exception e) { if (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config())) throw e; if (log.isDebugEnabled()) log.debug("Failed to unmarshall cache object value for the event notification: " + e); if (!forceKeepBinary) LT.warn(log, "Failed to unmarshall cache object value for the event notification " + "(all further notifications will keep binary object format)."); forceKeepBinary = true; key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false); val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, false); oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, false); } Can public this point in JIRA? Best regards, T-Systems RUS Point of Production Roman Koriakov Software Developer Kirova 11, Voronezh, Russia Tel: + 7 473 200 15 30 E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com> http://www.t-systems.com<http://www.t-systems.ru/> -Original Message- From: Ilya Kasnacheev Sent: Thursday, December 12, 2019 6:35 PM To: dev Subject: Re: joining Hello! You will need to register on https://issues.apache.org/jira/ first. Please tell me when you do. Regards, -- Ilya Kasnacheev чт, 12 дек. 2019 г. в 18:09, mailto:roman.koria...@t-systems.com>>: > Hi Ilya, > > it’d be nice if it were rkoriakov > > > > Best regards, > > T-Systems RUS > Point of Production > Roman Koriakov > Software Developer > Kirova 11, Voronezh, Russia > Tel: + 7 473 200 15 30 > E-mail: > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>> > http://www.t-systems.com<http://www.t-systems.ru/<http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > -Original Message- > From: Ilya Kasnacheev > mailto:ilya.kasnach...@gmail.com>> > Sent: Thursday, December 12, 2019 5:25 PM > To: dev mailto:dev@ignite.apache.org>> > Subject: Re: joining > > > > Hello! > > > > I will need an Apache JIRA username to add you to contributors. Can you > provide it? > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > чт, 12 дек. 2019 г. в 17:20, roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>: > > > > > Hi everyone, > > > I'd like to participate in this project! > > > > > > Best regards, > > > > > > T-Systems RUS > > > Point of Production > > > Roman Koriakov > > > Software Developer > > > Kirova 11, Voronezh, Russia > > > Tel: + 7 473 200 15 30 > > > E-mail: > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com> > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com > >> > > > http://www.t-systems.com<http://www.t-systems.ru/<<http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c> > http://www.t-systems.com%3chttp:/www.t-systems.ru/>> > > > > > > >
[jira] [Created] (IGNITE-12196) [Phase-4] Deprecate old rebalancing cache metrics
Maxim Muzafarov created IGNITE-12196: Summary: [Phase-4] Deprecate old rebalancing cache metrics Key: IGNITE-12196 URL: https://issues.apache.org/jira/browse/IGNITE-12196 Project: Ignite Issue Type: Sub-task Reporter: Maxim Muzafarov We need to mark rebalancing CacheMetrics deprecated and remove them from metrics a newly introduced metrics framework IGNITE-11961. Such cache metrics should be implemented in an old-fashion way (like they were before the metrics framework added) to keep backwards compatibility. Removed it Apache Ignite 3.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution
Denis, I measure the impact of metrics collecting on my laptop, it's about 5 seconds on each node for collecting metrics of 1000 caches (all caches in one cache group) with 32000 partitions. All this time tcp-disco-msg-worker is blocked. Guys, thanks for your proposals, I'd filled ticket [1]. [1]: https://issues.apache.org/jira/browse/IGNITE-10642 вт, 4 дек. 2018 г. в 18:07, Alexey Kuznetsov : > Hi, > > One of the problems with metrics is a huge size in case when a lot caches > started on node (for example, I see 7000 caches). > We have to think how to compact them. > Not all metrics changed frequently, so, we may store locally and send over > wire only a difference from previous collect. > > And think carefully about store format. For example, if current cache > metrics will be passed as JSON object, > then 70% of it will be strings with metrics names. > > > On Tue, Dec 4, 2018 at 7:22 PM Vladimir Ozerov > wrote: > > > Hi Alex, > > > > Agree with you. Most of the time these distribution of metrics is not > > needed. In future we will have more and more information which > potentially > > needs to be shared between nodes. E.g. IO statistics, SQL statistics for > > query optimizer, SQL execution history, etc. We need common mechanics for > > this, so I vote for your proposal: > > 1) Data is collected locally > > 2) If a node needs to collect data from the cluster, it sends explicit > > request over communication SPI > > 3) For performance reasons we may consider caching - return previously > > collected metrics without re-requesting them again if they are not too > old > > (configurable) > > > > On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov > > wrote: > > > > > Hi Igniters, > > > > > > In the current implementation, cache metrics are collected on each node > > and > > > sent across the whole cluster with discovery message > > > (TcpDiscoveryMetricsUpdateMessage) with configured frequency > > > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested > > > them. > > > If there are a lot of caches and a lot of nodes in the cluster, metrics > > > update message (which contain each metric for each cache on each node) > > can > > > reach a critical size. > > > > > > Also frequently collecting all cache metrics have a negative > performance > > > impact (some of them just get values from AtomicLong, but some of them > > need > > > an iteration over all cache partitions). > > > The only way now to disable cache metrics collecting and sending with > > > discovery message is to disable statistics for each cache. But this > also > > > makes impossible to request some of cache metrics locally (for the > > current > > > node only). Requesting a limited set of cache metrics on the current > node > > > doesn't have such performance impact as the frequent collecting of all > > > cache metrics, but sometimes it's enough for diagnostic purposes. > > > > > > As a workaround I have filled and implemented ticket [1], which > > introduces > > > new system property to disable cache metrics sending with > > > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the > > message > > > will contain only node metrics). But system property is not good for a > > > permanent solution. Perhaps it's better to move such property to public > > API > > > (to IgniteConfiguration for example). > > > > > > Also maybe we should change cache metrics distributing strategy? For > > > example, collect metrics by request via communication SPI or subscribe > > to a > > > limited set of cache/metrics, etc. > > > > > > Thoughts? > > > > > > [1]: https://issues.apache.org/jira/browse/IGNITE-10172 > > > > > > > > -- > Alexey Kuznetsov >
[jira] [Created] (IGNITE-10642) Cache metrics distribution mechanism should be changed from broadcast to request-response communication pattern
Aleksey Plekhanov created IGNITE-10642: -- Summary: Cache metrics distribution mechanism should be changed from broadcast to request-response communication pattern Key: IGNITE-10642 URL: https://issues.apache.org/jira/browse/IGNITE-10642 Project: Ignite Issue Type: Improvement Affects Versions: 2.7 Reporter: Aleksey Plekhanov In the current implementation, all cache metrics are collected on each node for all caches and sent across the whole cluster with discovery message ({{TcpDiscoveryMetricsUpdateMessage}}) with configured frequency (MetricsUpdateFrequency, 2 seconds by default) even if no one requested them. This mechanism should be changed in the following ways: * Local cache metrics should be available (if configured) on each node * If a node needs to collect data from the cluster, it sends explicit request over communication SPI (request should contain a limited set of caches and/or metrics) * For performance reasons collected cluster-wide values must be cached. Previously collected metrics should be returned without re-requesting them again if they are not too old (configurable) * The mechanism should be easily adaptable for other types of statistics, which probably needs to be shared between nodes in the future (IO statistics, SQL statistics, SQL execution history, etc) * Message format should be carefully designed to minimize message size (cluster can contain thousands of caches and hundreds of nodes) * There must be an opportunity to configure metrics in runtime -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution
Hi, One of the problems with metrics is a huge size in case when a lot caches started on node (for example, I see 7000 caches). We have to think how to compact them. Not all metrics changed frequently, so, we may store locally and send over wire only a difference from previous collect. And think carefully about store format. For example, if current cache metrics will be passed as JSON object, then 70% of it will be strings with metrics names. On Tue, Dec 4, 2018 at 7:22 PM Vladimir Ozerov wrote: > Hi Alex, > > Agree with you. Most of the time these distribution of metrics is not > needed. In future we will have more and more information which potentially > needs to be shared between nodes. E.g. IO statistics, SQL statistics for > query optimizer, SQL execution history, etc. We need common mechanics for > this, so I vote for your proposal: > 1) Data is collected locally > 2) If a node needs to collect data from the cluster, it sends explicit > request over communication SPI > 3) For performance reasons we may consider caching - return previously > collected metrics without re-requesting them again if they are not too old > (configurable) > > On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov > wrote: > > > Hi Igniters, > > > > In the current implementation, cache metrics are collected on each node > and > > sent across the whole cluster with discovery message > > (TcpDiscoveryMetricsUpdateMessage) with configured frequency > > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested > > them. > > If there are a lot of caches and a lot of nodes in the cluster, metrics > > update message (which contain each metric for each cache on each node) > can > > reach a critical size. > > > > Also frequently collecting all cache metrics have a negative performance > > impact (some of them just get values from AtomicLong, but some of them > need > > an iteration over all cache partitions). > > The only way now to disable cache metrics collecting and sending with > > discovery message is to disable statistics for each cache. But this also > > makes impossible to request some of cache metrics locally (for the > current > > node only). Requesting a limited set of cache metrics on the current node > > doesn't have such performance impact as the frequent collecting of all > > cache metrics, but sometimes it's enough for diagnostic purposes. > > > > As a workaround I have filled and implemented ticket [1], which > introduces > > new system property to disable cache metrics sending with > > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the > message > > will contain only node metrics). But system property is not good for a > > permanent solution. Perhaps it's better to move such property to public > API > > (to IgniteConfiguration for example). > > > > Also maybe we should change cache metrics distributing strategy? For > > example, collect metrics by request via communication SPI or subscribe > to a > > limited set of cache/metrics, etc. > > > > Thoughts? > > > > [1]: https://issues.apache.org/jira/browse/IGNITE-10172 > > > -- Alexey Kuznetsov
Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution
Hi Alex, Agree with you. Most of the time these distribution of metrics is not needed. In future we will have more and more information which potentially needs to be shared between nodes. E.g. IO statistics, SQL statistics for query optimizer, SQL execution history, etc. We need common mechanics for this, so I vote for your proposal: 1) Data is collected locally 2) If a node needs to collect data from the cluster, it sends explicit request over communication SPI 3) For performance reasons we may consider caching - return previously collected metrics without re-requesting them again if they are not too old (configurable) On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov wrote: > Hi Igniters, > > In the current implementation, cache metrics are collected on each node and > sent across the whole cluster with discovery message > (TcpDiscoveryMetricsUpdateMessage) with configured frequency > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested > them. > If there are a lot of caches and a lot of nodes in the cluster, metrics > update message (which contain each metric for each cache on each node) can > reach a critical size. > > Also frequently collecting all cache metrics have a negative performance > impact (some of them just get values from AtomicLong, but some of them need > an iteration over all cache partitions). > The only way now to disable cache metrics collecting and sending with > discovery message is to disable statistics for each cache. But this also > makes impossible to request some of cache metrics locally (for the current > node only). Requesting a limited set of cache metrics on the current node > doesn't have such performance impact as the frequent collecting of all > cache metrics, but sometimes it's enough for diagnostic purposes. > > As a workaround I have filled and implemented ticket [1], which introduces > new system property to disable cache metrics sending with > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message > will contain only node metrics). But system property is not good for a > permanent solution. Perhaps it's better to move such property to public API > (to IgniteConfiguration for example). > > Also maybe we should change cache metrics distributing strategy? For > example, collect metrics by request via communication SPI or subscribe to a > limited set of cache/metrics, etc. > > Thoughts? > > [1]: https://issues.apache.org/jira/browse/IGNITE-10172 >
Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution
hi, Alex. imo: 1. metrics through discovery require refactoring. 2. local cache metrics should be available (if configured) on each node. 3. there must be an opportunity to configure metrics in runtime. thanks. > > >Hi Igniters, > >In the current implementation, cache metrics are collected on each node and >sent across the whole cluster with discovery message >(TcpDiscoveryMetricsUpdateMessage) with configured frequency >(MetricsUpdateFrequency, 2 seconds by default) even if no one requested >them. >If there are a lot of caches and a lot of nodes in the cluster, metrics >update message (which contain each metric for each cache on each node) can >reach a critical size. > >Also frequently collecting all cache metrics have a negative performance >impact (some of them just get values from AtomicLong, but some of them need >an iteration over all cache partitions). >The only way now to disable cache metrics collecting and sending with >discovery message is to disable statistics for each cache. But this also >makes impossible to request some of cache metrics locally (for the current >node only). Requesting a limited set of cache metrics on the current node >doesn't have such performance impact as the frequent collecting of all >cache metrics, but sometimes it's enough for diagnostic purposes. > >As a workaround I have filled and implemented ticket [1], which introduces >new system property to disable cache metrics sending with >TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message >will contain only node metrics). But system property is not good for a >permanent solution. Perhaps it's better to move such property to public API >(to IgniteConfiguration for example). > >Also maybe we should change cache metrics distributing strategy? For >example, collect metrics by request via communication SPI or subscribe to a >limited set of cache/metrics, etc. > >Thoughts? > >[1]: https://issues.apache.org/jira/browse/IGNITE-10172 -- Zhenya Stanilovsky
Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution
Alex, Did you measure the impact of metrics collection? What is the overhead you are trying to avoid? Just to make it clear, MetricUpdateMessage-s are used as heartbeats. So they are sent anyways, even if no metrics are distributed between nodes. Denis вт, 4 дек. 2018 г. в 12:46, Alex Plehanov : > Hi Igniters, > > In the current implementation, cache metrics are collected on each node and > sent across the whole cluster with discovery message > (TcpDiscoveryMetricsUpdateMessage) with configured frequency > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested > them. > If there are a lot of caches and a lot of nodes in the cluster, metrics > update message (which contain each metric for each cache on each node) can > reach a critical size. > > Also frequently collecting all cache metrics have a negative performance > impact (some of them just get values from AtomicLong, but some of them need > an iteration over all cache partitions). > The only way now to disable cache metrics collecting and sending with > discovery message is to disable statistics for each cache. But this also > makes impossible to request some of cache metrics locally (for the current > node only). Requesting a limited set of cache metrics on the current node > doesn't have such performance impact as the frequent collecting of all > cache metrics, but sometimes it's enough for diagnostic purposes. > > As a workaround I have filled and implemented ticket [1], which introduces > new system property to disable cache metrics sending with > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message > will contain only node metrics). But system property is not good for a > permanent solution. Perhaps it's better to move such property to public API > (to IgniteConfiguration for example). > > Also maybe we should change cache metrics distributing strategy? For > example, collect metrics by request via communication SPI or subscribe to a > limited set of cache/metrics, etc. > > Thoughts? > > [1]: https://issues.apache.org/jira/browse/IGNITE-10172 >
[DISCUSSION] Performance issue with cluster-wide cache metrics distribution
Hi Igniters, In the current implementation, cache metrics are collected on each node and sent across the whole cluster with discovery message (TcpDiscoveryMetricsUpdateMessage) with configured frequency (MetricsUpdateFrequency, 2 seconds by default) even if no one requested them. If there are a lot of caches and a lot of nodes in the cluster, metrics update message (which contain each metric for each cache on each node) can reach a critical size. Also frequently collecting all cache metrics have a negative performance impact (some of them just get values from AtomicLong, but some of them need an iteration over all cache partitions). The only way now to disable cache metrics collecting and sending with discovery message is to disable statistics for each cache. But this also makes impossible to request some of cache metrics locally (for the current node only). Requesting a limited set of cache metrics on the current node doesn't have such performance impact as the frequent collecting of all cache metrics, but sometimes it's enough for diagnostic purposes. As a workaround I have filled and implemented ticket [1], which introduces new system property to disable cache metrics sending with TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message will contain only node metrics). But system property is not good for a permanent solution. Perhaps it's better to move such property to public API (to IgniteConfiguration for example). Also maybe we should change cache metrics distributing strategy? For example, collect metrics by request via communication SPI or subscribe to a limited set of cache/metrics, etc. Thoughts? [1]: https://issues.apache.org/jira/browse/IGNITE-10172
Unable to get the ignite cache metrics
Hi I brought ignite server on k8s cluster. Set the below property for a cache i wanted to check the metrics Then i started the client and tried to push the data into ignite cache. I am able to see the data in the cache. But the values for the following metrics i am getting as 0. Can some one let me know why is this. ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_cacheputs = 0 ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_averageputtime = 0 -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Unable to get the ignite cache metrics
Hi I am new to ignite. Brought up the ignite servers on k8s. Enabled the cache level metrics by setting the below property in ignite config xml for specific cache As part of Ignite client(brought up as another pod) i am putting the data into cache. when i checked the below cache metrics i am getting 0 value. Can some one help me why 0 is coming. ignite_org_apache_ignite_internal_processors_cache_cacheclustermetricsmxbeanimpl_cacheputs = 0 ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_averageputtime =0 But the for this metric giving the number of entries in the cache: ignite_org_apache_ignite_internal_processors_cache_cacheclustermetricsmxbeanimpl_keysize = 5060 -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
[jira] [Created] (IGNITE-9224) MVCC SQL: Cache metrics
Ivan Pavlukhin created IGNITE-9224: -- Summary: MVCC SQL: Cache metrics Key: IGNITE-9224 URL: https://issues.apache.org/jira/browse/IGNITE-9224 Project: Ignite Issue Type: Improvement Reporter: Ivan Pavlukhin Assignee: Ivan Pavlukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8554) Cache metrics: expose metrics with rebalance info about keys
Alexey Kuznetsov created IGNITE-8554: Summary: Cache metrics: expose metrics with rebalance info about keys Key: IGNITE-8554 URL: https://issues.apache.org/jira/browse/IGNITE-8554 Project: Ignite Issue Type: Improvement Reporter: Alexey Kuznetsov Assignee: Alexey Kuznetsov In order to show info about rebalance progress we need to expose estimatedRebalancingKeys and rebalancedKeys metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ignite pull request #3369: IGNITE-6923 Cache metrics optimization
Github user alex-plekhanov closed the pull request at: https://github.com/apache/ignite/pull/3369 ---
[GitHub] ignite pull request #3369: IGNITE-6923 Cache metrics optimization
GitHub user alex-plekhanov opened a pull request: https://github.com/apache/ignite/pull/3369 IGNITE-6923 Cache metrics optimization You can merge this pull request into a Git repository by running: $ git pull https://github.com/alex-plekhanov/ignite IGNITE-6923 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3369.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3369 commit eef3fed408d7cfbeedcabcb354989c64be773724 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-19T14:44:06Z IGNITE-6923 Optimized nonHeapMemoryUsed commit 122467d0ca4cfe859e2fc5af276b20c4f50dc89c Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-20T16:13:34Z IGNITE-6923 getTotalPartitionsCount, getRebalancingPartitionsCount optimization commit e20d842f9ccf9c4d1e4703a52cd723d8e37ddbea Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-22T08:56:36Z IGNITE-6923 Cluster metrics optimization (proxy class implemented) commit bff0a1845799ad90c6ecdd3812e84418ba45bd07 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-25T08:40:24Z IGNITE-6923 Partitions metrics optimization commit 6dd59b9961f9fc073d4616367e36607430f113b8 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-26T12:57:01Z IGNITE-6923 Cache metrics optimization commit 96b0396f05c7aacb24bd9ee88e2564d240fae0be Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-26T12:58:30Z IGNITE-6923 Cache metrics optimization commit 79067428f7252700bb23d29f3e3b4b7dfa5586bf Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-27T07:48:56Z IGNITE-6923 Disable cache metrics update flag commit 5e5d675f6fb6a25eda57fcbf53e49ec87fda3ba2 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2017-12-27T08:07:57Z IGNITE-6923 License header commit e90d778d9fab0b1e467ef1314505060494fea1db Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2018-01-11T20:57:29Z IGNITE-6923 Bugfix commit 77e50a74dadc6ae40d301f63cd0e4b73b6203303 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2018-01-12T14:43:29Z IGNITE-6923 Unit test commit 39f7c653e8b91ec7b02244e2633e27ea9103793d Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2018-01-12T14:47:35Z Revert "IGNITE-6923 Disable cache metrics update flag" This reverts commit 9bb904f commit 62ea9f5d6524eff6e9a69fbc8ca2ac0c95325796 Author: Aleksey Plekhanov <plehanov.alex@...> Date: 2018-01-12T19:24:57Z IGNITE-6923 Test comment added ---
[GitHub] ignite pull request #3356: IGNITE-7126: add new cache metrics parameters
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/3356 ---
[GitHub] ignite pull request #3356: IGNITE-7126: add new cache metrics parameters
GitHub user AlexeyRokhin opened a pull request: https://github.com/apache/ignite/pull/3356 IGNITE-7126: add new cache metrics parameters Required parameters were added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/AlexeyRokhin/ignite ignite-7126 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3356 commit 6cde9dad36af56926e54304edb7189ae58c69b78 Author: Alexey Rokhin <arokhin@...> Date: 2018-01-10T22:10:26Z IGNITE-7126: add new cache metrics parameters ---
[jira] [Created] (IGNITE-7106) Add option to VisorNodeDataCollectorTask to not collect cache metrics
Alexey Kuznetsov created IGNITE-7106: Summary: Add option to VisorNodeDataCollectorTask to not collect cache metrics Key: IGNITE-7106 URL: https://issues.apache.org/jira/browse/IGNITE-7106 Project: Ignite Issue Type: Bug Reporter: Alexey Kuznetsov On large clusters with > 100 nodes and > 1000 caches this task can collect huge amount of data. We can add an option to collect this info on demand -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6925) Simplify cache metrics activation
Denis Magda created IGNITE-6925: --- Summary: Simplify cache metrics activation Key: IGNITE-6925 URL: https://issues.apache.org/jira/browse/IGNITE-6925 Project: Ignite Issue Type: Bug Security Level: Public (Viewable by anyone) Reporter: Denis Magda The user needs to do 3 things to enabled cache metrics: - set {{statisticsEnabled}} to {{true}}. - set not a dummy {{EventsStorageSpi}} - list metrics of the interest. This process has to be reduced to 2 steps or, preferably, to 1. More details are here: http://apache-ignite-developers.2346864.n4.nabble.com/Annoying-extra-steps-for-enabling-metrics-td21865.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6923) Cache metrics are updated in timeout-worker potentially delaying critical code execution due to current implementation issues.
Alexei Scherbakov created IGNITE-6923: - Summary: Cache metrics are updated in timeout-worker potentially delaying critical code execution due to current implementation issues. Key: IGNITE-6923 URL: https://issues.apache.org/jira/browse/IGNITE-6923 Project: Ignite Issue Type: Bug Security Level: Public (Viewable by anyone) Affects Versions: 2.3 Reporter: Alexei Scherbakov Fix For: 2.4 Some metrics are using full cache iteration for calculation. See stack trace for example. {noformat} "grid-timeout-worker-#39%DPL_GRID%DplGridNodeName%" #152 prio=5 os_prio=0 tid=0x7f1009a03000 nid=0x5caa runnable [0x7f0f059d9000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.containsKey(HashMap.java:595) at java.util.HashSet.contains(HashSet.java:203) at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$3.apply(IgniteCacheOffheapManagerImpl.java:339) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$3.apply(IgniteCacheOffheapManagerImpl.java:337) at org.apache.ignite.internal.util.lang.gridfunc.TransformFilteringIterator.hasNext:@TransformFilteringIterator.java:90) at org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:45) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.cacheEntriesCount(IgniteCacheOffheapManagerImpl.java:293) at org.apache.ignite.internal.processors.cache.CacheMetricsImpl.getOffHeapPrimaryEntriesCount(CacheMetricsImpl.java:240) at org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:271) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3217) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1151) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.nonHeapMemoryUsed(GridDiscoveryManager.java:1121) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.metrics(GridDiscoveryManager.java:1087) at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode.metrics(TcpDiscoveryNode.java:269) at org.apache.ignite.internal.IgniteKernal$3.run(IgniteKernal.java:1175) at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask.onTimeout(GridTimeoutProcessor.java:256) - locked <0x7f115f5bf890> (a org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask) at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:158) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
IGNITE-6679 Clean up some deprecated cache metrics
Hello, Igniters. I have removed deprecated metrics [1]. Please, review [2]. Tests look good [3]. 1. https://issues.apache.org/jira/browse/IGNITE-6679 2. https://reviews.ignite.apache.org/ignite/review/IGNT-CR-390 3. https://ci.ignite.apache.org/project.html?projectId=Ignite20Tests=projectOverview_Ignite20Tests=pull%2F2962%2Fhead -- Best wishes, Amelchev Nikita
[jira] [Created] (IGNITE-6679) Clean up some deprecated cache metrics
Sergey Puchnin created IGNITE-6679: -- Summary: Clean up some deprecated cache metrics Key: IGNITE-6679 URL: https://issues.apache.org/jira/browse/IGNITE-6679 Project: Ignite Issue Type: Improvement Security Level: Public (Viewable by anyone) Components: cache Reporter: Sergey Puchnin Priority: Trivial An old optimistic serializable mode implementation was removed in 2.0 but some cache metrics still present in CacheMetrics interface. Need to clean up and remove these metrics: *TxCommitQueueSize* *TxPrepareQueueSize* *TxStartVersionCountsSize* *TxDhtCommitQueueSize* *TxDhtPrepareQueueSize* *TxDhtStartVersionCountsSize* An algorithm for page eviction was changed and metric *DhtEvictQueueCurrentSize* should be removed also. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6630) Incorrect time units of average transaction commit/rollback duration cache metrics.
Pavel Pereslegin created IGNITE-6630: Summary: Incorrect time units of average transaction commit/rollback duration cache metrics. Key: IGNITE-6630 URL: https://issues.apache.org/jira/browse/IGNITE-6630 Project: Ignite Issue Type: Bug Reporter: Pavel Pereslegin Assignee: Pavel Pereslegin Priority: Minor AverageTxCommitTime and AverageTxRollbackTime metrics in CacheMetrics calculated in milliseconds instead of microseconds as pointed in javadoc. Simple junit repro: {code:java} public class CacheMetricsTxAvgTimeTest extends GridCommonAbstractTest { /** */ private <K, V> CacheConfiguration<K, V> cacheConfiguration(String name) { CacheConfiguration<K, V> cacheConfiguration = new CacheConfiguration<>(name); cacheConfiguration.setCacheMode(CacheMode.PARTITIONED); cacheConfiguration.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL); cacheConfiguration.setStatisticsEnabled(true); return cacheConfiguration; } /** */ public void testTxCommitDuration() throws Exception { try ( Ignite node = startGrid(0)) { IgniteCache<Object, Object> cache = node.createCache(cacheConfiguration(DEFAULT_CACHE_NAME)); try (Transaction tx = node.transactions().txStart()) { cache.put(1, 1); // Await 1 second. U.sleep(1_000); tx.commit(); } // Documentation says that this metric is in microseconds. float commitTime = cache.metrics().getAverageTxCommitTime(); // But this assertion will fail because it in milliseconds and returns only ~1000. assert commitTime >= 1_000_000; } } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6584) .NET: Propagate new cache metrics
Pavel Tupitsyn created IGNITE-6584: -- Summary: .NET: Propagate new cache metrics Key: IGNITE-6584 URL: https://issues.apache.org/jira/browse/IGNITE-6584 Project: Ignite Issue Type: Improvement Components: platforms Reporter: Pavel Tupitsyn Priority: Trivial Some properties are missing in {{ICacheMetrics}} that exist in {{CacheMetrics}} on Java side, such as rebalancing-related stuff (see IGNITE-6583). Add them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6565) Use long type for size and keySize in cache metrics
Ilya Kasnacheev created IGNITE-6565: --- Summary: Use long type for size and keySize in cache metrics Key: IGNITE-6565 URL: https://issues.apache.org/jira/browse/IGNITE-6565 Project: Ignite Issue Type: Bug Affects Versions: 2.2 Reporter: Ilya Kasnacheev Currently it's int so for large caches there's no way to convey correct value. Should introduce getSizeLong() and getKeySizeLong() Also introduce the same in .Net and make sure that compatibility not broken when passing OP_LOCAL_METRICS and OP_GLOBAL_METRICS. BTW do we need keySize at all? What's it for? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6564) Incorrect calculation size and keySize for cluster cache metrics
Ilya Kasnacheev created IGNITE-6564: --- Summary: Incorrect calculation size and keySize for cluster cache metrics Key: IGNITE-6564 URL: https://issues.apache.org/jira/browse/IGNITE-6564 Project: Ignite Issue Type: Bug Affects Versions: 2.2 Reporter: Ilya Kasnacheev Priority: Minor They are currently not passed by ring and therefore only taken from current node, which returns incorrect (local) value. See CacheMetricsSnapshot class. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Cache Metrics
Den, I see at least two problems here: 1. Metrics meaning for end user. How user should interpret metrics in this case. Moreover, average is bad gauge for monitoring because it hides actual latencies. User should have possibility to get accurate metrics in order to build some monitoring that can create percentile based charts for example and accuracy is very important property for such cases. 2. It just makes code more complex and we will have metrics related logic in two places instead of one. On Wed, Jul 26, 2017 at 4:45 AM, Denis Magda <dma...@apache.org> wrote: > Andrey, > > I would simply take an average if a mixed clients-servers cluster group is > used. > > In general, the goal of the ticket was to fix the time-based metrics on the > server side. As far as I understand they are already calculated properly on > the client’s considering network contribution, right? So, all that’s left to > do is to count the same on the servers so that those metrics no longer return > 0. > > — > Denis > >> On Jul 25, 2017, at 6:53 AM, Andrey Gura <ag...@apache.org> wrote: >> >> Den, >> >> doesn't make sense from my point if view. And we create new problem: >> how should we aggregate this metrics when user requests metrics for >> cluster group. >> >> On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote: >>> Guys, >>> >>> What if we calculate it on both sides? The client will keep the total time >>> needed to complete an operation including network hoops while a server >>> (primary or backup) will count only local time. >>> >>> — >>> Denis >>> >>>> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote: >>>> >>>> Hi, >>>> >>>> I believe that the first solution is better than second because it >>>> takes into account network communication time. Average time of >>>> communication between nodes doesn't make sense from my point of view. >>>> >>>> So I vote for #1. >>>> >>>> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин >>>> <slava.kopti...@gmail.com> wrote: >>>>> Hi Experts, >>>>> >>>>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495 >>>>> >>>>> A few words about this issue: >>>>> It is about that the process of gathering/updating of cache metrics is >>>>> inconsistent in some cases. >>>>> Let's consider the following simple topology which contains only two >>>>> nodes: >>>>> first node is a client node and the second is a server. >>>>> And client node starts requests to the server node, for instance >>>>> cache.put(), cache.putAll(), cache.get() etc. >>>>> In that case, metrics which are related to counters (cache hits, cache >>>>> misses, removals and puts) are calculated on the server side, >>>>> while time metrics are updated on the client node. >>>>> >>>>> I think that both metrics (counters and time) should be calculated on the >>>>> same node. So, there are two obvious solution: >>>>> >>>>> #1 Node that starts some operation is responsible for updating the cache >>>>> metrics. >>>>> Pro: >>>>> - it will allow to get more accurate results of metrics. >>>>> Contra: >>>>> - this approach does not work in particular cases. for example, >>>>> partitioned >>>>> cache with FULL_ASYNC write synchronization mode. >>>>> - needs to extend response messages (GridNearAtomicUpdateResponse, >>>>> GridNearGetResponse etc) >>>>> in order to provide additional information from remote node: cache hits, >>>>> number of removal etc. >>>>> So, it will lead to additional pressure on communication channel. >>>>> Perhaps, this impact will be small - 4 bytes per message or something like >>>>> that. >>>>> - backward incompatibility (this is a consequence of the previous point) >>>>> >>>>> #2 Primary node (node that actually executes a request) >>>>> Pro: >>>>> - easy to implement >>>>> - backward compatible >>>>> Contra: >>>>> - time metrics will not include the time of communication between nodes, >>>>> so >>>>> the results will be less accurate. >>>>> - perhaps we need to provide additional metric which will allow to get avg >>>>> time of communication between nodes. >>>>> >>>>> Please let me know about your thoughts. >>>>> Perhaps, both alternatives are not so good... >>>>> >>>>> Regards, >>>>> Slava. >>> >
Re: Cache Metrics
Andrey, I would simply take an average if a mixed clients-servers cluster group is used. In general, the goal of the ticket was to fix the time-based metrics on the server side. As far as I understand they are already calculated properly on the client’s considering network contribution, right? So, all that’s left to do is to count the same on the servers so that those metrics no longer return 0. — Denis > On Jul 25, 2017, at 6:53 AM, Andrey Gura <ag...@apache.org> wrote: > > Den, > > doesn't make sense from my point if view. And we create new problem: > how should we aggregate this metrics when user requests metrics for > cluster group. > > On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote: >> Guys, >> >> What if we calculate it on both sides? The client will keep the total time >> needed to complete an operation including network hoops while a server >> (primary or backup) will count only local time. >> >> — >> Denis >> >>> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote: >>> >>> Hi, >>> >>> I believe that the first solution is better than second because it >>> takes into account network communication time. Average time of >>> communication between nodes doesn't make sense from my point of view. >>> >>> So I vote for #1. >>> >>> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин >>> <slava.kopti...@gmail.com> wrote: >>>> Hi Experts, >>>> >>>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495 >>>> >>>> A few words about this issue: >>>> It is about that the process of gathering/updating of cache metrics is >>>> inconsistent in some cases. >>>> Let's consider the following simple topology which contains only two nodes: >>>> first node is a client node and the second is a server. >>>> And client node starts requests to the server node, for instance >>>> cache.put(), cache.putAll(), cache.get() etc. >>>> In that case, metrics which are related to counters (cache hits, cache >>>> misses, removals and puts) are calculated on the server side, >>>> while time metrics are updated on the client node. >>>> >>>> I think that both metrics (counters and time) should be calculated on the >>>> same node. So, there are two obvious solution: >>>> >>>> #1 Node that starts some operation is responsible for updating the cache >>>> metrics. >>>> Pro: >>>> - it will allow to get more accurate results of metrics. >>>> Contra: >>>> - this approach does not work in particular cases. for example, partitioned >>>> cache with FULL_ASYNC write synchronization mode. >>>> - needs to extend response messages (GridNearAtomicUpdateResponse, >>>> GridNearGetResponse etc) >>>> in order to provide additional information from remote node: cache hits, >>>> number of removal etc. >>>> So, it will lead to additional pressure on communication channel. >>>> Perhaps, this impact will be small - 4 bytes per message or something like >>>> that. >>>> - backward incompatibility (this is a consequence of the previous point) >>>> >>>> #2 Primary node (node that actually executes a request) >>>> Pro: >>>> - easy to implement >>>> - backward compatible >>>> Contra: >>>> - time metrics will not include the time of communication between nodes, so >>>> the results will be less accurate. >>>> - perhaps we need to provide additional metric which will allow to get avg >>>> time of communication between nodes. >>>> >>>> Please let me know about your thoughts. >>>> Perhaps, both alternatives are not so good... >>>> >>>> Regards, >>>> Slava. >>
Re: Cache Metrics
Den, doesn't make sense from my point if view. And we create new problem: how should we aggregate this metrics when user requests metrics for cluster group. On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote: > Guys, > > What if we calculate it on both sides? The client will keep the total time > needed to complete an operation including network hoops while a server > (primary or backup) will count only local time. > > — > Denis > >> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote: >> >> Hi, >> >> I believe that the first solution is better than second because it >> takes into account network communication time. Average time of >> communication between nodes doesn't make sense from my point of view. >> >> So I vote for #1. >> >> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин >> <slava.kopti...@gmail.com> wrote: >>> Hi Experts, >>> >>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495 >>> >>> A few words about this issue: >>> It is about that the process of gathering/updating of cache metrics is >>> inconsistent in some cases. >>> Let's consider the following simple topology which contains only two nodes: >>> first node is a client node and the second is a server. >>> And client node starts requests to the server node, for instance >>> cache.put(), cache.putAll(), cache.get() etc. >>> In that case, metrics which are related to counters (cache hits, cache >>> misses, removals and puts) are calculated on the server side, >>> while time metrics are updated on the client node. >>> >>> I think that both metrics (counters and time) should be calculated on the >>> same node. So, there are two obvious solution: >>> >>> #1 Node that starts some operation is responsible for updating the cache >>> metrics. >>> Pro: >>> - it will allow to get more accurate results of metrics. >>> Contra: >>> - this approach does not work in particular cases. for example, partitioned >>> cache with FULL_ASYNC write synchronization mode. >>> - needs to extend response messages (GridNearAtomicUpdateResponse, >>> GridNearGetResponse etc) >>> in order to provide additional information from remote node: cache hits, >>> number of removal etc. >>> So, it will lead to additional pressure on communication channel. >>> Perhaps, this impact will be small - 4 bytes per message or something like >>> that. >>> - backward incompatibility (this is a consequence of the previous point) >>> >>> #2 Primary node (node that actually executes a request) >>> Pro: >>> - easy to implement >>> - backward compatible >>> Contra: >>> - time metrics will not include the time of communication between nodes, so >>> the results will be less accurate. >>> - perhaps we need to provide additional metric which will allow to get avg >>> time of communication between nodes. >>> >>> Please let me know about your thoughts. >>> Perhaps, both alternatives are not so good... >>> >>> Regards, >>> Slava. >
Re: Cache Metrics
Guys, What if we calculate it on both sides? The client will keep the total time needed to complete an operation including network hoops while a server (primary or backup) will count only local time. — Denis > On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote: > > Hi, > > I believe that the first solution is better than second because it > takes into account network communication time. Average time of > communication between nodes doesn't make sense from my point of view. > > So I vote for #1. > > On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин > <slava.kopti...@gmail.com> wrote: >> Hi Experts, >> >> I am working on https://issues.apache.org/jira/browse/IGNITE-3495 >> >> A few words about this issue: >> It is about that the process of gathering/updating of cache metrics is >> inconsistent in some cases. >> Let's consider the following simple topology which contains only two nodes: >> first node is a client node and the second is a server. >> And client node starts requests to the server node, for instance >> cache.put(), cache.putAll(), cache.get() etc. >> In that case, metrics which are related to counters (cache hits, cache >> misses, removals and puts) are calculated on the server side, >> while time metrics are updated on the client node. >> >> I think that both metrics (counters and time) should be calculated on the >> same node. So, there are two obvious solution: >> >> #1 Node that starts some operation is responsible for updating the cache >> metrics. >> Pro: >> - it will allow to get more accurate results of metrics. >> Contra: >> - this approach does not work in particular cases. for example, partitioned >> cache with FULL_ASYNC write synchronization mode. >> - needs to extend response messages (GridNearAtomicUpdateResponse, >> GridNearGetResponse etc) >> in order to provide additional information from remote node: cache hits, >> number of removal etc. >> So, it will lead to additional pressure on communication channel. >> Perhaps, this impact will be small - 4 bytes per message or something like >> that. >> - backward incompatibility (this is a consequence of the previous point) >> >> #2 Primary node (node that actually executes a request) >> Pro: >> - easy to implement >> - backward compatible >> Contra: >> - time metrics will not include the time of communication between nodes, so >> the results will be less accurate. >> - perhaps we need to provide additional metric which will allow to get avg >> time of communication between nodes. >> >> Please let me know about your thoughts. >> Perhaps, both alternatives are not so good... >> >> Regards, >> Slava.
Re: Cache Metrics
Hi, I believe that the first solution is better than second because it takes into account network communication time. Average time of communication between nodes doesn't make sense from my point of view. So I vote for #1. On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин <slava.kopti...@gmail.com> wrote: > Hi Experts, > > I am working on https://issues.apache.org/jira/browse/IGNITE-3495 > > A few words about this issue: > It is about that the process of gathering/updating of cache metrics is > inconsistent in some cases. > Let's consider the following simple topology which contains only two nodes: > first node is a client node and the second is a server. > And client node starts requests to the server node, for instance > cache.put(), cache.putAll(), cache.get() etc. > In that case, metrics which are related to counters (cache hits, cache > misses, removals and puts) are calculated on the server side, > while time metrics are updated on the client node. > > I think that both metrics (counters and time) should be calculated on the > same node. So, there are two obvious solution: > > #1 Node that starts some operation is responsible for updating the cache > metrics. > Pro: > - it will allow to get more accurate results of metrics. > Contra: > - this approach does not work in particular cases. for example, partitioned > cache with FULL_ASYNC write synchronization mode. > - needs to extend response messages (GridNearAtomicUpdateResponse, > GridNearGetResponse etc) > in order to provide additional information from remote node: cache hits, > number of removal etc. > So, it will lead to additional pressure on communication channel. > Perhaps, this impact will be small - 4 bytes per message or something like > that. > - backward incompatibility (this is a consequence of the previous point) > > #2 Primary node (node that actually executes a request) > Pro: > - easy to implement > - backward compatible > Contra: > - time metrics will not include the time of communication between nodes, so > the results will be less accurate. > - perhaps we need to provide additional metric which will allow to get avg > time of communication between nodes. > > Please let me know about your thoughts. > Perhaps, both alternatives are not so good... > > Regards, > Slava.
Cache Metrics
Hi Experts, I am working on https://issues.apache.org/jira/browse/IGNITE-3495 A few words about this issue: It is about that the process of gathering/updating of cache metrics is inconsistent in some cases. Let's consider the following simple topology which contains only two nodes: first node is a client node and the second is a server. And client node starts requests to the server node, for instance cache.put(), cache.putAll(), cache.get() etc. In that case, metrics which are related to counters (cache hits, cache misses, removals and puts) are calculated on the server side, while time metrics are updated on the client node. I think that both metrics (counters and time) should be calculated on the same node. So, there are two obvious solution: #1 Node that starts some operation is responsible for updating the cache metrics. Pro: - it will allow to get more accurate results of metrics. Contra: - this approach does not work in particular cases. for example, partitioned cache with FULL_ASYNC write synchronization mode. - needs to extend response messages (GridNearAtomicUpdateResponse, GridNearGetResponse etc) in order to provide additional information from remote node: cache hits, number of removal etc. So, it will lead to additional pressure on communication channel. Perhaps, this impact will be small - 4 bytes per message or something like that. - backward incompatibility (this is a consequence of the previous point) #2 Primary node (node that actually executes a request) Pro: - easy to implement - backward compatible Contra: - time metrics will not include the time of communication between nodes, so the results will be less accurate. - perhaps we need to provide additional metric which will allow to get avg time of communication between nodes. Please let me know about your thoughts. Perhaps, both alternatives are not so good... Regards, Slava.
[GitHub] ignite pull request #2133: IGNITE-5492: Local cache metrics are broken.
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/2133 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] ignite pull request #2133: IGNITE-5492: Local cache metrics are broken.
GitHub user shroman opened a pull request: https://github.com/apache/ignite/pull/2133 IGNITE-5492: Local cache metrics are broken. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shroman/ignite IGNITE-5492 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/2133.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2133 commit 01a6b8cb3f57a390cc74692c7099f4bed5131b6d Author: shroman <rsht...@yahoo.com> Date: 2017-06-15T09:41:01Z IGNITE-5492: Local cache metrics are broken. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] ignite pull request #1800: Obsolete cache metrics removed, test fixed
Github user sergey-chugunov-1985 closed the pull request at: https://github.com/apache/ignite/pull/1800 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] ignite pull request #1800: Obsolete cache metrics removed, test fixed
GitHub user sergey-chugunov-1985 opened a pull request: https://github.com/apache/ignite/pull/1800 Obsolete cache metrics removed, test fixed You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-4536-later-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/1800.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1800 commit 0caa761fdf38e476364bc6714e98929742098bfc Author: Sergey Chugunov <sergey.chugu...@gmail.com> Date: 2017-04-14T15:03:03Z IGNITE-4536 two more obsolete cache metrics were removed, test for memory allocation was fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Fwd: Cache Metrics
Cross sending this to dev. Igniters, why does the metrics stuff have to be so confusing? Looks like if "statisticsEnabled" is false, then metrics return all 0s. Can we at least have a warning in the log stating that the metrics are disabled, and explaining how to enable them? D. -- Forwarded message -- From: Alper Tekinalp <al...@evam.com> Date: Tue, Dec 20, 2016 at 3:10 AM Subject: Re: Cache Metrics To: u...@ignite.apache.org Hi all. Thanks for your replies. Enabling statistics fixed it. On Tue, Dec 20, 2016 at 12:39 PM, Andrey Mashenkov <amashen...@gridgain.com> wrote: > Hi Alper, > > May be it is not obvious, but to enable offheap you need to > setOffheapMaxMemory to zero (unlimited) or above zero. > Also metrics is disabled by default, you need call > setStatisticsEnabled(true); > > On Tue, Dec 20, 2016 at 11:41 AM, Alper Tekinalp <al...@evam.com> wrote: > >> Hi all. >> >> I have the following code: >> IgniteConfiguration igniteConfiguration = new >> IgniteConfiguration(); >> igniteConfiguration.setGridName("alper"); >> Ignite start = Ignition.start(igniteConfiguration); >> >> CacheConfiguration configuration = new CacheConfiguration(); >> configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC) >> .setCacheMode(CacheMode.PARTITIONED) >> .setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED) >> .setRebalanceMode(CacheRebalanceMode.SYNC) >> .setWriteSynchronizationMode(C >> acheWriteSynchronizationMode.FULL_SYNC) >> .setRebalanceThrottle(100) >> .setRebalanceBatchSize(2*1024*1024) >> .setBackups(1) >> .setName("cemil") >> .setEagerTtl(false); >> start.getOrCreateCache(configuration); >> >> IgniteCache<Object, Object> cemil = start.getOrCreateCache("cemil" >> ); >> >> cemil.put("1", "10"); >> cemil.put("2", "10"); >> cemil.put("3", "10"); >> cemil.put("4", "10"); >> >> System.out.println(cemil.metrics().getOffHeapAllocatedSize()); >> System.out.println(cemil.metrics().getOffHeapBackupEntriesCo >> unt()); >> System.out.println(cemil.metrics().getOffHeapGets()); >> System.out.println(cemil.metrics().getOffHeapHits()); >> System.out.println(cemil.metrics().getOffHeapMisses()); >> System.out.println(cemil.metrics().getOffHeapPuts()); >> System.out.println(cemil.metrics().getOffHeapEvictions()); >> System.out.println(cemil.metrics().getOffHeapHitPercentage()); >> >> All prints 0. Is that normal? Am i doing something wrong? >> >> -- >> Alper Tekinalp >> >> Software Developer >> Evam Streaming Analytics >> >> Atatürk Mah. Turgut Özal Bulv. >> Gardenya 5 Plaza K:6 Ataşehir >> 34758 İSTANBUL >> >> Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 >> www.evam.com.tr >> <http://www.evam.com> >> > > -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com>
Re: Cache Metrics
Hi, Alper! Could you try to set configuration.setStatisticsEnabled(true) and try once again? On Tue, Dec 20, 2016 at 3:41 PM, Alper Tekinalpwrote: > Hi all. > > I have the following code: > IgniteConfiguration igniteConfiguration = new > IgniteConfiguration(); > igniteConfiguration.setGridName("alper"); > Ignite start = Ignition.start(igniteConfiguration); > > CacheConfiguration configuration = new CacheConfiguration(); > configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC) > .setCacheMode(CacheMode.PARTITIONED) > .setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED) > .setRebalanceMode(CacheRebalanceMode.SYNC) > > .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC) > .setRebalanceThrottle(100) > .setRebalanceBatchSize(2*1024*1024) > .setBackups(1) > .setName("cemil") > .setEagerTtl(false); > start.getOrCreateCache(configuration); > > IgniteCache
Cache Metrics
Hi all. I have the following code: IgniteConfiguration igniteConfiguration = new IgniteConfiguration(); igniteConfiguration.setGridName("alper"); Ignite start = Ignition.start(igniteConfiguration); CacheConfiguration configuration = new CacheConfiguration(); configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC) .setCacheMode(CacheMode.PARTITIONED) .setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED) .setRebalanceMode(CacheRebalanceMode.SYNC) .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC) .setRebalanceThrottle(100) .setRebalanceBatchSize(2*1024*1024) .setBackups(1) .setName("cemil") .setEagerTtl(false); start.getOrCreateCache(configuration); IgniteCache
[GitHub] ignite pull request #1263: IGNITE-4264: fix cache metrics error between serv...
GitHub user wmz7year opened a pull request: https://github.com/apache/ignite/pull/1263 IGNITE-4264: fix cache metrics error between server and client. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wmz7year/ignite ignite-4264 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/1263.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1263 commit af739867d50218dd47990ce17e81cce06615db85 Author: jiangwei <jiang...@caifuzhinan.com> Date: 2016-11-23T03:01:37Z fix cache metrics error between server and client. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Cache metrics return incorrect values
Hmmm... Ok, every time when we perform put we go to offheap, becuase it can already contain this key. So my statement about one offheap get pere one cache.get is wrong. Anyway, get operation should update offheap gets metric. See usages of CacheMetricsImpl#onOffheapRead. On Wed, May 25, 2016 at 4:06 PM, Vladislav Pyatkovwrote: > Andrey, > > I can see offheap gets metric increments every time when get > > > Unfortunately not. When cache configured as OFFHEAP_TIERED it does not > work. > About increment Get when Put takes place: > > org.apache.ignite.internal.processors.cache.local.CacheLocalOffHeapAndSwapMetricsSelfTest#testOffHeapMetrics > The logic existed is a long time and were covered tests. > > for (int i = 0; i < KEYS_CNT; i++) > cache.put(i, i); > > assertEquals(KEYS_CNT, cache.localMetrics().getOffHeapGets()); > > We execute only put, but get counter also incremented. > > Is anyone has another opinion? > > > > On Wed, May 25, 2016 at 2:51 PM, Andrey Gura wrote: > > > Denis, > > > > I disagree. readOffheapPointer doesn't touch offheap get/put metrics > > deliberately. User should have exactly one offheap get operation per > > cache.get call. > > > > Vlad, > > > > as I can see offheap gets metric increments every time when get, > contains, > > etc operations perform, so it should work. If you have more then one node > > then cluster metrics should be updated eventually with discovery message > > and immediately for local node. So if local node isn't primary for your > key > > you can get metrics with some delay. > > > > If particular metric doesn't change then we need find method that should > be > > responsible for update of this metric. > > > > > > On Tue, May 24, 2016 at 4:27 PM, Denis Magda > wrote: > > > > > Hi Vlad, > > > > > > In my understanding this should work or implemented this way for > > > OFFHEAP_TIRED cache. > > > > > > CacheMetrics.getCacheEvictions - incremented on every put & get > operation > > > because an entry “goes through” heap memory and evicted from there when > > > it’s no longer needed (usually at the end of get or put operation). > > > > > > CacheMetrics.getOffHeapGets - should be incremented every time the > > > off-heap layer is accessed for a particular key. This can be an > ordinary > > > cache.get() call or during a cache.put() that unswaps an entry before > the > > > new value is put. In my understanding you can increase this statistics > > > exactly in this method - GridCacheSwapManager#readOffheapPointer. > > > > > > CacheMetrics.getOffHeapPuts - should be incremented every time a put > > > operations happens and an entry is moved to off heap. > > > > > > — > > > Denis > > > > > > > On May 24, 2016, at 2:47 PM, Vladislav Pyatkov < > vpyat...@gridgain.com> > > > wrote: > > > > > > > > I try to understand how statistics work and fixe some problem. > > > > I first case: > > > > cache.put(46744, "val 46744"); > > > > cache.get(46744); > > > > In statistic I see: > > > > 2016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) > > entries > > > > count 0 > > > > 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) > > > > entries count 1 > > > > 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) > > > > > > > > In brackets Hit and Miss values. > > > > > > > > But I asume OffHeap get must to be one, because cache configured as > > > > OFFHEAP_TIERED and swapEnabled - false. > > > > > > > > My investigation has lead to method > > > > > > > > > > org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer. > > > > The method read only pointer from heap, but not get bytes of value > and > > > not > > > > increase any statistic. > > > > If each receive pointer increase statistic (OffHeap get I mean), then > > > each > > > > OffHeap put will increased OffHeap get, because readOffheapPointer > take > > > > place on OffHeap put. > > > > > > > > The thing confuses my: > > > > Has any rules metrics works? > > > > Where works with metrics value must take place? > > > > > > > > > > > > -- > > Andrey Gura > > GridGain Systems, Inc. > > www.gridgain.com > > > -- Andrey Gura GridGain Systems, Inc. www.gridgain.com
Re: Cache metrics return incorrect values
Denis, I disagree. readOffheapPointer doesn't touch offheap get/put metrics deliberately. User should have exactly one offheap get operation per cache.get call. Vlad, as I can see offheap gets metric increments every time when get, contains, etc operations perform, so it should work. If you have more then one node then cluster metrics should be updated eventually with discovery message and immediately for local node. So if local node isn't primary for your key you can get metrics with some delay. If particular metric doesn't change then we need find method that should be responsible for update of this metric. On Tue, May 24, 2016 at 4:27 PM, Denis Magdawrote: > Hi Vlad, > > In my understanding this should work or implemented this way for > OFFHEAP_TIRED cache. > > CacheMetrics.getCacheEvictions - incremented on every put & get operation > because an entry “goes through” heap memory and evicted from there when > it’s no longer needed (usually at the end of get or put operation). > > CacheMetrics.getOffHeapGets - should be incremented every time the > off-heap layer is accessed for a particular key. This can be an ordinary > cache.get() call or during a cache.put() that unswaps an entry before the > new value is put. In my understanding you can increase this statistics > exactly in this method - GridCacheSwapManager#readOffheapPointer. > > CacheMetrics.getOffHeapPuts - should be incremented every time a put > operations happens and an entry is moved to off heap. > > — > Denis > > > On May 24, 2016, at 2:47 PM, Vladislav Pyatkov > wrote: > > > > I try to understand how statistics work and fixe some problem. > > I first case: > > cache.put(46744, "val 46744"); > > cache.get(46744); > > In statistic I see: > > 2016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) entries > > count 0 > > 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) > > entries count 1 > > 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) > > > > In brackets Hit and Miss values. > > > > But I asume OffHeap get must to be one, because cache configured as > > OFFHEAP_TIERED and swapEnabled - false. > > > > My investigation has lead to method > > > org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer. > > The method read only pointer from heap, but not get bytes of value and > not > > increase any statistic. > > If each receive pointer increase statistic (OffHeap get I mean), then > each > > OffHeap put will increased OffHeap get, because readOffheapPointer take > > place on OffHeap put. > > > > The thing confuses my: > > Has any rules metrics works? > > Where works with metrics value must take place? > > -- Andrey Gura GridGain Systems, Inc. www.gridgain.com
[jira] [Created] (IGNITE-3190) OffHeap cache metrics do not detected get from OffHeap
Vladislav Pyatkov created IGNITE-3190: - Summary: OffHeap cache metrics do not detected get from OffHeap Key: IGNITE-3190 URL: https://issues.apache.org/jira/browse/IGNITE-3190 Project: Ignite Issue Type: Bug Reporter: Vladislav Pyatkov Assignee: Vladislav Pyatkov Simple configuration cache with OffHeap tiered (statistics must be enabled) never increase of get from OffHeap (CacheMetrics#getOffHeapGets always 0) {code} cache.put(46744, "val 46744"); cache.get(46744); {code} {noforamt} 016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) entries count 0 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) entries count 1 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Cache metrics return incorrect values
Hi Vlad, In my understanding this should work or implemented this way for OFFHEAP_TIRED cache. CacheMetrics.getCacheEvictions - incremented on every put & get operation because an entry “goes through” heap memory and evicted from there when it’s no longer needed (usually at the end of get or put operation). CacheMetrics.getOffHeapGets - should be incremented every time the off-heap layer is accessed for a particular key. This can be an ordinary cache.get() call or during a cache.put() that unswaps an entry before the new value is put. In my understanding you can increase this statistics exactly in this method - GridCacheSwapManager#readOffheapPointer. CacheMetrics.getOffHeapPuts - should be incremented every time a put operations happens and an entry is moved to off heap. — Denis > On May 24, 2016, at 2:47 PM, Vladislav Pyatkovwrote: > > I try to understand how statistics work and fixe some problem. > I first case: > cache.put(46744, "val 46744"); > cache.get(46744); > In statistic I see: > 2016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) entries > count 0 > 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) > entries count 1 > 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) > > In brackets Hit and Miss values. > > But I asume OffHeap get must to be one, because cache configured as > OFFHEAP_TIERED and swapEnabled - false. > > My investigation has lead to method > org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer. > The method read only pointer from heap, but not get bytes of value and not > increase any statistic. > If each receive pointer increase statistic (OffHeap get I mean), then each > OffHeap put will increased OffHeap get, because readOffheapPointer take > place on OffHeap put. > > The thing confuses my: > Has any rules metrics works? > Where works with metrics value must take place?
Cache metrics return incorrect values
I try to understand how statistics work and fixe some problem. I first case: cache.put(46744, "val 46744"); cache.get(46744); In statistic I see: 2016-05-24 14:19:31 INFO ServerNode:78 - Swap put 0 get 0 (0, 0) entries count 0 2016-05-24 14:19:31 INFO ServerNode:81 - OffHeap put 1 get 0 (0, 0) entries count 1 2016-05-24 14:19:31 INFO ServerNode:84 - OnHeap put 1 get 1 (1, 0) In brackets Hit and Miss values. But I asume OffHeap get must to be one, because cache configured as OFFHEAP_TIERED and swapEnabled - false. My investigation has lead to method org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer. The method read only pointer from heap, but not get bytes of value and not increase any statistic. If each receive pointer increase statistic (OffHeap get I mean), then each OffHeap put will increased OffHeap get, because readOffheapPointer take place on OffHeap put. The thing confuses my: Has any rules metrics works? Where works with metrics value must take place?
Re: Stream API doesn't update cache metrics.
What is your usecase? пятница, 8 апреля 2016 г. пользователь Dmitry Karachentsev написал: > Yes, switching it to 'true' does the magic. > > Thanks! > > On 08.04.2016 13:52, Yakov Zhdanov wrote: > >> Is allowOverwrite set to 'false'? >> >> Thanks! >> -- >> Yakov Zhdanov, Director R >> *GridGain Systems* >> www.gridgain.com >> >> 2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev < >> dkarachent...@gridgain.com>: >> >> Hi all. >>> >>> Adding data to cache via streamer doesn't update cache metrics like >>> AveragePutTime, CachePuts. Was it made intentionally? >>> >>> Thanks! >>> Dmitry. >>> >>> > -- --Yakov
Re: Stream API doesn't update cache metrics.
Yes, switching it to 'true' does the magic. Thanks! On 08.04.2016 13:52, Yakov Zhdanov wrote: Is allowOverwrite set to 'false'? Thanks! -- Yakov Zhdanov, Director R *GridGain Systems* www.gridgain.com 2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev <dkarachent...@gridgain.com>: Hi all. Adding data to cache via streamer doesn't update cache metrics like AveragePutTime, CachePuts. Was it made intentionally? Thanks! Dmitry.
Re: Stream API doesn't update cache metrics.
Is allowOverwrite set to 'false'? Thanks! -- Yakov Zhdanov, Director R *GridGain Systems* www.gridgain.com 2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev <dkarachent...@gridgain.com>: > Hi all. > > Adding data to cache via streamer doesn't update cache metrics like > AveragePutTime, CachePuts. Was it made intentionally? > > Thanks! > Dmitry. >
Stream API doesn't update cache metrics.
Hi all. Adding data to cache via streamer doesn't update cache metrics like AveragePutTime, CachePuts. Was it made intentionally? Thanks! Dmitry.
[jira] [Created] (IGNITE-2731) Cache metrics documentation on readme.io
Denis Magda created IGNITE-2731: --- Summary: Cache metrics documentation on readme.io Key: IGNITE-2731 URL: https://issues.apache.org/jira/browse/IGNITE-2731 Project: Ignite Issue Type: Bug Components: documentation Affects Versions: 1.5.0.final Reporter: Denis Magda Fix For: 1.6 Cache metrics related topic is becoming hot on the user list. 1) http://apache-ignite-users.70518.x6.nabble.com/Monitoring-Cache-Data-counters-Cache-Data-Size-td3203.html#a3211 2) http://apache-ignite-users.70518.x6.nabble.com/Is-there-a-way-to-get-cache-metrics-for-all-the-nodes-in-cluster-combined-td2674.html 3) http://apache-ignite-users.70518.x6.nabble.com/Metrics-for-backup-caches-td2689.html#a2692 The time to add a specific article for this topic has come. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-2636) Server cache metrics for put-get-remove avg time are incorrect for case when request sent from client
Vladimir Ershov created IGNITE-2636: --- Summary: Server cache metrics for put-get-remove avg time are incorrect for case when request sent from client Key: IGNITE-2636 URL: https://issues.apache.org/jira/browse/IGNITE-2636 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 1.5.0.final Reporter: Vladimir Ershov Server cache metrics for put-get-remove avg time are incorrect for case when request sent from client. We should add methods like CacheMetrics#addPutAndGetTimeNanos for all flows, when requests for cache modifications are processed. For all type of caches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] ignite pull request: IGNTIE-2483 Cache metrics functionality for c...
GitHub user VladimirErshov opened a pull request: https://github.com/apache/ignite/pull/479 IGNTIE-2483 Cache metrics functionality for client nodes should be developed. Added new version of CacheMetricsSnapshot. Fixed merging logic. Added proper put/get/remove time counting on the client side. You can merge this pull request into a Git repository by running: $ git pull https://github.com/VladimirErshov/ignite ignite-2483 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/479.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #479 commit 0bb74d8afcd3523aa51659a791b4078114f73bd3 Author: vershov <vers...@gridgain.com> Date: 2016-02-11T17:43:33Z IGNTIE-2483 added metrics on client. Fixed upTime. redesigned base method and gathering logic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Client Cache metrics API design discussion.
Igniters! Here is a handsome moment in our current cache metrics API, that begs for an improvement and due to it significancy assumed to be discussed communitywise. Current CacheMetrics interface is confusing for a case, when it is accessed from a client node. One of the typical question is: *what CacheMetrics#getSize should return on a client Node for a non-Near non-Local cache?* Here are some options: 1. Zero. As it works now, it is just 0, since there is no entries on the client node. 2. Amount of all entries for this cache across the cluster. 3. Or, and here comes an interesting part, - amount of values which were fore.x. created through this client node, as it is useful for #getAveragePutTime. 4. Your variant? The same for the rest of the API: getCacheHits (0, cluster, client), getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?). Assuming this point can give a good start for our discussion: there are use-cases, that demands metrics to be gathered for a client node separately, fore.x. user can measure latency between nodes, by comparing #getAveragePutTime on client and server side. Thus I consider reasonable to implement specific ClientCacheMetricsImpl with logic for client, but actual questions are: what should methods like getSize, getHits return? Is it necessary to support backward compatibility here for metrics API? Does the community think that it is reasonable to put our efforts to this task and that we want to support case for cache metrics on a client node? Thoughts?
Re: Client Cache metrics API design discussion.
Vladimir, As I already suggested in the ticket [1], I think that by default we should return metrics for the whole cluster. Now we collect them only from local node, which is confusing, especially on the client. If one needs metrics for one node or from subset of nodes, metrics(ClusterGroup) method can be used. So as for the size, I'm definitely for option 2. Option 3 is more about 'getCachePuts()', but not 'getSIze()', no? Where do we increment this counter - on the client or on the primary node? If on the client, this metric will work just as you described when you get metrics for a particular client using metrics(ClusterGroup). Probably it also would be useful to add localMetrics() shortcut method. [1] https://issues.apache.org/jira/browse/IGNITE-2483 -Val On Fri, Feb 5, 2016 at 8:44 AM, Vladimir Ershov <vers...@gridgain.com> wrote: > Igniters! > > Here is a handsome moment in our current cache metrics API, that begs for > an improvement and due to it significancy assumed to be discussed > communitywise. Current CacheMetrics interface is confusing for a case, when > it is accessed from a client node. > One of the typical question is: > *what CacheMetrics#getSize should return on a client Node for a non-Near > non-Local cache?* > Here are some options: > >1. Zero. As it works now, it is just 0, since there is no entries on the >client node. >2. Amount of all entries for this cache across the cluster. >3. Or, and here comes an interesting part, - amount of values which were >fore.x. created through this client node, as it is useful for >#getAveragePutTime. >4. Your variant? > > The same for the rest of the API: getCacheHits (0, cluster, client), > getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?). > > Assuming this point can give a good start for our discussion: there are > use-cases, that demands metrics to be gathered for a client node > separately, fore.x. user can measure latency between nodes, by comparing > #getAveragePutTime on client and server side. Thus I consider reasonable to > implement specific ClientCacheMetricsImpl with logic for client, but actual > questions are: what should methods like getSize, getHits return? Is it > necessary to support backward compatibility here for metrics API? Does the > community think that it is reasonable to put our efforts to this task and > that we want to support case for cache metrics on a client node? > > Thoughts? >
Re: Client Cache metrics API design discussion.
Agree. All metrics should return the data for the whole cache, unless specifically specified otherwise by user. D. On Fri, Feb 5, 2016 at 10:56 AM, Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Vladimir, > > As I already suggested in the ticket [1], I think that by default we should > return metrics for the whole cluster. Now we collect them only from local > node, which is confusing, especially on the client. If one needs metrics > for one node or from subset of nodes, metrics(ClusterGroup) method can be > used. > > So as for the size, I'm definitely for option 2. > > Option 3 is more about 'getCachePuts()', but not 'getSIze()', no? Where do > we increment this counter - on the client or on the primary node? If on the > client, this metric will work just as you described when you get metrics > for a particular client using metrics(ClusterGroup). > > Probably it also would be useful to add localMetrics() shortcut method. > > [1] https://issues.apache.org/jira/browse/IGNITE-2483 > > -Val > > On Fri, Feb 5, 2016 at 8:44 AM, Vladimir Ershov <vers...@gridgain.com> > wrote: > > > Igniters! > > > > Here is a handsome moment in our current cache metrics API, that begs for > > an improvement and due to it significancy assumed to be discussed > > communitywise. Current CacheMetrics interface is confusing for a case, > when > > it is accessed from a client node. > > One of the typical question is: > > *what CacheMetrics#getSize should return on a client Node for a non-Near > > non-Local cache?* > > Here are some options: > > > >1. Zero. As it works now, it is just 0, since there is no entries on > the > >client node. > >2. Amount of all entries for this cache across the cluster. > >3. Or, and here comes an interesting part, - amount of values which > were > >fore.x. created through this client node, as it is useful for > >#getAveragePutTime. > >4. Your variant? > > > > The same for the rest of the API: getCacheHits (0, cluster, client), > > getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?). > > > > Assuming this point can give a good start for our discussion: there are > > use-cases, that demands metrics to be gathered for a client node > > separately, fore.x. user can measure latency between nodes, by comparing > > #getAveragePutTime on client and server side. Thus I consider reasonable > to > > implement specific ClientCacheMetricsImpl with logic for client, but > actual > > questions are: what should methods like getSize, getHits return? Is it > > necessary to support backward compatibility here for metrics API? Does > the > > community think that it is reasonable to put our efforts to this task and > > that we want to support case for cache metrics on a client node? > > > > Thoughts? > > >
[jira] [Created] (IGNITE-2483) Cache metrics bugs
Valentin Kulichenko created IGNITE-2483: --- Summary: Cache metrics bugs Key: IGNITE-2483 URL: https://issues.apache.org/jira/browse/IGNITE-2483 Project: Ignite Issue Type: Bug Components: cache Reporter: Valentin Kulichenko Fix For: 1.6 User list discussion: http://apache-ignite-users.70518.x6.nabble.com/Is-there-a-way-to-get-cache-metrics-for-all-the-nodes-in-cluster-combined-td2674.html Currently there are at least three issues with cache metrics: # When metrics are acquired on client, average put times are always zero. This happens because timings are calculated on the client, but puts are counted on servers. # Size and keySize are always zero even if cache is not empty. # Default metrics() method that doesn't take a cluster group provides metrics for local node only. So if it's called on client, they are always empty. It should calculate metrics for the whole cluster instead. Also looks like this code is very undertested. Coverage should be significantly improved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)