subject:"Cache Metrics"

[jira] [Created] (IGNITE-13982) Add documentation for new checkpoint, cluster and cache metrics

2021-01-12 Thread Amelchev Nikita (Jira)

Amelchev Nikita created IGNITE-13982:


 Summary: Add documentation for new checkpoint, cluster and cache 
metrics
 Key: IGNITE-13982
 URL: https://issues.apache.org/jira/browse/IGNITE-13982
 Project: Ignite
  Issue Type: Task
Reporter: Amelchev Nikita
Assignee: Amelchev Nikita
 Fix For: 2.10


Add documentation for new metrics:

* LastCheckpointbeforeLockDuration
* LastCheckpointListenersExecuteDuration
* LastCheckpointLockHoldDuration
* LastCheckpointWalCpRecordFsyncDuration
* LastCheckpointWriteCheckpointEntryDuration
* LastCheckpointSplitAndSortPagesDuration

* CheckpointBeforeLockHistogram
* CheckpointLockWaitHistogram
* CheckpointListenersExecuteHistogram
* CheckpointMarkHistogram
* CheckpointLockHoldHistogram
* CheckpointPagesWriteHistogram
* CheckpointFsyncHistogram
* CheckpointWalRecordFsyncHistogram
* CheckpointWriteEntryHistogram
* CheckpointSplitAndSortPagesHistogram
* CheckpointHistogram

* TopologyVersion
* TotalNodes
* TotalBaselineNodes
* TotalServerNodes
* TotalClientNodes
* ActiveBaselineNodes

* OffHeapEntriesCount
* OffHeapBackupEntriesCount
* OffHeapPrimaryEntriesCount
* HeapEntriesCount
* CacheSize



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13952) Cache metrics without description

2021-01-04 Thread Alexand Polyakov (Jira)

Alexand Polyakov created IGNITE-13952:
-

 Summary: Cache metrics without description
 Key: IGNITE-13952
 URL: https://issues.apache.org/jira/browse/IGNITE-13952
 Project: Ignite
  Issue Type: Sub-task
Reporter: Alexand Polyakov
Assignee: Alexand Polyakov


list of metrics

|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|RebalancedKeys|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EstimatedRebalancingKeys|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|RebalanceClearingPartitionsLeft|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorAverageInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorHitPercentage|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorHits|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorInvocations|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMaxInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMinInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMisses|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorMissPercentage|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorPuts|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorReadOnlyInvocations|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheClusterMetricsMXBeanImpl"|EntryProcessorRemovals|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|RebalancedKeys|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EstimatedRebalancingKeys|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|RebalanceClearingPartitionsLeft|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorAverageInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorHitPercentage|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorHits|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorInvocations|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMaxInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMinInvocationTime|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMisses|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorMissPercentage|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorPuts|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorReadOnlyInvocations|
|org.apache:group=CACHE_NAME,name="org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl"|EntryProcessorRemovals|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Cache metrics on server nodes does not update correctly

2020-03-13 Thread Andrey Gura

Hi,

AI 2.7.6 doesn't contain a bug with aggregation of cache hits/misses.

I don't sure that described problem is related with IGNITE-3495 [1].
So it makes sense to file an issue.

[1] https://issues.apache.org/jira/browse/IGNITE-3495

On Thu, Mar 12, 2020 at 8:21 PM Dominik Przybysz  wrote:
>
> Hi,
> I used ignite in version 2.7.6 (but I have also seen this behaviour on other 
> 2.7.x versions) and there aren't any near or local cache.
> I expect that if I ask distributed cache about key which does not exist then 
> the miss metric will be incremented.
>
>
> śr., 11 mar 2020 o 11:35 Andrey Gura  napisał(a):
>>
>> Denis,
>>
>> I don't sure that I understand what is expected behavior should be.
>> There are local and aggregated cluster wide metrics. I don't know
>> which one used by Visor because I never used it :)
>>
>> Also it would be great to know what version of Apache Ignite used in
>> described case. I remember some bug with metrics aggregation during
>> discovery metrics message round trip.
>>
>> On Wed, Mar 11, 2020 at 12:05 AM Denis Magda  wrote:
>> >
>> > @Nikolay Izhikov , @Andrey Gura ,
>> > could you folks check out this thread?
>> >
>> > I have a feeling that what Dominik is describing was talked out before and
>> > rather some sort of a limitation than an issue with the current
>> > implementation.
>> >
>> > -
>> > Denis
>> >
>> >
>> > On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz 
>> > wrote:
>> >
>> > > Hi,
>> > > I am trying to use partitioned cache on server nodes to which I connect
>> > > with client node. Statistics of cache in the cluster are updated, but 
>> > > only
>> > > for hits metric - misses metric is always 0.
>> > >
>> > > To reproduce this problem I created cluster of two nodes:
>> > >
>> > > Server node 1 adds 100 random test cases and prints cache statistics
>> > > continuously:
>> > >
>> > > public class IgniteClusterNode1 {
>> > > public static void main(String[] args) throws InterruptedException {
>> > > IgniteConfiguration igniteConfiguration = new
>> > > IgniteConfiguration();
>> > >
>> > > CacheConfiguration cacheConfiguration = new CacheConfiguration();
>> > > cacheConfiguration.setName("test");
>> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
>> > > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC);
>> > > cacheConfiguration.setStatisticsEnabled(true);
>> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
>> > >
>> > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
>> > > communicationSpi.setLocalPort(47500);
>> > > igniteConfiguration.setCommunicationSpi(communicationSpi);
>> > >
>> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
>> > > discoverySpi.setLocalPort(47100);
>> > > discoverySpi.setLocalPortRange(100);
>> > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
>> > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
>> > > "127.0.0.1:48100..48200"));
>> > > igniteConfiguration.setDiscoverySpi(discoverySpi);
>> > >
>> > > try (Ignite ignite = Ignition.start(igniteConfiguration)) {
>> > > try (IgniteCache cache =
>> > > ignite.getOrCreateCache("test")) {
>> > > new Random().ints(1000).map(i -> Math.abs(i %
>> > > 1000)).distinct().limit(100).forEach(i -> {
>> > > String key = "data_" + i;
>> > > String value = UUID.randomUUID().toString();
>> > > cache.put(key, value);
>> > > }
>> > > );
>> > > }
>> > > while (true) {
>> > > System.out.println(ignite.cache("test").metrics());
>> > > Thread.sleep(5000);
>> > > }
>> > > }
>> > > }
>> > > }
>> > >
>> > > Server node 2 only prints cache statistics continuously:
>> > >
>> > > public class IgniteClusterNode2 {
>> > > public static void main(String[] args) throws InterruptedException {
>> > > IgniteConfiguration igniteConfiguration = new
>> > > IgniteConfiguration();
>> > >
>> > > CacheConfiguration cacheConfiguration = new CacheConfiguration();
>> > > cacheConfiguration.setName("test");
>> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
>> > > cacheConfiguration.setStatisticsEnabled(true);
>> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
>> > >
>> > > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
>> > > communicationSpi.setLocalPort(48500);
>> > > igniteConfiguration.setCommunicationSpi(communicationSpi);
>> > >
>> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
>> > > discoverySpi.setLocalPort(48100);
>> > > discoverySpi.setLocalPortRange(100);
>> > > TcpDiscoveryVmIpFinder

Re: Cache metrics on server nodes does not update correctly

2020-03-12 Thread Dominik Przybysz

Hi,
I used ignite in version 2.7.6 (but I have also seen this behaviour on
other 2.7.x versions) and there aren't any near or local cache.
I expect that if I ask distributed cache about key which does not exist
then the miss metric will be incremented.


śr., 11 mar 2020 o 11:35 Andrey Gura  napisał(a):

> Denis,
>
> I don't sure that I understand what is expected behavior should be.
> There are local and aggregated cluster wide metrics. I don't know
> which one used by Visor because I never used it :)
>
> Also it would be great to know what version of Apache Ignite used in
> described case. I remember some bug with metrics aggregation during
> discovery metrics message round trip.
>
> On Wed, Mar 11, 2020 at 12:05 AM Denis Magda  wrote:
> >
> > @Nikolay Izhikov , @Andrey Gura ,
> > could you folks check out this thread?
> >
> > I have a feeling that what Dominik is describing was talked out before
> and
> > rather some sort of a limitation than an issue with the current
> > implementation.
> >
> > -
> > Denis
> >
> >
> > On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz 
> > wrote:
> >
> > > Hi,
> > > I am trying to use partitioned cache on server nodes to which I connect
> > > with client node. Statistics of cache in the cluster are updated, but
> only
> > > for hits metric - misses metric is always 0.
> > >
> > > To reproduce this problem I created cluster of two nodes:
> > >
> > > Server node 1 adds 100 random test cases and prints cache statistics
> > > continuously:
> > >
> > > public class IgniteClusterNode1 {
> > > public static void main(String[] args) throws InterruptedException
> {
> > > IgniteConfiguration igniteConfiguration = new
> > > IgniteConfiguration();
> > >
> > > CacheConfiguration cacheConfiguration = new
> CacheConfiguration();
> > > cacheConfiguration.setName("test");
> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> > > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC);
> > > cacheConfiguration.setStatisticsEnabled(true);
> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
> > >
> > > TcpCommunicationSpi communicationSpi = new
> TcpCommunicationSpi();
> > > communicationSpi.setLocalPort(47500);
> > > igniteConfiguration.setCommunicationSpi(communicationSpi);
> > >
> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> > > discoverySpi.setLocalPort(47100);
> > > discoverySpi.setLocalPortRange(100);
> > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> > > "127.0.0.1:48100..48200"));
> > > igniteConfiguration.setDiscoverySpi(discoverySpi);
> > >
> > > try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> > > try (IgniteCache cache =
> > > ignite.getOrCreateCache("test")) {
> > > new Random().ints(1000).map(i -> Math.abs(i %
> > > 1000)).distinct().limit(100).forEach(i -> {
> > > String key = "data_" + i;
> > > String value =
> UUID.randomUUID().toString();
> > > cache.put(key, value);
> > > }
> > > );
> > > }
> > > while (true) {
> > > System.out.println(ignite.cache("test").metrics());
> > > Thread.sleep(5000);
> > > }
> > > }
> > > }
> > > }
> > >
> > > Server node 2 only prints cache statistics continuously:
> > >
> > > public class IgniteClusterNode2 {
> > > public static void main(String[] args) throws InterruptedException
> {
> > > IgniteConfiguration igniteConfiguration = new
> > > IgniteConfiguration();
> > >
> > > CacheConfiguration cacheConfiguration = new
> CacheConfiguration();
> > > cacheConfiguration.setName("test");
> > > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> > > cacheConfiguration.setStatisticsEnabled(true);
> > > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
> > >
> > > TcpCommunicationSpi communicationSpi = new
> TcpCommunicationSpi();
> > > communicationSpi.setLocalPort(48500);
> > > igniteConfiguration.setCommunicationSpi(communicationSpi);
> > >
> > > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> > > discoverySpi.setLocalPort(48100);
> > > discoverySpi.setLocalPortRange(100);
> > > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> > > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> > > "127.0.0.1:48100..48200"));
> > > igniteConfiguration.setDiscoverySpi(discoverySpi);
> > >
> > > try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> > > while (true) {
> > > System.out.println(ignite.cache("test").metrics());
> >

Re: Cache metrics on server nodes does not update correctly

2020-03-11 Thread Andrey Gura

Denis,

I don't sure that I understand what is expected behavior should be.
There are local and aggregated cluster wide metrics. I don't know
which one used by Visor because I never used it :)

Also it would be great to know what version of Apache Ignite used in
described case. I remember some bug with metrics aggregation during
discovery metrics message round trip.

On Wed, Mar 11, 2020 at 12:05 AM Denis Magda  wrote:
>
> @Nikolay Izhikov , @Andrey Gura ,
> could you folks check out this thread?
>
> I have a feeling that what Dominik is describing was talked out before and
> rather some sort of a limitation than an issue with the current
> implementation.
>
> -
> Denis
>
>
> On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz 
> wrote:
>
> > Hi,
> > I am trying to use partitioned cache on server nodes to which I connect
> > with client node. Statistics of cache in the cluster are updated, but only
> > for hits metric - misses metric is always 0.
> >
> > To reproduce this problem I created cluster of two nodes:
> >
> > Server node 1 adds 100 random test cases and prints cache statistics
> > continuously:
> >
> > public class IgniteClusterNode1 {
> > public static void main(String[] args) throws InterruptedException {
> > IgniteConfiguration igniteConfiguration = new
> > IgniteConfiguration();
> >
> > CacheConfiguration cacheConfiguration = new CacheConfiguration();
> > cacheConfiguration.setName("test");
> > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> > cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC);
> > cacheConfiguration.setStatisticsEnabled(true);
> > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
> >
> > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
> > communicationSpi.setLocalPort(47500);
> > igniteConfiguration.setCommunicationSpi(communicationSpi);
> >
> > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> > discoverySpi.setLocalPort(47100);
> > discoverySpi.setLocalPortRange(100);
> > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> > "127.0.0.1:48100..48200"));
> > igniteConfiguration.setDiscoverySpi(discoverySpi);
> >
> > try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> > try (IgniteCache cache =
> > ignite.getOrCreateCache("test")) {
> > new Random().ints(1000).map(i -> Math.abs(i %
> > 1000)).distinct().limit(100).forEach(i -> {
> > String key = "data_" + i;
> > String value = UUID.randomUUID().toString();
> > cache.put(key, value);
> > }
> > );
> > }
> > while (true) {
> > System.out.println(ignite.cache("test").metrics());
> > Thread.sleep(5000);
> > }
> > }
> > }
> > }
> >
> > Server node 2 only prints cache statistics continuously:
> >
> > public class IgniteClusterNode2 {
> > public static void main(String[] args) throws InterruptedException {
> > IgniteConfiguration igniteConfiguration = new
> > IgniteConfiguration();
> >
> > CacheConfiguration cacheConfiguration = new CacheConfiguration();
> > cacheConfiguration.setName("test");
> > cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> > cacheConfiguration.setStatisticsEnabled(true);
> > igniteConfiguration.setCacheConfiguration(cacheConfiguration);
> >
> > TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
> > communicationSpi.setLocalPort(48500);
> > igniteConfiguration.setCommunicationSpi(communicationSpi);
> >
> > TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> > discoverySpi.setLocalPort(48100);
> > discoverySpi.setLocalPortRange(100);
> > TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> > ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> > "127.0.0.1:48100..48200"));
> > igniteConfiguration.setDiscoverySpi(discoverySpi);
> >
> > try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> > while (true) {
> > System.out.println(ignite.cache("test").metrics());
> > Thread.sleep(5000);
> > }
> > }
> > }
> > }
> >
> > Next I start a client node which continuously read data from the cluster:
> >
> > public class CacheClusterReader {
> > public static void main(String[] args) throws InterruptedException {
> > IgniteConfiguration cfg = new IgniteConfiguration();
> > cfg.setClientMode(true);
> >
> > TcpDiscoverySpi spi = new TcpDiscoverySpi();
> > TcpDiscoveryVmIpFinder tcMp = new TcpDiscoveryVmIpFinder();
> >

Re: Cache metrics on server nodes does not update correctly

2020-03-10 Thread Denis Magda

@Nikolay Izhikov , @Andrey Gura ,
could you folks check out this thread?

I have a feeling that what Dominik is describing was talked out before and
rather some sort of a limitation than an issue with the current
implementation.

-
Denis


On Tue, Mar 3, 2020 at 11:41 PM Dominik Przybysz 
wrote:

> Hi,
> I am trying to use partitioned cache on server nodes to which I connect
> with client node. Statistics of cache in the cluster are updated, but only
> for hits metric - misses metric is always 0.
>
> To reproduce this problem I created cluster of two nodes:
>
> Server node 1 adds 100 random test cases and prints cache statistics
> continuously:
>
> public class IgniteClusterNode1 {
> public static void main(String[] args) throws InterruptedException {
> IgniteConfiguration igniteConfiguration = new
> IgniteConfiguration();
>
> CacheConfiguration cacheConfiguration = new CacheConfiguration();
> cacheConfiguration.setName("test");
> cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC);
> cacheConfiguration.setStatisticsEnabled(true);
> igniteConfiguration.setCacheConfiguration(cacheConfiguration);
>
> TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
> communicationSpi.setLocalPort(47500);
> igniteConfiguration.setCommunicationSpi(communicationSpi);
>
> TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> discoverySpi.setLocalPort(47100);
> discoverySpi.setLocalPortRange(100);
> TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> "127.0.0.1:48100..48200"));
> igniteConfiguration.setDiscoverySpi(discoverySpi);
>
> try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> try (IgniteCache cache =
> ignite.getOrCreateCache("test")) {
> new Random().ints(1000).map(i -> Math.abs(i %
> 1000)).distinct().limit(100).forEach(i -> {
> String key = "data_" + i;
> String value = UUID.randomUUID().toString();
> cache.put(key, value);
> }
> );
> }
> while (true) {
> System.out.println(ignite.cache("test").metrics());
> Thread.sleep(5000);
> }
> }
> }
> }
>
> Server node 2 only prints cache statistics continuously:
>
> public class IgniteClusterNode2 {
> public static void main(String[] args) throws InterruptedException {
> IgniteConfiguration igniteConfiguration = new
> IgniteConfiguration();
>
> CacheConfiguration cacheConfiguration = new CacheConfiguration();
> cacheConfiguration.setName("test");
> cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
> cacheConfiguration.setStatisticsEnabled(true);
> igniteConfiguration.setCacheConfiguration(cacheConfiguration);
>
> TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
> communicationSpi.setLocalPort(48500);
> igniteConfiguration.setCommunicationSpi(communicationSpi);
>
> TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
> discoverySpi.setLocalPort(48100);
> discoverySpi.setLocalPortRange(100);
> TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
> ipFinder.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> "127.0.0.1:48100..48200"));
> igniteConfiguration.setDiscoverySpi(discoverySpi);
>
> try (Ignite ignite = Ignition.start(igniteConfiguration)) {
> while (true) {
> System.out.println(ignite.cache("test").metrics());
> Thread.sleep(5000);
> }
> }
> }
> }
>
> Next I start a client node which continuously read data from the cluster:
>
> public class CacheClusterReader {
> public static void main(String[] args) throws InterruptedException {
> IgniteConfiguration cfg = new IgniteConfiguration();
> cfg.setClientMode(true);
>
> TcpDiscoverySpi spi = new TcpDiscoverySpi();
> TcpDiscoveryVmIpFinder tcMp = new TcpDiscoveryVmIpFinder();
> tcMp.setAddresses(Arrays.asList("127.0.0.1:47100..47200",
> "127.0.0.1:48100..48200"));
> spi.setIpFinder(tcMp);
> cfg.setDiscoverySpi(spi);
>
> CacheConfiguration cacheConfig = new
> CacheConfiguration<>("test");
> cacheConfig.setStatisticsEnabled(true);
> cacheConfig.setCacheMode(CacheMode.PARTITIONED);
> cfg.setCacheConfiguration(cacheConfig);
>
> try (Ignite ignite = Ignition.start(cfg)) {
> System.out.println(ignite.cacheNames());
>
> while (true) {
> try (IgniteCache cache =
> ignite.getOrCreateCache(cacheConfig)) {
>

Re: MetaStorage key length limitations and Cache Metrics configuration

2020-02-28 Thread Sergey Chugunov

Ivan,

I also don't think this issue is a blocker for 2.8 as it affects only
experimental functionality and only in special cases.

Removing key length limitations in MetaStorage seems more strategic
approach to me but depending on how we decide to approach it (as a local
fix or as part of a broader improvement of MetaStorage internal
implementation) we may target it to 2.8.1 or 2.9.

In the latter case it makes sense to implement key length validation [1]
and include it to 2.8.1 to prevent user from making destructive actions.
Otherwise if we decide to implement [2] earlier and remove this pesky
limitation in 2.8.1 then I'm fine with closing [1] with "Won't fix"
resolution.

Does it make sense to you?

[1] https://issues.apache.org/jira/browse/IGNITE-12721
[2] https://issues.apache.org/jira/browse/IGNITE-12726

On Fri, Feb 28, 2020 at 4:18 PM Maxim Muzafarov  wrote:

> Ivan,
>
>
> This issue doesn't seem to be a blocker for 2.8 release from my point of
> view.
> I think we definitely will have such bugs in future and 2.8.1 is our
> goal for them.
>
> Please, let me know if we should wait for the fix and include it exactly
> in 2.8.
>
> On Fri, 28 Feb 2020 at 15:40, Nikolay Izhikov  wrote:
> >
> > Igniters,
> >
> > I think we can replace cache name with the cache id.
> > This should solve issue with the length limitation.
> >
> > What do you think?
> >
> > > 28 февр. 2020 г., в 15:32, Ivan Bessonov 
> написал(а):
> > >
> > > Hello Igniters,
> > >
> > > we have an issue in master branch and in the upcoming 2.8 release that
> > > related to new metrics functionality implemented in [1]. You can't use
> new
> > > "configureHistogramMetric" and "configureHitRateMetric" configuration
> > > methods on caches with long names. My estimation shows that cache with
> 30
> > > characters in its name will shut down your whole cluster with failure
> > > handler if
> > > you try to change metrics configuration for it using one of those
> methods.
> > >
> > > Initially we wanted to merge [2] to show a valid error message instead
> of
> > > failing
> > > the cluster, but it wasn't in plans for 2.8 because we didn't know
> that it
> > > clashes
> > > with [1].
> > >
> > > I created issue [3] with plans of removing MetaStorage key length
> > > limitations, but
> > > it requires some thoughtful MetaStorageTree reworkings. I mean that it
> > > can't be
> > > done in only a few days.
> > >
> > > What do you think? Does this issue affect 2.8 release? AFAIK new
> metrics are
> > > experimental and they can have some known issues. Feel free to ask me
> for
> > > more
> > > details if it's needed.
> > >
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-11987
> > > [2] https://issues.apache.org/jira/browse/IGNITE-12721
> > > [3] https://issues.apache.org/jira/browse/IGNITE-12726
> > >
> > > --
> > > Sincerely yours,
> > > Ivan Bessonov
> >
>

Re: MetaStorage key length limitations and Cache Metrics configuration

2020-02-28 Thread Maxim Muzafarov

Ivan,


This issue doesn't seem to be a blocker for 2.8 release from my point of view.
I think we definitely will have such bugs in future and 2.8.1 is our
goal for them.

Please, let me know if we should wait for the fix and include it exactly in 2.8.

On Fri, 28 Feb 2020 at 15:40, Nikolay Izhikov  wrote:
>
> Igniters,
>
> I think we can replace cache name with the cache id.
> This should solve issue with the length limitation.
>
> What do you think?
>
> > 28 февр. 2020 г., в 15:32, Ivan Bessonov  написал(а):
> >
> > Hello Igniters,
> >
> > we have an issue in master branch and in the upcoming 2.8 release that
> > related to new metrics functionality implemented in [1]. You can't use new
> > "configureHistogramMetric" and "configureHitRateMetric" configuration
> > methods on caches with long names. My estimation shows that cache with 30
> > characters in its name will shut down your whole cluster with failure
> > handler if
> > you try to change metrics configuration for it using one of those methods.
> >
> > Initially we wanted to merge [2] to show a valid error message instead of
> > failing
> > the cluster, but it wasn't in plans for 2.8 because we didn't know that it
> > clashes
> > with [1].
> >
> > I created issue [3] with plans of removing MetaStorage key length
> > limitations, but
> > it requires some thoughtful MetaStorageTree reworkings. I mean that it
> > can't be
> > done in only a few days.
> >
> > What do you think? Does this issue affect 2.8 release? AFAIK new metrics are
> > experimental and they can have some known issues. Feel free to ask me for
> > more
> > details if it's needed.
> >
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-11987
> > [2] https://issues.apache.org/jira/browse/IGNITE-12721
> > [3] https://issues.apache.org/jira/browse/IGNITE-12726
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
>

Re: MetaStorage key length limitations and Cache Metrics configuration

2020-02-28 Thread Nikolay Izhikov

Igniters,

I think we can replace cache name with the cache id.
This should solve issue with the length limitation.

What do you think?

> 28 февр. 2020 г., в 15:32, Ivan Bessonov  написал(а):
> 
> Hello Igniters,
> 
> we have an issue in master branch and in the upcoming 2.8 release that
> related to new metrics functionality implemented in [1]. You can't use new
> "configureHistogramMetric" and "configureHitRateMetric" configuration
> methods on caches with long names. My estimation shows that cache with 30
> characters in its name will shut down your whole cluster with failure
> handler if
> you try to change metrics configuration for it using one of those methods.
> 
> Initially we wanted to merge [2] to show a valid error message instead of
> failing
> the cluster, but it wasn't in plans for 2.8 because we didn't know that it
> clashes
> with [1].
> 
> I created issue [3] with plans of removing MetaStorage key length
> limitations, but
> it requires some thoughtful MetaStorageTree reworkings. I mean that it
> can't be
> done in only a few days.
> 
> What do you think? Does this issue affect 2.8 release? AFAIK new metrics are
> experimental and they can have some known issues. Feel free to ask me for
> more
> details if it's needed.
> 
> 
> [1] https://issues.apache.org/jira/browse/IGNITE-11987
> [2] https://issues.apache.org/jira/browse/IGNITE-12721
> [3] https://issues.apache.org/jira/browse/IGNITE-12726
> 
> -- 
> Sincerely yours,
> Ivan Bessonov

MetaStorage key length limitations and Cache Metrics configuration

2020-02-28 Thread Ivan Bessonov

Hello Igniters,

we have an issue in master branch and in the upcoming 2.8 release that
related to new metrics functionality implemented in [1]. You can't use new
"configureHistogramMetric" and "configureHitRateMetric" configuration
methods on caches with long names. My estimation shows that cache with 30
characters in its name will shut down your whole cluster with failure
handler if
you try to change metrics configuration for it using one of those methods.

Initially we wanted to merge [2] to show a valid error message instead of
failing
the cluster, but it wasn't in plans for 2.8 because we didn't know that it
clashes
with [1].

I created issue [3] with plans of removing MetaStorage key length
limitations, but
it requires some thoughtful MetaStorageTree reworkings. I mean that it
can't be
done in only a few days.

What do you think? Does this issue affect 2.8 release? AFAIK new metrics are
experimental and they can have some known issues. Feel free to ask me for
more
details if it's needed.


[1] https://issues.apache.org/jira/browse/IGNITE-11987
[2] https://issues.apache.org/jira/browse/IGNITE-12721
[3] https://issues.apache.org/jira/browse/IGNITE-12726

-- 
Sincerely yours,
Ivan Bessonov

Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-26 Thread Ivan Pavlukhin

Hi Roman,

I suppose that we can resolve the ticket with 2.8 fix version if you
have no objections.

чт, 26 дек. 2019 г. в 10:52, :
>
> Hi Ivan,
> Does it mean that the problem is gone and I should close the JIRA 
> IGNITE-12445 ?
>
>
> -Original Message-
> From: Ivan Pavlukhin 
> Sent: Monday, December 16, 2019 11:05 PM
> To: dev 
> Subject: Re: When Cache Metrics are switched on (statisticsEnabled = true) 
> the empty cache events arrive to the client nodes
>
> I also checked the reproducer with current master. It seems that the problem 
> is fixed there.
>
> пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev :
> >
> > Hello!
> >
> > Is there a chance you are using Zk?
> >
> > I believe it's https://issues.apache.org/jira/browse/IGNITE-6564
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пт, 13 дек. 2019 г. в 12:24, :
> >
> > > Hi Community,
> > >
> > > I’d like to ask you about the following behavior of Apache Ignite:
> > >
> > >
> > > If we want to react on some PUT or READ cache operations first of
> > > all we need to turn on the appropriate cache events on the server
> > > node and catch those events on the client nodes using remote approach 
> > > with two listeners.
> > > It works well until we switch on statisticsEnabled on the server
> > > node, it will lead to the situation when we get empty CacheEvent objects.
> > >
> > > The example that demonstrates this issue is in the attachments. This
> > > example is consists of three nodes:  1 server node with cache and 2
> > > clients.  One client is filling the cache and the second one is
> > > listening PUT operations. When we turn on Cache Metrics on the server 
> > > node:
> > > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we
> > > get empty events (Sometimes CacheEvent objects with null fields.
> > > Sometimes there are no events at all)
> > >
> > > My suppose is there is some Exception in
> > > GridCacheEventManager.addEvent() when Cache Metrics is turned on.
> > >
> > > catch (Exception e) {
> > >   if
> > > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
> > > throw e;  if (log.isDebugEnabled())
> > >  log.debug("Failed to unmarshall cache object value for the
> > > event
> > > notification: " + e);
> > >
> > >   if (!forceKeepBinary)
> > > LT.warn(log, "Failed to unmarshall cache object value for the
> > > event notification " +
> > >  "(all further notifications will keep binary object
> > > format).");
> > >
> > >   forceKeepBinary = true;
> > >
> > >   key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true,
> > > false);
> > >
> > >   val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal,
> > > true, false);
> > >
> > >   oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal,
> > > true, false);
> > >
> > > }
> > >
> > > Can public this point in JIRA?
> > >
> > > Best regards,
> > >
> > > T-Systems RUS
> > > Point of Production
> > > Roman Koriakov
> > > Software Developer
> > > Kirova 11, Voronezh, Russia
> > > Tel: + 7 473 200 15 30
> > > E-mail:
> > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
> > > http://www.t-systems.com<http://www.t-systems.ru/>
> > >
> > >
> > >
> > > -Original Message-
> > > From: Ilya Kasnacheev 
> > > Sent: Thursday, December 12, 2019 6:35 PM
> > > To: dev 
> > > Subject: Re: joining
> > >
> > >
> > >
> > > Hello!
> > >
> > >
> > >
> > > You will need to register on https://issues.apache.org/jira/ first.
> > >
> > >
> > >
> > > Please tell me when you do.
> > >
> > >
> > >
> > > Regards,
> > >
> > > --
> > >
> > > Ilya Kasnacheev
> > >
> > >
> > >
> > >
> > >
> > > чт, 12 дек. 2019 г. в 18:09,  > > roman.koria...@t-systems.com>>:
> > >
> > >
> > >
> > > > Hi Ilya,
> > >
> > > >
> > >
> > > > i

RE: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-25 Thread Roman.Koriakov

Hi Ivan,
Does it mean that the problem is gone and I should close the JIRA IGNITE-12445 ?


-Original Message-
From: Ivan Pavlukhin  
Sent: Monday, December 16, 2019 11:05 PM
To: dev 
Subject: Re: When Cache Metrics are switched on (statisticsEnabled = true) the 
empty cache events arrive to the client nodes

I also checked the reproducer with current master. It seems that the problem is 
fixed there.

пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev :
>
> Hello!
>
> Is there a chance you are using Zk?
>
> I believe it's https://issues.apache.org/jira/browse/IGNITE-6564
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 13 дек. 2019 г. в 12:24, :
>
> > Hi Community,
> >
> > I’d like to ask you about the following behavior of Apache Ignite:
> >
> >
> > If we want to react on some PUT or READ cache operations first of 
> > all we need to turn on the appropriate cache events on the server 
> > node and catch those events on the client nodes using remote approach with 
> > two listeners.
> > It works well until we switch on statisticsEnabled on the server 
> > node, it will lead to the situation when we get empty CacheEvent objects.
> >
> > The example that demonstrates this issue is in the attachments. This 
> > example is consists of three nodes:  1 server node with cache and 2 
> > clients.  One client is filling the cache and the second one is 
> > listening PUT operations. When we turn on Cache Metrics on the server node:
> > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we 
> > get empty events (Sometimes CacheEvent objects with null fields. 
> > Sometimes there are no events at all)
> >
> > My suppose is there is some Exception in 
> > GridCacheEventManager.addEvent() when Cache Metrics is turned on.
> >
> > catch (Exception e) {
> >   if
> > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
> > throw e;  if (log.isDebugEnabled())
> >  log.debug("Failed to unmarshall cache object value for the 
> > event
> > notification: " + e);
> >
> >   if (!forceKeepBinary)
> > LT.warn(log, "Failed to unmarshall cache object value for the 
> > event notification " +
> >  "(all further notifications will keep binary object 
> > format).");
> >
> >   forceKeepBinary = true;
> >
> >   key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, 
> > false);
> >
> >   val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, 
> > true, false);
> >
> >   oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, 
> > true, false);
> >
> > }
> >
> > Can public this point in JIRA?
> >
> > Best regards,
> >
> > T-Systems RUS
> > Point of Production
> > Roman Koriakov
> > Software Developer
> > Kirova 11, Voronezh, Russia
> > Tel: + 7 473 200 15 30
> > E-mail: 
> > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
> > http://www.t-systems.com<http://www.t-systems.ru/>
> >
> >
> >
> > -Original Message-
> > From: Ilya Kasnacheev 
> > Sent: Thursday, December 12, 2019 6:35 PM
> > To: dev 
> > Subject: Re: joining
> >
> >
> >
> > Hello!
> >
> >
> >
> > You will need to register on https://issues.apache.org/jira/ first.
> >
> >
> >
> > Please tell me when you do.
> >
> >
> >
> > Regards,
> >
> > --
> >
> > Ilya Kasnacheev
> >
> >
> >
> >
> >
> > чт, 12 дек. 2019 г. в 18:09,  > roman.koria...@t-systems.com>>:
> >
> >
> >
> > > Hi Ilya,
> >
> > >
> >
> > > it’d be nice if it were rkoriakov
> >
> > >
> >
> > >
> >
> > >
> >
> > > Best regards,
> >
> > >
> >
> > > T-Systems RUS
> >
> > > Point of Production
> >
> > > Roman Koriakov
> >
> > > Software Developer
> >
> > > Kirova 11, Voronezh, Russia
> >
> > > Tel: + 7 473 200 15 30
> >
> > > E-mail: 
> > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com
> > <mailto:roman.koria...@t-systems.com%3cmailto:Roman.Koriakov@t-syste
> > ms.com
> > >>
> >
> > > http://www.t-systems.com<http://www.t-systems.ru/<
> > http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
> >
> &g

Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-16 Thread Ivan Pavlukhin

I also checked the reproducer with current master. It seems that the
problem is fixed there.

пн, 16 дек. 2019 г. в 19:36, Ilya Kasnacheev :
>
> Hello!
>
> Is there a chance you are using Zk?
>
> I believe it's https://issues.apache.org/jira/browse/IGNITE-6564
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 13 дек. 2019 г. в 12:24, :
>
> > Hi Community,
> >
> > I’d like to ask you about the following behavior of Apache Ignite:
> >
> >
> > If we want to react on some PUT or READ cache operations first of all we
> > need to turn on the appropriate cache events on the server node and catch
> > those events on the client nodes using remote approach with two listeners.
> > It works well until we switch on statisticsEnabled on the server node, it
> > will lead to the situation when we get empty CacheEvent objects.
> >
> > The example that demonstrates this issue is in the attachments. This
> > example is consists of three nodes:  1 server node with cache and 2
> > clients.  One client is filling the cache and the second one is listening
> > PUT operations. When we turn on Cache Metrics on the server node:
> > cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get
> > empty events (Sometimes CacheEvent objects with null fields. Sometimes
> > there are no events at all)
> >
> > My suppose is there is some Exception in GridCacheEventManager.addEvent()
> > when Cache Metrics is turned on.
> >
> > catch (Exception e) {
> >   if
> > (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
> > throw e;  if (log.isDebugEnabled())
> >  log.debug("Failed to unmarshall cache object value for the event
> > notification: " + e);
> >
> >   if (!forceKeepBinary)
> > LT.warn(log, "Failed to unmarshall cache object value for the event
> > notification " +
> >  "(all further notifications will keep binary object
> > format).");
> >
> >   forceKeepBinary = true;
> >
> >   key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false);
> >
> >   val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true,
> > false);
> >
> >   oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true,
> > false);
> >
> > }
> >
> > Can public this point in JIRA?
> >
> > Best regards,
> >
> > T-Systems RUS
> > Point of Production
> > Roman Koriakov
> > Software Developer
> > Kirova 11, Voronezh, Russia
> > Tel: + 7 473 200 15 30
> > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
> > http://www.t-systems.com<http://www.t-systems.ru/>
> >
> >
> >
> > -Original Message-
> > From: Ilya Kasnacheev 
> > Sent: Thursday, December 12, 2019 6:35 PM
> > To: dev 
> > Subject: Re: joining
> >
> >
> >
> > Hello!
> >
> >
> >
> > You will need to register on https://issues.apache.org/jira/ first.
> >
> >
> >
> > Please tell me when you do.
> >
> >
> >
> > Regards,
> >
> > --
> >
> > Ilya Kasnacheev
> >
> >
> >
> >
> >
> > чт, 12 дек. 2019 г. в 18:09,  > roman.koria...@t-systems.com>>:
> >
> >
> >
> > > Hi Ilya,
> >
> > >
> >
> > > it’d be nice if it were rkoriakov
> >
> > >
> >
> > >
> >
> > >
> >
> > > Best regards,
> >
> > >
> >
> > > T-Systems RUS
> >
> > > Point of Production
> >
> > > Roman Koriakov
> >
> > > Software Developer
> >
> > > Kirova 11, Voronezh, Russia
> >
> > > Tel: + 7 473 200 15 30
> >
> > > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com
> > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com
> > >>
> >
> > > http://www.t-systems.com<http://www.t-systems.ru/<
> > http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
> >
> > >
> >
> > >
> >
> > >
> >
> > > -Original Message-
> >
> > > From: Ilya Kasnacheev  > ilya.kasnach...@gmail.com>>
> >
> > > Sent: Thursday, December 12, 2019 5:25 PM
> >
> > > To: dev mailto:dev@ignite.apache.org>>
> >
> > > Subject: Re: joining
> >
> > >
>

Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-16 Thread Ilya Kasnacheev

Hello!

Is there a chance you are using Zk?

I believe it's https://issues.apache.org/jira/browse/IGNITE-6564

Regards,
-- 
Ilya Kasnacheev


пт, 13 дек. 2019 г. в 12:24, :

> Hi Community,
>
> I’d like to ask you about the following behavior of Apache Ignite:
>
>
> If we want to react on some PUT or READ cache operations first of all we
> need to turn on the appropriate cache events on the server node and catch
> those events on the client nodes using remote approach with two listeners.
> It works well until we switch on statisticsEnabled on the server node, it
> will lead to the situation when we get empty CacheEvent objects.
>
> The example that demonstrates this issue is in the attachments. This
> example is consists of three nodes:  1 server node with cache and 2
> clients.  One client is filling the cache and the second one is listening
> PUT operations. When we turn on Cache Metrics on the server node:
> cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get
> empty events (Sometimes CacheEvent objects with null fields. Sometimes
> there are no events at all)
>
> My suppose is there is some Exception in GridCacheEventManager.addEvent()
> when Cache Metrics is turned on.
>
> catch (Exception e) {
>   if
> (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
> throw e;  if (log.isDebugEnabled())
>  log.debug("Failed to unmarshall cache object value for the event
> notification: " + e);
>
>   if (!forceKeepBinary)
> LT.warn(log, "Failed to unmarshall cache object value for the event
> notification " +
>  "(all further notifications will keep binary object
> format).");
>
>   forceKeepBinary = true;
>
>   key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false);
>
>   val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true,
> false);
>
>   oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true,
> false);
>
> }
>
> Can public this point in JIRA?
>
> Best regards,
>
> T-Systems RUS
> Point of Production
> Roman Koriakov
> Software Developer
> Kirova 11, Voronezh, Russia
> Tel: + 7 473 200 15 30
> E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
> http://www.t-systems.com<http://www.t-systems.ru/>
>
>
>
> -Original Message-
> From: Ilya Kasnacheev 
> Sent: Thursday, December 12, 2019 6:35 PM
> To: dev 
> Subject: Re: joining
>
>
>
> Hello!
>
>
>
> You will need to register on https://issues.apache.org/jira/ first.
>
>
>
> Please tell me when you do.
>
>
>
> Regards,
>
> --
>
> Ilya Kasnacheev
>
>
>
>
>
> чт, 12 дек. 2019 г. в 18:09,  roman.koria...@t-systems.com>>:
>
>
>
> > Hi Ilya,
>
> >
>
> > it’d be nice if it were rkoriakov
>
> >
>
> >
>
> >
>
> > Best regards,
>
> >
>
> > T-Systems RUS
>
> > Point of Production
>
> > Roman Koriakov
>
> > Software Developer
>
> > Kirova 11, Voronezh, Russia
>
> > Tel: + 7 473 200 15 30
>
> > E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com
> <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com
> >>
>
> > http://www.t-systems.com<http://www.t-systems.ru/<
> http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
>
> >
>
> >
>
> >
>
> > -Original Message-
>
> > From: Ilya Kasnacheev  ilya.kasnach...@gmail.com>>
>
> > Sent: Thursday, December 12, 2019 5:25 PM
>
> > To: dev mailto:dev@ignite.apache.org>>
>
> > Subject: Re: joining
>
> >
>
> >
>
> >
>
> > Hello!
>
> >
>
> >
>
> >
>
> > I will need an Apache JIRA username to add you to contributors. Can you
>
> > provide it?
>
> >
>
> >
>
> >
>
> > Regards,
>
> >
>
> > --
>
> >
>
> > Ilya Kasnacheev
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > чт, 12 дек. 2019 г. в 17:20, 
> > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>:
>
> >
>
> >
>
> >
>
> > > Hi everyone,
>
> >
>
> > > I'd like to participate in this  project!
>
> >
>
> > >
>
> >
>
> > > Best regards,
>
> >
>
> > >
>
> >
>
> > > T-Systems RUS
>
> >
>
> > > Point of Production
>
> >
>
> > > Roman Koriakov
>
> >
>
> > > Software Developer
>
> >
>
> > > Kirova 11, Voronezh, Russia
>
> >
>
> > > Tel: + 7 473 200 15 30
>
> >
>
> > > E-mail: roman.koria...@t-systems.com roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%
> 3cmailto:roman.koria...@t-systems.com>
>
> > <mailto:roman.koria...@t-systems.com%
> 3cmailto:roman.koria...@t-systems.com
>
> > >>
>
> >
>
> > > http://www.t-systems.com<http://www.t-systems.ru/<<
> http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c>
>
> > http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
>
> >
>
> > >
>
> >
>
> > >
>
> >
>

Re: When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-16 Thread Ivan Pavlukhin

Hi Roman,

Thank you for reporting this! I looked into and on my machine I was
able to receive events on client-handler node but exception occurred
in the local listener! In a following line:
System.out.println("Received  event [evt=" + evt.name() + ",
cacheName=" + evt.cacheName() + ", key=" + evt.key().toString());

This indeed looks like a weird bug. Event appears broken after
deserialization on a listener side after it is received from a server.

пт, 13 дек. 2019 г. в 12:24, :
>
> Hi Community,
>
> I’d like to ask you about the following behavior of Apache Ignite:
>
>
> If we want to react on some PUT or READ cache operations first of all we need 
> to turn on the appropriate cache events on the server node and catch those 
> events on the client nodes using remote approach with two listeners. It works 
> well until we switch on statisticsEnabled on the server node, it will lead to 
> the situation when we get empty CacheEvent objects.
>
> The example that demonstrates this issue is in the attachments. This example 
> is consists of three nodes:  1 server node with cache and 2 clients.  One 
> client is filling the cache and the second one is listening PUT operations. 
> When we turn on Cache Metrics on the server node: 
> cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get empty 
> events (Sometimes CacheEvent objects with null fields. Sometimes there are no 
> events at all)
>
> My suppose is there is some Exception in GridCacheEventManager.addEvent() 
> when Cache Metrics is turned on.
>
> catch (Exception e) {
>   if 
> (!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
> throw e;  if (log.isDebugEnabled())
>  log.debug("Failed to unmarshall cache object value for the event 
> notification: " + e);
>
>   if (!forceKeepBinary)
> LT.warn(log, "Failed to unmarshall cache object value for the event 
> notification " +
>  "(all further notifications will keep binary object format).");
>
>   forceKeepBinary = true;
>
>   key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false);
>
>   val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, false);
>
>   oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, 
> false);
>
> }
>
> Can public this point in JIRA?
>
> Best regards,
>
> T-Systems RUS
> Point of Production
> Roman Koriakov
> Software Developer
> Kirova 11, Voronezh, Russia
> Tel: + 7 473 200 15 30
> E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
> http://www.t-systems.com<http://www.t-systems.ru/>
>
>
>
> -Original Message-
> From: Ilya Kasnacheev 
> Sent: Thursday, December 12, 2019 6:35 PM
> To: dev 
> Subject: Re: joining
>
>
>
> Hello!
>
>
>
> You will need to register on https://issues.apache.org/jira/ first.
>
>
>
> Please tell me when you do.
>
>
>
> Regards,
>
> --
>
> Ilya Kasnacheev
>
>
>
>
>
> чт, 12 дек. 2019 г. в 18:09, 
> mailto:roman.koria...@t-systems.com>>:
>
>
>
> > Hi Ilya,
>
> >
>
> > it’d be nice if it were rkoriakov
>
> >
>
> >
>
> >
>
> > Best regards,
>
> >
>
> > T-Systems RUS
>
> > Point of Production
>
> > Roman Koriakov
>
> > Software Developer
>
> > Kirova 11, Voronezh, Russia
>
> > Tel: + 7 473 200 15 30
>
> > E-mail: 
> > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>>
>
> > http://www.t-systems.com<http://www.t-systems.ru/<http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
>
> >
>
> >
>
> >
>
> > -Original Message-
>
> > From: Ilya Kasnacheev 
> > mailto:ilya.kasnach...@gmail.com>>
>
> > Sent: Thursday, December 12, 2019 5:25 PM
>
> > To: dev mailto:dev@ignite.apache.org>>
>
> > Subject: Re: joining
>
> >
>
> >
>
> >
>
> > Hello!
>
> >
>
> >
>
> >
>
> > I will need an Apache JIRA username to add you to contributors. Can you
>
> > provide it?
>
> >
>
> >
>
> >
>
> > Regards,
>
> >
>
> > --
>
> >
>
> > Ilya Kasnacheev
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > чт, 12 дек. 2019 г. в 17:20, 
> > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>:
>
> >
>
> >
>
> >
>
> > > Hi everyone,
>
> >
>
> > > I'd like to participate in this  project!
>
> >
>
> > >
>
> >
>
> > > Best regards,
>
> >
>
> > >
>
> >
>
> > > T-Systems RUS
>
> >
>
> > > Point of Production
>
> >
>
> > > Roman Koriakov
>
> >
>
> > > Software Developer
>
> >
>
> > > Kirova 11, Voronezh, Russia
>
> >
>
> > > Tel: + 7 473 200 15 30
>
> >
>
> > > E-mail: 
> > > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>
>
> > <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com
>
> > >>
>
> >
>
> > > http://www.t-systems.com<http://www.t-systems.ru/<<http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c>
>
> > http://www.t-systems.com%3chttp:/www.t-systems.ru/>>
>
> >
>
> > >
>
> >
>
> > >
>
> >



-- 
Best regards,
Ivan Pavlukhin

[jira] [Created] (IGNITE-12445) When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-13 Thread Roman Koriakov (Jira)

Roman Koriakov created IGNITE-12445:
---

 Summary: When Cache Metrics are switched on (statisticsEnabled = 
true) the empty cache events arrive to the client nodes 
 Key: IGNITE-12445
 URL: https://issues.apache.org/jira/browse/IGNITE-12445
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.7.6
 Environment: OS Name Microsoft Windows 10 Pro
 java version "1.8.0_231"
 java version OpenJDK 64-Bit Server VM 11+28
Reporter: Roman Koriakov


If we want to react on some PUT or READ cache operations first of all we need 
to turn on the appropriate cache events on the server node and catch those 
events on the client nodes using remote approach with two listeners. It works 
well until we switch on *statisticsEnabled* on the server node, it will lead to 
the situation when we get empty *CacheEvent* objects.

The example that demonstrates this issue is in the attachments. This example is 
consists of three nodes:  1 server node with cache and 2 clients.  One client 
is filling the cache and the second one is listening PUT operations. When we 
turn on Cache Metrics on the server node: 
*cacheConfig.setStatisticsEnabled(true);* in *EventServerCache.java* we get 
empty events ({color:#172b4d}Sometimes {color}CacheEvent objects with null 
fields. Sometimes there are no events at all)

My suppose is there is some Exception in GridCacheEventManager.addEvent() when 
Cache Metrics is turned on. 

{color:#cc7832}catch {color}(Exception e) {
   {color:#cc7832}if 
{color}(!{color:#9876aa}cctx{color}.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled({color:#9876aa}cctx{color}.config()))
     {color:#cc7832}throw {color}e{color:#cc7832};{color}{color:#cc7832}  if 
{color}({color:#9876aa}log{color}.isDebugEnabled())
      {color:#9876aa}log{color}.debug({color:#6a8759}"Failed to unmarshall 
cache object value for the event notification: " {color}+ 
e){color:#cc7832};{color}{color:#cc7832}  {color}

{color:#cc7832}  if {color}(!{color:#9876aa}forceKeepBinary{color})
     LT.warn({color:#9876aa}log{color}{color:#cc7832}, 
{color}{color:#6a8759}"Failed to unmarshall cache object value for the event 
notification " {color}+
              {color:#6a8759}"(all further notifications will keep binary 
object format)."{color}){color:#cc7832};{color} 

{color:#9876aa}  forceKeepBinary {color}= {color:#cc7832}true;{color} 

  key0 = 
{color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(key{color:#cc7832},
 true, false{color}){color:#cc7832};{color} 

  val0 = 
{color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(newVal{color:#cc7832},
 true, false{color}){color:#cc7832};{color} 

  oldVal0 = 
{color:#9876aa}cctx{color}.cacheObjectContext().unwrapBinaryIfNeeded(oldVal{color:#cc7832},
 true, false{color}){color:#cc7832};{color}

}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

When Cache Metrics are switched on (statisticsEnabled = true) the empty cache events arrive to the client nodes

2019-12-13 Thread Roman.Koriakov

Hi Community,

I’d like to ask you about the following behavior of Apache Ignite:


If we want to react on some PUT or READ cache operations first of all we need 
to turn on the appropriate cache events on the server node and catch those 
events on the client nodes using remote approach with two listeners. It works 
well until we switch on statisticsEnabled on the server node, it will lead to 
the situation when we get empty CacheEvent objects.

The example that demonstrates this issue is in the attachments. This example is 
consists of three nodes:  1 server node with cache and 2 clients.  One client 
is filling the cache and the second one is listening PUT operations. When we 
turn on Cache Metrics on the server node: 
cacheConfig.setStatisticsEnabled(true); in EventServerCache.java we get empty 
events (Sometimes CacheEvent objects with null fields. Sometimes there are no 
events at all)

My suppose is there is some Exception in GridCacheEventManager.addEvent() when 
Cache Metrics is turned on.

catch (Exception e) {
  if 
(!cctx.cacheObjectContext().kernalContext().cacheObjects().isBinaryEnabled(cctx.config()))
throw e;  if (log.isDebugEnabled())
 log.debug("Failed to unmarshall cache object value for the event 
notification: " + e);

  if (!forceKeepBinary)
LT.warn(log, "Failed to unmarshall cache object value for the event 
notification " +
 "(all further notifications will keep binary object format).");

  forceKeepBinary = true;

  key0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(key, true, false);

  val0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(newVal, true, false);

  oldVal0 = cctx.cacheObjectContext().unwrapBinaryIfNeeded(oldVal, true, false);

}

Can public this point in JIRA?

Best regards,

T-Systems RUS
Point of Production
Roman Koriakov
Software Developer
Kirova 11, Voronezh, Russia
Tel: + 7 473 200 15 30
E-mail: roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>
http://www.t-systems.com<http://www.t-systems.ru/>



-Original Message-
From: Ilya Kasnacheev 
Sent: Thursday, December 12, 2019 6:35 PM
To: dev 
Subject: Re: joining



Hello!



You will need to register on https://issues.apache.org/jira/ first.



Please tell me when you do.



Regards,

--

Ilya Kasnacheev





чт, 12 дек. 2019 г. в 18:09, 
mailto:roman.koria...@t-systems.com>>:



> Hi Ilya,

>

> it’d be nice if it were rkoriakov

>

>

>

> Best regards,

>

> T-Systems RUS

> Point of Production

> Roman Koriakov

> Software Developer

> Kirova 11, Voronezh, Russia

> Tel: + 7 473 200 15 30

> E-mail: 
> roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>>

> http://www.t-systems.com<http://www.t-systems.ru/<http://www.t-systems.com%3chttp:/www.t-systems.ru/>>

>

>

>

> -Original Message-

> From: Ilya Kasnacheev 
> mailto:ilya.kasnach...@gmail.com>>

> Sent: Thursday, December 12, 2019 5:25 PM

> To: dev mailto:dev@ignite.apache.org>>

> Subject: Re: joining

>

>

>

> Hello!

>

>

>

> I will need an Apache JIRA username to add you to contributors. Can you

> provide it?

>

>

>

> Regards,

>

> --

>

> Ilya Kasnacheev

>

>

>

>

>

> чт, 12 дек. 2019 г. в 17:20,  roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com>>>:

>

>

>

> > Hi everyone,

>

> > I'd like to participate in this  project!

>

> >

>

> > Best regards,

>

> >

>

> > T-Systems RUS

>

> > Point of Production

>

> > Roman Koriakov

>

> > Software Developer

>

> > Kirova 11, Voronezh, Russia

>

> > Tel: + 7 473 200 15 30

>

> > E-mail: 
> > roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com<mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com>

> <mailto:roman.koria...@t-systems.com%3cmailto:roman.koria...@t-systems.com

> >>

>

> > http://www.t-systems.com<http://www.t-systems.ru/<<http://www.t-systems.com%3chttp:/www.t-systems.ru/%3c>

> http://www.t-systems.com%3chttp:/www.t-systems.ru/>>

>

> >

>

> >

>

[jira] [Created] (IGNITE-12196) [Phase-4] Deprecate old rebalancing cache metrics

2019-09-18 Thread Maxim Muzafarov (Jira)

Maxim Muzafarov created IGNITE-12196:


 Summary: [Phase-4] Deprecate old rebalancing cache metrics
 Key: IGNITE-12196
 URL: https://issues.apache.org/jira/browse/IGNITE-12196
 Project: Ignite
  Issue Type: Sub-task
Reporter: Maxim Muzafarov


We need to mark rebalancing CacheMetrics deprecated and remove them from 
metrics a newly introduced metrics framework IGNITE-11961. Such cache metrics 
should be implemented in an old-fashion way (like they were before the metrics 
framework added) to keep backwards compatibility.
Removed it Apache Ignite 3.0




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-11 Thread Alex Plehanov

Denis, I measure the impact of metrics collecting on my laptop, it's about
5 seconds on each node for collecting metrics of 1000 caches (all caches in
one cache group) with 32000 partitions. All this time tcp-disco-msg-worker
is blocked.

Guys, thanks for your proposals, I'd filled ticket [1].

[1]: https://issues.apache.org/jira/browse/IGNITE-10642


вт, 4 дек. 2018 г. в 18:07, Alexey Kuznetsov :

> Hi,
>
> One of the problems with metrics is a huge size in case when a lot caches
> started on node (for example, I see 7000 caches).
> We have to think how to compact them.
> Not all metrics changed frequently, so, we may store locally and send over
> wire only a difference from previous collect.
>
> And think carefully about store format. For example, if current cache
> metrics will be passed as JSON object,
>  then 70% of it will be strings with metrics names.
>
>
> On Tue, Dec 4, 2018 at 7:22 PM Vladimir Ozerov 
> wrote:
>
> > Hi Alex,
> >
> > Agree with you. Most of the time these distribution of metrics is not
> > needed. In future we will have more and more information which
> potentially
> > needs to be shared between nodes. E.g. IO statistics, SQL statistics for
> > query optimizer, SQL execution history, etc. We need common mechanics for
> > this, so I vote for your proposal:
> > 1) Data is collected locally
> > 2) If a node needs to collect data from the cluster, it sends explicit
> > request over communication SPI
> > 3) For performance reasons we may consider caching - return previously
> > collected metrics without re-requesting them again if they are not too
> old
> > (configurable)
> >
> > On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov 
> > wrote:
> >
> > > Hi Igniters,
> > >
> > > In the current implementation, cache metrics are collected on each node
> > and
> > > sent across the whole cluster with discovery message
> > > (TcpDiscoveryMetricsUpdateMessage) with configured frequency
> > > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested
> > > them.
> > > If there are a lot of caches and a lot of nodes in the cluster, metrics
> > > update message (which contain each metric for each cache on each node)
> > can
> > > reach a critical size.
> > >
> > > Also frequently collecting all cache metrics have a negative
> performance
> > > impact (some of them just get values from AtomicLong, but some of them
> > need
> > > an iteration over all cache partitions).
> > > The only way now to disable cache metrics collecting and sending with
> > > discovery message is to disable statistics for each cache. But this
> also
> > > makes impossible to request some of cache metrics locally (for the
> > current
> > > node only). Requesting a limited set of cache metrics on the current
> node
> > > doesn't have such performance impact as the frequent collecting of all
> > > cache metrics, but sometimes it's enough for diagnostic purposes.
> > >
> > > As a workaround I have filled and implemented ticket [1], which
> > introduces
> > > new system property to disable cache metrics sending with
> > > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the
> > message
> > > will contain only node metrics). But system property is not good for a
> > > permanent solution. Perhaps it's better to move such property to public
> > API
> > > (to IgniteConfiguration for example).
> > >
> > > Also maybe we should change cache metrics distributing strategy? For
> > > example, collect metrics by request via communication SPI or subscribe
> > to a
> > > limited set of cache/metrics, etc.
> > >
> > > Thoughts?
> > >
> > > [1]: https://issues.apache.org/jira/browse/IGNITE-10172
> > >
> >
>
>
> --
> Alexey Kuznetsov
>

[jira] [Created] (IGNITE-10642) Cache metrics distribution mechanism should be changed from broadcast to request-response communication pattern

2018-12-11 Thread Aleksey Plekhanov (JIRA)

Aleksey Plekhanov created IGNITE-10642:
--

 Summary: Cache metrics distribution mechanism should be changed 
from broadcast to request-response communication pattern
 Key: IGNITE-10642
 URL: https://issues.apache.org/jira/browse/IGNITE-10642
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.7
Reporter: Aleksey Plekhanov


In the current implementation, all cache metrics are collected on each node for 
all caches and sent across the whole cluster with discovery message 
({{TcpDiscoveryMetricsUpdateMessage}}) with configured frequency 
(MetricsUpdateFrequency, 2 seconds by default) even if no one requested them. 

This mechanism should be changed in the following ways:
* Local cache metrics should be available (if configured) on each node
* If a node needs to collect data from the cluster, it sends explicit
request over communication SPI (request should contain a limited set of caches 
and/or metrics)
* For performance reasons collected cluster-wide values must be cached. 
Previously
collected metrics should be returned without re-requesting them again if they 
are not too old
(configurable)
* The mechanism should be easily adaptable for other types of statistics, which 
probably needs to be shared between nodes in the future (IO statistics, SQL 
statistics, SQL execution history, etc)
* Message format should be carefully designed to minimize message size (cluster 
can contain thousands of caches and hundreds of nodes)
* There must be an opportunity to configure metrics in runtime



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-04 Thread Alexey Kuznetsov

Hi,

One of the problems with metrics is a huge size in case when a lot caches
started on node (for example, I see 7000 caches).
We have to think how to compact them.
Not all metrics changed frequently, so, we may store locally and send over
wire only a difference from previous collect.

And think carefully about store format. For example, if current cache
metrics will be passed as JSON object,
 then 70% of it will be strings with metrics names.


On Tue, Dec 4, 2018 at 7:22 PM Vladimir Ozerov  wrote:

> Hi Alex,
>
> Agree with you. Most of the time these distribution of metrics is not
> needed. In future we will have more and more information which potentially
> needs to be shared between nodes. E.g. IO statistics, SQL statistics for
> query optimizer, SQL execution history, etc. We need common mechanics for
> this, so I vote for your proposal:
> 1) Data is collected locally
> 2) If a node needs to collect data from the cluster, it sends explicit
> request over communication SPI
> 3) For performance reasons we may consider caching - return previously
> collected metrics without re-requesting them again if they are not too old
> (configurable)
>
> On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov 
> wrote:
>
> > Hi Igniters,
> >
> > In the current implementation, cache metrics are collected on each node
> and
> > sent across the whole cluster with discovery message
> > (TcpDiscoveryMetricsUpdateMessage) with configured frequency
> > (MetricsUpdateFrequency, 2 seconds by default) even if no one requested
> > them.
> > If there are a lot of caches and a lot of nodes in the cluster, metrics
> > update message (which contain each metric for each cache on each node)
> can
> > reach a critical size.
> >
> > Also frequently collecting all cache metrics have a negative performance
> > impact (some of them just get values from AtomicLong, but some of them
> need
> > an iteration over all cache partitions).
> > The only way now to disable cache metrics collecting and sending with
> > discovery message is to disable statistics for each cache. But this also
> > makes impossible to request some of cache metrics locally (for the
> current
> > node only). Requesting a limited set of cache metrics on the current node
> > doesn't have such performance impact as the frequent collecting of all
> > cache metrics, but sometimes it's enough for diagnostic purposes.
> >
> > As a workaround I have filled and implemented ticket [1], which
> introduces
> > new system property to disable cache metrics sending with
> > TcpDiscoveryMetricsUpdateMessage (in case this property is set, the
> message
> > will contain only node metrics). But system property is not good for a
> > permanent solution. Perhaps it's better to move such property to public
> API
> > (to IgniteConfiguration for example).
> >
> > Also maybe we should change cache metrics distributing strategy? For
> > example, collect metrics by request via communication SPI or subscribe
> to a
> > limited set of cache/metrics, etc.
> >
> > Thoughts?
> >
> > [1]: https://issues.apache.org/jira/browse/IGNITE-10172
> >
>


-- 
Alexey Kuznetsov

Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-04 Thread Vladimir Ozerov

Hi Alex,

Agree with you. Most of the time these distribution of metrics is not
needed. In future we will have more and more information which potentially
needs to be shared between nodes. E.g. IO statistics, SQL statistics for
query optimizer, SQL execution history, etc. We need common mechanics for
this, so I vote for your proposal:
1) Data is collected locally
2) If a node needs to collect data from the cluster, it sends explicit
request over communication SPI
3) For performance reasons we may consider caching - return previously
collected metrics without re-requesting them again if they are not too old
(configurable)

On Tue, Dec 4, 2018 at 12:46 PM Alex Plehanov 
wrote:

> Hi Igniters,
>
> In the current implementation, cache metrics are collected on each node and
> sent across the whole cluster with discovery message
> (TcpDiscoveryMetricsUpdateMessage) with configured frequency
> (MetricsUpdateFrequency, 2 seconds by default) even if no one requested
> them.
> If there are a lot of caches and a lot of nodes in the cluster, metrics
> update message (which contain each metric for each cache on each node) can
> reach a critical size.
>
> Also frequently collecting all cache metrics have a negative performance
> impact (some of them just get values from AtomicLong, but some of them need
> an iteration over all cache partitions).
> The only way now to disable cache metrics collecting and sending with
> discovery message is to disable statistics for each cache. But this also
> makes impossible to request some of cache metrics locally (for the current
> node only). Requesting a limited set of cache metrics on the current node
> doesn't have such performance impact as the frequent collecting of all
> cache metrics, but sometimes it's enough for diagnostic purposes.
>
> As a workaround I have filled and implemented ticket [1], which introduces
> new system property to disable cache metrics sending with
> TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message
> will contain only node metrics). But system property is not good for a
> permanent solution. Perhaps it's better to move such property to public API
> (to IgniteConfiguration for example).
>
> Also maybe we should change cache metrics distributing strategy? For
> example, collect metrics by request via communication SPI or subscribe to a
> limited set of cache/metrics, etc.
>
> Thoughts?
>
> [1]: https://issues.apache.org/jira/browse/IGNITE-10172
>

Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-04 Thread Zhenya Stanilovsky

hi, Alex.
imo:
1. metrics through discovery require refactoring.
2. local cache metrics should be available (if configured) on each node.
3. there must be an opportunity to configure metrics in runtime.

thanks.


>
>
>Hi Igniters,
>
>In the current implementation, cache metrics are collected on each node and
>sent across the whole cluster with discovery message
>(TcpDiscoveryMetricsUpdateMessage) with configured frequency
>(MetricsUpdateFrequency, 2 seconds by default) even if no one requested
>them.
>If there are a lot of caches and a lot of nodes in the cluster, metrics
>update message (which contain each metric for each cache on each node) can
>reach a critical size.
>
>Also frequently collecting all cache metrics have a negative performance
>impact (some of them just get values from AtomicLong, but some of them need
>an iteration over all cache partitions).
>The only way now to disable cache metrics collecting and sending with
>discovery message is to disable statistics for each cache. But this also
>makes impossible to request some of cache metrics locally (for the current
>node only). Requesting a limited set of cache metrics on the current node
>doesn't have such performance impact as the frequent collecting of all
>cache metrics, but sometimes it's enough for diagnostic purposes.
>
>As a workaround I have filled and implemented ticket [1], which introduces
>new system property to disable cache metrics sending with
>TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message
>will contain only node metrics). But system property is not good for a
>permanent solution. Perhaps it's better to move such property to public API
>(to IgniteConfiguration for example).
>
>Also maybe we should change cache metrics distributing strategy? For
>example, collect metrics by request via communication SPI or subscribe to a
>limited set of cache/metrics, etc.
>
>Thoughts?
>
>[1]:  https://issues.apache.org/jira/browse/IGNITE-10172


-- 
Zhenya Stanilovsky

Re: [DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-04 Thread Denis Mekhanikov

Alex,

Did you measure the impact of metrics collection? What is the overhead you
are trying to avoid?

Just to make it clear, MetricUpdateMessage-s are used as heartbeats.
So they are sent anyways, even if no metrics are distributed between nodes.

Denis

вт, 4 дек. 2018 г. в 12:46, Alex Plehanov :

> Hi Igniters,
>
> In the current implementation, cache metrics are collected on each node and
> sent across the whole cluster with discovery message
> (TcpDiscoveryMetricsUpdateMessage) with configured frequency
> (MetricsUpdateFrequency, 2 seconds by default) even if no one requested
> them.
> If there are a lot of caches and a lot of nodes in the cluster, metrics
> update message (which contain each metric for each cache on each node) can
> reach a critical size.
>
> Also frequently collecting all cache metrics have a negative performance
> impact (some of them just get values from AtomicLong, but some of them need
> an iteration over all cache partitions).
> The only way now to disable cache metrics collecting and sending with
> discovery message is to disable statistics for each cache. But this also
> makes impossible to request some of cache metrics locally (for the current
> node only). Requesting a limited set of cache metrics on the current node
> doesn't have such performance impact as the frequent collecting of all
> cache metrics, but sometimes it's enough for diagnostic purposes.
>
> As a workaround I have filled and implemented ticket [1], which introduces
> new system property to disable cache metrics sending with
> TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message
> will contain only node metrics). But system property is not good for a
> permanent solution. Perhaps it's better to move such property to public API
> (to IgniteConfiguration for example).
>
> Also maybe we should change cache metrics distributing strategy? For
> example, collect metrics by request via communication SPI or subscribe to a
> limited set of cache/metrics, etc.
>
> Thoughts?
>
> [1]: https://issues.apache.org/jira/browse/IGNITE-10172
>

[DISCUSSION] Performance issue with cluster-wide cache metrics distribution

2018-12-04 Thread Alex Plehanov

Hi Igniters,

In the current implementation, cache metrics are collected on each node and
sent across the whole cluster with discovery message
(TcpDiscoveryMetricsUpdateMessage) with configured frequency
(MetricsUpdateFrequency, 2 seconds by default) even if no one requested
them.
If there are a lot of caches and a lot of nodes in the cluster, metrics
update message (which contain each metric for each cache on each node) can
reach a critical size.

Also frequently collecting all cache metrics have a negative performance
impact (some of them just get values from AtomicLong, but some of them need
an iteration over all cache partitions).
The only way now to disable cache metrics collecting and sending with
discovery message is to disable statistics for each cache. But this also
makes impossible to request some of cache metrics locally (for the current
node only). Requesting a limited set of cache metrics on the current node
doesn't have such performance impact as the frequent collecting of all
cache metrics, but sometimes it's enough for diagnostic purposes.

As a workaround I have filled and implemented ticket [1], which introduces
new system property to disable cache metrics sending with
TcpDiscoveryMetricsUpdateMessage (in case this property is set, the message
will contain only node metrics). But system property is not good for a
permanent solution. Perhaps it's better to move such property to public API
(to IgniteConfiguration for example).

Also maybe we should change cache metrics distributing strategy? For
example, collect metrics by request via communication SPI or subscribe to a
limited set of cache/metrics, etc.

Thoughts?

[1]: https://issues.apache.org/jira/browse/IGNITE-10172

Unable to get the ignite cache metrics

2018-09-19 Thread krupa

Hi 
I brought ignite server on k8s cluster.
Set the below property for a cache i wanted to check the metrics
 

Then i started the client and tried to push the data into ignite cache.
I am able to see the data in the cache.
But the values for the following metrics i am getting as 0. Can some one let
me know why is this.
ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_cacheputs
= 0
ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_averageputtime
= 0





--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Unable to get the ignite cache metrics

2018-09-19 Thread krupa

Hi 
I am new to ignite. 
Brought up the ignite servers on k8s. 
Enabled the cache level metrics by setting the below property in ignite
config xml for specific cache


As part of Ignite client(brought up as another pod)  i am putting the data
into cache.
when i checked the below cache metrics i am getting 0 value.  
Can some one help me why 0 is coming. 
ignite_org_apache_ignite_internal_processors_cache_cacheclustermetricsmxbeanimpl_cacheputs
 
= 0
ignite_org_apache_ignite_internal_processors_cache_cachelocalmetricsmxbeanimpl_averageputtime
=0

But the for this metric giving the number of entries in the cache:
ignite_org_apache_ignite_internal_processors_cache_cacheclustermetricsmxbeanimpl_keysize
 
= 5060










--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

[jira] [Created] (IGNITE-9224) MVCC SQL: Cache metrics

2018-08-07 Thread Ivan Pavlukhin (JIRA)

Ivan Pavlukhin created IGNITE-9224:
--

 Summary: MVCC SQL: Cache metrics
 Key: IGNITE-9224
 URL: https://issues.apache.org/jira/browse/IGNITE-9224
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Pavlukhin
Assignee: Ivan Pavlukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-8554) Cache metrics: expose metrics with rebalance info about keys

2018-05-22 Thread Alexey Kuznetsov (JIRA)

Alexey Kuznetsov created IGNITE-8554:


 Summary: Cache metrics: expose metrics with rebalance info about 
keys
 Key: IGNITE-8554
 URL: https://issues.apache.org/jira/browse/IGNITE-8554
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexey Kuznetsov
Assignee: Alexey Kuznetsov


In order to show info about rebalance progress we need to expose 
estimatedRebalancingKeys and rebalancedKeys metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] ignite pull request #3369: IGNITE-6923 Cache metrics optimization

2018-04-16 Thread alex-plekhanov

Github user alex-plekhanov closed the pull request at:

https://github.com/apache/ignite/pull/3369


---

[GitHub] ignite pull request #3369: IGNITE-6923 Cache metrics optimization

2018-01-12 Thread alex-plekhanov

GitHub user alex-plekhanov opened a pull request:

https://github.com/apache/ignite/pull/3369

IGNITE-6923 Cache metrics optimization



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alex-plekhanov/ignite IGNITE-6923

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3369.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3369


commit eef3fed408d7cfbeedcabcb354989c64be773724
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-19T14:44:06Z

IGNITE-6923 Optimized nonHeapMemoryUsed

commit 122467d0ca4cfe859e2fc5af276b20c4f50dc89c
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-20T16:13:34Z

IGNITE-6923 getTotalPartitionsCount, getRebalancingPartitionsCount 
optimization

commit e20d842f9ccf9c4d1e4703a52cd723d8e37ddbea
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-22T08:56:36Z

IGNITE-6923 Cluster metrics optimization (proxy class implemented)

commit bff0a1845799ad90c6ecdd3812e84418ba45bd07
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-25T08:40:24Z

IGNITE-6923 Partitions metrics optimization

commit 6dd59b9961f9fc073d4616367e36607430f113b8
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-26T12:57:01Z

IGNITE-6923 Cache metrics optimization

commit 96b0396f05c7aacb24bd9ee88e2564d240fae0be
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-26T12:58:30Z

IGNITE-6923 Cache metrics optimization

commit 79067428f7252700bb23d29f3e3b4b7dfa5586bf
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-27T07:48:56Z

IGNITE-6923 Disable cache metrics update flag

commit 5e5d675f6fb6a25eda57fcbf53e49ec87fda3ba2
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2017-12-27T08:07:57Z

IGNITE-6923 License header

commit e90d778d9fab0b1e467ef1314505060494fea1db
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-01-11T20:57:29Z

IGNITE-6923 Bugfix

commit 77e50a74dadc6ae40d301f63cd0e4b73b6203303
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-01-12T14:43:29Z

IGNITE-6923 Unit test

commit 39f7c653e8b91ec7b02244e2633e27ea9103793d
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-01-12T14:47:35Z

Revert "IGNITE-6923 Disable cache metrics update flag"

This reverts commit 9bb904f

commit 62ea9f5d6524eff6e9a69fbc8ca2ac0c95325796
Author: Aleksey Plekhanov <plehanov.alex@...>
Date:   2018-01-12T19:24:57Z

IGNITE-6923 Test comment added




---

[GitHub] ignite pull request #3356: IGNITE-7126: add new cache metrics parameters

2018-01-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3356


---

[GitHub] ignite pull request #3356: IGNITE-7126: add new cache metrics parameters

2018-01-10 Thread AlexeyRokhin

GitHub user AlexeyRokhin opened a pull request:

https://github.com/apache/ignite/pull/3356

IGNITE-7126: add new cache metrics parameters

Required parameters were added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/AlexeyRokhin/ignite ignite-7126

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3356


commit 6cde9dad36af56926e54304edb7189ae58c69b78
Author: Alexey Rokhin <arokhin@...>
Date:   2018-01-10T22:10:26Z

IGNITE-7126: add new cache metrics parameters




---

[jira] [Created] (IGNITE-7106) Add option to VisorNodeDataCollectorTask to not collect cache metrics

2017-12-04 Thread Alexey Kuznetsov (JIRA)

Alexey Kuznetsov created IGNITE-7106:


 Summary: Add option to VisorNodeDataCollectorTask to not collect 
cache metrics
 Key: IGNITE-7106
 URL: https://issues.apache.org/jira/browse/IGNITE-7106
 Project: Ignite
  Issue Type: Bug
Reporter: Alexey Kuznetsov


On large clusters with > 100 nodes and > 1000 caches this task can collect huge 
amount of data.
We can add an option to collect this info on demand



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6925) Simplify cache metrics activation

2017-11-15 Thread Denis Magda (JIRA)

Denis Magda created IGNITE-6925:
---

 Summary: Simplify cache metrics activation
 Key: IGNITE-6925
 URL: https://issues.apache.org/jira/browse/IGNITE-6925
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Reporter: Denis Magda


The user needs to do 3 things to enabled cache metrics:
- set {{statisticsEnabled}} to {{true}}.
- set not a dummy {{EventsStorageSpi}}
- list metrics of the interest.

This process has to be reduced to 2 steps or, preferably, to 1.

More details are here: 
http://apache-ignite-developers.2346864.n4.nabble.com/Annoying-extra-steps-for-enabling-metrics-td21865.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6923) Cache metrics are updated in timeout-worker potentially delaying critical code execution due to current implementation issues.

2017-11-15 Thread Alexei Scherbakov (JIRA)

Alexei Scherbakov created IGNITE-6923:
-

 Summary: Cache metrics are updated in timeout-worker potentially 
delaying critical code execution due to current implementation issues.
 Key: IGNITE-6923
 URL: https://issues.apache.org/jira/browse/IGNITE-6923
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Affects Versions: 2.3
Reporter: Alexei Scherbakov
 Fix For: 2.4


Some metrics are using full cache iteration for calculation.

See stack trace for example.

{noformat}
"grid-timeout-worker-#39%DPL_GRID%DplGridNodeName%" #152 prio=5 os_prio=0 
tid=0x7f1009a03000 nid=0x5caa runnable [0x7f0f059d9000] 
   java.lang.Thread.State: RUNNABLE 
at java.util.HashMap.containsKey(HashMap.java:595) 
at java.util.HashSet.contains(HashSet.java:203) 
at 
java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) 
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$3.apply(IgniteCacheOffheapManagerImpl.java:339)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$3.apply(IgniteCacheOffheapManagerImpl.java:337)
at 
org.apache.ignite.internal.util.lang.gridfunc.TransformFilteringIterator.hasNext:@TransformFilteringIterator.java:90)
at 
org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:45)
 
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.cacheEntriesCount(IgniteCacheOffheapManagerImpl.java:293)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsImpl.getOffHeapPrimaryEntriesCount(CacheMetricsImpl.java:240)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:271)
 
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3217)
 
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1151)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.nonHeapMemoryUsed(GridDiscoveryManager.java:1121)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.metrics(GridDiscoveryManager.java:1087)
 
at 
org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode.metrics(TcpDiscoveryNode.java:269)
 
at 
org.apache.ignite.internal.IgniteKernal$3.run(IgniteKernal.java:1175) 
at 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask.onTimeout(GridTimeoutProcessor.java:256)
- locked <0x7f115f5bf890> (a 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask)
at 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:158)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

IGNITE-6679 Clean up some deprecated cache metrics

2017-11-02 Thread Nikita Amelchev

Hello, Igniters.

I have removed deprecated metrics [1]. Please, review [2]. Tests look good
[3].

1. https://issues.apache.org/jira/browse/IGNITE-6679
2. https://reviews.ignite.apache.org/ignite/review/IGNT-CR-390
3.
https://ci.ignite.apache.org/project.html?projectId=Ignite20Tests=projectOverview_Ignite20Tests=pull%2F2962%2Fhead

-- 
Best wishes,
Amelchev Nikita

[jira] [Created] (IGNITE-6679) Clean up some deprecated cache metrics

2017-10-19 Thread Sergey Puchnin (JIRA)

Sergey Puchnin created IGNITE-6679:
--

 Summary: Clean up some deprecated cache metrics 
 Key: IGNITE-6679
 URL: https://issues.apache.org/jira/browse/IGNITE-6679
 Project: Ignite
  Issue Type: Improvement
  Security Level: Public (Viewable by anyone)
  Components: cache
Reporter: Sergey Puchnin
Priority: Trivial


An old optimistic serializable mode implementation was removed in 2.0 but some 
cache metrics still present in CacheMetrics interface. 
Need to clean up and remove these metrics:
*TxCommitQueueSize*
*TxPrepareQueueSize*
*TxStartVersionCountsSize*
*TxDhtCommitQueueSize*
*TxDhtPrepareQueueSize*
*TxDhtStartVersionCountsSize*

An algorithm for page eviction was changed and metric 
*DhtEvictQueueCurrentSize* should be removed also.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6630) Incorrect time units of average transaction commit/rollback duration cache metrics.

2017-10-13 Thread Pavel Pereslegin (JIRA)

Pavel Pereslegin created IGNITE-6630:


 Summary: Incorrect time units of average transaction 
commit/rollback duration cache metrics.
 Key: IGNITE-6630
 URL: https://issues.apache.org/jira/browse/IGNITE-6630
 Project: Ignite
  Issue Type: Bug
Reporter: Pavel Pereslegin
Assignee: Pavel Pereslegin
Priority: Minor


AverageTxCommitTime and AverageTxRollbackTime metrics in CacheMetrics 
calculated in milliseconds instead of microseconds as pointed in javadoc.

Simple junit repro:
{code:java}
public class CacheMetricsTxAvgTimeTest extends GridCommonAbstractTest {
/** */
private <K, V> CacheConfiguration<K, V> cacheConfiguration(String name) {
CacheConfiguration<K, V> cacheConfiguration = new 
CacheConfiguration<>(name);
cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
cacheConfiguration.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
cacheConfiguration.setStatisticsEnabled(true);
return cacheConfiguration;
}

/** */
public void testTxCommitDuration() throws Exception {
try ( Ignite node = startGrid(0)) {
IgniteCache<Object, Object> cache = 
node.createCache(cacheConfiguration(DEFAULT_CACHE_NAME));

try (Transaction tx = node.transactions().txStart()) {
cache.put(1, 1);

// Await 1 second.
U.sleep(1_000);

tx.commit();
}

// Documentation says that this metric is in microseconds.
float commitTime = cache.metrics().getAverageTxCommitTime();

// But this assertion will fail because it in milliseconds and 
returns only ~1000.
assert commitTime >= 1_000_000;
}
}
}
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6584) .NET: Propagate new cache metrics

2017-10-09 Thread Pavel Tupitsyn (JIRA)

Pavel Tupitsyn created IGNITE-6584:
--

 Summary: .NET: Propagate new cache metrics
 Key: IGNITE-6584
 URL: https://issues.apache.org/jira/browse/IGNITE-6584
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Pavel Tupitsyn
Priority: Trivial


Some properties are missing in {{ICacheMetrics}} that exist in {{CacheMetrics}} 
on Java side, such as rebalancing-related stuff (see IGNITE-6583). Add them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6565) Use long type for size and keySize in cache metrics

2017-10-05 Thread Ilya Kasnacheev (JIRA)

Ilya Kasnacheev created IGNITE-6565:
---

 Summary: Use long type for size and keySize in cache metrics
 Key: IGNITE-6565
 URL: https://issues.apache.org/jira/browse/IGNITE-6565
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.2
Reporter: Ilya Kasnacheev


Currently it's int so for large caches there's no way to convey correct value.

Should introduce getSizeLong() and getKeySizeLong()

Also introduce the same in .Net and make sure that compatibility not broken 
when passing OP_LOCAL_METRICS and OP_GLOBAL_METRICS.

BTW do we need keySize at all? What's it for?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (IGNITE-6564) Incorrect calculation size and keySize for cluster cache metrics

2017-10-05 Thread Ilya Kasnacheev (JIRA)

Ilya Kasnacheev created IGNITE-6564:
---

 Summary: Incorrect calculation size and keySize for cluster cache 
metrics
 Key: IGNITE-6564
 URL: https://issues.apache.org/jira/browse/IGNITE-6564
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.2
Reporter: Ilya Kasnacheev
Priority: Minor


They are currently not passed by ring and therefore only taken from current 
node, which returns incorrect (local) value.

See CacheMetricsSnapshot class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: Cache Metrics

2017-08-03 Thread Andrey Gura

Den,

I see at least two problems here:

1. Metrics meaning for end user. How user should interpret metrics in
this case. Moreover, average is bad gauge for monitoring because it
hides actual latencies. User should have possibility to get accurate
metrics in order to build some monitoring that can create percentile
based charts for example and accuracy is very important property for
such cases.

2. It just makes code more complex and we will have metrics related
logic in two places instead of one.



On Wed, Jul 26, 2017 at 4:45 AM, Denis Magda <dma...@apache.org> wrote:
> Andrey,
>
> I would simply take an average if a mixed clients-servers cluster group is 
> used.
>
> In general, the goal of the ticket was to fix the time-based metrics on the 
> server side. As far as I understand they are already calculated properly on 
> the client’s considering network contribution, right? So, all that’s left to 
> do is to count the same on the servers so that those metrics no longer return 
> 0.
>
> —
> Denis
>
>> On Jul 25, 2017, at 6:53 AM, Andrey Gura <ag...@apache.org> wrote:
>>
>> Den,
>>
>> doesn't make sense from my point if view. And we create new problem:
>> how should we aggregate this metrics when user requests metrics for
>> cluster group.
>>
>> On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote:
>>> Guys,
>>>
>>> What if we calculate it on both sides? The client will keep the total time 
>>> needed to complete an operation including network hoops while a server 
>>> (primary or backup) will count only local time.
>>>
>>> —
>>> Denis
>>>
>>>> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I believe that the first solution is better than second because it
>>>> takes into account network communication time. Average time of
>>>> communication between nodes doesn't make sense from my point of view.
>>>>
>>>> So I vote for #1.
>>>>
>>>> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин
>>>> <slava.kopti...@gmail.com> wrote:
>>>>> Hi Experts,
>>>>>
>>>>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495
>>>>>
>>>>> A few words about this issue:
>>>>> It is about that the process of gathering/updating of cache metrics is
>>>>> inconsistent in some cases.
>>>>> Let's consider the following simple topology which contains only two 
>>>>> nodes:
>>>>> first node is a client node and the second is a server.
>>>>> And client node starts requests to the server node, for instance
>>>>> cache.put(), cache.putAll(), cache.get() etc.
>>>>> In that case, metrics which are related to counters (cache hits, cache
>>>>> misses, removals and puts) are calculated on the server side,
>>>>> while time metrics are updated on the client node.
>>>>>
>>>>> I think that both metrics (counters and time) should be calculated on the
>>>>> same node. So, there are two obvious solution:
>>>>>
>>>>> #1 Node that starts some operation is responsible for updating the cache
>>>>> metrics.
>>>>> Pro:
>>>>> - it will allow to get more accurate results of metrics.
>>>>> Contra:
>>>>> - this approach does not work in particular cases. for example, 
>>>>> partitioned
>>>>> cache with FULL_ASYNC write synchronization mode.
>>>>> - needs to extend response messages (GridNearAtomicUpdateResponse,
>>>>> GridNearGetResponse etc)
>>>>> in order to provide additional information from remote node: cache hits,
>>>>> number of removal etc.
>>>>> So, it will lead to additional pressure on communication channel.
>>>>> Perhaps, this impact will be small - 4 bytes per message or something like
>>>>> that.
>>>>> - backward incompatibility (this is a consequence of the previous point)
>>>>>
>>>>> #2 Primary node (node that actually executes a request)
>>>>> Pro:
>>>>> - easy to implement
>>>>> - backward compatible
>>>>> Contra:
>>>>> - time metrics will not include the time of communication between nodes, 
>>>>> so
>>>>> the results will be less accurate.
>>>>> - perhaps we need to provide additional metric which will allow to get avg
>>>>> time of communication between nodes.
>>>>>
>>>>> Please let me know about your thoughts.
>>>>> Perhaps, both alternatives are not so good...
>>>>>
>>>>> Regards,
>>>>> Slava.
>>>
>

Re: Cache Metrics

2017-07-25 Thread Denis Magda

Andrey,

I would simply take an average if a mixed clients-servers cluster group is used.

In general, the goal of the ticket was to fix the time-based metrics on the 
server side. As far as I understand they are already calculated properly on the 
client’s considering network contribution, right? So, all that’s left to do is 
to count the same on the servers so that those metrics no longer return 0.

—
Denis
 
> On Jul 25, 2017, at 6:53 AM, Andrey Gura <ag...@apache.org> wrote:
> 
> Den,
> 
> doesn't make sense from my point if view. And we create new problem:
> how should we aggregate this metrics when user requests metrics for
> cluster group.
> 
> On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote:
>> Guys,
>> 
>> What if we calculate it on both sides? The client will keep the total time 
>> needed to complete an operation including network hoops while a server 
>> (primary or backup) will count only local time.
>> 
>> —
>> Denis
>> 
>>> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote:
>>> 
>>> Hi,
>>> 
>>> I believe that the first solution is better than second because it
>>> takes into account network communication time. Average time of
>>> communication between nodes doesn't make sense from my point of view.
>>> 
>>> So I vote for #1.
>>> 
>>> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин
>>> <slava.kopti...@gmail.com> wrote:
>>>> Hi Experts,
>>>> 
>>>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495
>>>> 
>>>> A few words about this issue:
>>>> It is about that the process of gathering/updating of cache metrics is
>>>> inconsistent in some cases.
>>>> Let's consider the following simple topology which contains only two nodes:
>>>> first node is a client node and the second is a server.
>>>> And client node starts requests to the server node, for instance
>>>> cache.put(), cache.putAll(), cache.get() etc.
>>>> In that case, metrics which are related to counters (cache hits, cache
>>>> misses, removals and puts) are calculated on the server side,
>>>> while time metrics are updated on the client node.
>>>> 
>>>> I think that both metrics (counters and time) should be calculated on the
>>>> same node. So, there are two obvious solution:
>>>> 
>>>> #1 Node that starts some operation is responsible for updating the cache
>>>> metrics.
>>>> Pro:
>>>> - it will allow to get more accurate results of metrics.
>>>> Contra:
>>>> - this approach does not work in particular cases. for example, partitioned
>>>> cache with FULL_ASYNC write synchronization mode.
>>>> - needs to extend response messages (GridNearAtomicUpdateResponse,
>>>> GridNearGetResponse etc)
>>>> in order to provide additional information from remote node: cache hits,
>>>> number of removal etc.
>>>> So, it will lead to additional pressure on communication channel.
>>>> Perhaps, this impact will be small - 4 bytes per message or something like
>>>> that.
>>>> - backward incompatibility (this is a consequence of the previous point)
>>>> 
>>>> #2 Primary node (node that actually executes a request)
>>>> Pro:
>>>> - easy to implement
>>>> - backward compatible
>>>> Contra:
>>>> - time metrics will not include the time of communication between nodes, so
>>>> the results will be less accurate.
>>>> - perhaps we need to provide additional metric which will allow to get avg
>>>> time of communication between nodes.
>>>> 
>>>> Please let me know about your thoughts.
>>>> Perhaps, both alternatives are not so good...
>>>> 
>>>> Regards,
>>>> Slava.
>>

Re: Cache Metrics

2017-07-25 Thread Andrey Gura

Den,

doesn't make sense from my point if view. And we create new problem:
how should we aggregate this metrics when user requests metrics for
cluster group.

On Mon, Jul 24, 2017 at 8:48 PM, Denis Magda <dma...@apache.org> wrote:
> Guys,
>
> What if we calculate it on both sides? The client will keep the total time 
> needed to complete an operation including network hoops while a server 
> (primary or backup) will count only local time.
>
> —
> Denis
>
>> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote:
>>
>> Hi,
>>
>> I believe that the first solution is better than second because it
>> takes into account network communication time. Average time of
>> communication between nodes doesn't make sense from my point of view.
>>
>> So I vote for #1.
>>
>> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин
>> <slava.kopti...@gmail.com> wrote:
>>> Hi Experts,
>>>
>>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495
>>>
>>> A few words about this issue:
>>> It is about that the process of gathering/updating of cache metrics is
>>> inconsistent in some cases.
>>> Let's consider the following simple topology which contains only two nodes:
>>> first node is a client node and the second is a server.
>>> And client node starts requests to the server node, for instance
>>> cache.put(), cache.putAll(), cache.get() etc.
>>> In that case, metrics which are related to counters (cache hits, cache
>>> misses, removals and puts) are calculated on the server side,
>>> while time metrics are updated on the client node.
>>>
>>> I think that both metrics (counters and time) should be calculated on the
>>> same node. So, there are two obvious solution:
>>>
>>> #1 Node that starts some operation is responsible for updating the cache
>>> metrics.
>>> Pro:
>>> - it will allow to get more accurate results of metrics.
>>> Contra:
>>> - this approach does not work in particular cases. for example, partitioned
>>> cache with FULL_ASYNC write synchronization mode.
>>> - needs to extend response messages (GridNearAtomicUpdateResponse,
>>> GridNearGetResponse etc)
>>>  in order to provide additional information from remote node: cache hits,
>>> number of removal etc.
>>>  So, it will lead to additional pressure on communication channel.
>>> Perhaps, this impact will be small - 4 bytes per message or something like
>>> that.
>>> - backward incompatibility (this is a consequence of the previous point)
>>>
>>> #2 Primary node (node that actually executes a request)
>>> Pro:
>>> - easy to implement
>>> - backward compatible
>>> Contra:
>>> - time metrics will not include the time of communication between nodes, so
>>> the results will be less accurate.
>>> - perhaps we need to provide additional metric which will allow to get avg
>>> time of communication between nodes.
>>>
>>> Please let me know about your thoughts.
>>> Perhaps, both alternatives are not so good...
>>>
>>> Regards,
>>> Slava.
>

Re: Cache Metrics

2017-07-24 Thread Denis Magda

Guys,

What if we calculate it on both sides? The client will keep the total time 
needed to complete an operation including network hoops while a server (primary 
or backup) will count only local time.

—
Denis

> On Jul 17, 2017, at 7:07 AM, Andrey Gura <ag...@apache.org> wrote:
> 
> Hi,
> 
> I believe that the first solution is better than second because it
> takes into account network communication time. Average time of
> communication between nodes doesn't make sense from my point of view.
> 
> So I vote for #1.
> 
> On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин
> <slava.kopti...@gmail.com> wrote:
>> Hi Experts,
>> 
>> I am working on https://issues.apache.org/jira/browse/IGNITE-3495
>> 
>> A few words about this issue:
>> It is about that the process of gathering/updating of cache metrics is
>> inconsistent in some cases.
>> Let's consider the following simple topology which contains only two nodes:
>> first node is a client node and the second is a server.
>> And client node starts requests to the server node, for instance
>> cache.put(), cache.putAll(), cache.get() etc.
>> In that case, metrics which are related to counters (cache hits, cache
>> misses, removals and puts) are calculated on the server side,
>> while time metrics are updated on the client node.
>> 
>> I think that both metrics (counters and time) should be calculated on the
>> same node. So, there are two obvious solution:
>> 
>> #1 Node that starts some operation is responsible for updating the cache
>> metrics.
>> Pro:
>> - it will allow to get more accurate results of metrics.
>> Contra:
>> - this approach does not work in particular cases. for example, partitioned
>> cache with FULL_ASYNC write synchronization mode.
>> - needs to extend response messages (GridNearAtomicUpdateResponse,
>> GridNearGetResponse etc)
>>  in order to provide additional information from remote node: cache hits,
>> number of removal etc.
>>  So, it will lead to additional pressure on communication channel.
>> Perhaps, this impact will be small - 4 bytes per message or something like
>> that.
>> - backward incompatibility (this is a consequence of the previous point)
>> 
>> #2 Primary node (node that actually executes a request)
>> Pro:
>> - easy to implement
>> - backward compatible
>> Contra:
>> - time metrics will not include the time of communication between nodes, so
>> the results will be less accurate.
>> - perhaps we need to provide additional metric which will allow to get avg
>> time of communication between nodes.
>> 
>> Please let me know about your thoughts.
>> Perhaps, both alternatives are not so good...
>> 
>> Regards,
>> Slava.

Re: Cache Metrics

2017-07-17 Thread Andrey Gura

Hi,

I believe that the first solution is better than second because it
takes into account network communication time. Average time of
communication between nodes doesn't make sense from my point of view.

So I vote for #1.

On Thu, Jul 13, 2017 at 11:52 PM, Вячеслав Коптилин
<slava.kopti...@gmail.com> wrote:
> Hi Experts,
>
> I am working on https://issues.apache.org/jira/browse/IGNITE-3495
>
> A few words about this issue:
> It is about that the process of gathering/updating of cache metrics is
> inconsistent in some cases.
> Let's consider the following simple topology which contains only two nodes:
> first node is a client node and the second is a server.
> And client node starts requests to the server node, for instance
> cache.put(), cache.putAll(), cache.get() etc.
> In that case, metrics which are related to counters (cache hits, cache
> misses, removals and puts) are calculated on the server side,
> while time metrics are updated on the client node.
>
> I think that both metrics (counters and time) should be calculated on the
> same node. So, there are two obvious solution:
>
> #1 Node that starts some operation is responsible for updating the cache
> metrics.
> Pro:
>  - it will allow to get more accurate results of metrics.
> Contra:
> - this approach does not work in particular cases. for example, partitioned
> cache with FULL_ASYNC write synchronization mode.
> - needs to extend response messages (GridNearAtomicUpdateResponse,
> GridNearGetResponse etc)
>   in order to provide additional information from remote node: cache hits,
> number of removal etc.
>   So, it will lead to additional pressure on communication channel.
> Perhaps, this impact will be small - 4 bytes per message or something like
> that.
> - backward incompatibility (this is a consequence of the previous point)
>
> #2 Primary node (node that actually executes a request)
> Pro:
> - easy to implement
> - backward compatible
> Contra:
> - time metrics will not include the time of communication between nodes, so
> the results will be less accurate.
> - perhaps we need to provide additional metric which will allow to get avg
> time of communication between nodes.
>
> Please let me know about your thoughts.
> Perhaps, both alternatives are not so good...
>
> Regards,
> Slava.

Cache Metrics

2017-07-13 Thread Вячеслав Коптилин

Hi Experts,

I am working on https://issues.apache.org/jira/browse/IGNITE-3495

A few words about this issue:
It is about that the process of gathering/updating of cache metrics is
inconsistent in some cases.
Let's consider the following simple topology which contains only two nodes:
first node is a client node and the second is a server.
And client node starts requests to the server node, for instance
cache.put(), cache.putAll(), cache.get() etc.
In that case, metrics which are related to counters (cache hits, cache
misses, removals and puts) are calculated on the server side,
while time metrics are updated on the client node.

I think that both metrics (counters and time) should be calculated on the
same node. So, there are two obvious solution:

#1 Node that starts some operation is responsible for updating the cache
metrics.
Pro:
 - it will allow to get more accurate results of metrics.
Contra:
- this approach does not work in particular cases. for example, partitioned
cache with FULL_ASYNC write synchronization mode.
- needs to extend response messages (GridNearAtomicUpdateResponse,
GridNearGetResponse etc)
  in order to provide additional information from remote node: cache hits,
number of removal etc.
  So, it will lead to additional pressure on communication channel.
Perhaps, this impact will be small - 4 bytes per message or something like
that.
- backward incompatibility (this is a consequence of the previous point)

#2 Primary node (node that actually executes a request)
Pro:
- easy to implement
- backward compatible
Contra:
- time metrics will not include the time of communication between nodes, so
the results will be less accurate.
- perhaps we need to provide additional metric which will allow to get avg
time of communication between nodes.

Please let me know about your thoughts.
Perhaps, both alternatives are not so good...

Regards,
Slava.

[GitHub] ignite pull request #2133: IGNITE-5492: Local cache metrics are broken.

2017-06-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/2133


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] ignite pull request #2133: IGNITE-5492: Local cache metrics are broken.

2017-06-15 Thread shroman

GitHub user shroman opened a pull request:

https://github.com/apache/ignite/pull/2133

IGNITE-5492: Local cache metrics are broken.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shroman/ignite IGNITE-5492

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/2133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2133


commit 01a6b8cb3f57a390cc74692c7099f4bed5131b6d
Author: shroman <rsht...@yahoo.com>
Date:   2017-06-15T09:41:01Z

IGNITE-5492: Local cache metrics are broken.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] ignite pull request #1800: Obsolete cache metrics removed, test fixed

2017-04-25 Thread sergey-chugunov-1985

Github user sergey-chugunov-1985 closed the pull request at:

https://github.com/apache/ignite/pull/1800


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] ignite pull request #1800: Obsolete cache metrics removed, test fixed

2017-04-14 Thread sergey-chugunov-1985

GitHub user sergey-chugunov-1985 opened a pull request:

https://github.com/apache/ignite/pull/1800

Obsolete cache metrics removed, test fixed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-4536-later-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/1800.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1800


commit 0caa761fdf38e476364bc6714e98929742098bfc
Author: Sergey Chugunov <sergey.chugu...@gmail.com>
Date:   2017-04-14T15:03:03Z

IGNITE-4536 two more obsolete cache metrics were removed, test for memory 
allocation was fixed




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Fwd: Cache Metrics

2016-12-20 Thread Dmitriy Setrakyan

Cross sending this to dev.

Igniters, why does the metrics stuff have to be so confusing? Looks like if
"statisticsEnabled" is false, then metrics return all 0s. Can we at least
have a warning in the log stating that the metrics are disabled, and
explaining how to enable them?

D.

-- Forwarded message --
From: Alper Tekinalp <al...@evam.com>
Date: Tue, Dec 20, 2016 at 3:10 AM
Subject: Re: Cache Metrics
To: u...@ignite.apache.org


Hi all.

Thanks for your replies. Enabling statistics fixed it.

On Tue, Dec 20, 2016 at 12:39 PM, Andrey Mashenkov <amashen...@gridgain.com>
wrote:

> Hi Alper,
>
> May be it is not obvious, but to enable offheap you need to
> setOffheapMaxMemory to zero (unlimited) or above zero.
> Also metrics is disabled by default, you need call
> setStatisticsEnabled(true);
>
> On Tue, Dec 20, 2016 at 11:41 AM, Alper Tekinalp <al...@evam.com> wrote:
>
>> Hi all.
>>
>> I have the following code:
>> IgniteConfiguration igniteConfiguration = new
>> IgniteConfiguration();
>> igniteConfiguration.setGridName("alper");
>> Ignite start = Ignition.start(igniteConfiguration);
>>
>> CacheConfiguration configuration = new CacheConfiguration();
>> configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC)
>> .setCacheMode(CacheMode.PARTITIONED)
>> .setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED)
>> .setRebalanceMode(CacheRebalanceMode.SYNC)
>> .setWriteSynchronizationMode(C
>> acheWriteSynchronizationMode.FULL_SYNC)
>> .setRebalanceThrottle(100)
>> .setRebalanceBatchSize(2*1024*1024)
>> .setBackups(1)
>> .setName("cemil")
>> .setEagerTtl(false);
>> start.getOrCreateCache(configuration);
>>
>> IgniteCache<Object, Object> cemil = start.getOrCreateCache("cemil"
>> );
>>
>> cemil.put("1", "10");
>> cemil.put("2", "10");
>> cemil.put("3", "10");
>> cemil.put("4", "10");
>>
>> System.out.println(cemil.metrics().getOffHeapAllocatedSize());
>> System.out.println(cemil.metrics().getOffHeapBackupEntriesCo
>> unt());
>> System.out.println(cemil.metrics().getOffHeapGets());
>> System.out.println(cemil.metrics().getOffHeapHits());
>> System.out.println(cemil.metrics().getOffHeapMisses());
>> System.out.println(cemil.metrics().getOffHeapPuts());
>> System.out.println(cemil.metrics().getOffHeapEvictions());
>> System.out.println(cemil.metrics().getOffHeapHitPercentage());
>>
>> All prints 0. Is that normal? Am i doing something wrong?
>>
>> --
>> Alper Tekinalp
>>
>> Software Developer
>> Evam Streaming Analytics
>>
>> Atatürk Mah. Turgut Özal Bulv.
>> Gardenya 5 Plaza K:6 Ataşehir
>> 34758 İSTANBUL
>>
>> Tel:  +90 216 455 01 53 Fax: +90 216 455 01 54
>> www.evam.com.tr
>> <http://www.evam.com>
>>
>
>


-- 
Alper Tekinalp

Software Developer
Evam Streaming Analytics

Atatürk Mah. Turgut Özal Bulv.
Gardenya 5 Plaza K:6 Ataşehir
34758 İSTANBUL

Tel:  +90 216 455 01 53 Fax: +90 216 455 01 54
www.evam.com.tr
<http://www.evam.com>

Re: Cache Metrics

2016-12-20 Thread Alexey Kuznetsov

Hi, Alper!

Could you try to set configuration.setStatisticsEnabled(true)  and try once
again?

On Tue, Dec 20, 2016 at 3:41 PM, Alper Tekinalp  wrote:

> Hi all.
>
> I have the following code:
> IgniteConfiguration igniteConfiguration = new
> IgniteConfiguration();
> igniteConfiguration.setGridName("alper");
> Ignite start = Ignition.start(igniteConfiguration);
>
> CacheConfiguration configuration = new CacheConfiguration();
> configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC)
> .setCacheMode(CacheMode.PARTITIONED)
> .setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED)
> .setRebalanceMode(CacheRebalanceMode.SYNC)
>
> .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC)
> .setRebalanceThrottle(100)
> .setRebalanceBatchSize(2*1024*1024)
> .setBackups(1)
> .setName("cemil")
> .setEagerTtl(false);
> start.getOrCreateCache(configuration);
>
> IgniteCache cemil = start.getOrCreateCache("cemil"
> );
>
> cemil.put("1", "10");
> cemil.put("2", "10");
> cemil.put("3", "10");
> cemil.put("4", "10");
>
> System.out.println(cemil.metrics().getOffHeapAllocatedSize());
> System.out.println(cemil.metrics().getOffHeapBackupEntriesCount()
> );
> System.out.println(cemil.metrics().getOffHeapGets());
> System.out.println(cemil.metrics().getOffHeapHits());
> System.out.println(cemil.metrics().getOffHeapMisses());
> System.out.println(cemil.metrics().getOffHeapPuts());
> System.out.println(cemil.metrics().getOffHeapEvictions());
> System.out.println(cemil.metrics().getOffHeapHitPercentage());
>
> All prints 0. Is that normal? Am i doing something wrong?
>
> --
> Alper Tekinalp
>
> Software Developer
> Evam Streaming Analytics
>
> Atatürk Mah. Turgut Özal Bulv.
> Gardenya 5 Plaza K:6 Ataşehir
> 34758 İSTANBUL
>
> Tel:  +90 216 455 01 53 Fax: +90 216 455 01 54
> www.evam.com.tr
> 
>



-- 
Alexey Kuznetsov

Cache Metrics

2016-12-20 Thread Alper Tekinalp

Hi all.

I have the following code:
IgniteConfiguration igniteConfiguration = new IgniteConfiguration();
igniteConfiguration.setGridName("alper");
Ignite start = Ignition.start(igniteConfiguration);

CacheConfiguration configuration = new CacheConfiguration();
configuration.setAtomicityMode(CacheAtomicityMode.ATOMIC)
.setCacheMode(CacheMode.PARTITIONED)
.setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED)
.setRebalanceMode(CacheRebalanceMode.SYNC)

.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC)
.setRebalanceThrottle(100)
.setRebalanceBatchSize(2*1024*1024)
.setBackups(1)
.setName("cemil")
.setEagerTtl(false);
start.getOrCreateCache(configuration);

IgniteCache cemil = start.getOrCreateCache("cemil");

cemil.put("1", "10");
cemil.put("2", "10");
cemil.put("3", "10");
cemil.put("4", "10");

System.out.println(cemil.metrics().getOffHeapAllocatedSize());
System.out.println(cemil.metrics().getOffHeapBackupEntriesCount());
System.out.println(cemil.metrics().getOffHeapGets());
System.out.println(cemil.metrics().getOffHeapHits());
System.out.println(cemil.metrics().getOffHeapMisses());
System.out.println(cemil.metrics().getOffHeapPuts());
System.out.println(cemil.metrics().getOffHeapEvictions());
System.out.println(cemil.metrics().getOffHeapHitPercentage());

All prints 0. Is that normal? Am i doing something wrong?

-- 
Alper Tekinalp

Software Developer
Evam Streaming Analytics

Atatürk Mah. Turgut Özal Bulv.
Gardenya 5 Plaza K:6 Ataşehir
34758 İSTANBUL

Tel:  +90 216 455 01 53 Fax: +90 216 455 01 54
www.evam.com.tr

[GitHub] ignite pull request #1263: IGNITE-4264: fix cache metrics error between serv...

2016-11-22 Thread wmz7year

GitHub user wmz7year opened a pull request:

https://github.com/apache/ignite/pull/1263

IGNITE-4264: fix cache metrics error between server and client.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wmz7year/ignite ignite-4264

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/1263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1263


commit af739867d50218dd47990ce17e81cce06615db85
Author: jiangwei <jiang...@caifuzhinan.com>
Date:   2016-11-23T03:01:37Z

    fix cache metrics error between server and client.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: Cache metrics return incorrect values

2016-05-25 Thread Andrey Gura

Hmmm...

Ok, every time when we perform put we go to offheap, becuase it can already
contain this key. So my statement about one offheap get pere one cache.get
is wrong.

Anyway, get operation should update offheap gets metric. See usages of
CacheMetricsImpl#onOffheapRead.


On Wed, May 25, 2016 at 4:06 PM, Vladislav Pyatkov 
wrote:

> Andrey,
>
>  I can see offheap gets metric increments every time when get
>
>
> Unfortunately not. When cache configured as OFFHEAP_TIERED it does not
> work.
> About increment Get when Put takes place:
>
> org.apache.ignite.internal.processors.cache.local.CacheLocalOffHeapAndSwapMetricsSelfTest#testOffHeapMetrics
> The logic existed is a long time and were covered tests.
>
> for (int i = 0; i < KEYS_CNT; i++)
> cache.put(i, i);
>
> assertEquals(KEYS_CNT, cache.localMetrics().getOffHeapGets());
>
> We execute only put, but get counter also incremented.
>
> Is anyone has another opinion?
>
>
>
> On Wed, May 25, 2016 at 2:51 PM, Andrey Gura  wrote:
>
> > Denis,
> >
> > I disagree. readOffheapPointer doesn't touch offheap get/put metrics
> > deliberately. User should have exactly one offheap get operation per
> > cache.get call.
> >
> > Vlad,
> >
> > as I can see offheap gets metric increments every time when get,
> contains,
> > etc operations perform, so it should work. If you have more then one node
> > then cluster metrics should be updated eventually with discovery message
> > and immediately for local node. So if local node isn't primary for your
> key
> > you can get metrics with some delay.
> >
> > If particular metric doesn't change then we need find method that should
> be
> > responsible for update of this metric.
> >
> >
> > On Tue, May 24, 2016 at 4:27 PM, Denis Magda 
> wrote:
> >
> > > Hi Vlad,
> > >
> > > In my understanding this should work or implemented this way for
> > > OFFHEAP_TIRED cache.
> > >
> > > CacheMetrics.getCacheEvictions - incremented on every put & get
> operation
> > > because an entry “goes through” heap memory and evicted from there when
> > > it’s no longer needed (usually at the end of get or put operation).
> > >
> > > CacheMetrics.getOffHeapGets - should be incremented every time the
> > > off-heap layer is accessed for a particular key. This can be an
> ordinary
> > > cache.get() call or during a cache.put() that unswaps an entry before
> the
> > > new value is put. In my understanding you can increase this statistics
> > > exactly in this method - GridCacheSwapManager#readOffheapPointer.
> > >
> > > CacheMetrics.getOffHeapPuts - should be incremented every time a put
> > > operations happens and an entry is moved to off heap.
> > >
> > > —
> > > Denis
> > >
> > > > On May 24, 2016, at 2:47 PM, Vladislav Pyatkov <
> vpyat...@gridgain.com>
> > > wrote:
> > > >
> > > > I try to understand how statistics work and fixe some problem.
> > > > I first case:
> > > > cache.put(46744, "val 46744");
> > > > cache.get(46744);
> > > > In statistic I see:
> > > > 2016-05-24 14:19:31 INFO  ServerNode:78 - Swap put 0 get 0 (0, 0)
> > entries
> > > > count 0
> > > > 2016-05-24 14:19:31 INFO  ServerNode:81 - OffHeap put 1 get 0 (0, 0)
> > > > entries count 1
> > > > 2016-05-24 14:19:31 INFO  ServerNode:84 - OnHeap put 1 get 1 (1, 0)
> > > >
> > > > In brackets Hit and Miss values.
> > > >
> > > > But I asume OffHeap get must to be one, because cache configured as
> > > > OFFHEAP_TIERED and swapEnabled - false.
> > > >
> > > > My investigation has lead to method
> > > >
> > >
> >
> org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer.
> > > > The method read only pointer from heap, but not get bytes of value
> and
> > > not
> > > > increase any statistic.
> > > > If each receive pointer increase statistic (OffHeap get I mean), then
> > > each
> > > > OffHeap put will increased OffHeap get, because readOffheapPointer
> take
> > > > place on OffHeap put.
> > > >
> > > > The thing confuses my:
> > > > Has any rules metrics works?
> > > > Where works with metrics value must take place?
> > >
> > >
> >
> >
> > --
> > Andrey Gura
> > GridGain Systems, Inc.
> > www.gridgain.com
> >
>



-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Re: Cache metrics return incorrect values

2016-05-25 Thread Andrey Gura

Denis,

I disagree. readOffheapPointer doesn't touch offheap get/put metrics
deliberately. User should have exactly one offheap get operation per
cache.get call.

Vlad,

as I can see offheap gets metric increments every time when get, contains,
etc operations perform, so it should work. If you have more then one node
then cluster metrics should be updated eventually with discovery message
and immediately for local node. So if local node isn't primary for your key
you can get metrics with some delay.

If particular metric doesn't change then we need find method that should be
responsible for update of this metric.


On Tue, May 24, 2016 at 4:27 PM, Denis Magda  wrote:

> Hi Vlad,
>
> In my understanding this should work or implemented this way for
> OFFHEAP_TIRED cache.
>
> CacheMetrics.getCacheEvictions - incremented on every put & get operation
> because an entry “goes through” heap memory and evicted from there when
> it’s no longer needed (usually at the end of get or put operation).
>
> CacheMetrics.getOffHeapGets - should be incremented every time the
> off-heap layer is accessed for a particular key. This can be an ordinary
> cache.get() call or during a cache.put() that unswaps an entry before the
> new value is put. In my understanding you can increase this statistics
> exactly in this method - GridCacheSwapManager#readOffheapPointer.
>
> CacheMetrics.getOffHeapPuts - should be incremented every time a put
> operations happens and an entry is moved to off heap.
>
> —
> Denis
>
> > On May 24, 2016, at 2:47 PM, Vladislav Pyatkov 
> wrote:
> >
> > I try to understand how statistics work and fixe some problem.
> > I first case:
> > cache.put(46744, "val 46744");
> > cache.get(46744);
> > In statistic I see:
> > 2016-05-24 14:19:31 INFO  ServerNode:78 - Swap put 0 get 0 (0, 0) entries
> > count 0
> > 2016-05-24 14:19:31 INFO  ServerNode:81 - OffHeap put 1 get 0 (0, 0)
> > entries count 1
> > 2016-05-24 14:19:31 INFO  ServerNode:84 - OnHeap put 1 get 1 (1, 0)
> >
> > In brackets Hit and Miss values.
> >
> > But I asume OffHeap get must to be one, because cache configured as
> > OFFHEAP_TIERED and swapEnabled - false.
> >
> > My investigation has lead to method
> >
> org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer.
> > The method read only pointer from heap, but not get bytes of value and
> not
> > increase any statistic.
> > If each receive pointer increase statistic (OffHeap get I mean), then
> each
> > OffHeap put will increased OffHeap get, because readOffheapPointer take
> > place on OffHeap put.
> >
> > The thing confuses my:
> > Has any rules metrics works?
> > Where works with metrics value must take place?
>
>


-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

[jira] [Created] (IGNITE-3190) OffHeap cache metrics do not detected get from OffHeap

2016-05-24 Thread Vladislav Pyatkov (JIRA)

Vladislav Pyatkov created IGNITE-3190:
-

 Summary: OffHeap cache metrics do not detected get from OffHeap
 Key: IGNITE-3190
 URL: https://issues.apache.org/jira/browse/IGNITE-3190
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov
Assignee: Vladislav Pyatkov


Simple configuration cache with OffHeap tiered (statistics must be enabled) 
never increase of get from OffHeap (CacheMetrics#getOffHeapGets always 0)

{code}
cache.put(46744, "val 46744");
cache.get(46744);
{code}

{noforamt}
016-05-24 14:19:31 INFO  ServerNode:78 - Swap put 0 get 0 (0, 0) entries count 0
2016-05-24 14:19:31 INFO  ServerNode:81 - OffHeap put 1 get 0 (0, 0) entries 
count 1
2016-05-24 14:19:31 INFO  ServerNode:84 - OnHeap put 1 get 1 (1, 0)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Cache metrics return incorrect values

2016-05-24 Thread Denis Magda

Hi Vlad,

In my understanding this should work or implemented this way for OFFHEAP_TIRED 
cache.

CacheMetrics.getCacheEvictions - incremented on every put & get operation 
because an entry “goes through” heap memory and evicted from there when it’s no 
longer needed (usually at the end of get or put operation).

CacheMetrics.getOffHeapGets - should be incremented every time the off-heap 
layer is accessed for a particular key. This can be an ordinary cache.get() 
call or during a cache.put() that unswaps an entry before the new value is put. 
In my understanding you can increase this statistics exactly in this method - 
GridCacheSwapManager#readOffheapPointer.

CacheMetrics.getOffHeapPuts - should be incremented every time a put operations 
happens and an entry is moved to off heap.

—
Denis

> On May 24, 2016, at 2:47 PM, Vladislav Pyatkov  wrote:
> 
> I try to understand how statistics work and fixe some problem.
> I first case:
> cache.put(46744, "val 46744");
> cache.get(46744);
> In statistic I see:
> 2016-05-24 14:19:31 INFO  ServerNode:78 - Swap put 0 get 0 (0, 0) entries
> count 0
> 2016-05-24 14:19:31 INFO  ServerNode:81 - OffHeap put 1 get 0 (0, 0)
> entries count 1
> 2016-05-24 14:19:31 INFO  ServerNode:84 - OnHeap put 1 get 1 (1, 0)
> 
> In brackets Hit and Miss values.
> 
> But I asume OffHeap get must to be one, because cache configured as
> OFFHEAP_TIERED and swapEnabled - false.
> 
> My investigation has lead to method
> org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer.
> The method read only pointer from heap, but not get bytes of value and not
> increase any statistic.
> If each receive pointer increase statistic (OffHeap get I mean), then each
> OffHeap put will increased OffHeap get, because readOffheapPointer take
> place on OffHeap put.
> 
> The thing confuses my:
> Has any rules metrics works?
> Where works with metrics value must take place?

Cache metrics return incorrect values

2016-05-24 Thread Vladislav Pyatkov

I try to understand how statistics work and fixe some problem.
I first case:
cache.put(46744, "val 46744");
cache.get(46744);
In statistic I see:
2016-05-24 14:19:31 INFO  ServerNode:78 - Swap put 0 get 0 (0, 0) entries
count 0
2016-05-24 14:19:31 INFO  ServerNode:81 - OffHeap put 1 get 0 (0, 0)
entries count 1
2016-05-24 14:19:31 INFO  ServerNode:84 - OnHeap put 1 get 1 (1, 0)

In brackets Hit and Miss values.

But I asume OffHeap get must to be one, because cache configured as
OFFHEAP_TIERED and swapEnabled - false.

My investigation has lead to method
org.apache.ignite.internal.processors.cache.GridCacheSwapManager#readOffheapPointer.
The method read only pointer from heap, but not get bytes of value and not
increase any statistic.
If each receive pointer increase statistic (OffHeap get I mean), then each
OffHeap put will increased OffHeap get, because readOffheapPointer take
place on OffHeap put.

The thing confuses my:
Has any rules metrics works?
Where works with metrics value must take place?

Re: Stream API doesn't update cache metrics.

2016-04-08 Thread Yakov Zhdanov

What is your usecase?

пятница, 8 апреля 2016 г. пользователь Dmitry Karachentsev написал:

> Yes, switching it to 'true' does the magic.
>
> Thanks!
>
> On 08.04.2016 13:52, Yakov Zhdanov wrote:
>
>> Is allowOverwrite set to 'false'?
>>
>> Thanks!
>> --
>> Yakov Zhdanov, Director R
>> *GridGain Systems*
>> www.gridgain.com
>>
>> 2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev <
>> dkarachent...@gridgain.com>:
>>
>> Hi all.
>>>
>>> Adding data to cache via streamer doesn't update cache metrics like
>>> AveragePutTime, CachePuts. Was it made intentionally?
>>>
>>> Thanks!
>>> Dmitry.
>>>
>>>
>

-- 
--Yakov

Re: Stream API doesn't update cache metrics.

2016-04-08 Thread Dmitry Karachentsev


Yes, switching it to 'true' does the magic.

Thanks!

On 08.04.2016 13:52, Yakov Zhdanov wrote:

Is allowOverwrite set to 'false'?

Thanks!
--
Yakov Zhdanov, Director R
*GridGain Systems*
www.gridgain.com

2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev <dkarachent...@gridgain.com>:


Hi all.

Adding data to cache via streamer doesn't update cache metrics like
AveragePutTime, CachePuts. Was it made intentionally?

Thanks!
Dmitry.

Re: Stream API doesn't update cache metrics.

2016-04-08 Thread Yakov Zhdanov

Is allowOverwrite set to 'false'?

Thanks!
--
Yakov Zhdanov, Director R
*GridGain Systems*
www.gridgain.com

2016-04-08 10:30 GMT+03:00 Dmitry Karachentsev <dkarachent...@gridgain.com>:

> Hi all.
>
> Adding data to cache via streamer doesn't update cache metrics like
> AveragePutTime, CachePuts. Was it made intentionally?
>
> Thanks!
> Dmitry.
>

Stream API doesn't update cache metrics.

2016-04-08 Thread Dmitry Karachentsev


Hi all.

Adding data to cache via streamer doesn't update cache metrics like 
AveragePutTime, CachePuts. Was it made intentionally?


Thanks!
Dmitry.

[jira] [Created] (IGNITE-2731) Cache metrics documentation on readme.io

2016-02-29 Thread Denis Magda (JIRA)

Denis Magda created IGNITE-2731:
---

 Summary: Cache metrics documentation on readme.io
 Key: IGNITE-2731
 URL: https://issues.apache.org/jira/browse/IGNITE-2731
 Project: Ignite
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.5.0.final
Reporter: Denis Magda
 Fix For: 1.6


Cache metrics related topic is becoming hot on the user list.

1) 
http://apache-ignite-users.70518.x6.nabble.com/Monitoring-Cache-Data-counters-Cache-Data-Size-td3203.html#a3211

2) 
http://apache-ignite-users.70518.x6.nabble.com/Is-there-a-way-to-get-cache-metrics-for-all-the-nodes-in-cluster-combined-td2674.html
3) 
http://apache-ignite-users.70518.x6.nabble.com/Metrics-for-backup-caches-td2689.html#a2692

The time to add a specific article for this topic has come.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (IGNITE-2636) Server cache metrics for put-get-remove avg time are incorrect for case when request sent from client

2016-02-12 Thread Vladimir Ershov (JIRA)

Vladimir Ershov created IGNITE-2636:
---

 Summary: Server cache metrics for put-get-remove avg time are 
incorrect for case when request sent from client
 Key: IGNITE-2636
 URL: https://issues.apache.org/jira/browse/IGNITE-2636
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 1.5.0.final
Reporter: Vladimir Ershov


Server cache metrics for put-get-remove avg time are incorrect for case when 
request sent from client.
We should add methods like CacheMetrics#addPutAndGetTimeNanos for all flows, 
when requests for cache modifications are processed. For all type of caches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] ignite pull request: IGNTIE-2483 Cache metrics functionality for c...

2016-02-12 Thread VladimirErshov

GitHub user VladimirErshov opened a pull request:

https://github.com/apache/ignite/pull/479

IGNTIE-2483 Cache metrics functionality for client nodes should be 
developed.

Added new version of CacheMetricsSnapshot.
Fixed merging logic.
Added proper put/get/remove time counting on the client side.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/VladimirErshov/ignite ignite-2483

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/479.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #479


commit 0bb74d8afcd3523aa51659a791b4078114f73bd3
Author: vershov <vers...@gridgain.com>
Date:   2016-02-11T17:43:33Z

IGNTIE-2483 added metrics on client. Fixed upTime. redesigned base method 
and gathering logic.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Client Cache metrics API design discussion.

2016-02-05 Thread Vladimir Ershov

Igniters!

Here is a handsome moment in our current cache metrics API, that begs for
an improvement and due to it significancy assumed to be discussed
communitywise. Current CacheMetrics interface is confusing for a case, when
it is accessed from a client node.
One of the typical question is:
*what CacheMetrics#getSize should return on a client Node for a non-Near
non-Local cache?*
Here are some options:

   1. Zero. As it works now, it is just 0, since there is no entries on the
   client node.
   2. Amount of all entries for this cache across the cluster.
   3. Or, and here comes an interesting part, - amount of values which were
   fore.x. created through this client node, as it is useful for
   #getAveragePutTime.
   4.  Your variant?

The same for the rest of the API: getCacheHits (0, cluster, client),
getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?).

Assuming this point can give a good start for our discussion: there are
use-cases, that demands metrics to be gathered for a client node
separately, fore.x. user can measure latency between nodes, by comparing
#getAveragePutTime on client and server side. Thus I consider reasonable to
implement specific ClientCacheMetricsImpl with logic for client, but actual
questions are: what should methods like getSize, getHits return? Is it
necessary to support backward compatibility here for metrics API? Does the
community think that it is reasonable to put our efforts to this task and
that we want to support case for cache metrics on a client node?

Thoughts?

Re: Client Cache metrics API design discussion.

2016-02-05 Thread Valentin Kulichenko

Vladimir,

As I already suggested in the ticket [1], I think that by default we should
return metrics for the whole cluster. Now we collect them only from local
node, which is confusing, especially on the client. If one needs metrics
for one node or from subset of nodes, metrics(ClusterGroup) method can be
used.

So as for the size, I'm definitely for option 2.

Option 3 is more about 'getCachePuts()', but not 'getSIze()', no? Where do
we increment this counter - on the client or on the primary node? If on the
client, this metric will work just as you described when you get metrics
for a particular client using metrics(ClusterGroup).

Probably it also would be useful to add localMetrics() shortcut method.

[1] https://issues.apache.org/jira/browse/IGNITE-2483

-Val

On Fri, Feb 5, 2016 at 8:44 AM, Vladimir Ershov <vers...@gridgain.com>
wrote:

> Igniters!
>
> Here is a handsome moment in our current cache metrics API, that begs for
> an improvement and due to it significancy assumed to be discussed
> communitywise. Current CacheMetrics interface is confusing for a case, when
> it is accessed from a client node.
> One of the typical question is:
> *what CacheMetrics#getSize should return on a client Node for a non-Near
> non-Local cache?*
> Here are some options:
>
>1. Zero. As it works now, it is just 0, since there is no entries on the
>client node.
>2. Amount of all entries for this cache across the cluster.
>3. Or, and here comes an interesting part, - amount of values which were
>fore.x. created through this client node, as it is useful for
>#getAveragePutTime.
>4.  Your variant?
>
> The same for the rest of the API: getCacheHits (0, cluster, client),
> getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?).
>
> Assuming this point can give a good start for our discussion: there are
> use-cases, that demands metrics to be gathered for a client node
> separately, fore.x. user can measure latency between nodes, by comparing
> #getAveragePutTime on client and server side. Thus I consider reasonable to
> implement specific ClientCacheMetricsImpl with logic for client, but actual
> questions are: what should methods like getSize, getHits return? Is it
> necessary to support backward compatibility here for metrics API? Does the
> community think that it is reasonable to put our efforts to this task and
> that we want to support case for cache metrics on a client node?
>
> Thoughts?
>

Re: Client Cache metrics API design discussion.

2016-02-05 Thread Dmitriy Setrakyan

Agree. All metrics should return the data for the whole cache, unless
specifically specified otherwise by user.

D.

On Fri, Feb 5, 2016 at 10:56 AM, Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Vladimir,
>
> As I already suggested in the ticket [1], I think that by default we should
> return metrics for the whole cluster. Now we collect them only from local
> node, which is confusing, especially on the client. If one needs metrics
> for one node or from subset of nodes, metrics(ClusterGroup) method can be
> used.
>
> So as for the size, I'm definitely for option 2.
>
> Option 3 is more about 'getCachePuts()', but not 'getSIze()', no? Where do
> we increment this counter - on the client or on the primary node? If on the
> client, this metric will work just as you described when you get metrics
> for a particular client using metrics(ClusterGroup).
>
> Probably it also would be useful to add localMetrics() shortcut method.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-2483
>
> -Val
>
> On Fri, Feb 5, 2016 at 8:44 AM, Vladimir Ershov <vers...@gridgain.com>
> wrote:
>
> > Igniters!
> >
> > Here is a handsome moment in our current cache metrics API, that begs for
> > an improvement and due to it significancy assumed to be discussed
> > communitywise. Current CacheMetrics interface is confusing for a case,
> when
> > it is accessed from a client node.
> > One of the typical question is:
> > *what CacheMetrics#getSize should return on a client Node for a non-Near
> > non-Local cache?*
> > Here are some options:
> >
> >1. Zero. As it works now, it is just 0, since there is no entries on
> the
> >client node.
> >2. Amount of all entries for this cache across the cluster.
> >3. Or, and here comes an interesting part, - amount of values which
> were
> >fore.x. created through this client node, as it is useful for
> >#getAveragePutTime.
> >4.  Your variant?
> >
> > The same for the rest of the API: getCacheHits (0, cluster, client),
> > getTxDhtCommitQueueSize (0, cluster, for client keys, UnsOpEx?).
> >
> > Assuming this point can give a good start for our discussion: there are
> > use-cases, that demands metrics to be gathered for a client node
> > separately, fore.x. user can measure latency between nodes, by comparing
> > #getAveragePutTime on client and server side. Thus I consider reasonable
> to
> > implement specific ClientCacheMetricsImpl with logic for client, but
> actual
> > questions are: what should methods like getSize, getHits return? Is it
> > necessary to support backward compatibility here for metrics API? Does
> the
> > community think that it is reasonable to put our efforts to this task and
> > that we want to support case for cache metrics on a client node?
> >
> > Thoughts?
> >
>

[jira] [Created] (IGNITE-2483) Cache metrics bugs

2016-01-27 Thread Valentin Kulichenko (JIRA)

Valentin Kulichenko created IGNITE-2483:
---

 Summary: Cache metrics bugs
 Key: IGNITE-2483
 URL: https://issues.apache.org/jira/browse/IGNITE-2483
 Project: Ignite
  Issue Type: Bug
  Components: cache
Reporter: Valentin Kulichenko
 Fix For: 1.6


User list discussion: 
http://apache-ignite-users.70518.x6.nabble.com/Is-there-a-way-to-get-cache-metrics-for-all-the-nodes-in-cluster-combined-td2674.html

Currently there are at least three issues with cache metrics:
# When metrics are acquired on client, average put times are always zero. This 
happens because timings are calculated on the client, but puts are counted on 
servers.
# Size and keySize are always zero even if cache is not empty.
# Default metrics() method that doesn't take a cluster group provides metrics 
for local node only. So if it's called on client, they are always empty. It 
should calculate metrics for the whole cluster instead.

Also looks like this code is very undertested. Coverage should be significantly 
improved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

72 matches

Mail list logo