Re: Should Geode stats conform to backwards compatibility constraints?
Here's a PR for a revert of the offending commits. https://github.com/apache/geode/pull/3195 Seems valuable to revert for now and decide a path forward rather than rush to get in 1.9 -michael On Wed, Feb 13, 2019 at 9:51 PM Jacob Barrett wrote: > How about mod MAX_INT? It would wrap back to 0 and make it more consistent > with at least SNMP counters that roll over to 0 when maxed. A monitoring > and graphing system can account for this by recognizing the current value > is less than the previous and typically uses the previous and current > values to recreate the true value. > > -Jake > > > > On Feb 13, 2019, at 8:22 PM, Ryan McMahon wrote: > > > > Sorry that should read “and if the value exceeds MAX_INT” > > > >> On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon > wrote: > >> > >> +1 to introducing a new method which returns the stat as long per Jake’s > >> suggestion. I vote for getInt() to downcast as an int, and the value > >> exceeds MAX_INT, return MAX_INT and possibly add a warning message which > >> points users to the new long version of the method. I think throwing an > >> exception might be a bit much personally. > >> > >> Ryan > >> > >>> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols > wrote: > >>> > >>> Is it possible for the underlying counter to be maintained as a long, > and > >>> change the getInt method to return as normal when the value is within > >>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the > long > >>> value would overflow an int? > >>> > >>> For the MX Bean, should we keep (but deprecate) the existing attribute, > >>> and add a new attribute TotalNetSearchCompletedAsLong? > >>> > On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: > > Quite a few Geode stats are currently defined as IntCounters which > very > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to > not > wrap to negative MAX_VALUE, so my team defined the following two > tickets > and changed them to LongCounters on the develop branch: > > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value > https://issues.apache.org/jira/browse/GEODE-6345 > > GEODE-6334: CachePerfStats operation count stats may wrap to negative > >>> values > https://issues.apache.org/jira/browse/GEODE-6334 > > Some people are now concerned that this may break backwards > >>> compatibility > and API for existing users. > > There are two potential ways for a user to *experience* this as an API > change: > > 1) MemberMXBean in JMX > > *- int getTotalNetSearchCompleted();* > *+ long getTotalNetSearchCompleted();* > > As you can see in GEODE-6334, we changed quite a few stats from > integer > >>> to > long in CachePerfStats. The only one directly exposed as an attribute > on > MemberMXBean is TotalNetSearchCompleted. > > Users will find that this API method changed. > > 2) Statistics API with undocumented strings > > If we assume that users might know or discover that we have a > statistics > text id of "cachePerfStats" with an integer stat of name "puts" then > >>> they > could use our Statistics API to get the value: > > * 1:CacheFactory cacheFactory = new CacheFactory();* > * 2:Cache cache = cacheFactory.create();* > * 3:* > * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");* > * 5:* > * 6:Statistics[] statistics = > > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* > * 7:int oldPuts = statistics[0].getInt("puts");* > * 8:* > * 9:region.put("some", "value");* > *10:* > *11:int newPuts = statistics[0].getInt("puts");* > *12:* > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* > > The above works in Geode 1.8 and earlier but will begin to throw the > following in Geode 1.9 (when develop branch is released): > > *java.lang.IllegalArgumentException: The statistic puts with id 23 is > of > type long and it was expected to be an int.* > * at > > >>> > org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* > * at > > >>> > org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* > * at > > >>> > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* > * at > > >>> > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* > * at com.user.example.Example.example(Example.java7)* > > In order to avoid the above exception, a user would have to change the > >>> code > on lines 7 and 11 to use *getLong* instead of *getInt*. > > Should Geode stats be considered a form of API contract and should > they > conform to backwards
Re: Should Geode stats conform to backwards compatibility constraints?
How about mod MAX_INT? It would wrap back to 0 and make it more consistent with at least SNMP counters that roll over to 0 when maxed. A monitoring and graphing system can account for this by recognizing the current value is less than the previous and typically uses the previous and current values to recreate the true value. -Jake > On Feb 13, 2019, at 8:22 PM, Ryan McMahon wrote: > > Sorry that should read “and if the value exceeds MAX_INT” > >> On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon wrote: >> >> +1 to introducing a new method which returns the stat as long per Jake’s >> suggestion. I vote for getInt() to downcast as an int, and the value >> exceeds MAX_INT, return MAX_INT and possibly add a warning message which >> points users to the new long version of the method. I think throwing an >> exception might be a bit much personally. >> >> Ryan >> >>> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols wrote: >>> >>> Is it possible for the underlying counter to be maintained as a long, and >>> change the getInt method to return as normal when the value is within >>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long >>> value would overflow an int? >>> >>> For the MX Bean, should we keep (but deprecate) the existing attribute, >>> and add a new attribute TotalNetSearchCompletedAsLong? >>> On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: Quite a few Geode stats are currently defined as IntCounters which very easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not wrap to negative MAX_VALUE, so my team defined the following two tickets and changed them to LongCounters on the develop branch: GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value https://issues.apache.org/jira/browse/GEODE-6345 GEODE-6334: CachePerfStats operation count stats may wrap to negative >>> values https://issues.apache.org/jira/browse/GEODE-6334 Some people are now concerned that this may break backwards >>> compatibility and API for existing users. There are two potential ways for a user to *experience* this as an API change: 1) MemberMXBean in JMX *- int getTotalNetSearchCompleted();* *+ long getTotalNetSearchCompleted();* As you can see in GEODE-6334, we changed quite a few stats from integer >>> to long in CachePerfStats. The only one directly exposed as an attribute on MemberMXBean is TotalNetSearchCompleted. Users will find that this API method changed. 2) Statistics API with undocumented strings If we assume that users might know or discover that we have a statistics text id of "cachePerfStats" with an integer stat of name "puts" then >>> they could use our Statistics API to get the value: * 1:CacheFactory cacheFactory = new CacheFactory();* * 2:Cache cache = cacheFactory.create();* * 3:* * 4:Region region = cache.>>> String>createRegionFactory(PARTITION).create("region");* * 5:* * 6:Statistics[] statistics = cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* * 7:int oldPuts = statistics[0].getInt("puts");* * 8:* * 9:region.put("some", "value");* *10:* *11:int newPuts = statistics[0].getInt("puts");* *12:* *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* The above works in Geode 1.8 and earlier but will begin to throw the following in Geode 1.9 (when develop branch is released): *java.lang.IllegalArgumentException: The statistic puts with id 23 is of type long and it was expected to be an int.* * at >>> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* * at >>> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* * at >>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* * at >>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* * at com.user.example.Example.example(Example.java7)* In order to avoid the above exception, a user would have to change the >>> code on lines 7 and 11 to use *getLong* instead of *getInt*. Should Geode stats be considered a form of API contract and should they conform to backwards compatibility constraints? Should we change these from LongCounters back to IntCounters? Someone did suggest that we could change them back to IntCounters and >>> then change our statistics internals to skip to zero anytime a stat attempts >>> to increment above MAX_VALUE, however, I think that if we decide that stats cannot be changed from IntCounters to LongCounters then we should not >>> make this change either. Thanks
Re: Should Geode stats conform to backwards compatibility constraints?
Sorry that should read “and if the value exceeds MAX_INT” On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon wrote: > +1 to introducing a new method which returns the stat as long per Jake’s > suggestion. I vote for getInt() to downcast as an int, and the value > exceeds MAX_INT, return MAX_INT and possibly add a warning message which > points users to the new long version of the method. I think throwing an > exception might be a bit much personally. > > Ryan > > On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols wrote: > >> Is it possible for the underlying counter to be maintained as a long, and >> change the getInt method to return as normal when the value is within >> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long >> value would overflow an int? >> >> For the MX Bean, should we keep (but deprecate) the existing attribute, >> and add a new attribute TotalNetSearchCompletedAsLong? >> >> > On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: >> > >> > Quite a few Geode stats are currently defined as IntCounters which very >> > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not >> > wrap to negative MAX_VALUE, so my team defined the following two tickets >> > and changed them to LongCounters on the develop branch: >> > >> > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value >> > https://issues.apache.org/jira/browse/GEODE-6345 >> > >> > GEODE-6334: CachePerfStats operation count stats may wrap to negative >> values >> > https://issues.apache.org/jira/browse/GEODE-6334 >> > >> > Some people are now concerned that this may break backwards >> compatibility >> > and API for existing users. >> > >> > There are two potential ways for a user to *experience* this as an API >> > change: >> > >> > 1) MemberMXBean in JMX >> > >> > *- int getTotalNetSearchCompleted();* >> > *+ long getTotalNetSearchCompleted();* >> > >> > As you can see in GEODE-6334, we changed quite a few stats from integer >> to >> > long in CachePerfStats. The only one directly exposed as an attribute on >> > MemberMXBean is TotalNetSearchCompleted. >> > >> > Users will find that this API method changed. >> > >> > 2) Statistics API with undocumented strings >> > >> > If we assume that users might know or discover that we have a statistics >> > text id of "cachePerfStats" with an integer stat of name "puts" then >> they >> > could use our Statistics API to get the value: >> > >> > * 1:CacheFactory cacheFactory = new CacheFactory();* >> > * 2:Cache cache = cacheFactory.create();* >> > * 3:* >> > * 4:Region region = cache.> > String>createRegionFactory(PARTITION).create("region");* >> > * 5:* >> > * 6:Statistics[] statistics = >> > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* >> > * 7:int oldPuts = statistics[0].getInt("puts");* >> > * 8:* >> > * 9:region.put("some", "value");* >> > *10:* >> > *11:int newPuts = statistics[0].getInt("puts");* >> > *12:* >> > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* >> > >> > The above works in Geode 1.8 and earlier but will begin to throw the >> > following in Geode 1.9 (when develop branch is released): >> > >> > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of >> > type long and it was expected to be an int.* >> > * at >> > >> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* >> > * at >> > >> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* >> > * at >> > >> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* >> > * at >> > >> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* >> > * at com.user.example.Example.example(Example.java7)* >> > >> > In order to avoid the above exception, a user would have to change the >> code >> > on lines 7 and 11 to use *getLong* instead of *getInt*. >> > >> > Should Geode stats be considered a form of API contract and should they >> > conform to backwards compatibility constraints? >> > >> > Should we change these from LongCounters back to IntCounters? >> > >> > Someone did suggest that we could change them back to IntCounters and >> then >> > change our statistics internals to skip to zero anytime a stat attempts >> to >> > increment above MAX_VALUE, however, I think that if we decide that stats >> > cannot be changed from IntCounters to LongCounters then we should not >> make >> > this change either. >> > >> > Thanks for any input or opinions, >> > Kirk >> >>
Re: Should Geode stats conform to backwards compatibility constraints?
+1 to introducing a new method which returns the stat as long per Jake’s suggestion. I vote for getInt() to downcast as an int, and the value exceeds MAX_INT, return MAX_INT and possibly add a warning message which points users to the new long version of the method. I think throwing an exception might be a bit much personally. Ryan On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols wrote: > Is it possible for the underlying counter to be maintained as a long, and > change the getInt method to return as normal when the value is within > MAX_INT, otherwise throw an exception and/or return MAX_INT when the long > value would overflow an int? > > For the MX Bean, should we keep (but deprecate) the existing attribute, > and add a new attribute TotalNetSearchCompletedAsLong? > > > On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: > > > > Quite a few Geode stats are currently defined as IntCounters which very > > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not > > wrap to negative MAX_VALUE, so my team defined the following two tickets > > and changed them to LongCounters on the develop branch: > > > > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value > > https://issues.apache.org/jira/browse/GEODE-6345 > > > > GEODE-6334: CachePerfStats operation count stats may wrap to negative > values > > https://issues.apache.org/jira/browse/GEODE-6334 > > > > Some people are now concerned that this may break backwards compatibility > > and API for existing users. > > > > There are two potential ways for a user to *experience* this as an API > > change: > > > > 1) MemberMXBean in JMX > > > > *- int getTotalNetSearchCompleted();* > > *+ long getTotalNetSearchCompleted();* > > > > As you can see in GEODE-6334, we changed quite a few stats from integer > to > > long in CachePerfStats. The only one directly exposed as an attribute on > > MemberMXBean is TotalNetSearchCompleted. > > > > Users will find that this API method changed. > > > > 2) Statistics API with undocumented strings > > > > If we assume that users might know or discover that we have a statistics > > text id of "cachePerfStats" with an integer stat of name "puts" then they > > could use our Statistics API to get the value: > > > > * 1:CacheFactory cacheFactory = new CacheFactory();* > > * 2:Cache cache = cacheFactory.create();* > > * 3:* > > * 4:Region region = cache. > String>createRegionFactory(PARTITION).create("region");* > > * 5:* > > * 6:Statistics[] statistics = > > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* > > * 7:int oldPuts = statistics[0].getInt("puts");* > > * 8:* > > * 9:region.put("some", "value");* > > *10:* > > *11:int newPuts = statistics[0].getInt("puts");* > > *12:* > > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* > > > > The above works in Geode 1.8 and earlier but will begin to throw the > > following in Geode 1.9 (when develop branch is released): > > > > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of > > type long and it was expected to be an int.* > > * at > > > org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* > > * at > > > org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* > > * at > > > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* > > * at > > > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* > > * at com.user.example.Example.example(Example.java7)* > > > > In order to avoid the above exception, a user would have to change the > code > > on lines 7 and 11 to use *getLong* instead of *getInt*. > > > > Should Geode stats be considered a form of API contract and should they > > conform to backwards compatibility constraints? > > > > Should we change these from LongCounters back to IntCounters? > > > > Someone did suggest that we could change them back to IntCounters and > then > > change our statistics internals to skip to zero anytime a stat attempts > to > > increment above MAX_VALUE, however, I think that if we decide that stats > > cannot be changed from IntCounters to LongCounters then we should not > make > > this change either. > > > > Thanks for any input or opinions, > > Kirk > >
Re: Should Geode stats conform to backwards compatibility constraints?
Is it possible for the underlying counter to be maintained as a long, and change the getInt method to return as normal when the value is within MAX_INT, otherwise throw an exception and/or return MAX_INT when the long value would overflow an int? For the MX Bean, should we keep (but deprecate) the existing attribute, and add a new attribute TotalNetSearchCompletedAsLong? > On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: > > Quite a few Geode stats are currently defined as IntCounters which very > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not > wrap to negative MAX_VALUE, so my team defined the following two tickets > and changed them to LongCounters on the develop branch: > > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value > https://issues.apache.org/jira/browse/GEODE-6345 > > GEODE-6334: CachePerfStats operation count stats may wrap to negative values > https://issues.apache.org/jira/browse/GEODE-6334 > > Some people are now concerned that this may break backwards compatibility > and API for existing users. > > There are two potential ways for a user to *experience* this as an API > change: > > 1) MemberMXBean in JMX > > *- int getTotalNetSearchCompleted();* > *+ long getTotalNetSearchCompleted();* > > As you can see in GEODE-6334, we changed quite a few stats from integer to > long in CachePerfStats. The only one directly exposed as an attribute on > MemberMXBean is TotalNetSearchCompleted. > > Users will find that this API method changed. > > 2) Statistics API with undocumented strings > > If we assume that users might know or discover that we have a statistics > text id of "cachePerfStats" with an integer stat of name "puts" then they > could use our Statistics API to get the value: > > * 1:CacheFactory cacheFactory = new CacheFactory();* > * 2:Cache cache = cacheFactory.create();* > * 3:* > * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");* > * 5:* > * 6:Statistics[] statistics = > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* > * 7:int oldPuts = statistics[0].getInt("puts");* > * 8:* > * 9:region.put("some", "value");* > *10:* > *11:int newPuts = statistics[0].getInt("puts");* > *12:* > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* > > The above works in Geode 1.8 and earlier but will begin to throw the > following in Geode 1.9 (when develop branch is released): > > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of > type long and it was expected to be an int.* > * at > org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* > * at com.user.example.Example.example(Example.java7)* > > In order to avoid the above exception, a user would have to change the code > on lines 7 and 11 to use *getLong* instead of *getInt*. > > Should Geode stats be considered a form of API contract and should they > conform to backwards compatibility constraints? > > Should we change these from LongCounters back to IntCounters? > > Someone did suggest that we could change them back to IntCounters and then > change our statistics internals to skip to zero anytime a stat attempts to > increment above MAX_VALUE, however, I think that if we decide that stats > cannot be changed from IntCounters to LongCounters then we should not make > this change either. > > Thanks for any input or opinions, > Kirk
Re: Should Geode stats conform to backwards compatibility constraints?
+1 to what Jake said about MemberMXBean - that is definitely a public API class, so we need to deprecate the old method and introduce a new one. I'm also not clear about the stats. Technically, they are discoverable through the public API, so maybe? It seems like they are mix of things users might want and stuff that is more internal/debugging oriented. If the downcasting trick Jake mentioned isn't that hard, I'd say we should do that. That said, I suspect we've renamed some stats in the past. One other thing to consider - will changing the size of the stats affect any tooling like VSD or gfsh or ?? that may have to deal with stat data from multiple geode versions? -Dan On Wed, Feb 13, 2019 at 3:59 PM Kirk Lund wrote: > Quite a few Geode stats are currently defined as IntCounters which very > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not > wrap to negative MAX_VALUE, so my team defined the following two tickets > and changed them to LongCounters on the develop branch: > > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value > https://issues.apache.org/jira/browse/GEODE-6345 > > GEODE-6334: CachePerfStats operation count stats may wrap to negative > values > https://issues.apache.org/jira/browse/GEODE-6334 > > Some people are now concerned that this may break backwards compatibility > and API for existing users. > > There are two potential ways for a user to *experience* this as an API > change: > > 1) MemberMXBean in JMX > > *- int getTotalNetSearchCompleted();* > *+ long getTotalNetSearchCompleted();* > > As you can see in GEODE-6334, we changed quite a few stats from integer to > long in CachePerfStats. The only one directly exposed as an attribute on > MemberMXBean is TotalNetSearchCompleted. > > Users will find that this API method changed. > > 2) Statistics API with undocumented strings > > If we assume that users might know or discover that we have a statistics > text id of "cachePerfStats" with an integer stat of name "puts" then they > could use our Statistics API to get the value: > > * 1:CacheFactory cacheFactory = new CacheFactory();* > * 2:Cache cache = cacheFactory.create();* > * 3:* > * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");* > * 5:* > * 6:Statistics[] statistics = > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* > * 7:int oldPuts = statistics[0].getInt("puts");* > * 8:* > * 9:region.put("some", "value");* > *10:* > *11:int newPuts = statistics[0].getInt("puts");* > *12:* > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* > > The above works in Geode 1.8 and earlier but will begin to throw the > following in Geode 1.9 (when develop branch is released): > > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of > type long and it was expected to be an int.* > * at > > org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* > * at > > org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* > * at > > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* > * at > > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* > * at com.user.example.Example.example(Example.java7)* > > In order to avoid the above exception, a user would have to change the code > on lines 7 and 11 to use *getLong* instead of *getInt*. > > Should Geode stats be considered a form of API contract and should they > conform to backwards compatibility constraints? > > Should we change these from LongCounters back to IntCounters? > > Someone did suggest that we could change them back to IntCounters and then > change our statistics internals to skip to zero anytime a stat attempts to > increment above MAX_VALUE, however, I think that if we decide that stats > cannot be changed from IntCounters to LongCounters then we should not make > this change either. > > Thanks for any input or opinions, > Kirk >
Re: Should Geode stats conform to backwards compatibility constraints?
I don’t consider the stats API a public API. I think it is a leaked internal API. Do we document the interactions with this API? Do we document the stat names? I consider documentation the API. I can discover all sorts of exposed methods in a library but shouldn’t consider them contracted API. If however we come to a conclusion that they are API then we could work around the issues below. We could changes the stat API to be a little forgiving on the getInt and related calls if the internal storage is a long. We simply downcast the long to an int. The behavior to the end user does not change but if they use the getLong they get the new improved stat. For the JMX we just add a new method, getTotalNetSearchCompletedLong() to get the long version of the stat. -Jake > On Feb 13, 2019, at 3:58 PM, Kirk Lund wrote: > > Quite a few Geode stats are currently defined as IntCounters which very > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not > wrap to negative MAX_VALUE, so my team defined the following two tickets > and changed them to LongCounters on the develop branch: > > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value > https://issues.apache.org/jira/browse/GEODE-6345 > > GEODE-6334: CachePerfStats operation count stats may wrap to negative values > https://issues.apache.org/jira/browse/GEODE-6334 > > Some people are now concerned that this may break backwards compatibility > and API for existing users. > > There are two potential ways for a user to *experience* this as an API > change: > > 1) MemberMXBean in JMX > > *- int getTotalNetSearchCompleted();* > *+ long getTotalNetSearchCompleted();* > > As you can see in GEODE-6334, we changed quite a few stats from integer to > long in CachePerfStats. The only one directly exposed as an attribute on > MemberMXBean is TotalNetSearchCompleted. > > Users will find that this API method changed. > > 2) Statistics API with undocumented strings > > If we assume that users might know or discover that we have a statistics > text id of "cachePerfStats" with an integer stat of name "puts" then they > could use our Statistics API to get the value: > > * 1:CacheFactory cacheFactory = new CacheFactory();* > * 2:Cache cache = cacheFactory.create();* > * 3:* > * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");* > * 5:* > * 6:Statistics[] statistics = > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* > * 7:int oldPuts = statistics[0].getInt("puts");* > * 8:* > * 9:region.put("some", "value");* > *10:* > *11:int newPuts = statistics[0].getInt("puts");* > *12:* > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* > > The above works in Geode 1.8 and earlier but will begin to throw the > following in Geode 1.9 (when develop branch is released): > > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of > type long and it was expected to be an int.* > * at > org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* > * at > org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* > * at com.user.example.Example.example(Example.java7)* > > In order to avoid the above exception, a user would have to change the code > on lines 7 and 11 to use *getLong* instead of *getInt*. > > Should Geode stats be considered a form of API contract and should they > conform to backwards compatibility constraints? > > Should we change these from LongCounters back to IntCounters? > > Someone did suggest that we could change them back to IntCounters and then > change our statistics internals to skip to zero anytime a stat attempts to > increment above MAX_VALUE, however, I think that if we decide that stats > cannot be changed from IntCounters to LongCounters then we should not make > this change either. > > Thanks for any input or opinions, > Kirk
Should Geode stats conform to backwards compatibility constraints?
Quite a few Geode stats are currently defined as IntCounters which very easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not wrap to negative MAX_VALUE, so my team defined the following two tickets and changed them to LongCounters on the develop branch: GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value https://issues.apache.org/jira/browse/GEODE-6345 GEODE-6334: CachePerfStats operation count stats may wrap to negative values https://issues.apache.org/jira/browse/GEODE-6334 Some people are now concerned that this may break backwards compatibility and API for existing users. There are two potential ways for a user to *experience* this as an API change: 1) MemberMXBean in JMX *- int getTotalNetSearchCompleted();* *+ long getTotalNetSearchCompleted();* As you can see in GEODE-6334, we changed quite a few stats from integer to long in CachePerfStats. The only one directly exposed as an attribute on MemberMXBean is TotalNetSearchCompleted. Users will find that this API method changed. 2) Statistics API with undocumented strings If we assume that users might know or discover that we have a statistics text id of "cachePerfStats" with an integer stat of name "puts" then they could use our Statistics API to get the value: * 1:CacheFactory cacheFactory = new CacheFactory();* * 2:Cache cache = cacheFactory.create();* * 3:* * 4:Region region = cache.createRegionFactory(PARTITION).create("region");* * 5:* * 6:Statistics[] statistics = cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");* * 7:int oldPuts = statistics[0].getInt("puts");* * 8:* * 9:region.put("some", "value");* *10:* *11:int newPuts = statistics[0].getInt("puts");* *12:* *13:assertThat(newPuts).isEqualTo(oldPuts + 1);* The above works in Geode 1.8 and earlier but will begin to throw the following in Geode 1.9 (when develop branch is released): *java.lang.IllegalArgumentException: The statistic puts with id 23 is of type long and it was expected to be an int.* * at org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)* * at org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)* * at org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)* * at org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)* * at com.user.example.Example.example(Example.java7)* In order to avoid the above exception, a user would have to change the code on lines 7 and 11 to use *getLong* instead of *getInt*. Should Geode stats be considered a form of API contract and should they conform to backwards compatibility constraints? Should we change these from LongCounters back to IntCounters? Someone did suggest that we could change them back to IntCounters and then change our statistics internals to skip to zero anytime a stat attempts to increment above MAX_VALUE, however, I think that if we decide that stats cannot be changed from IntCounters to LongCounters then we should not make this change either. Thanks for any input or opinions, Kirk