Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-14 Thread Michael Oleske
Here's a PR for a revert of the offending commits.

https://github.com/apache/geode/pull/3195

Seems valuable to revert for now and decide a path forward rather than rush
to get in 1.9

-michael

On Wed, Feb 13, 2019 at 9:51 PM Jacob Barrett  wrote:

> How about mod MAX_INT? It would wrap back to 0 and make it more consistent
> with at least SNMP counters that roll over to 0 when maxed. A monitoring
> and graphing system can account for this by recognizing the current value
> is less than the previous and typically uses the previous and current
> values to recreate the true value.
>
> -Jake
>
>
> > On Feb 13, 2019, at 8:22 PM, Ryan McMahon  wrote:
> >
> > Sorry that should read “and if the value exceeds MAX_INT”
> >
> >> On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon 
> wrote:
> >>
> >> +1 to introducing a new method which returns the stat as long per Jake’s
> >> suggestion.  I vote for getInt() to downcast as an int, and the value
> >> exceeds MAX_INT, return MAX_INT and possibly add a warning message which
> >> points users to the new long version of the method.  I think throwing an
> >> exception might be a bit much personally.
> >>
> >> Ryan
> >>
> >>> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols 
> wrote:
> >>>
> >>> Is it possible for the underlying counter to be maintained as a long,
> and
> >>> change the getInt method to return as normal when the value is within
> >>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the
> long
> >>> value would overflow an int?
> >>>
> >>> For the MX Bean, should we keep (but deprecate) the existing attribute,
> >>> and add a new attribute TotalNetSearchCompletedAsLong?
> >>>
>  On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
> 
>  Quite a few Geode stats are currently defined as IntCounters which
> very
>  easily wrap past Integer.MAX_VALUE. Some users wanted these stats to
> not
>  wrap to negative MAX_VALUE, so my team defined the following two
> tickets
>  and changed them to LongCounters on the develop branch:
> 
>  GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
>  https://issues.apache.org/jira/browse/GEODE-6345
> 
>  GEODE-6334: CachePerfStats operation count stats may wrap to negative
> >>> values
>  https://issues.apache.org/jira/browse/GEODE-6334
> 
>  Some people are now concerned that this may break backwards
> >>> compatibility
>  and API for existing users.
> 
>  There are two potential ways for a user to *experience* this as an API
>  change:
> 
>  1) MemberMXBean in JMX
> 
>  *-  int getTotalNetSearchCompleted();*
>  *+  long getTotalNetSearchCompleted();*
> 
>  As you can see in GEODE-6334, we changed quite a few stats from
> integer
> >>> to
>  long in CachePerfStats. The only one directly exposed as an attribute
> on
>  MemberMXBean is TotalNetSearchCompleted.
> 
>  Users will find that this API method changed.
> 
>  2) Statistics API with undocumented strings
> 
>  If we assume that users might know or discover that we have a
> statistics
>  text id of "cachePerfStats" with an integer stat of name "puts" then
> >>> they
>  could use our Statistics API to get the value:
> 
>  * 1:CacheFactory cacheFactory = new CacheFactory();*
>  * 2:Cache cache = cacheFactory.create();*
>  * 3:*
>  * 4:Region region = cache.  String>createRegionFactory(PARTITION).create("region");*
>  * 5:*
>  * 6:Statistics[] statistics =
> 
> cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
>  * 7:int oldPuts = statistics[0].getInt("puts");*
>  * 8:*
>  * 9:region.put("some", "value");*
>  *10:*
>  *11:int newPuts = statistics[0].getInt("puts");*
>  *12:*
>  *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
> 
>  The above works in Geode 1.8 and earlier but will begin to throw the
>  following in Geode 1.9 (when develop branch is released):
> 
>  *java.lang.IllegalArgumentException: The statistic puts with id 23 is
> of
>  type long and it was expected to be an int.*
>  * at
> 
> >>>
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
>  * at
> 
> >>>
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
>  * at
> 
> >>>
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
>  * at
> 
> >>>
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
>  * at com.user.example.Example.example(Example.java7)*
> 
>  In order to avoid the above exception, a user would have to change the
> >>> code
>  on lines 7 and 11 to use *getLong* instead of *getInt*.
> 
>  Should Geode stats be considered a form of API contract and should
> they
>  conform to backwards 

Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Jacob Barrett
How about mod MAX_INT? It would wrap back to 0 and make it more consistent with 
at least SNMP counters that roll over to 0 when maxed. A monitoring and 
graphing system can account for this by recognizing the current value is less 
than the previous and typically uses the previous and current values to 
recreate the true value. 

-Jake


> On Feb 13, 2019, at 8:22 PM, Ryan McMahon  wrote:
> 
> Sorry that should read “and if the value exceeds MAX_INT”
> 
>> On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon  wrote:
>> 
>> +1 to introducing a new method which returns the stat as long per Jake’s
>> suggestion.  I vote for getInt() to downcast as an int, and the value
>> exceeds MAX_INT, return MAX_INT and possibly add a warning message which
>> points users to the new long version of the method.  I think throwing an
>> exception might be a bit much personally.
>> 
>> Ryan
>> 
>>> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols  wrote:
>>> 
>>> Is it possible for the underlying counter to be maintained as a long, and
>>> change the getInt method to return as normal when the value is within
>>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long
>>> value would overflow an int?
>>> 
>>> For the MX Bean, should we keep (but deprecate) the existing attribute,
>>> and add a new attribute TotalNetSearchCompletedAsLong?
>>> 
 On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
 
 Quite a few Geode stats are currently defined as IntCounters which very
 easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
 wrap to negative MAX_VALUE, so my team defined the following two tickets
 and changed them to LongCounters on the develop branch:
 
 GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
 https://issues.apache.org/jira/browse/GEODE-6345
 
 GEODE-6334: CachePerfStats operation count stats may wrap to negative
>>> values
 https://issues.apache.org/jira/browse/GEODE-6334
 
 Some people are now concerned that this may break backwards
>>> compatibility
 and API for existing users.
 
 There are two potential ways for a user to *experience* this as an API
 change:
 
 1) MemberMXBean in JMX
 
 *-  int getTotalNetSearchCompleted();*
 *+  long getTotalNetSearchCompleted();*
 
 As you can see in GEODE-6334, we changed quite a few stats from integer
>>> to
 long in CachePerfStats. The only one directly exposed as an attribute on
 MemberMXBean is TotalNetSearchCompleted.
 
 Users will find that this API method changed.
 
 2) Statistics API with undocumented strings
 
 If we assume that users might know or discover that we have a statistics
 text id of "cachePerfStats" with an integer stat of name "puts" then
>>> they
 could use our Statistics API to get the value:
 
 * 1:CacheFactory cacheFactory = new CacheFactory();*
 * 2:Cache cache = cacheFactory.create();*
 * 3:*
 * 4:Region region = cache.>>> String>createRegionFactory(PARTITION).create("region");*
 * 5:*
 * 6:Statistics[] statistics =
 cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
 * 7:int oldPuts = statistics[0].getInt("puts");*
 * 8:*
 * 9:region.put("some", "value");*
 *10:*
 *11:int newPuts = statistics[0].getInt("puts");*
 *12:*
 *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
 
 The above works in Geode 1.8 and earlier but will begin to throw the
 following in Geode 1.9 (when develop branch is released):
 
 *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
 type long and it was expected to be an int.*
 * at
 
>>> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
 * at
 
>>> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
 * at
 
>>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
 * at
 
>>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
 * at com.user.example.Example.example(Example.java7)*
 
 In order to avoid the above exception, a user would have to change the
>>> code
 on lines 7 and 11 to use *getLong* instead of *getInt*.
 
 Should Geode stats be considered a form of API contract and should they
 conform to backwards compatibility constraints?
 
 Should we change these from LongCounters back to IntCounters?
 
 Someone did suggest that we could change them back to IntCounters and
>>> then
 change our statistics internals to skip to zero anytime a stat attempts
>>> to
 increment above MAX_VALUE, however, I think that if we decide that stats
 cannot be changed from IntCounters to LongCounters then we should not
>>> make
 this change either.
 
 Thanks 

Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Ryan McMahon
Sorry that should read “and if the value exceeds MAX_INT”

On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon  wrote:

> +1 to introducing a new method which returns the stat as long per Jake’s
> suggestion.  I vote for getInt() to downcast as an int, and the value
> exceeds MAX_INT, return MAX_INT and possibly add a warning message which
> points users to the new long version of the method.  I think throwing an
> exception might be a bit much personally.
>
> Ryan
>
> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols  wrote:
>
>> Is it possible for the underlying counter to be maintained as a long, and
>> change the getInt method to return as normal when the value is within
>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long
>> value would overflow an int?
>>
>> For the MX Bean, should we keep (but deprecate) the existing attribute,
>> and add a new attribute TotalNetSearchCompletedAsLong?
>>
>> > On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
>> >
>> > Quite a few Geode stats are currently defined as IntCounters which very
>> > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
>> > wrap to negative MAX_VALUE, so my team defined the following two tickets
>> > and changed them to LongCounters on the develop branch:
>> >
>> > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
>> > https://issues.apache.org/jira/browse/GEODE-6345
>> >
>> > GEODE-6334: CachePerfStats operation count stats may wrap to negative
>> values
>> > https://issues.apache.org/jira/browse/GEODE-6334
>> >
>> > Some people are now concerned that this may break backwards
>> compatibility
>> > and API for existing users.
>> >
>> > There are two potential ways for a user to *experience* this as an API
>> > change:
>> >
>> > 1) MemberMXBean in JMX
>> >
>> > *-  int getTotalNetSearchCompleted();*
>> > *+  long getTotalNetSearchCompleted();*
>> >
>> > As you can see in GEODE-6334, we changed quite a few stats from integer
>> to
>> > long in CachePerfStats. The only one directly exposed as an attribute on
>> > MemberMXBean is TotalNetSearchCompleted.
>> >
>> > Users will find that this API method changed.
>> >
>> > 2) Statistics API with undocumented strings
>> >
>> > If we assume that users might know or discover that we have a statistics
>> > text id of "cachePerfStats" with an integer stat of name "puts" then
>> they
>> > could use our Statistics API to get the value:
>> >
>> > * 1:CacheFactory cacheFactory = new CacheFactory();*
>> > * 2:Cache cache = cacheFactory.create();*
>> > * 3:*
>> > * 4:Region region = cache.> > String>createRegionFactory(PARTITION).create("region");*
>> > * 5:*
>> > * 6:Statistics[] statistics =
>> > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
>> > * 7:int oldPuts = statistics[0].getInt("puts");*
>> > * 8:*
>> > * 9:region.put("some", "value");*
>> > *10:*
>> > *11:int newPuts = statistics[0].getInt("puts");*
>> > *12:*
>> > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
>> >
>> > The above works in Geode 1.8 and earlier but will begin to throw the
>> > following in Geode 1.9 (when develop branch is released):
>> >
>> > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
>> > type long and it was expected to be an int.*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
>> > * at com.user.example.Example.example(Example.java7)*
>> >
>> > In order to avoid the above exception, a user would have to change the
>> code
>> > on lines 7 and 11 to use *getLong* instead of *getInt*.
>> >
>> > Should Geode stats be considered a form of API contract and should they
>> > conform to backwards compatibility constraints?
>> >
>> > Should we change these from LongCounters back to IntCounters?
>> >
>> > Someone did suggest that we could change them back to IntCounters and
>> then
>> > change our statistics internals to skip to zero anytime a stat attempts
>> to
>> > increment above MAX_VALUE, however, I think that if we decide that stats
>> > cannot be changed from IntCounters to LongCounters then we should not
>> make
>> > this change either.
>> >
>> > Thanks for any input or opinions,
>> > Kirk
>>
>>


Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Ryan McMahon
+1 to introducing a new method which returns the stat as long per Jake’s
suggestion.  I vote for getInt() to downcast as an int, and the value
exceeds MAX_INT, return MAX_INT and possibly add a warning message which
points users to the new long version of the method.  I think throwing an
exception might be a bit much personally.

Ryan

On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols  wrote:

> Is it possible for the underlying counter to be maintained as a long, and
> change the getInt method to return as normal when the value is within
> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long
> value would overflow an int?
>
> For the MX Bean, should we keep (but deprecate) the existing attribute,
> and add a new attribute TotalNetSearchCompletedAsLong?
>
> > On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
> >
> > Quite a few Geode stats are currently defined as IntCounters which very
> > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> > wrap to negative MAX_VALUE, so my team defined the following two tickets
> > and changed them to LongCounters on the develop branch:
> >
> > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> > https://issues.apache.org/jira/browse/GEODE-6345
> >
> > GEODE-6334: CachePerfStats operation count stats may wrap to negative
> values
> > https://issues.apache.org/jira/browse/GEODE-6334
> >
> > Some people are now concerned that this may break backwards compatibility
> > and API for existing users.
> >
> > There are two potential ways for a user to *experience* this as an API
> > change:
> >
> > 1) MemberMXBean in JMX
> >
> > *-  int getTotalNetSearchCompleted();*
> > *+  long getTotalNetSearchCompleted();*
> >
> > As you can see in GEODE-6334, we changed quite a few stats from integer
> to
> > long in CachePerfStats. The only one directly exposed as an attribute on
> > MemberMXBean is TotalNetSearchCompleted.
> >
> > Users will find that this API method changed.
> >
> > 2) Statistics API with undocumented strings
> >
> > If we assume that users might know or discover that we have a statistics
> > text id of "cachePerfStats" with an integer stat of name "puts" then they
> > could use our Statistics API to get the value:
> >
> > * 1:CacheFactory cacheFactory = new CacheFactory();*
> > * 2:Cache cache = cacheFactory.create();*
> > * 3:*
> > * 4:Region region = cache. > String>createRegionFactory(PARTITION).create("region");*
> > * 5:*
> > * 6:Statistics[] statistics =
> > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> > * 7:int oldPuts = statistics[0].getInt("puts");*
> > * 8:*
> > * 9:region.put("some", "value");*
> > *10:*
> > *11:int newPuts = statistics[0].getInt("puts");*
> > *12:*
> > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
> >
> > The above works in Geode 1.8 and earlier but will begin to throw the
> > following in Geode 1.9 (when develop branch is released):
> >
> > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> > type long and it was expected to be an int.*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> > * at com.user.example.Example.example(Example.java7)*
> >
> > In order to avoid the above exception, a user would have to change the
> code
> > on lines 7 and 11 to use *getLong* instead of *getInt*.
> >
> > Should Geode stats be considered a form of API contract and should they
> > conform to backwards compatibility constraints?
> >
> > Should we change these from LongCounters back to IntCounters?
> >
> > Someone did suggest that we could change them back to IntCounters and
> then
> > change our statistics internals to skip to zero anytime a stat attempts
> to
> > increment above MAX_VALUE, however, I think that if we decide that stats
> > cannot be changed from IntCounters to LongCounters then we should not
> make
> > this change either.
> >
> > Thanks for any input or opinions,
> > Kirk
>
>


Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Owen Nichols
Is it possible for the underlying counter to be maintained as a long, and 
change the getInt method to return as normal when the value is within MAX_INT, 
otherwise throw an exception and/or return MAX_INT when the long value would 
overflow an int?

For the MX Bean, should we keep (but deprecate) the existing attribute, and add 
a new attribute TotalNetSearchCompletedAsLong?

> On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
> 
> Quite a few Geode stats are currently defined as IntCounters which very
> easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> wrap to negative MAX_VALUE, so my team defined the following two tickets
> and changed them to LongCounters on the develop branch:
> 
> GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> https://issues.apache.org/jira/browse/GEODE-6345
> 
> GEODE-6334: CachePerfStats operation count stats may wrap to negative values
> https://issues.apache.org/jira/browse/GEODE-6334
> 
> Some people are now concerned that this may break backwards compatibility
> and API for existing users.
> 
> There are two potential ways for a user to *experience* this as an API
> change:
> 
> 1) MemberMXBean in JMX
> 
> *-  int getTotalNetSearchCompleted();*
> *+  long getTotalNetSearchCompleted();*
> 
> As you can see in GEODE-6334, we changed quite a few stats from integer to
> long in CachePerfStats. The only one directly exposed as an attribute on
> MemberMXBean is TotalNetSearchCompleted.
> 
> Users will find that this API method changed.
> 
> 2) Statistics API with undocumented strings
> 
> If we assume that users might know or discover that we have a statistics
> text id of "cachePerfStats" with an integer stat of name "puts" then they
> could use our Statistics API to get the value:
> 
> * 1:CacheFactory cacheFactory = new CacheFactory();*
> * 2:Cache cache = cacheFactory.create();*
> * 3:*
> * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");*
> * 5:*
> * 6:Statistics[] statistics =
> cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> * 7:int oldPuts = statistics[0].getInt("puts");*
> * 8:*
> * 9:region.put("some", "value");*
> *10:*
> *11:int newPuts = statistics[0].getInt("puts");*
> *12:*
> *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
> 
> The above works in Geode 1.8 and earlier but will begin to throw the
> following in Geode 1.9 (when develop branch is released):
> 
> *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> type long and it was expected to be an int.*
> * at
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> * at com.user.example.Example.example(Example.java7)*
> 
> In order to avoid the above exception, a user would have to change the code
> on lines 7 and 11 to use *getLong* instead of *getInt*.
> 
> Should Geode stats be considered a form of API contract and should they
> conform to backwards compatibility constraints?
> 
> Should we change these from LongCounters back to IntCounters?
> 
> Someone did suggest that we could change them back to IntCounters and then
> change our statistics internals to skip to zero anytime a stat attempts to
> increment above MAX_VALUE, however, I think that if we decide that stats
> cannot be changed from IntCounters to LongCounters then we should not make
> this change either.
> 
> Thanks for any input or opinions,
> Kirk



Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Dan Smith
+1 to what Jake said about MemberMXBean - that is definitely a public API
class, so we need to deprecate the old method and introduce a new one.

I'm also not clear about the stats. Technically, they are discoverable
through the public API, so maybe? It seems like they are mix of things
users might want and stuff that is more internal/debugging oriented. If the
downcasting trick Jake mentioned isn't that hard, I'd say we should do
that. That said, I suspect we've renamed some stats in the past.

One other thing to consider - will changing the size of the stats affect
any tooling like VSD or gfsh or ?? that may have to deal with stat data
from multiple geode versions?

-Dan

On Wed, Feb 13, 2019 at 3:59 PM Kirk Lund  wrote:

> Quite a few Geode stats are currently defined as IntCounters which very
> easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> wrap to negative MAX_VALUE, so my team defined the following two tickets
> and changed them to LongCounters on the develop branch:
>
> GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> https://issues.apache.org/jira/browse/GEODE-6345
>
> GEODE-6334: CachePerfStats operation count stats may wrap to negative
> values
> https://issues.apache.org/jira/browse/GEODE-6334
>
> Some people are now concerned that this may break backwards compatibility
> and API for existing users.
>
> There are two potential ways for a user to *experience* this as an API
> change:
>
> 1) MemberMXBean in JMX
>
> *-  int getTotalNetSearchCompleted();*
> *+  long getTotalNetSearchCompleted();*
>
> As you can see in GEODE-6334, we changed quite a few stats from integer to
> long in CachePerfStats. The only one directly exposed as an attribute on
> MemberMXBean is TotalNetSearchCompleted.
>
> Users will find that this API method changed.
>
> 2) Statistics API with undocumented strings
>
> If we assume that users might know or discover that we have a statistics
> text id of "cachePerfStats" with an integer stat of name "puts" then they
> could use our Statistics API to get the value:
>
> * 1:CacheFactory cacheFactory = new CacheFactory();*
> * 2:Cache cache = cacheFactory.create();*
> * 3:*
> * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");*
> * 5:*
> * 6:Statistics[] statistics =
> cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> * 7:int oldPuts = statistics[0].getInt("puts");*
> * 8:*
> * 9:region.put("some", "value");*
> *10:*
> *11:int newPuts = statistics[0].getInt("puts");*
> *12:*
> *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
>
> The above works in Geode 1.8 and earlier but will begin to throw the
> following in Geode 1.9 (when develop branch is released):
>
> *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> type long and it was expected to be an int.*
> * at
>
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> * at
>
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> * at
>
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> * at
>
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> * at com.user.example.Example.example(Example.java7)*
>
> In order to avoid the above exception, a user would have to change the code
> on lines 7 and 11 to use *getLong* instead of *getInt*.
>
> Should Geode stats be considered a form of API contract and should they
> conform to backwards compatibility constraints?
>
> Should we change these from LongCounters back to IntCounters?
>
> Someone did suggest that we could change them back to IntCounters and then
> change our statistics internals to skip to zero anytime a stat attempts to
> increment above MAX_VALUE, however, I think that if we decide that stats
> cannot be changed from IntCounters to LongCounters then we should not make
> this change either.
>
> Thanks for any input or opinions,
> Kirk
>


Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Jacob Barrett
I don’t consider the stats API a public API. I think it is a leaked internal 
API. Do we document the interactions with this API? Do we document the stat 
names? I consider documentation the API. I can discover all sorts of exposed 
methods in a library but shouldn’t consider them contracted API.

If however we come to a conclusion that they are API then we could work around 
the issues below. We could changes the stat API to be a little forgiving on the 
getInt and related calls if the internal storage is a long. We simply downcast 
the long to an int. The behavior to the end user does not change but if they 
use the getLong they get the new improved stat. For the JMX we just add a new 
method, getTotalNetSearchCompletedLong() to get the long version of the stat.

-Jake

> On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
> 
> Quite a few Geode stats are currently defined as IntCounters which very
> easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> wrap to negative MAX_VALUE, so my team defined the following two tickets
> and changed them to LongCounters on the develop branch:
> 
> GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> https://issues.apache.org/jira/browse/GEODE-6345
> 
> GEODE-6334: CachePerfStats operation count stats may wrap to negative values
> https://issues.apache.org/jira/browse/GEODE-6334
> 
> Some people are now concerned that this may break backwards compatibility
> and API for existing users.
> 
> There are two potential ways for a user to *experience* this as an API
> change:
> 
> 1) MemberMXBean in JMX
> 
> *-  int getTotalNetSearchCompleted();*
> *+  long getTotalNetSearchCompleted();*
> 
> As you can see in GEODE-6334, we changed quite a few stats from integer to
> long in CachePerfStats. The only one directly exposed as an attribute on
> MemberMXBean is TotalNetSearchCompleted.
> 
> Users will find that this API method changed.
> 
> 2) Statistics API with undocumented strings
> 
> If we assume that users might know or discover that we have a statistics
> text id of "cachePerfStats" with an integer stat of name "puts" then they
> could use our Statistics API to get the value:
> 
> * 1:CacheFactory cacheFactory = new CacheFactory();*
> * 2:Cache cache = cacheFactory.create();*
> * 3:*
> * 4:Region region = cache. String>createRegionFactory(PARTITION).create("region");*
> * 5:*
> * 6:Statistics[] statistics =
> cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> * 7:int oldPuts = statistics[0].getInt("puts");*
> * 8:*
> * 9:region.put("some", "value");*
> *10:*
> *11:int newPuts = statistics[0].getInt("puts");*
> *12:*
> *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
> 
> The above works in Geode 1.8 and earlier but will begin to throw the
> following in Geode 1.9 (when develop branch is released):
> 
> *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> type long and it was expected to be an int.*
> * at
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> * at
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> * at com.user.example.Example.example(Example.java7)*
> 
> In order to avoid the above exception, a user would have to change the code
> on lines 7 and 11 to use *getLong* instead of *getInt*.
> 
> Should Geode stats be considered a form of API contract and should they
> conform to backwards compatibility constraints?
> 
> Should we change these from LongCounters back to IntCounters?
> 
> Someone did suggest that we could change them back to IntCounters and then
> change our statistics internals to skip to zero anytime a stat attempts to
> increment above MAX_VALUE, however, I think that if we decide that stats
> cannot be changed from IntCounters to LongCounters then we should not make
> this change either.
> 
> Thanks for any input or opinions,
> Kirk



Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Kirk Lund
Quite a few Geode stats are currently defined as IntCounters which very
easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
wrap to negative MAX_VALUE, so my team defined the following two tickets
and changed them to LongCounters on the develop branch:

GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
https://issues.apache.org/jira/browse/GEODE-6345

GEODE-6334: CachePerfStats operation count stats may wrap to negative values
https://issues.apache.org/jira/browse/GEODE-6334

Some people are now concerned that this may break backwards compatibility
and API for existing users.

There are two potential ways for a user to *experience* this as an API
change:

1) MemberMXBean in JMX

*-  int getTotalNetSearchCompleted();*
*+  long getTotalNetSearchCompleted();*

As you can see in GEODE-6334, we changed quite a few stats from integer to
long in CachePerfStats. The only one directly exposed as an attribute on
MemberMXBean is TotalNetSearchCompleted.

Users will find that this API method changed.

2) Statistics API with undocumented strings

If we assume that users might know or discover that we have a statistics
text id of "cachePerfStats" with an integer stat of name "puts" then they
could use our Statistics API to get the value:

* 1:CacheFactory cacheFactory = new CacheFactory();*
* 2:Cache cache = cacheFactory.create();*
* 3:*
* 4:Region region = cache.createRegionFactory(PARTITION).create("region");*
* 5:*
* 6:Statistics[] statistics =
cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
* 7:int oldPuts = statistics[0].getInt("puts");*
* 8:*
* 9:region.put("some", "value");*
*10:*
*11:int newPuts = statistics[0].getInt("puts");*
*12:*
*13:assertThat(newPuts).isEqualTo(oldPuts + 1);*

The above works in Geode 1.8 and earlier but will begin to throw the
following in Geode 1.9 (when develop branch is released):

*java.lang.IllegalArgumentException: The statistic puts with id 23 is of
type long and it was expected to be an int.*
* at
org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
* at com.user.example.Example.example(Example.java7)*

In order to avoid the above exception, a user would have to change the code
on lines 7 and 11 to use *getLong* instead of *getInt*.

Should Geode stats be considered a form of API contract and should they
conform to backwards compatibility constraints?

Should we change these from LongCounters back to IntCounters?

Someone did suggest that we could change them back to IntCounters and then
change our statistics internals to skip to zero anytime a stat attempts to
increment above MAX_VALUE, however, I think that if we decide that stats
cannot be changed from IntCounters to LongCounters then we should not make
this change either.

Thanks for any input or opinions,
Kirk