Odg: Odg: negative ActiveCQCount

2020-07-17 Thread Mario Kevo
Hi devs,

Just reminder if someone is familiar with this, or someone has some idea how to 
resolve this issue.

Thanks and BR,
Mario

Šalje: Mario Kevo 
Poslano: 7. srpnja 2020. 15:24
Prima: dev@geode.apache.org 
Predmet: Odg: Odg: negative ActiveCQCount

Hi,

Thank you all for the response!

What I got for now is that when I register CQ on the one server it 
processMessage to the other server through FilterProfile and in the message 
opType is REGISTER_CQ.
In fromData() method in FilterProfile.java states following:
if (isCqOp(this.opType)) {
this.serverCqName = in.readUTF();
if (this.opType == operationType.REGISTER_CQ || this.opType == 
operationType.SET_CQ_STATE) {
  this.cq = CqServiceProvider.readCq(in);
}
And there it register cq on the other server and not increment cqActiveCount, 
which is ok as redundancy is 0. But it now has on both server different 
instances of ServerCqImpl for the same cq. The ones created with constructor 
with arguments at the execute cq and another with empty constructor while 
deserializing the message with opType=REGISTER_CQ. For me this is ok as we need 
to follow up all changes on both servers as maybe some fullfil CQ condition on 
the other server. Correct me if I'm wrong.

But when it is going to close cq it executes it on both server, for me it is ok 
that what is started should be closed. But in the close method we have 
decrement if stateBeforeClosing is RUNNING. So it will be good if we can 
somehow process cq_state of this ServerCqImpl instance which is created by 
constructor with parameters before closing this created by deserialization.
Does anyone has an idea how to get this? Or some other idea to solve this issue?

BR,
Mario


Šalje: Kirk Lund 
Poslano: 1. srpnja 2020. 19:52
Prima: dev@geode.apache.org 
Predmet: Re: Odg: negative ActiveCQCount

Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a
statistic decrement bug for activeCqCount. Somewhere, each Server is
decrementing it once too many times.

You could find the statistics class containing activeCqCount and try adding
some debugging log statements or even add some breakpoints for debugger if
it's easily reproduced.

On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo  wrote:

> Hi Kirk, thanks for the response!
>
> I just realized that I wrongly describe the problem as I tried so many
> case. Sorry!
>
> We have system with two servers. If the redundancy is 0 then we have
> properly that on the first server is activeCqCount=1 and on the second is
> activeCqCount=0.
> After close CQ we got on first server activeCqCount=0 and on the second is
> activeCqCount=-1.
> gfsh>show metrics --categories=query
> Cluster-wide Metrics
>
> Category |  Metric  | Value
>  |  | -
> query| activeCQCount| -1
>  | queryRequestRate | 0.0
>
>
> In case we set redundancy to 1 it increments properly as expected, on both
> servers by one. But when cq is closed we got on both servers
> activeCqCount=-1. And show metrics command has the following output
> gfsh>show metrics --categories=query
> Cluster-wide Metrics
>
> Category |  Metric  | Value
>  |  | -
> query| activeCQCount| -1
>  | queryRequestRate | 0.0
>
> What I found is that when server register cq on one server it send message
> to other servers in the system with opType=REGISTER_CQ and in that case it
> creates new instance of ServerCqImpl on second server(with empty
> constructor of ServerCqImpl). When we close CQ there is two different
> instances on servers and it closed both of them, but as they are in RUNNING
> state before closing, it decrements activeCqCount on both of them.
>
> BR,
> Mario
>
> 
> Šalje: Kirk Lund 
> Poslano: 30. lipnja 2020. 19:54
> Prima: dev@geode.apache.org 
> Predmet: Re: negative ActiveCQCount
>
> I think *show metrics --categories=query* is showing you the query stats
> from DistributedSystemMXBean (see
> ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean
> aggregates values across all members in the cluster, so I would have
> expected activeCQCount to initially show a value of 2 after you create a
> ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a
> value of 0.
>
> When you create a CQ on a Server, it should be reflected asynchronously on
> the CacheServerMXBean in that Server. Each Server has its own
> CacheServerMXBean. Over on the Locator (JMX Manager), the
> DistributedSystemMXBean aggregates the count of active CQs in
> ServerClusterStatsMonitor by invoking
> DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state
> is federated to the Locator (

Odg: Odg: negative ActiveCQCount

2020-07-07 Thread Mario Kevo
Hi,

Thank you all for the response!

What I got for now is that when I register CQ on the one server it 
processMessage to the other server through FilterProfile and in the message 
opType is REGISTER_CQ.
In fromData() method in FilterProfile.java states following:
if (isCqOp(this.opType)) {
this.serverCqName = in.readUTF();
if (this.opType == operationType.REGISTER_CQ || this.opType == 
operationType.SET_CQ_STATE) {
  this.cq = CqServiceProvider.readCq(in);
}
And there it register cq on the other server and not increment cqActiveCount, 
which is ok as redundancy is 0. But it now has on both server different 
instances of ServerCqImpl for the same cq. The ones created with constructor 
with arguments at the execute cq and another with empty constructor while 
deserializing the message with opType=REGISTER_CQ. For me this is ok as we need 
to follow up all changes on both servers as maybe some fullfil CQ condition on 
the other server. Correct me if I'm wrong.

But when it is going to close cq it executes it on both server, for me it is ok 
that what is started should be closed. But in the close method we have 
decrement if stateBeforeClosing is RUNNING. So it will be good if we can 
somehow process cq_state of this ServerCqImpl instance which is created by 
constructor with parameters before closing this created by deserialization.
Does anyone has an idea how to get this? Or some other idea to solve this issue?

BR,
Mario


Šalje: Kirk Lund 
Poslano: 1. srpnja 2020. 19:52
Prima: dev@geode.apache.org 
Predmet: Re: Odg: negative ActiveCQCount

Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a
statistic decrement bug for activeCqCount. Somewhere, each Server is
decrementing it once too many times.

You could find the statistics class containing activeCqCount and try adding
some debugging log statements or even add some breakpoints for debugger if
it's easily reproduced.

On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo  wrote:

> Hi Kirk, thanks for the response!
>
> I just realized that I wrongly describe the problem as I tried so many
> case. Sorry!
>
> We have system with two servers. If the redundancy is 0 then we have
> properly that on the first server is activeCqCount=1 and on the second is
> activeCqCount=0.
> After close CQ we got on first server activeCqCount=0 and on the second is
> activeCqCount=-1.
> gfsh>show metrics --categories=query
> Cluster-wide Metrics
>
> Category |  Metric  | Value
>  |  | -
> query| activeCQCount| -1
>  | queryRequestRate | 0.0
>
>
> In case we set redundancy to 1 it increments properly as expected, on both
> servers by one. But when cq is closed we got on both servers
> activeCqCount=-1. And show metrics command has the following output
> gfsh>show metrics --categories=query
> Cluster-wide Metrics
>
> Category |  Metric  | Value
>  |  | -
> query| activeCQCount| -1
>  | queryRequestRate | 0.0
>
> What I found is that when server register cq on one server it send message
> to other servers in the system with opType=REGISTER_CQ and in that case it
> creates new instance of ServerCqImpl on second server(with empty
> constructor of ServerCqImpl). When we close CQ there is two different
> instances on servers and it closed both of them, but as they are in RUNNING
> state before closing, it decrements activeCqCount on both of them.
>
> BR,
> Mario
>
> 
> Šalje: Kirk Lund 
> Poslano: 30. lipnja 2020. 19:54
> Prima: dev@geode.apache.org 
> Predmet: Re: negative ActiveCQCount
>
> I think *show metrics --categories=query* is showing you the query stats
> from DistributedSystemMXBean (see
> ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean
> aggregates values across all members in the cluster, so I would have
> expected activeCQCount to initially show a value of 2 after you create a
> ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a
> value of 0.
>
> When you create a CQ on a Server, it should be reflected asynchronously on
> the CacheServerMXBean in that Server. Each Server has its own
> CacheServerMXBean. Over on the Locator (JMX Manager), the
> DistributedSystemMXBean aggregates the count of active CQs in
> ServerClusterStatsMonitor by invoking
> DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state
> is federated to the Locator (JMX Manager).
>
> Based on what I see in code and in the description on GEODE-8293, I think
> you might want to see if increment has a problem instead of decrement.
>
> I don't see anything that would limit the activeCQCount to only count the
> CQs on primaries. So, I would expect redundancy=1 to result in a value of
> 2. Does anyone else have different info about this?
>
> On Tue, Jun 30, 2020 at 5:31 AM Mario Kevo  wrote:
>
> > Hi geode-dev,
> >
> > I have a question