Hello Anthony,

Yes I am running in virtualized infrastructure. But when I checked %id and
%st and logged graph for it, i see %st as always 0.0 and %id in range of
(95-98) most of the time.

Could number of connections for every client app or member-timeout or
ack-wait-threshold help here?

Thanks,
- Dharam Thacker


On Thu, Sep 27, 2018 at 8:37 PM Anthony Baker <[email protected]> wrote:

> Are you running on cloud or virtualized infrastructure?  If so, check if
> your steal time stats—you may have “noisy neighbors” causing members to
> become unresponsive.  Geode detects this and fences off the unhealthy
> members to maintain consistency and availability.
>
> Anthony
>
>
> On Sep 27, 2018, at 10:31 AM, Dharam Thacker <[email protected]>
> wrote:
>
> Hi Team,
>
> I have following topology for geode currently and all regions are
> replicated.
>
> Note : Unfortunately I am still on version 1.1.1
>
> *Host1*:
> Locator1
> Server1.1 (Group1) -- 24G
> Server2.1 (Group2) -- 24G
> Client1 (CQ listener only -- 20 CQs registered via locator pool)
> Client2 (Fires OQL queries and functions only via locator pool)
>
> *Host2*:
> Locator2
> Server1.2 (Group1) -- 24G
> Server2.2 (Group2) -- 24G
>
> As shown above I have spring boot web app geode clients (client1 and
> client2) only on HOST1.
>
> If I scale them by putting them on HOST2 as well it works.
>
> Now I see 40 CQs registered for CQ listener client.
>
> But I frequently see now "GMS Membership error" complaining about "No
> heartbeat request and force disconnection of member" for all server nodes.
>
> Transient though but really painful!
>
> Somehow with 1.1.1 it can't auto reconnect which I know is fixed in later
> version but that's still fine.
>
> I did GC,CPU load and Memory analysis very well and at least these 3 looks
> quite healthy as expected.
>
> What could be the possible other reasons where scalling client apps might
> result into this?
>
> Or if you can suggest anything else to look at?
>
> Thanks,
> Dharam
>
>
>

Reply via email to