Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Mark Hanson
I think this cache.close() discussion is off topic. I’m not sure that’s the 
case, but it’s not at the root of my question.

The problem: Using gfsh -e, from a gfsh rule in test, stop server does not 
properly block as the rest of the api seems to. 

I’m looking for a better understanding of the desired interface.

The suggestion put forth is to use the existing indicators to make it 
synchronous. That seems right to me, but I could be wrong.

Thanks,
Mark

Sent from my iPhone

> On Sep 10, 2019, at 7:12 PM, John Blum  wrote:
> 
> @Mike - Who said anything about...
> 
> "*masking it in an early return from the shutdown command doesn't seem like
> the appropriate action.*"
> 
> I think you missed the point.  You explicitly have to break out of the
> wait, which is a conscious decision when *Gfsh* is run interactively.
> 
> The command as I previously stated, is "blocking", or "synchronous" with
> respect to cache.close(), which is "ultimately" what gets called whether
> the stop happens in-process or out-of-process (for that matter).  So, in a
> non-interactive mode, the real issue is, why is the cache not completely
> and properly closed/shutdown after a cache.close() call then?
> 
> Fix cache.close() then!  Don't simply bandaid this thing with yet another
> unreliable means to ascertain whether the cache was completely and properly
> shutdown.  And, don't put responsibility on the user to have register and
> receive notification on complete shutdown, or some other ridiculous means,
> either.
> 
> 
>> On Tue, Sep 10, 2019 at 6:15 PM Michael Stolz  wrote:
>> 
>> I understand that issue John, but masking it in an early return from the
>> shutdown command doesn't seem like the appropriate action.
>> Maybe we should consider that nearly all gfsh commands are not blocking,
>> and rather, have a way to determine which ones are still waiting for
>> completion?
>> 
>> --
>> Mike Stolz
>> Principal Engineer, Pivotal Cloud Cache
>> Mobile: +1-631-835-4771
>> 
>> 
>> 
>>> On Tue, Sep 10, 2019 at 9:13 PM John Blum  wrote:
>>> 
>>> @Anil-  I hear your argument when you are "scripting" with *Gfsh*, but
>>> blocking absolutely maybe less desirable when using *Gfsh* interactively.
>>> There are, after all, many non-cluster based commands.
>>> 
>>> @Mark - I see.  I have generally found in my own testing purposes,
>>> especially automated, that a cache instance is not fully closed and has
>> not
>>> properly released all it's resource even after cache.close() returns.
>>> 
>>> -j
>>> 
 On Tue, Sep 10, 2019 at 5:02 PM Mark Hanson  wrote:
 
 Hi John,
 
 Kirk and I found that in our testing it was returning before it was
>> fully
 stopped. I have a change that seems viable that waits for the pid file
>> to
 disappear from the subdirectory of the server. I am not a fan. I would
 prefer to wait for the pid to disappear, but that doesn’t seem like it
>>> will
 be cross-platform friendly.
 
 Thanks,
 Mark
 
> On Sep 10, 2019, at 3:31 PM, John Blum  wrote:
> 
> `stop server` is synchronous (with an option to break out of the wait
 using
> CTRL^C) AFAIR.
> 
> Way deep down inside, it simply relies on GemFireCache.close() to
>>> return
> (in-process).
> 
> As Darrel mentioned, there is not "true" signal the the server was
> successfully stopped.
> 
> -j
> 
> 
> On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider <
>>> dschnei...@pivotal.io>
> wrote:
> 
>> I think it would be good for stop server to confirm in some way that
>>> the
>> server has stopped before returning.
>> 
>> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson 
>>> wrote:
>> 
>>> Hello All,
>>> 
>>> I would like to propose that we make the gfsh “stop server” command
>>> synchronous. It is causing some issues with some tests as the rest
>> of
 the
>>> calls are blocking. Stop on the other hand immediately returns by
>>> comparison.
>>> This causes issues as shown in GEODE-7017 specifically.
>>> 
>>> GEODE:7017 CI failure:
>>> 
>>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
>>> startupReportsOnlineOnlyAfterRedundancyRestored
>>> https://issues.apache.org/jira/browse/GEODE-7017 <
>>> https://issues.apache.org/jira/browse/GEODE-7017>
>>> 
>>> 
>>> What do people think?
>>> 
>>> Thanks,
>>> Mark
>> 
> 
> 
> --
> -John
> john.blum10101 (skype)
 
 
>>> 
>>> --
>>> -John
>>> john.blum10101 (skype)
>>> 
>> 
> 
> 
> -- 
> -John
> john.blum10101 (skype)


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread John Blum
@Mike - Who said anything about...

"*masking it in an early return from the shutdown command doesn't seem like
the appropriate action.*"

I think you missed the point.  You explicitly have to break out of the
wait, which is a conscious decision when *Gfsh* is run interactively.

The command as I previously stated, is "blocking", or "synchronous" with
respect to cache.close(), which is "ultimately" what gets called whether
the stop happens in-process or out-of-process (for that matter).  So, in a
non-interactive mode, the real issue is, why is the cache not completely
and properly closed/shutdown after a cache.close() call then?

Fix cache.close() then!  Don't simply bandaid this thing with yet another
unreliable means to ascertain whether the cache was completely and properly
shutdown.  And, don't put responsibility on the user to have register and
receive notification on complete shutdown, or some other ridiculous means,
either.


On Tue, Sep 10, 2019 at 6:15 PM Michael Stolz  wrote:

> I understand that issue John, but masking it in an early return from the
> shutdown command doesn't seem like the appropriate action.
> Maybe we should consider that nearly all gfsh commands are not blocking,
> and rather, have a way to determine which ones are still waiting for
> completion?
>
> --
> Mike Stolz
> Principal Engineer, Pivotal Cloud Cache
> Mobile: +1-631-835-4771
>
>
>
> On Tue, Sep 10, 2019 at 9:13 PM John Blum  wrote:
>
> > @Anil-  I hear your argument when you are "scripting" with *Gfsh*, but
> > blocking absolutely maybe less desirable when using *Gfsh* interactively.
> > There are, after all, many non-cluster based commands.
> >
> > @Mark - I see.  I have generally found in my own testing purposes,
> > especially automated, that a cache instance is not fully closed and has
> not
> > properly released all it's resource even after cache.close() returns.
> >
> > -j
> >
> > On Tue, Sep 10, 2019 at 5:02 PM Mark Hanson  wrote:
> >
> > > Hi John,
> > >
> > > Kirk and I found that in our testing it was returning before it was
> fully
> > > stopped. I have a change that seems viable that waits for the pid file
> to
> > > disappear from the subdirectory of the server. I am not a fan. I would
> > > prefer to wait for the pid to disappear, but that doesn’t seem like it
> > will
> > > be cross-platform friendly.
> > >
> > > Thanks,
> > > Mark
> > >
> > > > On Sep 10, 2019, at 3:31 PM, John Blum  wrote:
> > > >
> > > > `stop server` is synchronous (with an option to break out of the wait
> > > using
> > > > CTRL^C) AFAIR.
> > > >
> > > > Way deep down inside, it simply relies on GemFireCache.close() to
> > return
> > > > (in-process).
> > > >
> > > > As Darrel mentioned, there is not "true" signal the the server was
> > > > successfully stopped.
> > > >
> > > > -j
> > > >
> > > >
> > > > On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider <
> > dschnei...@pivotal.io>
> > > > wrote:
> > > >
> > > >> I think it would be good for stop server to confirm in some way that
> > the
> > > >> server has stopped before returning.
> > > >>
> > > >> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson 
> > wrote:
> > > >>
> > > >>> Hello All,
> > > >>>
> > > >>> I would like to propose that we make the gfsh “stop server” command
> > > >>> synchronous. It is causing some issues with some tests as the rest
> of
> > > the
> > > >>> calls are blocking. Stop on the other hand immediately returns by
> > > >>> comparison.
> > > >>> This causes issues as shown in GEODE-7017 specifically.
> > > >>>
> > > >>> GEODE:7017 CI failure:
> > > >>>
> > org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> > > >>> startupReportsOnlineOnlyAfterRedundancyRestored
> > > >>> https://issues.apache.org/jira/browse/GEODE-7017 <
> > > >>> https://issues.apache.org/jira/browse/GEODE-7017>
> > > >>>
> > > >>>
> > > >>> What do people think?
> > > >>>
> > > >>> Thanks,
> > > >>> Mark
> > > >>
> > > >
> > > >
> > > > --
> > > > -John
> > > > john.blum10101 (skype)
> > >
> > >
> >
> > --
> > -John
> > john.blum10101 (skype)
> >
>


-- 
-John
john.blum10101 (skype)


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Michael Stolz
I understand that issue John, but masking it in an early return from the
shutdown command doesn't seem like the appropriate action.
Maybe we should consider that nearly all gfsh commands are not blocking,
and rather, have a way to determine which ones are still waiting for
completion?

--
Mike Stolz
Principal Engineer, Pivotal Cloud Cache
Mobile: +1-631-835-4771



On Tue, Sep 10, 2019 at 9:13 PM John Blum  wrote:

> @Anil-  I hear your argument when you are "scripting" with *Gfsh*, but
> blocking absolutely maybe less desirable when using *Gfsh* interactively.
> There are, after all, many non-cluster based commands.
>
> @Mark - I see.  I have generally found in my own testing purposes,
> especially automated, that a cache instance is not fully closed and has not
> properly released all it's resource even after cache.close() returns.
>
> -j
>
> On Tue, Sep 10, 2019 at 5:02 PM Mark Hanson  wrote:
>
> > Hi John,
> >
> > Kirk and I found that in our testing it was returning before it was fully
> > stopped. I have a change that seems viable that waits for the pid file to
> > disappear from the subdirectory of the server. I am not a fan. I would
> > prefer to wait for the pid to disappear, but that doesn’t seem like it
> will
> > be cross-platform friendly.
> >
> > Thanks,
> > Mark
> >
> > > On Sep 10, 2019, at 3:31 PM, John Blum  wrote:
> > >
> > > `stop server` is synchronous (with an option to break out of the wait
> > using
> > > CTRL^C) AFAIR.
> > >
> > > Way deep down inside, it simply relies on GemFireCache.close() to
> return
> > > (in-process).
> > >
> > > As Darrel mentioned, there is not "true" signal the the server was
> > > successfully stopped.
> > >
> > > -j
> > >
> > >
> > > On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider <
> dschnei...@pivotal.io>
> > > wrote:
> > >
> > >> I think it would be good for stop server to confirm in some way that
> the
> > >> server has stopped before returning.
> > >>
> > >> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson 
> wrote:
> > >>
> > >>> Hello All,
> > >>>
> > >>> I would like to propose that we make the gfsh “stop server” command
> > >>> synchronous. It is causing some issues with some tests as the rest of
> > the
> > >>> calls are blocking. Stop on the other hand immediately returns by
> > >>> comparison.
> > >>> This causes issues as shown in GEODE-7017 specifically.
> > >>>
> > >>> GEODE:7017 CI failure:
> > >>>
> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> > >>> startupReportsOnlineOnlyAfterRedundancyRestored
> > >>> https://issues.apache.org/jira/browse/GEODE-7017 <
> > >>> https://issues.apache.org/jira/browse/GEODE-7017>
> > >>>
> > >>>
> > >>> What do people think?
> > >>>
> > >>> Thanks,
> > >>> Mark
> > >>
> > >
> > >
> > > --
> > > -John
> > > john.blum10101 (skype)
> >
> >
>
> --
> -John
> john.blum10101 (skype)
>


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread John Blum
@Anil-  I hear your argument when you are "scripting" with *Gfsh*, but
blocking absolutely maybe less desirable when using *Gfsh* interactively.
There are, after all, many non-cluster based commands.

@Mark - I see.  I have generally found in my own testing purposes,
especially automated, that a cache instance is not fully closed and has not
properly released all it's resource even after cache.close() returns.

-j

On Tue, Sep 10, 2019 at 5:02 PM Mark Hanson  wrote:

> Hi John,
>
> Kirk and I found that in our testing it was returning before it was fully
> stopped. I have a change that seems viable that waits for the pid file to
> disappear from the subdirectory of the server. I am not a fan. I would
> prefer to wait for the pid to disappear, but that doesn’t seem like it will
> be cross-platform friendly.
>
> Thanks,
> Mark
>
> > On Sep 10, 2019, at 3:31 PM, John Blum  wrote:
> >
> > `stop server` is synchronous (with an option to break out of the wait
> using
> > CTRL^C) AFAIR.
> >
> > Way deep down inside, it simply relies on GemFireCache.close() to return
> > (in-process).
> >
> > As Darrel mentioned, there is not "true" signal the the server was
> > successfully stopped.
> >
> > -j
> >
> >
> > On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider 
> > wrote:
> >
> >> I think it would be good for stop server to confirm in some way that the
> >> server has stopped before returning.
> >>
> >> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson  wrote:
> >>
> >>> Hello All,
> >>>
> >>> I would like to propose that we make the gfsh “stop server” command
> >>> synchronous. It is causing some issues with some tests as the rest of
> the
> >>> calls are blocking. Stop on the other hand immediately returns by
> >>> comparison.
> >>> This causes issues as shown in GEODE-7017 specifically.
> >>>
> >>> GEODE:7017 CI failure:
> >>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> >>> startupReportsOnlineOnlyAfterRedundancyRestored
> >>> https://issues.apache.org/jira/browse/GEODE-7017 <
> >>> https://issues.apache.org/jira/browse/GEODE-7017>
> >>>
> >>>
> >>> What do people think?
> >>>
> >>> Thanks,
> >>> Mark
> >>
> >
> >
> > --
> > -John
> > john.blum10101 (skype)
>
>

-- 
-John
john.blum10101 (skype)


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Mark Hanson
Hi John, 

Kirk and I found that in our testing it was returning before it was fully 
stopped. I have a change that seems viable that waits for the pid file to 
disappear from the subdirectory of the server. I am not a fan. I would prefer 
to wait for the pid to disappear, but that doesn’t seem like it will be 
cross-platform friendly.

Thanks,
Mark

> On Sep 10, 2019, at 3:31 PM, John Blum  wrote:
> 
> `stop server` is synchronous (with an option to break out of the wait using
> CTRL^C) AFAIR.
> 
> Way deep down inside, it simply relies on GemFireCache.close() to return
> (in-process).
> 
> As Darrel mentioned, there is not "true" signal the the server was
> successfully stopped.
> 
> -j
> 
> 
> On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider 
> wrote:
> 
>> I think it would be good for stop server to confirm in some way that the
>> server has stopped before returning.
>> 
>> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson  wrote:
>> 
>>> Hello All,
>>> 
>>> I would like to propose that we make the gfsh “stop server” command
>>> synchronous. It is causing some issues with some tests as the rest of the
>>> calls are blocking. Stop on the other hand immediately returns by
>>> comparison.
>>> This causes issues as shown in GEODE-7017 specifically.
>>> 
>>> GEODE:7017 CI failure:
>>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
>>> startupReportsOnlineOnlyAfterRedundancyRestored
>>> https://issues.apache.org/jira/browse/GEODE-7017 <
>>> https://issues.apache.org/jira/browse/GEODE-7017>
>>> 
>>> 
>>> What do people think?
>>> 
>>> Thanks,
>>> Mark
>> 
> 
> 
> -- 
> -John
> john.blum10101 (skype)



Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Anilkumar Gingade
Its a good option. But do we see any use-cases, where user doesn't want to
wait for a server stop (if its taking long time) and continue to proceed
with other operation (say executing commands on other servers).
Also, i could not make out how this is related to GEODE-7017; the testcase
seems to be related to starting the server...

-Anil.


On Tue, Sep 10, 2019 at 3:32 PM John Blum  wrote:

> `stop server` is synchronous (with an option to break out of the wait using
> CTRL^C) AFAIR.
>
> Way deep down inside, it simply relies on GemFireCache.close() to return
> (in-process).
>
> As Darrel mentioned, there is not "true" signal the the server was
> successfully stopped.
>
> -j
>
>
> On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider 
> wrote:
>
> > I think it would be good for stop server to confirm in some way that the
> > server has stopped before returning.
> >
> > On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson  wrote:
> >
> > > Hello All,
> > >
> > > I would like to propose that we make the gfsh “stop server” command
> > > synchronous. It is causing some issues with some tests as the rest of
> the
> > > calls are blocking. Stop on the other hand immediately returns by
> > > comparison.
> > > This causes issues as shown in GEODE-7017 specifically.
> > >
> > > GEODE:7017 CI failure:
> > > org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> > > startupReportsOnlineOnlyAfterRedundancyRestored
> > > https://issues.apache.org/jira/browse/GEODE-7017 <
> > > https://issues.apache.org/jira/browse/GEODE-7017>
> > >
> > >
> > > What do people think?
> > >
> > > Thanks,
> > > Mark
> >
>
>
> --
> -John
> john.blum10101 (skype)
>


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread John Blum
`stop server` is synchronous (with an option to break out of the wait using
CTRL^C) AFAIR.

Way deep down inside, it simply relies on GemFireCache.close() to return
(in-process).

As Darrel mentioned, there is not "true" signal the the server was
successfully stopped.

-j


On Tue, Sep 10, 2019 at 3:23 PM Darrel Schneider 
wrote:

> I think it would be good for stop server to confirm in some way that the
> server has stopped before returning.
>
> On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson  wrote:
>
> > Hello All,
> >
> > I would like to propose that we make the gfsh “stop server” command
> > synchronous. It is causing some issues with some tests as the rest of the
> > calls are blocking. Stop on the other hand immediately returns by
> > comparison.
> > This causes issues as shown in GEODE-7017 specifically.
> >
> > GEODE:7017 CI failure:
> > org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> > startupReportsOnlineOnlyAfterRedundancyRestored
> > https://issues.apache.org/jira/browse/GEODE-7017 <
> > https://issues.apache.org/jira/browse/GEODE-7017>
> >
> >
> > What do people think?
> >
> > Thanks,
> > Mark
>


-- 
-John
john.blum10101 (skype)


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Darrel Schneider
I think it would be good for stop server to confirm in some way that the
server has stopped before returning.

On Tue, Sep 10, 2019 at 3:08 PM Mark Hanson  wrote:

> Hello All,
>
> I would like to propose that we make the gfsh “stop server” command
> synchronous. It is causing some issues with some tests as the rest of the
> calls are blocking. Stop on the other hand immediately returns by
> comparison.
> This causes issues as shown in GEODE-7017 specifically.
>
> GEODE:7017 CI failure:
> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> startupReportsOnlineOnlyAfterRedundancyRestored
> https://issues.apache.org/jira/browse/GEODE-7017 <
> https://issues.apache.org/jira/browse/GEODE-7017>
>
>
> What do people think?
>
> Thanks,
> Mark


[Proposal] Make gfsh "stop server" command synchronous

2019-09-10 Thread Mark Hanson
Hello All,

I would like to propose that we make the gfsh “stop server” command 
synchronous. It is causing some issues with some tests as the rest of the calls 
are blocking. Stop on the other hand immediately returns by comparison.
This causes issues as shown in GEODE-7017 specifically.

GEODE:7017 CI failure: 
org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest > 
startupReportsOnlineOnlyAfterRedundancyRestored
https://issues.apache.org/jira/browse/GEODE-7017 



What do people think?

Thanks,
Mark

Re: [DISCUSS] RFC - Move membership code to a separate gradle sub-project

2019-09-10 Thread Dan Smith
It looks like there is consensus to move this RFC forward and we are past
the to be reviewed by date. I'll go ahead and move this RFC into the "Under
Developement" state. Thanks all who provided feedback! If you have
additional feedback, we'll still be watching the RFC and this thread for
further comments/questions.

-Dan

On Fri, Sep 6, 2019 at 11:08 AM Udo Kohlmeyer  wrote:

> I've reviewed and commented on the RFC.
>
> +1 on the thought / notion of extracting modules.
>
> I'm less convinced on the initial extraction of the geode-serialization
> module and believe some attention is to be given to this, once a
> decision to convert the serialization to a stand alone module.
>
> --Udo
>
> On 8/30/19 3:50 PM, Dan Smith wrote:
> > Hi all,
> >
> > We added the following RFC to the wiki about moving Geode's membership
> > system to a separate gradle sub-project. Please review and comment by
> > 9/6/2019.
> >
> > https://cwiki.apache.org/confluence/x/WRB4Bw
> >
> > Thanks!
> > -Dan
> >
>