Errored: apache/geode-native#2084 (release/1.10.0 - 0668f6b)

2019-09-11 Thread Travis CI
Build Update for apache/geode-native
-

Build: #2084
Status: Errored

Duration: 2 hrs, 0 mins, and 18 secs
Commit: 0668f6b (release/1.10.0)
Author: Owen Nichols
Message: GEODE-7182: fix a warning in TcpSslConn.cpp that prevents successful 
compilation on gcc 8.3 (#515)


(cherry picked from commit 9b1c5ab31558c9d0ead6927398cb608521a5c20d)

View the changeset: 
https://github.com/apache/geode-native/compare/e1fba6056375...0668f6b7272a

View the full build log and details: 
https://travis-ci.org/apache/geode-native/builds/583862466?utm_medium=notification_source=email

--

You can unsubscribe from build emails from the apache/geode-native repository 
going to 
https://travis-ci.org/account/preferences/unsubscribe?repository=11948127_medium=notification_source=email.
Or unsubscribe from *all* email updating your settings at 
https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification_source=email.
Or configure specific recipients for build notifications in your .travis.yml 
file. See https://docs.travis-ci.com/user/notifications.



important change in backward-compatibility testing

2019-09-11 Thread Bruce Schuchardt
To fix GEODE-7168 I have modified the backward-compatibility framework 
to preserve periods in version names.  Prior to the change versions were 
in the form "100", "110", "120", etc.  Now they are "1.0.0-incubating", 
"1.1.0", "1.2.0", etc.


When requesting a VM that's running a particular version of Geode you 
need to use the dot notation.


    host.getVM(1, "1.9.0") /not host.getVM(1, "190")/

There is a new method in VersionManager for transforming a version 
string into the old notation if you need it.  See 
VersionManager.versionWithNoDots(versionString).  Use that if you need 
to parse a version into an integer.


On another note, please don't expect that these versions are reflected 
in the Versions class, which is primarily for noting changes in 
serialization.  For instance, our recent release 1.9.1 has no 
corresponding Version instance.




Re: [DISCUSS] Improvements on client function execution API

2019-09-11 Thread Dan Smith
+1 - Ok, I think I've come around to option (a). We can go head and add a
new execute(timeout, TimeUnit) method to the java API that is blocking. We
can leave the existing execute() method alone, except for documenting what
it is doing.

I would like implement execute(timeout,  TimeUnit) on the server side as
well. Since this Execution class is shared between both client and server
APIs, it would be unfortunate to have a method on Execution that simply
doesn't work on the server side.

-Dan


On Thu, Sep 5, 2019 at 9:25 AM Alberto Gomez  wrote:

> Hi all,
>
> First of all, thanks a lot Dan and Jacob for your feedback.
>
> As we are getting close to the deadline I am adding here some conclusions
> and a refined proposal in order to get some more feedback and if possible
> some voting on the two alternatives proposed (or any other in between if
> you feel any of them is lacking something).
>
> I also add some draft code to try to clarify a bit the more complex of the
> alternatives.
>
>
> Proposal summary (needs a decision on which option to implement):
>
> ---
>
> In order to make the API more coherent two alternatives are proposed:
>
> a) Remove the timeout from the ResultCollector::getResult() / document
> that the timeout has no effect, taking into account that
> Execution::execute() is always blocking.
> Additionally we could add the timeout parameter to the
> Execution::execute() method of the Java API in order to align it with the
> native client APIs. This timeout would not be the read timeout on the
> socket but a timeout for the execution of the operation.
>
> b) Change the implementation of the Execution::execute() method without
> timeout to be non-blocking on both the Java and native APIs. This change
> has backward compatibility implications, would probably bring some
> performance decrease and could pose some difficulties in the implementation
> on the C++ side (in the  handling of timed out operations that hold
> resources).
>
>
> The first option (a) is less risky and does not have impacts regarding
> backward compatibility and performance.
>
> The second one (b) is the preferred alternative in terms of the expected
> behavior from the users of the API. This option is more complex to
> implement and as mentioned above has performance and backward compatibility
> issues not easy to be solved.
>
> Following is a draft version of the implementation of b) on the Java
> client:
>
> https://github.com/Nordix/geode/commit/507a795e34c6083c129bda7e976b9223d1a893da
>
> Following is a draft version of the implementation of b) on the C++ native
> client:
>
> https://github.com/apache/geode-native/commit/a03a56f229bb8d75ee71044cf6196df07f43150d
>
> Note that the above implementation of b) in the C++ client implies that
> the Execution object returned by the FunctionService cannot be destroyed
> until the thread executing the function asynchronously has finished. If the
> function times out, the Execution object must be kept until the thread
> finishes.
>
>
> Other considerations
> -
>
> * Currently, in the function execution Java client there is not a
> possibility to set a timeout for the execution of functions. The closest to
> this is the read timeout that may be set globally for function executions
> but this is not really a timeout for operations.
>
> * Even if the API for function execution is the same on clients and
> servers, the implementation is different between them. On the clients, the
> execute() methods are blocking while on the servers it is non-blocking and
> the invoker of the function blocks on the getResult() method of the
> ResultCollector returned by the execute() method.
> Even if having both blocking and non-blocking implementation of execute()
> in both clients and servers sounds desirable from the point of view of
> orthogonality, this  could bring complications in terms of backward
> compatibility. Besides, a need for a blocking version of function execution
> has not been found.
>
> -Alberto G.
>
> On 29/8/19 16:48, Alberto Gomez wrote:
>
> Sorry, some corrections on my comments after revisiting the native
> client code.
>
> When I said that the timeout used in the execution() method (by means of
> a system property) was to set a read timeout on the socket, I was only
> talking about the Java client. In the case of the native clients, the
> timeout set in the execute() method is not translated into a socket
> timeout but it is the time to wait for the operation to complete, i.e.,
> to get all the results back.
>
> Things being so, I would change my proposal to:
>
> - Change the implementation of execute() on both Java and native clients
> to be non-blocking (having the blocking/non-blocking behavior
> configurable in the release this is introduced and leaving only the
> non-blocking behavior in the next release).
>
> - Either remove the execute() with timeout 

Re: Question about excluding serialized classes

2019-09-11 Thread Dale Emery
As far as I can tell, the things that execute functions use the public API to 
find the function to execute. So if we unwrap the functions in the public API, 
only the un-instrumented functions will be executed.

—
Dale Emery
dem...@pivotal.io



> On Sep 11, 2019, at 1:38 PM, Dan Smith  wrote:
> 
> I think you could still use your decorator approach if you also unwrap the
> Functions when returning them from the public APIs like getFunction etc. In
> that case your TimingFunction could probably safely by added to
> excludedClasses.txt.
> 
> You will miss collecting metrics about functions that aren't registered and
> are invoked using Execution.execute(Function) but maybe that's intentional?
> 
> -Dan
> 
> On Wed, Sep 11, 2019 at 1:24 PM Mark Hanson  wrote:
> 
>> They would be the specific functions, but this doesn’t get us around they
>> other problem. I think this approach is not going to work and we are about
>> to start looking into other ways.
>> 
>> Thanks,
>> Mark
>> 
>>> On Sep 11, 2019, at 12:14 PM, Anthony Baker  wrote:
>>> 
>>> I think the Decorator approach you outlined could have other impacts as
>> well.  Would I still be able to see specific function executions in
>> statistics or would they all become “TImingFunction”?
>>> 
>>> Anthony
>>> 
>>> 
 On Sep 11, 2019, at 12:00 PM, Aaron Lindsey 
>> wrote:
 
 Thanks for your response, Dan.
 
 The second scenario you mentioned (i.e. users calling
 FunctionService.getFunction(String)) worries me because our PR changes
>> the
 FunctionService so they would now get back an instance of the decorator
 function instead of the specific instance they registered by calling
 FunctionService.registerFunction(Function). Therefore, any explicit
>> casts
 to a Function implementation like (MyFunction)
 FunctionService.getFunction("MyFunction") would fail. Does that mean
>> this
 be a breaking change? The FunctionService class does not specify that
 getFunction must return the same type function as the one passed to
 registerFunction, but I could see how users might be relying on that
 behavior since there is no other way to get a specific function type
>> out of
 the FunctionService without doing a cast.
 
 - Aaron
 
 
 On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
 
> Functions are serialized when you call Execution.execute(Function)
>> instead
> of Execution.execute(String). Invoking execute on a function object
> serializes the function and executes it on the remote side. Functions
> executed this way don't have be registered.
> 
> Users can also get registered function objects directly from the
>> function
> service using FunctionService.getFunction(String) and do whatever they
>> want
> with them, which I guess could include serializing them.
> 
> Hope that helps!
> -Dan
> 
> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
> wrote:
> 
>> As part of a PR to add Micrometer timers for function executions
>> , we implemented a
>> decorator
>> Function that wraps all registered non-internal functions and adds
>> instrumentation. This PR is
>> failing AnalyzeSerializablesJUnitTest.testSerializables because the
>> decorator class is a new Serializable.
>> 
>> I'm not sure if it would be OK to add this class to
>> excludedClasses.txt
>> because I don't know whether this function will ever be serialized.
>> If it
>> will be serialized, then I'm concerned that this might break backwards
>> compatibility because we're changing the serialized form of registered
>> functions. If this is the case, then we could implement custom logic
>> for
>> serializing the decorator class which would replace its serialized
>> form
>> with the serialized form of the inner class. Again, I'm not sure if
>> that
>> would be necessary because I don't know the conditions under which a
>> function would be serialized.
>> 
>> Could someone help me understand when functions would be persisted or
> sent
>> over the wire so I can determine if this change would break
> compatibility?
>> 
>> Thanks,
>> Aaron
>> 
> 
>>> 
>> 
>> 



Re: Question about excluding serialized classes

2019-09-11 Thread Dale Emery
The stats use the ID of the function, and each TimingFunction reports the same 
ID as the function it wraps. So I think the stats would look like they always 
did.

Dale

—
Dale Emery
dem...@pivotal.io



> On Sep 11, 2019, at 12:14 PM, Anthony Baker  wrote:
> 
> I think the Decorator approach you outlined could have other impacts as well. 
>  Would I still be able to see specific function executions in statistics or 
> would they all become “TImingFunction”?
> 
> Anthony
> 
> 
>> On Sep 11, 2019, at 12:00 PM, Aaron Lindsey  wrote:
>> 
>> Thanks for your response, Dan.
>> 
>> The second scenario you mentioned (i.e. users calling
>> FunctionService.getFunction(String)) worries me because our PR changes the
>> FunctionService so they would now get back an instance of the decorator
>> function instead of the specific instance they registered by calling
>> FunctionService.registerFunction(Function). Therefore, any explicit casts
>> to a Function implementation like (MyFunction)
>> FunctionService.getFunction("MyFunction") would fail. Does that mean this
>> be a breaking change? The FunctionService class does not specify that
>> getFunction must return the same type function as the one passed to
>> registerFunction, but I could see how users might be relying on that
>> behavior since there is no other way to get a specific function type out of
>> the FunctionService without doing a cast.
>> 
>> - Aaron
>> 
>> 
>> On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
>> 
>>> Functions are serialized when you call Execution.execute(Function) instead
>>> of Execution.execute(String). Invoking execute on a function object
>>> serializes the function and executes it on the remote side. Functions
>>> executed this way don't have be registered.
>>> 
>>> Users can also get registered function objects directly from the function
>>> service using FunctionService.getFunction(String) and do whatever they want
>>> with them, which I guess could include serializing them.
>>> 
>>> Hope that helps!
>>> -Dan
>>> 
>>> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
>>> wrote:
>>> 
 As part of a PR to add Micrometer timers for function executions
 , we implemented a decorator
 Function that wraps all registered non-internal functions and adds
 instrumentation. This PR is
 failing AnalyzeSerializablesJUnitTest.testSerializables because the
 decorator class is a new Serializable.
 
 I'm not sure if it would be OK to add this class to excludedClasses.txt
 because I don't know whether this function will ever be serialized. If it
 will be serialized, then I'm concerned that this might break backwards
 compatibility because we're changing the serialized form of registered
 functions. If this is the case, then we could implement custom logic for
 serializing the decorator class which would replace its serialized form
 with the serialized form of the inner class. Again, I'm not sure if that
 would be necessary because I don't know the conditions under which a
 function would be serialized.
 
 Could someone help me understand when functions would be persisted or
>>> sent
 over the wire so I can determine if this change would break
>>> compatibility?
 
 Thanks,
 Aaron
 
>>> 
> 



Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Mark Hanson
Good question. I will have to look into that.

Thanks,
Mark

> On Sep 11, 2019, at 10:53 AM, Dan Smith  wrote:
> 
>> The idea I am working with at the moment that Kirk pointed me at was to
> use the pid file in the directory as indicator. Once that file disappears
> the server is stopped.
> 
> How will this work if stop server --member is invoked some a different
> machine than the member that is being stopped?
> 
> -Dan
> 
> On Wed, Sep 11, 2019 at 10:28 AM Mark Hanson  wrote:
> 
>> The idea I am working with at the moment that Kirk pointed me at was to
>> use the pid file in the directory as indicator. Once that file disappears
>> the server is stopped.
>> 
>> That seems to work in my testing.
>> 
>> Thoughts?
>> 
>> Thanks,
>> Mark
>> 
>>> On Sep 11, 2019, at 10:23 AM, Dan Smith  wrote:
>>> 
>>> It does seem like we should make stop synchronous, or at least make start
>>> wait for the old process to die as Bruce suggested. Otherwise it is
>>> difficult for someone to script the restart of a server.
>>> 
>>> Looking at the code, it does look like gfsh stop is asynchronous. There
>> are
>>> multiple ways to stop a server:
>>> * gfsh stop --dir - it looks like we write out some stop file and return
>>> immediately. Or, if we can connect over JMX, we invoke the
>>> MemberMBean.shutDownMember method, which launches a thread to close the
>>> cache, which is also asynchronous.
>>> * gfsh stop --pid - this seems to be similar to --dir
>>> * With a member name - this appears to go to the
>> MemberMBean.shutDownMember
>>> method as well.
>>> 
>>> I think one issue is that the JMX methods to stopping the server may be
>>> hard to ensure the process is really gone, because they can be invoked
>>> remotely. That may be why they are asynchronous - they need to return
>>> something to the caller before shutting down. So maybe Bruce's suggestion
>>> is better.
>>> 
>>> As Jens pointed out - tests should generally just use port 0 for servers.
>>> 
>>> -Dan
>>> 
>>> On Wed, Sep 11, 2019 at 8:46 AM Jens Deppe  wrote:
>>> 
 To circle back to the original test failure that prompted this
>> discussion -
 the failing test was getting intermittent bind exceptions on subsequent
 server restarts.
 
 I believe it's quite likely that a process' ports will remain
>> unavailable
 even after it is gone (I'm not sure if we create listening sockets with
 SO_REUSEADDR). So, as to John's comment that gfsh is already
>> synchronous, I
 don't think that adding extra functionality to gfsh, to ultimately just
 wait longer before exiting, is really solving the problem. I'd suggest
>> you
 adjust the tests to always start servers with `--server-port=0` so that
 there are no port conflicts and let the OS handle it.
 
 --Jens
 
 On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt <
>> bschucha...@pivotal.io>
 wrote:
 
> Blocking or non-blocking, I don't have a strong opinion.  What I'd
> really like to have gfsh ensure, though, is that no-one is able to
>> start
> a new instance of a server while the old process is still around.
>> Maybe
> the PID file is the way to do that.
> 
> On 9/10/19 3:08 PM, Mark Hanson wrote:
>> Hello All,
>> 
>> I would like to propose that we make the gfsh “stop server” command
> synchronous. It is causing some issues with some tests as the rest of
>> the
> calls are blocking. Stop on the other hand immediately returns by
> comparison.
>> This causes issues as shown in GEODE-7017 specifically.
>> 
>> GEODE:7017 CI failure:
> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> startupReportsOnlineOnlyAfterRedundancyRestored
>> https://issues.apache.org/jira/browse/GEODE-7017 <
> https://issues.apache.org/jira/browse/GEODE-7017>
>> 
>> 
>> What do people think?
>> 
>> Thanks,
>> Mark
> 
 
>> 
>> 



Re: Question about excluding serialized classes

2019-09-11 Thread Dan Smith
I think you could still use your decorator approach if you also unwrap the
Functions when returning them from the public APIs like getFunction etc. In
that case your TimingFunction could probably safely by added to
excludedClasses.txt.

You will miss collecting metrics about functions that aren't registered and
are invoked using Execution.execute(Function) but maybe that's intentional?

-Dan

On Wed, Sep 11, 2019 at 1:24 PM Mark Hanson  wrote:

> They would be the specific functions, but this doesn’t get us around they
> other problem. I think this approach is not going to work and we are about
> to start looking into other ways.
>
> Thanks,
> Mark
>
> > On Sep 11, 2019, at 12:14 PM, Anthony Baker  wrote:
> >
> > I think the Decorator approach you outlined could have other impacts as
> well.  Would I still be able to see specific function executions in
> statistics or would they all become “TImingFunction”?
> >
> > Anthony
> >
> >
> >> On Sep 11, 2019, at 12:00 PM, Aaron Lindsey 
> wrote:
> >>
> >> Thanks for your response, Dan.
> >>
> >> The second scenario you mentioned (i.e. users calling
> >> FunctionService.getFunction(String)) worries me because our PR changes
> the
> >> FunctionService so they would now get back an instance of the decorator
> >> function instead of the specific instance they registered by calling
> >> FunctionService.registerFunction(Function). Therefore, any explicit
> casts
> >> to a Function implementation like (MyFunction)
> >> FunctionService.getFunction("MyFunction") would fail. Does that mean
> this
> >> be a breaking change? The FunctionService class does not specify that
> >> getFunction must return the same type function as the one passed to
> >> registerFunction, but I could see how users might be relying on that
> >> behavior since there is no other way to get a specific function type
> out of
> >> the FunctionService without doing a cast.
> >>
> >> - Aaron
> >>
> >>
> >> On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
> >>
> >>> Functions are serialized when you call Execution.execute(Function)
> instead
> >>> of Execution.execute(String). Invoking execute on a function object
> >>> serializes the function and executes it on the remote side. Functions
> >>> executed this way don't have be registered.
> >>>
> >>> Users can also get registered function objects directly from the
> function
> >>> service using FunctionService.getFunction(String) and do whatever they
> want
> >>> with them, which I guess could include serializing them.
> >>>
> >>> Hope that helps!
> >>> -Dan
> >>>
> >>> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
> >>> wrote:
> >>>
>  As part of a PR to add Micrometer timers for function executions
>  , we implemented a
> decorator
>  Function that wraps all registered non-internal functions and adds
>  instrumentation. This PR is
>  failing AnalyzeSerializablesJUnitTest.testSerializables because the
>  decorator class is a new Serializable.
> 
>  I'm not sure if it would be OK to add this class to
> excludedClasses.txt
>  because I don't know whether this function will ever be serialized.
> If it
>  will be serialized, then I'm concerned that this might break backwards
>  compatibility because we're changing the serialized form of registered
>  functions. If this is the case, then we could implement custom logic
> for
>  serializing the decorator class which would replace its serialized
> form
>  with the serialized form of the inner class. Again, I'm not sure if
> that
>  would be necessary because I don't know the conditions under which a
>  function would be serialized.
> 
>  Could someone help me understand when functions would be persisted or
> >>> sent
>  over the wire so I can determine if this change would break
> >>> compatibility?
> 
>  Thanks,
>  Aaron
> 
> >>>
> >
>
>


Re: Question about excluding serialized classes

2019-09-11 Thread Mark Hanson
They would be the specific functions, but this doesn’t get us around they other 
problem. I think this approach is not going to work and we are about to start 
looking into other ways.

Thanks,
Mark

> On Sep 11, 2019, at 12:14 PM, Anthony Baker  wrote:
> 
> I think the Decorator approach you outlined could have other impacts as well. 
>  Would I still be able to see specific function executions in statistics or 
> would they all become “TImingFunction”?
> 
> Anthony
> 
> 
>> On Sep 11, 2019, at 12:00 PM, Aaron Lindsey  wrote:
>> 
>> Thanks for your response, Dan.
>> 
>> The second scenario you mentioned (i.e. users calling
>> FunctionService.getFunction(String)) worries me because our PR changes the
>> FunctionService so they would now get back an instance of the decorator
>> function instead of the specific instance they registered by calling
>> FunctionService.registerFunction(Function). Therefore, any explicit casts
>> to a Function implementation like (MyFunction)
>> FunctionService.getFunction("MyFunction") would fail. Does that mean this
>> be a breaking change? The FunctionService class does not specify that
>> getFunction must return the same type function as the one passed to
>> registerFunction, but I could see how users might be relying on that
>> behavior since there is no other way to get a specific function type out of
>> the FunctionService without doing a cast.
>> 
>> - Aaron
>> 
>> 
>> On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
>> 
>>> Functions are serialized when you call Execution.execute(Function) instead
>>> of Execution.execute(String). Invoking execute on a function object
>>> serializes the function and executes it on the remote side. Functions
>>> executed this way don't have be registered.
>>> 
>>> Users can also get registered function objects directly from the function
>>> service using FunctionService.getFunction(String) and do whatever they want
>>> with them, which I guess could include serializing them.
>>> 
>>> Hope that helps!
>>> -Dan
>>> 
>>> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
>>> wrote:
>>> 
 As part of a PR to add Micrometer timers for function executions
 , we implemented a decorator
 Function that wraps all registered non-internal functions and adds
 instrumentation. This PR is
 failing AnalyzeSerializablesJUnitTest.testSerializables because the
 decorator class is a new Serializable.
 
 I'm not sure if it would be OK to add this class to excludedClasses.txt
 because I don't know whether this function will ever be serialized. If it
 will be serialized, then I'm concerned that this might break backwards
 compatibility because we're changing the serialized form of registered
 functions. If this is the case, then we could implement custom logic for
 serializing the decorator class which would replace its serialized form
 with the serialized form of the inner class. Again, I'm not sure if that
 would be necessary because I don't know the conditions under which a
 function would be serialized.
 
 Could someone help me understand when functions would be persisted or
>>> sent
 over the wire so I can determine if this change would break
>>> compatibility?
 
 Thanks,
 Aaron
 
>>> 
> 



Re: Question about excluding serialized classes

2019-09-11 Thread Anthony Baker
I think the Decorator approach you outlined could have other impacts as well.  
Would I still be able to see specific function executions in statistics or 
would they all become “TImingFunction”?

Anthony


> On Sep 11, 2019, at 12:00 PM, Aaron Lindsey  wrote:
> 
> Thanks for your response, Dan.
> 
> The second scenario you mentioned (i.e. users calling
> FunctionService.getFunction(String)) worries me because our PR changes the
> FunctionService so they would now get back an instance of the decorator
> function instead of the specific instance they registered by calling
> FunctionService.registerFunction(Function). Therefore, any explicit casts
> to a Function implementation like (MyFunction)
> FunctionService.getFunction("MyFunction") would fail. Does that mean this
> be a breaking change? The FunctionService class does not specify that
> getFunction must return the same type function as the one passed to
> registerFunction, but I could see how users might be relying on that
> behavior since there is no other way to get a specific function type out of
> the FunctionService without doing a cast.
> 
> - Aaron
> 
> 
> On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
> 
>> Functions are serialized when you call Execution.execute(Function) instead
>> of Execution.execute(String). Invoking execute on a function object
>> serializes the function and executes it on the remote side. Functions
>> executed this way don't have be registered.
>> 
>> Users can also get registered function objects directly from the function
>> service using FunctionService.getFunction(String) and do whatever they want
>> with them, which I guess could include serializing them.
>> 
>> Hope that helps!
>> -Dan
>> 
>> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
>> wrote:
>> 
>>> As part of a PR to add Micrometer timers for function executions
>>> , we implemented a decorator
>>> Function that wraps all registered non-internal functions and adds
>>> instrumentation. This PR is
>>> failing AnalyzeSerializablesJUnitTest.testSerializables because the
>>> decorator class is a new Serializable.
>>> 
>>> I'm not sure if it would be OK to add this class to excludedClasses.txt
>>> because I don't know whether this function will ever be serialized. If it
>>> will be serialized, then I'm concerned that this might break backwards
>>> compatibility because we're changing the serialized form of registered
>>> functions. If this is the case, then we could implement custom logic for
>>> serializing the decorator class which would replace its serialized form
>>> with the serialized form of the inner class. Again, I'm not sure if that
>>> would be necessary because I don't know the conditions under which a
>>> function would be serialized.
>>> 
>>> Could someone help me understand when functions would be persisted or
>> sent
>>> over the wire so I can determine if this change would break
>> compatibility?
>>> 
>>> Thanks,
>>> Aaron
>>> 
>> 



Re: Question about excluding serialized classes

2019-09-11 Thread Dan Smith
Yeah, I would expect that FunctionService.getFunction() would return the
same function object I registered with FunctionService.registerFunction.

-Dan

On Wed, Sep 11, 2019 at 12:01 PM Aaron Lindsey  wrote:

> Thanks for your response, Dan.
>
> The second scenario you mentioned (i.e. users calling
> FunctionService.getFunction(String)) worries me because our PR changes the
> FunctionService so they would now get back an instance of the decorator
> function instead of the specific instance they registered by calling
> FunctionService.registerFunction(Function). Therefore, any explicit casts
> to a Function implementation like (MyFunction)
> FunctionService.getFunction("MyFunction") would fail. Does that mean this
> be a breaking change? The FunctionService class does not specify that
> getFunction must return the same type function as the one passed to
> registerFunction, but I could see how users might be relying on that
> behavior since there is no other way to get a specific function type out of
> the FunctionService without doing a cast.
>
> - Aaron
>
>
> On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:
>
> > Functions are serialized when you call Execution.execute(Function)
> instead
> > of Execution.execute(String). Invoking execute on a function object
> > serializes the function and executes it on the remote side. Functions
> > executed this way don't have be registered.
> >
> > Users can also get registered function objects directly from the function
> > service using FunctionService.getFunction(String) and do whatever they
> want
> > with them, which I guess could include serializing them.
> >
> > Hope that helps!
> > -Dan
> >
> > On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
> > wrote:
> >
> > > As part of a PR to add Micrometer timers for function executions
> > > , we implemented a
> decorator
> > > Function that wraps all registered non-internal functions and adds
> > > instrumentation. This PR is
> > > failing AnalyzeSerializablesJUnitTest.testSerializables because the
> > > decorator class is a new Serializable.
> > >
> > > I'm not sure if it would be OK to add this class to excludedClasses.txt
> > > because I don't know whether this function will ever be serialized. If
> it
> > > will be serialized, then I'm concerned that this might break backwards
> > > compatibility because we're changing the serialized form of registered
> > > functions. If this is the case, then we could implement custom logic
> for
> > > serializing the decorator class which would replace its serialized form
> > > with the serialized form of the inner class. Again, I'm not sure if
> that
> > > would be necessary because I don't know the conditions under which a
> > > function would be serialized.
> > >
> > > Could someone help me understand when functions would be persisted or
> > sent
> > > over the wire so I can determine if this change would break
> > compatibility?
> > >
> > > Thanks,
> > > Aaron
> > >
> >
>


Re: Question about excluding serialized classes

2019-09-11 Thread Aaron Lindsey
Thanks for your response, Dan.

The second scenario you mentioned (i.e. users calling
FunctionService.getFunction(String)) worries me because our PR changes the
FunctionService so they would now get back an instance of the decorator
function instead of the specific instance they registered by calling
FunctionService.registerFunction(Function). Therefore, any explicit casts
to a Function implementation like (MyFunction)
FunctionService.getFunction("MyFunction") would fail. Does that mean this
be a breaking change? The FunctionService class does not specify that
getFunction must return the same type function as the one passed to
registerFunction, but I could see how users might be relying on that
behavior since there is no other way to get a specific function type out of
the FunctionService without doing a cast.

- Aaron


On Wed, Sep 11, 2019 at 10:52 AM Dan Smith  wrote:

> Functions are serialized when you call Execution.execute(Function) instead
> of Execution.execute(String). Invoking execute on a function object
> serializes the function and executes it on the remote side. Functions
> executed this way don't have be registered.
>
> Users can also get registered function objects directly from the function
> service using FunctionService.getFunction(String) and do whatever they want
> with them, which I guess could include serializing them.
>
> Hope that helps!
> -Dan
>
> On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey 
> wrote:
>
> > As part of a PR to add Micrometer timers for function executions
> > , we implemented a decorator
> > Function that wraps all registered non-internal functions and adds
> > instrumentation. This PR is
> > failing AnalyzeSerializablesJUnitTest.testSerializables because the
> > decorator class is a new Serializable.
> >
> > I'm not sure if it would be OK to add this class to excludedClasses.txt
> > because I don't know whether this function will ever be serialized. If it
> > will be serialized, then I'm concerned that this might break backwards
> > compatibility because we're changing the serialized form of registered
> > functions. If this is the case, then we could implement custom logic for
> > serializing the decorator class which would replace its serialized form
> > with the serialized form of the inner class. Again, I'm not sure if that
> > would be necessary because I don't know the conditions under which a
> > function would be serialized.
> >
> > Could someone help me understand when functions would be persisted or
> sent
> > over the wire so I can determine if this change would break
> compatibility?
> >
> > Thanks,
> > Aaron
> >
>


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Dan Smith
> The idea I am working with at the moment that Kirk pointed me at was to
use the pid file in the directory as indicator. Once that file disappears
the server is stopped.

How will this work if stop server --member is invoked some a different
machine than the member that is being stopped?

-Dan

On Wed, Sep 11, 2019 at 10:28 AM Mark Hanson  wrote:

> The idea I am working with at the moment that Kirk pointed me at was to
> use the pid file in the directory as indicator. Once that file disappears
> the server is stopped.
>
> That seems to work in my testing.
>
> Thoughts?
>
> Thanks,
> Mark
>
> > On Sep 11, 2019, at 10:23 AM, Dan Smith  wrote:
> >
> > It does seem like we should make stop synchronous, or at least make start
> > wait for the old process to die as Bruce suggested. Otherwise it is
> > difficult for someone to script the restart of a server.
> >
> > Looking at the code, it does look like gfsh stop is asynchronous. There
> are
> > multiple ways to stop a server:
> > * gfsh stop --dir - it looks like we write out some stop file and return
> > immediately. Or, if we can connect over JMX, we invoke the
> > MemberMBean.shutDownMember method, which launches a thread to close the
> > cache, which is also asynchronous.
> > * gfsh stop --pid - this seems to be similar to --dir
> > * With a member name - this appears to go to the
> MemberMBean.shutDownMember
> > method as well.
> >
> > I think one issue is that the JMX methods to stopping the server may be
> > hard to ensure the process is really gone, because they can be invoked
> > remotely. That may be why they are asynchronous - they need to return
> > something to the caller before shutting down. So maybe Bruce's suggestion
> > is better.
> >
> > As Jens pointed out - tests should generally just use port 0 for servers.
> >
> > -Dan
> >
> > On Wed, Sep 11, 2019 at 8:46 AM Jens Deppe  wrote:
> >
> >> To circle back to the original test failure that prompted this
> discussion -
> >> the failing test was getting intermittent bind exceptions on subsequent
> >> server restarts.
> >>
> >> I believe it's quite likely that a process' ports will remain
> unavailable
> >> even after it is gone (I'm not sure if we create listening sockets with
> >> SO_REUSEADDR). So, as to John's comment that gfsh is already
> synchronous, I
> >> don't think that adding extra functionality to gfsh, to ultimately just
> >> wait longer before exiting, is really solving the problem. I'd suggest
> you
> >> adjust the tests to always start servers with `--server-port=0` so that
> >> there are no port conflicts and let the OS handle it.
> >>
> >> --Jens
> >>
> >> On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt <
> bschucha...@pivotal.io>
> >> wrote:
> >>
> >>> Blocking or non-blocking, I don't have a strong opinion.  What I'd
> >>> really like to have gfsh ensure, though, is that no-one is able to
> start
> >>> a new instance of a server while the old process is still around.
> Maybe
> >>> the PID file is the way to do that.
> >>>
> >>> On 9/10/19 3:08 PM, Mark Hanson wrote:
>  Hello All,
> 
>  I would like to propose that we make the gfsh “stop server” command
> >>> synchronous. It is causing some issues with some tests as the rest of
> the
> >>> calls are blocking. Stop on the other hand immediately returns by
> >>> comparison.
>  This causes issues as shown in GEODE-7017 specifically.
> 
>  GEODE:7017 CI failure:
> >>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> >>> startupReportsOnlineOnlyAfterRedundancyRestored
>  https://issues.apache.org/jira/browse/GEODE-7017 <
> >>> https://issues.apache.org/jira/browse/GEODE-7017>
> 
> 
>  What do people think?
> 
>  Thanks,
>  Mark
> >>>
> >>
>
>


Re: Question about excluding serialized classes

2019-09-11 Thread Dan Smith
Functions are serialized when you call Execution.execute(Function) instead
of Execution.execute(String). Invoking execute on a function object
serializes the function and executes it on the remote side. Functions
executed this way don't have be registered.

Users can also get registered function objects directly from the function
service using FunctionService.getFunction(String) and do whatever they want
with them, which I guess could include serializing them.

Hope that helps!
-Dan

On Wed, Sep 11, 2019 at 10:27 AM Aaron Lindsey  wrote:

> As part of a PR to add Micrometer timers for function executions
> , we implemented a decorator
> Function that wraps all registered non-internal functions and adds
> instrumentation. This PR is
> failing AnalyzeSerializablesJUnitTest.testSerializables because the
> decorator class is a new Serializable.
>
> I'm not sure if it would be OK to add this class to excludedClasses.txt
> because I don't know whether this function will ever be serialized. If it
> will be serialized, then I'm concerned that this might break backwards
> compatibility because we're changing the serialized form of registered
> functions. If this is the case, then we could implement custom logic for
> serializing the decorator class which would replace its serialized form
> with the serialized form of the inner class. Again, I'm not sure if that
> would be necessary because I don't know the conditions under which a
> function would be serialized.
>
> Could someone help me understand when functions would be persisted or sent
> over the wire so I can determine if this change would break compatibility?
>
> Thanks,
> Aaron
>


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread John Blum
+1 to Bruce's comments as well.

This is exactly the kind of thing I needed to do (handle) inside of the *Spring
Test for Apache Geode* (STDG) project from a framework perspective, to
ensure that other projects relying on STDG (e.g. SBDG, SSDG) for their
integration testing purposes (e.g. client/server integration test cases)
with Apache Geode, were reliable, in an automated fashion.  This meant
ensuring things like the Apache Geode process is completely and properly
shutdown (using the appropriate checks) all all resource used by a cache
instance are released before another test class is allowed to continue and
do its work.  FWIW.

-j

On Wed, Sep 11, 2019 at 10:28 AM Mark Hanson  wrote:

> The idea I am working with at the moment that Kirk pointed me at was to
> use the pid file in the directory as indicator. Once that file disappears
> the server is stopped.
>
> That seems to work in my testing.
>
> Thoughts?
>
> Thanks,
> Mark
>
> > On Sep 11, 2019, at 10:23 AM, Dan Smith  wrote:
> >
> > It does seem like we should make stop synchronous, or at least make start
> > wait for the old process to die as Bruce suggested. Otherwise it is
> > difficult for someone to script the restart of a server.
> >
> > Looking at the code, it does look like gfsh stop is asynchronous. There
> are
> > multiple ways to stop a server:
> > * gfsh stop --dir - it looks like we write out some stop file and return
> > immediately. Or, if we can connect over JMX, we invoke the
> > MemberMBean.shutDownMember method, which launches a thread to close the
> > cache, which is also asynchronous.
> > * gfsh stop --pid - this seems to be similar to --dir
> > * With a member name - this appears to go to the
> MemberMBean.shutDownMember
> > method as well.
> >
> > I think one issue is that the JMX methods to stopping the server may be
> > hard to ensure the process is really gone, because they can be invoked
> > remotely. That may be why they are asynchronous - they need to return
> > something to the caller before shutting down. So maybe Bruce's suggestion
> > is better.
> >
> > As Jens pointed out - tests should generally just use port 0 for servers.
> >
> > -Dan
> >
> > On Wed, Sep 11, 2019 at 8:46 AM Jens Deppe  wrote:
> >
> >> To circle back to the original test failure that prompted this
> discussion -
> >> the failing test was getting intermittent bind exceptions on subsequent
> >> server restarts.
> >>
> >> I believe it's quite likely that a process' ports will remain
> unavailable
> >> even after it is gone (I'm not sure if we create listening sockets with
> >> SO_REUSEADDR). So, as to John's comment that gfsh is already
> synchronous, I
> >> don't think that adding extra functionality to gfsh, to ultimately just
> >> wait longer before exiting, is really solving the problem. I'd suggest
> you
> >> adjust the tests to always start servers with `--server-port=0` so that
> >> there are no port conflicts and let the OS handle it.
> >>
> >> --Jens
> >>
> >> On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt <
> bschucha...@pivotal.io>
> >> wrote:
> >>
> >>> Blocking or non-blocking, I don't have a strong opinion.  What I'd
> >>> really like to have gfsh ensure, though, is that no-one is able to
> start
> >>> a new instance of a server while the old process is still around.
> Maybe
> >>> the PID file is the way to do that.
> >>>
> >>> On 9/10/19 3:08 PM, Mark Hanson wrote:
>  Hello All,
> 
>  I would like to propose that we make the gfsh “stop server” command
> >>> synchronous. It is causing some issues with some tests as the rest of
> the
> >>> calls are blocking. Stop on the other hand immediately returns by
> >>> comparison.
>  This causes issues as shown in GEODE-7017 specifically.
> 
>  GEODE:7017 CI failure:
> >>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> >>> startupReportsOnlineOnlyAfterRedundancyRestored
>  https://issues.apache.org/jira/browse/GEODE-7017 <
> >>> https://issues.apache.org/jira/browse/GEODE-7017>
> 
> 
>  What do people think?
> 
>  Thanks,
>  Mark
> >>>
> >>
>
>

-- 
-John
john.blum10101 (skype)


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Mark Hanson
The idea I am working with at the moment that Kirk pointed me at was to use the 
pid file in the directory as indicator. Once that file disappears the server is 
stopped.

That seems to work in my testing.

Thoughts?

Thanks,
Mark

> On Sep 11, 2019, at 10:23 AM, Dan Smith  wrote:
> 
> It does seem like we should make stop synchronous, or at least make start
> wait for the old process to die as Bruce suggested. Otherwise it is
> difficult for someone to script the restart of a server.
> 
> Looking at the code, it does look like gfsh stop is asynchronous. There are
> multiple ways to stop a server:
> * gfsh stop --dir - it looks like we write out some stop file and return
> immediately. Or, if we can connect over JMX, we invoke the
> MemberMBean.shutDownMember method, which launches a thread to close the
> cache, which is also asynchronous.
> * gfsh stop --pid - this seems to be similar to --dir
> * With a member name - this appears to go to the MemberMBean.shutDownMember
> method as well.
> 
> I think one issue is that the JMX methods to stopping the server may be
> hard to ensure the process is really gone, because they can be invoked
> remotely. That may be why they are asynchronous - they need to return
> something to the caller before shutting down. So maybe Bruce's suggestion
> is better.
> 
> As Jens pointed out - tests should generally just use port 0 for servers.
> 
> -Dan
> 
> On Wed, Sep 11, 2019 at 8:46 AM Jens Deppe  wrote:
> 
>> To circle back to the original test failure that prompted this discussion -
>> the failing test was getting intermittent bind exceptions on subsequent
>> server restarts.
>> 
>> I believe it's quite likely that a process' ports will remain unavailable
>> even after it is gone (I'm not sure if we create listening sockets with
>> SO_REUSEADDR). So, as to John's comment that gfsh is already synchronous, I
>> don't think that adding extra functionality to gfsh, to ultimately just
>> wait longer before exiting, is really solving the problem. I'd suggest you
>> adjust the tests to always start servers with `--server-port=0` so that
>> there are no port conflicts and let the OS handle it.
>> 
>> --Jens
>> 
>> On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt 
>> wrote:
>> 
>>> Blocking or non-blocking, I don't have a strong opinion.  What I'd
>>> really like to have gfsh ensure, though, is that no-one is able to start
>>> a new instance of a server while the old process is still around.  Maybe
>>> the PID file is the way to do that.
>>> 
>>> On 9/10/19 3:08 PM, Mark Hanson wrote:
 Hello All,
 
 I would like to propose that we make the gfsh “stop server” command
>>> synchronous. It is causing some issues with some tests as the rest of the
>>> calls are blocking. Stop on the other hand immediately returns by
>>> comparison.
 This causes issues as shown in GEODE-7017 specifically.
 
 GEODE:7017 CI failure:
>>> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
>>> startupReportsOnlineOnlyAfterRedundancyRestored
 https://issues.apache.org/jira/browse/GEODE-7017 <
>>> https://issues.apache.org/jira/browse/GEODE-7017>
 
 
 What do people think?
 
 Thanks,
 Mark
>>> 
>> 



Question about excluding serialized classes

2019-09-11 Thread Aaron Lindsey
As part of a PR to add Micrometer timers for function executions
, we implemented a decorator
Function that wraps all registered non-internal functions and adds
instrumentation. This PR is
failing AnalyzeSerializablesJUnitTest.testSerializables because the
decorator class is a new Serializable.

I'm not sure if it would be OK to add this class to excludedClasses.txt
because I don't know whether this function will ever be serialized. If it
will be serialized, then I'm concerned that this might break backwards
compatibility because we're changing the serialized form of registered
functions. If this is the case, then we could implement custom logic for
serializing the decorator class which would replace its serialized form
with the serialized form of the inner class. Again, I'm not sure if that
would be necessary because I don't know the conditions under which a
function would be serialized.

Could someone help me understand when functions would be persisted or sent
over the wire so I can determine if this change would break compatibility?

Thanks,
Aaron


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Dan Smith
It does seem like we should make stop synchronous, or at least make start
wait for the old process to die as Bruce suggested. Otherwise it is
difficult for someone to script the restart of a server.

Looking at the code, it does look like gfsh stop is asynchronous. There are
multiple ways to stop a server:
* gfsh stop --dir - it looks like we write out some stop file and return
immediately. Or, if we can connect over JMX, we invoke the
MemberMBean.shutDownMember method, which launches a thread to close the
cache, which is also asynchronous.
* gfsh stop --pid - this seems to be similar to --dir
* With a member name - this appears to go to the MemberMBean.shutDownMember
method as well.

I think one issue is that the JMX methods to stopping the server may be
hard to ensure the process is really gone, because they can be invoked
remotely. That may be why they are asynchronous - they need to return
something to the caller before shutting down. So maybe Bruce's suggestion
is better.

As Jens pointed out - tests should generally just use port 0 for servers.

-Dan

On Wed, Sep 11, 2019 at 8:46 AM Jens Deppe  wrote:

> To circle back to the original test failure that prompted this discussion -
> the failing test was getting intermittent bind exceptions on subsequent
> server restarts.
>
> I believe it's quite likely that a process' ports will remain unavailable
> even after it is gone (I'm not sure if we create listening sockets with
> SO_REUSEADDR). So, as to John's comment that gfsh is already synchronous, I
> don't think that adding extra functionality to gfsh, to ultimately just
> wait longer before exiting, is really solving the problem. I'd suggest you
> adjust the tests to always start servers with `--server-port=0` so that
> there are no port conflicts and let the OS handle it.
>
> --Jens
>
> On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt 
> wrote:
>
> > Blocking or non-blocking, I don't have a strong opinion.  What I'd
> > really like to have gfsh ensure, though, is that no-one is able to start
> > a new instance of a server while the old process is still around.  Maybe
> > the PID file is the way to do that.
> >
> > On 9/10/19 3:08 PM, Mark Hanson wrote:
> > > Hello All,
> > >
> > > I would like to propose that we make the gfsh “stop server” command
> > synchronous. It is causing some issues with some tests as the rest of the
> > calls are blocking. Stop on the other hand immediately returns by
> > comparison.
> > > This causes issues as shown in GEODE-7017 specifically.
> > >
> > > GEODE:7017 CI failure:
> > org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> > startupReportsOnlineOnlyAfterRedundancyRestored
> > > https://issues.apache.org/jira/browse/GEODE-7017 <
> > https://issues.apache.org/jira/browse/GEODE-7017>
> > >
> > >
> > > What do people think?
> > >
> > > Thanks,
> > > Mark
> >
>


Re: resource manager requirements & recommendations

2019-09-11 Thread Anthony Baker
The challenge with designing a good approach for managing heap use in Java is 
that we *can’t* know how much of the current heap use is really garbage.  That 
means that it can be really easy to evict too much or too little data.

With the CMS engine there are tuning parameters like occupancy fraction that 
you can set to match the eviction threshold.  This leads to a fairly 
predictable approach to managing heap memory.  With G!GC, the challenge is 
harder since the entire heap might fill up with garbage before any collections 
occur. 

Despite CMS being deprecated, I think it’s currently the best choice to control 
heap use in Geode.  As noted in JEP 291 [1] and subsequent discussion [2]:  
"For some applications CMS is a very good fit and might always outperform G1”.  
I also think we need to do more work in this area to make G1 perform as well as 
CMS.

Anthony

[1] http://openjdk.java.net/jeps/291
[2] http://mail.openjdk.java.net/pipermail/jdk9-dev/2017-April/thread.html#start

> On Sep 11, 2019, at 9:14 AM, Alberto Bustamante Reyes 
>  wrote:
> 
> Hi all,
> 
> Im interested on using the resource manager with G1 garbage collector. To 
> check if it is possible, I have been reading documentation about heap memory 
> management and I came up with some questions because there are some points in 
> the documentation where it is not clear for me if they are describing 
> requirements or recommendations.
> 
> As far as I understood, the requirements for using the Resource Manager are 
> only two:
> 
>  *   set the critical heap percentage
>  *   configure your GC properly in order to work before the eviction 
> procedure starts.
> 
> Am I right? There are three points in the documentation that makes me 
> question if I'm correct:
> 
> 
>  1.  The first chapter in 
> https://geode.apache.org/docs/guide/19/managing/heap_use/heap_management.html 
> states how to configure your GC for improving performance, but it only talks 
> about CMS, there is no info about other GCs.
>  2.  In the steps of how to configure ResourceManager, when talking about 
> tuning GC parameters, it talks again only about CMS.
>  3.  In the documentation of ResourceManager class, setCriticalHeapPercentage 
> method, it is stated the following:
> 
> Many virtual machine implementations have additional VM switches to control 
> the behavior of the garbage collector. We suggest that you investigate tuning 
> the garbage collector when using this type of eviction controller. A 
> collector that frequently collects is needed to keep our heap usage up to 
> date. In particular, on the Sun HotSpot VM, the -XX:+UseConcMarkSweepGC flag 
> needs to be set, [...]
> 
> So it seems that CMS is a requirement, but I have not found in the code any 
> limitation about using only CMS.
> 
> If my previous statement about the requirements is fine, then I suppose the 
> documentation needs a review to distinguish between generic requirements and 
> the CMS specific use case.
> 
> Other question that come to my mind is about the lack of info about G1. As 
> CMS is deprecated since Java 9, are there any plans to test and document G1 
> configuration?
> 
> Thanks in advance for your comments!
> 
> Alberto B.
> 
> 
> 
> 
> 
> 



resource manager requirements & recommendations

2019-09-11 Thread Alberto Bustamante Reyes
Hi all,

Im interested on using the resource manager with G1 garbage collector. To check 
if it is possible, I have been reading documentation about heap memory 
management and I came up with some questions because there are some points in 
the documentation where it is not clear for me if they are describing 
requirements or recommendations.

As far as I understood, the requirements for using the Resource Manager are 
only two:

  *   set the critical heap percentage
  *   configure your GC properly in order to work before the eviction procedure 
starts.

Am I right? There are three points in the documentation that makes me question 
if I'm correct:


  1.  The first chapter in 
https://geode.apache.org/docs/guide/19/managing/heap_use/heap_management.html 
states how to configure your GC for improving performance, but it only talks 
about CMS, there is no info about other GCs.
  2.  In the steps of how to configure ResourceManager, when talking about 
tuning GC parameters, it talks again only about CMS.
  3.  In the documentation of ResourceManager class, setCriticalHeapPercentage 
method, it is stated the following:

Many virtual machine implementations have additional VM switches to control the 
behavior of the garbage collector. We suggest that you investigate tuning the 
garbage collector when using this type of eviction controller. A collector that 
frequently collects is needed to keep our heap usage up to date. In particular, 
on the Sun HotSpot VM, the -XX:+UseConcMarkSweepGC flag needs to be set, [...]

So it seems that CMS is a requirement, but I have not found in the code any 
limitation about using only CMS.

If my previous statement about the requirements is fine, then I suppose the 
documentation needs a review to distinguish between generic requirements and 
the CMS specific use case.

Other question that come to my mind is about the lack of info about G1. As CMS 
is deprecated since Java 9, are there any plans to test and document G1 
configuration?

Thanks in advance for your comments!

Alberto B.








Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Jens Deppe
To circle back to the original test failure that prompted this discussion -
the failing test was getting intermittent bind exceptions on subsequent
server restarts.

I believe it's quite likely that a process' ports will remain unavailable
even after it is gone (I'm not sure if we create listening sockets with
SO_REUSEADDR). So, as to John's comment that gfsh is already synchronous, I
don't think that adding extra functionality to gfsh, to ultimately just
wait longer before exiting, is really solving the problem. I'd suggest you
adjust the tests to always start servers with `--server-port=0` so that
there are no port conflicts and let the OS handle it.

--Jens

On Wed, Sep 11, 2019 at 8:17 AM Bruce Schuchardt 
wrote:

> Blocking or non-blocking, I don't have a strong opinion.  What I'd
> really like to have gfsh ensure, though, is that no-one is able to start
> a new instance of a server while the old process is still around.  Maybe
> the PID file is the way to do that.
>
> On 9/10/19 3:08 PM, Mark Hanson wrote:
> > Hello All,
> >
> > I would like to propose that we make the gfsh “stop server” command
> synchronous. It is causing some issues with some tests as the rest of the
> calls are blocking. Stop on the other hand immediately returns by
> comparison.
> > This causes issues as shown in GEODE-7017 specifically.
> >
> > GEODE:7017 CI failure:
> org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest >
> startupReportsOnlineOnlyAfterRedundancyRestored
> > https://issues.apache.org/jira/browse/GEODE-7017 <
> https://issues.apache.org/jira/browse/GEODE-7017>
> >
> >
> > What do people think?
> >
> > Thanks,
> > Mark
>


Re: [Proposal] Make gfsh "stop server" command synchronous

2019-09-11 Thread Bruce Schuchardt
Blocking or non-blocking, I don't have a strong opinion.  What I'd 
really like to have gfsh ensure, though, is that no-one is able to start 
a new instance of a server while the old process is still around.  Maybe 
the PID file is the way to do that.


On 9/10/19 3:08 PM, Mark Hanson wrote:

Hello All,

I would like to propose that we make the gfsh “stop server” command 
synchronous. It is causing some issues with some tests as the rest of the calls 
are blocking. Stop on the other hand immediately returns by comparison.
This causes issues as shown in GEODE-7017 specifically.

GEODE:7017 CI failure: 
org.apache.geode.launchers.ServerStartupValueRecoveryNotificationTest > 
startupReportsOnlineOnlyAfterRedundancyRestored
https://issues.apache.org/jira/browse/GEODE-7017 



What do people think?

Thanks,
Mark