from:"Ryan McMahon"

Re: [VOTE] Adding new AEQ feature to release/1.10.0

2019-09-13 Thread Ryan McMahon

+1

On Fri, Sep 13, 2019 at 3:58 PM Donal Evans  wrote:

> +1
>
> On Fri, Sep 13, 2019 at 3:37 PM Benjamin Ross  wrote:
>
> > +1
> >
> > On Fri, Sep 13, 2019 at 3:25 PM Anilkumar Gingade 
> > wrote:
> >
> > > +1. This is needed for spring data-geode; whose upcoming release is
> based
> > > on older geode version.
> > >
> > > -Anil.
> > >
> > >
> > > On Fri, Sep 13, 2019 at 3:23 PM Nabarun Nag  wrote:
> > >
> > > > Hi Geode Community ,
> > > >
> > > > [GEODE-7121]
> > > >
> > > > I would like to include the new feature of creating AEQs with a
> paused
> > > > event processor to the release 1.10 branch. This also includes the
> > > feature
> > > > to resume the AEQ at a later point in time.
> > > > This feature includes addition of new/modified APIs and gfsh
> commands.
> > > >
> > > > [All details about this feature has been discussed in a previous
> > discuss
> > > > thread]
> > > >
> > > > These are the commits that needs to be in release 1.10.0 branch.
> > > > f6e11084daa30791f7bbf9a8187f6d1bc9c4b91a
> > > > 615d3399d24810126a6d57b5163f7afcd06366f7
> > > > 1440a95e266e671679a623f93865c5e7e683244f
> > > > 42e07dc9054794657acb40c292f3af74b79a1ea6
> > > > e1f200e2f9e77e986d250fde3848dc004b26a7c2
> > > > 5f70160fba08a06c7e1fc48c7099e63dd1a0502b
> > > > 0645446ec626bc351a2c881e4df6a4ae2e75fbfc
> > > > 575c6bac115112df1e84455b052566c75764b0be
> > > > 3d9627ff16443f4aa513a67bcc284e68953aff8a
> > > > ea22e72916f8e34455800d347690e483727f9bf5
> > > > 8d26d595f5fb94ff703116eb91bb747e9ba7f536
> > > >
> > > > Will create a PR ASAP.
> > > >
> > > > Regards
> > > > Nabarun Nag
> > > >
> > >
> >
>

Re: [VOTE] Apache Geode 1.9.1 RC2

2019-08-29 Thread Ryan McMahon

+1

On Thu, Aug 29, 2019 at 5:11 PM Kirk Lund  wrote:

> +1
>
> On Thu, Aug 29, 2019 at 5:02 PM Owen Nichols  wrote:
>
> > Hello Geode dev community,
> >
> > This is a release candidate for Apache Geode, version 1.9.1.RC2.
> > Thanks to all the community members for their contributions to this
> > release!
> >
> > Please do a review and give your feedback. The deadline is 3PM PST Wed,
> > September 04 2019.
> > Release notes can be found at:
> >
> https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1
> > <
> >
> https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1
> >
> >
> >
> > Please note that we are voting upon the source tags: rel/v1.9.1.RC2
> >
> > Apache Geode:
> > https://github.com/apache/geode/tree/rel/v1.9.1.RC2 <
> > https://github.com/apache/geode/tree/rel/v1.9.1.RC2>
> > Apache Geode examples:
> > https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC2 <
> > https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC2>
> > Apache Geode native:
> > https://github.com/apache/geode-native/tree/rel/v1.9.1.RC2 <
> > https://github.com/apache/geode-native/tree/rel/v1.9.1.RC2>
> >
> > Source and binary files:
> > https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC2/ <
> > https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC2/>
> >
> > Maven staging repo:
> > https://repository.apache.org/content/repositories/orgapachegeode-1056 <
> > https://repository.apache.org/content/repositories/orgapachegeode-1056>
> >
> > Geode's KEYS file containing PGP keys we use to sign the release:
> > https://github.com/apache/geode/blob/develop/KEYS <
> > https://github.com/apache/geode/blob/develop/KEYS>
> >
> > PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
> > https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC2
> > -PgeodeRepositoryUrl=
> > https://repository.apache.org/content/repositories/orgapachegeode-1056
> > build runAll
> >
> > Regards
> > Owen & Kirk
>

Re: [DISCUSS] Pulling the current proposed 1.10 release until we can agree on develop being stable

2019-08-28 Thread Ryan McMahon

+1 to continuing with the release branch we have.

I don't think that anybody is aiming to cause instability on the develop
branch.  We are all designing, writing code, refactoring, and writing
exhaustive tests to the best of our ability to ensure high quality.  We
have to accept that there will sometimes be unforeseen consequences; that
is the reality of working on a complex enterprise product.  I think that it
is unfortunate on this particular release we have seen as many critical
issues as we have, but I do not think there is anything fundamentally wrong
with our workflow.  I think if anything, cutting the release early and
allowing ample time to identify issues is a very good thing.  Yes, in
theory the develop branch should be releasable at any given moment, but in
reality there is sometimes a feedback delay where issues arise after some
time.  I think our current process of cutting a branch and cherry-picking
fixes for issues found allows us to deal with this delayed feedback.

The issues here to me are more around the delayed feedback time (issues
found by very rare race conditions that don't arise often) and a lack of
existing test coverage for many of our features.  I do believe that
everyone who has been committing refactored or new code is doing their due
diligence to write tests, but it is not always easy to identify where test
coverage gaps exist.  We will of course continue to fill those gaps as we
find them.  Do you have other ideas on how to make develop more stable?
Again, I personally feel people are doing their best to write new tests and
review code - to me this is not a process problem (more like a tech
debt/ease of refactoring problem) but I'm definitely open to discussion.

Ryan

Re: Proposal to include GEODE-7088 and GEODE-7089 in 1.10.0

2019-08-26 Thread Ryan McMahon

The cherry-pick for GEODE-7088 is clean so I didn't open a PR for that
one.  The cherry-pick for GEODE-7089 required manual merging due to several
unrelated stats changes added to develop recently.  The PR to merge that
one into release/1.10.0 is here:
https://github.com/apache/geode/pull/3976

The original PR for this change when it was added to develop is here:
https://github.com/apache/geode/pull/3929

Thanks all for reading and considering.

Ryan

On Mon, Aug 26, 2019 at 4:22 PM Udo Kohlmeyer  wrote:

> Thank you Ryan,
>
> +1 for inclusion
>
> On 8/26/19 3:33 PM, Ryan McMahon wrote:
> > Udo,
> >
> > Here are inline answers to your questions:
> >
> > *Is this an existing issue?*
> >
> > Short answer - yes, but it has never been in a release version of Geode.
> > The leak was introduced as part of some changes to address handling
> > multiple concurrent registration requests for a given client on a single
> > server.  The issue is only seen if client registration fails which is not
> > particularly common, which is why we are only seeing it now after months
> of
> > testing.  The commit for that was introduced here on April 30th.
> >
> https://github.com/apache/geode/commit/bc2a2fa5af374cfedfba4dc1abe6cbc2a7b719c8
> > The ConcurrentModificationException issue (which ultimately causes the
> > registration to fail) was introduced on April 22nd here:
> >
> https://github.com/apache/geode/commit/afc311c04f6908a8f725834cdf2c49ce6e902b3f
> >
> >
> > *Why is it more critical to squeeze it into an existing (almost
> > release) version of Apache Geode?*
> >
> > Not sure I totally understand this question, but it is critical because
> the
> > leak can cause servers to crash due to OOM.  Again, because the problems
> > were introduced in late April and we haven't released Geode since then,
> so
> > I think it is very important to merge these fixes into 1.10.0 before we
> > release.
> >
> >
> >
> > *What guarantees do we have that this fix makes the application more
> stable
> > compared to adding another hidden issue, which we will discover in
> another
> > few weeks from now?*
> >
> > I added numerous tests for this scenario to ensure that the leak would
> > never happen regardless of the cause of a failed client registration.
> > There obviously is no way to 100% guarantee that there will be no issues
> > that arise from the fixes themselves, but our existing test coverage plus
> > the new tests I wrote should give us the confidence we need.
> >
> > Thanks,
> > Ryan
> >
> > On Mon, Aug 26, 2019 at 3:17 PM Udo Kohlmeyer  wrote:
> >
> >> In order to better understand this request:
> >>
> >> Is this an existing issue?
> >>
> >> Why is it more critical to squeeze it into an existing (almost release)
> >> version of Apache Geode?
> >>
> >> What guarantees do we have that this fix makes the application more
> >> stable compared to adding another hidden issue, which we will discover
> >> in another few weeks from now?
> >>
> >>
> >> --Udo
> >>
> >> On 8/26/19 3:10 PM, Ryan McMahon wrote:
> >>> Hi all,
> >>>
> >>> I would like to propose cherry-picking GEODE-7088 and GEODE-7089 to the
> >>> 1.10.0 release branch.  The two JIRAs are related to the same root
> >> problem,
> >>> which I would classify as critical.  We discovered a case where a
> failed
> >>> client registration could lead to a memory leak in a server, eventually
> >>> causing the server to crash due to lack of memory.
> >>>
> >>> The issue is instigated by a ConcurrentModificationException due to
> >>> iteration of a non-thread safe collection while it is being mutated
> >>> (GEODE-7088).  This exception occurs when the client's queue image is
> >> being
> >>> copied from one server to the next during client registration, and it
> >>> causes the client's registration to fail.  The client would likely
> >> succeed
> >>> if it retried registration with that same server, but if it registers
> >> with
> >>> a different server, we end up leaking events to the client's
> registration
> >>> queue on the original server (GEODE-7089).
> >>>
> >>> The fix for GEODE-7088 is to use thread-safe collections for interested
> >>> clients in client update messages.  The fix for GEODE-7089 is to always
> >>> drain and remove the registration queue regardless of success or
> failure.
> >>> Together, these fixes prevent the failed registrations and memory leak.
> >>>
> >>> The SHAs for the fixes and tests in develop are:
> >>>
> >>> GEODE-7088
> >>> - 174af1d23fb7e09eb2bc2fa55479df854850fadb
> >>> - 5bb753a8f4ff2886acd8e62d6f51fea58e37881d
> >>>
> >>> GEODE-7089
> >>> - 5d0153ad4adb1612a1083673f98b1982819a6589
> >>>
> >>> This proposal is to cherry-pick these commits to 1.10.0 release branch.
> >>>
> >>> Thanks,
> >>> Ryan McMahon
> >>>
>

Re: Proposal to include GEODE-7088 and GEODE-7089 in 1.10.0

2019-08-26 Thread Ryan McMahon

Udo,

Here are inline answers to your questions:

*Is this an existing issue?*

Short answer - yes, but it has never been in a release version of Geode.
The leak was introduced as part of some changes to address handling
multiple concurrent registration requests for a given client on a single
server.  The issue is only seen if client registration fails which is not
particularly common, which is why we are only seeing it now after months of
testing.  The commit for that was introduced here on April 30th.
https://github.com/apache/geode/commit/bc2a2fa5af374cfedfba4dc1abe6cbc2a7b719c8
The ConcurrentModificationException issue (which ultimately causes the
registration to fail) was introduced on April 22nd here:
https://github.com/apache/geode/commit/afc311c04f6908a8f725834cdf2c49ce6e902b3f

*Why is it more critical to squeeze it into an existing (almost
release) version of Apache Geode?*

Not sure I totally understand this question, but it is critical because the
leak can cause servers to crash due to OOM.  Again, because the problems
were introduced in late April and we haven't released Geode since then, so
I think it is very important to merge these fixes into 1.10.0 before we
release.

*What guarantees do we have that this fix makes the application more stable
compared to adding another hidden issue, which we will discover in another
few weeks from now?*

I added numerous tests for this scenario to ensure that the leak would
never happen regardless of the cause of a failed client registration.
There obviously is no way to 100% guarantee that there will be no issues
that arise from the fixes themselves, but our existing test coverage plus
the new tests I wrote should give us the confidence we need.

Thanks,
Ryan

On Mon, Aug 26, 2019 at 3:17 PM Udo Kohlmeyer  wrote:

> In order to better understand this request:
>
> Is this an existing issue?
>
> Why is it more critical to squeeze it into an existing (almost release)
> version of Apache Geode?
>
> What guarantees do we have that this fix makes the application more
> stable compared to adding another hidden issue, which we will discover
> in another few weeks from now?
>
>
> --Udo
>
> On 8/26/19 3:10 PM, Ryan McMahon wrote:
> > Hi all,
> >
> > I would like to propose cherry-picking GEODE-7088 and GEODE-7089 to the
> > 1.10.0 release branch.  The two JIRAs are related to the same root
> problem,
> > which I would classify as critical.  We discovered a case where a failed
> > client registration could lead to a memory leak in a server, eventually
> > causing the server to crash due to lack of memory.
> >
> > The issue is instigated by a ConcurrentModificationException due to
> > iteration of a non-thread safe collection while it is being mutated
> > (GEODE-7088).  This exception occurs when the client's queue image is
> being
> > copied from one server to the next during client registration, and it
> > causes the client's registration to fail.  The client would likely
> succeed
> > if it retried registration with that same server, but if it registers
> with
> > a different server, we end up leaking events to the client's registration
> > queue on the original server (GEODE-7089).
> >
> > The fix for GEODE-7088 is to use thread-safe collections for interested
> > clients in client update messages.  The fix for GEODE-7089 is to always
> > drain and remove the registration queue regardless of success or failure.
> > Together, these fixes prevent the failed registrations and memory leak.
> >
> > The SHAs for the fixes and tests in develop are:
> >
> > GEODE-7088
> > - 174af1d23fb7e09eb2bc2fa55479df854850fadb
> > - 5bb753a8f4ff2886acd8e62d6f51fea58e37881d
> >
> > GEODE-7089
> > - 5d0153ad4adb1612a1083673f98b1982819a6589
> >
> > This proposal is to cherry-pick these commits to 1.10.0 release branch.
> >
> > Thanks,
> > Ryan McMahon
> >
>

Proposal to include GEODE-7088 and GEODE-7089 in 1.10.0

2019-08-26 Thread Ryan McMahon

Hi all,

I would like to propose cherry-picking GEODE-7088 and GEODE-7089 to the
1.10.0 release branch.  The two JIRAs are related to the same root problem,
which I would classify as critical.  We discovered a case where a failed
client registration could lead to a memory leak in a server, eventually
causing the server to crash due to lack of memory.

The issue is instigated by a ConcurrentModificationException due to
iteration of a non-thread safe collection while it is being mutated
(GEODE-7088).  This exception occurs when the client's queue image is being
copied from one server to the next during client registration, and it
causes the client's registration to fail.  The client would likely succeed
if it retried registration with that same server, but if it registers with
a different server, we end up leaking events to the client's registration
queue on the original server (GEODE-7089).

The fix for GEODE-7088 is to use thread-safe collections for interested
clients in client update messages.  The fix for GEODE-7089 is to always
drain and remove the registration queue regardless of success or failure.
Together, these fixes prevent the failed registrations and memory leak.

The SHAs for the fixes and tests in develop are:

GEODE-7088
- 174af1d23fb7e09eb2bc2fa55479df854850fadb
- 5bb753a8f4ff2886acd8e62d6f51fea58e37881d

GEODE-7089
- 5d0153ad4adb1612a1083673f98b1982819a6589

This proposal is to cherry-pick these commits to 1.10.0 release branch.

Thanks,
Ryan McMahon

Re: [DISCUSS] Controlling event dispatch to AsyncEventListener (review by Aug 22)

2019-08-20 Thread Ryan McMahon

+1 to Juan's comment.  I wonder if it would make sense just to have a
variation on the existing create() method.  Either have a createPaused()
method or add an overload for create() that takes a startPaused boolean.
That will really drive home that the AsyncEventQueue will be created in a
paused state.  I personally would prefer the overload with extra parameter,
but I think either approach would be fine.

Thanks,
Ryan

On Tue, Aug 20, 2019 at 8:56 AM Juan José Ramos  wrote:

> Hello Anil,
>
> +1 for the proposed solution.
> I'd change the method name from *pauseEventDispatchToListener* to something
> more meaningful and understandable for our users, maybe *startPaused*?,
> *setManualStart* (as we currently have for the *GatewaySenderFactory*)?,
> *startWithEventDispatcherPaused*?.
> Best regards.
>
>
>
> On Sat, Aug 17, 2019 at 12:55 AM Anilkumar Gingade 
> wrote:
>
> > I have updated the wiki based on Dan's comment.
> > Changes with api:
> >
> > *On "AsyncEventQueueFactory" interface - *
> >
> > *AsyncEventQueueFactory pauseEventDispatchToListener();  *// This causes
> > AEQ to be created with paused state.
> >
> >
> >
> >
> > On Fri, Aug 16, 2019 at 4:36 PM Anilkumar Gingade 
> > wrote:
> >
> > > Dan,
> > >
> > > If you look into the API; the AEQ will be created with the pause state.
> > > The user (application) has to call resume to dispatch the events.
> > >
> > > It will be slightly different from GatewaySender behavior; where
> > > GatewaySender will be created with run mode and then application has to
> > > call pause on it. Here in this case AEQ will be created with paused
> > state.
> > >
> > > -Anil.
> > >
> > >
> > > On Fri, Aug 16, 2019 at 4:31 PM Dan Smith  wrote:
> > >
> > >> Hi Anil,
> > >>
> > >> While I like the idea of matching the API of GatewaySender, I'm not
> > sure I
> > >> see how this solves the problem. Is it required of the user to call
> > pause
> > >> on the AsyncEventQueue as soon as it is created? How would someone do
> > that
> > >> when creating AEQs with xml or cluster configuration? Maybe it would
> be
> > >> better to not dispatch any events until we are done creating all
> > regions?
> > >>
> > >> -Dan
> > >>
> > >> On Fri, Aug 16, 2019 at 2:31 PM Anilkumar Gingade <
> aging...@pivotal.io>
> > >> wrote:
> > >>
> > >> > Proposal to support controlling capability with event dispatch to
> > >> > AsyncEventQueue Listener.
> > >> >
> > >> > Wiki proposal page:
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/GEODE/%5BDraft%5D+Controlling+event+dispatch+to+AsyncEventListener
> > >> >
> > >> > Here is the details from the wiki page:
> > >> > *Problem*
> > >> >
> > >> > *The Geode system requires AEQs to be configured before regions are
> > >> > created. If an AEQ listener is operating on a secondary region, this
> > >> could
> > >> > cause listener to operate on a region which is not yet created or
> > fully
> > >> > initialized (for region with co-located regions) which could result
> in
> > >> > missing events or dead-lock scenario between region (co-located
> > region)
> > >> > creation threads. This scenario is likely to happen during
> persistence
> > >> > recovery; when AEQs are created in the start, the recovered AEQ
> events
> > >> are
> > >> > dispatched immediately, thus invoking the AEQ listeners.*
> > >> > Anti-Goals
> > >> >
> > >> > None
> > >> > *Solution*
> > >> >
> > >> > *The proposed solution is to provide a way to control dispatching
> AEQ
> > >> > events to the AEQ Listeners, this could be done by adding "pause"
> and
> > >> > "resume" capability to the AEQ, which will allow application to
> decide
> > >> when
> > >> > to dispatch events to the listeners. *
> > >> >
> > >> >
> > >> > *The proposal is similar to existing "pause" and "resume" behavior
> on
> > >> the
> > >> > GatewaySender, on which the AEQ is based on (AEQ implementation is a
> > >> > wrapper around GatewaySender). *
> > >> > Changes and Additions to Public Interfaces
> > >> >
> > >> > *The proposed APIs are:*
> > >> >
> > >> > *On "AsyncEventQueueFactory" interface - *
> > >> >
> > >> > *AsyncEventQueue pauseEventDispatchToListener();*
> > >> >
> > >> > *On "AsyncEventQueue" interface -*
> > >> >
> > >> > *boolean resumeEventDispatchToListener(); **returns true or false if
> > the
> > >> > event dispatch is resumed successfully.*
> > >> >
> > >> >
> > >> > *The constraints on the pauseEventDispatchToListener() will remain
> > >> similar
> > >> > to as in "GatewaySender.pause()" :*
> > >> >
> > >> > "It should be kept in mind that the events will still be getting
> > queued
> > >> > into the queue. The scope of this operation is the VM on which it is
> > >> > invoked. In case the AEQ is parallel, the AEQ will be paused on
> > >> individual
> > >> > node where this API is called and the AEQ on other VM's can still
> > >> dispatch
> > >> > events. In case the AEQ is not parallel, and the running AEQ on
> which
> > >> this
> > >> > API is invoked is not primary then

Re: Hardcoded list of jars for Jetty tests

2019-08-19 Thread Ryan McMahon

Hi Kirk,

I’m curious if you’ve found an alternative so we don’t have to hard code
this list?  We actually recently ran into an issue where this test list was
updated due to missing dependencies, but our documentation was not updated
and therefore it was missing the new dependencies.  I currently have a PR
which includes a comment above this list indicating that the documentation
should always be updated when the list is updated.  Not an ideal solution
but it’s better than nothing.  Here is the PR:
https://github.com/apache/geode/pull/3886

Ryan

On Mon, Aug 19, 2019 at 3:48 PM Kirk Lund  wrote:

> PS: it's in TomcatInstall...
>
>   private static final String[] tomcatRequiredJars =
>   {"antlr", "commons-io", "commons-lang", "commons-validator",
> "fastutil", "geode-common",
>   "geode-core", "geode-log4j", "geode-management",
> "javax.transaction-api", "jgroups",
>   "log4j-api", "log4j-core", "log4j-jul", "micrometer",
> "shiro-core", "jetty-server",
>   "jetty-util", "jetty-http", "jetty-io"};
>
> On Mon, Aug 19, 2019 at 3:39 PM Kirk Lund  wrote:
>
> > Can anyone point me at the hardcoded list of jars for the Jetty and
> Tomcat
> > tests? I can't seem to find it this time despite grepping for several
> > things that should be in the list. Maybe it was changed since I last had
> to
> > look for it...
> >
> > I found the one in dunit ProcessManager:
> >
> >   dunitClasspath =
> >   removeModulesFromPath(dunitClasspath, "geode-core", "geode-cq",
> > "geode-common",
> >   "geode-json", "geode-log4j", "geode-lucene", "geode-wan");
> >
> > I would love to remove all of these hardcoded lists of jars... they're
> bad
> > news.
> >
> > By the way, the fact that we leave geode-dunit on the classpath (on
> > purpose) for tests that have JVMs running old versions of Geode is a
> > nightmare to anyone trying to move classes around for something like
> > geode-log4j. Please try to think of a different way to handle this.
> >
> >
>

Re: Unit tests are hanging?

2019-08-08 Thread Ryan McMahon

I have a PR up for this now.
https://github.com/apache/geode/pull/3900

It bumps the timeout to 20 minutes but also changes the CALL_STACK_TIMEOUT
to be 1140 seconds (19 minutes).  The latter configuration parameter
controls when we declare the task "hung" and dump stacks.  We were not
dumping stacks at all before these changes because the value was 1800
seconds which is well above the timeout of 10 minutes.  With my proposed
change we will dump stacks before the task times out so we will have
something to look at in the test artifacts to identify the hang.

Thanks,
Ryan

On Thu, Aug 8, 2019 at 11:28 AM Ryan McMahon  wrote:

> Looks like we have a general consensus from the community.  I'll go ahead
> and make a PR for the changes.
>
> Thanks,
> Ryan
>
> On Thu, Aug 8, 2019 at 11:03 AM Juan José Ramos  wrote:
>
>> +1
>>
>> On Thu, Aug 8, 2019 at 6:55 PM Kirk Lund  wrote:
>>
>> > +1
>> >
>> > On Thu, Aug 8, 2019 at 10:14 AM Dan Smith  wrote:
>> >
>> > > > With all that, I propose we permanently bump the timeouts on
>> UnitTestX
>> > > jobs
>> > > > across the board (main pipeline, PR pipeline, etc) from 10 to 20
>> > minutes
>> > > to
>> > > > be more tolerant of these types of degradations.
>> > > >
>> > >
>> > > +1
>> > >
>> > > -Dan
>> > >
>> >
>>
>>
>> --
>> Juan José Ramos Cassella
>> Senior Technical Support Engineer
>> Email: jra...@pivotal.io
>> Office#: +353 21 4238611
>> Mobile#: +353 87 2074066
>> After Hours Contact#: +1 877 477 2269
>> Office Hours: Mon - Thu 08:30 - 17:00 GMT. Fri 08:30 - 16:00 GMT
>> How to upload artifacts:
>> https://support.pivotal.io/hc/en-us/articles/204369073
>> How to escalate a ticket:
>> https://support.pivotal.io/hc/en-us/articles/203809556
>>
>> [image: support] <https://support.pivotal.io/> [image: twitter]
>> <https://twitter.com/pivotal> [image: linkedin]
>> <https://www.linkedin.com/company/3048967> [image: facebook]
>> <https://www.facebook.com/pivotalsoftware> [image: google plus]
>> <https://plus.google.com/+Pivotal> [image: youtube]
>> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl
>> >
>>
>

Re: Unit tests are hanging?

2019-08-08 Thread Ryan McMahon

Looks like we have a general consensus from the community.  I'll go ahead
and make a PR for the changes.

Thanks,
Ryan

On Thu, Aug 8, 2019 at 11:03 AM Juan José Ramos  wrote:

> +1
>
> On Thu, Aug 8, 2019 at 6:55 PM Kirk Lund  wrote:
>
> > +1
> >
> > On Thu, Aug 8, 2019 at 10:14 AM Dan Smith  wrote:
> >
> > > > With all that, I propose we permanently bump the timeouts on
> UnitTestX
> > > jobs
> > > > across the board (main pipeline, PR pipeline, etc) from 10 to 20
> > minutes
> > > to
> > > > be more tolerant of these types of degradations.
> > > >
> > >
> > > +1
> > >
> > > -Dan
> > >
> >
>
>
> --
> Juan José Ramos Cassella
> Senior Technical Support Engineer
> Email: jra...@pivotal.io
> Office#: +353 21 4238611
> Mobile#: +353 87 2074066
> After Hours Contact#: +1 877 477 2269
> Office Hours: Mon - Thu 08:30 - 17:00 GMT. Fri 08:30 - 16:00 GMT
> How to upload artifacts:
> https://support.pivotal.io/hc/en-us/articles/204369073
> How to escalate a ticket:
> https://support.pivotal.io/hc/en-us/articles/203809556
>
> [image: support]  [image: twitter]
>  [image: linkedin]
>  [image: facebook]
>  [image: google plus]
>  [image: youtube]
> 
>

Re: Another change for 1.10.0 release

2019-08-08 Thread Ryan McMahon

+1

On Thu, Aug 8, 2019 at 10:40 AM John Blum  wrote:

> +1 for Dan's changes.
>
> On Thu, Aug 8, 2019 at 10:28 AM Owen Nichols  wrote:
>
> > Hi Dan, thank you for bringing your concern.
> >
> > Our “critical fixes” rule allows critical fixes to be brought to the
> > release branch by proposal on the dev list [as you have just done].  If
> > there is consensus from the Geode community that this GEODE-7055 fix
> > satisfies the “critical fixes” rule, Dick or I will be happy to bring it
> to
> > the 1.10.0 release branch.
> >
> > -Owen
> >
> >
> > > On Aug 8, 2019, at 10:11 AM, Dan Smith  wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to get the fix for GEODE-7055 (Don't send failure replies
> from a
> > > P2P reader thread) into the 1.10.0 release branch.
> > >
> > > This bug was causing a hang on startup for users of the session
> > replication
> > > module that didn't put the session jars on the classpath of the
> locator.
> > > The hang doesn't happen with 1.9.0, so I'd I think we should make sure
> it
> > > won't happen with 1.10.0 as well.
> > >
> > > -Dan
> >
> >
>
> --
> -John
> john.blum10101 (skype)
>

Re: Unit tests are hanging?

2019-08-08 Thread Ryan McMahon

So we temporarily bumped the timeout from 10 minutes to 2 hours on the
UnitTestOpenJDK11 execute_tests Concourse task, originally with the
intention of logging into the heavy lifter to debug further.  However,
after doing that we see that the jobs are all succeeding in roughly the
same amount of time as they did before we started seeing these timeouts
(total job time 25-30 minutes, total execute_tests task time of 7.5ish
minutes).  This makes me think that we were seeing some sort of performance
degradation at the infrastructure level, as opposed to some recent change
to Geode which is causing the timeouts.

With all that, I propose we permanently bump the timeouts on UnitTestX jobs
across the board (main pipeline, PR pipeline, etc) from 10 to 20 minutes to
be more tolerant of these types of degradations.  Please let me know if
anyone is opposed to doing this.

Thanks,
Ryan

On Wed, Aug 7, 2019 at 11:39 AM Ryan McMahon  wrote:

> Still trying to identify the cause of the unit test hangs.stay tuned.
> For other people who might not know this, the reason the entire CI job
> fails rather than the hanging test is because we don't have any global
> test-level timeouts, so a hanging test will run up until the Concourse
> timeout is reached.  I wrote a story to explore adding test level timeouts
> so we can get actual feedback in the CI about which test is hanging without
> waiting for the whole job to timeout, downloading artifacts, looking at the
> stack dumps, etc etc.  I think we could have some reasonable global upper
> limit (30 minutes?) for any test.  We could always tune that as needed.
>
> https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7057?filter=allopenissues
>
> Ryan
>
> On Wed, Aug 7, 2019 at 11:21 AM Bruce Schuchardt 
> wrote:
>
>> Yeah, that test passed on my branch in unit tests and stress tests but
>> for some reason is hanging after the merge to develop. I've pushed an
>> @Ignore for the test that you should pick up.
>>
>> On 8/7/19 10:38 AM, Kirk Lund wrote:
>> > Yep, that's the same test I'm seeing in the callstacks of my dunit tgz
>> from
>> > concourse...
>> >
>> > Started @ 2019-08-07 07:25:18.494 +
>> > 2019-08-07 07:51:25.252 +
>> >   org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
>> > testMulticastAfterReconnect
>> > Ended @ 2019-08-07 08:28:16.591 +
>> >
>> > Thanks for digging into it!
>> >
>> > On Wed, Aug 7, 2019 at 10:16 AM Ryan McMahon 
>> wrote:
>> >
>> >> I think the reflection and PowerMock warnings here are probably a red
>> >> herring.  We pulled down the artifacts and found that the DUnit job is
>> >> hanging due to stuck threads in a newer DUnit test.  I am not sure why
>> it
>> >> isn't failing the test but rather failing the entire job.  I believe
>> Bruce
>> >> Schuchart is going to dig into this a bit.  Analysis from Lynn
>> >> Hughes-Godfrey:
>> >>
>> >> So, I downloaded
>> >>
>> >>
>> >>
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/961
>> >>
>> >> to
>> >>
>> >>
>> >>
>> /Users/lhughesgodfrey/Downloads/distributedtestfiles-OpenJDK8-1.11.0-SNAPSHOT.0010/geode-core/build/distributedTest
>> >>
>> >> In that callstacks directory, you'll see dunit-hangs.txt ... and this
>> >> test has been running for over an hour ...
>> >>
>> >> ```
>> >>
>> >> Started @ 2019-08-07 07:54:10.272 +
>> >>
>> >> 2019-08-07 08:22:25.440 +
>> >> org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
>> >> testMulticastAfterReconnect
>> >>
>> >> Ended @ 2019-08-07 08:58:50.243 +
>> >>
>> >> ```
>> >>
>> >> When I look at the callstacks (at the top level), I see this:
>> >>
>> >> ```
>> >>
>> >> "Pooled Waiting Message Processor 1" #213 daemon prio=5 os_prio=0
>> >> tid=0x7fc738024000 nid=0x488 waiting for monitor entry
>> >> [0x7fc7ec5f7000]
>> >>
>> >> java.lang.Thread.State: BLOCKED (on object monitor)
>> >>
>> >>  at
>> >>
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.startEventProcessing(GMSMembershipManager.java:1179)
>> >>
>> >>  - waiting to lock <0xe03a75e0> (a java.lan

Re: Unit tests are hanging?

2019-08-07 Thread Ryan McMahon

Still trying to identify the cause of the unit test hangs.stay tuned.
For other people who might not know this, the reason the entire CI job
fails rather than the hanging test is because we don't have any global
test-level timeouts, so a hanging test will run up until the Concourse
timeout is reached.  I wrote a story to explore adding test level timeouts
so we can get actual feedback in the CI about which test is hanging without
waiting for the whole job to timeout, downloading artifacts, looking at the
stack dumps, etc etc.  I think we could have some reasonable global upper
limit (30 minutes?) for any test.  We could always tune that as needed.
https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7057?filter=allopenissues

Ryan

On Wed, Aug 7, 2019 at 11:21 AM Bruce Schuchardt 
wrote:

> Yeah, that test passed on my branch in unit tests and stress tests but
> for some reason is hanging after the merge to develop. I've pushed an
> @Ignore for the test that you should pick up.
>
> On 8/7/19 10:38 AM, Kirk Lund wrote:
> > Yep, that's the same test I'm seeing in the callstacks of my dunit tgz
> from
> > concourse...
> >
> > Started @ 2019-08-07 07:25:18.494 +
> > 2019-08-07 07:51:25.252 +
> >   org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
> > testMulticastAfterReconnect
> > Ended @ 2019-08-07 08:28:16.591 +
> >
> > Thanks for digging into it!
> >
> > On Wed, Aug 7, 2019 at 10:16 AM Ryan McMahon 
> wrote:
> >
> >> I think the reflection and PowerMock warnings here are probably a red
> >> herring.  We pulled down the artifacts and found that the DUnit job is
> >> hanging due to stuck threads in a newer DUnit test.  I am not sure why
> it
> >> isn't failing the test but rather failing the entire job.  I believe
> Bruce
> >> Schuchart is going to dig into this a bit.  Analysis from Lynn
> >> Hughes-Godfrey:
> >>
> >> So, I downloaded
> >>
> >>
> >>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/961
> >>
> >> to
> >>
> >>
> >>
> /Users/lhughesgodfrey/Downloads/distributedtestfiles-OpenJDK8-1.11.0-SNAPSHOT.0010/geode-core/build/distributedTest
> >>
> >> In that callstacks directory, you'll see dunit-hangs.txt ... and this
> >> test has been running for over an hour ...
> >>
> >> ```
> >>
> >> Started @ 2019-08-07 07:54:10.272 +
> >>
> >> 2019-08-07 08:22:25.440 +
> >> org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
> >> testMulticastAfterReconnect
> >>
> >> Ended @ 2019-08-07 08:58:50.243 +
> >>
> >> ```
> >>
> >> When I look at the callstacks (at the top level), I see this:
> >>
> >> ```
> >>
> >> "Pooled Waiting Message Processor 1" #213 daemon prio=5 os_prio=0
> >> tid=0x7fc738024000 nid=0x488 waiting for monitor entry
> >> [0x7fc7ec5f7000]
> >>
> >> java.lang.Thread.State: BLOCKED (on object monitor)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.startEventProcessing(GMSMembershipManager.java:1179)
> >>
> >>  - waiting to lock <0xe03a75e0> (a java.lang.Object)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.ClusterDistributionManager.readyForMessages(ClusterDistributionManager.java:1152)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.ClusterDistributionManager.lambda$startThreads$10(ClusterDistributionManager.java:1129)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$87/1408912936.run(Unknown
> >> Source)
> >>
> >>  at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> >>
> >>  at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:961)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.ClusterDistributionManager.doWaitingThread(ClusterDistributionManager.java:851)
> >>
> >>  at
> >>
> org.apache.geode.distributed.internal.Cluster

Re: Unit tests are hanging?

2019-08-07 Thread Ryan McMahon

I think the reflection and PowerMock warnings here are probably a red
herring.  We pulled down the artifacts and found that the DUnit job is
hanging due to stuck threads in a newer DUnit test.  I am not sure why it
isn't failing the test but rather failing the entire job.  I believe Bruce
Schuchart is going to dig into this a bit.  Analysis from Lynn
Hughes-Godfrey:

So, I downloaded

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/961

to

/Users/lhughesgodfrey/Downloads/distributedtestfiles-OpenJDK8-1.11.0-SNAPSHOT.0010/geode-core/build/distributedTest

In that callstacks directory, you'll see dunit-hangs.txt ... and this
test has been running for over an hour ...

```

Started @ 2019-08-07 07:54:10.272 +

2019-08-07 08:22:25.440 +
org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
testMulticastAfterReconnect

Ended @ 2019-08-07 08:58:50.243 +

```

When I look at the callstacks (at the top level), I see this:

```

"Pooled Waiting Message Processor 1" #213 daemon prio=5 os_prio=0
tid=0x7fc738024000 nid=0x488 waiting for monitor entry
[0x7fc7ec5f7000]

   java.lang.Thread.State: BLOCKED (on object monitor)

at 
org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.startEventProcessing(GMSMembershipManager.java:1179)

- waiting to lock <0xe03a75e0> (a java.lang.Object)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.readyForMessages(ClusterDistributionManager.java:1152)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.lambda$startThreads$10(ClusterDistributionManager.java:1129)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$87/1408912936.run(Unknown
Source)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:961)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.doWaitingThread(ClusterDistributionManager.java:851)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$53/274751999.invoke(Unknown
Source)

at 
org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:121)

at 
org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$49/1824093464.run(Unknown
Source)

at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:

- <0xe03a7660> (a
java.util.concurrent.ThreadPoolExecutor$Worker)

```

This thread holds the lock:

```

"ReconnectThread" #193 prio=5 os_prio=0 tid=0x7fc7cc219000
nid=0x46a waiting on condition [0x7fc78bdfb000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)

at java.lang.Thread.sleep(Native Method)

at 
org.apache.geode.internal.tcp.Connection.createSender(Connection.java:959)

at 
org.apache.geode.internal.tcp.ConnectionTable.handleNewPendingConnection(ConnectionTable.java:297)

at 
org.apache.geode.internal.tcp.ConnectionTable.getSharedConnection(ConnectionTable.java:408)

at 
org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:593)

at 
org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:825)

at 
org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:537)

at 
org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:326)

at 
org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:241)

at 
org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:596)

at 
org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1558)

at 
org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.send(GMSMembershipManager.java:1739)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2849)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2776)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2813)

at 
org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1523)

at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:174)

at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.sendReply(DistributedCacheOperation.java:1269)

at

Re: Unnecessary uses of final on local variables

2019-06-13 Thread Ryan McMahon

I agree with this sentiment, and have generally only been using final on
class fields and method parameters where I want to guarantee immutability
as of late.  However, I was at one time adding final to local variables,
and I know that I have checked in code with final locals.  Should we
actively be removing finals when we see it on locals?

One case where I still have been using final for locals is when the
variable is effectively constant and I want to show that intent.  For
instance:
final String resourceId =
"someStaticResourceIdThatReallyShouldntBeReassignedEver";

Is this a valid use case or should we view this as unnecessary verbosity as
well?

Thanks,
Ryan

On Thu, Jun 13, 2019 at 1:31 PM Kirk Lund  wrote:

> According to Effective Java 3rd Edition, all local variables are implicitly
> made final by the JVM (and thus receiving the same JVM optimizations as a
> final variable) without requiring use of the keyword as long as you only
> set the value once. So this becomes an unnecessary use of the keyword
> final which
> really just adds noise to the code (one more word on many lines) much like
> unnecessary uses of this. on references to class fields. The only context
> where it's still valuable to use final is on class fields and constants
> where it provides concurrency guarantees.
>
> It may be useful to temporarily mark a local variable as final while
> performing a refactoring. See Martin Fowler's book for various refactorings
> in which he does this.
>
> There really is no value in retaining the keyword on a local variable any
> longer, so I think we should avoid using it in the same way we avoid
> unnecessary uses of this. on references to class fields. It's all just more
> noise in the code.
>

Re: "Output path is not specified for module" (IntelliJ)

2019-06-06 Thread Ryan McMahon

Fair warning - some have wiped their .idea file and it works temporarily,
but inevitably will return.  I would recommend the setting changes as
described above if it recurs.

Ryan

On Thu, Jun 6, 2019 at 12:04 PM Peter Tran  wrote:

> Awesome - a wipe of my .idea file worked. Thank you!
>
> On Thu, Jun 6, 2019 at 2:54 PM Jacob Barrett  wrote:
>
> > This seems to happen frequently with 2019.1. The easiest way I found to
> > fix this was to start over with the IJ project.  Some have had luck
> > switching between the gradle builder and the IJ builder in the cradle
> > configuration panel.
> >
> > > On Jun 6, 2019, at 11:47 AM, Peter Tran  wrote:
> > >
> > > Hello,
> > >
> > > Has anyone had an issue in IntelliJ running tests with the error:
> > >
> > > "Cannot start compilation: the output path is not specified for module"
> > >
> > > I setup intelliJ using the BUILDING.md instructions and haven't come
> > across
> > > this problem. I haven't updated intelliJ since the last time I ran
> tests
> > > successfully. I haven't changed my JDK and am experiencing it on
> develop
> > >
> > > Thanks
> >
>

Re: "Output path is not specified for module" (IntelliJ)

2019-06-06 Thread Ryan McMahon

I have seen this issue and this seems to be a pretty reliable solution:

1. Upgrade to IntelliJ 2019.1.3 (the latest and greatest) if you haven't
already.
2. Reimport the Geode project as described in the "Setting up IntelliJ"
section of the BUILDING.MD
3. Set your "Build and run using" setting in Delegate settings to 'Gradle',
as shown in the screenshot below.  These settings are found under
IntelliJ->Preferences->Build, Execution, and Deployment->Build
Tools->Gradle.

[image: image.png]

Let me know if this solves the problem for you.  I can add this to the
BUILDING.MD if it resolves the issue.

Ryan

On Thu, Jun 6, 2019 at 11:54 AM Jacob Barrett  wrote:

> This seems to happen frequently with 2019.1. The easiest way I found to
> fix this was to start over with the IJ project.  Some have had luck
> switching between the gradle builder and the IJ builder in the cradle
> configuration panel.
>
> > On Jun 6, 2019, at 11:47 AM, Peter Tran  wrote:
> >
> > Hello,
> >
> > Has anyone had an issue in IntelliJ running tests with the error:
> >
> > "Cannot start compilation: the output path is not specified for module"
> >
> > I setup intelliJ using the BUILDING.md instructions and haven't come
> across
> > this problem. I haven't updated intelliJ since the last time I ran tests
> > successfully. I haven't changed my JDK and am experiencing it on develop
> >
> > Thanks
>

Re: [DISCUSS] Criteria for PMC, committers

2019-05-29 Thread Ryan McMahon

My two cents...

1) What is our criteria for becoming a committer and PMC member?
I don't have anything particularly enlightening to say here, and I think I
am just reinforcing what we already do, but here goes...

I think this piece from the ASF notes is key for measuring candidacy for
committer/PMC status, but still quite subjective:

contributions from non-committers - both specific code patches, bugs
> reported or commented on, or just helpful interaction on their project
> lists - [are used] to evaluate contributors as potential committers

It is hard to quantify the value of contributions from non-committers
(number of code patches?  quality of code patches?  number of feature
ideas/bugs reported?  number of times replied on a dev list thread?).  I
think we have to consider each candidate on a case-by-case basis with these
things in mind.  I'm not sure we can fairly create hard and fast rules like
a candidate must have filed X tickets and written Y comments.  As is
pointed out in your third bullet point, we have to decide whether we've
gained confidence and trust the contributor to make the right decision when
considering things like merging code and PR reviewing.  Again, this is all
very subjective so I feel it has to be evaluated on a case-by-case basis,
which is basically what we have always done.  I would love to hear of more
objective/quantitative ways of measuring this sort of thing though, if
people have ideas.

2) Do we have separate criteria for committers and PMC members (and thus
should elect them separately)?
I think we should continue to bundle the two.  I haven't heard a compelling
reason for separating the two, and bundling them simplifies the process
somewhat.

Ryan

On Wed, May 29, 2019 at 10:17 AM Anthony Baker  wrote:

> I think it’s time to re-establish consensus around two things:
>
> 1) What is our criteria for becoming a committer and PMC member?
> 2) Do we have separate criteria for committers and PMC members (and thus
> should elect them separately)?
>
> The ASF notes that projects are free to chose the approach that works best
> for them [1]:
>
> > PMCs are free to set the bar for merit within their projects, as long as
> decision making is done in a collaborative fashion as in the Apache Way.
> Healthy PMCs will regularly review contributions from non-committers - both
> specific code patches, bugs reported or commented on, or just helpful
> interaction on their project lists - to evaluate contributors as potential
> committers. Ensuring that PMC members are helping to mentor helpful new
> contributors to their projects helps to ensure a healthy and growing
> project community.
> >
> > PMCs vary significantly in the level of commitment and work expected to
> be considered for a committership. Some PMCs vote in new PMC members
> typically from their existing committers (i.e. the progression is
> contributor -> vote -> committer -> vote -> PMC), while other PMCs always
> elect new committers into the PMC simultaneously (contributor -> vote ->
> committer & PMC member).
>
> To date, we’ve been mostly following the “bundling” approach of combining
> committers / PMC’s votes.  This is not explicitly spelled out in our wiki
> however (see [2][3]).  We established the current criteria back in  2016
> [4].  The private@ thread [5] that spawned this discussion included some
> great advice from our project mentors (Roman, Kos, Niall, William Rowe).
> If I can summarize it here, it basically boils down to:
>
> - Set the bar for inclusion as low as possible
> - Read the definition of Merit [5]
> - Is the person trustworthy with code, community, etc.
>
> Thoughts?
>
>
> Anthony
>
>
> [1] https://apache.org/foundation/governance/pmcs.html
> [2] https://cwiki.apache.org/confluence/display/GEODE/Becoming+a+committer
> [3]
> https://cwiki.apache.org/confluence/display/GEODE/Nominating+a+Committer+and+PMC+Member
> [4]
> https://lists.apache.org/thread.html/819a140349394833cf1c51b653672ab7a950d48891cf6abef245b8a7@1452130249@%3Cdev.geode.apache.org%3E
> [5] http://theapacheway.com/merit/
>
>
>

Re: Backwards compatibility issue with JSONFormatter

2019-05-14 Thread Ryan McMahon

Fixed in
https://github.com/apache/geode/commit/cc0b37a504480f554b1884491f44a3cca4113ef5

On Tue, May 14, 2019 at 2:06 PM Ryan McMahon  wrote:

> Thanks Anthony!  Have we considered using a compliance checker in our CI
> like the one in the first link?
>
> Side note - I think that varargs is an interesting case that I don't see
> called out in those links.  Since varargs is just syntactic sugar, it is
> ultimately the varargs parameter is converted to an array in the bytecode.
> So in the JSON formatter change:
>
> fromJSON(byte[] jsonByteArray, String... identityFields)
>
>
> becomes
>
> fromJSON(byte[] jsonByteArray, String[] identityFields)
>
>
> Hence the breakage of the ABI.
>
> Ryan
>
> On Tue, May 14, 2019 at 1:33 PM Anthony Baker  wrote:
>
>> Here are a few links on API compatibility:
>>
>> https://lvc.github.io/japi-compliance-checker/#Examples
>> https://wiki.eclipse.org/Evolving_Java-based_APIs
>> https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html
>>
>>
>> Anthony
>>
>>
>> > On May 14, 2019, at 12:45 PM, Dan Smith  wrote:
>> >
>> > Sounds good! Yeah, this is a bit of an interesting case where code will
>> > still compile against the new API without change, but I think
>> maintaining
>> > compatibility of the binary API is also important. Thanks for looking
>> into
>> > it.
>> >
>> > -Dan
>> >
>> > On Tue, May 14, 2019 at 11:22 AM Ryan McMahon 
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> This is my mistake - it was a misunderstanding of what constitutes a
>> >> breaking public API change.  If a client app were to recompile using
>> the
>> >> new bits with the new method signature, there wouldn't be an issue
>> because
>> >> the new signature would work with 0 varargs.  The problem arises when
>> you
>> >> hot swap the geode dependencies for that client app without
>> recompiling, as
>> >> the bytecode no longer matches.  Since we do support the hot swap use
>> case
>> >> for CVEs etc, I see now this is considered a breaking API change.
>> >>
>> >> I'll change this to use a method overload instead, it should be a
>> pretty
>> >> simple fix.  Luckily, this issue didn't make it into any Geode
>> releases.
>> >>
>> >> Thanks,
>> >> Ryan
>> >>
>> >>
>> >> On Tue, May 14, 2019 at 10:47 AM Udo Kohlmeyer  wrote:
>> >>
>> >>> How did this slip the review process that the signature of a public
>> API
>> >>> was changed?
>> >>>
>> >>> Well done Gester for finding this!!
>> >>>
>> >>> --Udo
>> >>>
>> >>> On 5/14/19 10:40, Dan Smith wrote:
>> >>>> I think the changes for GEODE-6327 broke backwards compatibility in
>> >>>> JSONFormatter with the change from fromJSON(String jsonString) to
>> >>>> fromJSON(String jsonString, String... identityFields)
>> >>>>
>> >>>> Adding an additional varargs parameter to the method breaks code that
>> >> was
>> >>>> compiled against the non-varargs version. I think we need to overload
>> >> the
>> >>>> method instead.
>> >>>>
>> >>>> Thanks to Gester for discovering this with his test!
>> >>>>
>> >>>> -Dan
>> >>>>
>> >>>
>> >>
>>
>>

Re: Backwards compatibility issue with JSONFormatter

2019-05-14 Thread Ryan McMahon

Thanks Anthony!  Have we considered using a compliance checker in our CI
like the one in the first link?

Side note - I think that varargs is an interesting case that I don't see
called out in those links.  Since varargs is just syntactic sugar, it is
ultimately the varargs parameter is converted to an array in the bytecode.
So in the JSON formatter change:

fromJSON(byte[] jsonByteArray, String... identityFields)


becomes

fromJSON(byte[] jsonByteArray, String[] identityFields)


Hence the breakage of the ABI.

Ryan

On Tue, May 14, 2019 at 1:33 PM Anthony Baker  wrote:

> Here are a few links on API compatibility:
>
> https://lvc.github.io/japi-compliance-checker/#Examples
> https://wiki.eclipse.org/Evolving_Java-based_APIs
> https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html
>
>
> Anthony
>
>
> > On May 14, 2019, at 12:45 PM, Dan Smith  wrote:
> >
> > Sounds good! Yeah, this is a bit of an interesting case where code will
> > still compile against the new API without change, but I think maintaining
> > compatibility of the binary API is also important. Thanks for looking
> into
> > it.
> >
> > -Dan
> >
> > On Tue, May 14, 2019 at 11:22 AM Ryan McMahon 
> wrote:
> >
> >> Hi all,
> >>
> >> This is my mistake - it was a misunderstanding of what constitutes a
> >> breaking public API change.  If a client app were to recompile using the
> >> new bits with the new method signature, there wouldn't be an issue
> because
> >> the new signature would work with 0 varargs.  The problem arises when
> you
> >> hot swap the geode dependencies for that client app without
> recompiling, as
> >> the bytecode no longer matches.  Since we do support the hot swap use
> case
> >> for CVEs etc, I see now this is considered a breaking API change.
> >>
> >> I'll change this to use a method overload instead, it should be a pretty
> >> simple fix.  Luckily, this issue didn't make it into any Geode releases.
> >>
> >> Thanks,
> >> Ryan
> >>
> >>
> >> On Tue, May 14, 2019 at 10:47 AM Udo Kohlmeyer  wrote:
> >>
> >>> How did this slip the review process that the signature of a public API
> >>> was changed?
> >>>
> >>> Well done Gester for finding this!!
> >>>
> >>> --Udo
> >>>
> >>> On 5/14/19 10:40, Dan Smith wrote:
> >>>> I think the changes for GEODE-6327 broke backwards compatibility in
> >>>> JSONFormatter with the change from fromJSON(String jsonString) to
> >>>> fromJSON(String jsonString, String... identityFields)
> >>>>
> >>>> Adding an additional varargs parameter to the method breaks code that
> >> was
> >>>> compiled against the non-varargs version. I think we need to overload
> >> the
> >>>> method instead.
> >>>>
> >>>> Thanks to Gester for discovering this with his test!
> >>>>
> >>>> -Dan
> >>>>
> >>>
> >>
>
>

Re: Backwards compatibility issue with JSONFormatter

2019-05-14 Thread Ryan McMahon

Hi all,

This is my mistake - it was a misunderstanding of what constitutes a
breaking public API change.  If a client app were to recompile using the
new bits with the new method signature, there wouldn't be an issue because
the new signature would work with 0 varargs.  The problem arises when you
hot swap the geode dependencies for that client app without recompiling, as
the bytecode no longer matches.  Since we do support the hot swap use case
for CVEs etc, I see now this is considered a breaking API change.

I'll change this to use a method overload instead, it should be a pretty
simple fix.  Luckily, this issue didn't make it into any Geode releases.

Thanks,
Ryan

On Tue, May 14, 2019 at 10:47 AM Udo Kohlmeyer  wrote:

> How did this slip the review process that the signature of a public API
> was changed?
>
> Well done Gester for finding this!!
>
> --Udo
>
> On 5/14/19 10:40, Dan Smith wrote:
> > I think the changes for GEODE-6327 broke backwards compatibility in
> > JSONFormatter with the change from fromJSON(String jsonString) to
> > fromJSON(String jsonString, String... identityFields)
> >
> > Adding an additional varargs parameter to the method breaks code that was
> > compiled against the non-varargs version. I think we need to overload the
> > method instead.
> >
> > Thanks to Gester for discovering this with his test!
> >
> > -Dan
> >
>

Re: [VOTE] Apache Geode 1.9.0 RC4

2019-04-24 Thread Ryan McMahon

+1 Verified signatures, compiled from source, and ran all examples.

Ryan

On Tue, Apr 23, 2019 at 3:44 PM Owen Nichols  wrote:

> +1
>
> I used gpg --verify to check all signatures, compiled geode from source,
> ran all examples, and set up a cluster with 2 locators, 3 servers, a simple
> client and SSL enabled.
>
> -Owen
>
>
> > On Apr 23, 2019, at 1:32 PM, Dan Smith  wrote:
> >
> > +1
> >
> > Looks good to me. I ran the geode-release-check against the repo.
> >
> > -Dan
> >
> > On Fri, Apr 19, 2019 at 5:00 PM Owen Nichols  > wrote:
> >
> >> Hello, Geode dev community,
> >>
> >>
> >> This is the fourth release candidate for Apache Geode, version 1.9.0.
> >>
> >> Thanks to all the community members for their contributions to this
> >> release!
> >>
> >>
> >> Please do a review and give your feedback. The deadline is before 3 PM
> PST
> >> Wed April 24th, 2019.
> >>
> >>
> >> 1.9.0 resolves 296 issues on Apache Geode JIRA system. Release notes
> >> can be found at:
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.0
> >> <
> >>
> https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.0
> <
> https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.0
> >
> >>>
> >>
> >>
> >> Please note that we are voting upon the source tags: rel/v1.9.0.RC4
> >>
> >>
> >> Apache Geode:
> >>
> >> https://github.com/apache/geode/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode/tree/rel/v1.9.0.RC4> <
> >> https://github.com/apache/geode/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode/tree/rel/v1.9.0.RC4>>
> >>
> >> Apache Geode examples:
> >>
> >> https://github.com/apache/geode-examples/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode-examples/tree/rel/v1.9.0.RC4> <
> >> https://github.com/apache/geode-examples/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode-examples/tree/rel/v1.9.0.RC4>>
> >>
> >> Apache Geode Native:
> >>
> >> https://github.com/apache/geode-native/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode-native/tree/rel/v1.9.0.RC4> <
> >> https://github.com/apache/geode-native/tree/rel/v1.9.0.RC4 <
> https://github.com/apache/geode-native/tree/rel/v1.9.0.RC4>>
> >>
> >>
> >> Commit ID:
> >>
> >> Apache Geode: c0a73d1cb84986d432003bd12e70175520e63597
> >>
> >> Apache Geode Examples: b58b2720e9cb1fae9e2cbca7053bc9c748dab79d
> >>
> >> Apache Geode Native: add53da376c3044feb2fb22dc37a30989c271f19
> >>
> >>
> >> Source and binary files:
> >>
> >> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4 <
> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4> <
> >> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4 <
> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4>>
> >>
> >> Maven staging repo:
> >>
> >> https://repository.apache.org/content/repositories/orgapachegeode-1054
>  <
> >> https://repository.apache.org/content/repositories/orgapachegeode-1054
> >
> >>
> >> Geode's KEYS file containing PGP keys we use to sign the release:
> >>
> >> https://github.com/apache/geode/blob/develop/KEYS <
> https://github.com/apache/geode/blob/develop/KEYS> <
> >> https://github.com/apache/geode/blob/develop/KEYS <
> https://github.com/apache/geode/blob/develop/KEYS>>
> >>
> >>
> >> Signed the release with GPG fingerprint:
> >>
> >> DB5476815A475574577D442B468A4800EAFB2498
> >>
> >> PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
> >> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4 <
> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4> <
> >> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4 <
> https://dist.apache.org/repos/dist/dev/geode/1.9.0.RC4>>
> >> -PgeodeRepositoryUrl=
> >> https://repository.apache.org/content/repositories/orgapachegeode-1054
>  <
> >> https://repository.apache.org/content/repositories/orgapachegeode-1054
> >
> >> build
> >> runAll
> >>
> >>
> >> Regards,
> >> - Owen & Dan
>
>

Re: 1.9 release date

2019-03-01 Thread Ryan McMahon

+1 to prioritizing quality over releasing on the desired cadence.  The
quarterly release cadence is a good goal, but it shouldn't be a strict rule
if there is more work to be done to ensure good quality.

Ryan

On Fri, Mar 1, 2019 at 8:02 AM Anthony Baker  wrote:

> IMHO we start release work based on a quarterly schedule and we finish it
> based on meeting quality goals.  So right now I’m less worried about when
> the release will be done (because uncertainty) and more focused on ensuring
> we have demonstrated stability on the release branch.  Hopefully that will
> happen sooner than 4/1…but it could take longer too.
>
> Anthony
>
>
> > On Feb 28, 2019, at 6:00 PM, Alexander Murmann 
> wrote:
> >
> > Hi everyone,
> >
> > According to our wiki we were aiming for a March 1st release date for our
> > 1.9 release. We cut the release branch about two weeks late and see
> unusual
> > amounts of merges still going into the branch. I propose that we give
> > ourselves some more time to validate what's there. My proposal is to aim
> > for last week of March or maybe even week of April 1st.
> >
> > What do you all think?
>
>

Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Ryan McMahon

Sorry that should read “and if the value exceeds MAX_INT”

On Wed, Feb 13, 2019 at 8:21 PM Ryan McMahon  wrote:

> +1 to introducing a new method which returns the stat as long per Jake’s
> suggestion.  I vote for getInt() to downcast as an int, and the value
> exceeds MAX_INT, return MAX_INT and possibly add a warning message which
> points users to the new long version of the method.  I think throwing an
> exception might be a bit much personally.
>
> Ryan
>
> On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols  wrote:
>
>> Is it possible for the underlying counter to be maintained as a long, and
>> change the getInt method to return as normal when the value is within
>> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long
>> value would overflow an int?
>>
>> For the MX Bean, should we keep (but deprecate) the existing attribute,
>> and add a new attribute TotalNetSearchCompletedAsLong?
>>
>> > On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
>> >
>> > Quite a few Geode stats are currently defined as IntCounters which very
>> > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
>> > wrap to negative MAX_VALUE, so my team defined the following two tickets
>> > and changed them to LongCounters on the develop branch:
>> >
>> > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
>> > https://issues.apache.org/jira/browse/GEODE-6345
>> >
>> > GEODE-6334: CachePerfStats operation count stats may wrap to negative
>> values
>> > https://issues.apache.org/jira/browse/GEODE-6334
>> >
>> > Some people are now concerned that this may break backwards
>> compatibility
>> > and API for existing users.
>> >
>> > There are two potential ways for a user to *experience* this as an API
>> > change:
>> >
>> > 1) MemberMXBean in JMX
>> >
>> > *-  int getTotalNetSearchCompleted();*
>> > *+  long getTotalNetSearchCompleted();*
>> >
>> > As you can see in GEODE-6334, we changed quite a few stats from integer
>> to
>> > long in CachePerfStats. The only one directly exposed as an attribute on
>> > MemberMXBean is TotalNetSearchCompleted.
>> >
>> > Users will find that this API method changed.
>> >
>> > 2) Statistics API with undocumented strings
>> >
>> > If we assume that users might know or discover that we have a statistics
>> > text id of "cachePerfStats" with an integer stat of name "puts" then
>> they
>> > could use our Statistics API to get the value:
>> >
>> > * 1:CacheFactory cacheFactory = new CacheFactory();*
>> > * 2:Cache cache = cacheFactory.create();*
>> > * 3:*
>> > * 4:Region region = cache.> > String>createRegionFactory(PARTITION).create("region");*
>> > * 5:*
>> > * 6:Statistics[] statistics =
>> > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
>> > * 7:int oldPuts = statistics[0].getInt("puts");*
>> > * 8:*
>> > * 9:region.put("some", "value");*
>> > *10:*
>> > *11:int newPuts = statistics[0].getInt("puts");*
>> > *12:*
>> > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
>> >
>> > The above works in Geode 1.8 and earlier but will begin to throw the
>> > following in Geode 1.9 (when develop branch is released):
>> >
>> > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
>> > type long and it was expected to be an int.*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
>> > * at
>> >
>> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
>> > * at com.user.example.Example.example(Example.java7)*
>> >
>> > In order to avoid the above exception, a user would have to change the
>> code
>> > on lines 7 and 11 to use *getLong* instead of *getInt*.
>> >
>> > Should Geode stats be considered a form of API contract and should they
>> > conform to backwards compatibility constraints?
>> >
>> > Should we change these from LongCounters back to IntCounters?
>> >
>> > Someone did suggest that we could change them back to IntCounters and
>> then
>> > change our statistics internals to skip to zero anytime a stat attempts
>> to
>> > increment above MAX_VALUE, however, I think that if we decide that stats
>> > cannot be changed from IntCounters to LongCounters then we should not
>> make
>> > this change either.
>> >
>> > Thanks for any input or opinions,
>> > Kirk
>>
>>

Re: Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Ryan McMahon

+1 to introducing a new method which returns the stat as long per Jake’s
suggestion.  I vote for getInt() to downcast as an int, and the value
exceeds MAX_INT, return MAX_INT and possibly add a warning message which
points users to the new long version of the method.  I think throwing an
exception might be a bit much personally.

Ryan

On Wed, Feb 13, 2019 at 4:24 PM Owen Nichols  wrote:

> Is it possible for the underlying counter to be maintained as a long, and
> change the getInt method to return as normal when the value is within
> MAX_INT, otherwise throw an exception and/or return MAX_INT when the long
> value would overflow an int?
>
> For the MX Bean, should we keep (but deprecate) the existing attribute,
> and add a new attribute TotalNetSearchCompletedAsLong?
>
> > On Feb 13, 2019, at 3:58 PM, Kirk Lund  wrote:
> >
> > Quite a few Geode stats are currently defined as IntCounters which very
> > easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
> > wrap to negative MAX_VALUE, so my team defined the following two tickets
> > and changed them to LongCounters on the develop branch:
> >
> > GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
> > https://issues.apache.org/jira/browse/GEODE-6345
> >
> > GEODE-6334: CachePerfStats operation count stats may wrap to negative
> values
> > https://issues.apache.org/jira/browse/GEODE-6334
> >
> > Some people are now concerned that this may break backwards compatibility
> > and API for existing users.
> >
> > There are two potential ways for a user to *experience* this as an API
> > change:
> >
> > 1) MemberMXBean in JMX
> >
> > *-  int getTotalNetSearchCompleted();*
> > *+  long getTotalNetSearchCompleted();*
> >
> > As you can see in GEODE-6334, we changed quite a few stats from integer
> to
> > long in CachePerfStats. The only one directly exposed as an attribute on
> > MemberMXBean is TotalNetSearchCompleted.
> >
> > Users will find that this API method changed.
> >
> > 2) Statistics API with undocumented strings
> >
> > If we assume that users might know or discover that we have a statistics
> > text id of "cachePerfStats" with an integer stat of name "puts" then they
> > could use our Statistics API to get the value:
> >
> > * 1:CacheFactory cacheFactory = new CacheFactory();*
> > * 2:Cache cache = cacheFactory.create();*
> > * 3:*
> > * 4:Region region = cache. > String>createRegionFactory(PARTITION).create("region");*
> > * 5:*
> > * 6:Statistics[] statistics =
> > cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
> > * 7:int oldPuts = statistics[0].getInt("puts");*
> > * 8:*
> > * 9:region.put("some", "value");*
> > *10:*
> > *11:int newPuts = statistics[0].getInt("puts");*
> > *12:*
> > *13:assertThat(newPuts).isEqualTo(oldPuts + 1);*
> >
> > The above works in Geode 1.8 and earlier but will begin to throw the
> > following in Geode 1.9 (when develop branch is released):
> >
> > *java.lang.IllegalArgumentException: The statistic puts with id 23 is of
> > type long and it was expected to be an int.*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
> > * at
> >
> org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
> > * at com.user.example.Example.example(Example.java7)*
> >
> > In order to avoid the above exception, a user would have to change the
> code
> > on lines 7 and 11 to use *getLong* instead of *getInt*.
> >
> > Should Geode stats be considered a form of API contract and should they
> > conform to backwards compatibility constraints?
> >
> > Should we change these from LongCounters back to IntCounters?
> >
> > Someone did suggest that we could change them back to IntCounters and
> then
> > change our statistics internals to skip to zero anytime a stat attempts
> to
> > increment above MAX_VALUE, however, I think that if we decide that stats
> > cannot be changed from IntCounters to LongCounters then we should not
> make
> > this change either.
> >
> > Thanks for any input or opinions,
> > Kirk
>
>

Re: [Proposal] Change default gemfire.memoryEventTolerance from 1 to 0

2019-01-24 Thread Ryan McMahon

Hi Alexander,

Thanks for bring up those concerns.  I think in this case the default of 1
is potentially masking behavior that the user would definitely want to see
and react to (such as heap memory read outliers/spikes), so it is a bit
different than other cases where changing the default would be a
nice-to-have in 2.0.  The user can still adjust their parameter to deal
with any bad reads they are observing, but it would be best if they did
that explicitly instead of us hiding the behavior by default silently.  So
I would argue it is a fair change in this case in a minor release, as it
could be viewed as a "bug fix" i.e. masking the bad reads by default.

Ryan

On Thu, Jan 24, 2019 at 9:49 AM David Wisler  wrote:

> I talked to him.  He is fine with our new change.
>
> :-)He just wanted to have it on the table.
>
> So you missed out on this 1 minute discussion.  haha
>
> On Thu, Jan 24, 2019 at 9:38 AM Ryan McMahon  wrote:
>
>> Hey David,
>>
>> Alexander responded to the dev list, but not sure you got a copy.
>> Basically he is asking if we can wait to change the default tolerance until
>> Geode 2.0, since it is a behavior change and apparently there are several
>> config parameters for which we'd like to change the default, but we pushed
>> it to 2.0.  I don't feel super strongly either way.  Do you want to reply
>> to this thread or talk to Alexander directly?
>>
>> Ryan
>>
>> -- Forwarded message -
>> From: Alexander Murmann 
>> Date: Wed, Jan 23, 2019 at 4:13 PM
>> Subject: Re: [Proposal] Change default gemfire.memoryEventTolerance from
>> 1 to 0
>> To: 
>> Cc: Ryan McMahon 
>>
>>
>> Ryan, thank you so much for the great explanation of your proposal!
>>
>> This seems very sound and you and David got me convinced that it's the
>> right thing to change the default. To me the question now is one of
>> timing.
>> Is this something we can change in a minor release or do we have to wait
>> for Geode 2.0? I think we have quite a few defaults we'd like to update.
>> However, a user might have a system in prod that relies on defaults being
>> a
>> certain way and upgrading to the next minor shouldn't require any work on
>> their end to prevent any negative impact on their system.
>>
>> Thoughts?
>>
>> On Tue, Jan 22, 2019 at 12:33 PM David Wisler  wrote:
>>
>> > I would add that, by changing the default to 0, we can then skip all of
>> the
>> > "special" logic that almost no customers use.With a default of 1,
>> we go
>> > into this logic every time unnecessarily, even when customers have not
>> > explicitly told us to "tolerate" an eviction or critical state change.
>>   I
>> > am in favor of this default change to 0, and also add that there are no
>> > customers who would even realize such a change in behavior has occurred.
>> > I would also suggest that tolerating 1 critical reading, delaying the
>> > subsequent behaviors in GemFire when above critical, could make us more
>> > vulnerable to OOME's than would be the case by immediately transitioning
>> > state.
>> >
>> > My 2 cents.  Thanks for the email Ryan.
>> >
>> >
>> >
>> > On Tue, Jan 22, 2019 at 10:22 AM Ryan McMahon 
>> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I am currently fixing a bug
>> > > <https://issues.apache.org/jira/browse/GEODE-6304> with the
>> > > HeapMemoryMonitor event tolerance feature, and came across a decision
>> > that
>> > > I thought would be more appropriate for the Geode dev list.
>> > >
>> > > For those familiar with the feature, we are proposing that the default
>> > > gemfire.memoryEventTolerance config parameter value is changed from 1
>> to
>> > 0
>> > > so state transitions from normal to eviction or critical occur
>> > immediately
>> > > after reading a single heap-used-bytes event above threshold.  If you
>> are
>> > > unfamiliar with the feature, read on.
>> > >
>> > > The memory event tolerance feature addresses issues with some JVM
>> distros
>> > > that result in sporadic, erroneously high heap-bytes-used readings.
>> The
>> > > feature was introduced to address this issue in the JRockit JVM, but
>> it
>> > has
>> > > been found that other JVM distros are susceptible to this problem as
>> > well.
>> > >
>> > > The feature prevents an "unexpect

[Proposal] Change default gemfire.memoryEventTolerance from 1 to 0

2019-01-22 Thread Ryan McMahon

Hi all,

I am currently fixing a bug
 with the
HeapMemoryMonitor event tolerance feature, and came across a decision that
I thought would be more appropriate for the Geode dev list.

For those familiar with the feature, we are proposing that the default
gemfire.memoryEventTolerance config parameter value is changed from 1 to 0
so state transitions from normal to eviction or critical occur immediately
after reading a single heap-used-bytes event above threshold.  If you are
unfamiliar with the feature, read on.

The memory event tolerance feature addresses issues with some JVM distros
that result in sporadic, erroneously high heap-bytes-used readings.  The
feature was introduced to address this issue in the JRockit JVM, but it has
been found that other JVM distros are susceptible to this problem as well.

The feature prevents an "unexpected" state transition from a normal state
to an eviction or critical state by requiring N (configurable) consecutive
heap-used-byte events above threshold before changing states.  The current
default configuration is N = 5 for JRockit and N = 1 for all other JVMs.
In a non-JRockit JVM, this configuration permits a single event above
threshold WITHOUT causing a state transition.  In other words, by default,
we allow for a single bad outlier heap-used-bytes reading without going
into an eviction or critical state.

As part of this bug fix (which involves a failure to reset the tolerance
counter under some conditions), we opted to remove the special handling for
JRockit because JRockit is no longer supported.  After removing the JRockit
handling, we started re-evaluating if a default value of 1 is appropriate
for all other JVMs.  We are considering changing the default to 0, so state
transitions would occur immediately if an event above the threshold is
received.  If a user is facing one of these problematic JVMs, they can then
change the gemfire.memoryEventTolerance config parameter to increase the
tolerance.  Our concern is that the default today is potentially masking
bad heap readings without the user ever knowing.

To summarize, if we change the default from 1 to 0 it would potentially be
a change in behavior in that we would no longer be masking a single bad
heap-used-bytes reading i.e. no longer permitting a single outlier without
changing states.  The user can then decide whether to configure a non-zero
tolerance to address the situation.  Any thoughts on this change in
behavior?

Thanks,
Ryan

Re: [VOTE] Apache Geode 1.8.0 RC2

2018-12-10 Thread Ryan McMahon

+1

- Ran all examples
- Ran gfsh, created region, performed basic entry operations
- Verified SHAs
- Verified signatures

Ryan

On Mon, Dec 10, 2018 at 4:01 PM Alexander Murmann 
wrote:

> A reminder for everyone that the vote is scheduled to end tonight. Please
> verify and vote!
>
> @Galen I'll take a look and fix the handwritten note.
>

Re: PowerMock and mock ClassLoader

2018-12-04 Thread Ryan McMahon

+1 to a spotless rule.  Unless anyone objects, I’ll look into doing that
after PowerMock is eliminated.

Ryan

On Tue, Dec 4, 2018 at 3:50 PM Helena Bales  wrote:

> Once we have refactored tests currently using PowerMock, it might be
> prudent to introduce a spotless rule to prohibit the reintroduction of
> PowerMock.
>
> On Tue, Dec 4, 2018 at 3:32 PM Ryan McMahon 
> wrote:
>
> > I am interested in contributing to this effort.  I removed PowerMock
> usage
> > from one set of tests (GEODE-6052), and at that time I took a quick
> glance
> > at other usages.  I’ll assign GEODE-6143 to myself and see about removing
> > the remaining usages.
> >
> > Ryan
> >
> > On Tue, Dec 4, 2018 at 3:08 PM Kirk Lund  wrote:
> >
> > > I filed GEODE-6143: PowerMock should not be used in Geode tests.
> > >
> > > We need everyone to stop using PowerMock in new tests. If anyone sees a
> > PR
> > > attempting to use PowerMock please let the contributor know about
> > > GEODE-6143. The alternative is to refactor product code such that
> > > dependencies are passed into the constructor instead of reaching out to
> > > singletons and to avoid using static methods or static final fields for
> > > anything would would make testing more difficult.
> > >
> > > On Tue, Dec 4, 2018 at 11:20 AM Dan Smith  wrote:
> > >
> > > > +1 to removing PowerMock. Any situation that needs PowerMock needs
> > > > refactoring more.
> > > >
> > > > -Dan
> > > >
> > > > On Tue, Dec 4, 2018 at 10:27 AM Kirk Lund  wrote:
> > > >
> > > > > Anyone have any ideas which unit test is using PowerMock and is
> > > > injecting a
> > > > > mock ClassLoader? This keeps failing in my precheckin runs. I think
> > we
> > > > need
> > > > > to a) remove all uses of PowerMock and b) forbid its use going
> > forward.
> > > > >
> > > > > 2018-12-04 18:11:36,258 Distributed system shutdown hook ERROR
> Could
> > > not
> > > > > reconfigure JMX java.lang.LinkageError: loader constraint
> violation:
> > > > loader
> > > > > (instance of org/powermock/core/classloader/MockClassLoader)
> > previously
> > > > > initiated loading for a different type with name
> > > > > "javax/management/MBeanServer"
> > > > > at java.lang.ClassLoader.defineClass1(Native Method)
> > > > > at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
> > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:337)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregisterLoggerContext(Server.java:261)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:165)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:141)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:558)
> > > > >

Re: PowerMock and mock ClassLoader

2018-12-04 Thread Ryan McMahon

I am interested in contributing to this effort.  I removed PowerMock usage
from one set of tests (GEODE-6052), and at that time I took a quick glance
at other usages.  I’ll assign GEODE-6143 to myself and see about removing
the remaining usages.

Ryan

On Tue, Dec 4, 2018 at 3:08 PM Kirk Lund  wrote:

> I filed GEODE-6143: PowerMock should not be used in Geode tests.
>
> We need everyone to stop using PowerMock in new tests. If anyone sees a PR
> attempting to use PowerMock please let the contributor know about
> GEODE-6143. The alternative is to refactor product code such that
> dependencies are passed into the constructor instead of reaching out to
> singletons and to avoid using static methods or static final fields for
> anything would would make testing more difficult.
>
> On Tue, Dec 4, 2018 at 11:20 AM Dan Smith  wrote:
>
> > +1 to removing PowerMock. Any situation that needs PowerMock needs
> > refactoring more.
> >
> > -Dan
> >
> > On Tue, Dec 4, 2018 at 10:27 AM Kirk Lund  wrote:
> >
> > > Anyone have any ideas which unit test is using PowerMock and is
> > injecting a
> > > mock ClassLoader? This keeps failing in my precheckin runs. I think we
> > need
> > > to a) remove all uses of PowerMock and b) forbid its use going forward.
> > >
> > > 2018-12-04 18:11:36,258 Distributed system shutdown hook ERROR Could
> not
> > > reconfigure JMX java.lang.LinkageError: loader constraint violation:
> > loader
> > > (instance of org/powermock/core/classloader/MockClassLoader) previously
> > > initiated loading for a different type with name
> > > "javax/management/MBeanServer"
> > > at java.lang.ClassLoader.defineClass1(Native Method)
> > > at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> > > at
> > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
> > > at
> > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
> > > at
> > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
> > > at
> > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:337)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregisterLoggerContext(Server.java:261)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:165)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:141)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:558)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:619)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:636)
> > > at
> > >
> org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:243)
> > > at
> > >
> > >
> >
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
> > > at org.apache.logging.log4j.LogManager.getContext(LogManager.java:174)
> > > at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:661)
> > > at
> > >
> >
> org.apache.geode.internal.logging.LogService.getLogger(LogService.java:64)
> > > at
> > >
> > >
> >
> org.apache.geode.internal.tcp.ConnectionTable.(ConnectionTable.java:61)
> > > at
> > >
> > >
> >
> org.apache.geode.distributed.DistributedSystem.setThreadsSocketPolicy(DistributedSystem.java:263)
> > > at
> > >
> > >
> >
> org.apache.geode.distributed.internal.InternalDistributedSystem.lambda$static$0(InternalDistributedSystem.java:2317)
> > > at java.lang.Thread.run(Thread.java:748)
> > >
> >
>

Re: PowerMock unit test errors

2018-11-21 Thread Ryan McMahon

Just thought I'd share - after a second pass at this I was able to avoid
PowerMock when modifying the class under test to use constructor DI.  I
think initially I was a bit reluctant to modify the production code here
but on a second look I think it was the right thing to do.  Just wanted to
share the lesson I learned!

Ryan

On Wed, Nov 14, 2018 at 1:34 PM Bruce Schuchardt 
wrote:

> I don't think you necessarily need to redo your work Ryan.  I just think
> something has been left behind by the test.  It looks like a mock made
> its way into  the JMX "Server" in the log4j code.
>
> On 11/14/18 10:57 AM, Ryan McMahon wrote:
> > I will write up a story to address the use of PowerMock in these JMX
> tests
> > in particular.  I remember attempting to avoid PowerMock when writing
> this
> > test, because I agree that it should be avoided.  I just want to explain
> my
> > thinking so that we can discuss what would have been a better approach.
> >
> > In this particular set of tests, I used it to mock two static methods:
> > 1. InternalDistributedSystem.getConnectedInstance()
> > 2. LogService.getLogger()
> >
> > I considered making these instance methods as part of this ticket.  When
> I
> > started down that path, I found there are 41 calls to
> > InternalDistributedSystem.getConnectedInstance(), and many of the callers
> > have no reference to an instance of InternalDistributedSystem.  While I
> > think this is a solvable problem, it would have vastly increased the
> scope
> > of the work involved.  Would this ticket have been the place to do that
> > work?  I am under the impression that we have some Jira tickets
> > specifically related to removing statics.  Are those more aptly fit for
> > that work?
> >
> > I agree that the LogService could have been injected in the
> MBeanJMXAdapter
> > constructor, so that would have been a reasonable API change to make this
> > testable without mocking the static LogService.  However, I'm still
> > interested in what people think about the InternalDistributedSystem case
> > described above.
> >
> > There are ways I could have changed the public API of
> > InternalDistributedSystem and related classes to make them testable
> without
> > the use of PowerMock.  There are a lot of ways to do this, but they all
> > boil down to adding test hooks in public API.  Is that better than using
> > PowerMock?  I see the following options:
> >
> > 1. Don't test at this level which requires mocking statics
> > 2. Use PowerMock
> > 3. Add test hooks (perhaps make InternalDistributedSystem wrap a
> singleton
> > which you can swap for a test object, but there are other ways to do
> this)
> > 4. Do major refactoring to eliminate the statics involved
> >
> > I don't think that not testing at this level is a great solution.  We've
> > discussed the disadvantages of PowerMock already, so the cons of #2 are
> > well known.  I am not a fan of adding public API test hooks, and in this
> > case I'd rather use PowerMock honestly, but that is highly subjective.
> And
> > #4 is the "ideal world" case, but isn't always practically possible given
> > the scope of the work involved.
> >
> > I'm curious what other people think about this topic?
> >
> > Thanks,
> > Ryan
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 6, 2018 at 10:40 AM Kirk Lund  wrote:
> >
> >> I think we should try really hard to avoid using PowerMock. If you have
> >> some code that you need to use PowerMock to make it testable, please
> >> consider refactoring the code to a) avoid statics, b) pass all
> dependencies
> >> in via the constructor.
> >>
> >> The following was spit out by one of my unit test runs. None of the
> logging
> >> or log4j tests in my PR involve PowerMock. Also, this unit test run had
> no
> >> FAILED tests, so this would appear to be generated by some unit test
> that
> >> uses PowerMock and PASSED.
> >>
> >>> Task :geode-core:test
> >> 2018-11-06 18:03:49,443 Distributed system shutdown hook ERROR Could not
> >> reconfigure JMX java.lang.LinkageError: loader constraint violation:
> loader
> >> (instance of org/powermock/core/classloader/MockClassLoader) previously
> >> initiated loading for a different type with name
> >> "javax/management/MBeanServer"
> >> at java.lang.ClassLoader.defineClass1(Native Method)
> >> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> >> at
> >>
> >

Re: Release branch for Apache Geode 1.8.0 has been cut

2018-11-21 Thread Ryan McMahon

HI Alexander,

I have another fix that I believe would be beneficial for the 1.8 release.
We have seen that issues related to intermittent DNS lookup failures can
cause WAN event processing to cease where a retry would have been more
appropriate.  The fix is already merged to develop.  May I cherry pick it
back to 1.8 release branch as well?

https://issues.apache.org/jira/browse/GEODE-6065

Thanks,
Ryan

On Thu, Nov 15, 2018 at 10:19 AM Alexander Murmann 
wrote:

> Ryan, Jason & Nabarun, these all look like critical issues to me that we
> should address in 1.8 to not introduce a regression. If nobody else has an
> objection, I am fine with you cherry-picking these into the relase branch.
>
> Thanks!
>
> On Wed, Nov 14, 2018 at 5:19 PM Nabarun Nag  wrote:
>
> > I would also like to put in the fix for GEODE-6053. It's in the review
> > phase and soon will be in the develop and the release branch.
> >
> > Regards
> > Nabarun Nag
> >
> > > On Nov 14, 2018, at 10:28 AM, Jason Huynh  wrote:
> > >
> > > I'd like to cherry pick the following two commits to release/1.8.
> > >
> > >
> > >
> >
> https://github.com/apache/geode/commit/6d9e026feb584309dff269e593417082a71434fc
> > >
> > >
> >
> https://github.com/apache/geode/commit/d22e83e167f261e933b79b2619367d2e1b788db6
> > >
> > >
> > > Reason: There was a previous commit that modified the way
> > > FunctionInvocationTargetExceptions were being wrapped or sent back to
> the
> > > client.  This could negatively impact a user that was drilling down
> into
> > > exceptions.getCause() methods as sometimes it would be null.  These two
> > > commits restore the behavior for any client between 1.0-1.7.
> > >
> > >
> > > Thanks,
> > >
> > > -Jason
> > >
> > > On Mon, Nov 12, 2018 at 5:47 PM Ryan McMahon 
> > wrote:
> > >
> > >> Hi Alexander,
> > >>
> > >> I would like to cherry pick the following commits from develop to
> > >> release/1.8.0:
> > >>
> > >>
> > >>
> >
> https://github.com/apache/geode/commit/e9ea18e18c85b977b91192d4edbb9a4e18b2643e
> > >> *Reason*: This was a revert of a previous commit which addressed data
> > >> inconsistencies between WAN sites, but it was found the fix introduced
> > >> several other inconsistencies.
> > >>
> > >>
> > >>
> >
> https://github.com/apache/geode/commit/aab0198e8478d4246042b2eb889c8ce7e28bb52e
> > >> *Reason: *This fixes a race condition in the QueryMonitor which causes
> > an
> > >> unexpected RejectedExecutionException in monitorQuery()
> > >>
> > >> Please let me know if this sounds reasonable and I can go ahead and
> > begin
> > >> the cherry picking process.
> > >>
> > >> Thanks!
> > >> Ryan
> > >>
> > >>
> > >> On Thu, Nov 8, 2018 at 1:54 PM Alexander Murmann  >
> > >> wrote:
> > >>
> > >>> Hello everyone,
> > >>>
> > >>> As discussed previously created a new release branch for Apache Geode
> > >> 1.8.0
> > >>> - "release/1.8.0"
> > >>>
> > >>> Please do review and raise any concern with the release branch.
> > >>> If no concerns are raised, we will start with the voting for the
> > release
> > >>> candidate soon.
> > >>>
> > >>> This also means that all tickets JIRA that get resolved on develop
> > should
> > >>> be marked with 1.9.0 as their fix version.
> > >>>
> > >>> Thanks!
> > >>>
> > >>> Alexander
> > >>>
> > >>
> >
> >
>

Re: [DISCUSS] - Create new repository for geode benchmarks

2018-11-15 Thread Ryan McMahon

+1 I think a separate repo makes sense, and I'm excited for this effort!


On Thu, Nov 15, 2018 at 10:47 AM Dan Smith  wrote:

> Hi all,
>
> We (Naba, Sean, Brian and I) would like to add some benchmarks for geode,
> with a goal of eventually running them as part of our concourse build and
> detecting performance changes.
>
> We think it makes sense to put these benchmarks in a separate
> geode-benchmarks repository. That will make them less tightly coupled to a
> specific revision of geode. What do you all think? If folks are okay with
> this, I will go ahead and create the repository.
>
> We have some prototype code in this repository with a simple client/server
> put benchmark:  https://github.com/upthewaterspout/geode-performance.
>
> -Dan
>

Re: PowerMock unit test errors

2018-11-14 Thread Ryan McMahon

I've created a Jira to track the elimination of PowerMock from these tests
in particular, which will probably involve doing the major refactoring
mentioned in item #4 in my previous email.

https://issues.apache.org/jira/browse/GEODE-6052

On Wed, Nov 14, 2018 at 10:57 AM Ryan McMahon 
wrote:

> I will write up a story to address the use of PowerMock in these JMX tests
> in particular.  I remember attempting to avoid PowerMock when writing this
> test, because I agree that it should be avoided.  I just want to explain my
> thinking so that we can discuss what would have been a better approach.
>
> In this particular set of tests, I used it to mock two static methods:
> 1. InternalDistributedSystem.getConnectedInstance()
> 2. LogService.getLogger()
>
> I considered making these instance methods as part of this ticket.  When I
> started down that path, I found there are 41 calls to
> InternalDistributedSystem.getConnectedInstance(), and many of the callers
> have no reference to an instance of InternalDistributedSystem.  While I
> think this is a solvable problem, it would have vastly increased the scope
> of the work involved.  Would this ticket have been the place to do that
> work?  I am under the impression that we have some Jira tickets
> specifically related to removing statics.  Are those more aptly fit for
> that work?
>
> I agree that the LogService could have been injected in the
> MBeanJMXAdapter constructor, so that would have been a reasonable API
> change to make this testable without mocking the static LogService.
> However, I'm still interested in what people think about the
> InternalDistributedSystem case described above.
>
> There are ways I could have changed the public API of
> InternalDistributedSystem and related classes to make them testable without
> the use of PowerMock.  There are a lot of ways to do this, but they all
> boil down to adding test hooks in public API.  Is that better than using
> PowerMock?  I see the following options:
>
> 1. Don't test at this level which requires mocking statics
> 2. Use PowerMock
> 3. Add test hooks (perhaps make InternalDistributedSystem wrap a singleton
> which you can swap for a test object, but there are other ways to do this)
> 4. Do major refactoring to eliminate the statics involved
>
> I don't think that not testing at this level is a great solution.  We've
> discussed the disadvantages of PowerMock already, so the cons of #2 are
> well known.  I am not a fan of adding public API test hooks, and in this
> case I'd rather use PowerMock honestly, but that is highly subjective.  And
> #4 is the "ideal world" case, but isn't always practically possible given
> the scope of the work involved.
>
> I'm curious what other people think about this topic?
>
> Thanks,
> Ryan
>
>
>
>
>
>
> On Tue, Nov 6, 2018 at 10:40 AM Kirk Lund  wrote:
>
>> I think we should try really hard to avoid using PowerMock. If you have
>> some code that you need to use PowerMock to make it testable, please
>> consider refactoring the code to a) avoid statics, b) pass all
>> dependencies
>> in via the constructor.
>>
>> The following was spit out by one of my unit test runs. None of the
>> logging
>> or log4j tests in my PR involve PowerMock. Also, this unit test run had no
>> FAILED tests, so this would appear to be generated by some unit test that
>> uses PowerMock and PASSED.
>>
>> > Task :geode-core:test
>> 2018-11-06 18:03:49,443 Distributed system shutdown hook ERROR Could not
>> reconfigure JMX java.lang.LinkageError: loader constraint violation:
>> loader
>> (instance of org/powermock/core/classloader/MockClassLoader) previously
>> initiated loading for a different type with name
>> "javax/management/MBeanServer"
>> at java.lang.ClassLoader.defineClass1(Native Method)
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>> at
>>
>> org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
>> at
>>
>> org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
>> at
>>
>> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
>> at
>>
>> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>>
>> org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:335)
>> at
>>
>> org.apache.logging.log4j.core.jmx.Server.unregisterLoggerContext(Server.java:259)
>> at
>>
>> o

Re: PowerMock unit test errors

2018-11-14 Thread Ryan McMahon

I will write up a story to address the use of PowerMock in these JMX tests
in particular.  I remember attempting to avoid PowerMock when writing this
test, because I agree that it should be avoided.  I just want to explain my
thinking so that we can discuss what would have been a better approach.

In this particular set of tests, I used it to mock two static methods:
1. InternalDistributedSystem.getConnectedInstance()
2. LogService.getLogger()

I considered making these instance methods as part of this ticket.  When I
started down that path, I found there are 41 calls to
InternalDistributedSystem.getConnectedInstance(), and many of the callers
have no reference to an instance of InternalDistributedSystem.  While I
think this is a solvable problem, it would have vastly increased the scope
of the work involved.  Would this ticket have been the place to do that
work?  I am under the impression that we have some Jira tickets
specifically related to removing statics.  Are those more aptly fit for
that work?

I agree that the LogService could have been injected in the MBeanJMXAdapter
constructor, so that would have been a reasonable API change to make this
testable without mocking the static LogService.  However, I'm still
interested in what people think about the InternalDistributedSystem case
described above.

There are ways I could have changed the public API of
InternalDistributedSystem and related classes to make them testable without
the use of PowerMock.  There are a lot of ways to do this, but they all
boil down to adding test hooks in public API.  Is that better than using
PowerMock?  I see the following options:

1. Don't test at this level which requires mocking statics
2. Use PowerMock
3. Add test hooks (perhaps make InternalDistributedSystem wrap a singleton
which you can swap for a test object, but there are other ways to do this)
4. Do major refactoring to eliminate the statics involved

I don't think that not testing at this level is a great solution.  We've
discussed the disadvantages of PowerMock already, so the cons of #2 are
well known.  I am not a fan of adding public API test hooks, and in this
case I'd rather use PowerMock honestly, but that is highly subjective.  And
#4 is the "ideal world" case, but isn't always practically possible given
the scope of the work involved.

I'm curious what other people think about this topic?

Thanks,
Ryan

On Tue, Nov 6, 2018 at 10:40 AM Kirk Lund  wrote:

> I think we should try really hard to avoid using PowerMock. If you have
> some code that you need to use PowerMock to make it testable, please
> consider refactoring the code to a) avoid statics, b) pass all dependencies
> in via the constructor.
>
> The following was spit out by one of my unit test runs. None of the logging
> or log4j tests in my PR involve PowerMock. Also, this unit test run had no
> FAILED tests, so this would appear to be generated by some unit test that
> uses PowerMock and PASSED.
>
> > Task :geode-core:test
> 2018-11-06 18:03:49,443 Distributed system shutdown hook ERROR Could not
> reconfigure JMX java.lang.LinkageError: loader constraint violation: loader
> (instance of org/powermock/core/classloader/MockClassLoader) previously
> initiated loading for a different type with name
> "javax/management/MBeanServer"
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at
>
> org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
> at
>
> org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
> at
>
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
> at
>
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
>
> org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:335)
> at
>
> org.apache.logging.log4j.core.jmx.Server.unregisterLoggerContext(Server.java:259)
> at
>
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:164)
> at
>
> org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:140)
> at
>
> org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:558)
> at
>
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:619)
> at
>
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:636)
> at
> org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231)
> at
>
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:243)
> at
>
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
> at org.apache.logging.log4j.LogManager.getContext(LogManager.java:174)
> at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:661)
>

Re: [DISCUSS] Disable merge for failing pull requests

2018-11-13 Thread Ryan McMahon

+1 I like this idea, but I recognize that it will be a challenge when there
is still some flakiness to the pipeline.  I think we'd need clear
guidelines on what to do if your PR fails due to something seemingly
unrelated.  For instance, we ran into GEODE-5943 (flaky EvictionDUnitTest)
in our last PR, and saw that there was already an open ticket for the
flakiness (we even reverted our change and reproduced to be sure).  So we
triggered another PR pipeline and it passed the second time.  Is rerunning
the pipeline again sufficient in this case?  Or should we have stopped what
we were doing and took up GEODE-5943, assuming it wasn't assigned to
someone?  If it was already assigned to someone, do we wait until that bug
is fixed before we run through the PR pipeline again?

I think if anything this will help us treat bugs that are causing a red
pipeline as critical, and I think that is a good thing.  So I'm still +1
despite the flakiness.  Just curious what people think about how we should
handle cases where there is a known failure and it is indeed unrelated to
our PR.

Ryan


On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao  wrote:

> Just to clarify, that flaky EvictionDUnitTest is old flaky. The PR to
> refactor the test passed all checks, even the repeatTest as well. I had a
> closed PR that just touched the "un-refactored" EvictionDUnitTest, it
> wouldn't even pass the repeatTest at all.
>
> On Mon, Nov 12, 2018 at 2:04 PM Dan Smith  wrote:
>
> > To be clear, I don't think we have an issue of untrustworthy committers
> > pushing code they know is broken to develop.
> >
> > The problem is that it is all to easy to look at a PR with some failures
> > and *assume* your code didn't cause the failures and merge it anyway. I
> > think we should all be at least rerunning the tests and not merging the
> PR
> > until everything passes. If the merge button is greyed out, that might
> help
> > communicate that standard to everyone.
> >
> > Looking at the OpenJDK 8 metrics, it looks to me like most of the issues
> > are recently introduced (builds 81-84 and the EvictionDUnitTest), not old
> > flaky tests. So I think we were a little more disciplined always
> listening
> > to what the checks are telling us we would have less noise in the long
> run.
> >
> >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-metrics/jobs/GeodeDistributedTestOpenJDK8Metrics/builds/23
> >
> > -Dan
> >
> > On Mon, Nov 12, 2018 at 11:21 AM Udo Kohlmeyer  wrote:
> >
> > > 0
> > >
> > > Patrick does make a point. The committers on the project have been
> voted
> > > in as "responsible, trustworthy and best of breed" and rights and
> > > privileges according to those beliefs have been bestowed.
> > >
> > > We should live those words and trust our committers. In the end.. If
> > > there is a "rotten" apple in the mix this should be addressed, maybe
> not
> > > as public flogging, but with stern communications.
> > >
> > > On the other side, I've also seen the model where the submitter of PR
> is
> > > not allowed to merge + commit their own PR's. That befalls to another
> > > committer to complete this task, avoiding the whole, "I'll just quickly
> > > commit without due diligence".
> > >
> > > --Udo
> > >
> > >
> > > On 11/12/18 10:23, Patrick Rhomberg wrote:
> > > > -1
> > > >
> > > > I really don't think this needs to be codified.  If people are
> merging
> > > red
> > > > PRs, that is a failing as a responsible developer.  Defensive
> > programming
> > > > is all well and good, but this seems like it's a bit beyond the pale
> in
> > > > that regard.  I foresee it making the correction of a red main
> pipeline
> > > > must more difficult that it needs to be.  And while it's much better
> > than
> > > > it had been, I am (anecdotally) still seeing plenty of flakiness in
> my
> > > PRs,
> > > > either resulting from infra failures or test classes that need to be
> > > > refactored or reevaluated.
> > > >
> > > > If someone is merging truly broken code that has failed precheckin,
> > that
> > > > seems to me to be a discussion to have with that person.   If it
> > > > persists, perhaps a public flogging would be in order.  But at
> the
> > > end
> > > > of the day, the onus is on us to be responsible developers, and no
> > amount
> > > > of gatekeeping is going to be a substitute for that.
> > > >
> > > > On Mon, Nov 12, 2018 at 9:38 AM, Galen O'Sullivan <
> > gosulli...@pivotal.io
> > > >
> > > > wrote:
> > > >
> > > >> I'm in favor of this change, but only if we have a way to restart
> > > failing
> > > >> PR builds without being the PR submitter. Any committer should be
> able
> > > to
> > > >> restart the build. The pipeline is still flaky enough and I want to
> > > avoid
> > > >> the situation where a new contributor is asked repeatedly to push
> > empty
> > > >> commits to trigger a PR build (in between actual PR review) and
> their
> > PR
> > > >> gets delayed by days if not weeks. That's a real bad experience

Re: Release branch for Apache Geode 1.8.0 has been cut

2018-11-12 Thread Ryan McMahon

Hi Alexander,

I would like to cherry pick the following commits from develop to
release/1.8.0:

https://github.com/apache/geode/commit/e9ea18e18c85b977b91192d4edbb9a4e18b2643e
*Reason*: This was a revert of a previous commit which addressed data
inconsistencies between WAN sites, but it was found the fix introduced
several other inconsistencies.

https://github.com/apache/geode/commit/aab0198e8478d4246042b2eb889c8ce7e28bb52e
*Reason: *This fixes a race condition in the QueryMonitor which causes an
unexpected RejectedExecutionException in monitorQuery()

Please let me know if this sounds reasonable and I can go ahead and begin
the cherry picking process.

Thanks!
Ryan

On Thu, Nov 8, 2018 at 1:54 PM Alexander Murmann 
wrote:

> Hello everyone,
>
> As discussed previously created a new release branch for Apache Geode 1.8.0
> - "release/1.8.0"
>
> Please do review and raise any concern with the release branch.
> If no concerns are raised, we will start with the voting for the release
> candidate soon.
>
> This also means that all tickets JIRA that get resolved on develop should
> be marked with 1.9.0 as their fix version.
>
> Thanks!
>
> Alexander
>

Re: [DISCUSS] Cutting 1.8 release branch

2018-11-02 Thread Ryan McMahon

Bill Burcham and I have been working on a data inconsistency issue which
involves a lost event across WAN sites during rebalance on the originating
site.  We are currently performing root cause analysis.  Below is a Geode
ticket which we will update with more details as we learn more.

https://issues.apache.org/jira/browse/GEODE-5972

On Thu, Nov 1, 2018 at 5:23 PM Ernest Burghardt 
wrote:

> and PR 390 has been approved and merged
>
> On Thu, Nov 1, 2018 at 5:10 PM Ernest Burghardt 
> wrote:
>
> > geode-native fixes are in
> https://github.com/apache/geode-native/pull/390
> >
> > On Thu, Nov 1, 2018 at 4:06 PM Anthony Baker  wrote:
> >
> >> The geode-native source headers I mentioned in [1] need to be cleaned
> up.
> >>
> >> Anthony
> >>
> >> [1]
> >>
> https://lists.apache.org/thread.html/8c9da19d7c0ef0149b1ed79bf0cecde38f17a854ecfa0f0a42f1ff0b@%3Cdev.geode.apache.org%3E
> >>
> >> > On Nov 1, 2018, at 2:01 PM, Bruce Schuchardt 
> >> wrote:
> >> >
> >> > This PR has been merged to develop
> >> >
> >> >
> >> > On 11/1/18 1:35 PM, Bruce Schuchardt wrote:
> >> >> I would like to get this PR in the release:
> >> https://github.com/apache/geode/pull/2757
> >> >>
> >> >> I'm testing the merge to develop now
> >> >>
> >> >>
> >> >> On 11/1/18 1:18 PM, Sai Boorlagadda wrote:
> >> >>> Sure! agree that we should hold the releases unless there is a
> >> >>> critical issue.
> >> >>>
> >> >>> This is not a gating issue and the code is already committed to
> >> develop.
> >> >>>
> >> >>> Sai
> >> >>>
> >> >>> On Thu, Nov 1, 2018 at 1:03 PM Alexander Murmann <
> amurm...@pivotal.io
> >> >
> >> >>> wrote:
> >> >>>
> >>  In the spirit of the previously discussed timed releases, we should
> >> only
> >>  hold cutting the release for critical issues, that for some reason
> >> (not
> >>  sure how that might be) should not be fixed after the branch has
> >> been cut.
> >>  Waiting for features leads us down the slippery slope we are trying
> >> to
> >>  avoid by having timed releases. Does that make sense?
> >> 
> >>  On Thu, Nov 1, 2018 at 12:45 PM Sai Boorlagadda <
> >> sai.boorlaga...@gmail.com
> >>  wrote:
> >> 
> >> > I would like to resolve GEODE-5338 as it is currently waiting for
> >> > doc update.
> >> >
> >> > Sai
> >> >
> >> > On Thu, Nov 1, 2018 at 10:00 AM Alexander Murmann <
> >> amurm...@pivotal.io>
> >> > wrote:
> >> >
> >> >> Hi everyone,
> >> >>
> >> >> It's time to cut the release branch, since we are moving to time
> >> based
> >> >> releases. Are there any reasons why a release branch should not
> be
> >> cut
> >>  as
> >> >> soon as possible?
> >> >>
> >> >>
> >> >
> >>
> >>
>

Re: [VOTE] Time-based release schedule for minor releases

2018-10-10 Thread Ryan McMahon

I’m with Sai that it seems like we need to clear up our definitions of
minor versus patch releases.  The referenced SemVer definition indicates
that any backwards compatible bug fix qualifies for a patch release.  But
it was stated earlier that only security-related or critidal bug fixes
justify a patch release.  I personally prefer SemVer’s definition, but it’s
up for debate.

Perhaps we can do 3-month release cycles, and determine whether the release
would be patch or minor based on the changes added since the last release
(bug fixes vs new functionality).

Ryan

On Wed, Oct 10, 2018 at 7:48 AM Addison Huddy  wrote:

> +1 for a cadence based release cycle that using 3-month between minor
> releases.
>
> On Wed, Oct 10, 2018 at 7:29 AM Sai Boorlagadda  >
> wrote:
>
> > After looking at these definitions are we not talking about time-based
> > patch releases?
> > It is again subjective how much functionality makes a MINOR release.
> >
> > Though this thread is seeking consensus on time-based scheduled it is
> > specifically for minors.
> > it appears to me like we need to update our definitions before we make a
> > decision on schedule.
> >
> > On Tue, Oct 9, 2018 at 10:06 AM Alexander Murmann 
> > wrote:
> >
> > > Bruce, agreed. I think we should eliminate all the fluff language
> around
> > > "significant improvements". What's "significant" is entirely
> subjective.
> > > I'd prefer to stick to the much simpler definition from semver.org:
> > > >
> > > > Given a version number MAJOR.MINOR.PATCH, increment the:
> > > >
> > > >1. MAJOR version when you make incompatible API changes,
> > > >2. MINOR version when you add functionality in a
> > backwards-compatible
> > > >manner, and
> > > >3. PATCH version when you make backwards-compatible bug fixes.
> > > >
> > > >
> > > On Tue, Oct 9, 2018 at 9:03 AM Bruce Schuchardt <
> bschucha...@pivotal.io>
> > > wrote:
> > >
> > > > -1
> > > >
> > > > To me the definition of major vs minor releases is too muddy.  Our
> > > > Versioning and Branching
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Versioning+and+Branching
> > > >
> > > >
> > > > page has requirements for a Minor release that seem orthogonal to
> this
> > > > discussion.  A Minor release "must definitely include significant
> > > > improvements to current API or components that justify not be
> > configured
> > > > (sic) as /maintenance/ changes".
> > > >
> > > > The page also describes a Maintenance release that seems to be more
> > what
> > > > we're talking about here.
> > > >
> > > > I think we need a new proposal for Major / Minor / Maintenance
> release
> > > > definitions.  That's what we should be voting on.
> > > >
> > > >
> > > >
> > > > On 10/8/18 2:24 PM, Alexander Murmann wrote:
> > > > > Hi everyone,
> > > > >
> > > > > As discussed in "Predictable minor release cadence", I'd like us to
> > > find
> > > > > agreement on releasing a new minor version every three months.
> There
> > > are
> > > > > more details in the other thread and I should have captured
> > everything
> > > > > relevant on the wiki:
> > > > > https://cwiki.apache.org/confluence/display/GEODE/Release+Schedule
> > > > >
> > > > > There are also some discussions about patch releases. Let's please
> > > focus
> > > > > this vote on the proposed minor release schedule and carry on other
> > > > > discussions in the [DISCUSS] thread.
> > > > >
> > > > > Thank you all!
> > > > >
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Predictable minor release cadence

2018-10-04 Thread Ryan McMahon

+1 for scheduled releases and cutting the branch one month prior to
release. Given the time it took to fully root out, classify, and solve
issues with this 1.7 release, I think a month is the right amount of time
between cutting the branch and releasing.  If it ends up being too much or
too little, we can always adjust.

I don’t feel strongly about the release cadence, but I generally favor more
frequent releases if possible (3 month over 6 month).  That way new
features can get into the hands of users sooner, assuming the feature takes
less than 3 months to complete.  Again, we can adjust the cadence if
necessary if it is too frequent or infrequent.

Ryan

On Thu, Oct 4, 2018 at 4:18 PM Alexander Murmann 
wrote:

> Anil, releasing every 3 months should give us 3 months of dev work. Don't
> forget that there will be one month during which the next release is
> already cut, but the vast majority of the work is going to the release
> after that. So while we cut 1.7 one month ago and release 1.7 today, we
> already have one month of work on develop again. It's not going to work out
> for this first release though, due to my suggestion to cut a month early to
> avoid holidays. If I recall correctly our past problem was that it took
> longer to iron out issue on the release branch than expected. The solution
> to that would be to cut the release even earlier, but 1 month feels very
> generous.
>
> On Thu, Oct 4, 2018 at 4:04 PM Anilkumar Gingade 
> wrote:
>
> > If I remember from earlier discussion; the plan was to deliver a release
> > once 3 months. But from the past release history we had difficulty in
> > achieving that, either the features are not completely ready or the
> > bug-fixes have taken more time. We need verify what is right for Apache
> > Geode, 3, 4 or 6 months; and there is any community dev/activity that
> > depends on Geode release.
> > My vote will be for 4 or 6 months, as it provides at least 3+ month for
> dev
> > activity and 1 month for QA.
> >
> > -Anil.
> >
> >
> > On Thu, Oct 4, 2018 at 2:43 PM Dan Smith  wrote:
> >
> > > +1 I definitely like the idea of scheduled releases.
> > >
> > > I wonder if cutting the release branch a month ahead of time is
> overkill,
> > > but I guess we do seem to keep finding issues after the branch is cut.
> > >
> > > -Dan
> > >
> > > On Thu, Oct 4, 2018 at 1:25 PM Alexander Murmann 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I want to propose shipping Geode on a regular cadence. My concrete
> > > proposal
> > > > is to ship Geode every 3 months on the first weekday. To make sure we
> > hit
> > > > that date we would cut the release 1 months prior to that day.
> > > >
> > > > *Why?*
> > > > Knowing on what day the release will get cut and on what day we ship
> > > allows
> > > > community members to plan their contributions. If I want my feature
> to
> > be
> > > > in the next release I know by when I need to have it merged to
> develop
> > > and
> > > > can plan accordingly. As a user who is waiting for a particular
> feature
> > > or
> > > > fix that's already on develop, I know when to expect the release that
> > > > includes this work and again, can plan accordingly.
> > > >
> > > > This makes working and using Apache Geode more predictable which
> makes
> > > all
> > > > our lives less stressful. To make this work, it's important to be
> > strict
> > > > about cutting the release branch on the set date and only allow
> > critical
> > > > fixes after the release has been cut. Once we start compromising on
> > this,
> > > > we go down a slippery slope that ultimately leads to not getting the
> > > > predictability benefits described here.
> > > >
> > > > Some other successful Apache projects share similar approaches:
> > > >
> > > >- Kafka
> > > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan>
> > > >releases every 4 months and cuts the release 1 month prior
> > > >- PredictionIO <
> https://predictionio.apache.org/resources/release/>
> > > > releases
> > > >every 2 months
> > > >- Spark  does
> not
> > > seem
> > > >to have a hard date, but aims to ship every 6 months, so there is
> at
> > > > least
> > > >a goal date
> > > >
> > > >
> > > > *What?*
> > > > As stated above, I suggest to release every three months. Given we
> just
> > > > shipped the next release would go out on January 2nd. That timing in
> > > > unfortunate, due to the holidays. Therefore I propose to aim for a
> > > December
> > > > 3rd (1st Monday in December) release. In order to meet that date, we
> > > should
> > > > cut the release branch on November 1st. That also means that we
> should
> > > > start finding a volunteer to manager the release on October 25th. I
> > know
> > > > this seems really close, given we just shipped, but keep in mind that
> > > this
> > > > is to avoid the holidays and that we already have close to a month
> > worth
> > > of
>

Re: [VOTE] Apache Geode 1.7.0 RC2

2018-10-01 Thread Ryan McMahon

+1

- Verified clean build from source
- Built and ran all geode-examples

On Fri, Sep 28, 2018 at 8:52 PM Sai Boorlagadda 
wrote:

> +1
>
> * verified signatures
>
> * verified source distribution builds
>
> * ran basic gfsh commands
>
> * started pulse
>
> * verified examples work with rc2
>
>
> Sai
>
> On Fri, Sep 28, 2018 at 8:57 AM Dave Barnes  wrote:
>
> > +1
> > Downloaded the release & successfully ran some representative gfsh
> > commands.
> > There is still a problem that I noted in my RC1 review: the user guide
> > config files incorrectly specify v1.8.
> > This is not a showstopper, but this time I have created a JIRA ticket
> > (GEODE-5795) so if (heaven forfend) we roll an RC3, the correction will
> be
> > included.
> >
> > On Thu, Sep 27, 2018 at 7:13 PM, Nabarun Nag  wrote:
> >
> > > Following checks completed:
> > > - checked signatures -src.tgz -examples.tgz -.tgz
> > > - checked SHA's -src.tgz -examples.tgz -.tgz
> > > - builds from source
> > > - run gfsh - start locator, server - create region - do put and get -
> > > execute OQL query
> > > - examples run cleanly
> > > - the correct version in gfsh command version
> > > - out scripts are now present in the source distribution
> > > -> ./ci/resource-types/gce-instances-resource/out
> > > -> ./ci/resource-types/concourse-metadata-resource/files/out
> > >
> > > Regrads
> > > Nabarun Nag
> > >
> > >
> > > On Thu, Sep 27, 2018 at 6:54 PM Nabarun Nag  wrote:
> > >
> > > > This is the second release candidate for Apache Geode, version 1.7.0.
> > > > Thanks to all the community members for their contributions to this
> > > > release!
> > > >
> > > > Please do a review and give your feedback. The deadline is the end of
> > day
> > > > 2nd October 2018.
> > > >
> > > > It resolves 351 issues on Apache Geode JIRA system. Release notes can
> > be
> > > > found at:
> > > > https://cwiki.apache.org/confluence/display/GEODE/
> > > > Release+Notes#ReleaseNotes-1.7.0
> > > >
> > > >
> > > > Please note that we are voting upon the source tags: rel/v1.7.0.RC2
> > > > Apache Geode:
> > > > https://github.com/apache/geode/tree/rel/v1.7.0.RC2
> > > > Apache Geode examples:
> > > > https://github.com/apache/geode-examples/tree/rel/v1.7.0.RC2
> > > >
> > > > Commit ID:
> > > > Apache Geode:
> > > > 48061cdcc293d42020b8695a5a23c4dd360550ff
> > > > Apache Geode Examples:
> > > > 9dca29c7c10fa8d44abc893271420476cfc0808f
> > > >
> > > >
> > > > Source and binary files:
> > > > https://dist.apache.org/repos/dist/dev/geode/1.7.0.RC2/
> > > >
> > > > Maven staging repo:
> > > >
> https://repository.apache.org/content/repositories/orgapachegeode-1046
> > > >
> > > > Geode's KEYS file containing PGP keys we use to sign the release:
> > > > https://github.com/apache/geode/blob/develop/KEYS
> > > >
> > > > Signed the release with fingerprint:
> > > > rsa4096 2018-01-04 [SC] [expires: 2022-01-04]
> > > > CE6CD0A89480B1B9FCB98699274C66710770C135
> > > >
> > > > rsa4096 2018-01-04 [SC] [expires: 2022-01-04]
> > > >
> > > > I do apologize if there was an oversight.
> > > >
> > > > PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
> > > > https://dist.apache.org/repos/dist/dev/geode/1.7.0.RC2
> > > > -PgeodeRepositoryUrl=
> > > >
> https://repository.apache.org/content/repositories/orgapachegeode-1046
> > > > build runAll
> > > >
> > > > Regards
> > > > Nabarun Nag
> > > >
> > >
> >
>

Re: LoggingTest category

2018-09-25 Thread Ryan McMahon

This information is now in the TESTING.md in the Git repo as well FYI.

Ryan

On Tue, Sep 25, 2018 at 11:51 AM Anthony Baker  wrote:

> Add a property like this on the command line when you execute a test task:
>
> -PtestCategory=org.apache.geode.test.junit.categories.GfshTest
>
> Anthony
>
>
> > On Sep 25, 2018, at 2:21 PM, Kirk Lund  wrote:
> >
> > I've made sure that all logging related tests have the LoggingTest
> category
> > but now I can no longer find any mechanism in our gradle files to execute
> > all tests of a specific component category. Is there a way to run all
> tests
> > with this category or did this get deleted?
>
>

Re: [VOTE] Apache Geode 1.7.0 RC1

2018-09-18 Thread Ryan McMahon

+1

- Read release notes and verified no incorrect/misspelled content
- Verified clean build from source
- Built and ran all geode-examples

Ryan



On Mon, Sep 17, 2018 at 5:33 PM, Anthony Baker  wrote:

> +1
>
> Reviewed LICENSE and NOTICE*
> Verified signatures and hashes
> Checked source releases for binaries
> Verified that source releases build successfully
> Confirmed that geode-examples run correctly
> Verified GitHub tag
>
> Anthony
>
> * I found a few places that need copyright date updates and will tweak
> those for the next release
>
>
> > On Sep 17, 2018, at 5:06 PM, Nabarun Nag  wrote:
> >
> > *DEADLINE :* Kindly complete your vote by Wednesday, 19th September 2018.
> > End of day PDT.
> >
> > Following checks completed:
> >
> > - checked signatures
> > - checked sha’s
> > - builds from source
> > - run gfsh - start locator, server - create region - do put and get -
> > execute OQL query
> > - examples run cleanly
> >
> > @John Blum  , I was looking into the mail chain for
> > Apache Geode 1.5.0 RC and I saw that you had some concerns regarding SD
> > Geode's behavior with the changes made to Log4j in Apache Geode. In this
> > release, too we have made some upgrades to log4j and we would to hear
> your
> > feedback / get a green light on these changes.
> >
> > Regards
> > Nabarun Nag
> >
> >
> >
> > On Fri, Sep 14, 2018 at 2:50 PM Nabarun Nag  wrote:
> >
> >> Correction :
> >> Please note that we are voting upon the source tags: rel/v1.7.0.RC1
> >>
> >> On Fri, Sep 14, 2018 at 2:49 PM Nabarun Nag  wrote:
> >>
> >>> This is the first release candidate for Apache Geode, version 1.7.0.
> >>> Thanks to all the community members for their contributions to this
> >>> release!
> >>>
> >>> Please do review and give your feedback.
> >>>
> >>> It resolves 351 issues on Apache Geode JIRA system. Release notes can
> be
> >>> found at:
> >>> https://cwiki.apache.org/confluence/display/GEODE/
> >>> Release+Notes#ReleaseNotes-1.7.0
> >>>
> >>>
> >>> Please note that we are voting upon the source tags: rel/v1.6.0.RC1
> >>> Apache Geode:
> >>> https://github.com/apache/geode/tree/rel/v1.7.0.RC1
> >>> Apache Geode examples:
> >>> https://github.com/apache/geode-examples/tree/rel/v1.7.0.RC1
> >>>
> >>> Commit ID:
> >>> Apache Geode:
> >>> f9abdeb489c9278f96afd37f72a7a9d14cb0f154
> >>> Apache Geode Examples:
> >>> 9dca29c7c10fa8d44abc893271420476cfc0808f
> >>>
> >>>
> >>> Source and binary files:
> >>> https://dist.apache.org/repos/dist/dev/geode/1.7.0.RC1/
> >>>
> >>> Maven staging repo:
> >>> https://repository.apache.org/content/repositories/orgapachegeode-1042
> >>>
> >>> Geode's KEYS file containing PGP keys we use to sign the release:
> >>> https://github.com/apache/geode/blob/develop/KEYS
> >>>
> >>> Signed the release with fingerprint:
> >>> rsa4096 2018-01-04 [SC] [expires: 2022-01-04]
> >>> CE6CD0A89480B1B9FCB98699274C66710770C135
> >>>
> >>> rsa4096 2018-01-04 [SC] [expires: 2022-01-04]
> >>>
> >>> I do apologize if there was an oversight.
> >>>
> >>> Regards
> >>> Nabarun Nag
> >>>
> >>
>
>

Updating instructions to run tests

2018-09-17 Thread Ryan McMahon

Hi all,

I recently added instructions on how to setup Geode in IntelliJ
,
and as part of that effort realized there is some duplication between the
Wiki [1

, 2
]
and the BUILDING.md
.

I am removing this duplication and noticed that we have some outdated
instructions on how to run tests in those Wiki sections.  Before I update
these instructions, I wanted to propose we move these instructions to
somewhere in the repo.  Here are some options:

1. Add a TESTING.md with instructions on running tests and link to it from
the Wiki
2. Add instructions on running tests to the existing BUILDING.md
3. Put all build and test instructions in the README.md, and get rid of
BUILDING.md.  I mention this option because this is what Apache Spark and
Apache Kafka do.  This might be hard for us because we have examples and
other content in the README.md already, so it would become quite bloated.
4. Leave the testing instructions where they are in the Wiki but update
them to be accurate

I would vote for #1.  I am open to other options as well if none of these
seem ideal.

Thanks,
Ryan

Re: Instructions for Setting Up IntelliJ

2018-09-11 Thread Ryan McMahon

Kirk - I have a PR open here which has the "Setting up IntelliJ" section
you've described:
https://github.com/apache/geode/pull/2456

I would be happy to use your version if you think it is more comprehensive,
or we can revise my PR to include any details you feel are missing.

Ryan

On Tue, Sep 11, 2018 at 4:15 PM, Jacob Barrett  wrote:

> Put it with the source! BUILDING.md or other file.
>
> > On Sep 11, 2018, at 4:11 PM, Kirk Lund  wrote:
> >
> > I don't care which location (wiki or part of the readme) but I do have
> > up-to-date instructions that I could update either location with. Maybe a
> > detailed "Setting up IntelliJ" section on BUILDING.md? Just let me know
> if
> > you'd like my version.
> >
> >> On Tue, Sep 11, 2018 at 3:36 PM, Jianxia Chen 
> wrote:
> >>
> >> +1 to revise the wiki to link to README.md/BUILDING.md
> >>
> >> On Tue, Sep 11, 2018 at 3:22 PM Ryan McMahon 
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I am looking to add more comprehensive instructions for how to bring
> >> Geode
> >>> into IntelliJ.  I've written the instructions but am now looking at
> where
> >>> to put them.
> >>>
> >>> There appears to be duplicate information in these sections of the
> Geode
> >>> wiki and the README.md/BUILDING.md in the Geode Git repository.
> >>>
> >>>
> >>> https://cwiki.apache.org/confluence/display/GEODE/
> >> Getting+Started+for+Geode+Developers
> >>>
> >>> https://cwiki.apache.org/confluence/display/GEODE/
> >> Building+and+Running+Geode+from+Source
> >>>
> >>> I'm a fan of one source of truth, but I wanted to see if anyone felt
> >>> strongly about where the information lives.  I propose we revise the
> wiki
> >>> to simply link to the README.md/BUILDING.md and eliminate the duplicate
> >>> information (how to run gradlew, etc).  Any thoughts?
> >>>
> >>> Thanks,
> >>> Ryan
> >>>
> >>
>

Instructions for Setting Up IntelliJ

2018-09-11 Thread Ryan McMahon

Hi all,

I am looking to add more comprehensive instructions for how to bring Geode
into IntelliJ.  I've written the instructions but am now looking at where
to put them.

There appears to be duplicate information in these sections of the Geode
wiki and the README.md/BUILDING.md in the Geode Git repository.

https://cwiki.apache.org/confluence/display/GEODE/Getting+Started+for+Geode+Developers
https://cwiki.apache.org/confluence/display/GEODE/Building+and+Running+Geode+from+Source

I'm a fan of one source of truth, but I wanted to see if anyone felt
strongly about where the information lives.  I propose we revise the wiki
to simply link to the README.md/BUILDING.md and eliminate the duplicate
information (how to run gradlew, etc).  Any thoughts?

Thanks,
Ryan

New Committers: Use GitBox to Link Apache Account to GitHub Account

2018-09-07 Thread Ryan McMahon

Hi Everyone,

I just wanted to relay some information to new committers.  We determined
that in order to get write access to the Geode repository, you need to go
to this site and go through the linking process.

https://gitbox.apache.org/setup/

If you don't to do this, you will be added to the "Apache Committers" team
but not the "geode committers" team.

Ryan
mcmellaw...@apache.org

Re: 2 minute gateway startup time due to GEODE-5591

2018-09-05 Thread Ryan McMahon

+1 for reverting in both places.

I see that there is already an isGatewayReceiver flag in the AcceptorImpl
constructor.  It's not ideal, but could we use this flag to prevent the 2
minute retry logic for happening if this flag is true?

Ryan

On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
lhughesgodf...@pivotal.io> wrote:

> +1 for reverting in both places.
>
> On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith  wrote:
>
> > +1 for reverting in both places. The current fix is not better, that's
> why
> > we are reverting it on the release branch!
> >
> > -Dan
> >
> > On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett 
> wrote:
> >
> > > I’m not ok with reverting in develop. Revert in 1.7 and modify in
> > develop.
> > > We shouldn’t go backwards in develop. The current fix is better than
> the
> > > bug it fixes.
> > >
> > > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag  wrote:
> > > >
> > > > If everyone is okay with it, I will revert that change in develop and
> > > then
> > > > cherry pick it to release/1.7.0 branch.
> > > > Please do comment.
> > > >
> > > > Regards
> > > > Nabarun Nag
> > > >
> > > >
> > > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith  wrote:
> > > >>
> > > >> +1 to yank it and rework the fix.
> > > >>
> > > >> Gester's change helps, but it just means that you will sometimes
> > > randomly
> > > >> have a 2 minute delay starting up a gateway receiver. I don't think
> > > that is
> > > >> a great user experience either.
> > > >>
> > > >> -Dan
> > > >>
> > > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <
> > > bschucha...@pivotal.io>
> > > >> wrote:
> > > >>
> > > >>> Let's yank it
> > > >>>
> > > >>>
> > > >>>
> > >  On 9/4/18 5:04 PM, Sean Goller wrote:
> > > 
> > >  If it's to get the release out, I'm fine with reverting. I don't
> > like
> > > >> it,
> > >  but I'm not willing to die on that hill. :)
> > > 
> > >  -S.
> > > 
> > >  On Tue, Sep 4, 2018 at 4:38 PM Dan Smith 
> wrote:
> > > 
> > >  Spitting this into a separate thread.
> > > >
> > > > I see the issue. The two minute timeout is the constructor for
> > > > AcceptorImpl, where it retries to bind for 2 minutes.
> > > >
> > > > That behavior makes sense for CacheServer.start.
> > > >
> > > > But it doesn't make sense for the new logic in
> > > GatewayReceiver.start()
> > > > from
> > > > GEODE-5591. That code is trying to use CacheServer.start to scan
> > for
> > > an
> > > > available port, trying each port in a range. That free port
> finding
> > > >> logic
> > > > really doesn't want to have two minutes of retries for each port.
> > It
> > > > seems
> > > > like we need to rework the fix for GEODE-5591.
> > > >
> > > > Does it make sense to hold up the release to rework this fix, or
> > > should
> > > > we
> > > > just revert it? Have we switched concourse over to using alpine
> > > linux,
> > > > which I think was the original motivation for this fix?
> > > >
> > > > -Dan
> > > >
> > > > On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith 
> > wrote:
> > > >
> > > > Why is it waiting at all in this case? Where is this 2 minute
> > timeout
> > > >> coming from?
> > > >>
> > > >> -Dan
> > > >>
> > > >> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <
> > > >>
> > > > sai.boorlaga...@gmail.com
> > > >
> > > >> wrote:
> > > >>> So the issue is that it takes longer to start than previous
> > > releases?
> > > >>> Also, is this wait time only when using Gfsh to create
> > > >>> gateway-receiver?
> > > >>>
> > > >>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag 
> > > wrote:
> > > >>>
> > > >>> Currently we have a minor issue in the release branch as
> pointed
> > > out
> > > 
> > > >>> by
> > > >
> > > >> Barry O.
> > >  We will wait till a resolution is figured out for this issue.
> > > 
> > >  Steps:
> > >  1. create locator
> > >  2. start server --name=server1 --server-port=40404
> > >  3. start server --name=server2 --server-port=40405
> > >  4. create gateway-receiver --member=server1
> > >  5. create gateway-receiver --member=server2 `This gets stuck
> > for 2
> > > 
> > > >>> minutes`
> > > >>>
> > >  Is the 2 minute wait time acceptable? Should we document it?
> > When
> > > we
> > > 
> > > >>> revert
> > > >>>
> > >  GEODE-5591, this issue does not happen.
> > > 
> > >  Regards
> > >  Nabarun Nag
> > > 
> > > 
> > > >>>
> > > >>
> > >
> >
>

Re: Geode Wiki Write/Edit Permissions

2018-09-04 Thread Ryan McMahon

Follow up: looks like my wiki username is just 'rmcmahon'.

Thanks,
Ryan McMahon
rmcma...@pivotal.io

On Tue, Sep 4, 2018 at 9:21 AM, Ryan McMahon  wrote:

> Hello,
>
> I'd like to request write/edit permissions on the Geode wiki.  My Apache
> username is mcmellawatt and the email address I used with Confluence is
> rmcma...@pivotal.io.
>
> Thanks,
> Ryan McMahon
> rmcma...@pivotal.io
>

Geode Wiki Write/Edit Permissions

2018-09-04 Thread Ryan McMahon

Hello,

I'd like to request write/edit permissions on the Geode wiki.  My Apache
username is mcmellawatt and the email address I used with Confluence is
rmcma...@pivotal.io.

Thanks,
Ryan McMahon
rmcma...@pivotal.io

Re: no test category, rename tests?

2018-08-21 Thread Ryan McMahon

+1 for the naming convention of *Test, *IntegrationTest, and
*DistributedTest.  It’s mostly personal preference, but I also agree with
Kirk’s comment that it makes it easier to find the test you are looking for
by file name.  I also find it easier to see the test type I’m working with
in the file tabs of my IDE without thinking about the source set.  Again,
it’s mostly personal preference but I think it’s a good convention.

Ryan

On Tue, Aug 21, 2018 at 12:00 PM Kirk Lund  wrote:

> Imagine if all the dunits and integration tests were named FooTest and now
> you finally want to write a unit test for Foo. You'll need to create a 3rd
> test named FootTest or rename the existing FooTests to FooIntegrationTest
> or FooDistributedTest. Since we have separate source sets for all three,
> naming all three FooTest is possible. But in my opinion this is just going
> to cause a lot of suffering when you're searching for the right test to
> modify for a code change or to debug a CI failure. What do you prefer?
>
> On Tue, Aug 21, 2018 at 11:51 AM, Sai Boorlagadda <
> sai.boorlaga...@gmail.com
> > wrote:
>
> > Are there any benefits of naming them specifically to reflect the
> category?
> > I am used to look at test category annotation to differentiate a test is
> a
> > single-vm or a multi-vm.
> >
> > On Tue, Aug 21, 2018 at 11:29 AM Anthony Baker 
> wrote:
> >
> > > The original purpose of specific names was because we filtered test
> > > execution based on the name pattern (then we switched to categories and
> > now
> > > we use source sets / dirs).
> > >
> > > I like the idea of changing *JUnitTest to *Test.  I’m not as excited
> > about
> > > *IntegrationTest and friends.  But I’m willing to go with group
> consensus
> > > on this (aka +0).
> > >
> > > Anthony
> > >
> > >
> > > > On Aug 21, 2018, at 10:11 AM, Kirk Lund  wrote:
> > > >
> > > > +1 Yes! This naming scheme matches what we already have on the Apache
> > > Geode
> > > > Wiki [1] for naming tests.
> > > >
> > > > Side note: Looks like we need to update the pages on the Wiki to
> match
> > > our
> > > > recent changes involving JUnit categories.
> > > >
> > > > [1] https://cwiki.apache.org/confluence/display/GEODE/Writing+tests
> > > >
> > > > On Tue, Aug 21, 2018 at 9:52 AM, Sai Boorlagadda <
> > > sai_boorlaga...@apache.org
> > > >> wrote:
> > > >
> > > >> Many tests file names are *JUnitTest.java[1] even though it is an
> > > >> integration test or distributed test. After we segregated into
> > > respective
> > > >> folders and removed the test category it is not visible which test
> it
> > is
> > > >> unless we look back into the path of the source directory
> > > (integrationTest
> > > >> or distributedTest).
> > > >>
> > > >> So I wanted to rename tests to reflect test category. Let me know if
> > > there
> > > >> is okay and if I do rename are there any downstream impacts (eg:
> > > metrics,
> > > >> auto diagnosis etc)
> > > >>
> > > >> Any integration test will be renamed as *IntegrationTest.java
> > > >> Any distributed test will be renamed as *DistributedTest.java
> > > >> Any unit test will be renamed as *Test.java.
> > > >>
> > > >> [1]
> > > >> https://github.com/apache/geode/blob/develop/geode-core/
> > > >> src/integrationTest/java/org/apache/geode/internal/cache/
> > > >> ComplexDiskRegionJUnitTest.java
> > > >>
> > >
> > >
> >
>

Re: trying to implement SSL configuration

2018-06-20 Thread Ryan McMahon

Hi Liron,


The first thing that jumps out to me when you say that GFSH could not
connect to the JMX manager is that you need to have `jmx` in addition to
`locator` in your `ssl-enabled-components` Geode system property.  For
example, you'd need ssl-enabled-components=locator,jmx at a minimum for
GFSH to connect.  it's a bit different if you pass --use-http to your
`connect` command, but it doesn't appear you are doing that.


Ryan

On Wed, Jun 20, 2018 at 8:46 AM, Liron Ben Ari 
wrote:

> Hi ,
> Well , I managed!! All my processes are talking with SSL configuration
> (hip hip Horay ☺)
> I figure out – that I need client authentication and server authentication
> in the server certificate EKU , and that I need a single  depth hierarchy ,
> I am not sure it will be the case when I wil need to implement it in the
> customer site…
>
> Does anyone have id why it was used like this?
>
>
> Last question…
> I am trying to configure the gfsh to connect to my locator.
> I’ve added to the connect command the needed properties…
>
>
> ${GEMFIRE_HOME}/bin/gfsh -e "connect --locator=192.168.2.100[1028]
> --use-ssl  --security-properties-file=$GF_SERVER_DIR/properties/
> gemfire.sec.properties
>
> I can see that he is able to connect to the locator – but I see that it is
> trying to connect to the manager without success.
> Does anyone know if I need to add another certificate or key for the
> manager?
>
>
> 1) Executing - connect --locator=192.168.2.100[1028] --use-ssl
> --security-properties-file=/users/xpiwrk1/GemFire/Server/
> properties/gemfire.sec.properties
>
> Connecting to Locator at [host=192.168.2.100, port=1028] ..
> Connecting to Manager at [host=eaasrt, port=1029] ..
> Could not connect to : [host=eaasrt, port=1029]. Failed to retrieve
> RMIServer stub: javax.naming.CommunicationException [Root exception is
> java.rmi.ConnectIOException: error during JRMP connection establishment;
> nested exception is:
> javax.net.ssl.SSLHandshakeException: Received fatal alert:
> handshake_failure]
>
>
>
> Thank you so much!!!
> From: Ernest Burghardt [mailto:eburgha...@pivotal.io]
> Sent: Tuesday, June 12, 2018 7:27 PM
> To: u...@geode.apache.org
> Cc: Udo Kohlmeyer ; dev@geode.apache.org; Gregory
> Vortman ; Vladi Polonsky
> ; Alon Bar-Lev 
> Subject: Re: trying to implement SSL configuration
>
> Hello,
>
> For "native" C++ interaction have a look at geode-native/cppcache/
> integration-test/testThinClientSSL
> This should provide an example of connecting with SSL enabled...
>
> EB
>
> On Tue, Jun 12, 2018 at 2:48 AM, Liron Ben Ari  mailto:liron.ben...@amdocs.com>> wrote:
>
> We check  - the PKCS12 works  - (as  we saw it in the s_client)
> It looks like the server did not found  a valid certificate...
>
> Maybe you have a working example? When the client is native c++?
>
> Thanks!!
>
> -Original Message-
> From: Liron Ben Ari
> Sent: Tuesday, June 12, 2018 11:25 AM
> To: Udo Kohlmeyer mailto:ukohlme...@pivotal.io>>;
> dev@geode.apache.org; u...@geode.apache.org
> 
> Cc: Gregory Vortman  gregory.vort...@amdocs.com>>; Vladi Polonsky  mailto:vladi.polon...@amdocs.com>>; Alon Bar-Lev  mailto:alon.bar...@amdocs.com>>
> Subject: RE: trying to implement SSL configuration
>
> Hi ,
> Thanks you for the quick respond.
> So according to the link you send, the keystore type is jks as well.
> I will try  and update...
> But according the client configuration (I found this document for it:
> http://pubs.vmware.com/vfabric53/topic/com.vmware.
> ICbase/PDF/vfabric-gemfire-nc-ug-7.0.1.pdf)
>
> The  keystore for the native client should be in PEM format.
>
>
>
> -Original Message-
> From: Udo Kohlmeyer [mailto:ukohlme...@pivotal.io ukohlme...@pivotal.io>]
> Sent: Tuesday, June 12, 2018 1:49 AM
> To: dev@geode.apache.org; Liron Ben Ari <
> liron.ben...@amdocs.com>;
> u...@geode.apache.org
> Cc: Gregory Vortman  gregory.vort...@amdocs.com>>; Vladi Polonsky  mailto:vladi.polon...@amdocs.com>>; Alon Bar-Lev  mailto:alon.bar...@amdocs.com>>
> Subject: Re: trying to implement SSL configuration
>
> Hi there,
>
> Have you tried the following?
>
> https://docs.oracle.com/cd/E19798-01/821-1841/gjrgy/index.html
>
> I have not tried to use a PKCS12 keystore type. Was there a particular
> reason why you are using it? Could you try with a JKS?
>
> --Udo
>
> On 6/11/18 03:31, Liron Ben Ari wrote:
> > Hello team.
> > I am trying to move my Client server to work with SSL as part of
> Security POC we are running .
> > I was moving on GEODE documents  (there are a lot! :)) and there was a
> lot of different options...
> >
> >
> >
> > This is the configuration  I used:
> >
> > I've generated Keystore & certificate using a private tool (that uses
> > the openssl + Keytools)
> >
> > For client:
> >   A file containing PEM encoded X.509 certificate and PEM encoded
> > PKCS#8 encrypted

Re: Geode JIRA permissions

2018-03-14 Thread Ryan McMahon

Hi Dan,

My username is rmcmahon according to my profile.  Not sure how the second
username got associated with my email address.

Thanks,
Ryan

On Wed, Mar 14, 2018 at 11:24 AM, Dan Smith <dsm...@pivotal.io> wrote:

> Hi Ryan,
>
> What's your JIRA username? I see two usernames linked to this email
> address mcmellawatt and rmcmahon.
>
> -Dan
>
> On Wed, Mar 14, 2018 at 11:09 AM, Ryan McMahon <rmcma...@pivotal.io>
> wrote:
>
>> Hello,
>>
>> I need permissions to update tickets in Geode JIRA.  If someone can grant
>> me those permissions it would be appreciated.
>>
>> Thanks,
>> Ryan
>>
>
>

Geode JIRA permissions

2018-03-14 Thread Ryan McMahon

Hello,

I need permissions to update tickets in Geode JIRA.  If someone can grant
me those permissions it would be appreciated.

Thanks,
Ryan

54 matches

Mail list logo