Re: [VOTE] change Default branch for geode-examples to 'develop'

2020-07-14 Thread Anthony Baker
-1.  Happy to change my mind if there’s a user-friendly way to deal with the 
scenario I mentioned below.

Anthony


On Jul 14, 2020, at 8:40 AM, Owen Nichols 
mailto:onich...@vmware.com>> wrote:

Hi Anthony, there is a separate discuss thread [1] for this topic.  This is the 
vote thread for the ASF INFRA ticket [2] to make this one specific change that 
came out of the discussion.  Your input is valuable and I encourage you to both 
vote on this thread and continue the conversation on the discuss thread.

[1] 
https://lists.apache.org/thread.html/rfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee%40%3Cdev.geode.apache.org%3E
[2] https://issues.apache.org/jira/browse/INFRA-20510


On 7/14/20, 7:16 AM, "Anthony Baker" 
mailto:bak...@vmware.com>> wrote:

   Consider the use case of an application developer who wants to run 
geode-examples against the latest geode release:

   1) brew install apache-geode
   2) git clone geode-examples
   3) Get some runtime errors because geode-examples won’t connect to a 
previous geode release

   At this point, you have to do some detective work to either download the 
geode-examples from the corresponding source release or switch over to the 
appropriate git tag.

   I think there’s value in maintaining a default branch of geode-examples that 
tracks the latest release.


   Anthony


On Jul 9, 2020, at 9:39 PM, Owen Nichols 
mailto:onich...@vmware.com>> wrote:

A fresh checkout of geode and all but one geode- repos checks 
out develop as the Default branch.

The lone exception is geode-examples.  Please vote +1 if you are in favor of 
changing its Default branch to develop for consistency with the other repos and 
other reasons as per recent discussion[1].

[1] 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fx%2Fthread.html%2Frfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee%40%253Cdev.geode.apache.org%253E&data=02%7C01%7Cbakera%40vmware.com%7Ce9db02c2e21544dc02e608d8280c43d6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637303380464893583&sdata=1dCB0szNtuImpKWBSaL9ntHFY6z1KCVJWusiT5%2BEO0I%3D&reserved=0



Re: Odg: [INFO] Latest test run of 200 DistributedTestOpenJDK8 passes

2020-07-14 Thread Alexander Murmann
Thanks for looking into this, Mario!

You are probably right, that the underlying issue might have been
pre-existing and that the test is surfacing it. I am glad though that you
are investigating, because a close to 30% fail rate is a problem. Something
like this happens every once in a while and then someone has to do some
work to resolve historical problems that they hadn't planned to addres.

Thanks!

On Tue, Jul 14, 2020 at 1:06 AM Mario Ivanac  wrote:

> Hi,
>
> after adding additional checks in failing test, now I can see that test
> are failing due to fault that some batch are distributed at stopping of GW
> sender.
> Cause of that, I suspect that this problem existed prior to this PR, but
> this PR is first to introduce test to check this.
>
> I will continue to investigate this fault, but I can not locally reproduce
> this fault, so this is slowing troubleshooting.
>
> BR,
> Mario
> 
> Šalje: Alexander Murmann 
> Poslano: 14. srpnja 2020. 1:11
> Prima: Alexander Murmann 
> Kopija: dev@geode.apache.org ; Mario Ivanac
> 
> Predmet: Re: [INFO] Latest test run of 200 DistributedTestOpenJDK8 passes
>
> We continue to see these WAN tests adding a fail rate of just below 30% in
> our mass test runs
> <
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/create-mass-test-run-report/builds/10
> >
> .
>
> That's a very significant fail rate that impacts our ability to get our
> code committed with confidence.
>
> Can we resolve this issue? Otherwise, I think we need to consider reverting
> GEODE-7458.
>
> On Fri, Jun 19, 2020 at 3:28 PM Alexander Murmann 
> wrote:
>
> > Looking more into this, it looks like this was introduced by the changes
> > for GEODE-7458 - "Adding additional option in gfsh command "start gateway
> > sender" to control clearing of existing queues".
> >
> > That happened about a month ago, but it's inherent to those flaky tests
> > that we discover them only after a while. Nonetheless, they become paper
> > cuts that ultimately slow us down substantially if they persist.
> >
> > @Mario Ivanac If I am correct and GEODE-7458 introduced this you were the
> > one making that change. Might you be able to take a look at making that
> > test more reliable or reverting the change?
> >
> > Thank you!
> >
> > On Fri, Jun 19, 2020 at 7:57 AM Alexander Murmann 
> > wrote:
> >
> >> Thank you so much for sharing this, Mark!
> >>
> >> It looks like there is a big cluster around WAN Gateway. Is anyone
> >> already looking into the WAN issues?
> >>
> >> On Thu, Jun 18, 2020 at 10:06 PM Mark Hanson 
> wrote:
> >>
> >>> FYI, the build success rate was around 90% or so about two months ago.
> >>>
> >>> Here are the DUnit tests that are currently failing in our tests, most
> >>> likely in CI, and PR pipelines.
> >>>
> >>> Please let me know if you have any questions.
> >>>
> >>> Thanks,
> >>> Mark
> >>>
> >>>
> >>>
> >>>
> ***
> >>>
> >>>  Overall build success rate: 78.0% (156 of 200)
> >>>
> >>>
> >>>
> ***
> >>>
> >>>
> >>>
> >>> The following test methods see failures in more than one class.  There
> >>> may be a failing *TestBase class
> >>>
> >>>
> >>>
> >>>
> *.testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
> >>> 12 failures :
> >>>
> >>>   SerialWANPersistenceEnabledGatewaySenderDUnitTest:  8 failures
> >>> (96.000% success rate)
> >>>
> >>>   SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  4 failures
> >>> (98.000% success rate)
> >>>
> >>>
> >>>
> >>>
> *.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
> >>> 12 failures :
> >>>
> >>>   ParallelWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  5
> >>> failures (97.500% success rate)
> >>>
> >>>   ParallelWANPersistenceEnabledGatewaySenderDUnitTest:  7 failures
> >>> (96.500% success rate)
> >>>
> >>>
> >>>
> >>> *.testPingWrongServer:  4 failures :
> >>>
> >>>   ClientServerMiscSelectorDUnitTest:  3 failures (98.500% success rate)
> >>>
> >>>   ClientServerMiscDUnitTest:  1 failures (99.500% success rate)
> >>>
> >>>
> >>>
> >>>
> >>>
> ***
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> org.apache.geode.internal.cache.wan.serial.SerialWANPersistenceEnabledGatewaySenderDUnitTest:
> >>> 8 failures (96.000% success rate)
> >>>
> >>>
> >>>
> >>>
> >>>
> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
> >>>
> >>>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3539
> >>>
> >>>
> >>>
> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
> >>>
> >>>
> https://concourse.apachegeode-ci.info/teams/main/pipe

Re: [VOTE] change Default branch for geode-examples to 'develop'

2020-07-14 Thread Owen Nichols
Hi Anthony, there is a separate discuss thread [1] for this topic.  This is the 
vote thread for the ASF INFRA ticket [2] to make this one specific change that 
came out of the discussion.  Your input is valuable and I encourage you to both 
vote on this thread and continue the conversation on the discuss thread.

[1] 
https://lists.apache.org/thread.html/rfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee%40%3Cdev.geode.apache.org%3E
[2] https://issues.apache.org/jira/browse/INFRA-20510


On 7/14/20, 7:16 AM, "Anthony Baker"  wrote:

Consider the use case of an application developer who wants to run 
geode-examples against the latest geode release:

1) brew install apache-geode
2) git clone geode-examples
3) Get some runtime errors because geode-examples won’t connect to a 
previous geode release

At this point, you have to do some detective work to either download the 
geode-examples from the corresponding source release or switch over to the 
appropriate git tag.

I think there’s value in maintaining a default branch of geode-examples 
that tracks the latest release.


Anthony


> On Jul 9, 2020, at 9:39 PM, Owen Nichols  wrote:
> 
> A fresh checkout of geode and all but one geode- repos 
checks out develop as the Default branch.
> 
> The lone exception is geode-examples.  Please vote +1 if you are in favor 
of changing its Default branch to develop for consistency with the other repos 
and other reasons as per recent discussion[1].
> 
> [1] 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fx%2Fthread.html%2Frfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee%40%253Cdev.geode.apache.org%253E&data=02%7C01%7Conichols%40vmware.com%7C0b94a8d480c948973a7008d828007e43%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637303329899842884&sdata=pc1oOl0IBQITxVBWsPqfToAjHADHBKMFygzatJz2vA0%3D&reserved=0




Re: [VOTE] change Default branch for geode-examples to 'develop'

2020-07-14 Thread Ju@N
+1

On Fri, 10 Jul 2020 at 15:52, Alberto Bustamante Reyes
 wrote:

> +1
> 
> De: Joris Melchior 
> Enviado: viernes, 10 de julio de 2020 15:54
> Para: dev@geode.apache.org 
> Asunto: Re: [VOTE] change Default branch for geode-examples to 'develop'
>
> +1
>
> On 2020-07-10, 12:39 AM, "Owen Nichols"  wrote:
>
> A fresh checkout of geode and all but one geode- repos
> checks out develop as the Default branch.
>
> The lone exception is geode-examples.  Please vote +1 if you are in
> favor of changing its Default branch to develop for consistency with the
> other repos and other reasons as per recent discussion[1].
>
> [1]
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fx%2Fthread.html%2Frfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee%40%253Cdev.geode.apache.org%253E&data=02%7C01%7Cjmelchior%40vmware.com%7C458c4abf934b43480f2308d8248b403a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637299527784977071&sdata=7CRcXQYAkbVtQ5CMFZgKZCMtfyqHw2UxkNPA4KwSl8k%3D&reserved=0
>
>

-- 
Ju@N


Re: [VOTE] change Default branch for geode-examples to 'develop'

2020-07-14 Thread Anthony Baker
Consider the use case of an application developer who wants to run 
geode-examples against the latest geode release:

1) brew install apache-geode
2) git clone geode-examples
3) Get some runtime errors because geode-examples won’t connect to a previous 
geode release

At this point, you have to do some detective work to either download the 
geode-examples from the corresponding source release or switch over to the 
appropriate git tag.

I think there’s value in maintaining a default branch of geode-examples that 
tracks the latest release.


Anthony


> On Jul 9, 2020, at 9:39 PM, Owen Nichols  wrote:
> 
> A fresh checkout of geode and all but one geode- repos checks 
> out develop as the Default branch.
> 
> The lone exception is geode-examples.  Please vote +1 if you are in favor of 
> changing its Default branch to develop for consistency with the other repos 
> and other reasons as per recent discussion[1].
> 
> [1] 
> https://lists.apache.org/x/thread.html/rfec15c0a7d5d6d57beed90868dbb53e3bfcaabca67589b28585556ee@%3Cdev.geode.apache.org%3E



Odg: [INFO] Latest test run of 200 DistributedTestOpenJDK8 passes

2020-07-14 Thread Mario Ivanac
Hi,

after adding additional checks in failing test, now I can see that test are 
failing due to fault that some batch are distributed at stopping of GW sender.
Cause of that, I suspect that this problem existed prior to this PR, but this 
PR is first to introduce test to check this.

I will continue to investigate this fault, but I can not locally reproduce this 
fault, so this is slowing troubleshooting.

BR,
Mario

Šalje: Alexander Murmann 
Poslano: 14. srpnja 2020. 1:11
Prima: Alexander Murmann 
Kopija: dev@geode.apache.org ; Mario Ivanac 

Predmet: Re: [INFO] Latest test run of 200 DistributedTestOpenJDK8 passes

We continue to see these WAN tests adding a fail rate of just below 30% in
our mass test runs

.

That's a very significant fail rate that impacts our ability to get our
code committed with confidence.

Can we resolve this issue? Otherwise, I think we need to consider reverting
GEODE-7458.

On Fri, Jun 19, 2020 at 3:28 PM Alexander Murmann 
wrote:

> Looking more into this, it looks like this was introduced by the changes
> for GEODE-7458 - "Adding additional option in gfsh command "start gateway
> sender" to control clearing of existing queues".
>
> That happened about a month ago, but it's inherent to those flaky tests
> that we discover them only after a while. Nonetheless, they become paper
> cuts that ultimately slow us down substantially if they persist.
>
> @Mario Ivanac If I am correct and GEODE-7458 introduced this you were the
> one making that change. Might you be able to take a look at making that
> test more reliable or reverting the change?
>
> Thank you!
>
> On Fri, Jun 19, 2020 at 7:57 AM Alexander Murmann 
> wrote:
>
>> Thank you so much for sharing this, Mark!
>>
>> It looks like there is a big cluster around WAN Gateway. Is anyone
>> already looking into the WAN issues?
>>
>> On Thu, Jun 18, 2020 at 10:06 PM Mark Hanson  wrote:
>>
>>> FYI, the build success rate was around 90% or so about two months ago.
>>>
>>> Here are the DUnit tests that are currently failing in our tests, most
>>> likely in CI, and PR pipelines.
>>>
>>> Please let me know if you have any questions.
>>>
>>> Thanks,
>>> Mark
>>>
>>>
>>>
>>> ***
>>>
>>>  Overall build success rate: 78.0% (156 of 200)
>>>
>>>
>>> ***
>>>
>>>
>>>
>>> The following test methods see failures in more than one class.  There
>>> may be a failing *TestBase class
>>>
>>>
>>>
>>> *.testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
>>> 12 failures :
>>>
>>>   SerialWANPersistenceEnabledGatewaySenderDUnitTest:  8 failures
>>> (96.000% success rate)
>>>
>>>   SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  4 failures
>>> (98.000% success rate)
>>>
>>>
>>>
>>> *.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
>>> 12 failures :
>>>
>>>   ParallelWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  5
>>> failures (97.500% success rate)
>>>
>>>   ParallelWANPersistenceEnabledGatewaySenderDUnitTest:  7 failures
>>> (96.500% success rate)
>>>
>>>
>>>
>>> *.testPingWrongServer:  4 failures :
>>>
>>>   ClientServerMiscSelectorDUnitTest:  3 failures (98.500% success rate)
>>>
>>>   ClientServerMiscDUnitTest:  1 failures (99.500% success rate)
>>>
>>>
>>>
>>>
>>> ***
>>>
>>>
>>>
>>>
>>>
>>> org.apache.geode.internal.cache.wan.serial.SerialWANPersistenceEnabledGatewaySenderDUnitTest:
>>> 8 failures (96.000% success rate)
>>>
>>>
>>>
>>>
>>>  
>>> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
>>>
>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3539
>>>
>>>
>>>  
>>> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
>>>
>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3526
>>>
>>>
>>>  
>>> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
>>>
>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3505
>>>
>>>
>>>  
>>> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
>>>
>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3435
>>>
>>>
>>>  
>>> testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
>>>
>>> https://concourse.apachegeode-ci.info/teams/main/pipelin