Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-26 Thread Andor Molnar
Hi all, I’ve updated a bunch of old tickets under this umbrella to reflect the most up-to-date situation: https://issues.apache.org/jira/browse/ZOOKEEPER-3170 Flakies which are not flakies anymore have been closed and will create new

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-22 Thread Andor Molnár
Thanks Bogdan, so far so good. testNodeDataChanged is an old beast, I've a possible fix for that from @afine: https://github.com/apache/zookeeper/pull/300 Would be great if we could review it and get rid of this flaky. Andor On 10/20/18 06:41, Bogdan Kanivets wrote: > I think the argument

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-19 Thread Bogdan Kanivets
I think the argument for keeping concurrency is that it may manifest some unknown problems with the code. Maybe a middle ground - move largest offenders into separate junit tag and run them after rest of the test with threads=1. Hopefully this will make life better for PRs. On the note of

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-18 Thread Michael Han
It's a good idea to reduce the concurrency of to eliminate flakyness. Looks like single threaded unit tests on trunk is pretty stable https://builds.apache.org/job/zookeeper-trunk-single-thread/ (some failures are due to C tests). The build time is longer, but not too bad (for pre-commit build,

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-18 Thread Andor Molnar
Lots of ConnectionLoss and Address already in use failures on branch34_java9. Looks like specific to Jenkins slave H22. Andor On Mon, Oct 15, 2018 at 2:50 PM, Andor Molnar wrote: > +1 > > > > On Mon, Oct 15, 2018 at 1:55 PM, Enrico Olivelli > wrote: > >> Il giorno lun 15 ott 2018 alle ore

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-15 Thread Andor Molnar
+1 On Mon, Oct 15, 2018 at 1:55 PM, Enrico Olivelli wrote: > Il giorno lun 15 ott 2018 alle ore 12:46 Andor Molnar > ha scritto: > > > > Thank you guys. This is great help. > > > > I remember your efforts Bogdan, as far as I remember you observer thread > starvation in multiple runs on

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-15 Thread Enrico Olivelli
Il giorno lun 15 ott 2018 alle ore 12:46 Andor Molnar ha scritto: > > Thank you guys. This is great help. > > I remember your efforts Bogdan, as far as I remember you observer thread > starvation in multiple runs on Apache Jenkins. Correct my if I’m wrong. > > I’ve created an umbrella Jira to

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-15 Thread Andor Molnar
Thank you guys. This is great help. I remember your efforts Bogdan, as far as I remember you observer thread starvation in multiple runs on Apache Jenkins. Correct my if I’m wrong. I’ve created an umbrella Jira to capture all flaky test fixing efforts here:

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-15 Thread Bogdan Kanivets
Fangmin, Those are good ideas. FYI, I've stated running tests continuously in aws m1.xlarge. https://github.com/lavacat/zookeeper-tests-lab So far, I've done ~ 12 runs of trunk. Same common offenders as in Flaky dash: testManyChildWatchersAutoReset, testPurgeWhenLogRollingInProgress I'll do

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-14 Thread Fangmin Lv
Internally, we also did some works to reduce the flaky, here are the main things we've done: * using retry rule to retry in case the zk client lost it's connection, this could happen if the quorum tests is running on unstable environment and the leader election happened. * using random port

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-13 Thread Bogdan Kanivets
I've looked into flakiness couple months ago (special attention on testManyChildWatchersAutoReset). In my opinion the problem is a) and c). Unfortunately I don't have data to back this claim. I don't remember seeing many 'port binding' exceptions. Unless 'port assignment' issue manifested as some

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-13 Thread Enrico Olivelli
Il ven 12 ott 2018, 23:17 Benjamin Reed ha scritto: > i think the unique port assignment (d) is more problematic than it > appears. there is a race between finding a free port and actually > grabbing it. i think that contributes to the flakiness. > This is very hard to solve for our test cases,

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Benjamin Reed
i think the unique port assignment (d) is more problematic than it appears. there is a race between finding a free port and actually grabbing it. i think that contributes to the flakiness. ben On Fri, Oct 12, 2018 at 8:50 AM Andor Molnar wrote: > > That is a completely valid point. I started to

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Andor Molnar
That is a completely valid point. I started to investigate flakies for exactly the same reason, if you remember the thread that I started a while ago. It was later abandoned unfortunately, because I’ve run into a few issues: - We nailed down that in order to release 3.5 stable, we have to make

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Patrick Hunt
iirc the number of threads was increased to improve performance. Reducing is fine, but do we understand why it's failing? Perhaps it's finding real issues as a result of the artificial concurrency/load. Patrick On Fri, Oct 12, 2018 at 7:12 AM Andor Molnar wrote: > Thanks for the feedback. >

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Andor Molnar
Thanks for the feedback. I'm running a few tests now: branch-3.5 on 2 threads and trunk on 4 threads to see what's the impact on the build time. Github PR job is hard to configure, because its settings are hard coded into a shell script in the codebase. I have to open PR for that. Andor On

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Norbert Kalmar
+1, running the tests locally with 1 thread always passes (well, I run it about 5 times, but still) On the other hand, running it on 8 threads yields similarly flaky results as Apache runs. (Although it is much faster, but if we have to run 6-8-10 times sometimes to get a green run...) Norbert

Re: Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Enrico Olivelli
+1 Enrico Il ven 12 ott 2018, 13:52 Andor Molnar ha scritto: > Hi, > > What do you think of changing number of threads running unit tests in > Jenkins from current 8 to 4 or even 2? > > Running unit tests inside Cloudera environment on a single thread shows the > builds much more stable. That

Decrease number of threads in Jenkins builds to reduce flakyness

2018-10-12 Thread Andor Molnar
Hi, What do you think of changing number of threads running unit tests in Jenkins from current 8 to 4 or even 2? Running unit tests inside Cloudera environment on a single thread shows the builds much more stable. That would be probably too slow, but maybe running at least less threads would