That is a completely valid point. I started to investigate flakies for exactly
the same reason, if you remember the thread that I started a while ago. It was
later abandoned unfortunately, because I’ve run into a few issues:
- We nailed down that in order to release 3.5 stable, we have to make sure it’s
not worse than 3.4 by comparing the builds: but these builds are not
comparable, because 3.4 tests running single threaded while 3.5 multithreaded
showing problems which might also exist on 3.4,
- Neither of them running C++ tests for some reason, but that’s not really an
issue here,
- Looks like tests on 3.5 is just as solid as on 3.4, because running them on a
dedicated, single threaded environment show almost all tests succeeding,
- I think the root cause of failing unit tests could be one (or more) of the
following:
a) Environmental: Jenkins slave gets overloaded with other builds and
multithreaded test running makes things even worse: starving JDK threads and ZK
instances (both clients and servers) are unable to operate
b) Conceptional: ZK unit tests were not designed to run on multiple
threads: I investigated the unique port assignment feature which is looking
good, but there could be other possible gaps which makes them unreliable when
running simultaneously.
c) Bad testing: testing ZK in the wrong way, making bad assumption
(e.g. not syncing clients), etc.
d) Bug in the server.
I feel that finding case d) with these tests is super hard, because a test
report doesn’t give any information on what could go wrong with ZooKeeper. More
or less guessing is your only option.
Finding c) is a little bit easier, I’m trying to submit patches on them and
hopefully making some progress.
The huge pain in the arse though are a) and b): people desperately keep
commenting “please retest this” on github to get a green build while testing is
going in a direction to hide real problems: I mean people started not to care
about a failing build, because “it must be some flaky unrelated to my patch”.
Which is bad, but the shame is it’s true 90% percent of cases.
I’m just trying to find some ways - besides fixing c) and d) flakies - to get
more reliable and more informative Jenkins builds. Don’t want to make a huge
turnaround, but I think if we can get a significantly more reliable build for
the price of slightly longer build time running on 4 threads instead of 8, I
say let’s do it.
As always, any help from the community is more than welcome and appreciated.
Thanks,
Andor
> On 2018. Oct 12., at 16:52, Patrick Hunt <[email protected]> wrote:
>
> iirc the number of threads was increased to improve performance. Reducing
> is fine, but do we understand why it's failing? Perhaps it's finding real
> issues as a result of the artificial concurrency/load.
>
> Patrick
>
> On Fri, Oct 12, 2018 at 7:12 AM Andor Molnar <[email protected]>
> wrote:
>
>> Thanks for the feedback.
>> I'm running a few tests now: branch-3.5 on 2 threads and trunk on 4 threads
>> to see what's the impact on the build time.
>>
>> Github PR job is hard to configure, because its settings are hard coded
>> into a shell script in the codebase. I have to open PR for that.
>>
>> Andor
>>
>>
>>
>> On Fri, Oct 12, 2018 at 2:46 PM, Norbert Kalmar <
>> [email protected]> wrote:
>>
>>> +1, running the tests locally with 1 thread always passes (well, I run it
>>> about 5 times, but still)
>>> On the other hand, running it on 8 threads yields similarly flaky results
>>> as Apache runs. (Although it is much faster, but if we have to run 6-8-10
>>> times sometimes to get a green run...)
>>>
>>> Norbert
>>>
>>> On Fri, Oct 12, 2018 at 2:05 PM Enrico Olivelli <[email protected]>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> Enrico
>>>>
>>>> Il ven 12 ott 2018, 13:52 Andor Molnar <[email protected]> ha scritto:
>>>>
>>>>> Hi,
>>>>>
>>>>> What do you think of changing number of threads running unit tests in
>>>>> Jenkins from current 8 to 4 or even 2?
>>>>>
>>>>> Running unit tests inside Cloudera environment on a single thread
>> shows
>>>> the
>>>>> builds much more stable. That would be probably too slow, but maybe
>>>> running
>>>>> at least less threads would improve the situation.
>>>>>
>>>>> It's getting very annoying that I cannot get a green build on GitHub
>>> with
>>>>> only a few retests.
>>>>>
>>>>> Regards,
>>>>> Andor
>>>>>
>>>> --
>>>>
>>>>
>>>> -- Enrico Olivelli
>>>>
>>>
>>