On 07.10.19 14:16, Thomas Huth wrote:
> On 04/10/2019 14.44, Max Reitz wrote:
>> On 04.10.19 12:19, Kevin Wolf wrote:
>>> Am 02.10.2019 um 19:47 hat Max Reitz geschrieben:
>>>> On 02.10.19 18:44, Kevin Wolf wrote:
>>>>> Am 02.10.2019 um 13:57 hat Max Reitz geschrieben:
>>>>>> It usually worked fine for me because it’s rather rare that non-block
>>>>>> patches broke the iotests.
>>>>>
>>>>> I disagree. It happened all the time that someone else broke the iotests
>>>>> in master and we needed to fix it up.
>>>>
>>>> OK.
>>>>
>>>> (In my experience, it’s still mostly block patches, only that they tend
>>>> to go through non-block trees.)
>>>
>>> Fair enough, it's usually code that touches block code. I assumed "block
>>> patches" to mean patches that go through one of the block trees and for
>>> which iotests are run before sending a pull request.
>>>
>>> In the end, I don't care what code these patches touched. I do an
>>> innocent git pull, and when I finally see that it's master that breaks
>>> iotests and not my patches on top of it, I'm annoyed.
>>
>> Hm.  Part of my point was that this still happens all the time.
>>
>> Which is why I’d prefer more tests to run in CI than a handful of not
>> very useful ones in make check.
> 
> Ok, so let's try to add some more useful test to the "auto" group. Kevin
> mentioned 030, 040 and 041, and I think it should be ok to enable them
> (IIRC the only issue was that they run a little bit longer, but if they
> are very useful, we should include them anyway).

I agree on those.  (Maybe not 040, but definitely 030 and 041.)

Maybe one of the issues was the “path too long” thing for Unix sockets?

> Do you have any other tests in mind which could be very useful?

I’d like a test for iothreads, it doesn’t look like there currently is
one in auto.  (The problem of course is that they have a higher chance
of being flaky, but I think they also have a higher probability of
breaking because of non-block stuff.)

127 and 256 look promising to me.


Also, I don’t see any migration tests in auto (156 doesn’t really count).

The ones that looks interesting here are 091, 181 (but I seem to
remember that 181 is flaky under load, I should investigate that), 183,
and 203 (which also tests iothreads).


Those are the two area that I spontaneously think of when considering
what is more likely to be broken by non-block patches.  Unfortunately,
those are also the two areas with the tests that tend to be the
flakiest, I guess...

> [...]
>>> So if you merge that revert, when iotests break in master, I take it I
>>> can rely on you to fix it up before it impacts my working branch?
>>
>> This is not a game, I’m talking purely from my standpoint:
>> (I talked wrongly, but more on that below)
>>
>> Whenever make check fails, it’s urgent.  Without iotests running in make
>> check, we had some time to sort the situation out.  That’s no longer the
>> case.
>>
>> That means we need to take care of everything immediately.  And I purely
>> want help with that.
> 
> While that sounds noble at a first glance, I think you've maybe got too
> high expectations at yourself here? I mean, all other maintainers of
> other subsystems with tests have the same "problem" if the tests for
> their subsystem run in "make check".

The difference is that the iotests generally seem much less
deterministic to me than the other tests we have.

> The "normal" behavior that I've
> seen so far (and which I think is the right way to deal with it):
> 
> 1) If a pull request gets rejected due to a "make check" failure, simply
> drop the offending patches from the pull request, send a v2 pull req
> without them, and tell the author of the offending patches to fix the
> problem (unless the fix is completely trivial and could immediately be
> applied to the v2 pull req). You really don't have to fix each and every
> issue on your own as a maintainer and can redirect this to the patch
> authors again.
> 
> 2) If a test fails occasionally during "make check" (due to race
> conditions or whatever), it gets disabled from "make check" if it can't
> be fixed easily (e.g. in the qtests it gets moved to the "SPEED=slow"
> category, or in iotests it would get removed from the "auto" group).

Well, we’ll see how it goes.  I suppose in practice it won’t be too big
of a problem to just temporarily remove tests from auto if the issue
isn’t clear immediately but the test does seem important.  (I don’t
think there’s too much danger of forgetting them.)

>>> Yes, making iotests stable on CI probably involves some pain, especially
>>> initially. However, if we don't fix them upstream, they'll end up on our
>>> desk again downstream because people run tests neither on your nor on my
>>> laptop.
>>>
>>> I think we might actually save time by fixing them once and for all,
>>> even if they are problems in the test cases and not in QEMU, and making
>>> new test cases portable from the start, instead of getting individual
>>> reports one at a time and having to look at the same test cases again
>>> and again.
>>
>> You should really know I’m all for that and that I’ve done my share of
>> work there.
>>
>> But from my perspective we’ve said and tried that for six years and we
>> aren’t there still.  So to me “We should just fix it” of course sounds
>> correct, but I also don’t believe it’ll happen any time soon.  Mostly
>> because we generally don’t know what to fix before it breaks, but also
>> because that’s exactly what we’ve tried to do for at least six years.
> 
> Well, I guess we won't ever get there if we don't try.

That was my point, that we have tried.  It isn’t like we’ve failed, it’s
just that it looks like a never-ending mission to me.

And it isn’t like I’m not working on improving the situation, even when
I’m saying that I don’t think it’ll ever be perfectly deterministic.

> And the hurdles
> will rather get higher over the years since more and more tests are
> added ...
> 
>> OTOH, keeping the iotests in make check means we have to act on failure
>> reports immediately.  For example, someone™ needs to investigate the 130
>> failure Peter has reported.  I’ve just replied “I don’t know”, CC’d
>> Thomas, and waited whether anyone else would do anything.  Nobody did,
>> and that can’t work.  (I also wrote that I wouldn’t act on it, precisely
>> because I didn’t see the point.  I assumed that if anyone disagreed with
>> that statement, they would have said something.)
>>
>> So acting on such reports means pain, too.  I consider it greater than
>> the previous kind of pain, because I prefer “This sucks, I’ll have to
>> fix it this week or so” to “Oh crap, I’ll have to investigate now,
>> reproduce it, write an email as soon as possible, and fix it”.
> 
> I think that "I have to investigate now ..." mindset is not the right
> way to handle these kind of issues. If a test is shaky and can not be
> fixed easily, we should simply disable it from the "auto" group again.
> So if you like, I can send a patch to remove 130 from the "auto" group
> (personally, I'd rather wait to see if it fails anytime soon again, or
> if this was maybe rather a one-time issue due to a non-clean test system
> ...)

Waiting for another failure sounds OK to me.  (OTOH, 130 doesn’t seem
like something that needs to be in auto, in case we want to take
something out to save time for more important tests.)

But that reminds me that iotest failures probably should be
automatically signaled to me and Kevin instead of Peter having to write
an email himself.  Would that be possible?

Max

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to