On 11/21/2012 11:52 AM, Eric Rescorla wrote:
On Wed, Nov 21, 2012 at 12:28 AM, <[email protected]> wrote:

 From what I'm getting from Randell and Ekr's feedback, it sounds like
getting the smoke tests for the full stack of webrtc is the priority, not
the crashtests. I think that makes logical sense as the primary focus area,
although I should note that given the high bug rates for crashes, I do
think we need to continuously burn through expanding the crashtest
automation to protect ourselves carefully from crash regressions. From the
QA side, what I'd like to see is a combination of both ideas proposed to
achieve the joint balance of effective sunny day automation + protection
against crash regressions to achieve the benefits of both areas.

My suggestion I would go for is something along the lines of this:

1. Any work Henrik already started for certain crashtests (since I know
he's got a few in queue) - finish those off

I think it depends on how long this takes. We are already suffering from
this now, so if it means that
we won't see a start of functional testing till next week (I'm assuming
that Henrik isn't taking TG off)
then I would prefer to just suspend the crashtests.

I don't have a problem with this if it's relatively short (a couple of days), as you ask.

2. Henrik shifts focus towards smoke test automation for the full stack,
I'll finish off the gUM smoke test automation (which is 90% done).
3. We establish a review policy that any patches that can definitely have
a crashtest associated with them has to be included on check-in by the
developer building the patch. If the patch comes in without a crashtest
that can indeed possibly have one, the patch gets a review- until the test
is included. That will balance both validating that the patch actually does
work at check-in, but also starts to burn through progressively building
out a crashtest regression automation suite.

I don't really understand what this means. I agree that if a crashtest exists, 
it should pass before
we submit the patch. Do you mean something else?

I do think he means something else.  I think he means blocking checkin for any 
crash-bug patch without an associated crashtest (if it's possible to build one).

At this point in the project, we have *critical* needs for smoketests/mochitests of the 
full stack, plus some obvious stressors (and the more generic the stressors, generally 
the better).  I think some crash-bug-derived mochitests make sense right now, especially 
if they cover a fair bit of the stack or cover an important edgecase.  True 
anti-specific-regression crashtests don't show much value to me at this point.  Many of 
them would likely either break quietly or become irrelevant due to API changes or 
refactorings.  Note: many crash bugs are currently coming from fairly "normal" 
use of the APIs, and so smoketests and sunny-day/common-failure mochitests should do a 
good job with them, and testcases for them can be turned into or added to those tests.

Where it's easy to write a test, I wouldn't stop anyone (crashtest or not), 
though I'd suggest mochitests always make more sense if there's any way to know 
we finished the test (connected a call, captured media, etc).  And instead of 
tightening down the crashtest to the tightest case that makes it fail, I'd 
widen them to the widest case that we can easily do that triggers the bug, at 
least currently.

Please re-read my message.  That lays out what I think as module owner should 
be the priorities for tests, especially at this point.  I'm open to persuasion, 
but please indicate what the practical improvement will be relative to what I 
suggested.  Otherwise, that's the way we should proceed.

p.s. a few concrete examples of how I want us to approach test creation:

A very useful mochitest/stressor would be deleting a PeerConnection (or 
navigating away or reloading) immediately after creation, and repeated with 
progressively longer periods (run as a modification to a baseline test that 
starts a loopback call).  This would cover a huge range of possible failures 
and race conditions (and sunny-day paths).

An example of a stressor that is of less practical use (though still important 
for security reasons perhaps) would be something that quickly creates a very 
large number of (fake) mediastreams, or peerconnections, datachannels, etc.

An example of a crashtest that's of less use right now would be bug 799191 (we 
used the number of video devices as the loop limit for looking at audio 
devices).  Yes, we can test this, but this specific bug is unlikely to recur, 
and we would end up testing for that from here to eternity on every push - and 
we have more important tests to get in.  And it's very likely that other 
standard smoketests/mochitests/sunny-day tests would cover this path well 
anyways.

An example of a reasonably useful "crashtest" (which should be coded as a 
mochitest) would be bug 798873: support for arbitrary-length SDP strings, though I doubt 
it would regress.  This might best be tested as part of a suite of SDP 
creation/manipulation tests (and that probably can only be fully tested from C++).

An even better example (because it covers a bunch of sunny-day cases) would be 
bug 802376 (selecting a video device other than the first one crashes), though 
it may be tough to test, or NSS startup, or bug 780790 (which really should be 
part of an API-conformance mochitest).  To be honest, I had trouble coming up 
with a *great* example here, though as we clean more into the corners I'm sure 
we'll find some.  Probably the best current examples would come from fuzz bugs.

Does that work? I do understand the concerns Ekr and Randell raise, but I
also don't want to take risks on crash regressions too much given the high
crash bug rate, even with the feature currently prefed off.

--
Randell Jesup, Mozilla Corp

_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

Reply via email to