The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we "fixed" it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC & MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator.
Details: We ran into huge problems getting AEC/MediaStreamGraph changes (bug 818822 and things dependent on it) into the tree due to problems with B2g-emulator debug M10 (permaorange timeouts). This test adds a fairly small amount of processing to input audio data (resampling to 44100Hz). A test that runs perfectly in emulator opt builds and runs fine locally in M10 debug (10-12 seconds reported for the test in the logs, with or without the change), goes from taking 30-40 seconds on tbpl to 350-450(!) seconds (and then times out). Fix that one, and others fail even worse. I contacted Gregor Wagner asking for help and also jgriffin in #b2g. We found one problem (emulator going to 'sleep' during mochitests, bug 992436); I have a patch up to enable wakelock globally for mochitests. However, that just pushed the error a little deeper. The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. I worked around it for now, by turning down the timers that push fake realtime data into the system - this will cause audio underruns in MediaStreamGraph, and doesn't solve the problem of MediaStreamGraph potentially overloading itself for other reasons, or breaking assumptions about being able to keep up with data streams. (MSG wants to run every 10ms or so.) This problem also likely plays hell with the Web Audio tests, and will play hell with WebRTC echo cancellation and the media reception code, which will start trying to insert loss-concealment data and break timer-based packet loss recovery, bandwidth estimators, etc. As to what to do? That's a good question, as turning off the emulator tests isn't a realistic option. One option (very, very painful, and even slower) would be a proper device simulator which simulates both the CPU and the system hardware (of *some* B2G phone). This would produce the most realistic result with an emulator. Another option (likely not simple) would be to find a way to "slow down time" for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. So, what do we do? Because if we do nothing, it will only get worse. -- Randell Jesup, Mozilla Corp remove "news" for personal email _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform