Re: Using Firefox to test the Visual C++ compiler

2016-12-01 Thread Gregory Szorc
Many of our tests have inconsistent results.

Generally speaking, developers only run a narrow subset of tests locally
and defer running the full suite of tests to our automation infrastructure.
Within that infrastructure, we have a database of known "intermittent"
failures and tools that attempt to auto classify known intermittent
failures. People find intermittent failures that fall through the cracks.
In some cases, we outright disable tests that aren't reliable enough. In
other cases we set thresholds as to what their expected failure rate should
be. Unfortunately, we don't have a good answer for reproducing this
classification infrastructure outside of Mozilla :/

Furthermore, there are some tests that are so wonky that reproducing
behavior outside of our automation environment is difficult or impossible.
It is not uncommon to have to "check out" a machine from our automation
environment in order to debug a failure.

If your goal is to qualify pre-release compiler changes against Firefox,
I'd start by focusing on tests that run reliably. Generally speaking,
"headless" tests (tests not rendering a Firefox window) are more reliable.
This includes gtest, xpcshell, and many JS engine tests. Reftests,
crashtests, and web platform tests (WPT) are also pretty isolated and are
generally more consistent. Mochitests tend to have the most inconsistency
from my experience.

This general problem of inconsistent test execution is a complex topic and
has consumed thousands of people hours at Mozilla. If you're intent on
running large parts of Firefox automation, it's probably worth a meeting or
video conference with some of us to go over in more detail.

On Fri, Nov 25, 2016 at 3:36 AM, Gratian Lup  wrote:

> Hi,
>
> Thanks a lot for your help! I think doing the testing on our machines is a
> better approach overall, even from an engineering perspective - in case of
> a failure you can just grab the machine and start debugging, recompiling,
> etc. The legal work, if possible at all, would be quite complicated too.
>
> This new approach of testing benefits both us and the tested projects.
> What we are doing:
> - If the failure is in the compiler frontend, it is either a bug or a
> problem in the code itself. There were a few cases when the code had
> problems and we informed and helped those project to solve it - this would
> also be the case with Firefox.
> - If the failure is in the compiler backend/optimizer, fix the bug. This
> is pretty much always a real bug, unless the source code triggers some
> cases of undefined behavior - we didn't find such a case yet.
>
> I tried running the tests on Windows 7 this time, with the AV and firewall
> disabled, just to be sure they don't interfere. When using the mach
> commands, there are still quite a lot of failures, so I assume that
> something is still not configured right. I was curious if it's different on
> Linux, but tests are also failing there (I used Xubuntu 16.10 with KDE).
>
> Most failures are in the mochitest, reftest and wpt suites. If it's not
> possible to make all the tests pass, a good enough approach is to establish
> a baseline of tests that are known to fail using a debug/non-optimized
> build and consider a real failure only a failed test not in that set. We
> already do this for a few other large projects and seems like a good
> compromise.
>
> Is there a way to skip over the tests that are known to fail? With Gtest
> this is easy using the --gtest_filter flag. If it's not possible, then a
> script is needed to parse the results and ignore those known failures.
>
> Thanks,
> Gratian
>
> Here are some results from the tests running on Windows 7. You can find
> some log files here: https://1drv.ms/f/s!AmGHUgGfTN19hjuFKlOeZ7VtpUsd
>
> *o Tests without failures:*
> mach gtest
> mach marionette-test
> mach firefox-ui-functional
> mach jsapi-tests
> mach jsbrowsertest
>
> o Spidermonkey tests
> mach check-spidermonkey doesn't seem to work because js.exe is not built
> by default. To test it I built the js folder and followed the Spidermonkey
> test instructions to run the JS and JIT tests - both pass without failures.
>
> *o Tests with failures*
> *- mach mochitest*
>
> Part 1
>
> 1 INFO Passed:  675002
>
> 2 INFO Failed:  23
>
> 3 INFO Todo:1586
>
> 4 INFO Mode:e10s
>
> Part 2
>
> 1 INFO Passed:  676509
>
> 2 INFO Failed:  16
>
> 3 INFO Todo:1592
>
> 4 INFO Mode:e10s
>
>
> Example of failed test: 
> dom/base/test/test_noAudioNotificationOnMutedElement.html
> | Test timed out.
>
>
> *- mach reftest*
>
> REFTEST INFO | Successful: 13436 (13416 pass, 20 load only)
>
> REFTEST INFO | Unexpected: 114 (113 unexpected fail, 0 unexpected pass, 0
> unexpected asserts, 1 failed load, 0 exception)
>
> REFTEST INFO | Known problems: 700 (285 known fail, 0 known asserts, 69
> random, 346 skipped, 0 slow)
>
> Example of failed test: 
> c:/firefox/mozilla-central/layout/reftests/bugs/321402-4.xul
> == 

Re: Using Firefox to test the Visual C++ compiler

2016-11-28 Thread ryanvm
On Wednesday, November 16, 2016 at 3:27:45 AM UTC-5, Gratian Lup wrote:
> Hello,
> 
> *I posted this on multiple mailing lists, I'm not sure which is the right 
> place for these questions.
> 
> I'm a developer on the Microsoft Visual C++ compiler (code optimizer) and I'm 
> looking into extending our test suite with popular open-source projects. This 
> helps  us to find bugs earlier, ensuring that these projects are not broken 
> by frontend or backend changes. It also helps the projects themselves, by 
> making upgrades to the never compiler easier - ideally without any problems. 
> 
> I have several questions about building Firefox on Windows and especially 
> about running the correctness tests:
> 
> 1. What is the latest version of Visual Studio that should be used? I tried 
> first VS2015 Update 3 and there seems to be a C++ error about "constexpr". 
> Next I tried Update 2 and that one finishes the build.
> 
> 2. What are the tests that should be run to test correctness? The idea here 
> is to ensure that new optimizations don't introduce new bugs, which should be 
> exposed by new test failures in Firefox. I looked over the QA Automated 
> testing page and there seem to be a lot of different test suites - running 
> all is probably not feasible. Is there a subset that is run when a checkin is 
> done, for example?
> 
> 3. What Windows OS do you use for testing? Window 10, Windows 8, etc? Any 
> other configuration work needed, such as disabling the antivirus/firewall?
> 
> After the build completed, I tried to run a few of the tests and there seem 
> to be many failures even with a debug build. I assume that the checked in 
> tests are supposed to pass, so it's likely a Windows config problem or 
> failure to access extra files from the Web. 
> 
> For example, ./mach reftests had about 700 failures. ./mach gtest, 
> jsapi-tests, jsbrowsertest had none. The tests were run under Windows 10 
> Anniversary update with the AV disabled.
> 
> Thanks,
> Gratian Lup
> Microsoft Visual C++ team

FYI, setting --enable-js-shell will get you js.exe as part of your Firefox 
build.
___
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds


Re: Using Firefox to test the Visual C++ compiler

2016-11-28 Thread Gratian Lup
Hi,

Thanks a lot for your help! I think doing the testing on our machines is a
better approach overall, even from an engineering perspective - in case of
a failure you can just grab the machine and start debugging, recompiling,
etc. The legal work, if possible at all, would be quite complicated too.

This new approach of testing benefits both us and the tested projects. What
we are doing:
- If the failure is in the compiler frontend, it is either a bug or a
problem in the code itself. There were a few cases when the code had
problems and we informed and helped those project to solve it - this would
also be the case with Firefox.
- If the failure is in the compiler backend/optimizer, fix the bug. This is
pretty much always a real bug, unless the source code triggers some cases
of undefined behavior - we didn't find such a case yet.

I tried running the tests on Windows 7 this time, with the AV and firewall
disabled, just to be sure they don't interfere. When using the mach
commands, there are still quite a lot of failures, so I assume that
something is still not configured right. I was curious if it's different on
Linux, but tests are also failing there (I used Xubuntu 16.10 with KDE).

Most failures are in the mochitest, reftest and wpt suites. If it's not
possible to make all the tests pass, a good enough approach is to establish
a baseline of tests that are known to fail using a debug/non-optimized
build and consider a real failure only a failed test not in that set. We
already do this for a few other large projects and seems like a good
compromise.

Is there a way to skip over the tests that are known to fail? With Gtest
this is easy using the --gtest_filter flag. If it's not possible, then a
script is needed to parse the results and ignore those known failures.

Thanks,
Gratian

Here are some results from the tests running on Windows 7. You can find
some log files here: https://1drv.ms/f/s!AmGHUgGfTN19hjuFKlOeZ7VtpUsd

*o Tests without failures:*
mach gtest
mach marionette-test
mach firefox-ui-functional
mach jsapi-tests
mach jsbrowsertest

o Spidermonkey tests
mach check-spidermonkey doesn't seem to work because js.exe is not built by
default. To test it I built the js folder and followed the Spidermonkey
test instructions to run the JS and JIT tests - both pass without failures.

*o Tests with failures*
*- mach mochitest*

Part 1

1 INFO Passed:  675002

2 INFO Failed:  23

3 INFO Todo:1586

4 INFO Mode:e10s

Part 2

1 INFO Passed:  676509

2 INFO Failed:  16

3 INFO Todo:1592

4 INFO Mode:e10s


Example of failed
test: dom/base/test/test_noAudioNotificationOnMutedElement.html | Test
timed out.


*- mach reftest*

REFTEST INFO | Successful: 13436 (13416 pass, 20 load only)

REFTEST INFO | Unexpected: 114 (113 unexpected fail, 0 unexpected pass, 0
unexpected asserts, 1 failed load, 0 exception)

REFTEST INFO | Known problems: 700 (285 known fail, 0 known asserts, 69
random, 346 skipped, 0 slow)

Example of failed test:
c:/firefox/mozilla-central/layout/reftests/bugs/321402-4.xul
== file:///c:/firefox/mozilla-central/layout/reftests/bugs/321402-4-ref.xul
| image comparison, max difference: 32, number of differing pixels: 1


The images look identical to me, but 1 pixel is supposedly different - I'm
wondering why the test fails if the max. acceptable diff is 32. Some other
cases I look into also had "identical" images.


*- mach web-platform-test*

*- mach crash-test*



On Mon, Nov 21, 2016 at 9:54 AM, Gregory Szorc  wrote:

> On Wed, Nov 16, 2016 at 12:25 PM, Gratian Lup  wrote:
>
>> On Wednesday, November 16, 2016 at 5:23:58 AM UTC-8, Ted Mielczarek wrote:
>> > Gratian,
>> >
>> > One of my coworkers reminded me of something that might be an option for
>> > you--we have scripts that would allow you to provide a Firefox build
>> > that you generated (at a publicly accessible URL) and trigger test jobs
>> > on that build in our CI infrastructure. If that's something that sounds
>> > useful to you we can definitely make that happen.
>> >
>> > You'd have to produce a Firefox build, run the `mach package` and `mach
>> > package-tests` targets, upload a few zip files from the $objdir/dist
>> > directory to somewhere accessible via public HTTP and then run a Python
>> > script to schedule test jobs against those files.
>> >
>> > -Ted
>>
>> Hi Ted,
>>
>> Thanks a lot for your help!
>> Using Windows 7 or 8 to do the tests should be fine - I actually don't
>> see any reason it shouldn't be OK to also do other builds and tests on them
>> when not used for Firefox.
>>
>> The idea of testing on your infrastructure is tempting, but probably
>> would consume too many resources, since this new testing system is intended
>> to be used both overnight against the latest good compiler build, but also
>> by every developer on its own while working on new features - this would be
>> quite a lot of people. Doing a test build now would still be a good idea,
>> though, 

Re: Using Firefox to test the Visual C++ compiler

2016-11-21 Thread Jonathan Wilson

issues to work out. (I'm guessing Microsoft LCA will have an opinion on
mostly-in-the-public-domain Mozilla infrastructure accessing pre-release
Microsoft software.)

Microsoft does release "nightly" compiler builds these days
https://blogs.msdn.microsoft.com/vcblog/2016/04/26/stay-up-to-date-with-the-visual-c-tools-on-nuget/
so maybe it wouldn't be so much of a problem.

___
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds


Re: Using Firefox to test the Visual C++ compiler

2016-11-16 Thread Gratian Lup
On Wednesday, November 16, 2016 at 5:23:58 AM UTC-8, Ted Mielczarek wrote:
> Gratian,
> 
> One of my coworkers reminded me of something that might be an option for
> you--we have scripts that would allow you to provide a Firefox build
> that you generated (at a publicly accessible URL) and trigger test jobs
> on that build in our CI infrastructure. If that's something that sounds
> useful to you we can definitely make that happen.
> 
> You'd have to produce a Firefox build, run the `mach package` and `mach
> package-tests` targets, upload a few zip files from the $objdir/dist
> directory to somewhere accessible via public HTTP and then run a Python
> script to schedule test jobs against those files.
> 
> -Ted

Hi Ted,

Thanks a lot for your help!
Using Windows 7 or 8 to do the tests should be fine - I actually don't see any 
reason it shouldn't be OK to also do other builds and tests on them when not 
used for Firefox. 

The idea of testing on your infrastructure is tempting, but probably would 
consume too many resources, since this new testing system is intended to be 
used both overnight against the latest good compiler build, but also by every 
developer on its own while working on new features - this would be quite a lot 
of people. Doing a test build now would still be a good idea, though, at least 
to see  if everything passes in the right environment.

I have a few more questions about running the tests:

1. How exactly should the build artifacts be copied to the test machine? 
Something like ./mach package? After copying over, running the tests with 
./mach is going to pick the binaries after copying, or is some ./mach "unpack" 
needed? I assume the entire mozilla-central enlistment is also required.

2. Can I see on Treeherder the exact command line that was used to launch the 
test suite? I looked over the log files and didn't find anything like that.

I'm going to try again with Update 3, I might have used instead the most recent 
build, which can indeed show some new errors, the frontend team do a lot of 
changes.

Thanks,
Gratian
___
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds


Re: Using Firefox to test the Visual C++ compiler

2016-11-16 Thread Ted Mielczarek
Gratian,

One of my coworkers reminded me of something that might be an option for
you--we have scripts that would allow you to provide a Firefox build
that you generated (at a publicly accessible URL) and trigger test jobs
on that build in our CI infrastructure. If that's something that sounds
useful to you we can definitely make that happen.

You'd have to produce a Firefox build, run the `mach package` and `mach
package-tests` targets, upload a few zip files from the $objdir/dist
directory to somewhere accessible via public HTTP and then run a Python
script to schedule test jobs against those files.

-Ted
___
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds


Re: Using Firefox to test the Visual C++ compiler

2016-11-16 Thread Ted Mielczarek
On Wed, Nov 16, 2016, at 03:27 AM, lgrat...@gmail.com wrote:
> Hello,
> 
> *I posted this on multiple mailing lists, I'm not sure which is the right
> place for these questions.

Hi Gratian,

Finding the right communication channel always feels like a problem at
Mozilla. :) dev-builds is a fine venue for this discussion.

> I'm a developer on the Microsoft Visual C++ compiler (code optimizer) and
> I'm looking into extending our test suite with popular open-source
> projects. This helps  us to find bugs earlier, ensuring that these
> projects are not broken by frontend or backend changes. It also helps the
> projects themselves, by making upgrades to the never compiler easier -
> ideally without any problems. 

This sounds great! We always struggle with updating to a new toolchain,
whether with compiler bugs or issues in our codebase surfaced by
correctness fixes in the compiler.  For example, here's the dependency
tree of bugs we fixed that are marked as blocking our "Support Visual
C++ 2015" bug:
https://bugzilla.mozilla.org/showdependencytree.cgi?id=1119082_resolved=0

> I have several questions about building Firefox on Windows and especially
> about running the correctness tests:
> 
> 1. What is the latest version of Visual Studio that should be used? I
> tried first VS2015 Update 3 and there seems to be a C++ error about
> "constexpr". Next I tried Update 2 and that one finishes the build.

We are using VS2015u3 for our builds in automation currently, so it
should work:
https://dxr.mozilla.org/mozilla-central/rev/79feeed4293336089590320a9f30a813fade8e3c/browser/config/tooltool-manifests/win32/releng.manifest#32

Are you building from the mozilla-central repository (our development
repository) or something else?

> 2. What are the tests that should be run to test correctness? The idea
> here is to ensure that new optimizations don't introduce new bugs, which
> should be exposed by new test failures in Firefox. I looked over the QA
> Automated testing page and there seem to be a lot of different test
> suites - running all is probably not feasible. Is there a subset that is
> run when a checkin is done, for example?

We run a lot of tests on every Firefox checkin. Here are all the Windows
build and test jobs from an arbitrary recent mozilla-central revision:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central=f8ba9c9b401f57b0047ddd6932cb830190865b38=Windows

The secret decoder key for all those letters is here (I'm not sure if
it's fully kept up-to-date):
https://treeherder.mozilla.org/userguide.html

We do have an elaborate system that tries not to run every single test
on every single checkin, but toolchain changes have a way of showing up
as test failures in very strange places, especially with things like our
PGO builds. (Ask me sometime about debugging an issue that showed up
when I turned on PGO in the Firefox 3 release cycle...) That being said,
if you were going to run any tests I would probably recommend some of
the Mochitest test suites, since they launch a browser and run a lot of
HTML+JS, which exercises a good amount of our code, and the JIT tests,
which exercise the JavaScript engine pretty heavily. The JS team tends
to write more complicated C++ than other groups within Mozilla, so they
also hit a lot of compiler issues.

> 3. What Windows OS do you use for testing? Window 10, Windows 8, etc? Any
> other configuration work needed, such as disabling the
> antivirus/firewall?

I will have to confirm specifics, but we currently run tests on Windows
7 (x86) and Windows 8 (x64), with just a small handful of tests running
on Windows 10 (x64). I'm fairly certain we do a lot of configuration on
our test machines, but I don't have the details handy. We generally
attempt to restrict network access from our test suites so that network
issues do not cause test failures, but we do run multiple servers on
localhost during testing, so at the very least that needs to be enabled.

Note also that we do our builds and tests on separate machines in
automation. Running the tests from a build on the same machine should
work fine, that's what developers do locally, it just doesn't scale well
in automation and we also want to do our builds on Server and run our
tests on a desktop OS.

> After the build completed, I tried to run a few of the tests and there
> seem to be many failures even with a debug build. I assume that the
> checked in tests are supposed to pass, so it's likely a Windows config
> problem or failure to access extra files from the Web. 
>
> For example, ./mach reftests had about 700 failures. ./mach gtest,
> jsapi-tests, jsbrowsertest had none. The tests were run under Windows 10
> Anniversary update with the AV disabled.

Many tests are extremely sensitive to the OS environment, unfortunately.
I don't think we have reftests running in Windows 10, so there's likely
to be issues there due to test expectations. You might have better luck
with them on a Windows 7 machine, which we do run