Re: rr chaos mode update
On Mon, Feb 15, 2016 at 6:26 PM, Kyle Hueywrote: > > FWIW, every failure that I've debugged to completion so far has been a bug > in the test (although I have two fatal assertion bugs I'm working through > that will obviously be flaws in Gecko). I think one of the things we > really want to get a feeling for is how often we find actual bugs in the > product. > Yes. So far I've found three Gecko bugs, but we'll find many bugs in tests. Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
I've got RR working under digital oceans and it works great there. We've built a harness for generating replays. Once a replay is generated I match the replay with the bug and comment in the bug looking for developers to investigate. When they respond they can investigate by ssh'ing. Example: https://bugzilla.mozilla.org/show_bug.cgi?id=1223249#c12 If we can we should prefer to have an ssh end point running rather than ship a large VM image. It's also my understanding that while RR works inside a VM, the trace will not work if the VM has changed host. However right now we've decided that it's really overkill for the time being. Producing interesting RR replays is really trivial at the moment. Finding enough engineers to analyze them is not. On Mon, Feb 15, 2016 at 1:21 AM, Mike Hommeywrote: > On Sun, Feb 14, 2016 at 09:25:58PM -0800, Bobby Holley wrote: > > This is so. Damn. Exciting. Thank you roc for having the vision and > > persistence to bring this dream to reality. > > > > How far are we from being able to use cloud (rather than local) machine > > time to produce a trace of an intermittently-failing bug? Some one-click > > procedure to produce a trace from a failure on treeherder seems like it > > would lower the activation energy significantly. > > One limiting factor is the CPU features required, that are not > virtualized on AWS (they are on digital ocean, and that's about the only > cloud provider where they are ttbomk). > > Relatedly, roc, is it possible to replay, on a different host, with > possibly a different CPU, a record that would have been taken on the > cloud? Does using a VM make it possible? If yes, having "the cloud" (or > a set of developers) try to reproduce intermittents, and then have > developers download the records and corresponding VM would be very > useful. If not, we'd need a system like we have for build/test slave > loaners. > > Mike > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
On Sun, Feb 14, 2016 at 09:25:58PM -0800, Bobby Holley wrote: > This is so. Damn. Exciting. Thank you roc for having the vision and > persistence to bring this dream to reality. > > How far are we from being able to use cloud (rather than local) machine > time to produce a trace of an intermittently-failing bug? Some one-click > procedure to produce a trace from a failure on treeherder seems like it > would lower the activation energy significantly. One limiting factor is the CPU features required, that are not virtualized on AWS (they are on digital ocean, and that's about the only cloud provider where they are ttbomk). Relatedly, roc, is it possible to replay, on a different host, with possibly a different CPU, a record that would have been taken on the cloud? Does using a VM make it possible? If yes, having "the cloud" (or a set of developers) try to reproduce intermittents, and then have developers download the records and corresponding VM would be very useful. If not, we'd need a system like we have for build/test slave loaners. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
On Sun, Feb 14, 2016 at 9:37 PM, L. David Baronwrote: > On Sunday 2016-02-14 21:26 -0800, Kyle Huey wrote: > > On Sun, Feb 14, 2016 at 9:16 PM, Robert O'Callahan > wrote: > > > Over the last few days we have had a lot of positive experiences > > > reproducing bugs with rr chaos mode. Kyle tells me that, in fact, he's > been > > > able to reproduce every single bug he tried with enough machine time > thrown > > > at it. > > > > Of five or so, but yes. > > How many of those were intermittents that were never actually > reported on Linux on our test infrastructure (i.e., reported only on > other platforms), but that you were able to reproduce in rr's chaos > mode on Linux? > At least one, bug 1150737, had only appeared in any great quantity on 10.6, and may never have appeared on non-Mac tests in automation. Chaos mode reproduced it in a minute or two on Linux. - Kyle ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
On Sunday 2016-02-14 21:26 -0800, Kyle Huey wrote: > On Sun, Feb 14, 2016 at 9:16 PM, Robert O'Callahan> wrote: > > Over the last few days we have had a lot of positive experiences > > reproducing bugs with rr chaos mode. Kyle tells me that, in fact, he's been > > able to reproduce every single bug he tried with enough machine time thrown > > at it. > > Of five or so, but yes. How many of those were intermittents that were never actually reported on Linux on our test infrastructure (i.e., reported only on other platforms), but that you were able to reproduce in rr's chaos mode on Linux? -David -- 턞 L. David Baron http://dbaron.org/ 턂 턢 Mozilla https://www.mozilla.org/ 턂 Before I built a wall I'd ask to know What I was walling in or walling out, And to whom I was like to give offense. - Robert Frost, Mending Wall (1914) signature.asc Description: Digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
This is so. Damn. Exciting. Thank you roc for having the vision and persistence to bring this dream to reality. How far are we from being able to use cloud (rather than local) machine time to produce a trace of an intermittently-failing bug? Some one-click procedure to produce a trace from a failure on treeherder seems like it would lower the activation energy significantly. On Sun, Feb 14, 2016 at 9:16 PM, Robert O'Callahanwrote: > Over the last few days we have had a lot of positive experiences > reproducing bugs with rr chaos mode. Kyle tells me that, in fact, he's been > able to reproduce every single bug he tried with enough machine time thrown > at it. > > At this point the limiting factor is getting developers to actually debug > and fix recorded test failures. Anyone should be able to set up a VM on > their local machine, build Firefox, record some failures and fix them. For > best results, run just one test that's known intermittent, or possibly the > whole directory of tests if there might be inter-test dependencies. Use > --shuffle and --run-until-failure. The most convenient way to run rr with > chaos mode is probably to create a script rr-chaos that prepends the > --chaos option, and use --debugger rr-chaos. > > Lots of tests have been disabled for intermittency over the years. Now we > have the ability to fix (at least some of) them without much pain, it may > be worth revisiting them, though i don't know how to prioritize that. > > We might want to revisit our workflow. If we had the ability to mark tests > as disabled-for-intermittency explicitly, maybe we could automatically > disable intermittent tests as they show up and dedicate a pool of machines > to reproducing them with rr. > > Rob > -- > lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf > toD > selthor stor edna siewaoeodm or v sstvr esBa kbvted,t > rdsme,aoreseoouoto > o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea > lurpr > .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr > esn > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: rr chaos mode update
On Sun, Feb 14, 2016 at 9:16 PM, Robert O'Callahanwrote: > Over the last few days we have had a lot of positive experiences > reproducing bugs with rr chaos mode. Kyle tells me that, in fact, he's been > able to reproduce every single bug he tried with enough machine time thrown > at it. > Of five or so, but yes. At this point the limiting factor is getting developers to actually debug > and fix recorded test failures. Anyone should be able to set up a VM on > their local machine, build Firefox, record some failures and fix them. For > best results, run just one test that's known intermittent, or possibly the > whole directory of tests if there might be inter-test dependencies. Use > --shuffle and --run-until-failure. The most convenient way to run rr with > chaos mode is probably to create a script rr-chaos that prepends the > --chaos option, and use --debugger rr-chaos. > You can generally pass --debugger=/path/to/rr --debugger-args="record -h" to mach to get things working. Lots of tests have been disabled for intermittency over the years. Now we > have the ability to fix (at least some of) them without much pain, it may > be worth revisiting them, though i don't know how to prioritize that. > FWIW, every failure that I've debugged to completion so far has been a bug in the test (although I have two fatal assertion bugs I'm working through that will obviously be flaws in Gecko). I think one of the things we really want to get a feeling for is how often we find actual bugs in the product. - Kyle ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
rr chaos mode update
Over the last few days we have had a lot of positive experiences reproducing bugs with rr chaos mode. Kyle tells me that, in fact, he's been able to reproduce every single bug he tried with enough machine time thrown at it. At this point the limiting factor is getting developers to actually debug and fix recorded test failures. Anyone should be able to set up a VM on their local machine, build Firefox, record some failures and fix them. For best results, run just one test that's known intermittent, or possibly the whole directory of tests if there might be inter-test dependencies. Use --shuffle and --run-until-failure. The most convenient way to run rr with chaos mode is probably to create a script rr-chaos that prepends the --chaos option, and use --debugger rr-chaos. Lots of tests have been disabled for intermittency over the years. Now we have the ability to fix (at least some of) them without much pain, it may be worth revisiting them, though i don't know how to prioritize that. We might want to revisit our workflow. If we had the ability to mark tests as disabled-for-intermittency explicitly, maybe we could automatically disable intermittent tests as they show up and dedicate a pool of machines to reproducing them with rr. Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Presto: Comparing Firefox performance with other browsers (and e10s with non-e10s)
On Mon, Feb 15, 2016 at 1:12 PM, Martin Thomsonwrote: > On Mon, Feb 15, 2016 at 12:30 PM, Valentin Gosu > wrote: >>> Thumbnails, or columns on the right for each selected browser with >>> median (or mean), with the best (for that site) in green, the worst in >>> red would allow eyeballing the results and finding interesting >>> differences without clicking on 100 links... (please!) Or to avoid >>> overloading the page, one page with graphs like today, another with the >>> columns I indicated (where clicking on the row takes you to the graph >>> page for that side). >>> >> >> What I noticed is that pages with lots of elements, and elements that come >> from different sources seem to have a higher variability. So pages such as >> flickr, with lots of images with various sizes, or pages that load various >> ads. > > You currently graph every test result, sorted. This can be reduced to > a single measurement. Here I think that you can take the 5th, 50th > and 95th percentiles (mean isn't particularly interesting, and you > want to avoid extreme outliers). The x axis can then be used for > something else. The obvious choice is that you turn this into a bar > graph with browsers on that x-axis. You could probably remove the > browser selector then. Oh, to be a little less obtuse, I think that means that you get a column graph with error bars on each column. Your x-axis is by browser (and version) with two columns for each (first view and refresh). ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Presto: Comparing Firefox performance with other browsers (and e10s with non-e10s)
On Mon, Feb 15, 2016 at 12:30 PM, Valentin Gosuwrote: >> Thumbnails, or columns on the right for each selected browser with >> median (or mean), with the best (for that site) in green, the worst in >> red would allow eyeballing the results and finding interesting >> differences without clicking on 100 links... (please!) Or to avoid >> overloading the page, one page with graphs like today, another with the >> columns I indicated (where clicking on the row takes you to the graph >> page for that side). >> > > What I noticed is that pages with lots of elements, and elements that come > from different sources seem to have a higher variability. So pages such as > flickr, with lots of images with various sizes, or pages that load various > ads. You currently graph every test result, sorted. This can be reduced to a single measurement. Here I think that you can take the 5th, 50th and 95th percentiles (mean isn't particularly interesting, and you want to avoid extreme outliers). The x axis can then be used for something else. The obvious choice is that you turn this into a bar graph with browsers on that x-axis. You could probably remove the browser selector then. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Presto: Comparing Firefox performance with other browsers (and e10s with non-e10s)
On 12 February 2016 at 17:11, Randell Jesupwrote: > >- You can click on each individual point to go to the WPT run and view > >the results in greater detail > > What does "run index" mean in the graphs? The values appear to be > sorted from best to worst; so it's comparing best to best, next-best to > next-best, etc? Ah, and I see "sorted" undoes the sorting. > In the sorted version, it is the index of that run in the sorted array and mostly meaningless. The run index in the unsorted version is the index of the run in chronological order, but the firstRun and repeatRun will share the same index. > I'd think displaying mean/median and std-deviation (or a bell-curve-ish) > might be easier to understand. But I'm no statistician. :-) It also > likely is easier to read when the numbers of samples don't match (or you > need to stretch them all to the same "width"; using a bell-curve plot of > median/mean/std-dev avoids that problem. > I also think it would be quite useful, but my knowledge of statistics is pretty basic. I tried to get the number of samples to match, but for a few domains that didn't work. I assume there's a bug in my scripts. > > Thumbnails, or columns on the right for each selected browser with > median (or mean), with the best (for that site) in green, the worst in > red would allow eyeballing the results and finding interesting > differences without clicking on 100 links... (please!) Or to avoid > overloading the page, one page with graphs like today, another with the > columns I indicated (where clicking on the row takes you to the graph > page for that side). > What I noticed is that pages with lots of elements, and elements that come from different sources seem to have a higher variability. So pages such as flickr, with lots of images with various sizes, or pages that load various ads. > > >-- > >Error sources: > > > >Websites may return different content depending on the UA string. While > >this optimization makes sense for a lot of websites, in this situation it > >is difficult to determine if the browser's performance or the website's > >optimizations have more impact on the page load. > > Might be interesting to force our UA for a series of tests to match > theirs, or vice-versa, just to check which sites appear to care (and > then mark them). > I've tried it for a few domains and it didn't make much of a difference. I'll try it for all the domains to see if there is a pattern we could make out. > > Great! It'll be interesting to track how these change over time as well > (or as versions get added to the list). Again, medians/means/etc may > help with evaluating and tracking this (or automating notices, ala Talos) > I thought about doing this, but Talos is always using static content on a local connection, whereas this goes over a real network which may vary performance, and load real websites which may change content or optimize for different situations. I expect it's useful for confirming certain properties, such as if page loads are faster on Fx, Chrome or Nightly, and by how much, but probably can't get results that make sense over a longer period of time. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Returning dictionaries in WebIDL
Martin Thomson: > I know that this is not good practice, but is there something written > down somewhere explaining why? I don’t know if it’s written down somewhere. Using dictionary types for IDL attributes is forbidden by the spec, because it would mean that a new copy of the object would need to be returned each time the property were accessed. This is the case for sequence types too, where you can much more obviously encourage wasteful object creation: interface A { attribute sequence values; }; for (var i = 0; i < myA.values.length; i++) { … } That would create a new JS Array each time around the loop. With dictionaries you might access a couple of properties from the return value of the getter property and have similar issues: dictionary D { double x; doubly y; }; interface A { attribute D d; }; Math.sqrt(myA.d.x * myA.d.x + myA.d.y * myA.d.y); This would create four copies of the JS object for the dictionary. Another point is that these sequence and dictionary objects can’t be monitored for changes, so for example you couldn’t write a spec that required the browser to do something when you assign to myA.d.x since that’s just a plain data property on the object returned from d. So for APIs where you do want to notice property value changes like this, you’ll need to use interfaces, and for array-ish things we’ve now got FrozenArray, which is an array reference type (as opposed to sequence’s (and dictionaries’) pass-by-value behaviour). We don’t currently have a type that means “reference to an object that has a particular shape defined by a dictionary”. So for now if you really want an API that allows myA.d = { x: 1, y: 2 }; where the A object either immediately, or later, inspects the values on the object, then you have to use “object” as the type and invoke the type conversions or do the JS property getting in the spec yourself. -- Cameron McCormack ≝ http://mcc.id.au/ ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Returning dictionaries in WebIDL
Yep, thanks for that. I somehow missed the bit where it says, very clearly, that attributes MUST NOT be dictionaries. I was looking for something to give someone else who doesn't like explanations and prefers reading from the scripture. Your explanation matches my understanding. BTW, there are some people who consider the fact that a dictionary becomes outside the control of the browser to be a feature. That is partly why WebRTC is how it is. On Mon, Feb 15, 2016 at 10:11 AM, Cameron McCormackwrote: > Martin Thomson: >> I know that this is not good practice, but is there something written >> down somewhere explaining why? > > I don’t know if it’s written down somewhere. Using dictionary types for > IDL attributes is forbidden by the spec, because it would mean that a > new copy of the object would need to be returned each time the property > were accessed. This is the case for sequence types too, where you > can much more obviously encourage wasteful object creation: > > interface A { > attribute sequence values; > }; > > for (var i = 0; i < myA.values.length; i++) { > … > } > > That would create a new JS Array each time around the loop. With > dictionaries you might access a couple of properties from the return > value of the getter property and have similar issues: > > dictionary D { > double x; > doubly y; > }; > > interface A { > attribute D d; > }; > > Math.sqrt(myA.d.x * myA.d.x + myA.d.y * myA.d.y); > > This would create four copies of the JS object for the dictionary. > > Another point is that these sequence and dictionary objects can’t be > monitored for changes, so for example you couldn’t write a spec that > required the browser to do something when you assign to myA.d.x since > that’s just a plain data property on the object returned from d. So for > APIs where you do want to notice property value changes like this, > you’ll need to use interfaces, and for array-ish things we’ve now got > FrozenArray, which is an array reference type (as opposed to > sequence’s (and dictionaries’) pass-by-value behaviour). > > We don’t currently have a type that means “reference to an object that > has a particular shape defined by a dictionary”. So for now if you > really want an API that allows > > myA.d = { x: 1, y: 2 }; > > where the A object either immediately, or later, inspects the values on > the object, then you have to use “object” as the type and invoke the > type conversions or do the JS property getting in the spec yourself. > > -- > Cameron McCormack ≝ http://mcc.id.au/ ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Returning dictionaries in WebIDL
On 2/14/16 5:50 PM, Martin Thomson wrote: I know that this is not good practice It really depends. There are some situations in which returning a dictionary is probably fine; ideally when what's meant is "a copy of the initialization data for this object". This is the situation with WebGLRenderingContext.getContextAttributes(). One thing worth keeping in mind in general, though, is the argument Allen makes in https://lists.w3.org/Archives/Public/public-script-coord/2014JanMar/0201.html -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Returning dictionaries in WebIDL
I know that this is not good practice, but is there something written down somewhere explaining why? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform