Re: proposal: replace talos with inline tests

2013-03-04 Thread Dave Mandelin
On Monday, March 4, 2013 5:15:56 AM UTC-8, Jim Mathies wrote:
 For metrofx we’ve been working on getting omtc and apzc running in the 
 browser. One of the things we need to be able to do is run performance tests 
 that tell us whether or not the work we’re doing is having a positive effect 
 on perf. We currently don’t have automated tests up and running for metrofx 
 and talos is even farther off.
 
 So to work around this I’ve been putting together some basic perf tests I can 
 use to measure performance using the mochitest framework. I’m wondering if 
 this might be a useful answer to our perf tests problems long term. 

I think this is an incredibly interesting proposal, and I'd love to see 
something like it go forward. Detailed reactions below.

 Putting together talos tests is a real pain. You have to write a new test 
 using the talos framework (which is a separate repo from mc), test the test 
 to be sure it’s working, file rel eng bugs on getting it integrated into 
 talos test runs, populated in graph server, and tested via staging to be sure 
 everything is working right. Overall the overhead here seems way too high.

Yup. And that's a big problem. Not only does this make your life harder, it 
makes people not do as much performance testing as they otherwise might. The JS 
team has had the experience that adding a new way of creating correctness tests 
incredibly easy (with *zero* overhead in the common case) really helped get 
more tests written and used. So I think it would be great to make it a lot 
easier to write perf tests.

 Maybe we should consider changing this system so devs can write performance 
 tests that suit their needs that are integrated into our main repo? Basically:
 
 1) rework graphs server to be open ended so that it can accept data from test 
 runs within our normal test frameworks.

IIUC, something like this is a key requirement: letting any perf test feed into 
the reporting system. People have pointed out that the Talos tests run on 
selected machines, so the perf tests should probably run on them as well, 
rather than on the correctness test machines. But that's only a small change to 
the basic idea, right?

 2) develop of test module that can be included in tests that allows test 
 writers to post performance data to graph server.

Does that mean a mochitest module? This part seems optional, although certainly 
useful. Some tests will require non-mochitest frameworks.

I believe jmaher did some work to get in-browser standard JS benchmarks running 
automatically and reporting to graph-server. I'm curious how that would fit in 
with this idea--would the test module help at all, or could there be some other 
kind of more general module maybe, so that even things like standard benchmarks 
can be self-serve?

 3) come up with a good way to manage the life cycle of active perf tests so 
 graph server doesn’t become polluted.

:-) How about getting an owner optionally listed for new tests, and then tests 
will be removed if no one is looking at them (according to web server logs) and 
there is no owner of record or the owner doesn't say the tests are still needed?

 4) port existing talos tests over to the mochitest framework.
 
 5) drop talos.

This seems like it's in the line of fix Talos. I'm not sure if this 
particular 4+5 is the right way to go, but the idea has some merit.

To the extent that people don't pay attention to Talos, it seems we really 
don't need to do anything with it. If people are paying attention to and taking 
care of performance in their area, then we're covered. To take the example I 
happen to know best, the JS team uses AWFY to track JS performance on standard 
benchmarks and additional tests they've decided are useful. So Talos is not 
needed to track JS performance. Having all the features of the new graph server 
does sound pretty cool, though.

It appears that there a few areas that are only covered by Talos for now, 
though. I think in that category we have warm startup time via Ts, and basic 
layout performance via Tp. I'm not sure about memory, because we do seem to 
detect increases via Talos, but we also have AWSY, and I don't know whether 
AWSY obviates Talos memory measurements or not.

For that kind of thing, I'm thinking maybe we should go with the same teams 
take care of their own perf tests idea. Performance is a natural owner for Ts. 
I'm not entirely sure about Tp, but it's probably layout or DOM. Then those 
teams could decide if they wanted to switch from Talos to a different 
framework. If everything's working properly, if the difficulty of reproducing 
Talos tests locally caused enough problems to warrant it, the owning teams 
would notice and switch.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: proposal: replace talos with inline tests

2013-03-04 Thread Dave Mandelin
On Monday, March 4, 2013 5:42:39 AM UTC-8, Ed Morley wrote:
 (CCing auto-to...@mozilla.com)
 
 jmaher and jhammel will be able to comment more on the talos specifics, 
 but few thoughts off the top of my head:
 
 It seems like we're conflating multiple issues here:
   1) [talos] is a separate repo from mc

And also

1a) Talos itself is a big pain for developers to use and debug regressions 
in, not to mention add tests to, which they basically don't.

It seems that some of this may have changed recently, especially around using 
the new framework--I haven't used it in a while. I think Talos still does fail 
on creating tests, though, because lots of things just don't fit its 
assumptions.

   2) [it's a hassle to] test the test to be sure it’s working
   3) [it's a hassle to get results] populated in graph server
   4) [we need to] come up with a good way to manage the life cycle of 
 active perf tests so graph server doesn’t become polluted

 Switching from the talos harness to mochitest doesn't fix #2 (we still 
 have to test, and I don't see how it magically becomes any easier 
 without extra work - that could have been applied to talos instead) or 
 #3/#4 (orthogonal problem). It also seems like a brute force way of 
 fixing #1 (we could just check talos into mozilla-central).

I think that part was mostly supposed to address (1a).

 Instead, I think we should be asking:
 
 1) Is the best test framework for performance testing: [a] talos (with 
 improvements), [b] mochitest (with a significant amount of work to make 
 it compatible), or [c] a brand new framework?

I think that question doesn't have one answer. For JS, it's clearly something 
else, but it's not even really a framework--it's just running standard 
benchmarks. 

For other areas, there are likely different answers. That's why I was so 
excited about the self-serve idea. (Interestingly, I got schooled on this 
subject in a similar vein recently on bug tracking. :-) )

 2) Regardless of framework used, would checking it into mozilla-central 
 improve dev workflow enough to outweigh the downsides (see bug 787200 
 for history on that discussion)?

Thanks for the bug link. It seems like putting Talos itself into m-c has 
significant disadvantages. I'm not sure what to do with other/new perf tests.

 3) Regardless of framework used, how can we make the 
 development/testing/staging cycle less painful?

I liked the original proposal a lot for this.

 4) Regardless of framework used, who should be responsible for ensuring 
 we actively prune performance tests that are no longer relevant?

I gave an idea for how to do this in my reply to the original proposal. I 
didn't say who would do it, but I was assuming the maintainers/operators of 
graph-server, with the notion that they would be highly empowered to remove 
anything that no one asked them to keep or that didn't otherwise have a 
well-documented, easily understood rationale.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: proposal: replace talos with inline tests

2013-03-04 Thread Dave Mandelin
On Monday, March 4, 2013 5:17:29 PM UTC-8, Gregory Szorc wrote:
 On 3/4/13 5:09 PM, Dave Mandelin wrote:
 
  We already don't back back out changes for regressing a benchmark like
  we back them  out for regressing tests.  I think this is at least
  partially because a general sentiment that not all of our benchmarks
  correlate strongly to what they're trying to measure.
 
  I know this has been a hot topic lately. I think getting more clarity on 
  this would be great, *if* of course we could have an answer that was both 
  operationally beneficial and clear, which seems to be incredibly difficult.
 
  But this thread gives me a new idea. If each test run in automation had an 
  owner (as I suggested elsewhere), how about also making the owners 
  responsible for informing the sheriffs about what to do in case of 
  regression? If the owners know the test is reliable and measures something 
  important, they can ask for monitoring and presumptive backout. If not, 
  they can ask sheriffs to ignore the test, inform and coordinate with the 
  owning team, inform the landing person only, or some other action.
 
 This should be annotated in the tests themselves, IMO. We could even 
 have said annotation influence the color on TBPL. 

I like it. We would need to make sure the annotations reflect active 
consideration by the test owners, but I suppose failures are likely to 
self-correct.

 IMO we should be focusing on lessening the burden on the 
 sheriffs and leaving them to focus on real problems.

Absolutely.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The future of PGO on Windows

2013-02-06 Thread Dave Mandelin
On Mon, Feb 4, 2013 at 11:23 PM, Asa Dotzler a...@mozilla.com wrote:
 On 2/4/2013 6:59 PM, Dave Mandelin wrote:
 I was talking to Taras and Naveed about this today, and what also
 came up was:

 4. Do the work to make 64-bit JS jit perf as good as 32-bit JS jit
 perf, and then switch to x64 builds for Windows. There are of course
 many issues involved with such a switch,

 (These numbers are somewhat crude. I don't have excel on this machine to do
 precise math from our blocklist data. Given my somewhat dated info on the 
 Windows 32 vs 64 bit breakdown that level of precision isn't really 
 warranted.)

 Approximately 35% of our installs are on Windows XP. Microsoft has said that 
 less than 1% of XP installs are 64-bit. About 7% of our users are on Vista. 
 Microsoft said Vista's 64-bit percentage is about 11%. Just over 50% of our 
 Windows users are on Windows 7. Microsoft has said Windows 7 installs are 
 about 50% 32-bit and 50% 64-bit. That puts 32-bit Windows Firefox users at 
 around 65% of our total Windows user base.

 With nearly two thirds of our Windows users on 32-bit Windows versions, we 
 can't simply switch to x64 builds for Windows can we?

Definitely not. Thanks for reminding me. I guess there is still an option of 
trying to do PGO on 64-bit builds only (which apparently is definitely not 
possible for VS2010 and earlier) and letting more users pick up those benefits 
over time. But that's not very compelling.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed 2013 Platform Goals

2013-01-11 Thread Dave Mandelin
On Friday, January 11, 2013 3:51:57 PM UTC-8, Gary Kwong wrote:
  Thinking of making the top level goals bugs and hanging related work off
  them as deps.  What do people think of this idea?  Is it maintainable?
 
 Sounds reasonable, they could be meta bugs, marked with the meta keyword.

Maybe even a goal keyword to mark them specially?

I also mentioned to JP in conversation that I think the bugs should all be 
linked directly to the goal bug, rather than through intermediaries, so that 
you can always see it belongs to that goal right on the bug.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal: Remove Linux PGO Testing

2012-11-14 Thread Dave Mandelin
On Wednesday, November 14, 2012 10:53:37 AM UTC-8, Alex Keybl wrote:
 Discussions are ongoing as to whether disabling the test is our 
 best path forward here, given engineering opposition to disabling 
 PGO.

I strongly recommend disabling the test for 32-bit Linux PGO and moving on. Bug 
799295 has gotten attention and effort far out of proportion to its likely 
impact. We all have much better things to do with our time.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal for reorganizing test directories

2012-11-02 Thread Dave Mandelin
On Tuesday, October 30, 2012 7:26:34 AM UTC-7, Henrik Skupin wrote:
 As nearly all of you agreed on a flat folder structure makes a lot of 
 sense if only one type of test is present. I second that, and we
 shouldn't make use of a 'tests' subfolder in such a case. But it would
 be fantastic if we could agree on a reasonable naming schema. Something
 we would propose is to use a folder name which matches the test
 framework. Mostly we are doing it already but are affected by small
 differences like 'mochitest' vs. 'mochitests'. The right usage would be
 'mochitest' and it would apply to the other frameworks too.

I didn't see anyone disagree with this, at least, just Karl's note that we 
might need separate names for the harness implementation and the test 
directories. What do people think? Is this OK? I would imagine that taking 
Karl's comment into account, it is, and if no one speaks up now, I would think 
you could just start submitting patches.

 Further if more tests are getting created which span multiple test
 frameworks, we should introduce the proposed 'tests' sub folder so all
 the tests can be kept together. Existing tests would have to be moved
 under tests, e.g. 'tests/mochitest' and 'tests/crashtest'. As far as I
 have seen we are doing that already in a couple of components.

I don't understand this piece very well. Do people understand this well enough 
to agree or disagree with it, and to do it if it's wanted?

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal for reorganizing test directories

2012-11-01 Thread Dave Mandelin
At the last Tuesday meeting I foolishly agreed :-) to take charge of following 
up on this discussion and seeing if we can come to a decision. So, here goes:

First, I want to try to pour some gasoline on the dying embers and suggest that 
perhaps we should totally rearrange everything. As a developer user of our 
testing systems, I always found it incredibly irritating that there were test 
directories sprinkled throughout the tree that got copied to the build dirs as 
part of the build process, with no clear mapping between the build path of the 
test, the source path of the test, and the path you had to pass to mochitests 
to actually run the test. 

I would prefer something like this:

|-- tests/
|-- browser-chrome/
|-- topic1 (omit this level if there would be only one)
|-- topic2
|-- [...]
|-- chrome/
|-- crashtests/
|-- marionette/
|-- mochitests/
|-- reftests/
|-- xpcshell
|-- [..]/ 

That way, when working with a given testing harness, you know where to find all 
the files for it (source, metadata, and tests), and can easily figure out how 
to work with specific tests and directories.

This is approximately what SpiderMonkey uses, and it seems to be approximately 
what Chromium and WebKit use.

This does seem to be about the transpose of what you asked for Henrik, but it 
makes sense to me, so I'm curious what you think. 

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


PGO: another test + PGO topcrashes

2012-11-01 Thread Dave Mandelin
I'm still thinking about PGO:

1. I did another test. I wanted to know the effect on games, so I played 
BananaBread and eyeballed modal fps. (If anyone knows of a more accurate way to 
measure fps in the game, let me know.) I got:

 opt  38
 pgo  41

Similar magnitude to other domains. Super-unscientific test, though.

2. PGO-related topcrashes

The reason I started studying PGO was that PGO-related topcrashes are really 
vexing for developers, and I wanted to be sure dealing with all that vexation 
was actually paying off. It seems that it is, but now I want to ask, can it vex 
less?

The lifecycle of a PGO-related topcrash seems to be something like this:

 1. Someone changes the code base in a way that exposes a bug in the PGO 
compiler system.
 2. A topcrash gets noticed.
 3. Developers look at the topcrash and conclude it is a PGO bug.
 4. Developers patch code around the crash point by turning off optimizations 
locally, or randomly frobulating the code.
 5. The topcrash goes away.

I assume that some time after that:

 6. Someone changes the code base in a way so that that site no longer triggers 
the compiler bug. The random change or deoptimization changes.

What I'm getting at is that it seems like we are adding permanent gunk to the 
code base that only temporarily fixes problems. I can't be sure that's what's 
happening, but it seems plausible. Anyway, I am fairly sure we are currently 
playing whack-a-mole.

 (a) How about building Windows with a newer version of MSVC, say 2012? (What 
version are we using now, anyway? The build instructions page says 2010 is 
official, but a tbpl log showed Visual Studio 9.0 on the path.) Maybe they have 
fixed bugs in PGO.

 (b) Failing that, how about not fixing PGO bugs unless they are reproducible, 
on a trial basis? If my lifecycle theory is correct, then the total crash rate 
would stay roughly constant. And I assume that if the crash rate doesn't 
actually go up, that's OK. If it does, and especially if that can be shown to 
be due to an increasing number due to outstanding PGO bugs, that would show 
that we do benefit from fixing them.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal for reorganizing test directories

2012-11-01 Thread Dave Mandelin
On Thursday, November 1, 2012 6:33:39 PM UTC-7, therealbr...@gmail.com wrote:
 On Thursday, November 1, 2012 5:47:57 PM UTC-7, Dave Mandelin wrote:
 
  At the last Tuesday meeting I foolishly agreed :-) to take charge of 
  following up on this discussion and seeing if we can come to a decision. 
  So, here goes:
 
 ...
 
  I would prefer something like this:

  |-- tests/
  |-- browser-chrome/
  |-- topic1 (omit this level if there would be only one)
  |-- topic2
  |-- [...]
  |-- chrome/
  |-- crashtests/
  |-- marionette/
  |-- mochitests/
  |-- reftests/
  |-- xpcshell
  |-- [..]/ 
 
 ...
 
  This is approximately what SpiderMonkey uses, and it seems to be 
  approximately what Chromium and WebKit use.
 
 How about js/src/tests and other *tests subdirectories? Do they get
 moved out to a remote top-level, where SpiderMonkey-only embedders 
 will miss them?

 The tyranny of hierarchy never ends. Either we have subsidiarity for 
 js and other modules, or not. If Gecko is one big module -- ok, I 
 get it. But you need a principle for giving js its own tests while 
 hoisting all others.

It looks like V8 keeps JS tests in a separate directory, but JSC has them in 
common with WebKit, presumably since V8 promotes itself as an embeddable 
component while JSC I believe does not.

I don't think it would affect SpiderMonkey development much to move the tests 
to a new home. They are designed to be independent of the source code and the 
path to the program being tested. So I wouldn't mind. Even for independent 
distribution purposes, it makes more sense to me to collect files into new 
places to prepare a distro than to do so as part of the per-compile testing 
process.

 This seems small potatoes either way, BTW. I've been through both 
 approaches. Neither is obviously winning.

Sure, it's not some grand thing. I just like things to be nicely organized. And 
I really did find mochitest paths a hassle and a (small) tax on development 
effort.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Benefits of PGO on Windows

2012-10-19 Thread Dave Mandelin
On Wednesday, October 17, 2012 11:00:13 PM UTC-7, Mike Hommey wrote:
 If you copy omni.ja from the PGO build to the opt build, you'll be able
 to see if everything comes from that. We're planning to make that
 currently PGO-only optimization run on all builds. (bug 773171)

Excellent suggestion, plus it made me repeat the experiment. The repeat turned 
up somewhat more confusing data that still seems to support PGO for Windows 
startup. I did 2-3 tests with each of 4 configurations (I botched one trial and 
didn't bother rebooting to test it again), and got this:

 pgo with pgo omni.ja1.6 - 1.7 seconds (1.6, 1.7)
 pgo with opt omni.ja1.4 - 1.6 seconds (1.4, 1.6, 1.6)
 opt with pgo omni.ja1.3 - 8.0 seconds (1.3, 1.4, 8.0)
 opt with opt omni.ja2.9 - 8.7 seconds (2.9, 6.2, 8.7)

The number of trials is too small to conclude very much. If we really wanted to 
know, either someone would have to spend some time doing this over and over, or 
we'd have to use Telemetry with some A/B testing.

It's very weird to me that despite the new weirdness, opt/opt was always slower 
than pgo/pgo, and by about the same amount as my first experiment (in the best 
case for opt/opt). 

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Benefits of PGO on Windows

2012-10-19 Thread Dave Mandelin
On Thursday, October 18, 2012 4:59:10 AM UTC-7, Ted Mielczarek wrote:
 If you're interested in the benchmark side of things, it's fairly easy 
 to compare now that we build both PGO and non-PGO builds on a regular 
 basis. I'm having a little trouble getting graphserver to give me recent 
 data, but you can pick arbitrary tests that we run on Talos and graph 
 them side-by-side for the PGO and non-PGO cases. For example, here's Ts 
 and Tp5 MozAfterPaint for Windows 7 on both PGO and non-PGO builds 
 (the data ends in February for some reason):
 
 http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]sel=nonedisplayrange=365datatype=running
 
 You can see that there's a pretty solid 10-20% advantage to PGO in these 
 tests.

Ah. That answers my question about more data.

For Ts, I see a difference of only 70ms (e.g., 520-590 at the last point). 
That's borderline trivial, but the differences I measure are much greater. What 
does Ts actually measure, anyway? Is it measuring only from main() starting to 
first paint, or something like that?

For Tp5, I see a difference of 80ms (330-410 and such). I'm not really sure 
what to make of that. By itself, it doesn't necessarily seem like it would that 
noticeable, but the fraction is big enough that if it holds up for longer and 
bigger pages, I could see it slightly improving pageloads and probably also 
reducing some pauses for layout and such. From what I understand about Tp5, 
it's not really measuring modern pageloads (ignores network and isn't focused 
on popular sites). I wish we had something more representative so we could draw 
better conclusions (and not just about PGO).

 Here's Dromaeo (DOM) which displays a similar 20% advantage:
 
 http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]sel=nonedisplayrange=365datatype=running
 
 It's certainly hard to draw a conclusion about your hypothesis from just 
 benchmarks, but when almost all of our benchmarks display 10-20% 
 reductions on PGO builds it seems fair to say that that's likely to be 
 user-visible. 

It seems fair to me to say that core browser CPU-bound tasks are likely to be 
10-20% faster. There is probably some of that users can notice, although I'm 
not sure exactly what it would be. The JS benchmarks do show faster in the two 
builds, but I haven't tested other JS-based things to see if it's noticeable. I 
guess I should be testing game framerates or something like that too.

 We've spent hundreds of man-hours for perf gains far less than that.

Yes, we need to get more judicious about how we apply our perf efforts. :-)

 On a related note, Will Lachance has been tasked with getting our 
 Eideticker performance measurement framework working with Windows, so we 
 should be able to experimentally measure user-visible responsiveness in 
 the near future.

I'm curious to see what kinds of tests it will enable.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Benefits of PGO on Windows

2012-10-17 Thread Dave Mandelin
Following the recent discussion about PGO, I really wanted to understand what 
benefits PGO gives Firefox on Windows, if any--I was skeptical. Rafael (IIRC) 
posted some Talos numbers, but I didn't know how to interpret them. So I 
decided to try a few simple experiments to try to falsify the hypothesis that 
PGO has user-perceivable benefits.

Experimental setup: Windows builds from 
http://hg.mozilla.org/mozilla-central/rev/5f4a6a474455 on a Windows 7 Xeon. I 
took opt and pgo builds from the tbpl links.

Experiment 1: cold startup time

  I used a camera to measure time from pressing enter on a
  a command line until the Fx window was completely shown.

 results:
  opt: 3.025 seconds
  pgo: 1.841

  - A clear win for PGO. I'm told that there is a startup
time optimization that orders omni.ja that only runs 
in PGO builds. So it's not necessarily from the PGO
itself, but at least it means the current PGO builds
really are better.

Experiment 2: JS benchmarks

  I ran SunSpider and V8. I would have run Kraken too, but
  it takes longer to run and I already had significant
  results by then. I did 1-2 runs. Below I show the average,
  rounded off to not show noise digits.

 results:
 opt  pgo
  SunSpider  250  200  (seconds)
  V88900 9400  (score)

  - Another clear win for PGO.

(Side note: I've recorded startup times for myself, with my normal profile, of 
~30 seconds. I assumed that was just normal, so today I looked on Telemetry and 
saw that only 5-8% of startup times are that long. (I wish I knew what % of 
cold startups that is.) Today's results were with a clean profile, so it seems 
like my normal profile must be busting my startups (and others') badly. It 
would be really nice to make startup time independent of profile.)

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Minimum Required Python Version

2012-09-10 Thread Dave Mandelin
On Sunday, September 9, 2012 12:54:29 PM UTC-7, Gregory Szorc wrote:
 So, 2.6 or 2.7?

Thanks for bringing this up! Count me as another vote for 2.7. I don't like 
using obsolete language versions outside of necessity, and I've never found it 
difficult to install Python. 

I think MozillaBuild is currently on 2.6.5, so that will need to be updated.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The current state of Talos benchmarks

2012-08-29 Thread Dave Mandelin
On Wednesday, August 29, 2012 4:03:24 PM UTC-7, Ehsan Akhgari wrote:
 Hi everyone,
 
 The way the current situation happens is that many of the developers 
 ignore the Talos regression emails that go to dev-tree-management, 

Talos is widely disliked and distrusted by developers, because it's hard to 
understand what it's really measuring, and there are lots of false alarms. 
Metrics and A-Team have been doing a ton of work to improve this. In 
particular, I told them that some existing Talos JS tests were not useful to 
us, and they deleted them. And v2 is going to have exactly the tests we want, 
with regression alarms. So Talos can (and will) be fixed for developers.

 and in many cases regressions of a few percents slide in without being 
 tracked.  This trend of relatively big performance regressions becomes 
 more evident every time we do an uplift, which means that 6 weeks worth 
 of development get compared to the previous version.
 
 A few people (myself included) have tried to go through these emails and 
 notify the people responsible in the past.  This process has proved to 
 be ineffective, because (1) the problem is not officially owned by 
 anyone (currently the only person going through those emails is 
 mbrubeck), and (2) because of problems such as the difficulty of 
 diagnosing and reproducing performance regressions, many people think 
 that their patches are unlikely to have caused a regression, and 
 therefore no investigation gets done.

Yeah, that's no good.

 Some people have noted in the past that some Talos measurements are not 
 representative of something that the users would see, the Talos numbers 
 are noisy, and we don't have good tools to deal with these types of 
 regressions.  There might be some truth to all of these, but I believe 
 that the bigger problem is that nobody owns watching over these numbers, 
 and as a result as take regressions in some benchmarks which can 
 actually be representative of what our users experience.

The interesting thing is that we basically have no idea if that's true for any 
given Talos alarm.

 I don't believe that the current situation is acceptable, especially 
 with the recent focus on performance (through the Snappy project), and I 
 would like to ask people if they have any ideas on what we can do to fix 
 this.  The fix might be turning off some Talos tests if they're really 
 not useful, asking someone or a group of people to go over these test 
 results, get better tools with them, etc.  But _something_ needs to 
 happen here.

I would say:

- First, and most important, fix the test suite so that it measures only things 
that are useful and meaningful to developers and users. We can easily take a 
first cut at this if engineering teams go over the tests related to their work, 
and tell A-Team which are not useful. Over time, I think we need to get a solid 
understanding of what performance looks like to users, what things to test, and 
how to test them soundly. This may require dedicated performance engineers or a 
performance product manager.

- Second, as you say, get an owner for performance regressions. There are lots 
of ways we could do this. I think it would integrate fairly easily into our 
existing processes if we (automatically or by a designated person) filed a bug 
for each regression and marked it tracking (so the release managers would own 
followup). Alternately, we could have a designated person own followup. I'm not 
sure if that has any advantages, but release managers would probably know. But 
doing any of this is going to severely annoy engineers unless we get the false 
positive rate under control.

- Speaking of false positives, we should seriously start tracking them. We 
should keep track of each Talos regression found and its outcome. (It would be 
great to track false negatives too but it's a lot harder to catch them and 
record them accurately.) That way we'd actually know whether we have a few 
false positives or a lot, or whether the false positives were coming up on 
certain tests. And we could use that information to improve the false positive 
rate over time.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: telemetry data retention strategy

2012-08-15 Thread Dave Mandelin
On Wednesday, August 15, 2012 2:03:38 PM UTC-7, Taras Glek wrote:
 Hi,
 
 According to metrics we have about 1TB of telemetry data in hadoop. This 
 is almost a year worth of telemetry data. Our telemetry ping packets 
 keep growing as we add more probes. As the hadoop database gets bigger, 
 query times get worse, etc. We need to decide on what data we can throw 
 away and when.

Most of this sounds fine to me, except I wanna ask: how much does it cost to 
store 1TB of data? Next to nothing, right? I'd say move it out of the primary 
database to an archive area if you need to for performance, but why not keep 
all the archives?

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Switching nsnull to nullptr

2012-07-26 Thread Dave Mandelin
On Thursday, July 26, 2012 12:55:15 AM UTC-7, Aryeh Gregor wrote:
 On Wednesday, July 25, 2012 8:45:22 PM UTC+3, Dave Mandelin wrote:
 gt; SpiderMonkey officially has a C++ API now, so nullptr should be OK. 
 There is at least one wrinkle, which is that we need to support jsd for a 
 while yet, which is C. There are a few |NULL|s in jsapi.h that look like they 
 are exposed to C, so just a mass-replace wouldnamp;#39;t work, but there are 
 a couple of ways to get jsd to work and I donamp;#39;t think thereamp;#39;s 
 anything too complicated.
 
 nullptr is already defined to be 0L/0LL if unsupported.  It should also be 
 defined that way for C generally, although it#39;s currently not (which is a 
 bug in the patch that landed).  Then it will work in C too, albeit without 
 the type-safety, for anything that includes nscore.h.  I guess we could 
 define it to be NULL instead of 0L/0LL, but surely there was some reason we 
 didn#39;t do that for nsnull to begin with?

  #define nullptr 0L

for C sounds fine to me. I never particularly thought that using 
NULL/nsnull/whatever instead of 0 was really good for anything anyway.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Switching nsnull to nullptr

2012-07-25 Thread Dave Mandelin
On Wednesday, July 25, 2012 2:19:43 PM UTC-7, Ehsan Akhgari wrote:
 On 12-07-25 1:45 PM, Dave Mandelin wrote:
 gt; On Wednesday, July 25, 2012 7:45:43 AM UTC-7, Bobby Holley wrote:
 gt;gt; On Wed, Jul 25, 2012 at 4:21 PM, Aryeh Gregor wrote:
 gt;gt;
 gt;gt; amp;gt; On Wednesday, July 25, 2012 3:04:31 PM UTC+3, Justin Lebar 
 wrote:
 gt;gt; amp;gt; amp;gt; amp;amp;gt; The next step is to s/nsnull/nullptr/ 
 in the codebase, and get rid
 gt;gt; amp;gt; of nsnull.
 gt;gt; amp;gt; amp;gt;
 gt;gt; amp;gt; amp;gt; Forgive my ignorance, but how does this affect 
 NULL?  Would that be
 gt;gt; amp;gt; amp;gt; deprecated in favor of nullptr as well?  Should we 
 use nsnull instead
 gt;gt; amp;gt; amp;gt; of NULL in new code, in anticipation of the nsnull 
 --amp;amp;gt; nullptr switch?
 gt;gt; amp;gt;
 gt;gt; amp;gt; That would be a logical next step, for sure.  Iamp;#39;d 
 definitely say nullptr
 gt;gt; amp;gt; should be used instead of NULL where possible, because the 
 extra type
 gt;gt; amp;gt; safety is valuable.  I guess it would make sense to try 
 mass-changing NULL
 gt;gt; amp;gt; to nullptr,
 gt;gt;
 gt;gt;
 gt;gt; What about the JS engine (which uses NULL), and code that uses 
 js-engine
 gt;gt; style, such as (parts of) XPConnect? I imagine that the embedding 
 situation
 gt;gt; makes things more complicated there than a simple find/replace.
 gt;
 gt; SpiderMonkey officially has a C++ API now, so nullptr should be OK. 
 There is at least one wrinkle, which is that we need to support jsd for a 
 while yet, which is C. There are a few |NULL|s in jsapi.h that look like they 
 are exposed to C, so just a mass-replace wouldn#39;t work, but there are a 
 couple of ways to get jsd to work and I don#39;t think there#39;s anything 
 too complicated.
 
 Then maybe we can do something like:
 
 #ifndef __cplusplus
 #define nullptr NULL
 #endif
 
 And use nullptr everywhere else in jsd.h?
 
 Cheers,
 Ehsan

That would be one way to do it. (I could imagine that some C file somewhere 
includes that and uses nullptr in a different way but things like that seem 
very unlikely.) We could also switch jsd to C++ if we wanted to, although that 
would be more work.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform