Re: BHR Project Status

2017-09-21 Thread Calixte Denizet
> >
> > Calixte made a tool to associate a newly reported crash to a recent
> > changes.  Would it be possible to look at the same kind of tools based on
> > the reports of the BHR, with a given buildId ?
> >
>
> Do you have a link to this tool, so I can understand a little better?
>
>

This tool (see [1]) is mapping the files which appear in a crash backtrace
(available on socorro) and the patches made in the last 3 days (according
to the pushdate in hg.mozilla.org).
So it helps to identify the patch causing a regression (see metabug [2]).


[1]
https://github.com/calixteman/clouseau/blob/master/clouseau/guiltypatches.py
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1396527


>
>
> [1] https://arewesmoothyet.com/?category=all=2048_65536_
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: BHR Project Status

2017-09-20 Thread Doug Thayer
On Wed, Sep 20, 2017 at 9:06 AM, Nicolas B. Pierron <
nicolas.b.pier...@mozilla.com> wrote:

> > What impact has a stack which is being reported at 1% / 0.5% / 0% ?
>
> I see that the histograms on top are changing each time I highlight a new
> line.  When I mouse-over the histograms buckets I notice the  "ms/h" metric.
>
> What does it mean?  Does that mean that in any browsing session, any user
> will see the given signature for X ms per hour?
>

As a mean value, and for the date in question, yes. If a particular prefix
list of stack frames is responsible for one 20 second hang for one in 10
users every 5 hours (pretending all users use Nightly for the same number
of hours per day), you'll see a value of 20,000 / 10 / 5  == 400ms/h.


>
> Should the left column of the profiles report the time in  ms/h  instead
> of unit-less % ?
>

That's a good suggestion. I'll work on getting units in there.


>
> > Is the histogram indexed by build-id / report-date?
>

Build ID.


> > Is there a way to get the URL on which these hangs are reported?
>

If I'm understanding your question correctly I believe that's a privacy
issue. Maybe if we end up going with RAPPOR we'll be able to see some
aggregate stats on hanginess of particular domains though. Or we could
prompt Nightly users to opt into sending more sensitive information, but
that brings biases with it.


>
> > Can we have a larger history of hangs per signatures? (similar to
> crash-stat [1])
>

I'm working on keeping a dataset which will always go back to Sept. 1st.
[1] I think we should be able to maintain this, however the performance of
the UI will need to be addressed in some way, as even now it's a bit of a
hog.


>
> Calixte made a tool to associate a newly reported crash to a recent
> changes.  Would it be possible to look at the same kind of tools based on
> the reports of the BHR, with a given buildId ?
>

Do you have a link to this tool, so I can understand a little better?


[1] https://arewesmoothyet.com/?category=all=2048_65536_
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: BHR Project Status

2017-09-20 Thread Nicolas B. Pierron

On 09/20/2017 01:52 PM, Michael Layzell wrote:

Doug Thayer has written a visualizer for the collected data called
hangs.html (https://arewesmoothyet.com), based on the perf.html profiler
viewer. This interface allows analysis of the change in frequency of
specific hangs over time, lots of tools for filtering through hang
information, as well as a profiler-like interface for drilling into
specific hang stacks to determine what might be causing the problems.


This looks like a really interesting tools, and I am glad we have this 
information in a much more readable format now! While I think I got a rough 
idea of the content which is presented, I want to make sure I read this 
content correctly.


Thus, I have a few questions/comments, related to reading these data, and 
also related to investigating these reports, and to determine the importance 
(impact/priority) that we should give to each of these issues.



> What impact has a stack which is being reported at 1% / 0.5% / 0% ?

I see that the histograms on top are changing each time I highlight a new 
line.  When I mouse-over the histograms buckets I notice the  "ms/h" metric.


What does it mean?  Does that mean that in any browsing session, any user 
will see the given signature for X ms per hour?


Should the left column of the profiles report the time in  ms/h  instead of 
unit-less % ?


> Is the histogram indexed by build-id / report-date?

> Is there a way to get the URL on which these hangs are reported?

> Can we have a larger history of hangs per signatures? (similar to 
crash-stat [1])


Calixte made a tool to associate a newly reported crash to a recent changes. 
 Would it be possible to look at the same kind of tools based on the 
reports of the BHR, with a given buildId ?


[1] 
https://crash-stats.mozilla.com/signature/?product=Firefox=57.0a1=RtlEnterCriticalSection%20%7C%20mozilla%3A%3Anet%3A%3ACacheStorageService%3A%3ACacheQueueSize=%3E%3D2017-03-20T15%3A55%3A32.000Z=%3C2017-09-20T15%3A55%3A32.000Z#graphs


--
Nicolas B. Pierron
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


BHR Project Status

2017-09-20 Thread Michael Layzell
In the last few months we've been putting work into making the data which
we collect from the Background Hang Reporter (BHR) more usable and
actionable. We use BHR to measure the frequency and cause of browser hangs
(when the main thread's event loop doesn't process events for 128ms or
longer). The goal being to collect information which lets us improve
Firefox's responsiveness by reducing the frequency of main thread hangs.

On the data collection side, the BHR stack walking code has been rewritten
to take advantage of Gecko Profiler internals. This reduced code
duplication, and enables us to take advantage of Gecko Profiler features
like JS stack interleaving. In addition, the ping submission logic has been
rewritten to perform less work on main thread, and submit hang information
outside of the main ping. This let us began collecting much more data,
including interleaved chrome-js/native stack frame information for all
hangs, and information about the browser's state, such as pending input
events. Platform support has also been expanded from win32 to include
linux64, win64 and macOS.

Doug Thayer has written a visualizer for the collected data called
hangs.html (https://arewesmoothyet.com), based on the perf.html profiler
viewer. This interface allows analysis of the change in frequency of
specific hangs over time, lots of tools for filtering through hang
information, as well as a profiler-like interface for drilling into
specific hang stacks to determine what might be causing the problems. Doug
is actively working on adding new features to the UI to improve filtering
and make it easier to get good results from the data, but we're already
finding and fixing important bugs. Some bugs which have been fixed include
bug 1393597 where we discovered that a synchronous GC on an edge case was
having more of a performance impact than we expected, and bug 1381465 where
we observed and prioritized the fixing of main thread I/O in the content
process.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform