Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-08 Thread Stéphane Marchesin
On Tue, Jan 8, 2019 at 1:11 AM Eero Tamminen  wrote:
>
> Hi,
>
> On 8.1.2019 8.56, Stéphane Marchesin wrote:
> > Yes I think the Chrome-side is very simple here: because there isn't
> > time or means for in-depth investigation, if a driver crashes too
> > much, it gets blacklisted. The situation is not unique, the GPU
> > blacklist file is 1700 lines:
> > https://chromium.googlesource.com/chromium/src/gpu/+/master/config/software_rendering_list.json
> >
> > Anyway, IMO if the biggest crashers can be fixed, I think we could
> > eventually make a case to reenable.
>
> Can Chrome crash tracking system provide following information:
> * which www-page caused the crash
> * upstream Mesa commit used
> * upstream kernel commit used
> * if crash is from ChromeOS
> - what patches have been applied on top of kernel & Mesa

Note that for Chrome OS, we handle the crashes ourselves already. But
of course there is no nouveau-based chromebook so that wouldn't help
with the issue at hand.

> * if crash is from a Linux distro
>- distro version and windowing system setup (X/Wayland, compositor)
>- package versions for kernel, Mesa, display server and compositor
>  (to find what kernel & Mesa patches are applied)
> * detailed HW information
> * backtrace

The problem with the backtrace is that Chrome doesn't have the symbols
for the driver, so although we (internally, at Google) can get a
backtrace, we still need to symbolize it somehow.

> ?
>
> So that developers could try the corresponding combination?

The Chrome team has a relationship with a lot of vendors on other
OSes. If there is interest, I am sure we can work out something
similar between the Chrome team and the nouveau devs, and I'd be happy
to play middle man if needed. I think the main roadblock is about
symbolizing the backtraces which we have. We can probably start with
bugs that happen on "common" distros, for which it should be easy to
get symbols.

Stéphane


>
>
> - Eero
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-08 Thread Eero Tamminen

Hi,

On 8.1.2019 8.56, Stéphane Marchesin wrote:

Yes I think the Chrome-side is very simple here: because there isn't
time or means for in-depth investigation, if a driver crashes too
much, it gets blacklisted. The situation is not unique, the GPU
blacklist file is 1700 lines:
https://chromium.googlesource.com/chromium/src/gpu/+/master/config/software_rendering_list.json

Anyway, IMO if the biggest crashers can be fixed, I think we could
eventually make a case to reenable.


Can Chrome crash tracking system provide following information:
* which www-page caused the crash
* upstream Mesa commit used
* upstream kernel commit used
* if crash is from ChromeOS
   - what patches have been applied on top of kernel & Mesa
* if crash is from a Linux distro
  - distro version and windowing system setup (X/Wayland, compositor)
  - package versions for kernel, Mesa, display server and compositor
(to find what kernel & Mesa patches are applied)
* detailed HW information
* backtrace
?

So that developers could try the corresponding combination?


- Eero
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-07 Thread Stéphane Marchesin
On Sat, Jan 5, 2019 at 11:37 PM Jason Ekstrand  wrote:
>
> On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin  wrote:
>>
>> It looks like as of Chromium 71, nouveau is completely blacklisted.
>
>
> That's rather unfortunate. :-(  The intel mesa drivers were also blacklisted 
> for quite some time a while back.  I'm not really sure what we did to get 
> blacklisted or what we did to get unblacklisted.
>

One major difference is that we have shipped Chromebooks with
intel-based GPUs for ~8 years, so we (collectively, intel and Chrome
OS folks) have fixed the long tail of Chrome bugs for Chrome OS +
Intel, and Linux benefited as a side effect.


>>
>> I don't really see a way back from this, since they don't cite any
>> easily reproducible issues, except that some people had some issues
>> with indeterminate hardware and indeterminate versions of mesa.
>>
>> In the bug that triggered this
>> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
>> I might have slightly lost my cool, they (at the end) suggested that
>> we try to make nouveau a first-class citizen with chromium. However I
>> will never be able to present concrete evidence that inconcrete issues
>> are resolved. I did run the WebGL CTS suite, but that resulted in some
>> hangs from the the max-texture-size-equivalent test, and some
>> browser-level weirdness after some tests where later tests all fail
>> (due to what I have to assume is a browser bug). I don't think I
>> managed to properly track down the true reason why. I didn't want to
>> reach out to them with such results, as that's just further evidence
>> of nouveau not working perfectly.
>
>
> If you want concrete bugs to fix, I highly recommend OpenGL[ES] conformance 
> tests, dEQP, and the WebGL CTS (which is mostly a re-hash of the OpenGL ES 
> 3.0 CTS).  Google cares quite a bit about driver conformance and are much 
> more likely to consider nouveau to be high-quality if those test suites are 
> in good shape.  Years of experience dealing with Google says that dEQP 
> results speak much louder than philosophical arguments about who should 
> decide whether or not Chromium should accept the distro GL.  Fortunately for 
> you, the well funded driver teams (Intel and AMD) have already done a lot of 
> the painful work of getting a lot of the bugs and "bugs" out of core mesa and 
> galium.  What's left are likely real back-end driver bugs which may be 
> affecting some user somewhere so they're worth fixing.

The cause of this blacklist is not (lack of) deqp conformance, but
instead mostly automated crash reports. In other words, crashes in the
field where we have a backtrace but not necessarily a good repro case.
For someone building an application like Chrome, the multitude of
kernel+user space drivers+OS version+compositor combinations basically
makes each bug a very, very long investigation. I argued a long time
ago that we should try to get more communication going between Chrome
folks and Linux GPU driver folks to fix this, but quickly realized
that the task at hand is huge. You can only make a dent in it by being
very systematic about it. If someone wants to commit the time to do
that, I would be happy to help communication around these efforts.


>
>>
>> In the meanwhile, end users are losing accelerated WebGL which in
>> practice worked just fine (at least in my usage of it), and probably
>> some other functionality.
>>
>> One idea is to flip GL_VENDOR to some random string if chromium is
>> running. I don't like this idea, but I also don't have any great
>> alternatives. We can also just take this, as yet-another nail in the
>> nouveau coffin.
>
>
> You asked for opinions, so here you go. :-P  In my personal (and rather 
> disinterested) opinion, I would recommend against such measures.  The last 
> thing anyone needs is an arms race between nouveau and Chromium teams.  I 
> think the better short-term thing to do would be to provide some 
> documentation about WebGL and educate users about Chromium's 
> --ignore-gpu-blacklist option.  This documentation could go on the mesa 
> website or, likely more usefully, it could go in various distro wiki entries 
> about nouveau and/or general nvidia issues.  In the long term, what's needed 
> is improving nouveau quality and stability and re-building trust with the 
> Chromium team.  I'm not trying to attack nouveau here but the fact is that 
> trust has been lost due to an unfortunate history of mis-filed (against 
> Chromium) bugs.  That trust doesn't get re-built by nuclear solutions.


Yes I think the Chrome-side is very simple here: because there isn't
time or means for in-depth investigation, if a driver crashes too
much, it gets blacklisted. The situation is not unique, the GPU
blacklist file is 1700 lines:
https://chromium.googlesource.com/chromium/src/gpu/+/master/config/software_rendering_list.json

Anyway, IMO if the biggest crashers can be fixed, I think we could
eventually make a case to reenable.

Stéphane

>
> 

Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-06 Thread Tapani Pälli



On 1/6/19 9:37 AM, Jason Ekstrand wrote:
On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin > wrote:


It looks like as of Chromium 71, nouveau is completely blacklisted.


That's rather unfortunate. :-(  The intel mesa drivers were also 
blacklisted for quite some time a while back.  I'm not really sure what 
we did to get blacklisted or what we did to get unblacklisted.


We had lots of GPU hangs from WebGL tests. We fixed things until in some 
point things were passing and our web team sent a patch to Chromium to 
enable it back again. This is probably the best route to get Nouveau 
enabled as well.


Have to note that we do have currently some WebGL issues on i965 too .. 
should take a look at some point.




I don't really see a way back from this, since they don't cite any
easily reproducible issues, except that some people had some issues
with indeterminate hardware and indeterminate versions of mesa.

In the bug that triggered this
(https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
I might have slightly lost my cool, they (at the end) suggested that
we try to make nouveau a first-class citizen with chromium. However I
will never be able to present concrete evidence that inconcrete issues
are resolved. I did run the WebGL CTS suite, but that resulted in some
hangs from the the max-texture-size-equivalent test, and some
browser-level weirdness after some tests where later tests all fail
(due to what I have to assume is a browser bug). I don't think I
managed to properly track down the true reason why. I didn't want to
reach out to them with such results, as that's just further evidence
of nouveau not working perfectly.


If you want concrete bugs to fix, I highly recommend OpenGL[ES] 
conformance tests, dEQP, and the WebGL CTS (which is mostly a re-hash of 
the OpenGL ES 3.0 CTS).  Google cares quite a bit about driver 
conformance and are much more likely to consider nouveau to be 
high-quality if those test suites are in good shape.  Years of 
experience dealing with Google says that dEQP results speak much louder 
than philosophical arguments about who should decide whether or not 
Chromium should accept the distro GL.  Fortunately for you, the well 
funded driver teams (Intel and AMD) have already done a lot of the 
painful work of getting a lot of the bugs and "bugs" out of core mesa 
and galium.  What's left are likely real back-end driver bugs which may 
be affecting some user somewhere so they're worth fixing.


In the meanwhile, end users are losing accelerated WebGL which in
practice worked just fine (at least in my usage of it), and probably
some other functionality.

One idea is to flip GL_VENDOR to some random string if chromium is
running. I don't like this idea, but I also don't have any great
alternatives. We can also just take this, as yet-another nail in the
nouveau coffin.


You asked for opinions, so here you go. :-P  In my personal (and rather 
disinterested) opinion, I would recommend against such measures.  The 
last thing anyone needs is an arms race between nouveau and Chromium 
teams.  I think the better short-term thing to do would be to provide 
some documentation about WebGL and educate users about Chromium's 
--ignore-gpu-blacklist option.  This documentation could go on the mesa 
website or, likely more usefully, it could go in various distro wiki 
entries about nouveau and/or general nvidia issues.  In the long term, 
what's needed is improving nouveau quality and stability and re-building 
trust with the Chromium team.  I'm not trying to attack nouveau here but 
the fact is that trust has been lost due to an unfortunate history of 
mis-filed (against Chromium) bugs.  That trust doesn't get re-built by 
nuclear solutions.


--Jason

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-06 Thread Rob Clark
On Sat, Jan 5, 2019 at 3:40 PM Ilia Mirkin  wrote:
>
> It looks like as of Chromium 71, nouveau is completely blacklisted.
>
> I don't really see a way back from this, since they don't cite any
> easily reproducible issues, except that some people had some issues
> with indeterminate hardware and indeterminate versions of mesa.
>
> In the bug that triggered this
> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
> I might have slightly lost my cool, they (at the end) suggested that
> we try to make nouveau a first-class citizen with chromium. However I
> will never be able to present concrete evidence that inconcrete issues
> are resolved. I did run the WebGL CTS suite, but that resulted in some
> hangs from the the max-texture-size-equivalent test, and some
> browser-level weirdness after some tests where later tests all fail
> (due to what I have to assume is a browser bug). I don't think I
> managed to properly track down the true reason why. I didn't want to
> reach out to them with such results, as that's just further evidence
> of nouveau not working perfectly.
>
> In the meanwhile, end users are losing accelerated WebGL which in
> practice worked just fine (at least in my usage of it), and probably
> some other functionality.
>
> One idea is to flip GL_VENDOR to some random string if chromium is
> running. I don't like this idea, but I also don't have any great
> alternatives. We can also just take this, as yet-another nail in the
> nouveau coffin.
>

I think this would be a really bad idea

Better idea might be to request chromium to whitelist nouveau for
pairs of nv generation + mesa version that are known to pass (or at
least comes reasonably close to passing?) WebGL CTS.  Maybe setup a
wiki page or trello or bz or whatever w/ some pointers to info about
how to disable gpu blacklist (to run the cts tests in the first place)
and how to run cts, and table of nv generations.  I guess you don't
have hw or time to test everything yourself, but this is something
that distros and users can help with.  The idea for
wiki/trello/whatever was to help coordinate that and track open bugs
for failing CTS tests.


BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-05 Thread Jason Ekstrand
On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin  wrote:

> It looks like as of Chromium 71, nouveau is completely blacklisted.
>

That's rather unfortunate. :-(  The intel mesa drivers were also
blacklisted for quite some time a while back.  I'm not really sure what we
did to get blacklisted or what we did to get unblacklisted.


> I don't really see a way back from this, since they don't cite any
> easily reproducible issues, except that some people had some issues
> with indeterminate hardware and indeterminate versions of mesa.
>
> In the bug that triggered this
> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
> I might have slightly lost my cool, they (at the end) suggested that
> we try to make nouveau a first-class citizen with chromium. However I
> will never be able to present concrete evidence that inconcrete issues
> are resolved. I did run the WebGL CTS suite, but that resulted in some
> hangs from the the max-texture-size-equivalent test, and some
> browser-level weirdness after some tests where later tests all fail
> (due to what I have to assume is a browser bug). I don't think I
> managed to properly track down the true reason why. I didn't want to
> reach out to them with such results, as that's just further evidence
> of nouveau not working perfectly.
>

If you want concrete bugs to fix, I highly recommend OpenGL[ES] conformance
tests, dEQP, and the WebGL CTS (which is mostly a re-hash of the OpenGL ES
3.0 CTS).  Google cares quite a bit about driver conformance and are much
more likely to consider nouveau to be high-quality if those test suites are
in good shape.  Years of experience dealing with Google says that dEQP
results speak much louder than philosophical arguments about who should
decide whether or not Chromium should accept the distro GL.  Fortunately
for you, the well funded driver teams (Intel and AMD) have already done a
lot of the painful work of getting a lot of the bugs and "bugs" out of core
mesa and galium.  What's left are likely real back-end driver bugs which
may be affecting some user somewhere so they're worth fixing.


> In the meanwhile, end users are losing accelerated WebGL which in
> practice worked just fine (at least in my usage of it), and probably
> some other functionality.
>
> One idea is to flip GL_VENDOR to some random string if chromium is
> running. I don't like this idea, but I also don't have any great
> alternatives. We can also just take this, as yet-another nail in the
> nouveau coffin.
>

You asked for opinions, so here you go. :-P  In my personal (and rather
disinterested) opinion, I would recommend against such measures.  The last
thing anyone needs is an arms race between nouveau and Chromium teams.  I
think the better short-term thing to do would be to provide some
documentation about WebGL and educate users about Chromium's
--ignore-gpu-blacklist option.  This documentation could go on the mesa
website or, likely more usefully, it could go in various distro wiki
entries about nouveau and/or general nvidia issues.  In the long term,
what's needed is improving nouveau quality and stability and re-building
trust with the Chromium team.  I'm not trying to attack nouveau here but
the fact is that trust has been lost due to an unfortunate history of
mis-filed (against Chromium) bugs.  That trust doesn't get re-built by
nuclear solutions.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Chromium - Application-level nouveau blacklist

2019-01-05 Thread Ilia Mirkin
It looks like as of Chromium 71, nouveau is completely blacklisted.

I don't really see a way back from this, since they don't cite any
easily reproducible issues, except that some people had some issues
with indeterminate hardware and indeterminate versions of mesa.

In the bug that triggered this
(https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
I might have slightly lost my cool, they (at the end) suggested that
we try to make nouveau a first-class citizen with chromium. However I
will never be able to present concrete evidence that inconcrete issues
are resolved. I did run the WebGL CTS suite, but that resulted in some
hangs from the the max-texture-size-equivalent test, and some
browser-level weirdness after some tests where later tests all fail
(due to what I have to assume is a browser bug). I don't think I
managed to properly track down the true reason why. I didn't want to
reach out to them with such results, as that's just further evidence
of nouveau not working perfectly.

In the meanwhile, end users are losing accelerated WebGL which in
practice worked just fine (at least in my usage of it), and probably
some other functionality.

One idea is to flip GL_VENDOR to some random string if chromium is
running. I don't like this idea, but I also don't have any great
alternatives. We can also just take this, as yet-another nail in the
nouveau coffin.

Opinions welcome.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev