[PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-26 Thread Lukas Wunner
Hi Daniel,

On Tue, Aug 25, 2015 at 10:21:20AM +0200, Daniel Vetter wrote:
> On Tue, Aug 25, 2015 at 09:36:47AM +0200, Lukas Wunner wrote:
> > I've overhauled locking once more because previously all EDID reads
> > were serialized even on machines which don't use vga_switcheroo at all.
> > Seems it's impossible to fix this with mutexes and still be race-free
> > and deadlock-free, so I've changed vgasr_mutex to an rwlock:
> > https://github.com/l1k/linux/commit/ffa4546d85244a53541eed6799f93e8eea7f29e8
> > 
> > There are a number of vga_switcheroo functions added by Takashi Iwai
> > and yourself which don't lock anything at all, I believe this was in
> > part to avoid deadlocks (vga_switcheroo_runtime_resume_hdmi_audio locks
> > vgasr_mutex, wakes up the GPU with vga_switcheroo_runtime_resume
> > which locks vgasr_mutex once again). After changing vgasr_mutex to an
> > rwlock we can safely use locking in those functions as well:
> > https://github.com/l1k/linux/commit/cdea8f5c92039329dde47cadf20b67c7d9af0d62
> > 
> > With this change, on machines which don't use vga_switcheroo, the
> > overhead in drm_get_edid() is exactly 1 read_lock + 1 read_unlock
> > and EDID reads can happen in parallel. Thierry Reding and Alex Deucher
> > are in favor of adding a separate drm_get_edid_switcheroo() and changing
> > the relevant drivers (i915, nouveau, radeon, amdgpu) to use that in lieu
> > of drm_get_edid() so that drivers which don't use vga_switcheroo avoid
> > the lock_ddc/unlock_ddc calls. Performance-wise these additional calls
> > should be negligible, so I think the motivation can only be better
> > readability/clarity. One should also keep in mind that drivers which
> > don't use vga_switcheroo currently might do so in the future, e.g.
> > if mobile GPUs use a big.LITTLE concept like ARM CPUs already do.
> 
> Just a quick aside: Switching to rwlocks to make lockdep happy is a
> fallacy - lockdep unfortunately doesn't accurately track read lock
> depencies. Which means very likely you didn't fix the locking inversions
> but just made lockdep shut up about them. [...]
> I'd highly suggest you try fixing the vga-switcheroo locking without
> resorting to rw lock primitives - that just means we need to manually
> prove the locking for many cases which is tons of work.

mutex is not the right tool for the job:

To make DDC switching bullet proof you need to block the handler from
unregistering between lock_ddc and unlock_ddc. Solution: ddc_lock grabs
a mutex, ddc_unlock releases it, unregister_handler grabs that same lock.

Downside: All EDID reads are serialized, you can't probe EDID in parallel
if you have multiple displays. Which is ugly.

One could be tempted to "solve" this for non-muxed machines by immediately
releasing the lock in lock_ddc if there's no handler, and by checking in
unlock_ddc if mutex_is_locked before releasing it. But on muxed machines
this would be racy: GPU driver A calls lock_ddc, acquires the mutex, sees
there's no handler registered, releases the mutex. Between this and the
call to unlock_ddc, a handler registers and GPU driver B calls lock_ddc.
This time the mutex is actually locked and when driver A calls unlock_ddc,
it will release it, even though the other driver had acquired it.

Of course one could come up with all sorts of ugly hacks like remembering
for which client the mutex was acquired, but this wouldn't be atomic and
100% bullet proof, unlike rwlocks.

So it seems there's no way with mutexes to achieve parallel EDID reads
and still be race-free and deadlock-free.

rwlocks have the right semantics for this use case: Registering and
unregistering requires write access to the lock, thus happens exclusively.
In lock_ddc we acquire read access to the rwlock and in unlock_ddc we
release it. So the handler can't disappear while we've switched DDC.
We use a mutex ddc_lock to lock the DDC lines but we only acquire that
lock if there's actually a handler. So on non-muxed machines, EDID reads
*can* happen in parallel, there's just the (negligible) overhead of
1 read_lock + 1 read_unlock:
https://github.com/l1k/linux/commit/d0b6f659ae8f4b8b94f4b3ded9fc80e93950d0b3

Best regards,

Lukas


[PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-25 Thread Daniel Vetter
On Tue, Aug 25, 2015 at 09:36:47AM +0200, Lukas Wunner wrote:
> Hi Dave,
> 
> as requested I've pushed the MacBook Pro GPU switching stuff to GitHub
> to ease browsing/reviewing, latest branch is gmux-v5:
> https://github.com/l1k/linux/commits/gmux-v5
> 
> (I had applied for a freedesktop.org account in April with bug 89906
> but it's still pending, hence GitHub.)
> 
> I've overhauled locking once more because previously all EDID reads
> were serialized even on machines which don't use vga_switcheroo at all.
> Seems it's impossible to fix this with mutexes and still be race-free
> and deadlock-free, so I've changed vgasr_mutex to an rwlock:
> https://github.com/l1k/linux/commit/ffa4546d85244a53541eed6799f93e8eea7f29e8
> 
> There are a number of vga_switcheroo functions added by Takashi Iwai
> and yourself which don't lock anything at all, I believe this was in
> part to avoid deadlocks (vga_switcheroo_runtime_resume_hdmi_audio locks
> vgasr_mutex, wakes up the GPU with vga_switcheroo_runtime_resume
> which locks vgasr_mutex once again). After changing vgasr_mutex to an
> rwlock we can safely use locking in those functions as well:
> https://github.com/l1k/linux/commit/cdea8f5c92039329dde47cadf20b67c7d9af0d62
> 
> With this change, on machines which don't use vga_switcheroo, the
> overhead in drm_get_edid() is exactly 1 read_lock + 1 read_unlock
> and EDID reads can happen in parallel. Thierry Reding and Alex Deucher
> are in favor of adding a separate drm_get_edid_switcheroo() and changing
> the relevant drivers (i915, nouveau, radeon, amdgpu) to use that in lieu
> of drm_get_edid() so that drivers which don't use vga_switcheroo avoid
> the lock_ddc/unlock_ddc calls. Performance-wise these additional calls
> should be negligible, so I think the motivation can only be better
> readability/clarity. One should also keep in mind that drivers which
> don't use vga_switcheroo currently might do so in the future, e.g.
> if mobile GPUs use a big.LITTLE concept like ARM CPUs already do.

Just a quick aside: Switching to rwlocks to make lockdep happy is a
fallacy - lockdep unfortunately doesn't accurately track read lock
depencies. Which means very likely you didn't fix the locking inversions
but just made lockdep shut up about them. Example:

thread A grabs read-lock 1 and mutex 2.

thread B grabs mutex 2 and write-lock 1.

lockdep won't complain here, but if thread A has tries to get mutex 2
while thread B tries to write-lock 1 then there's an obvious deadlock.

I'd highly suggest you try fixing the vga-switcheroo locking without
resorting to rw lock primitives - that just means we need to manually
prove the locking for many cases which is tons of work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-25 Thread Lukas Wunner
Hi Dave,

as requested I've pushed the MacBook Pro GPU switching stuff to GitHub
to ease browsing/reviewing, latest branch is gmux-v5:
https://github.com/l1k/linux/commits/gmux-v5

(I had applied for a freedesktop.org account in April with bug 89906
but it's still pending, hence GitHub.)

I've overhauled locking once more because previously all EDID reads
were serialized even on machines which don't use vga_switcheroo at all.
Seems it's impossible to fix this with mutexes and still be race-free
and deadlock-free, so I've changed vgasr_mutex to an rwlock:
https://github.com/l1k/linux/commit/ffa4546d85244a53541eed6799f93e8eea7f29e8

There are a number of vga_switcheroo functions added by Takashi Iwai
and yourself which don't lock anything at all, I believe this was in
part to avoid deadlocks (vga_switcheroo_runtime_resume_hdmi_audio locks
vgasr_mutex, wakes up the GPU with vga_switcheroo_runtime_resume
which locks vgasr_mutex once again). After changing vgasr_mutex to an
rwlock we can safely use locking in those functions as well:
https://github.com/l1k/linux/commit/cdea8f5c92039329dde47cadf20b67c7d9af0d62

With this change, on machines which don't use vga_switcheroo, the
overhead in drm_get_edid() is exactly 1 read_lock + 1 read_unlock
and EDID reads can happen in parallel. Thierry Reding and Alex Deucher
are in favor of adding a separate drm_get_edid_switcheroo() and changing
the relevant drivers (i915, nouveau, radeon, amdgpu) to use that in lieu
of drm_get_edid() so that drivers which don't use vga_switcheroo avoid
the lock_ddc/unlock_ddc calls. Performance-wise these additional calls
should be negligible, so I think the motivation can only be better
readability/clarity. One should also keep in mind that drivers which
don't use vga_switcheroo currently might do so in the future, e.g.
if mobile GPUs use a big.LITTLE concept like ARM CPUs already do.

Best regards,

Lukas


[Intel-gfx] [PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-16 Thread Lukas Wunner
Hi Daniel,

On Wed, Aug 12, 2015 at 11:10:59PM +0200, Daniel Vetter wrote:
> Yes just squash and mention that the patch is based on work from
> $list_of_other_authors, plus cc them. There's not much point in
> acknowledging when people write broken patches ;-)

As you requested I've squashed the first 7 patches and I'll post
the resulting 3 replacement patches to dri-devel immediately after
sending this e-mail.

I've also overhauled locking.

These 3 patches lay the groundwork for DDC switching with gmux.
The subsequent patches in the series mostly concern reprobing
and though I've made locking-related changes to these as well,
I don't want to spam the list with them again until we have
reached consensus how reprobing should be implemented:


On Thu, Aug 13, 2015 at 08:50:47AM +0200, Daniel Vetter wrote:
> EDEFERREDPROBE isn't something that gets returned to userspace, it's
> just internal handling so that the kernel knows there's a depency
> issue and it needs to retry probing once other drivers have finished
> loading. It is the generic means linux has to handle cross-driver
> depencies which aren't reflected in the bus hierarchy.
> 
> I.e. it's just something to make sure that apple-gmux is fully loaded
> before i915/nouveau. The driver _will_ be initialized eventually.

Aha, so you want to stall initialization of i915/nouveau/radeon
*completely* until apple-gmux is loaded.

So how do you determine if initialization should be stalled?
You would have to hardcode DMIs for all MacBook Pro models.
I just counted it, there are 5 DMIs which require apple-gmux,
2 DMIs which require nouveau and 1 DMI which requires radeon
(they require nouveau/radeon for proxying, apple-gmux won't
help these models). And every year you would have to add another
DMI for the latest MacBook Pro model. People using the latest
model with an older kernel won't be able to use switching and
may open support tickets.

And if the module required for initialization is not installed
or has a different version than the kernel, i915 won't initialize
at all, which will be another source for support cases that you'll
have to deal with.

I think this doesn't scale and it shows that it's the wrong
approach.

vga_switcheroo *knows* when the handler registers, or the driver
to proxy AUX, and it can *notify* the inactive GPU's driver.
No need to hardcode DMIs, keep it simple.

And there *is* already a callback to notify drivers that they
should reprobe their outputs:

 * struct drm_mode_config_funcs - basic driver provided mode setting functions
 * @output_poll_changed: function to handle output configuration changes


> On Thu, Aug 13, 2015 at 1:37 AM, Lukas Wunner  wrote:
> > vga_switcheroo calls drm_kms_helper_hotplug_event() when the handler
> > registers (patch 11), which will invoke ->output_poll_changed.
> 
> Oh I didn't spot that one. This kind of layering inversions generally
> leads to deadlocks and fun stuff.

Please elaborate why you think it's a layering inversion to call
drm_kms_helper_hotplug_event() from vga_switcheroo.


> Also reprobing lvds/edp is just a
> side-effect when you have fbdev emulation enabled.

No, even though ->output_poll_changed is most commonly used to
update the fbcon output configuration, it gets called even if
fbdev emulation is disabled, and I've changed i915's callback
in patch 13 so that the LVDS/eDP connectors are reprobed even
if CONFIG_DRM_I915_FBDEV is not set.


> If we go with this re-probing approach then we definitely
> need a new hook in vga-switcheroo

Why? From my point of view, drm_kms_helper_hotplug_event()
already does the job. The only problem is that i915 removes
LVDS and eDP connectors from the mode configuration if they
can't be probed. By contrast, nouveau (and I think radeon)
will just keep them, but with status disconnected. I changed
i915 with patch 13 to behave identically to nouveau/radeon.
(Sorry, I've written this before but I feel like I need to
overcommunicate in this medium.)

The question is, do you consider it harmful to keep a modeless
LVDS or eDP connector in the mode configuration (with status
disconnected)? On the MacBooks it works fine but I have no idea
if it causes issues on other platforms.

If you absolutely positively believe that this causes issues
and don't want to change i915's behaviour, then yes indeed we
may need a separate vga_switcheroo callback.


> and even then there's still the locking problem.

The only lock held when calling drm_kms_helper_hotplug_event()
from vga_switcheroo is vgasr_mutex. When can this deadlock?
Whenever we call a vga_switcheroo function from the driver
upon probing outputs, specifically:

- vga_switcheroo_client_fb_set()
  gets called if the fb is recreated on reprobe
- vga_switcheroo_lock_ddc() / _unlock_ddc()
  get called when probing DDC and retrieving the EDID
- vga_switcheroo_set_ddc() / _get_ddc() / _set_aux() / get_aux()
  get called for proxying

So we can't lock vgasr_mutex in those functions.

In the case of _lock_ddc() / 

[Intel-gfx] [PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-13 Thread Daniel Vetter
On Thu, Aug 13, 2015 at 1:37 AM, Lukas Wunner  wrote:
> Hi Daniel,
>
> On Wed, Aug 12, 2015 at 04:16:25PM +0200, Daniel Vetter wrote:
>> > * Reprobing if the inactive GPU initializes before the apple-gmux module:
>> >   v1 used Matthew Garrett's approach of adding a driver callback.
>> >   v2 simply generates a hotplug event instead. nouveau polls its outputs
>> >   every 10 seconds so we want it to poll immediately once apple-gmux
>> >   registers. That is achieved by the hotplug event. The i915 driver is
>> >   changed to behave identically to nouveau. (Right now it deletes LVDS
>> >   and eDP connectors from the mode configuration if they can't be probed,
>> >   deeming them to be ghosts.)
>>
>> I thought -EDEFERREDPROBE is what we should be using if drivers don't get
>> loaded in the right order? Hand-rolling depency avoidance stuff is imo a
>> horrible idea.
> [...]
>> I think just reading edid and the relevant dp aux data in apple-gmux or
>> somewhere like that and stalling driver load until that's ready is the
>> only clean option.
>
> I'm afraid we can't stall initialization of a driver like that because
> even though the GPU may not be switched to the panel, it may still have
> an external monitor attached.
>
> All MacBook Pros have external DP and/or HDMI ports and these are
> either soldered to the discrete GPU (model year 2011 and onwards)
> or switchable between the discrete and integrated GPU (until 2010;
> I think they are even switchable separately from the panel).
>
> So basically we'd have to initialize the driver normally, and if
> intel_lvds_init() or intel_edp_init_connector() fail we'd have to
> somehow pass that up the call chain so that i915_pci_probe() can
> return -EPROBE_DEFER. And whenever we're asked to reprobe we'd
> repeat initialization of the LVDS or eDP connector.
>
> I'm wondering what the benefit is compared to just keeping the
> connector in the mode configuration, but with status disconnected,
> and reprobing it when the ->output_poll_changed callback gets invoked?
> Because that's what nouveau already does, and what I've changed i915
> to do with patch 13.
>
> vga_switcheroo calls drm_kms_helper_hotplug_event() when the handler
> registers (patch 11), which will invoke ->output_poll_changed.
> So we're talking about the Official DRM Callback [tm] to probe
> outputs, not "hand-rolling depency avoidance". :-)

Oh I didn't spot that one. This kind of layering inversions generally
leads to deadlocks and fun stuff. Also reprobing lvds/edp is just a
side-effect when you have fbdev emulation enabled. If we go with this
re-probing approach then we definitely need a new hook in
vga-switcheroo, and even then there's still the locking problem.

>> > * Framebuffer recreation if the inactive GPU initializes before the
>> >   apple-gmux module (i.e. discarding the default 1024x768 fb and replacing
>> >   with a new one with the actual panel resolution): v1 only supported this
>> >   for i915, v2 has a generic solution which works with nouveau and radeon
>> >   as well. (Necessary if the discrete GPU is forced to be the inactive one
>> >   on boot via the EFI variable.)
>>
>> Would completely remove the need for this ;-)
>
> Unfortunately not: We'd still have to initialize the driver to be able
> to drive external displays. If there are initially no connectors with
> modes, we'll once again end up with the 1024x768 fb.

EDEFERREDPROBE isn't something that gets returned to userspace, it's
just internal handling so that the kernel knows there's a depency
issue and it needs to retry probing once other drivers have finished
loading. It is the generic means linux has to handle cross-driver
depencies which aren't reflected in the bus hierarchy.

I.e. it's just something to make sure that apple-gmux is fully loaded
before i915/nouveau. The driver _will_ be initialized eventually.

>> You can't share the dp aux like that. It's true that we need a bit more
>> data (there's a few eDP related feature blocsk we need), but sharing the
>> aux channel entirely is no-go. If you do you get drivers trying to link
>> train and at best this fails and at worst you'll upset the configuration
>> of the other driver and piss of the panel enough to need a hard reset
>> until it works again.
>
> Yes, so far proxying of the AUX channel is read-only. In those cases
> when writing is necessary for setting up the output, I'm adding code
> to check if the current content of the DPCD is identical to what's
> being written and if so, skip the write. We'll see if this stategy is
> sufficient for the drivers to set up their outputs.

You need to block anything that would write _much_ earlier. By the
time we're doing link-training it's way too late.

>> I think the real tricky bit here with vgaswitcheroo is locking, I need to
>> take a separate lock at the patches for that.
>
> Locking when switching only the DDC lines is facilitated by the ddc_lock
> attribute of struct vgasr_priv. This is all local to 

[Intel-gfx] [PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-13 Thread Lukas Wunner
Hi Daniel,

On Wed, Aug 12, 2015 at 04:16:25PM +0200, Daniel Vetter wrote:
> > * Reprobing if the inactive GPU initializes before the apple-gmux module:
> >   v1 used Matthew Garrett's approach of adding a driver callback.
> >   v2 simply generates a hotplug event instead. nouveau polls its outputs
> >   every 10 seconds so we want it to poll immediately once apple-gmux
> >   registers. That is achieved by the hotplug event. The i915 driver is
> >   changed to behave identically to nouveau. (Right now it deletes LVDS
> >   and eDP connectors from the mode configuration if they can't be probed,
> >   deeming them to be ghosts.)
> 
> I thought -EDEFERREDPROBE is what we should be using if drivers don't get
> loaded in the right order? Hand-rolling depency avoidance stuff is imo a
> horrible idea.
[...]
> I think just reading edid and the relevant dp aux data in apple-gmux or
> somewhere like that and stalling driver load until that's ready is the
> only clean option.

I'm afraid we can't stall initialization of a driver like that because
even though the GPU may not be switched to the panel, it may still have
an external monitor attached.

All MacBook Pros have external DP and/or HDMI ports and these are
either soldered to the discrete GPU (model year 2011 and onwards)
or switchable between the discrete and integrated GPU (until 2010;
I think they are even switchable separately from the panel).

So basically we'd have to initialize the driver normally, and if
intel_lvds_init() or intel_edp_init_connector() fail we'd have to
somehow pass that up the call chain so that i915_pci_probe() can
return -EPROBE_DEFER. And whenever we're asked to reprobe we'd
repeat initialization of the LVDS or eDP connector.

I'm wondering what the benefit is compared to just keeping the
connector in the mode configuration, but with status disconnected,
and reprobing it when the ->output_poll_changed callback gets invoked?
Because that's what nouveau already does, and what I've changed i915
to do with patch 13.

vga_switcheroo calls drm_kms_helper_hotplug_event() when the handler
registers (patch 11), which will invoke ->output_poll_changed.
So we're talking about the Official DRM Callback [tm] to probe
outputs, not "hand-rolling depency avoidance". :-)


> > * Framebuffer recreation if the inactive GPU initializes before the
> >   apple-gmux module (i.e. discarding the default 1024x768 fb and replacing
> >   with a new one with the actual panel resolution): v1 only supported this
> >   for i915, v2 has a generic solution which works with nouveau and radeon
> >   as well. (Necessary if the discrete GPU is forced to be the inactive one
> >   on boot via the EFI variable.)
> 
> Would completely remove the need for this ;-)

Unfortunately not: We'd still have to initialize the driver to be able
to drive external displays. If there are initially no connectors with
modes, we'll once again end up with the 1024x768 fb.


> You can't share the dp aux like that. It's true that we need a bit more
> data (there's a few eDP related feature blocsk we need), but sharing the
> aux channel entirely is no-go. If you do you get drivers trying to link
> train and at best this fails and at worst you'll upset the configuration
> of the other driver and piss of the panel enough to need a hard reset
> until it works again.

Yes, so far proxying of the AUX channel is read-only. In those cases
when writing is necessary for setting up the output, I'm adding code
to check if the current content of the DPCD is identical to what's
being written and if so, skip the write. We'll see if this stategy is
sufficient for the drivers to set up their outputs.


> I think the real tricky bit here with vgaswitcheroo is locking, I need to
> take a separate lock at the patches for that.

Locking when switching only the DDC lines is facilitated by the ddc_lock
attribute of struct vgasr_priv. This is all local to vga_switcheroo.c
and contained in patches 5 and 6.

Locking when proxying the AUX channel is facilitated by the hw_mutex
attribute of struct drm_dp_aux. nouveau has its own locking mechanism
contained in drivers/gpu/drm/nouveau/nvkm/subdev/i2c/pad*.c. Thus,
when proxying via nouveau, there are two locking mechanisms at work
(drm_dp_aux hw_mutex as outer lock + pad as inner lock). This is
nothing introduced by this patch set, all existing code.

Locking of access to the struct vgasr_priv is facilitated by the
vgasr_mutex in vga_switcheroo.c. Also existing code.


Best regards,

Lukas


[Intel-gfx] [PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-12 Thread Daniel Vetter
On Tue, Aug 11, 2015 at 12:29:17PM +0200, Lukas Wunner wrote:
> This is a follow-up to the v1 posted in April:
> http://lists.freedesktop.org/archives/dri-devel/2015-April/081515.html
> 
> 
> Patches 1 - 17 enable GPU switching on the pre-retina MacBook Pro.
> These were tested successfully by multiple people and solve two
> tickets in Bugzilla:
> https://bugzilla.kernel.org/show_bug.cgi?id=88861
> https://bugs.freedesktop.org/show_bug.cgi?id=61115
> 
> Patches 18 - 22 are a preview of how we're tackling retina support.
> Those are marked experimental and are NOT ready to be merged yet.
> Feedback on them is welcome.
> 
> The patches are based on drm-next.
> 
> They were tested on the following hardware (thanks a lot everyone!):
> Tested-by: Paul Hordiienko 
> [MBP  6,2 2010  intel ILK + nvidia GT216  pre-retina]
> Tested-by: William Brown 
> [MBP  8,2 2011  intel SNB + amd turks pre-retina]
> Tested-by: Lukas Wunner 
> [MBP  9,1 2012  intel IVB + nvidia GK107  pre-retina]
> Tested-by: Bruno Bierbaumer 
> [MBP 11,3 2013  intel HSW + nvidia GK107  retina -- work in progress]
> 
> 
> What's new:
> 
> * By default the MBP boots with the display switched to the discrete GPU
>   but it can be forced to the integrated GPU with an EFI boot variable.
>   Here's a handy tool for that: https://github.com/0xbb/gpu-switch
>   v1 didn't work in this configuration, v2 does.
> 
> * Reprobing if the inactive GPU initializes before the apple-gmux module:
>   v1 used Matthew Garrett's approach of adding a driver callback.
>   v2 simply generates a hotplug event instead. nouveau polls its outputs
>   every 10 seconds so we want it to poll immediately once apple-gmux
>   registers. That is achieved by the hotplug event. The i915 driver is
>   changed to behave identically to nouveau. (Right now it deletes LVDS
>   and eDP connectors from the mode configuration if they can't be probed,
>   deeming them to be ghosts.)

I thought -EDEFERREDPROBE is what we should be using if drivers don't get
loaded in the right order? Hand-rolling depency avoidance stuff is imo a
horrible idea.

> * Framebuffer recreation if the inactive GPU initializes before the
>   apple-gmux module (i.e. discarding the default 1024x768 fb and replacing
>   with a new one with the actual panel resolution): v1 only supported this
>   for i915, v2 has a generic solution which works with nouveau and radeon
>   as well. (Necessary if the discrete GPU is forced to be the inactive one
>   on boot via the EFI variable.)

Would completely remove the need for this ;-)

> * Generally lots of rough edges were smoothed to hopefully make the
>   patches more suitable for merging. E.g. there's a bug in i915 where
>   the SSC status set by BIOS is preserved too late and v1 only contained
>   a workaround, whereas v2 contains a proper fix in a separate commit.
> 
> 
> The long journey towards retina support:
> 
> The pixel clock required for retina resolution is not supported by LVDS
> (which was used on pre-retinas), necessitating eDP instead. Problem is,
> the gmux register which switches the DDC lines on pre-retinas doesn't
> switch the AUX channel on retinas. Disassembling the OS X driver revealed
> that the gmux in retina MBPs gained an additional register 0x7f which gets
> written to when setting up the eDP configuration. There was some hope that
> this might switch the AUX channel. Alas, we tried writing various values
> to that register but were unable to get the inactive GPU to talk to the
> panel. The purpose of register 0x7f remains a mystery.
> 
> Teardowns of the first generation retina MBP name the NXP CBTL06142 and
> TI HD3SS212 as multiplexers and according to the data sheets I've found,
> neither supports switching the AUX channel separately from the main link.
> 
> Matthew Garrett had the idea of having the active GPU stash the EDID and
> the first 8 bytes of the DPCD (receiver capabilities) and letting the
> inactive GPU retrieve that data. I rebased and rewrote his patches and
> got everything working, only to discover that the drivers are unhappy
> with just 8 bytes of DPCD. They need full access to the DPCD to set up
> their outputs. We could stash the entire DPCD but some parts of it are
> mutable so the stashed data may become stale when the active GPU performs
> writes to the DPCD.
> 
> So I had the idea of using the active GPU as a proxy to talk to the panel,
> thus emulating switching of the AUX channel in software. We can leverage
> the struct drm_dp_aux and i2c_adapter (for DDC) to achieve this, swapping
> the inactive GPU's structs with those of the active GPU on the fly.
> That approach is implemented in patches 18 - 22 but there are still some
> driver issues that I'm debugging. The current status as per the latest
> logs Bruno sent me is that i915 rejects the mode retrieved via proxying
> with CLOCK_HIGH and nouveau aborts link training halfway through.
> Bottom line is that it's not yet working but we're getting closer.

[PATCH v2 00/22] Enable gpu switching on the MacBook Pro

2015-08-11 Thread Lukas Wunner
This is a follow-up to the v1 posted in April:
http://lists.freedesktop.org/archives/dri-devel/2015-April/081515.html


Patches 1 - 17 enable GPU switching on the pre-retina MacBook Pro.
These were tested successfully by multiple people and solve two
tickets in Bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=88861
https://bugs.freedesktop.org/show_bug.cgi?id=61115

Patches 18 - 22 are a preview of how we're tackling retina support.
Those are marked experimental and are NOT ready to be merged yet.
Feedback on them is welcome.

The patches are based on drm-next.

They were tested on the following hardware (thanks a lot everyone!):
Tested-by: Paul Hordiienko 
[MBP  6,2 2010  intel ILK + nvidia GT216  pre-retina]
Tested-by: William Brown 
[MBP  8,2 2011  intel SNB + amd turks pre-retina]
Tested-by: Lukas Wunner 
[MBP  9,1 2012  intel IVB + nvidia GK107  pre-retina]
Tested-by: Bruno Bierbaumer 
[MBP 11,3 2013  intel HSW + nvidia GK107  retina -- work in progress]


What's new:

* By default the MBP boots with the display switched to the discrete GPU
  but it can be forced to the integrated GPU with an EFI boot variable.
  Here's a handy tool for that: https://github.com/0xbb/gpu-switch
  v1 didn't work in this configuration, v2 does.

* Reprobing if the inactive GPU initializes before the apple-gmux module:
  v1 used Matthew Garrett's approach of adding a driver callback.
  v2 simply generates a hotplug event instead. nouveau polls its outputs
  every 10 seconds so we want it to poll immediately once apple-gmux
  registers. That is achieved by the hotplug event. The i915 driver is
  changed to behave identically to nouveau. (Right now it deletes LVDS
  and eDP connectors from the mode configuration if they can't be probed,
  deeming them to be ghosts.)

* Framebuffer recreation if the inactive GPU initializes before the
  apple-gmux module (i.e. discarding the default 1024x768 fb and replacing
  with a new one with the actual panel resolution): v1 only supported this
  for i915, v2 has a generic solution which works with nouveau and radeon
  as well. (Necessary if the discrete GPU is forced to be the inactive one
  on boot via the EFI variable.)

* Generally lots of rough edges were smoothed to hopefully make the
  patches more suitable for merging. E.g. there's a bug in i915 where
  the SSC status set by BIOS is preserved too late and v1 only contained
  a workaround, whereas v2 contains a proper fix in a separate commit.


The long journey towards retina support:

The pixel clock required for retina resolution is not supported by LVDS
(which was used on pre-retinas), necessitating eDP instead. Problem is,
the gmux register which switches the DDC lines on pre-retinas doesn't
switch the AUX channel on retinas. Disassembling the OS X driver revealed
that the gmux in retina MBPs gained an additional register 0x7f which gets
written to when setting up the eDP configuration. There was some hope that
this might switch the AUX channel. Alas, we tried writing various values
to that register but were unable to get the inactive GPU to talk to the
panel. The purpose of register 0x7f remains a mystery.

Teardowns of the first generation retina MBP name the NXP CBTL06142 and
TI HD3SS212 as multiplexers and according to the data sheets I've found,
neither supports switching the AUX channel separately from the main link.

Matthew Garrett had the idea of having the active GPU stash the EDID and
the first 8 bytes of the DPCD (receiver capabilities) and letting the
inactive GPU retrieve that data. I rebased and rewrote his patches and
got everything working, only to discover that the drivers are unhappy
with just 8 bytes of DPCD. They need full access to the DPCD to set up
their outputs. We could stash the entire DPCD but some parts of it are
mutable so the stashed data may become stale when the active GPU performs
writes to the DPCD.

So I had the idea of using the active GPU as a proxy to talk to the panel,
thus emulating switching of the AUX channel in software. We can leverage
the struct drm_dp_aux and i2c_adapter (for DDC) to achieve this, swapping
the inactive GPU's structs with those of the active GPU on the fly.
That approach is implemented in patches 18 - 22 but there are still some
driver issues that I'm debugging. The current status as per the latest
logs Bruno sent me is that i915 rejects the mode retrieved via proxying
with CLOCK_HIGH and nouveau aborts link training halfway through.
Bottom line is that it's not yet working but we're getting closer.

As a side effect, the pre-retinas gain a second way to initialize their
outputs: They can either use gmux to switch the DDC lines, or use the
active GPU as a proxy for the DDC communication. Which method gets used
depends on the order in which the drivers initialize, the inactive GPU
will happily use whatever is available and it automatically receives
a hotplug event once either method becomes ready for use.

But, once again, the patches