Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On 24/06/15 11:29, Daniel Vetter wrote: On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote: On 18/06/15 15:49, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). Yeah I've seen the origins of this in the display code, and that code gets the syncing wrong. The only thing that one has do to is grab a runtime pm reference for the appropriate power well to prevent dc5 entry, and release it when the firmware is loaded and initialized. Agreed. Which means any kind of firmware loader which requires/uses dev-struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). If you really need this guarantee (and I seriously hope not) then the only option is a synchronous firmware load at driver init _before_ we launch any of the asynchronous setup code. And there is already a lot of that, and we're adding more all the time. What I expect we need is synchronization of just the revelant part with the firmware loading, which necessarily needs to be somewhat async to be able to support cros/android requirements. And yes that needs to be done in a controlled manner, but most likely we need very specific solutions for the problem at hand. Unconditionally holding dev-struct_mutex isn't that solution. The other problem with dev-struct_mutex is that it's a giantic lock with ill defined coverage and semantics. It's imo the biggest piece of technical debt we carry around in i915.ko, and we pay the price for that dearlydaily. Which means that since a few years any kind of code which extended dev-struct_mutex to anything not clearly core gem data structures was rejected. Oh, I quite agree that the struct_mutex is an abomination and would certainly like to eliminate it. But at the moment it's the only sufficiently large-scale synchronisation operation available to ensure that (for example) we don't try to load the f/w at the same time that another thread is trying to reset the h/w. None of this loader code really needs the struct_mutex specifically; the WARN_ON macros were just there to help callers know what degree of synchronisation they need to organise before calling these
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Mon, Jul 06, 2015 at 01:44:10PM +0100, Dave Gordon wrote: On 24/06/15 11:29, Daniel Vetter wrote: On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote: On 18/06/15 15:49, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). Yeah I've seen the origins of this in the display code, and that code gets the syncing wrong. The only thing that one has do to is grab a runtime pm reference for the appropriate power well to prevent dc5 entry, and release it when the firmware is loaded and initialized. Agreed. Which means any kind of firmware loader which requires/uses dev-struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). If you really need this guarantee (and I seriously hope not) then the only option is a synchronous firmware load at driver init _before_ we launch any of the asynchronous setup code. And there is already a lot of that, and we're adding more all the time. What I expect we need is synchronization of just the revelant part with the firmware loading, which necessarily needs to be somewhat async to be able to support cros/android requirements. And yes that needs to be done in a controlled manner, but most likely we need very specific solutions for the problem at hand. Unconditionally holding dev-struct_mutex isn't that solution. The other problem with dev-struct_mutex is that it's a giantic lock with ill defined coverage and semantics. It's imo the biggest piece of technical debt we carry around in i915.ko, and we pay the price for that dearlydaily. Which means that since a few years any kind of code which extended dev-struct_mutex to anything not clearly core gem data structures was rejected. Oh, I quite agree that the struct_mutex is an abomination and would certainly like to eliminate it. But at the moment it's the only sufficiently large-scale synchronisation operation available to ensure that (for example) we don't try to load the f/w at the same time that another thread is trying to reset the h/w. I guess this is the crux here - for me part of the big problems around
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Fri, Jul 03, 2015 at 01:30:24PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides common support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. v2: Local functions should pass dev_priv rather than dev [Chris Wilson] Various other improvements per Chris Wilson's review comments v3: Defeatured version without async prefetch [Daniel Vetter] Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com --- drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/intel_uc_loader.c | 310 + drivers/gpu/drm/i915/intel_uc_loader.h | 92 ++ 3 files changed, 405 insertions(+) create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index de21965..f1f80fc 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -39,6 +39,9 @@ i915-y += i915_cmd_parser.o \ intel_ringbuffer.o \ intel_uncore.o +# generic ancilliary microcontroller support +i915-y += intel_uc_loader.o + # autogenerated null render state i915-y += intel_renderstate_gen6.o \ intel_renderstate_gen7.o \ diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c new file mode 100644 index 000..a8fc1dd --- /dev/null +++ b/drivers/gpu/drm/i915/intel_uc_loader.c @@ -0,0 +1,310 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Author: + * Dave Gordon david.s.gor...@intel.com + */ +#include linux/firmware.h +#include i915_drv.h +#include intel_uc_loader.h + +/** + * DOC: Common functions for embedded microcontroller (uC) firmware loading + * + * The functions in this file provide common support code for loading the + * firmware that may be required by an embedded microcontroller (uC). + * + * The function intel_uc_fw_init() should be called first; it requires no + * locking, and can be called even before GEM has been initialised. It just + * initialises the tracking data and stores its parameters for the subsequent + * call to intel_uc_fw_fetch(). + * + * At some convenient point after GEM initialisation, the driver should call + * intel_uc_fw_fetch(). The first time, this will use the Linux kernel's + * request_firmware() call to open and read the firmware image into memory. + * (On subsequent calls, this is skipped, as either the firmware has already + * been fetched into memory, or it is already known that no valid firmware + * image could be found). + * + * The callback() function passed to intel_uc_fw_fetch() can further check + * the firmware image before it is saved. This function can reject the image + * by returning a negative error code (e.g. -ENOEXEC), or accept it. In the + * latter case, it can return INTEL_UC_FW_GOOD (which is also the default if + * no callback() is supplied), and the common code here will save the whole + * of the firmware image in a (pageable, shmfs-backed) GEM object. + * + * If saving the whole image unmodified is not appropriate (for example, if + * only a small part of the image is needed later, or the data needs to be + * reorganised before saving), the callback() function can instead make its + * own arrangements for saving the required data in a GEM object or otherwise + * and then return INTEL_UC_FW_SAVED. + * + * (If such a callback does stash (some of) the image data in a GEM object, + * it can use the
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Mon, Jul 06, 2015 at 07:24:37PM +0100, Dave Gordon wrote: On 06/07/15 15:06, Daniel Vetter wrote: On Fri, Jul 03, 2015 at 01:30:24PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides common support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. v2: Local functions should pass dev_priv rather than dev [Chris Wilson] Various other improvements per Chris Wilson's review comments v3: Defeatured version without async prefetch [Daniel Vetter] Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com --- drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/intel_uc_loader.c | 310 + drivers/gpu/drm/i915/intel_uc_loader.h | 92 ++ 3 files changed, 405 insertions(+) create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index de21965..f1f80fc 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -39,6 +39,9 @@ i915-y += i915_cmd_parser.o \ intel_ringbuffer.o \ intel_uncore.o +# generic ancilliary microcontroller support +i915-y += intel_uc_loader.o + # autogenerated null render state i915-y += intel_renderstate_gen6.o \ intel_renderstate_gen7.o \ diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c new file mode 100644 index 000..a8fc1dd --- /dev/null +++ b/drivers/gpu/drm/i915/intel_uc_loader.c @@ -0,0 +1,310 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Author: + * Dave Gordon david.s.gor...@intel.com + */ +#include linux/firmware.h +#include i915_drv.h +#include intel_uc_loader.h + +/** + * DOC: Common functions for embedded microcontroller (uC) firmware loading + * + * The functions in this file provide common support code for loading the + * firmware that may be required by an embedded microcontroller (uC). + * + * The function intel_uc_fw_init() should be called first; it requires no + * locking, and can be called even before GEM has been initialised. It just + * initialises the tracking data and stores its parameters for the subsequent + * call to intel_uc_fw_fetch(). + * + * At some convenient point after GEM initialisation, the driver should call + * intel_uc_fw_fetch(). The first time, this will use the Linux kernel's + * request_firmware() call to open and read the firmware image into memory. + * (On subsequent calls, this is skipped, as either the firmware has already + * been fetched into memory, or it is already known that no valid firmware + * image could be found). + * + * The callback() function passed to intel_uc_fw_fetch() can further check + * the firmware image before it is saved. This function can reject the image + * by returning a negative error code (e.g. -ENOEXEC), or accept it. In the + * latter case, it can return INTEL_UC_FW_GOOD (which is also the default if + * no callback() is supplied), and the common code here will save the whole + * of the firmware image in a (pageable, shmfs-backed) GEM object. + * + * If saving the whole image unmodified is not appropriate (for example, if + * only a small part of the image is needed later, or the data needs to be + * reorganised before saving), the callback() function can instead make its + * own arrangements for saving the required data in a GEM object or otherwise + * and then return
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On 06/07/15 15:06, Daniel Vetter wrote: On Fri, Jul 03, 2015 at 01:30:24PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides common support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. v2: Local functions should pass dev_priv rather than dev [Chris Wilson] Various other improvements per Chris Wilson's review comments v3: Defeatured version without async prefetch [Daniel Vetter] Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com --- drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/intel_uc_loader.c | 310 + drivers/gpu/drm/i915/intel_uc_loader.h | 92 ++ 3 files changed, 405 insertions(+) create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index de21965..f1f80fc 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -39,6 +39,9 @@ i915-y += i915_cmd_parser.o \ intel_ringbuffer.o \ intel_uncore.o +# generic ancilliary microcontroller support +i915-y += intel_uc_loader.o + # autogenerated null render state i915-y += intel_renderstate_gen6.o \ intel_renderstate_gen7.o \ diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c new file mode 100644 index 000..a8fc1dd --- /dev/null +++ b/drivers/gpu/drm/i915/intel_uc_loader.c @@ -0,0 +1,310 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Author: + * Dave Gordon david.s.gor...@intel.com + */ +#include linux/firmware.h +#include i915_drv.h +#include intel_uc_loader.h + +/** + * DOC: Common functions for embedded microcontroller (uC) firmware loading + * + * The functions in this file provide common support code for loading the + * firmware that may be required by an embedded microcontroller (uC). + * + * The function intel_uc_fw_init() should be called first; it requires no + * locking, and can be called even before GEM has been initialised. It just + * initialises the tracking data and stores its parameters for the subsequent + * call to intel_uc_fw_fetch(). + * + * At some convenient point after GEM initialisation, the driver should call + * intel_uc_fw_fetch(). The first time, this will use the Linux kernel's + * request_firmware() call to open and read the firmware image into memory. + * (On subsequent calls, this is skipped, as either the firmware has already + * been fetched into memory, or it is already known that no valid firmware + * image could be found). + * + * The callback() function passed to intel_uc_fw_fetch() can further check + * the firmware image before it is saved. This function can reject the image + * by returning a negative error code (e.g. -ENOEXEC), or accept it. In the + * latter case, it can return INTEL_UC_FW_GOOD (which is also the default if + * no callback() is supplied), and the common code here will save the whole + * of the firmware image in a (pageable, shmfs-backed) GEM object. + * + * If saving the whole image unmodified is not appropriate (for example, if + * only a small part of the image is needed later, or the data needs to be + * reorganised before saving), the callback() function can instead make its + * own arrangements for saving the required data in a GEM object or otherwise + * and then return INTEL_UC_FW_SAVED. + * + * (If such a callback does stash (some of) the image data in a GEM object, + * it can use the uc_fw_obj and uc_fw_size fields; the common framework will
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Fri, Jun 19, 2015 at 09:43:11AM +0100, Dave Gordon wrote: On 18/06/15 15:49, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). Yeah I've seen the origins of this in the display code, and that code gets the syncing wrong. The only thing that one has do to is grab a runtime pm reference for the appropriate power well to prevent dc5 entry, and release it when the firmware is loaded and initialized. Agreed. Which means any kind of firmware loader which requires/uses dev-struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). If you really need this guarantee (and I seriously hope not) then the only option is a synchronous firmware load at driver init _before_ we launch any of the asynchronous setup code. And there is already a lot of that, and we're adding more all the time. What I expect we need is synchronization of just the revelant part with the firmware loading, which necessarily needs to be somewhat async to be able to support cros/android requirements. And yes that needs to be done in a controlled manner, but most likely we need very specific solutions for the problem at hand. Unconditionally holding dev-struct_mutex isn't that solution. The other problem with dev-struct_mutex is that it's a giantic lock with ill defined coverage and semantics. It's imo the biggest piece of technical debt we carry around in i915.ko, and we pay the price for that dearlydaily. Which means that since a few years any kind of code which extended dev-struct_mutex to anything not clearly core gem data structures was rejected. We should convert the DMC loader to use this too, so there need be only one bit of code in the whole driver that needs to understand how to use completions to get correct handover from a free-running no-locks-held thread to the properly disciplined environment of driver mainline for purposes of programming the h/w. Nack on using this for dmc, since I want them to convert it to the above synchronization, since that's how all the other async power
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On 18/06/15 15:49, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). Yeah I've seen the origins of this in the display code, and that code gets the syncing wrong. The only thing that one has do to is grab a runtime pm reference for the appropriate power well to prevent dc5 entry, and release it when the firmware is loaded and initialized. Agreed. Which means any kind of firmware loader which requires/uses dev-struct_mutex get stuff wrong and is not appropriate everywhere. BUT, the loading of the firmware into any uC MUST be done in a controlled manner i.e. at a time when no other thread is touching the h/w. Otherwise the f/w load and whatever else is concurrently accessing the h/w could in some cases interfere disastrously. Examples of interference might be: * interleaved accesses to the ELSP (in the case of the GuC) * incorrect handover of power management (DMC, GuC) * erroneous management of forcewake state In general the f/w that is just starting on the uC may have certain expectations about the initial state of the h/w, which may not be met if other threads are accessing various bits of h/w while the uC is booting up. So we absolutely need to guarantee that the f/w load is done by a thread which has exclusive ownership of any bit of the h/w that the f/w is going to make assumptions about. With the current locking structure of the driver, that means holding the struct_mutex (it shouldn't really, there should be a separate mutex for h/w register access vs. driver-private data structures, but there isn't). We should convert the DMC loader to use this too, so there need be only one bit of code in the whole driver that needs to understand how to use completions to get correct handover from a free-running no-locks-held thread to the properly disciplined environment of driver mainline for purposes of programming the h/w. Nack on using this for dmc, since I want them to convert it to the above synchronization, since that's how all the other async power initialization is done. Guc is different since we really must have it ready for execbuf, and for that usecase a completion at drm_open time sounds like the right thing. As a rule of thumb for refactoring and share infastructure we use the following recipe in drm: - first driver implements things as straightforward as possible - 2nd user copypastes - 3rd one has the duty to figure out whether some refactoring is in order or not. Imo that approach leads a really good balance between avoiding overengineering and having maintainable code. -Daniel We've already been through these phases; the code has already been cloned twice (and then changed, but not enough to fix the problems with the original) and then when I found the issues with the GuC loader and noticed the hilarious ownership dance it was doing during handover I realised it was time to fix it in one place rather than several, and posted a patchset to the internal mailing list on 2015-02-24 with this commentary: The GuC loader uses an asynchronous thread to fetch the firmware image (aka binary blob) from a file and load it into the GuC's memory. Unfortunately
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). We should convert the DMC loader to use this too, so there need be only one bit of code in the whole driver that needs to understand how to use completions to get correct handover from a free-running no-locks-held thread to the properly disciplined environment of driver mainline for purposes of programming the h/w. .Dave. --- drivers/gpu/drm/i915/Makefile |3 + drivers/gpu/drm/i915/intel_uc_loader.c | 312 drivers/gpu/drm/i915/intel_uc_loader.h | 82 + 3 files changed, 397 insertions(+) create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b7ddf48..607fa2a 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -38,6 +38,9 @@ i915-y += i915_cmd_parser.o \ intel_ringbuffer.o \ intel_uncore.o +# generic ancilliary microcontroller support +i915-y += intel_uc_loader.o + # autogenerated null render state i915-y += intel_renderstate_gen6.o \ intel_renderstate_gen7.o \ diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c new file mode 100644 index 000..26f0fbe --- /dev/null +++ b/drivers/gpu/drm/i915/intel_uc_loader.c @@ -0,0 +1,312 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Author: + * Dave Gordon david.s.gor...@intel.com + */ +#include linux/firmware.h +#include i915_drv.h +#include intel_uc_loader.h + +/** + * DOC: Generic embedded microcontroller (uC) firmware loading support + * + * The functions in this file provide a generic way to load the firmware that + * may be required by an embedded microcontroller (uC). + * + * The function intel_uc_fw_init() should be called early, and will initiate + * an asynchronous request to fetch the firmware image (aka binary blob). + * When the image has been fetched into memory, the kernel will call back to + *
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Thu, Jun 18, 2015 at 01:11:34PM +0100, Dave Gordon wrote: On 17/06/15 13:05, Daniel Vetter wrote: On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel It's precisely because it's in some sense trivial-but-tricky that we should write it once, get it right, and use it everywhere. Copypaste /does/ sound awful; I've seen how the code this was derived from had already been cloned into three flavours, all different and all wrong. It's a very simple abstraction: one early call to kick things off as early as possible, no locking required. One late call with the struct_mutex held to complete the synchronisation and actually do the work, thus guaranteeing that the transfer to the target uC is done in a controlled fashion, at a time of the caller's choice, and by the driver's mainline thread, NOT by an asynchronous thread racing with other activity (which was one of the things wrong with the original version). Yeah I've seen the origins of this in the display code, and that code gets the syncing wrong. The only thing that one has do to is grab a runtime pm reference for the appropriate power well to prevent dc5 entry, and release it when the firmware is loaded and initialized. Which means any kind of firmware loader which requires/uses dev-struct_mutex get stuff wrong and is not appropriate everywhere. We should convert the DMC loader to use this too, so there need be only one bit of code in the whole driver that needs to understand how to use completions to get correct handover from a free-running no-locks-held thread to the properly disciplined environment of driver mainline for purposes of programming the h/w. Nack on using this for dmc, since I want them to convert it to the above synchronization, since that's how all the other async power initialization is done. Guc is different since we really must have it ready for execbuf, and for that usecase a completion at drm_open time sounds like the right thing. As a rule of thumb for refactoring and share infastructure we use the following recipe in drm: - first driver implements things as straightforward as possible - 2nd user copypastes - 3rd one has the duty to figure out whether some refactoring is in order or not. Imo that approach leads a really good balance between avoiding overengineering and having maintainable code. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote: Guc is different since we really must have it ready for execbuf, and for that usecase a completion at drm_open time sounds like the right thing. But do we? It would be nice if we had a definite answer that the hw was ready before we started using it in anger, but I don't see any reason why we would have to delay userspace for a slow microcode update... (This presupposes that userspace batches are unaffected by GuC/execlist setup, which for userspace sanity I hope they are - or at least using predicate registers and conditional execution.) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote: On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote: Guc is different since we really must have it ready for execbuf, and for that usecase a completion at drm_open time sounds like the right thing. But do we? It would be nice if we had a definite answer that the hw was ready before we started using it in anger, but I don't see any reason why we would have to delay userspace for a slow microcode update... (This presupposes that userspace batches are unaffected by GuC/execlist setup, which for userspace sanity I hope they are - or at least using predicate registers and conditional execution.) Well I figured a wait_completion or flush_work unconditionally in execbuf is not to your liking, and it's better to keep that in open. But I think we should be able to get away with this at execbuf time. Might even be better since this wouldn't block sw-rendered boot-splashs. But either way should be suitable I think. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Thu, Jun 18, 2015 at 05:35:29PM +0200, Daniel Vetter wrote: On Thu, Jun 18, 2015 at 04:27:52PM +0100, Chris Wilson wrote: On Thu, Jun 18, 2015 at 04:49:49PM +0200, Daniel Vetter wrote: Guc is different since we really must have it ready for execbuf, and for that usecase a completion at drm_open time sounds like the right thing. But do we? It would be nice if we had a definite answer that the hw was ready before we started using it in anger, but I don't see any reason why we would have to delay userspace for a slow microcode update... (This presupposes that userspace batches are unaffected by GuC/execlist setup, which for userspace sanity I hope they are - or at least using predicate registers and conditional execution.) Well I figured a wait_completion or flush_work unconditionally in execbuf is not to your liking, and it's better to keep that in open. But I think we should be able to get away with this at execbuf time. Might even be better since this wouldn't block sw-rendered boot-splashs. But either way should be suitable I think. I am optimistic that we can make the request interface robust enough to be able queue up not only the ring initialisation and ppgtt initialisation requests, but also userspace requests. If it all works out, we only need to truly worry about microcode completion in hangcheck. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/15] drm/i915: Embedded microcontroller (uC) firmware loading support
On Mon, Jun 15, 2015 at 07:36:20PM +0100, Dave Gordon wrote: Current devices may contain one or more programmable microcontrollers that need to have a firmware image (aka binary blob) loaded from an external medium and transferred to the device's memory. This file provides generic support functions for doing this; they can then be used by each uC-specific loader, thus reducing code duplication and testing effort. Signed-off-by: Dave Gordon david.s.gor...@intel.com Signed-off-by: Alex Dai yu@intel.com Given that I'm just shredding the synchronization used by the dmc loader I'm not convinced this is a good idea. Abstraction has cost, and a bit of copy-paste for similar sounding but slightly different things doesn't sound awful to me. And the critical bit in all the firmware loading I've seen thus far is in synchronizing the loading with other operations, hiding that isn't a good idea. Worse if we enforce stuff like requiring dev-struct_mutex. -Daniel --- drivers/gpu/drm/i915/Makefile |3 + drivers/gpu/drm/i915/intel_uc_loader.c | 312 drivers/gpu/drm/i915/intel_uc_loader.h | 82 + 3 files changed, 397 insertions(+) create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.c create mode 100644 drivers/gpu/drm/i915/intel_uc_loader.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b7ddf48..607fa2a 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -38,6 +38,9 @@ i915-y += i915_cmd_parser.o \ intel_ringbuffer.o \ intel_uncore.o +# generic ancilliary microcontroller support +i915-y += intel_uc_loader.o + # autogenerated null render state i915-y += intel_renderstate_gen6.o \ intel_renderstate_gen7.o \ diff --git a/drivers/gpu/drm/i915/intel_uc_loader.c b/drivers/gpu/drm/i915/intel_uc_loader.c new file mode 100644 index 000..26f0fbe --- /dev/null +++ b/drivers/gpu/drm/i915/intel_uc_loader.c @@ -0,0 +1,312 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Author: + * Dave Gordon david.s.gor...@intel.com + */ +#include linux/firmware.h +#include i915_drv.h +#include intel_uc_loader.h + +/** + * DOC: Generic embedded microcontroller (uC) firmware loading support + * + * The functions in this file provide a generic way to load the firmware that + * may be required by an embedded microcontroller (uC). + * + * The function intel_uc_fw_init() should be called early, and will initiate + * an asynchronous request to fetch the firmware image (aka binary blob). + * When the image has been fetched into memory, the kernel will call back to + * uc_fw_fetch_callback() whose function is simply to record the completion + * status, and stash the firmware blob for later. + * + * At some convenient point after GEM initialisation, the driver should call + * intel_uc_fw_check(); this will check whether the asynchronous thread has + * completed and wait for it if not, check whether the image was successfully + * fetched; and then allow the callback() function (if provided) to validate + * the image and/or save the data in a GEM object. + * + * Thereafter the uC-specific code can transfer the data in the GEM object + * to the uC's memory (in some uC-specific way, not handled here). + * + * During driver shutdown, or if driver load is aborted, intel_uc_fw_fini() + * should be called to release any remaining resources. + */ + + +/* + * Called once per uC, late in driver initialisation. GEM is now ready, and so + * we can now create a GEM object to hold the uC firmware. But first, we must + * synchronise with the firmware-fetching thread that was initiated during + * early driver load, in intel_uc_fw_init(), and see whether it successfully + * fetched the firmware