Re: [PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47
Hi Dirk, You are absolutely right. I will revise my patch series to reflect the change. Basically, I will call the generic mmc_of_parse from the probe function of Freescale's driver. That will handle all the additional capabilities. Thanks Oded On 06/10/2013 09:29 AM, Dirk Behme wrote: On 02.06.2013 08:38, Oded Gabbay wrote: This patch adds support of recognizing hard-wired (permanent) cards to Freescale's SDHC host driver. This is done by adding the option fsl,card-wired to the SDHC device-tree entry. Detection of this option is done in the probe function. Update documentation in file fsl-esdhc.txt Why don't you want to introduce fsl,card-wired? Why don't you use non-removable? To my understanding the patch https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7f217794ffa72f208a250b79ab0b7ea3de19677f explicitly removed fsl,card-wired. So I don't think re-introducing it is a good idea? Best regards Dirk -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47
Hi All, Just noticed that 3.4.47/8 doesn't have the mmc_of_parse (compared to 3.9.4). Therefore, I will not use it and just fix the code to recognize the property non-removable Best regards, Oded On 06/10/2013 04:43 PM, Oded Gabbay wrote: Hi Dirk, You are absolutely right. I will revise my patch series to reflect the change. Basically, I will call the generic mmc_of_parse from the probe function of Freescale's driver. That will handle all the additional capabilities. Thanks Oded On 06/10/2013 09:29 AM, Dirk Behme wrote: On 02.06.2013 08:38, Oded Gabbay wrote: This patch adds support of recognizing hard-wired (permanent) cards to Freescale's SDHC host driver. This is done by adding the option fsl,card-wired to the SDHC device-tree entry. Detection of this option is done in the probe function. Update documentation in file fsl-esdhc.txt Why don't you want to introduce fsl,card-wired? Why don't you use non-removable? To my understanding the patch https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7f217794ffa72f208a250b79ab0b7ea3de19677f explicitly removed fsl,card-wired. So I don't think re-introducing it is a good idea? Best regards Dirk -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MMC: FSL SDHC: Add support for non-removable card. Kernel version 3.4.48
This patch adds support of recognizing non-removable cards to Freescale's SDHC host driver. This is done by detecting the attribute non-removable in the probe function This patch depends on patch[2/3] from 6-jun-2013: https://patchwork.kernel.org/patch/2649381/ This patch is instead of patch[3/3] from 6-jun-2013: https://patchwork.kernel.org/patch/2649231/ Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- drivers/mmc/host/sdhci-of-esdhc.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index e70f22f..a6e068c 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -222,6 +222,10 @@ static int __devinit sdhci_esdhc_probe(struct platform_device *pdev) host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL; } + /* Check if card is non-removable */ + if (of_find_property(np, non-removable, NULL)) + host-caps |= MMC_CAP_NONREMOVABLE; + ret = sdhci_add_host(host); if (ret) sdhci_pltfm_free(pdev); -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register
This patch fixes a bug in the fsl_pq_mdio.c module and in relevant device-tree files regarding the correct offset of the tbipa register in the eTSEC controller in some of Freescale's PQ3 and QorIQ SoC. The bug happens when the mdio in the device tree is configured to be compatible to fsl,gianfar-tbi. Because the mdio device in the device tree points to addresses 25520, 26520 or 27520 (depends on the controller ID), the variable priv-map at function fsl_pq_mdio_probe, points to that address. However, later in the function there is a write to register tbipa that is actually located at 25030, 26030 or 27030. Because the correct address is not io mapped, the contents are written to a different register in the controller. The fix sets the address of the mdio device to start at 25000, 26000 or 27000 and changes the mii_offset field to 0x520 in the relevant entry (fsl,gianfar-tbi) of the fsl_pq_mdio_match array. Note: This patch may break MDIO functionallity of some old Freescale's SoC until Freescale will fix their device tree files. Basically, every device tree which contains an mdio device that is compatible to fsl,gianfar-tbi should be examined. Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi| 4 ++-- arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi| 4 ++-- arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi| 4 ++-- arch/powerpc/boot/dts/ge_imp3a.dts | 4 ++-- arch/powerpc/boot/dts/mpc8536ds.dtsi | 4 ++-- arch/powerpc/boot/dts/mpc8544ds.dtsi | 2 +- arch/powerpc/boot/dts/mpc8548cds.dtsi | 6 +++--- arch/powerpc/boot/dts/mpc8568mds.dts | 2 +- arch/powerpc/boot/dts/mpc8572ds.dtsi | 6 +++--- arch/powerpc/boot/dts/mpc8572ds_camp_core0.dts | 4 ++-- arch/powerpc/boot/dts/mpc8572ds_camp_core1.dts | 2 +- arch/powerpc/boot/dts/p2020ds.dtsi | 4 ++-- arch/powerpc/boot/dts/p2020rdb-pc.dtsi | 4 ++-- arch/powerpc/boot/dts/p2020rdb.dts | 4 ++-- arch/powerpc/boot/dts/ppa8548.dts | 6 +++--- drivers/net/ethernet/freescale/fsl_pq_mdio.c | 2 +- 16 files changed, 31 insertions(+), 31 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi b/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi index 96693b4..d38bf63 100644 --- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi +++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi @@ -46,9 +46,9 @@ ethernet@25000 { interrupts = 35 2 0 0 36 2 0 0 40 2 0 0; }; -mdio@25520 { +mdio@25000 { #address-cells = 1; #size-cells = 0; compatible = fsl,gianfar-tbi; - reg = 0x25520 0x20; + reg = 0x25000 0x1000; }; diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi b/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi index 6b3fab1..6290b49 100644 --- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi +++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi @@ -46,9 +46,9 @@ ethernet@26000 { interrupts = 31 2 0 0 32 2 0 0 33 2 0 0; }; -mdio@26520 { +mdio@26000 { #address-cells = 1; #size-cells = 0; compatible = fsl,gianfar-tbi; - reg = 0x26520 0x20; + reg = 0x26000 0x1000; }; diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi b/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi index 0da592d..5296811 100644 --- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi +++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi @@ -46,9 +46,9 @@ ethernet@27000 { interrupts = 37 2 0 0 38 2 0 0 39 2 0 0; }; -mdio@27520 { +mdio@27000 { #address-cells = 1; #size-cells = 0; compatible = fsl,gianfar-tbi; - reg = 0x27520 0x20; + reg = 0x27000 0x1000; }; diff --git a/arch/powerpc/boot/dts/ge_imp3a.dts b/arch/powerpc/boot/dts/ge_imp3a.dts index fefae41..49d9b4e 100644 --- a/arch/powerpc/boot/dts/ge_imp3a.dts +++ b/arch/powerpc/boot/dts/ge_imp3a.dts @@ -174,14 +174,14 @@ }; }; - mdio@25520 { + mdio@25000 { tbi1: tbi-phy@11 { reg = 0x11; device_type = tbi-phy; }; }; - mdio@26520 { + mdio@26000 { status = disabled; }; diff --git a/arch/powerpc/boot/dts/mpc8536ds.dtsi b/arch/powerpc/boot/dts/mpc8536ds.dtsi index 7c3dde8..c4df5a1 100644 --- a/arch/powerpc/boot/dts/mpc8536ds.dtsi +++ b/arch/powerpc/boot/dts/mpc8536ds.dtsi @@ -227,11 +227,11 @@ phy-connection-type = rgmii-id; }; - mdio@26520 { + mdio@26000 { #address-cells = 1; #size-cells = 0; compatible = fsl,gianfar-tbi; - reg = 0x26520 0x20; + reg = 0x26000 0x1000; tbi1: tbi-phy@11 { reg = 0x11; diff --git a/arch/powerpc/boot/dts/mpc8544ds.dtsi b/arch/powerpc/boot
Re: [PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register
Oded Gabbay wrote: Note: This patch may break MDIO functionallity of some old Freescale's SoC until Freescale will fix their device tree files. Basically, every device tree which contains an mdio device that is compatible to fsl,gianfar-tbi should be examined. On 06/12/2013 04:04 PM, Timur Tabi wrote: I haven't had a chance to review the patch in detail, but I can tell you that breaking compatibility with older device trees is unacceptable. You need to add some code, even if it's an ugly hack, to support those trees. I generally agree with this statement except that without this patch, almost ALL of Freescale's SoC that uses fsl,gianfar-tbi are broken, including the older ones. At least this patch fixes some of the device trees. Because I'm not working at Freescale, I have a very limited access to a few SoC which I could test this patch on. I think it is Freescale's responsibility to release a complementary patch to fix the rest of the SoC device trees. Oded -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] MMC: P2020 SDHC: Add support for 8-bit bus width and non-removable card
This patch adds support of connecting an MMC media using an 8-bit bus width connection to Freescale's P2020 H/W SDHC controller. During the probe function, the generic function mmc_of_parse is called to detect whether the controller is configured with 8-bit bus width. Also, the generic function detecs if the non-removable property is set in the device tree. The function esdhc_pltfm_bus_width was added because the bus width configuration is platform specific. Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- drivers/mmc/host/sdhci-esdhc.h| 7 +++ drivers/mmc/host/sdhci-of-esdhc.c | 44 ++- 2 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h index d25f9ab..6f9a018 100644 --- a/drivers/mmc/host/sdhci-esdhc.h +++ b/drivers/mmc/host/sdhci-esdhc.h @@ -36,6 +36,13 @@ /* pltfm-specific */ #define ESDHC_HOST_CONTROL_LE 0x20 +/* + * P2020 interpretation of the SDHCI_HOST_CONTROL register + */ +#define ESDHC_CTRL_4BITBUS (0x1 1) +#define ESDHC_CTRL_8BITBUS (0x2 1) +#define ESDHC_CTRL_BUSWIDTH_MASK(0x3 1) + /* OF-specific */ #define ESDHC_DMA_SYSCTL 0x40c #define ESDHC_DMA_SNOOP0x0040 diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index 5e68adc..fd149a0 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -13,6 +13,7 @@ * your option) any later version. */ +#include linux/err.h #include linux/io.h #include linux/of.h #include linux/delay.h @@ -230,6 +231,30 @@ static void esdhc_of_platform_init(struct sdhci_host *host) host-quirks = ~SDHCI_QUIRK_NO_BUSY_IRQ; } +static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width) +{ + u32 ctrl; + + switch (width) { + case MMC_BUS_WIDTH_8: + ctrl = ESDHC_CTRL_8BITBUS; + break; + + case MMC_BUS_WIDTH_4: + ctrl = ESDHC_CTRL_4BITBUS; + break; + + default: + ctrl = 0; + break; + } + + clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL, + ESDHC_CTRL_BUSWIDTH_MASK, ctrl); + + return 0; +} + static const struct sdhci_ops sdhci_esdhc_ops = { .read_l = esdhc_readl, .read_w = esdhc_readw, @@ -247,6 +272,7 @@ static const struct sdhci_ops sdhci_esdhc_ops = { .platform_resume = esdhc_of_resume, #endif .adma_workaround = esdhci_of_adma_workaround, + .platform_bus_width = esdhc_pltfm_bus_width, }; static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { @@ -262,7 +288,23 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { static int sdhci_esdhc_probe(struct platform_device *pdev) { - return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata); + struct sdhci_host *host; + int ret = 0; + + host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata); + if (IS_ERR(host)) + return PTR_ERR(host); + + sdhci_get_of_property(pdev); + + /* call to generic mmc_of_parse to support additional capabilities */ + mmc_of_parse(host-mmc); + + ret = sdhci_add_host(host); + if (ret) + sdhci_pltfm_free(pdev); + + return ret; } static int sdhci_esdhc_remove(struct platform_device *pdev) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL register. This patch adds a QUIRK in the SDHCI header to signal that a host controller has a non-standard SDHCI_HOST_CONTROL register. The patch adds a check to the function esdhc_writeb in file sdhci-of-esdhc.c, where it checks if the write is done to the SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK, then the function simply returns instead of writing to the register. The patch also detects if the processor is P2020 (by looking in dev tree) and if so, adds the QUIRK to the host-quirk2 This patch depends on the first patch of this set (total of 2 patches) Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- drivers/mmc/host/sdhci-of-esdhc.c | 14 ++ include/linux/mmc/sdhci.h | 2 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index fd149a0..ca88529 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -121,6 +121,11 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, int reg) if (reg == SDHCI_HOST_CONTROL) { u32 dma_bits; + /* If host control register is not standard, exit +* this function */ + if (host-quirks2 SDHCI_QUIRK2_BROKEN_HOST_CONTROL) + return; + /* DMA select is 22,23 bits in Protocol Control Register */ dma_bits = (val SDHCI_CTRL_DMA_MASK) 5; clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK 5, @@ -289,6 +294,7 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { static int sdhci_esdhc_probe(struct platform_device *pdev) { struct sdhci_host *host; + struct device_node *np; int ret = 0; host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata); @@ -297,6 +303,14 @@ static int sdhci_esdhc_probe(struct platform_device *pdev) sdhci_get_of_property(pdev); + np = pdev-dev.of_node; + if (of_device_is_compatible(np, fsl,p2020-esdhc)) { + /* Freescale messed up with P2020 as it has a non-standard + * host control register + */ + host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL; + } + /* call to generic mmc_of_parse to support additional capabilities */ mmc_of_parse(host-mmc); diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h index b838ffc..b73dbdd 100644 --- a/include/linux/mmc/sdhci.h +++ b/include/linux/mmc/sdhci.h @@ -95,6 +95,8 @@ struct sdhci_host { /* The system physically doesn't support 1.8v, even if the host does */ #define SDHCI_QUIRK2_NO_1_8_V (12) #define SDHCI_QUIRK2_PRESET_VALUE_BROKEN (13) +/* Controller has a non-standard host control register */ +#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL (14) int irq;/* Device IRQ */ void __iomem *ioaddr; /* Mapped address */ -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register
On 06/12/2013 09:31 PM, Scott Wood wrote: On 06/12/2013 10:08:29 AM, Sebastian Andrzej Siewior wrote: On 06/12/2013 02:47 PM, Oded Gabbay wrote: This patch fixes a bug in the fsl_pq_mdio.c module and in relevant device-tree files regarding the correct offset of the tbipa register in the eTSEC controller in some of Freescale's PQ3 and QorIQ SoC. The bug happens when the mdio in the device tree is configured to be compatible to fsl,gianfar-tbi. Because the mdio device in the device tree points to addresses 25520, 26520 or 27520 (depends on the controller ID), the variable priv-map at function fsl_pq_mdio_probe, points to that address. However, later in the function there is a write to register tbipa that is actually located at 25030, 26030 or 27030. Because the correct address is not io mapped, the contents are written to a different register in the controller. The fix sets the address of the mdio device to start at 25000, 26000 or 27000 and changes the mii_offset field to 0x520 in the relevant entry (fsl,gianfar-tbi) of the fsl_pq_mdio_match array. Note: This patch may break MDIO functionallity of some old Freescale's SoC until Freescale will fix their device tree files. Basically, every device tree which contains an mdio device that is compatible to fsl,gianfar-tbi should be examined. Not as is. Please add a check for the original address. If it has 0x520 at the end print a warning and fix it up. Please add to the patch description which register is modified instead if this patch is not applied. Depending on how critical this it might has to go stable. I'm not sure it's stable material if this is something that has never worked... The device tree binding will also need to be fixed to note the difference in reg between fsl,gianfar-mdio and fsl-gianfar-tbi -- and should give an example of the latter. -Scott I read the 2 comments and I'm not sure what should be the best way to move ahead. I would like to describe what is the impact of not accepting this patch: When you connect any eTSEC, except the first one, using SGMII, you must configure the TBIPA register because the MII management configuration uses the TBIPA address as part of the SGMII initialization sequence, as described in the P2020 Reference manual. So, if that register is not initialized, the sequence is broken the and eTSEC is not functioning (can not send/receive packets). I still think the best way to fix it is what I did: 1. Point the priv-map to the start of the whole registers range of the eTSEC 2. Set mii_offset to 0x520 in the gianfar-tbi entry of the fsl_pq_mdio_match array. 3. Fix all the usages of the gianfar-tbi in the device tree files - change the starting address and reg range I think this is the best way because it is stated in fsl_pq_mdio_probe function that: /* * Some device tree nodes represent only the MII registers, and * others represent the MAC and MII registers. The 'mii_offset' field * contains the offset of the MII registers inside the mapped register * space. */ and that's why we have priv-map and priv-regs. So my fix goes according to the current design of the driver. -Oded -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] MMC: P2020 SDHC: Add support for 8-bit bus width and non-removable card
This patch adds support of connecting an MMC media using an 8-bit bus width connection to Freescale's P2020 H/W SDHC controller. During the probe function, the generic function mmc_of_parse is called to detect whether the controller is configured with 8-bit bus width. Also, the generic function detecs if the non-removable property is set in the device tree. The function esdhc_pltfm_bus_width was added because the bus width configuration is platform specific. Signed-off-by: Oded Gabbay ogab...@advaoptical.com Reviewed-by: Anton Vorontsov an...@enomsg.org --- drivers/mmc/host/sdhci-esdhc.h| 7 +++ drivers/mmc/host/sdhci-of-esdhc.c | 44 ++- 2 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h index d25f9ab..6f9a018 100644 --- a/drivers/mmc/host/sdhci-esdhc.h +++ b/drivers/mmc/host/sdhci-esdhc.h @@ -36,6 +36,13 @@ /* pltfm-specific */ #define ESDHC_HOST_CONTROL_LE 0x20 +/* + * P2020 interpretation of the SDHCI_HOST_CONTROL register + */ +#define ESDHC_CTRL_4BITBUS (0x1 1) +#define ESDHC_CTRL_8BITBUS (0x2 1) +#define ESDHC_CTRL_BUSWIDTH_MASK(0x3 1) + /* OF-specific */ #define ESDHC_DMA_SYSCTL 0x40c #define ESDHC_DMA_SNOOP0x0040 diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index 5e68adc..fd149a0 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -13,6 +13,7 @@ * your option) any later version. */ +#include linux/err.h #include linux/io.h #include linux/of.h #include linux/delay.h @@ -230,6 +231,30 @@ static void esdhc_of_platform_init(struct sdhci_host *host) host-quirks = ~SDHCI_QUIRK_NO_BUSY_IRQ; } +static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width) +{ + u32 ctrl; + + switch (width) { + case MMC_BUS_WIDTH_8: + ctrl = ESDHC_CTRL_8BITBUS; + break; + + case MMC_BUS_WIDTH_4: + ctrl = ESDHC_CTRL_4BITBUS; + break; + + default: + ctrl = 0; + break; + } + + clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL, + ESDHC_CTRL_BUSWIDTH_MASK, ctrl); + + return 0; +} + static const struct sdhci_ops sdhci_esdhc_ops = { .read_l = esdhc_readl, .read_w = esdhc_readw, @@ -247,6 +272,7 @@ static const struct sdhci_ops sdhci_esdhc_ops = { .platform_resume = esdhc_of_resume, #endif .adma_workaround = esdhci_of_adma_workaround, + .platform_bus_width = esdhc_pltfm_bus_width, }; static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { @@ -262,7 +288,23 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { static int sdhci_esdhc_probe(struct platform_device *pdev) { - return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata); + struct sdhci_host *host; + int ret; + + host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata); + if (IS_ERR(host)) + return PTR_ERR(host); + + sdhci_get_of_property(pdev); + + /* call to generic mmc_of_parse to support additional capabilities */ + mmc_of_parse(host-mmc); + + ret = sdhci_add_host(host); + if (ret) + sdhci_pltfm_free(pdev); + + return ret; } static int sdhci_esdhc_remove(struct platform_device *pdev) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/2] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL register. This patch adds a QUIRK in the SDHCI header to signal that a host controller has a non-standard SDHCI_HOST_CONTROL register. The patch adds a check to the function esdhc_writeb in file sdhci-of-esdhc.c, where it checks if the write is done to the SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK, then the function simply returns instead of writing to the register. The patch also detects if the processor is P2020 (by looking in dev tree) and if so, adds the QUIRK to the host-quirk2 This patch depends on the first patch of this set (total of 2 patches) Signed-off-by: Oded Gabbay ogab...@advaoptical.com Reviewed-by: Anton Vorontsov an...@enomsg.org --- drivers/mmc/host/sdhci-of-esdhc.c | 14 ++ include/linux/mmc/sdhci.h | 2 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index fd149a0..ca88529 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -121,6 +121,13 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, int reg) if (reg == SDHCI_HOST_CONTROL) { u32 dma_bits; + /* +* If host control register is not standard, exit +* this function +*/ + if (host-quirks2 SDHCI_QUIRK2_BROKEN_HOST_CONTROL) + return; + /* DMA select is 22,23 bits in Protocol Control Register */ dma_bits = (val SDHCI_CTRL_DMA_MASK) 5; clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK 5, @@ -289,6 +296,7 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = { static int sdhci_esdhc_probe(struct platform_device *pdev) { struct sdhci_host *host; + struct device_node *np; int ret; host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata); @@ -297,6 +304,15 @@ static int sdhci_esdhc_probe(struct platform_device *pdev) sdhci_get_of_property(pdev); + np = pdev-dev.of_node; + if (of_device_is_compatible(np, fsl,p2020-esdhc)) { + /* +* Freescale messed up with P2020 as it has a non-standard +* host control register +*/ + host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL; + } + /* call to generic mmc_of_parse to support additional capabilities */ mmc_of_parse(host-mmc); diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h index b838ffc..b73dbdd 100644 --- a/include/linux/mmc/sdhci.h +++ b/include/linux/mmc/sdhci.h @@ -95,6 +95,8 @@ struct sdhci_host { /* The system physically doesn't support 1.8v, even if the host does */ #define SDHCI_QUIRK2_NO_1_8_V (12) #define SDHCI_QUIRK2_PRESET_VALUE_BROKEN (13) +/* Controller has a non-standard host control register */ +#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL (14) int irq;/* Device IRQ */ void __iomem *ioaddr; /* Mapped address */ -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47
This patch adds support of recognizing hard-wired (permanent) cards to Freescale's SDHC host driver. This is done by adding the option fsl,card-wired to the SDHC device-tree entry. Detection of this option is done in the probe function. Update documentation in file fsl-esdhc.txt Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- Documentation/devicetree/bindings/mmc/fsl-esdhc.txt | 3 +++ drivers/mmc/host/sdhci-of-esdhc.c | 4 2 files changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt index 64bcb8b..6f0eefa 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt @@ -16,6 +16,9 @@ Required properties: only handle 1-bit data transfers. - sdhci,auto-cmd12: (optional) specifies that a controller can only handle auto CMD12. + - fsl,card-wired : (optional) specifies that the card is +a permanent card and should not be detected for insertion or +removal Example: diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index e70f22f..2f79ec2 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -222,6 +222,10 @@ static int __devinit sdhci_esdhc_probe(struct platform_device *pdev) host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL; } + /* If card is permanent, add capability of non-removable */ + if (of_get_property(np, fsl,card-wired, NULL)) + host-mmc-caps |= MMC_CAP_NONREMOVABLE; + ret = sdhci_add_host(host); if (ret) sdhci_pltfm_free(pdev); -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register. Kernel version 3.4.47
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL register. This patch adds a QUIRK in the SDHCI header to signal that a host controller has a non-standard SDHCI_HOST_CONTROL register. The patch adds a check to the function esdhc_writeb in file sdhci-of-esdhc.c, where it checks if the write is done to the SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK, then the function simply returns instead of writing to the register. The patch also detects if the processor is P2020 (by looking in dev tree) and if so, adds the QUIRK to the host-quirk2 Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- drivers/mmc/host/sdhci-of-esdhc.c | 10 ++ include/linux/mmc/sdhci.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index 6f433b8..e70f22f 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -82,6 +82,11 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, int reg) if (reg == SDHCI_HOST_CONTROL) { u32 dma_bits; + /* If host control register is not standard, exit +* this function */ + if (host-quirks2 SDHCI_QUIRK2_BROKEN_HOST_CONTROL) + return; + /* DMA select is 22,23 bits in Protocol Control Register */ dma_bits = (val SDHCI_CTRL_DMA_MASK) 5; clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK 5, @@ -210,6 +215,11 @@ static int __devinit sdhci_esdhc_probe(struct platform_device *pdev) if (of_device_is_compatible(np, fsl,p2020-esdhc)) { /* P2020 has capability of 8 bit bus width */ host-mmc-caps |= MMC_CAP_8_BIT_DATA; + + /* Freescale messed up with P2020 as it has a non-standard + * host control register + */ + host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL; } ret = sdhci_add_host(host); diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h index e9051e1..2742134 100644 --- a/include/linux/mmc/sdhci.h +++ b/include/linux/mmc/sdhci.h @@ -91,6 +91,8 @@ struct sdhci_host { unsigned int quirks2; /* More deviations from spec. */ #define SDHCI_QUIRK2_HOST_OFF_CARD_ON (10) +/* Controller has a non-standard host control register */ +#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL(11) int irq;/* Device IRQ */ void __iomem *ioaddr; /* Mapped address */ -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] MMC: P2020 SDHC: Add support for 8-bit bus width connection. Kernel version 3.4.47
This patch adds support of connecting an MMC media using an 8-bit bus width connection to Freescale's P2020 H/W SDHC controller. During the probe function, it detects if the processor is P2020 (by looking at device tree) and if so, it adds the MMC_CAP_8_BIT_DATA to the MMC caps Signed-off-by: Oded Gabbay ogab...@advaoptical.com --- drivers/mmc/host/sdhci-esdhc.h| 7 ++ drivers/mmc/host/sdhci-of-esdhc.c | 49 ++- 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h index d25f9ab..6f9a018 100644 --- a/drivers/mmc/host/sdhci-esdhc.h +++ b/drivers/mmc/host/sdhci-esdhc.h @@ -36,6 +36,13 @@ /* pltfm-specific */ #define ESDHC_HOST_CONTROL_LE 0x20 +/* + * P2020 interpretation of the SDHCI_HOST_CONTROL register + */ +#define ESDHC_CTRL_4BITBUS (0x1 1) +#define ESDHC_CTRL_8BITBUS (0x2 1) +#define ESDHC_CTRL_BUSWIDTH_MASK(0x3 1) + /* OF-specific */ #define ESDHC_DMA_SYSCTL 0x40c #define ESDHC_DMA_SNOOP0x0040 diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index f8eb1fb..6f433b8 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -13,6 +13,7 @@ * your option) any later version. */ +#include linux/err.h #include linux/io.h #include linux/of.h #include linux/delay.h @@ -143,6 +144,31 @@ static void esdhc_of_resume(struct sdhci_host *host) } #endif +static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width) +{ + u32 ctrl; + + switch (width) { + case MMC_BUS_WIDTH_8: + ctrl = ESDHC_CTRL_8BITBUS; + break; + + case MMC_BUS_WIDTH_4: + ctrl = ESDHC_CTRL_4BITBUS; + break; + + default: + ctrl = 0; + break; + } + + clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL, + ESDHC_CTRL_BUSWIDTH_MASK, ctrl); + + return 0; +} + + static struct sdhci_ops sdhci_esdhc_ops = { .read_l = sdhci_be32bs_readl, .read_w = esdhc_readw, @@ -158,6 +184,7 @@ static struct sdhci_ops sdhci_esdhc_ops = { .platform_suspend = esdhc_of_suspend, .platform_resume = esdhc_of_resume, #endif + .platform_8bit_width = esdhc_pltfm_bus_width }; static struct sdhci_pltfm_data sdhci_esdhc_pdata = { @@ -169,7 +196,27 @@ static struct sdhci_pltfm_data sdhci_esdhc_pdata = { static int __devinit sdhci_esdhc_probe(struct platform_device *pdev) { - return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata); + struct sdhci_host *host; + struct device_node *np; + int ret = 0; + + host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata); + if (IS_ERR(host)) + return PTR_ERR(host); + + sdhci_get_of_property(pdev); + + np = pdev-dev.of_node; + if (of_device_is_compatible(np, fsl,p2020-esdhc)) { + /* P2020 has capability of 8 bit bus width */ + host-mmc-caps |= MMC_CAP_8_BIT_DATA; + } + + ret = sdhci_add_host(host); + if (ret) + sdhci_pltfm_free(pdev); + + return ret; } static int __devexit sdhci_esdhc_remove(struct platform_device *pdev) -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 00/83] AMD HSA kernel driver
. Each mm_struct is associated with a unique PASID, allowing the IOMMUv2 to make userspace process memory accessible to the GPU. Next step is for the application to collect topology information via sysfs. This gives userspace enough information to be able to identify specific nodes (processors) in subsequent queue management calls. Application processes can create queues on multiple processors, and processors support queues from multiple processes. At this point the application can create work queues in userspace memory and pass them through the usermode library to kfd to have them mapped onto HW queue slots so that commands written to the queues can be executed by the GPU. Queue operations specify a processor node, and so the bulk of this code is device-specific. Written by John Bridgman john.bridg...@amd.com Alexey Skidanov (4): hsa/radeon: 32-bit processes support hsa/radeon: NULL pointer dereference bug workaround hsa/radeon: HSA64/HSA32 modes support hsa/radeon: Add local memory to topology Andrew Lewycky (3): hsa/radeon: Make binding of process to device permanent hsa/radeon: Implement hsaKmtSetMemoryPolicy mm: Change timing of notification to IOMMUs about a page to be invalidated Ben Goz (20): hsa/radeon: Add queue and hw_pointer_store modules hsa/radeon: Add support allocating kernel doorbells hsa/radeon: Add mqd_manager module hsa/radeon: Add kernel queue support for KFD hsa/radeon: Add module parameter of scheduling policy hsa/radeon: Add packet manager module hsa/radeon: Add process queue manager module hsa/radeon: Add device queue manager module hsa/radeon: Switch to new queue scheduler hsa/radeon: Add IOCTL for update queue hsa/radeon: Queue Management integration with Memory Management hsa/radeon: update queue fault handling hsa/radeon: fixing a bug to support 32b processes hsa/radeon: Fix number of pipes per ME hsa/radeon: Removing hw pointer store module hsa/radeon: Adding some error messages hsa/radeon: Fixing minor issues with kernel queues (DIQ) drm/radeon: Add register access functions to kfd2kgd interface hsa/radeon: Eliminating all direct register accesses drm/radeon: Remove lock functions from kfd2kgd interface Evgeny Pinchuk (9): hsa/radeon: fix the OEMID assignment in kfd_topology drm/radeon: extending kfd-kgd interface hsa/radeon: implementing IOCTL for clock counters drm/radeon: adding synchronization for GRBM GFX hsa/radeon: fixing clock counters bug drm/radeon: Extending kfd interface hsa/radeon: Adding max clock speeds to topology hsa/radeon: Alternating the source of max clock hsa/radeon: Exclusive access for perf. counters Michael Varga (1): hsa/radeon: debugging print statements Oded Gabbay (45): mm: Add kfd_process pointer to mm_struct drm/radeon: reduce number of free VMIDs and pipes in KV drm/radeon: Report doorbell configuration to kfd drm/radeon: Add radeon -- kfd interface drm/radeon: Add kfd--kgd interface to get virtual ram size drm/radeon: Add kfd--kgd interfaces of memory allocation/mapping drm/radeon: Add kfd--kgd interface of locking srbm_gfx_cntl register drm/radeon: Add calls to initialize and finalize kfd from radeon hsa/radeon: Add code base of hsa driver for AMD's GPUs hsa/radeon: Add initialization and unmapping of doorbell aperture hsa/radeon: Add scheduler code hsa/radeon: Add kfd mmap handler hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE hsa/radeon: Update MAINTAINERS and CREDITS files hsa/radeon: Add interrupt handling module hsa/radeon: Add the isr function of the KFD scehduler hsa/radeon: Handle deactivation of queues using interrupts hsa/radeon: Enable interrupts in KFD scheduler hsa/radeon: Enable/Disable KFD interrupt module hsa/radeon: Add interrupt callback function to kgd2kfd interface hsa/radeon: Add kgd--kfd interfaces for suspend and resume drm/radeon: Add calls to suspend and resume of kfd driver drm/radeon/cik: Don't touch int of pipes 1-7 drm/radeon/cik: Call kfd isr function hsa/radeon: Fix memory size allocated for HPD hsa/radeon: Fix list of supported devices hsa/radeon: Fix coding style in cik_int.h hsa/radeon: Print ioctl commnad only in debug mode hsa/radeon: Print ISR info only in debug mode hsa/radeon: Workaround for a bug in amd_iommu hsa/radeon: Eliminate warnings in compilation hsa/radeon: Various kernel styling fixes hsa/radeon: Rearrange structures in kfd_ioctl.h hsa/radeon: change another pr_info to pr_debug hsa/radeon: Fix timeout calculation in sync_with_hw hsa/radeon: Update module information and version hsa/radeon: Update module version to 0.6.0 hsa/radeon: Fix initialization of sh_mem registers hsa/radeon: Fix compilation warnings hsa/radeon: Remove old scheduler code hsa/radeon: Static analysis (smatch) fixes hsa/radeon: Check oversubscription before destroying runlist hsa/radeon: Don't verify cksum when parsing CRAT
[PATCH 01/83] mm: Add kfd_process pointer to mm_struct
This patch enables the KFD to retrieve the kfd_process object from the process's mm_struct. This is needed because kfd_process lifespan is bound to the process's mm_struct lifespan. When KFD is notified about an mm_struct tear-down, it checks if the kfd_process pointer is valid. If so, it releases the kfd_process object and all relevant resources. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- include/linux/mm_types.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 678097c..6179107 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -20,6 +20,10 @@ struct hmm; #endif +#ifdef CONFIG_HSA_RADEON +struct kfd_process; +#endif + #ifndef AT_VECTOR_SIZE_ARCH #define AT_VECTOR_SIZE_ARCH 0 #endif @@ -439,6 +443,16 @@ struct mm_struct { */ struct hmm *hmm; #endif +#if defined(CONFIG_HSA_RADEON) || defined(CONFIG_HSA_RADEON_MODULE) + /* +* kfd always register an mmu_notifier we rely on mmu notifier to keep +* refcount on mm struct as well as forbiding registering kfd on a +* dying mm +* +* This field is set with mmap_sem old in write mode. +*/ + struct kfd_process *kfd_process; +#endif #if defined(CONFIG_TRANSPARENT_HUGEPAGE) !USE_SPLIT_PMD_PTLOCKS pgtable_t pmd_huge_pte; /* protected by page_table_lock */ #endif -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV
To support HSA on KV, we need to limit the number of vmids and pipes that are available for radeon's use with KV. This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs 0-7) and also makes radeon thinks that KV has only a single MEC with a single pipe in it Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cik.c | 48 ++-- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 4bfc2c0..e0c8052 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device *rdev) /* * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total +* Nonetheless, we assign only 1 pipe because all other pipes will +* be handled by KFD */ - if (rdev-family == CHIP_KAVERI) - rdev-mec.num_mec = 2; - else - rdev-mec.num_mec = 1; - rdev-mec.num_pipe = 4; + rdev-mec.num_mec = 1; + rdev-mec.num_pipe = 1; rdev-mec.num_queue = rdev-mec.num_mec * rdev-mec.num_pipe * 8; if (rdev-mec.hpd_eop_obj == NULL) { @@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct radeon_device *rdev) /* init the pipes */ mutex_lock(rdev-srbm_mutex); - for (i = 0; i (rdev-mec.num_pipe * rdev-mec.num_mec); i++) { - int me = (i 4) ? 1 : 2; - int pipe = (i 4) ? i : (i - 4); - eop_gpu_addr = rdev-mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 2); + eop_gpu_addr = rdev-mec.hpd_eop_gpu_addr; - cik_srbm_select(rdev, me, pipe, 0, 0); + cik_srbm_select(rdev, 0, 0, 0, 0); - /* write the EOP addr */ - WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr 8); - WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) 8); + /* write the EOP addr */ + WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr 8); + WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) 8); - /* set the VMID assigned */ - WREG32(CP_HPD_EOP_VMID, 0); + /* set the VMID assigned */ + WREG32(CP_HPD_EOP_VMID, 0); + + /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ + tmp = RREG32(CP_HPD_EOP_CONTROL); + tmp = ~EOP_SIZE_MASK; + tmp |= order_base_2(MEC_HPD_SIZE / 8); + WREG32(CP_HPD_EOP_CONTROL, tmp); - /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ - tmp = RREG32(CP_HPD_EOP_CONTROL); - tmp = ~EOP_SIZE_MASK; - tmp |= order_base_2(MEC_HPD_SIZE / 8); - WREG32(CP_HPD_EOP_CONTROL, tmp); - } - cik_srbm_select(rdev, 0, 0, 0, 0); mutex_unlock(rdev-srbm_mutex); /* init the queues. Just two for now. */ @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib) */ int cik_vm_init(struct radeon_device *rdev) { - /* number of VMs */ - rdev-vm_manager.nvm = 16; + /* +* number of VMs +* VMID 0 is reserved for Graphics +* radeon compute will use VMIDs 1-7 +* KFD will use VMIDs 8-15 +*/ + rdev-vm_manager.nvm = 8; /* base offset of vram pages */ if (rdev-flags RADEON_IS_IGP) { u64 tmp = RREG32(MC_VM_FB_OFFSET); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/83] drm/radeon: Add kfd--kgd interfaces of memory allocation/mapping
This patch adds new interfaces to kfd2kgd_calls structure. The new interfaces allow the kfd driver to : 1. Allocated video memory through the radeon driver 2. Map and unmap video memory with GPUVM through the radeon driver 3. Map and unmap system memory with GPUVM through the radeon driver Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 129 include/linux/radeon_kfd.h | 23 +++ 2 files changed, 152 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 1b859b5..66ee36b 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -25,9 +25,31 @@ #include drm/drmP.h #include radeon.h +struct kgd_mem { + struct radeon_bo *bo; + u32 domain; +}; + +static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment, + enum kgd_memory_pool pool, struct kgd_mem **memory_handle); + +static void free_mem(struct kgd_dev *kgd, struct kgd_mem *memory_handle); + +static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t *vmid0_address); +static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem); + +static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr); +static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem); + static uint64_t get_vmem_size(struct kgd_dev *kgd); static const struct kfd2kgd_calls kfd2kgd = { + .allocate_mem = allocate_mem, + .free_mem = free_mem, + .gpumap_mem = gpumap_mem, + .ungpumap_mem = ungpumap_mem, + .kmap_mem = kmap_mem, + .unkmap_mem = unkmap_mem, .get_vmem_size = get_vmem_size, }; @@ -96,6 +118,113 @@ void radeon_kfd_device_fini(struct radeon_device *rdev) } } +static u32 pool_to_domain(enum kgd_memory_pool p) +{ + switch (p) { + case KGD_POOL_FRAMEBUFFER: return RADEON_GEM_DOMAIN_VRAM; + default: return RADEON_GEM_DOMAIN_GTT; + } +} + +static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment, + enum kgd_memory_pool pool, struct kgd_mem **memory_handle) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + struct kgd_mem *mem; + int r; + + mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL); + if (!mem) + return -ENOMEM; + + mem-domain = pool_to_domain(pool); + + r = radeon_bo_create(rdev, size, alignment, true, mem-domain, NULL, mem-bo); + if (r) { + kfree(mem); + return r; + } + + *memory_handle = mem; + return 0; +} + +static void free_mem(struct kgd_dev *kgd, struct kgd_mem *mem) +{ + /* Assume that KFD will never free gpumapped or kmapped memory. This is not quite settled. */ + radeon_bo_unref(mem-bo); + kfree(mem); +} + +static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t *vmid0_address) +{ + int r; + + r = radeon_bo_reserve(mem-bo, true); + + /* +* ttm_bo_reserve can only fail if the buffer reservation lock +* is held in circumstances that would deadlock +*/ + BUG_ON(r != 0); + r = radeon_bo_pin(mem-bo, mem-domain, vmid0_address); + radeon_bo_unreserve(mem-bo); + + return r; +} + +static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem) +{ + int r; + + r = radeon_bo_reserve(mem-bo, true); + + /* +* ttm_bo_reserve can only fail if the buffer reservation lock +* is held in circumstances that would deadlock +*/ + BUG_ON(r != 0); + r = radeon_bo_unpin(mem-bo); + + /* +* This unpin only removed NO_EVICT placement flags +* and should never fail +*/ + BUG_ON(r != 0); + radeon_bo_unreserve(mem-bo); +} + +static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr) +{ + int r; + + r = radeon_bo_reserve(mem-bo, true); + + /* +* ttm_bo_reserve can only fail if the buffer reservation lock +* is held in circumstances that would deadlock +*/ + BUG_ON(r != 0); + r = radeon_bo_kmap(mem-bo, ptr); + radeon_bo_unreserve(mem-bo); + + return r; +} + +static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem) +{ + int r; + + r = radeon_bo_reserve(mem-bo, true); + /* +* ttm_bo_reserve can only fail if the buffer reservation lock +* is held in circumstances that would deadlock +*/ + BUG_ON(r != 0); + radeon_bo_kunmap(mem-bo); + radeon_bo_unreserve(mem-bo); +} + static uint64_t get_vmem_size(struct kgd_dev *kgd) { struct radeon_device *rdev = (struct radeon_device *)kgd; diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 28cddf5..c7997d4 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -36,6 +36,14
[PATCH 03/83] drm/radeon: Report doorbell configuration to kfd
Radeon and KFD share the doorbell aperture. Radeon sets it up, takes the doorbells required for its own rings and reports the setup to KFD. Radeon reserved doorbells are at the start of the doorbell aperture. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon.h| 4 drivers/gpu/drm/radeon/radeon_device.c | 31 +++ 2 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 7cda75d..4e7e41f 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -676,6 +676,10 @@ struct radeon_doorbell { int radeon_doorbell_get(struct radeon_device *rdev, u32 *page); void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell); +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev, + phys_addr_t *aperture_base, + size_t *aperture_size, + size_t *start_offset); /* * IRQS. diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 03686fa..98538d2 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -328,6 +328,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell) __clear_bit(doorbell, rdev-doorbell.used); } +/** + * radeon_doorbell_get_kfd_info - Report doorbell configuration required to + *setup KFD + * + * @rdev: radeon_device pointer + * @aperture_base: output returning doorbell aperture base physical address + * @aperture_size: output returning doorbell aperture size in bytes + * @start_offset: output returning # of doorbell bytes reserved for radeon. + * + * Radeon and the KFD share the doorbell aperture. Radeon sets it up, + * takes doorbells required for its own rings and reports the setup to KFD. + * Radeon reserved doorbells are at the start of the doorbell aperture. + */ +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev, + phys_addr_t *aperture_base, + size_t *aperture_size, + size_t *start_offset) +{ + /* The first num_doorbells are used by radeon. +* KFD takes whatever's left in the aperture. */ + if (rdev-doorbell.size rdev-doorbell.num_doorbells * sizeof(u32)) { + *aperture_base = rdev-doorbell.base; + *aperture_size = rdev-doorbell.size; + *start_offset = rdev-doorbell.num_doorbells * sizeof(u32); + } else { + *aperture_base = 0; + *aperture_size = 0; + *start_offset = 0; + } +} + /* * radeon_wb_*() * Writeback is the the method by which the the GPU updates special pages -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon
The KFD driver should be loaded when the radeon driver is loaded and should be finalized when the radeon driver is removed. This patch adds a function call to initialize kfd from radeon_init and a function call to finalize kfd from radeon_exit. If the KFD driver is not present in the system, the initialize call fails and the radeon driver continues normally. This patch also adds calls to probe, initialize and finalize a kfd device per radeon device using the kgd--kfd interface. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_drv.c | 6 ++ drivers/gpu/drm/radeon/radeon_kms.c | 9 + 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index cb14213..88a45a0 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -151,6 +151,9 @@ static inline void radeon_register_atpx_handler(void) {} static inline void radeon_unregister_atpx_handler(void) {} #endif +extern bool radeon_kfd_init(void); +extern void radeon_kfd_fini(void); + int radeon_no_wb; int radeon_modeset = -1; int radeon_dynclks = -1; @@ -630,12 +633,15 @@ static int __init radeon_init(void) #endif } + radeon_kfd_init(); + /* let modprobe override vga console setting */ return drm_pci_init(driver, pdriver); } static void __exit radeon_exit(void) { + radeon_kfd_fini(); drm_pci_exit(driver, pdriver); radeon_unregister_atpx_handler(); } diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 35d9318..0748284 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -34,6 +34,10 @@ #include linux/slab.h #include linux/pm_runtime.h +extern void radeon_kfd_device_probe(struct radeon_device *rdev); +extern void radeon_kfd_device_init(struct radeon_device *rdev); +extern void radeon_kfd_device_fini(struct radeon_device *rdev); + #if defined(CONFIG_VGA_SWITCHEROO) bool radeon_has_atpx(void); #else @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev) pm_runtime_get_sync(dev-dev); + radeon_kfd_device_fini(rdev); + radeon_acpi_fini(rdev); radeon_modeset_fini(rdev); @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags) Error during ACPI methods call\n); } + radeon_kfd_device_probe(rdev); + radeon_kfd_device_init(rdev); + if (radeon_is_px(dev)) { pm_runtime_use_autosuspend(dev-dev); pm_runtime_set_autosuspend_delay(dev-dev, 5000); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/83] drm/radeon: Add kfd--kgd interface of locking srbm_gfx_cntl register
This patch adds a new interface to kfd2kgd_calls structure, which allows the kfd to lock and unlock the srbm_gfx_cntl register Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 20 include/linux/radeon_kfd.h | 4 2 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 66ee36b..594020e 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem); static uint64_t get_vmem_size(struct kgd_dev *kgd); +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd); + + static const struct kfd2kgd_calls kfd2kgd = { .allocate_mem = allocate_mem, .free_mem = free_mem, @@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = { .kmap_mem = kmap_mem, .unkmap_mem = unkmap_mem, .get_vmem_size = get_vmem_size, + .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl, + .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl, }; static const struct kgd2kfd_calls *kgd2kfd; @@ -233,3 +239,17 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd) return rdev-mc.real_vram_size; } + +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + + mutex_lock(rdev-srbm_mutex); +} + +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + + mutex_unlock(rdev-srbm_mutex); +} diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index c7997d4..40b691c 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -81,6 +81,10 @@ struct kfd2kgd_calls { void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem); uint64_t (*get_vmem_size)(struct kgd_dev *kgd); + + /* SRBM_GFX_CNTL mutex */ + void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd); + void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd); }; bool kgd2kfd_init(unsigned interface_version, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/83] hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE
This patch adds 2 new IOCTL to kfd driver. The first IOCTL is KFD_IOC_CREATE_QUEUE that is used by the user-mode application to create a compute queue on the GPU. The second IOCTL is KFD_IOC_DESTROY_QUEUE that is used by the user-mode application to destroy an existing compute queue on the GPU. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 155 ++ drivers/gpu/hsa/radeon/kfd_doorbell.c | 11 +++ include/uapi/linux/kfd_ioctl.h| 69 +++ 3 files changed, 235 insertions(+) create mode 100644 include/uapi/linux/kfd_ioctl.h diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 0b5bc74..4e7d5d0 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -27,11 +27,13 @@ #include linux/sched.h #include linux/slab.h #include linux/uaccess.h +#include uapi/linux/kfd_ioctl.h #include kfd_priv.h #include kfd_scheduler.h static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); +static int kfd_mmap(struct file *, struct vm_area_struct *); static const char kfd_dev_name[] = kfd; @@ -108,17 +110,170 @@ kfd_open(struct inode *inode, struct file *filep) return 0; } +static long +kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *arg) +{ + struct kfd_ioctl_create_queue_args args; + struct kfd_dev *dev; + int err = 0; + unsigned int queue_id; + struct kfd_queue *queue; + struct kfd_process_device *pdd; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + queue = kzalloc( + offsetof(struct kfd_queue, scheduler_queue) + dev-device_info-scheduler_class-queue_size, + GFP_KERNEL); + + if (!queue) + return -ENOMEM; + + queue-dev = dev; + + mutex_lock(p-mutex); + + pdd = radeon_kfd_bind_process_to_device(dev, p); + if (IS_ERR(pdd) 0) { + err = PTR_ERR(pdd); + goto err_bind_pasid; + } + + pr_debug(kfd: creating queue number %d for PASID %d on GPU 0x%x\n, + pdd-queue_count, + p-pasid, + dev-id); + + if (pdd-queue_count++ == 0) { + err = dev-device_info-scheduler_class-register_process(dev-scheduler, p, pdd-scheduler_process); + if (err 0) + goto err_register_process; + } + + if (!radeon_kfd_allocate_queue_id(p, queue_id)) + goto err_allocate_queue_id; + + err = dev-device_info-scheduler_class-create_queue(dev-scheduler, pdd-scheduler_process, + queue-scheduler_queue, + (void __user *)args.ring_base_address, + args.ring_size, + (void __user *)args.read_pointer_address, + (void __user *)args.write_pointer_address, + radeon_kfd_queue_id_to_doorbell(dev, p, queue_id)); + if (err) + goto err_create_queue; + + radeon_kfd_install_queue(p, queue_id, queue); + + args.queue_id = queue_id; + args.doorbell_address = (uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id); + + if (copy_to_user(arg, args, sizeof(args))) { + err = -EFAULT; + goto err_copy_args_out; + } + + mutex_unlock(p-mutex); + + pr_debug(kfd: queue id %d was created successfully.\n + ring buffer address == 0x%016llX\n + read ptr address== 0x%016llX\n + write ptr address == 0x%016llX\n + doorbell address== 0x%016llX\n, + args.queue_id, + args.ring_base_address, + args.read_pointer_address, + args.write_pointer_address, + args.doorbell_address); + + return 0; + +err_copy_args_out: + dev-device_info-scheduler_class-destroy_queue(dev-scheduler, queue-scheduler_queue); +err_create_queue: + radeon_kfd_remove_queue(p, queue_id); +err_allocate_queue_id: + if (--pdd-queue_count == 0) { + dev-device_info-scheduler_class-deregister_process(dev-scheduler, pdd-scheduler_process); + pdd-scheduler_process = NULL; + } +err_register_process: +err_bind_pasid: + kfree(queue); + mutex_unlock(p-mutex); + return err
[PATCH 15/83] hsa/radeon: Add interrupt handling module
This patch adds the interrupt handling module, in kfd_interrupt.c, and its related members in different data structures to the KFD driver. The KFD interrupt module maintains an internal interrupt ring per kfd device. The internal interrupt ring contains interrupts that needs further handling.The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile| 2 +- drivers/gpu/hsa/radeon/kfd_device.c| 1 + drivers/gpu/hsa/radeon/kfd_interrupt.c | 179 + drivers/gpu/hsa/radeon/kfd_priv.h | 18 drivers/gpu/hsa/radeon/kfd_scheduler.h | 3 + 5 files changed, 202 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_interrupt.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 28da10c..5422e6a 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -5,6 +5,6 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o + kfd_vidmem.o kfd_interrupt.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index 465c822..b2d2861 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -30,6 +30,7 @@ static const struct kfd_device_info bonaire_device_info = { .scheduler_class = radeon_kfd_cik_static_scheduler_class, .max_pasid_bits = 16, + .ih_ring_entry_size = 4 * sizeof(uint32_t) }; struct kfd_deviceid { diff --git a/drivers/gpu/hsa/radeon/kfd_interrupt.c b/drivers/gpu/hsa/radeon/kfd_interrupt.c new file mode 100644 index 000..2179780 --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_interrupt.c @@ -0,0 +1,179 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/* + * KFD Interrupts. + * + * AMD GPUs deliver interrupts by pushing an interrupt description onto the + * interrupt ring and then sending an interrupt. KGD receives the interrupt + * in ISR and sends us a pointer to each new entry on the interrupt ring. + * + * We generally can't process interrupt-signaled events from ISR, so we call + * out to each interrupt client module (currently only the scheduler) to ask if + * each interrupt is interesting. If they return true, then it requires further + * processing so we copy it to an internal interrupt ring and call each + * interrupt client again from a work-queue. + * + * There's no acknowledgment for the interrupts we use. The hardware simply + * queues a new interrupt each time without waiting. + * + * The fixed-size internal queue means that it's possible for us to lose + * interrupts because we have no back-pressure to the hardware. + */ + +#include linux/slab.h +#include linux/device.h +#include kfd_priv.h +#include kfd_scheduler.h + +#define KFD_INTERRUPT_RING_SIZE 256 + +static void interrupt_wq(struct work_struct *); + +int +radeon_kfd_interrupt_init(struct kfd_dev *kfd) +{ + void *interrupt_ring = kmalloc_array(KFD_INTERRUPT_RING_SIZE, + kfd-device_info-ih_ring_entry_size, + GFP_KERNEL); + if (!interrupt_ring) + return -ENOMEM; + + kfd-interrupt_ring = interrupt_ring; + kfd-interrupt_ring_size = + KFD_INTERRUPT_RING_SIZE * kfd-device_info-ih_ring_entry_size; + atomic_set(kfd
[PATCH 17/83] hsa/radeon: Handle deactivation of queues using interrupts
This patch modifies the scheduler code to use interrupts to handle the deactivation of queues. We prefer to use interrupts because the deactivation could take a long time since we need to wait for the wavefront to finish executing before deactivating the queue. There is an array of waitqueues, each cell is represents queues for a specific pipe. When a queue should be deactivated, it is inserted to the wait queue. The event that triggers the waitqueue is a dequeue-complete interrupt that arrives through the isr function of the scheduler. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_regs.h | 1 + drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 45 +-- 2 files changed, 37 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h index ef1d7ab..9c3ce97 100644 --- a/drivers/gpu/hsa/radeon/cik_regs.h +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -166,6 +166,7 @@ #define CP_HQD_DEQUEUE_REQUEST 0xC974 #defineDEQUEUE_REQUEST_DRAIN 1 +#defineDEQUEUE_INT (1U 8) #define CP_HQD_SEMA_CMD0xC97Cu #define CP_HQD_MSG_TYPE0xC980u diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index f86f958..5d42e88 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -139,6 +139,13 @@ struct cik_static_private { /* Queue q on pipe p is at bit QUEUES_PER_PIPE * p + q. */ unsigned long free_queues[DIV_ROUND_UP(CIK_MAX_PIPES * CIK_QUEUES_PER_PIPE, BITS_PER_LONG)]; + /* +* Dequeue waits for waves to finish so it could take a long time. We +* defer through an interrupt. dequeue_wait is woken when a dequeue- +* complete interrupt comes for that pipe. +*/ + wait_queue_head_t dequeue_wait[CIK_MAX_PIPES]; + kfd_mem_obj hpd_mem;/* Single allocation for HPDs for all KFD pipes. */ kfd_mem_obj mqd_mem;/* Single allocation for all MQDs for all KFD * pipes. This is actually struct cik_mqd_padded. */ @@ -411,6 +418,9 @@ static int cik_static_create(struct kfd_dev *dev, struct kfd_scheduler **schedul priv-free_vmid_mask = dev-shared_resources.compute_vmid_bitmap; + for (i = 0; i priv-num_pipes; i++) + init_waitqueue_head(priv-dequeue_wait[i]); + /* * Allocate memory for the HPDs. This is hardware-owned per-pipe data. * The driver never accesses this memory after zeroing it. It doesn't even have @@ -712,15 +722,18 @@ static void activate_queue(struct cik_static_private *priv, struct cik_static_qu unlock_srbm_index(priv); } -static void drain_hqd(struct cik_static_private *priv) +static bool queue_inactive(struct cik_static_private *priv, struct cik_static_queue *queue) { - WRITE_REG(priv-dev, CP_HQD_DEQUEUE_REQUEST, DEQUEUE_REQUEST_DRAIN); -} + bool inactive; -static void wait_hqd_inactive(struct cik_static_private *priv) -{ - while (READ_REG(priv-dev, CP_HQD_ACTIVE) != 0) - cpu_relax(); + lock_srbm_index(priv); + queue_select(priv, queue-queue); + + inactive = (READ_REG(priv-dev, CP_HQD_ACTIVE) == 0); + + unlock_srbm_index(priv); + + return inactive; } static void deactivate_queue(struct cik_static_private *priv, struct cik_static_queue *queue) @@ -728,10 +741,12 @@ static void deactivate_queue(struct cik_static_private *priv, struct cik_static_ lock_srbm_index(priv); queue_select(priv, queue-queue); - drain_hqd(priv); - wait_hqd_inactive(priv); + WRITE_REG(priv-dev, CP_HQD_DEQUEUE_REQUEST, DEQUEUE_REQUEST_DRAIN | DEQUEUE_INT); unlock_srbm_index(priv); + + wait_event(priv-dequeue_wait[queue-queue/CIK_QUEUES_PER_PIPE], + queue_inactive(priv, queue)); } #define BIT_MASK_64(high, low) (((1ULL (high)) - 1) ~((1ULL (low)) - 1)) @@ -791,6 +806,14 @@ cik_static_destroy_queue(struct kfd_scheduler *scheduler, struct kfd_scheduler_q release_hqd(priv, hwq-queue); } +static void +dequeue_int_received(struct cik_static_private *priv, uint32_t pipe_id) +{ + /* The waiting threads will check CP_HQD_ACTIVE to see whether their +* queue completed. */ + wake_up_all(priv-dequeue_wait[pipe_id]); +} + /* Figure out the KFD compute pipe ID for an interrupt ring entry. * Returns true if it's a KFD compute pipe, false otherwise. */ static bool int_compute_pipe(const struct cik_static_private *priv, @@ -829,6 +852,10 @@ cik_static_interrupt_isr(struct kfd_scheduler *scheduler, const void *ih_ring_en ihre-source_id, ihre-data, pipe_id, ihre-vmid
[PATCH 16/83] hsa/radeon: Add the isr function of the KFD scehduler
This patch adds the isr function to the KFD scheduler code. This function us called from the kgd2kfd_interrupt function which is an interrupt-context function. The purpose of the isr function is to determine whether the interrupt that arrived is interesting, i.e. some action need to be taken. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_int.h | 50 drivers/gpu/hsa/radeon/cik_regs.h | 2 + drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 56 +++ 3 files changed, 108 insertions(+) create mode 100644 drivers/gpu/hsa/radeon/cik_int.h diff --git a/drivers/gpu/hsa/radeon/cik_int.h b/drivers/gpu/hsa/radeon/cik_int.h new file mode 100644 index 000..e98551d --- /dev/null +++ b/drivers/gpu/hsa/radeon/cik_int.h @@ -0,0 +1,50 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef HSA_RADEON_CIK_INT_H_INCLUDED +#define HSA_RADEON_CIK_INT_H_INCLUDED + +#include linux/types.h + +struct cik_ih_ring_entry { + uint32_t source_id : 8; + uint32_t reserved1 : 8; + uint32_t reserved2 : 16; + + uint32_t data : 28; + uint32_t reserved3 : 4; + + /* pipeid, meid and unused3 are officially called RINGID, +* but for our purposes, they always decode into pipe and ME. */ + uint32_t pipeid : 2; + uint32_t meid : 2; + uint32_t reserved4 : 4; + uint32_t vmid : 8; + uint32_t pasid : 16; + + uint32_t reserved5; +}; + +#define CIK_INTSRC_DEQUEUE_COMPLETE0xC6 + +#endif + diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h index d0cdc57..ef1d7ab 100644 --- a/drivers/gpu/hsa/radeon/cik_regs.h +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -23,6 +23,8 @@ #ifndef CIK_REGS_H #define CIK_REGS_H +#define IH_VMID_0_LUT 0x3D40u + #define BIF_DOORBELL_CNTL 0x530Cu #defineSRBM_GFX_CNTL 0xE44 diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index b986ff9..f86f958 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -25,9 +25,12 @@ #include linux/slab.h #include linux/types.h #include linux/uaccess.h +#include linux/device.h +#include linux/sched.h #include kfd_priv.h #include kfd_scheduler.h #include cik_regs.h +#include cik_int.h /* CIK CP hardware is arranged with 8 queues per pipe and 8 pipes per MEC (microengine for compute). * The first MEC is ME 1 with the GFX ME as ME 0. @@ -273,6 +276,8 @@ static void set_vmid_pasid_mapping(struct cik_static_private *priv, unsigned int while (!(READ_REG(priv-dev, ATC_VMID_PASID_MAPPING_UPDATE_STATUS) (1U vmid))) cpu_relax(); WRITE_REG(priv-dev, ATC_VMID_PASID_MAPPING_UPDATE_STATUS, 1U vmid); + + WRITE_REG(priv-dev, IH_VMID_0_LUT + vmid*sizeof(uint32_t), pasid); } static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble) @@ -786,6 +791,54 @@ cik_static_destroy_queue(struct kfd_scheduler *scheduler, struct kfd_scheduler_q release_hqd(priv, hwq-queue); } +/* Figure out the KFD compute pipe ID for an interrupt ring entry. + * Returns true if it's a KFD compute pipe, false otherwise. */ +static bool int_compute_pipe(const struct cik_static_private *priv, +const struct cik_ih_ring_entry *ih_ring_entry, +uint32_t *kfd_pipe) +{ + uint32_t pipe_id; + + if (ih_ring_entry-meid == 0) /* Ignore graphics interrupts - compute only. */ + return false; + + pipe_id = (ih_ring_entry-meid - 1) * CIK_PIPES_PER_MEC + ih_ring_entry-pipeid
[PATCH 25/83] hsa/radeon: fix the OEMID assignment in kfd_topology
From: Evgeny Pinchuk evgeny.pinc...@amd.com The assignment of OEMID from the CRAT table is into a 64 variable. The OEMID is 48bit wide in the CRAT. This fix makes sure that only 48bit are assigned for the OEMID value from the CRAT table. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_crat.h | 2 ++ drivers/gpu/hsa/radeon/kfd_topology.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_crat.h b/drivers/gpu/hsa/radeon/kfd_crat.h index 587455d..a374fa3 100644 --- a/drivers/gpu/hsa/radeon/kfd_crat.h +++ b/drivers/gpu/hsa/radeon/kfd_crat.h @@ -42,6 +42,8 @@ #define CRAT_OEMTABLEID_LENGTH 8 #define CRAT_RESERVED_LENGTH 6 +#define CRAT_OEMID_64BIT_MASK ((1ULL (CRAT_OEMID_LENGTH * 8)) - 1) + struct crat_header { uint32_tsignature; uint32_tlength; diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c b/drivers/gpu/hsa/radeon/kfd_topology.c index 6acac25..2ee5444 100644 --- a/drivers/gpu/hsa/radeon/kfd_topology.c +++ b/drivers/gpu/hsa/radeon/kfd_topology.c @@ -467,10 +467,10 @@ static int kfd_parse_crat_table(void *crat_image) if (!top_dev) { kfd_release_live_view(); return -ENOMEM; + } } -} - sys_props.platform_id = *((uint64_t *)crat_table-oem_id); + sys_props.platform_id = (*((uint64_t *)crat_table-oem_id)) CRAT_OEMID_64BIT_MASK; sys_props.platform_oem = *((uint64_t *)crat_table-oem_table_id); sys_props.platform_rev = crat_table-revision; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 27/83] hsa/radeon: Implement hsaKmtSetMemoryPolicy
From: Andrew Lewycky andrew.lewy...@amd.com This patch adds support in KFD for the hsaKmtSetMemoryPolicy HSA thunk API call Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_regs.h | 1 + drivers/gpu/hsa/radeon/kfd_chardev.c | 59 + drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 91 +-- drivers/gpu/hsa/radeon/kfd_scheduler.h| 12 include/uapi/linux/kfd_ioctl.h| 13 5 files changed, 172 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h index 813cdc4..93f7b34 100644 --- a/drivers/gpu/hsa/radeon/cik_regs.h +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -54,6 +54,7 @@ #defineAPE1_MTYPE(x) ((x) 7) /* valid for both DEFAULT_MTYPE and APE1_MTYPE */ +#defineMTYPE_CACHED0 #defineMTYPE_NONCACHED 3 diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index e0b276d..ddaf357 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -231,6 +231,61 @@ kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p, void __user *a } static long +kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void __user *arg) +{ + struct kfd_ioctl_set_memory_policy_args args; + struct kfd_dev *dev; + int err = 0; + struct kfd_process_device *pdd; + enum cache_policy default_policy, alternate_policy; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + if (args.default_policy != KFD_IOC_CACHE_POLICY_COHERENT +args.default_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) { + return -EINVAL; + } + + if (args.alternate_policy != KFD_IOC_CACHE_POLICY_COHERENT +args.alternate_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) { + return -EINVAL; + } + + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + mutex_lock(p-mutex); + + pdd = radeon_kfd_bind_process_to_device(dev, p); + if (IS_ERR(pdd) 0) { + err = PTR_ERR(pdd); + goto out; + } + + default_policy = (args.default_policy == KFD_IOC_CACHE_POLICY_COHERENT) +? cache_policy_coherent : cache_policy_noncoherent; + + alternate_policy = (args.alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT) + ? cache_policy_coherent : cache_policy_noncoherent; + + if (!dev-device_info-scheduler_class-set_cache_policy(dev-scheduler, + pdd-scheduler_process, +default_policy, + alternate_policy, +(void __user *)args.alternate_aperture_base, + args.alternate_aperture_size)) + err = -EINVAL; + +out: + mutex_unlock(p-mutex); + + return err; +} + + +static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct kfd_process *process; @@ -253,6 +308,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) err = kfd_ioctl_destroy_queue(filep, process, (void __user *)arg); break; + case KFD_IOC_SET_MEMORY_POLICY: + err = kfd_ioctl_set_memory_policy(filep, process, (void __user *)arg); + break; + default: dev_err(kfd_device, unknown ioctl cmd 0x%x, arg 0x%lx)\n, diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index 9add5e5..3c3e7d6 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -162,6 +162,10 @@ struct cik_static_private { struct cik_static_process { unsigned int vmid; pasid_t pasid; + + uint32_t sh_mem_config; + uint32_t ape1_base; + uint32_t ape1_limit; }; struct cik_static_queue { @@ -346,6 +350,7 @@ static void init_ats(struct cik_static_private *priv) sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED); + sh_mem_config |= APE1_MTYPE(MTYPE_NONCACHED); WRITE_REG(priv-dev, SH_MEM_CONFIG, sh_mem_config); @@ -562,14 +567,26 @@ static void release_vmid(struct cik_static_private *priv, unsigned int vmid
[PATCH 26/83] hsa/radeon: Make binding of process to device permanent
From: Andrew Lewycky andrew.lewy...@amd.com Permanently bind the process to the device. The binding survives even when all queues are destroyed. Process exit and device removal terminate the binding. Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 27 +++ drivers/gpu/hsa/radeon/kfd_priv.h| 3 --- drivers/gpu/hsa/radeon/kfd_process.c | 21 ++--- 3 files changed, 13 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 4e7d5d0..e0b276d 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -141,20 +141,13 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a pdd = radeon_kfd_bind_process_to_device(dev, p); if (IS_ERR(pdd) 0) { err = PTR_ERR(pdd); - goto err_bind_pasid; + goto err_bind_process; } - pr_debug(kfd: creating queue number %d for PASID %d on GPU 0x%x\n, - pdd-queue_count, + pr_debug(kfd: creating queue for PASID %d on GPU 0x%x\n, p-pasid, dev-id); - if (pdd-queue_count++ == 0) { - err = dev-device_info-scheduler_class-register_process(dev-scheduler, p, pdd-scheduler_process); - if (err 0) - goto err_register_process; - } - if (!radeon_kfd_allocate_queue_id(p, queue_id)) goto err_allocate_queue_id; @@ -198,12 +191,7 @@ err_copy_args_out: err_create_queue: radeon_kfd_remove_queue(p, queue_id); err_allocate_queue_id: - if (--pdd-queue_count == 0) { - dev-device_info-scheduler_class-deregister_process(dev-scheduler, pdd-scheduler_process); - pdd-scheduler_process = NULL; - } -err_register_process: -err_bind_pasid: +err_bind_process: kfree(queue); mutex_unlock(p-mutex); return err; @@ -215,7 +203,6 @@ kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p, void __user *a struct kfd_ioctl_destroy_queue_args args; struct kfd_queue *queue; struct kfd_dev *dev; - struct kfd_process_device *pdd; if (copy_from_user(args, arg, sizeof(args))) return -EFAULT; @@ -239,14 +226,6 @@ kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p, void __user *a kfree(queue); - pdd = radeon_kfd_get_process_device_data(dev, p); - BUG_ON(pdd == NULL); /* Because a queue exists. */ - - if (--pdd-queue_count == 0) { - dev-device_info-scheduler_class-deregister_process(dev-scheduler, pdd-scheduler_process); - pdd-scheduler_process = NULL; - } - mutex_unlock(p-mutex); return 0; } diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 630d690..bca9cce 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -166,9 +166,6 @@ struct kfd_process_device { /* The user-mode address of the doorbell mapping for this device. */ doorbell_t __user *doorbell_mapping; - /* The number of queues created by this process for this device. */ - uint32_t queue_count; - /* Scheduler process data for this device. */ struct kfd_scheduler_process *scheduler_process; diff --git a/drivers/gpu/hsa/radeon/kfd_process.c b/drivers/gpu/hsa/radeon/kfd_process.c index 145ee38..f89f855 100644 --- a/drivers/gpu/hsa/radeon/kfd_process.c +++ b/drivers/gpu/hsa/radeon/kfd_process.c @@ -120,15 +120,6 @@ destroy_queues(struct kfd_process *p, struct kfd_dev *dev_filter) dev-device_info-scheduler_class-destroy_queue(dev-scheduler, queue-scheduler_queue); kfree(queue); - - BUG_ON(pdd-queue_count == 0); - BUG_ON(pdd-scheduler_process == NULL); - - if (--pdd-queue_count == 0) { - dev-device_info-scheduler_class-deregister_process(dev-scheduler, - pdd-scheduler_process); - pdd-scheduler_process = NULL; - } } } } @@ -144,6 +135,8 @@ static void free_process(struct kfd_process *p) /* doorbell mappings: automatic */ list_for_each_entry_safe(pdd, temp, p-per_device_data, per_device_list) { + pdd-dev-device_info-scheduler_class-deregister_process(pdd-dev-scheduler, pdd-scheduler_process); + pdd-scheduler_process = NULL; amd_iommu_unbind_pasid(pdd-dev-pdev, p-pasid); list_del(pdd-per_device_list); kfree(pdd); @@ -255,6 +248,12 @@ struct
[PATCH 22/83] drm/radeon: Add calls to suspend and resume of kfd driver
The radeon driver can suspend and resume its device. For each device it suspends/resumes, it should inform the kfd about it, so the kfd could perform relevant actions per that device. This patch adds the calls to kfd's suspend and resume functions. The device is passed as an argument. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cik.c| 7 +++ drivers/gpu/drm/radeon/radeon_kfd.c | 16 2 files changed, 23 insertions(+) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index e0c8052..b1c50f4 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -138,6 +138,8 @@ static void cik_fini_pg(struct radeon_device *rdev); static void cik_fini_cg(struct radeon_device *rdev); static void cik_enable_gui_idle_interrupt(struct radeon_device *rdev, bool enable); +extern void radeon_kfd_suspend(struct radeon_device *rdev); +extern int radeon_kfd_resume(struct radeon_device *rdev); /* get temperature in millidegrees */ int ci_get_temp(struct radeon_device *rdev) @@ -8429,6 +8431,10 @@ static int cik_startup(struct radeon_device *rdev) if (r) return r; + r = radeon_kfd_resume(rdev); + if (r) + return r; + return 0; } @@ -8477,6 +8483,7 @@ int cik_resume(struct radeon_device *rdev) */ int cik_suspend(struct radeon_device *rdev) { + radeon_kfd_suspend(rdev); radeon_pm_suspend(rdev); dce6_audio_fini(rdev); radeon_vm_manager_fini(rdev); diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 594020e..e3af85b 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -124,6 +124,22 @@ void radeon_kfd_device_fini(struct radeon_device *rdev) } } +void radeon_kfd_suspend(struct radeon_device *rdev) +{ + if (rdev-kfd) + kgd2kfd-suspend(rdev-kfd); +} + +int radeon_kfd_resume(struct radeon_device *rdev) +{ + int r = 0; + + if (rdev-kfd) + r = kgd2kfd-resume(rdev-kfd); + + return r; +} + static u32 pool_to_domain(enum kgd_memory_pool p) { switch (p) { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 21/83] hsa/radeon: Add kgd--kfd interfaces for suspend and resume
This patch adds two new interfaces to the kgd2kfd structure. Those interfaces are for doing suspend and resume of a kfd device, when its matching radeon device does suspend and resume. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/kfd_module.c | 2 ++ drivers/gpu/hsa/radeon/kfd_pm.c | 43 + drivers/gpu/hsa/radeon/kfd_priv.h | 4 include/linux/radeon_kfd.h | 2 ++ 5 files changed, 52 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_pm.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 5422e6a..935f9b7 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -5,6 +5,6 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o kfd_interrupt.o + kfd_vidmem.o kfd_interrupt.o kfd_pm.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index ad21c6d..a03743a 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -39,6 +39,8 @@ static const struct kgd2kfd_calls kgd2kfd = { .device_init= kgd2kfd_device_init, .device_exit= kgd2kfd_device_exit, .interrupt = kgd2kfd_interrupt, + .suspend= kgd2kfd_suspend, + .resume = kgd2kfd_resume, }; bool kgd2kfd_init(unsigned interface_version, diff --git a/drivers/gpu/hsa/radeon/kfd_pm.c b/drivers/gpu/hsa/radeon/kfd_pm.c new file mode 100644 index 000..783311f --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_pm.c @@ -0,0 +1,43 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Oded Gabbay + */ + +#include linux/device.h +#include kfd_priv.h +#include kfd_scheduler.h + +void kgd2kfd_suspend(struct kfd_dev *kfd) +{ + BUG_ON(kfd == NULL); + + kfd-device_info-scheduler_class-stop(kfd-scheduler); +} + +int kgd2kfd_resume(struct kfd_dev *kfd) +{ + BUG_ON(kfd == NULL); + + kfd-device_info-scheduler_class-start(kfd-scheduler); + + return 0; +} diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 5b6611f..630d690 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -247,4 +247,8 @@ int radeon_kfd_interrupt_init(struct kfd_dev *dev); void radeon_kfd_interrupt_exit(struct kfd_dev *dev); void kgd2kfd_interrupt(struct kfd_dev *dev, const void *ih_ring_entry); +/* Power Management */ +void kgd2kfd_suspend(struct kfd_dev *dev); +int kgd2kfd_resume(struct kfd_dev *dev); + #endif diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 2f4f7c0..63b7bac 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -63,6 +63,8 @@ struct kgd2kfd_calls { bool (*device_init)(struct kfd_dev *kfd, const struct kgd2kfd_shared_resources *gpu_resources); void (*device_exit)(struct kfd_dev *kfd); void (*interrupt)(struct kfd_dev *kfd, const void *ih_ring_entry); + void (*suspend)(struct kfd_dev *kfd); + int (*resume)(struct kfd_dev *kfd); }; struct kfd2kgd_calls { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 24/83] drm/radeon/cik: Call kfd isr function
When radeon handles interrupts for cik, propogate this interrupt to kfd. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cik.c| 4 drivers/gpu/drm/radeon/radeon_kfd.c | 6 ++ 2 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 803d0cb..6f4999a 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -140,6 +140,7 @@ static void cik_enable_gui_idle_interrupt(struct radeon_device *rdev, bool enable); extern void radeon_kfd_suspend(struct radeon_device *rdev); extern int radeon_kfd_resume(struct radeon_device *rdev); +extern void radeon_kfd_interrupt(struct radeon_device *rdev, const void *ih_ring_entry); /* get temperature in millidegrees */ int ci_get_temp(struct radeon_device *rdev) @@ -7703,6 +7704,9 @@ restart_ih: while (rptr != wptr) { /* wptr/rptr are in bytes! */ ring_index = rptr / 4; + + radeon_kfd_interrupt(rdev, (const void *) rdev-ih.ring[ring_index]); + src_id = le32_to_cpu(rdev-ih.ring[ring_index]) 0xff; src_data = le32_to_cpu(rdev-ih.ring[ring_index + 1]) 0xfff; ring_id = le32_to_cpu(rdev-ih.ring[ring_index + 2]) 0xff; diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index e3af85b..f4cc3c5 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -124,6 +124,12 @@ void radeon_kfd_device_fini(struct radeon_device *rdev) } } +void radeon_kfd_interrupt(struct radeon_device *rdev, const void *ih_ring_entry) +{ + if (rdev-kfd) + kgd2kfd-interrupt(rdev-kfd, ih_ring_entry); +} + void radeon_kfd_suspend(struct radeon_device *rdev) { if (rdev-kfd) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 28/83] mm: Change timing of notification to IOMMUs about a page to be invalidated
From: Andrew Lewycky andrew.lewy...@amd.com This patch changes the location of the mmu_notifier_invalidate_page function call inside try_to_unmap_one. The mmu_notifier_invalidate_page function call tells the IOMMU that a pgae should be invalidated. The location is changed from after releasing the physical page to before releasing the physical page. This change should prevent the bug that would occur in the (rare) case where the GPU attempts to access a page while the CPU attempts to swap out that page (or discard it if it is not dirty). Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- mm/rmap.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 196cd0c..73d4c3d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1231,13 +1231,17 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma, } else dec_mm_counter(mm, MM_FILEPAGES); + pte_unmap_unlock(pte, ptl); + + mmu_notifier_invalidate_page(vma, address, event); + page_remove_rmap(page); page_cache_release(page); + return ret; + out_unmap: pte_unmap_unlock(pte, ptl); - if (ret != SWAP_FAIL !(flags TTU_MUNLOCK)) - mmu_notifier_invalidate_page(vma, address, event); out: return ret; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 23/83] drm/radeon/cik: Don't touch int of pipes 1-7
HSA radeon driver (kfd) should set interrupts for pipes 1-7. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cik.c | 71 +--- 1 file changed, 1 insertion(+), 70 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index b1c50f4..803d0cb 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -7272,8 +7272,7 @@ static int cik_irq_init(struct radeon_device *rdev) int cik_irq_set(struct radeon_device *rdev) { u32 cp_int_cntl; - u32 cp_m1p0, cp_m1p1, cp_m1p2, cp_m1p3; - u32 cp_m2p0, cp_m2p1, cp_m2p2, cp_m2p3; + u32 cp_m1p0; u32 crtc1 = 0, crtc2 = 0, crtc3 = 0, crtc4 = 0, crtc5 = 0, crtc6 = 0; u32 hpd1, hpd2, hpd3, hpd4, hpd5, hpd6; u32 grbm_int_cntl = 0; @@ -7307,13 +7306,6 @@ int cik_irq_set(struct radeon_device *rdev) dma_cntl1 = RREG32(SDMA0_CNTL + SDMA1_REGISTER_OFFSET) ~TRAP_ENABLE; cp_m1p0 = RREG32(CP_ME1_PIPE0_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m1p1 = RREG32(CP_ME1_PIPE1_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m1p2 = RREG32(CP_ME1_PIPE2_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m1p3 = RREG32(CP_ME1_PIPE3_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m2p0 = RREG32(CP_ME2_PIPE0_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m2p1 = RREG32(CP_ME2_PIPE1_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m2p2 = RREG32(CP_ME2_PIPE2_INT_CNTL) ~TIME_STAMP_INT_ENABLE; - cp_m2p3 = RREG32(CP_ME2_PIPE3_INT_CNTL) ~TIME_STAMP_INT_ENABLE; if (rdev-flags RADEON_IS_IGP) thermal_int = RREG32_SMC(CG_THERMAL_INT_CTRL) @@ -7335,33 +7327,6 @@ int cik_irq_set(struct radeon_device *rdev) case 0: cp_m1p0 |= TIME_STAMP_INT_ENABLE; break; - case 1: - cp_m1p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - default: - DRM_DEBUG(si_irq_set: sw int cp1 invalid pipe %d\n, ring-pipe); - break; - } - } else if (ring-me == 2) { - switch (ring-pipe) { - case 0: - cp_m2p0 |= TIME_STAMP_INT_ENABLE; - break; - case 1: - cp_m2p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; default: DRM_DEBUG(si_irq_set: sw int cp1 invalid pipe %d\n, ring-pipe); break; @@ -7378,33 +7343,6 @@ int cik_irq_set(struct radeon_device *rdev) case 0: cp_m1p0 |= TIME_STAMP_INT_ENABLE; break; - case 1: - cp_m1p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - default: - DRM_DEBUG(si_irq_set: sw int cp2 invalid pipe %d\n, ring-pipe); - break; - } - } else if (ring-me == 2) { - switch (ring-pipe) { - case 0: - cp_m2p0 |= TIME_STAMP_INT_ENABLE; - break; - case 1: - cp_m2p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; default: DRM_DEBUG(si_irq_set: sw int cp2 invalid pipe %d\n, ring-pipe); break; @@ -7487,13 +7425,6 @@ int cik_irq_set(struct radeon_device *rdev) WREG32(SDMA0_CNTL
[PATCH 18/83] hsa/radeon: Enable interrupts in KFD scheduler
This patch enables the use of interrupts in the KFD scheduler when the scheduler performs its initialization. It also disables the interrupts when the scheduler stops its work. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 28 +++ 1 file changed, 28 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index 5d42e88..9add5e5 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -486,6 +486,32 @@ static void cik_static_destroy(struct kfd_scheduler *scheduler) kfree(priv); } +static void +enable_interrupts(struct cik_static_private *priv) +{ + unsigned int i; + + lock_srbm_index(priv); + for (i = 0; i priv-num_pipes; i++) { + pipe_select(priv, i); + WRITE_REG(priv-dev, CPC_INT_CNTL, DEQUEUE_REQUEST_INT_ENABLE); + } + unlock_srbm_index(priv); +} + +static void +disable_interrupts(struct cik_static_private *priv) +{ + unsigned int i; + + lock_srbm_index(priv); + for (i = 0; i priv-num_pipes; i++) { + pipe_select(priv, i); + WRITE_REG(priv-dev, CPC_INT_CNTL, 0); + } + unlock_srbm_index(priv); +} + static void cik_static_start(struct kfd_scheduler *scheduler) { struct cik_static_private *priv = kfd_scheduler_to_private(scheduler); @@ -495,6 +521,7 @@ static void cik_static_start(struct kfd_scheduler *scheduler) init_pipes(priv); init_ats(priv); + enable_interrupts(priv); } static void cik_static_stop(struct kfd_scheduler *scheduler) @@ -502,6 +529,7 @@ static void cik_static_stop(struct kfd_scheduler *scheduler) struct cik_static_private *priv = kfd_scheduler_to_private(scheduler); exit_ats(priv); + disable_interrupts(priv); radeon_kfd_vidmem_ungpumap(priv-dev, priv-hpd_mem); radeon_kfd_vidmem_ungpumap(priv-dev, priv-mqd_mem); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 32/83] hsa/radeon: implementing IOCTL for clock counters
From: Evgeny Pinchuk evgeny.pinc...@amd.com Implemented new IOCTL to query the CPU and GPU clock counters. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 37 include/uapi/linux/kfd_ioctl.h | 9 + 2 files changed, 46 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index ddaf357..d6fa980 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -28,6 +28,7 @@ #include linux/slab.h #include linux/uaccess.h #include uapi/linux/kfd_ioctl.h +#include linux/time.h #include kfd_priv.h #include kfd_scheduler.h @@ -284,6 +285,38 @@ out: return err; } +static long +kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void __user *arg) +{ + struct kfd_ioctl_get_clock_counters_args args; + struct kfd_dev *dev; + struct timespec time; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + /* Reading GPU clock counter from KGD */ + args.gpu_clock_counter = kfd2kgd-get_gpu_clock_counter(dev-kgd); + + /* No access to rdtsc. Using raw monotonic time */ + getrawmonotonic(time); + args.cpu_clock_counter = time.tv_nsec; + + get_monotonic_boottime(time); + args.system_clock_counter = time.tv_nsec; + + /* Since the counter is in nano-seconds we use 1GHz frequency */ + args.system_clock_freq = 10; + + if (copy_to_user(arg, args, sizeof(args))) + return -EFAULT; + + return 0; +} static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) @@ -312,6 +345,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) err = kfd_ioctl_set_memory_policy(filep, process, (void __user *)arg); break; + case KFD_IOC_GET_CLOCK_COUNTERS: + err = kfd_ioctl_get_clock_counters(filep, process, (void __user *)arg); + break; + default: dev_err(kfd_device, unknown ioctl cmd 0x%x, arg 0x%lx)\n, diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 928e628..5b9517e 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -70,12 +70,21 @@ struct kfd_ioctl_set_memory_policy_args { uint64_t alternate_aperture_size; /* to KFD */ }; +struct kfd_ioctl_get_clock_counters_args { + uint32_t gpu_id;/* to KFD */ + uint64_t gpu_clock_counter; /* from KFD */ + uint64_t cpu_clock_counter; /* from KFD */ + uint64_t system_clock_counter; /* from KFD */ + uint64_t system_clock_freq; /* from KFD */ +}; + #define KFD_IOC_MAGIC 'K' #define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct kfd_ioctl_get_version_args) #define KFD_IOC_CREATE_QUEUE _IOWR(KFD_IOC_MAGIC, 2, struct kfd_ioctl_create_queue_args) #define KFD_IOC_DESTROY_QUEUE _IOWR(KFD_IOC_MAGIC, 3, struct kfd_ioctl_destroy_queue_args) #define KFD_IOC_SET_MEMORY_POLICY _IOW(KFD_IOC_MAGIC, 4, struct kfd_ioctl_set_memory_policy_args) +#define KFD_IOC_GET_CLOCK_COUNTERS _IOWR(KFD_IOC_MAGIC, 5, struct kfd_ioctl_get_clock_counters_args) #pragma pack(pop) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 30/83] hsa/radeon: Fix list of supported devices
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device.c | 28 +++- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index b627e57..a21c095 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -27,7 +27,7 @@ #include kfd_priv.h #include kfd_scheduler.h -static const struct kfd_device_info bonaire_device_info = { +static const struct kfd_device_info kaveri_device_info = { .scheduler_class = radeon_kfd_cik_static_scheduler_class, .max_pasid_bits = 16, .ih_ring_entry_size = 4 * sizeof(uint32_t) @@ -40,10 +40,28 @@ struct kfd_deviceid { /* Please keep this sorted by increasing device id. */ static const struct kfd_deviceid supported_devices[] = { - { 0x1305, bonaire_device_info }, /* Kaveri */ - { 0x1307, bonaire_device_info }, /* Kaveri */ - { 0x130F, bonaire_device_info }, /* Kaveri */ - { 0x665C, bonaire_device_info }, /* Bonaire */ + { 0x1304, kaveri_device_info },/* Kaveri */ + { 0x1305, kaveri_device_info },/* Kaveri */ + { 0x1306, kaveri_device_info },/* Kaveri */ + { 0x1307, kaveri_device_info },/* Kaveri */ + { 0x1309, kaveri_device_info },/* Kaveri */ + { 0x130A, kaveri_device_info },/* Kaveri */ + { 0x130B, kaveri_device_info },/* Kaveri */ + { 0x130C, kaveri_device_info },/* Kaveri */ + { 0x130D, kaveri_device_info },/* Kaveri */ + { 0x130E, kaveri_device_info },/* Kaveri */ + { 0x130F, kaveri_device_info },/* Kaveri */ + { 0x1310, kaveri_device_info },/* Kaveri */ + { 0x1311, kaveri_device_info },/* Kaveri */ + { 0x1312, kaveri_device_info },/* Kaveri */ + { 0x1313, kaveri_device_info },/* Kaveri */ + { 0x1315, kaveri_device_info },/* Kaveri */ + { 0x1316, kaveri_device_info },/* Kaveri */ + { 0x1317, kaveri_device_info },/* Kaveri */ + { 0x1318, kaveri_device_info },/* Kaveri */ + { 0x131B, kaveri_device_info },/* Kaveri */ + { 0x131C, kaveri_device_info },/* Kaveri */ + { 0x131D, kaveri_device_info },/* Kaveri */ }; static const struct kfd_device_info * -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 20/83] hsa/radeon: Add interrupt callback function to kgd2kfd interface
This patch adds a new callback function to the kgd2kfd interface. The new callback is for propagating interrupts from radeon driver to the kfd driver. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 1 + include/linux/radeon_kfd.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index 6978bc0..ad21c6d 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -38,6 +38,7 @@ static const struct kgd2kfd_calls kgd2kfd = { .probe = kgd2kfd_probe, .device_init= kgd2kfd_device_init, .device_exit= kgd2kfd_device_exit, + .interrupt = kgd2kfd_interrupt, }; bool kgd2kfd_init(unsigned interface_version, diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 40b691c..2f4f7c0 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -62,6 +62,7 @@ struct kgd2kfd_calls { struct kfd_dev* (*probe)(struct kgd_dev *kgd, struct pci_dev *pdev); bool (*device_init)(struct kfd_dev *kfd, const struct kgd2kfd_shared_resources *gpu_resources); void (*device_exit)(struct kfd_dev *kfd); + void (*interrupt)(struct kfd_dev *kfd, const void *ih_ring_entry); }; struct kfd2kgd_calls { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 36/83] hsa/radeon: fixing clock counters bug
From: Evgeny Pinchuk evgeny.pinc...@amd.com Fixed wrong reporting of timestamps in kfd_ioctl_get_clock_counters. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index dba6084..75fe11f 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -304,10 +304,10 @@ kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void __u /* No access to rdtsc. Using raw monotonic time */ getrawmonotonic(time); - args.cpu_clock_counter = time.tv_nsec; + args.cpu_clock_counter = (uint64_t)timespec_to_ns(time); get_monotonic_boottime(time); - args.system_clock_counter = time.tv_nsec; + args.system_clock_counter = (uint64_t)timespec_to_ns(time); /* Since the counter is in nano-seconds we use 1GHz frequency */ args.system_clock_freq = 10; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 47/83] hsa/radeon: Add support allocating kernel doorbells
From: Ben Goz ben@amd.com This patch adds infrastructure to allocate doorbells which are not exposed to user space. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_doorbell.c | 76 ++- drivers/gpu/hsa/radeon/kfd_priv.h | 5 +++ 2 files changed, 80 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c b/drivers/gpu/hsa/radeon/kfd_doorbell.c index 3de8a02..abf4cb0 100644 --- a/drivers/gpu/hsa/radeon/kfd_doorbell.c +++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c @@ -23,6 +23,16 @@ #include kfd_priv.h #include linux/mm.h #include linux/mman.h +#include linux/slab.h + +/* + * This extension supports a kernel level doorbells management for the kernel queues + * basically the last doorbells page is devoted to kernel queues and that's assures + * that any user process won't get access to the kernel doorbells page + */ +static DEFINE_MUTEX(doorbell_mutex); +static unsigned long doorbell_available_index[DIV_ROUND_UP(MAX_PROCESS_QUEUES, BITS_PER_LONG)] = { 0 }; +#define KERNEL_DOORBELL_PASID 1 /* * Each device exposes a doorbell aperture, a PCI MMIO aperture that @@ -67,7 +77,22 @@ void radeon_kfd_doorbell_init(struct kfd_dev *kfd) kfd-doorbell_base = kfd-shared_resources.doorbell_physical_address + doorbell_start_offset; kfd-doorbell_id_offset = doorbell_start_offset / sizeof(doorbell_t); - kfd-doorbell_process_limit = doorbell_process_limit; + kfd-doorbell_process_limit = doorbell_process_limit - 1; + + kfd-doorbell_kernel_ptr = ioremap(kfd-doorbell_base, doorbell_process_allocation()); + BUG_ON(!kfd-doorbell_kernel_ptr); + + pr_debug(kfd: doorbell initialization\n + doorbell base == 0x%08lX\n + doorbell_id_offset == 0x%08lu\n + doorbell_process_limit == 0x%08lu\n + doorbell_kernel_offset == 0x%08lX\n + doorbell aperture size == 0x%08lX\n + doorbell kernel address == 0x%08lX\n, +(uintptr_t)kfd-doorbell_base, kfd-doorbell_id_offset, doorbell_process_limit, +(uintptr_t)kfd-doorbell_base, kfd-shared_resources.doorbell_aperture_size, +(uintptr_t)kfd-doorbell_kernel_ptr); + } /* This is the /dev/kfd mmap (for doorbell) implementation. We intend that this is only called through map_doorbells, @@ -136,6 +161,53 @@ map_doorbells(struct file *devkfd, struct kfd_process *process, struct kfd_dev * return 0; } +/* get kernel iomem pointer for a doorbell */ +u32 __iomem *radeon_kfd_get_kernel_doorbell(struct kfd_dev *kfd, unsigned int *doorbell_off) +{ + u32 inx; + + BUG_ON(!kfd || !doorbell_off); + + mutex_lock(doorbell_mutex); + inx = find_first_zero_bit(doorbell_available_index, MAX_PROCESS_QUEUES); + __set_bit(inx, doorbell_available_index); + mutex_unlock(doorbell_mutex); + + if (inx = MAX_PROCESS_QUEUES) + return NULL; + + /* caluculating the kernel doorbell offset using faked kernel pasid that allocated for kernel queues only */ + *doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation()/sizeof(doorbell_t)) + inx; + + pr_debug(kfd: get kernel queue doorbell\n + doorbell offset == 0x%08d\n + kernel address== 0x%08lX\n, +*doorbell_off, (uintptr_t)(kfd-doorbell_kernel_ptr + inx)); + + return kfd-doorbell_kernel_ptr + inx; +} + +void radeon_kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr) +{ + unsigned int inx; + + BUG_ON(!kfd || !db_addr); + + inx = (unsigned int)(db_addr - kfd-doorbell_kernel_ptr); + + mutex_lock(doorbell_mutex); + __clear_bit(inx, doorbell_available_index); + mutex_unlock(doorbell_mutex); +} + +inline void write_kernel_doorbell(u32 __iomem *db, u32 value) +{ + if (db) { + writel(value, db); + pr_debug(writing %d to doorbell address 0x%p\n, value, db); + } +} + /* Get the user-mode address of a doorbell. Assumes that the process mutex is being held. */ doorbell_t __user *radeon_kfd_get_doorbell(struct file *devkfd, struct kfd_process *process, struct kfd_dev *dev, unsigned int doorbell_index) @@ -152,6 +224,8 @@ doorbell_t __user *radeon_kfd_get_doorbell(struct file *devkfd, struct kfd_proce pdd = radeon_kfd_get_process_device_data(dev, process); BUG_ON(pdd == NULL); /* map_doorbells would have failed otherwise */ + pr_debug(doorbell value on creation 0x%x\n, pdd-doorbell_mapping[doorbell_index]); + return pdd
[PATCH 37/83] hsa/radeon: Print ISR info only in debug mode
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index 5bfde5c..7573d25 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -899,7 +899,7 @@ cik_static_interrupt_isr(struct kfd_scheduler *scheduler, const void *ih_ring_en if (!int_compute_pipe(priv, ihre, pipe_id)) return false; - dev_info(radeon_kfd_chardev(), INT(ISR): src=%02x, data=0x%x, pipe=%u, vmid=%u, pasid=%u\n, + dev_dbg(radeon_kfd_chardev(), INT(ISR): src=%02x, data=0x%x, pipe=%u, vmid=%u, pasid=%u\n, ihre-source_id, ihre-data, pipe_id, ihre-vmid, ihre-pasid); switch (source_id) { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 39/83] drm/radeon: Extending kfd interface
From: Evgeny Pinchuk evgeny.pinc...@amd.com Adding new function to the interface used by kfd. The new function retrieves the max engine clock speed. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 11 +++ include/linux/radeon_kfd.h | 2 ++ 2 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 6dba170..8b6d497 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -50,6 +50,8 @@ static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd); static void lock_grbm_gfx_idx(struct kgd_dev *kgd); static void unlock_grbm_gfx_idx(struct kgd_dev *kgd); +static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd); + static const struct kfd2kgd_calls kfd2kgd = { .allocate_mem = allocate_mem, @@ -64,6 +66,7 @@ static const struct kfd2kgd_calls kfd2kgd = { .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl, .lock_grbm_gfx_idx = lock_grbm_gfx_idx, .unlock_grbm_gfx_idx = unlock_grbm_gfx_idx, + .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz, }; static const struct kgd2kfd_calls *kgd2kfd; @@ -307,3 +310,11 @@ static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd) return rdev-asic-get_gpu_clock_counter(rdev); } + +static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + + /* The sclk is in quantas of 10kHz */ + return rdev-pm.power_state-clock_info-sclk / 100; +} diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 4c7e923..4114c8e 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -93,6 +93,8 @@ struct kfd2kgd_calls { /* GRBM_GFX_INDEX mutex */ void (*lock_grbm_gfx_idx)(struct kgd_dev *kgd); void (*unlock_grbm_gfx_idx)(struct kgd_dev *kgd); + + uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd); }; bool kgd2kfd_init(unsigned interface_version, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 51/83] hsa/radeon: Add packet manager module
From: Ben Goz ben@amd.com The packet manager module builds PM4 packets for the sole use of the CP scheduler. Those packets are used by the HIQ to submit runlists to the CP. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/kfd_packet_manager.c | 473 2 files changed, 474 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_packet_manager.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index f06d925..4978915 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -7,6 +7,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \ - kfd_kernel_queue.o + kfd_kernel_queue.o kfd_packet_manager.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c b/drivers/gpu/hsa/radeon/kfd_packet_manager.c new file mode 100644 index 000..4967b7c --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c @@ -0,0 +1,473 @@ +/* + * packet_manager.c + * + * Created on: Mar 16, 2014 + * Author: ben + */ +#include linux/slab.h +#include linux/mutex.h +#include kfd_device_queue_manager.h +#include kfd_kernel_queue.h +#include kfd_priv.h +#include kfd_pm4_headers.h +#include kfd_pm4_opcodes.h +#include cik_mqds.h + +static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, unsigned int buffer_size_bytes) +{ + unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t); + + BUG_ON((temp * sizeof(uint32_t)) buffer_size_bytes); + *wptr = temp; +} + +static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size) +{ + PM4_TYPE_3_HEADER header; + + header.u32all = 0; + header.opcode = opcode; + header.count = packet_size/sizeof(uint32_t) - 2; + header.type = PM4_TYPE_3; + + return header.u32all; +} + +static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int *rlib_size, bool *over_subscription) +{ + unsigned int process_count, queue_count; + + BUG_ON(!pm || !rlib_size || !over_subscription); + + process_count = pm-dqm-processes_count; + queue_count = pm-dqm-queue_count; + + /* check if there is over subscription*/ + *over_subscription = false; + if ((process_count = VMID_PER_DEVICE) || + queue_count = PIPE_PER_ME_CP_SCHEDULING * QUEUES_PER_PIPE) { + *over_subscription = true; + pr_debug(kfd: over subscribed runlist\n); + } + + /* calculate run list ib allocation size */ + *rlib_size = process_count * sizeof(struct pm4_map_process) + +queue_count * sizeof(struct pm4_map_queues); + + /* increase the allocation size in case we need a chained run list when over subscription */ + if (*over_subscription) + *rlib_size += sizeof(struct pm4_runlist); + + pr_debug(kfd: runlist ib size %d\n, *rlib_size); +} + +static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int **rl_buffer, uint64_t *rl_gpu_buffer, + unsigned int *rl_buffer_size, bool *is_over_subscription) +{ + int retval; + + BUG_ON(!pm); + BUG_ON(pm-allocated == true); + + pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription); + if (is_over_subscription + sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) + return -EFAULT; + + retval = radeon_kfd_vidmem_alloc_map(pm-dqm-dev, pm-ib_buffer_obj, (void **)rl_buffer, +rl_gpu_buffer, ALIGN(*rl_buffer_size, PAGE_SIZE)); + if (retval != 0) { + pr_err(kfd: failed to allocate runlist IB\n); + return retval; + } + + memset(*rl_buffer, 0, *rl_buffer_size); + pm-allocated = true; + return retval; +} + +static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer, + uint64_t ib, size_t ib_size_in_dwords, bool chain) +{ + struct pm4_runlist *packet; + + BUG_ON(!pm || !buffer || !ib); + + packet = (struct pm4_runlist *)buffer; + + memset(buffer, 0, sizeof(struct pm4_runlist)); + packet-header.u32all = build_pm4_header(IT_RUN_LIST, sizeof(struct pm4_runlist)); + + packet-bitfields4.ib_size = ib_size_in_dwords; + packet-bitfields4.chain = chain ? 1 : 0; + packet-bitfields4.offload_polling = 0; + packet-bitfields4.valid = 1; + packet-bitfields4.vmid = 0; + packet-ordinal2 = lower_32(ib); + packet-bitfields3.ib_base_hi = upper_32(ib
[PATCH 45/83] hsa/radeon: debugging print statements
From: Michael Varga michael.va...@amd.com Added debug print statements so critical errors during init may be debugged more easily. Signed-off-by: Michael Varga michael.va...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index 2e7d50d..82febf4 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -107,20 +107,30 @@ device_iommu_pasid_init(struct kfd_dev *kfd) int err; err = amd_iommu_device_info(kfd-pdev, iommu_info); - if (err 0) + if (err 0) { + dev_err(kfd_device, error getting iommu info. is the iommu enabled?\n); return false; + } - if ((iommu_info.flags required_iommu_flags) != required_iommu_flags) + if ((iommu_info.flags required_iommu_flags) != required_iommu_flags) { + dev_err(kfd_device, error required iommu flags ats(%i), pri(%i), pasid(%i)\n, + (iommu_info.flags AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0, + (iommu_info.flags AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0, + (iommu_info.flags AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0); return false; + } pasid_limit = min_t(pasid_t, (pasid_t)1 kfd-device_info-max_pasid_bits, iommu_info.max_pasids); pasid_limit = min_t(pasid_t, pasid_limit, kfd-doorbell_process_limit); err = amd_iommu_init_device(kfd-pdev, pasid_limit); - if (err 0) + if (err 0) { + dev_err(kfd_device, error initializing iommu device\n); return false; + } if (!radeon_kfd_set_pasid_limit(pasid_limit)) { + dev_err(kfd_device, error setting pasid limit\n); amd_iommu_free_device(kfd-pdev); return false; } @@ -166,6 +176,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, kfd-device_info-scheduler_class-start(kfd-scheduler); kfd-init_complete = true; + dev_info(kfd_device, added device (%x:%x)\n, kfd-pdev-vendor, +kfd-pdev-device); return true; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 52/83] hsa/radeon: Add process queue manager module
From: Ben Goz ben@amd.com The queue scheduler divides into two sections, one section is process bounded and the other section is device bounded. The process bounded section is handled by this module. The PQM handles HSA queue setup, updates and tear-down. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile| 3 +- drivers/gpu/hsa/radeon/kfd_priv.h | 29 ++ drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 370 + 3 files changed, 401 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 4978915..341fa67 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -7,6 +7,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \ - kfd_kernel_queue.o kfd_packet_manager.o + kfd_kernel_queue.o kfd_packet_manager.o \ + kfd_process_queue_manager.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index b3889aa..e716745 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -311,6 +311,9 @@ struct kfd_process_device { /* Scheduler process data for this device. */ struct kfd_scheduler_process *scheduler_process; + /* per-process-per device QCM data structure */ + struct qcm_process_device qpd; + /* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */ bool bound; @@ -342,6 +345,11 @@ struct kfd_process { /* List of kfd_process_device structures, one for each device the process is using. */ struct list_head per_device_data; + struct hw_pointer_store_properties write_ptr; + struct hw_pointer_store_properties read_ptr; + + struct process_queue_manager pqm; + /* The process's queues. */ size_t queue_array_size; struct kfd_queue **queues; /* Size is queue_array_size, up to MAX_PROCESS_QUEUES. */ @@ -431,6 +439,27 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type, struct kfd_dev *dev struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum kfd_queue_type type); void kernel_queue_uninit(struct kernel_queue *kq); +/* Process Queue Manager */ +struct process_queue_node { + struct queue *q; + struct kernel_queue *kq; + struct list_head process_queue_list; +}; + +int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p); +void pqm_uninit(struct process_queue_manager *pqm); +int pqm_create_queue(struct process_queue_manager *pqm, + struct kfd_dev *dev, + struct file *f, + struct queue_properties *properties, + unsigned int flags, + enum kfd_queue_type type, + unsigned int *qid); +int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid); +int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p); +struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, unsigned int qid); +void test_diq(struct kfd_dev *dev, struct process_queue_manager *pqm); + /* Packet Manager */ #define KFD_HIQ_TIMEOUT (500) diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c new file mode 100644 index 000..6e38ca4 --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c @@ -0,0 +1,370 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
[PATCH 82/83] drm/radeon: Remove lock functions from kfd2kgd interface
From: Ben Goz ben@amd.com Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 44 - include/linux/radeon_kfd.h | 10 - 2 files changed, 54 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 738c2b3..7e8e041 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -115,12 +115,6 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem); static uint64_t get_vmem_size(struct kgd_dev *kgd); static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd); -static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); -static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd); - -static void lock_grbm_gfx_idx(struct kgd_dev *kgd); -static void unlock_grbm_gfx_idx(struct kgd_dev *kgd); - static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd); /* @@ -146,10 +140,6 @@ static const struct kfd2kgd_calls kfd2kgd = { .unkmap_mem = unkmap_mem, .get_vmem_size = get_vmem_size, .get_gpu_clock_counter = get_gpu_clock_counter, - .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl, - .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl, - .lock_grbm_gfx_idx = lock_grbm_gfx_idx, - .unlock_grbm_gfx_idx = unlock_grbm_gfx_idx, .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz, .program_sh_mem_settings = kgd_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping, @@ -200,8 +190,6 @@ void radeon_kfd_device_init(struct radeon_device *rdev) { if (rdev-kfd) { struct kgd2kfd_shared_resources gpu_resources = { - .mmio_registers = rdev-rmmio, - .compute_vmid_bitmap = 0xFF00, .first_compute_pipe = 1, @@ -363,38 +351,6 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd) return rdev-mc.real_vram_size; } -static void lock_srbm_gfx_cntl(struct kgd_dev *kgd) -{ - struct radeon_device *rdev = (struct radeon_device *)kgd; - - mutex_lock(rdev-srbm_mutex); -} - -static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) -{ - struct radeon_device *rdev = (struct radeon_device *)kgd; - - mutex_unlock(rdev-srbm_mutex); -} - -static void lock_grbm_gfx_idx(struct kgd_dev *kgd) -{ - struct radeon_device *rdev = (struct radeon_device *)kgd; - - BUG_ON(kgd == NULL); - - mutex_lock(rdev-grbm_idx_mutex); -} - -static void unlock_grbm_gfx_idx(struct kgd_dev *kgd) -{ - struct radeon_device *rdev = (struct radeon_device *)kgd; - - BUG_ON(kgd == NULL); - - mutex_unlock(rdev-grbm_idx_mutex); -} - static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd) { struct radeon_device *rdev = (struct radeon_device *)kgd; diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index aa021fb..2fffe32 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -45,8 +45,6 @@ enum kgd_memory_pool { }; struct kgd2kfd_shared_resources { - void __iomem *mmio_registers; /* Mapped pointer to GFX MMIO registers. */ - unsigned int compute_vmid_bitmap; /* Bit n == 1 means VMID n is available for KFD. */ unsigned int first_compute_pipe; /* Compute pipes are counted starting from MEC0/pipe0 as 0. */ @@ -86,14 +84,6 @@ struct kfd2kgd_calls { uint64_t (*get_vmem_size)(struct kgd_dev *kgd); uint64_t (*get_gpu_clock_counter)(struct kgd_dev *kgd); - /* SRBM_GFX_CNTL mutex */ - void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd); - void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd); - - /* GRBM_GFX_INDEX mutex */ - void (*lock_grbm_gfx_idx)(struct kgd_dev *kgd); - void (*unlock_grbm_gfx_idx)(struct kgd_dev *kgd); - uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd); /* Register access functions */ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 62/83] hsa/radeon: Fix timeout calculation in sync_with_hw
This patch fixes a bug in the timeout calculation done in sync_with_hw functions. The original code assumed that jiffies is incremented in ms. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c index 25528b3..ce3261b 100644 --- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c +++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c @@ -222,12 +222,18 @@ static void submit_packet(struct kernel_queue *kq) static int sync_with_hw(struct kernel_queue *kq, unsigned long timeout_ms) { + unsigned long org_timeout_ms; + BUG_ON(!kq); - timeout_ms += jiffies; + + org_timeout_ms = timeout_ms; + timeout_ms += jiffies * 1000 / HZ; while (*kq-wptr_kernel != *kq-rptr_kernel) { - if (time_after(jiffies, timeout_ms)) { - pr_err(kfd: kernel_queue %s timeout expired %lu\n, __func__, timeout_ms); - pr_err(kfd: wptr: %d rptr: %d\n, *kq-wptr_kernel, *kq-rptr_kernel); + if (time_after(jiffies * 1000 / HZ, timeout_ms)) { + pr_err(kfd: kernel_queue %s timeout expired %lu\n, + __func__, org_timeout_ms); + pr_err(kfd: wptr: %d rptr: %d\n, + *kq-wptr_kernel, *kq-rptr_kernel); return -ETIME; } cpu_relax(); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 83/83] hsa/radeon: Update module version to 0.6.2
This version is intended for upstreaming to the Linux kernel 3.17 Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index c706236..c783eeb 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -30,10 +30,10 @@ #define KFD_DRIVER_AUTHOR AMD Inc. and others #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs -#define KFD_DRIVER_DATE20140623 +#define KFD_DRIVER_DATE20140710 #define KFD_DRIVER_MAJOR 0 #define KFD_DRIVER_MINOR 6 -#define KFD_DRIVER_PATCHLEVEL 1 +#define KFD_DRIVER_PATCHLEVEL 2 const struct kfd2kgd_calls *kfd2kgd; static const struct kgd2kfd_calls kgd2kfd = { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 68/83] hsa/radeon: Update module version to 0.6.0
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index fbfcce6..33cee3c 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -32,7 +32,7 @@ #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs #define KFD_DRIVER_DATE20140424 #define KFD_DRIVER_MAJOR 0 -#define KFD_DRIVER_MINOR 5 +#define KFD_DRIVER_MINOR 6 #define KFD_DRIVER_PATCHLEVEL 0 const struct kfd2kgd_calls *kfd2kgd; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 74/83] hsa/radeon: Adding some error messages
From: Ben Goz ben@amd.com Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 09c9a61..be89d26 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -137,11 +137,15 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a if (copy_from_user(args, arg, sizeof(args))) return -EFAULT; - if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(qptr_t))) + if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(qptr_t))) { + pr_err(kfd: can't access read pointer); return -EFAULT; + } - if (!access_ok(VERIFY_WRITE, args.write_pointer_address, sizeof(qptr_t))) + if (!access_ok(VERIFY_WRITE, args.write_pointer_address, sizeof(qptr_t))) { + pr_err(kfd: can't access write pointer); return -EFAULT; + } q_properties.is_interop = false; q_properties.queue_percent = args.queue_percentage; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 57/83] hsa/radeon: Eliminate warnings in compilation
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 6 +++--- drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 4 ++-- drivers/gpu/hsa/radeon/kfd_queue.c| 2 +- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 9a77332..80b702e 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -114,8 +114,8 @@ kfd_open(struct inode *inode, struct file *filep) process-is_32bit_user_mode = is_compat_task(); - dev_info(kfd_device, process %d opened, compat mode (32 bit) - %d\n, - process-pasid, process-is_32bit_user_mode); + dev_dbg(kfd_device, process %d opened, compat mode (32 bit) - %d\n, + process-pasid, process-is_32bit_user_mode); kfd_init_apertures(process); @@ -149,7 +149,7 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a pr_debug(%s Arguments: Queue Percentage (%d, %d)\n Queue Priority (%d, %d)\n Queue Address (0x%llX, 0x%llX)\n - Queue Size (%u64, %ll)\n, + Queue Size (%llX, %u)\n, __func__, q_properties.queue_percent, args.queue_percentage, q_properties.priority, args.queue_priority, diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c index aa64693e..25528b3 100644 --- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c +++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c @@ -89,8 +89,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev, prop.type = type; prop.vmid = 0; prop.queue_address = kq-pq_gpu_addr; - prop.read_ptr = kq-rptr_gpu_addr; - prop.write_ptr = kq-wptr_gpu_addr; + prop.read_ptr = (qptr_t *) kq-rptr_gpu_addr; + prop.write_ptr = (qptr_t *) kq-wptr_gpu_addr; if (init_queue(kq-queue, prop) != 0) goto err_init_queue; diff --git a/drivers/gpu/hsa/radeon/kfd_queue.c b/drivers/gpu/hsa/radeon/kfd_queue.c index 2d22cc1..646b6d1 100644 --- a/drivers/gpu/hsa/radeon/kfd_queue.c +++ b/drivers/gpu/hsa/radeon/kfd_queue.c @@ -67,7 +67,7 @@ void print_queue(struct queue *q) Queue Doorbell Pointer: 0x%p\n Queue Doorbell Offset: %u\n Queue MQD Address: 0x%p\n - Queue MQD Gart: 0x%p\n + Queue MQD Gart: 0x%llX\n Queue Process Address: 0x%p\n Queue Device Address: 0x%p\n, q-properties.type, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 80/83] drm/radeon: Add register access functions to kfd2kgd interface
From: Ben Goz ben@amd.com This patch extends the kfd2kgd interface by adding functions that perform direct register access. These functions can be called from kfd and will allow to eliminate all direct register accesses from within the kfd. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cikd.h | 51 +- drivers/gpu/drm/radeon/radeon_kfd.c | 354 include/linux/radeon_kfd.h | 11 ++ 3 files changed, 415 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/cikd.h b/drivers/gpu/drm/radeon/cikd.h index 0c6e1b5..0a2a403 100644 --- a/drivers/gpu/drm/radeon/cikd.h +++ b/drivers/gpu/drm/radeon/cikd.h @@ -1137,6 +1137,9 @@ #defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 3 #defineDEFAULT_MTYPE(x)((x) 4) #defineAPE1_MTYPE(x) ((x) 7) +/* valid for both DEFAULT_MTYPE and APE1_MTYPE */ +#defineMTYPE_CACHED0 +#defineMTYPE_NONCACHED 3 #defineSX_DEBUG_1 0x9060 @@ -1447,6 +1450,16 @@ #define CP_HQD_ACTIVE 0xC91C #define CP_HQD_VMID 0xC920 +#define CP_HQD_PERSISTENT_STATE 0xC924u +#defineDEFAULT_CP_HQD_PERSISTENT_STATE (0x33U 8) + +#define CP_HQD_PIPE_PRIORITY 0xC928u +#define CP_HQD_QUEUE_PRIORITY 0xC92Cu +#define CP_HQD_QUANTUM 0xC930u +#defineQUANTUM_EN 1U +#defineQUANTUM_SCALE_1MS (1U 4) +#defineQUANTUM_DURATION(x) ((x) 8) + #define CP_HQD_PQ_BASE0xC934 #define CP_HQD_PQ_BASE_HI 0xC938 #define CP_HQD_PQ_RPTR0xC93C @@ -1474,12 +1487,32 @@ #definePRIV_STATE (1 30) #defineKMD_QUEUE (1 31) -#define CP_HQD_DEQUEUE_REQUEST 0xC974 +#define CP_HQD_IB_BASE_ADDR0xC95Cu +#define CP_HQD_IB_BASE_ADDR_HI 0xC960u +#define CP_HQD_IB_RPTR 0xC964u +#define CP_HQD_IB_CONTROL 0xC968u +#defineIB_ATC_EN (1U 23) +#defineDEFAULT_MIN_IB_AVAIL_SIZE (3U 20) + +#define CP_HQD_DEQUEUE_REQUEST 0xC974 +#defineDEQUEUE_REQUEST_DRAIN 1 +#define DEQUEUE_REQUEST_RESET 2 #define CP_MQD_CONTROL 0xC99C #defineMQD_VMID(x) ((x) 0) #defineMQD_VMID_MASK (0xf 0) +#define CP_HQD_SEMA_CMD0xC97Cu +#define CP_HQD_MSG_TYPE0xC980u +#define CP_HQD_ATOMIC0_PREOP_LO0xC984u +#define CP_HQD_ATOMIC0_PREOP_HI0xC988u +#define CP_HQD_ATOMIC1_PREOP_LO0xC98Cu +#define CP_HQD_ATOMIC1_PREOP_HI0xC990u +#define CP_HQD_HQ_SCHEDULER0 0xC994u +#define CP_HQD_HQ_SCHEDULER1 0xC998u + +#define SH_STATIC_MEM_CONFIG 0x9604u + #define DB_RENDER_CONTROL 0x28000 #define PA_SC_RASTER_CONFIG 0x28350 @@ -2069,4 +2102,20 @@ #define VCE_CMD_IB_AUTO0x0005 #define VCE_CMD_SEMAPHORE 0x0006 +#define ATC_VMID0_PASID_MAPPING0x339Cu +#defineATC_VMID_PASID_MAPPING_UPDATE_STATUS0x3398u +#defineATC_VMID_PASID_MAPPING_VALID(1U 31) + +#define ATC_VM_APERTURE0_CNTL 0x3310u +#defineATS_ACCESS_MODE_NEVER 0 +#defineATS_ACCESS_MODE_ALWAYS 1 + +#define ATC_VM_APERTURE0_CNTL2 0x3318u +#define ATC_VM_APERTURE0_HIGH_ADDR 0x3308u +#define ATC_VM_APERTURE0_LOW_ADDR 0x3300u +#define
[PATCH 73/83] hsa/radeon: Adding qcm fence return status
From: Yair Shachar yair.shac...@amd.com Waiting on fence returns status Signed-off-by: Yair Shachar yair.shac...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 6 -- drivers/gpu/hsa/radeon/kfd_priv.h | 2 ++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 4931f8a..4c53e57 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -800,7 +800,7 @@ out: return retval; } -static void fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_value, unsigned long timeout) +int fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_value, unsigned long timeout) { BUG_ON(!fence_addr); timeout += jiffies; @@ -808,10 +808,12 @@ static void fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_valu while (*fence_addr != fence_value) { if (time_after(jiffies, timeout)) { pr_err(kfd: qcm fence wait loop timeout expired\n); - break; + return -ETIME; } cpu_relax(); } + + return 0; } static int destroy_queues_cpsch(struct device_queue_manager *dqm) diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 97bf58a..b61187a 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -463,6 +463,8 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, unsigned int qid); void test_diq(struct kfd_dev *dev, struct process_queue_manager *pqm); +int fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_value, unsigned long timeout); + /* Packet Manager */ #define KFD_HIQ_TIMEOUT (500) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 76/83] hsa/radeon: Check oversubscription before destroying runlist
This patch fixes a bug when using the mode of CP hardware scheduling without oversubscription. The bug was that the oversubscription check was performed _after_ the current runlist was destroyed, which caused the current HSA application to stop working. This patch moves the oversubscription check before the call to destroy the current runlist. If there is oversubscription, the function prints an error to dmesg and simply exits. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_packet_manager.c| 3 --- drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 9 + 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c b/drivers/gpu/hsa/radeon/kfd_packet_manager.c index 5cd23b0..0aef907 100644 --- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c @@ -88,9 +88,6 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int **rl_b BUG_ON(is_over_subscription == NULL); pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription); - if (*is_over_subscription - sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) - return -EFAULT; retval = radeon_kfd_vidmem_alloc_map(pm-dqm-dev, pm-ib_buffer_obj, (void **)rl_buffer, rl_gpu_buffer, ALIGN(*rl_buffer_size, PAGE_SIZE)); diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c index 5d7c46d..97b3cc6 100644 --- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c @@ -174,6 +174,15 @@ int pqm_create_queue(struct process_queue_manager *pqm, switch (type) { case KFD_QUEUE_TYPE_COMPUTE: + /* check if there is over subscription */ + if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) + ((dev-dqm-processes_count = VMID_PER_DEVICE) || + (dev-dqm-queue_count = PIPE_PER_ME_CP_SCHEDULING * QUEUES_PER_PIPE))) { + pr_err(kfd: over-subscription is not allowed in radeon_kfd.sched_policy == 1\n); + retval = -EPERM; + goto err_create_queue; + } + retval = create_cp_queue(pqm, dev, q, q_properties, f, *qid); if (retval != 0) goto err_create_queue; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 59/83] hsa/radeon: Exclusive access for perf. counters
From: Evgeny Pinchuk evgeny.pinc...@amd.com Introducing IOCTL implementation for controlling exclusive access to performace counters. The exclusive access is per GPU device. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 61 drivers/gpu/hsa/radeon/kfd_device.c | 2 ++ drivers/gpu/hsa/radeon/kfd_priv.h| 5 +++ drivers/gpu/hsa/radeon/kfd_process.c | 8 +++-- include/uapi/linux/kfd_ioctl.h | 12 +++ 5 files changed, 86 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 80b702e..b39df68 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -387,6 +387,59 @@ static int kfd_ioctl_get_process_apertures(struct file *filp, struct kfd_process return 0; } +static long +kfd_ioctl_pmc_acquire_access(struct file *filp, struct kfd_process *p, void __user *arg) +{ + struct kfd_ioctl_pmc_acquire_access_args args; + struct kfd_dev *dev; + int err = -EBUSY; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + spin_lock(dev-pmc_access_lock); + if (dev-pmc_locking_process == NULL) { + dev-pmc_locking_process = p; + dev-pmc_locking_trace = args.trace_id; + err = 0; + } else if (dev-pmc_locking_process == p dev-pmc_locking_trace == args.trace_id) { + /* Same trace already has an access. Returning success */ + err = 0; + } + + spin_unlock(dev-pmc_access_lock); + + return err; +} + +static long +kfd_ioctl_pmc_release_access(struct file *filp, struct kfd_process *p, void __user *arg) +{ + struct kfd_ioctl_pmc_release_access_args args; + struct kfd_dev *dev; + int err = -EINVAL; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + spin_lock(dev-pmc_access_lock); + if (dev-pmc_locking_process == p dev-pmc_locking_trace == args.trace_id) { + dev-pmc_locking_process = NULL; + dev-pmc_locking_trace = 0; + err = 0; + } + spin_unlock(dev-pmc_access_lock); + + return err; +} static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) @@ -427,6 +480,14 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) err = kfd_ioctl_update_queue(filep, process, (void __user *)arg); break; + case KFD_IOC_PMC_ACQUIRE_ACCESS: + err = kfd_ioctl_pmc_acquire_access(filep, process, (void __user *) arg); + break; + + case KFD_IOC_PMC_RELEASE_ACCESS: + err = kfd_ioctl_pmc_release_access(filep, process, (void __user *) arg); + break; + default: dev_err(kfd_device, unknown ioctl cmd 0x%x, arg 0x%lx)\n, diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index c602e16..9af812b 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -185,6 +185,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, return false; } + spin_lock_init(kfd-pmc_access_lock); + kfd-init_complete = true; dev_info(kfd_device, added device (%x:%x)\n, kfd-pdev-vendor, kfd-pdev-device); diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 049671b..e6d4993 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -135,6 +135,11 @@ struct kfd_dev { /* QCM Device instance */ struct device_queue_manager *dqm; + + /* Performance counters exclusivity lock */ + spinlock_t pmc_access_lock; + struct kfd_process *pmc_locking_process; + uint64_t pmc_locking_trace; }; /* KGD2KFD callbacks */ diff --git a/drivers/gpu/hsa/radeon/kfd_process.c b/drivers/gpu/hsa/radeon/kfd_process.c index f967c15..9bb5cab 100644 --- a/drivers/gpu/hsa/radeon/kfd_process.c +++ b/drivers/gpu/hsa/radeon/kfd_process.c @@ -96,9 +96,13 @@ static void free_process(struct kfd_process *p) BUG_ON(p == NULL); - /* doorbell mappings: automatic */ - list_for_each_entry_safe(pdd, temp, p-per_device_data, per_device_list) { + spin_lock(pdd-dev-pmc_access_lock); + if (pdd-dev-pmc_locking_process == p) { + pdd-dev-pmc_locking_process = NULL; + pdd-dev-pmc_locking_trace = 0; + } + spin_unlock(pdd
[PATCH 79/83] hsa/radeon: Update module version to 0.6.1
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index 33cee3c..c706236 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -30,10 +30,10 @@ #define KFD_DRIVER_AUTHOR AMD Inc. and others #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs -#define KFD_DRIVER_DATE20140424 +#define KFD_DRIVER_DATE20140623 #define KFD_DRIVER_MAJOR 0 #define KFD_DRIVER_MINOR 6 -#define KFD_DRIVER_PATCHLEVEL 0 +#define KFD_DRIVER_PATCHLEVEL 1 const struct kfd2kgd_calls *kfd2kgd; static const struct kgd2kfd_calls kgd2kfd = { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 81/83] hsa/radeon: Eliminating all direct register accesses
From: Ben Goz ben@amd.com This patch eliminates all direct register accesses from KFD and eliminate using of shared locks between KFD and radeon. The single exception is the doorbells that are used in both of the drivers. However, because they are located in separate pci bar pages, the danger of sharing registers between the drivers is minimal. Having said that, we are planning to move the doorbells as well to radeon. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/kfd_device.c | 2 - drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 113 +++--- drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 12 +- drivers/gpu/hsa/radeon/kfd_mqd_manager.c | 175 +- drivers/gpu/hsa/radeon/kfd_mqd_manager.h | 37 +++-- drivers/gpu/hsa/radeon/kfd_priv.h | 18 --- drivers/gpu/hsa/radeon/kfd_registers.c| 50 --- 8 files changed, 54 insertions(+), 355 deletions(-) delete mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index b5f05b4..d838bce 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -4,7 +4,7 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ - kfd_doorbell.o kfd_registers.o kfd_vidmem.o \ + kfd_doorbell.o kfd_vidmem.o \ kfd_interrupt.o kfd_aperture.o kfd_queue.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ kfd_process_queue_manager.o kfd_device_queue_manager.o diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index 30558c9..0ff2241 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -157,8 +157,6 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, { kfd-shared_resources = *gpu_resources; - kfd-regs = gpu_resources-mmio_registers; - radeon_kfd_doorbell_init(kfd); if (radeon_kfd_interrupt_init(kfd)) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 12b8b33..3eb5db3 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -112,30 +112,15 @@ static void init_process_memory(struct device_queue_manager *dqm, struct qcm_pro static void program_sh_mem_settings(struct device_queue_manager *dqm, struct qcm_process_device *qpd) { - struct mqd_manager *mqd; - - BUG_ON(qpd-vmid KFD_VMID_START_OFFSET); - - mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE); - if (mqd == NULL) - return; - - mqd-acquire_hqd(mqd, 0, 0, qpd-vmid); - - WRITE_REG(dqm-dev, SH_MEM_CONFIG, qpd-sh_mem_config); - - WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base); - WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit); - WRITE_REG(dqm-dev, SH_MEM_BASES, qpd-sh_mem_bases); - - mqd-release_hqd(mqd); + return kfd2kgd-program_sh_mem_settings(dqm-dev-kgd, qpd-vmid, qpd-sh_mem_config, + qpd-sh_mem_ape1_base, qpd-sh_mem_ape1_limit, qpd-sh_mem_bases); } static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue *q, struct qcm_process_device *qpd, int *allocate_vmid) { bool set, is_new_vmid; - int bit, retval, pipe; + int bit, retval, pipe, i; struct mqd_manager *mqd; BUG_ON(!dqm || !q || !qpd || !allocate_vmid); @@ -171,8 +156,8 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue * q-properties.vmid = qpd-vmid; set = false; - for (pipe = dqm-next_pipe_to_allocate; pipe get_pipes_num(dqm); - pipe = (pipe + 1) % get_pipes_num(dqm)) { + for (i = 0, pipe = dqm-next_pipe_to_allocate; i get_pipes_num(dqm); + pipe = (pipe + i++) % get_pipes_num(dqm)) { if (dqm-allocated_queues[pipe] != 0) { bit = find_first_bit((unsigned long *)dqm-allocated_queues[pipe], QUEUES_PER_PIPE); clear_bit(bit, (unsigned long *)dqm-allocated_queues[pipe]); @@ -238,9 +223,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm, struct qcm_pr retval = -ENOMEM; goto out; } - mqd-acquire_hqd(mqd, q-pipe, q-queue, 0); - retval = mqd-destroy_mqd(mqd, q-mqd, KFD_PREEMPT_TYPE_WAVEFRONT, QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS); - mqd-release_hqd(mqd); + retval = mqd-destroy_mqd(mqd, false, QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS, q-pipe, q-queue); if (retval != 0) goto
[PATCH 77/83] hsa/radeon: Add local memory to topology
From: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_topology.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c b/drivers/gpu/hsa/radeon/kfd_topology.c index 059b7db..d3aaad1 100644 --- a/drivers/gpu/hsa/radeon/kfd_topology.c +++ b/drivers/gpu/hsa/radeon/kfd_topology.c @@ -715,6 +715,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, sysfs_show_32bit_prop(buffer, max_engine_clk_fcompute, kfd2kgd-get_max_engine_clock_in_mhz( dev-gpu-kgd)); + sysfs_show_64bit_prop(buffer, local_mem_size, + kfd2kgd-get_vmem_size(dev-gpu-kgd)); ret = sysfs_show_32bit_prop(buffer, max_engine_clk_ccompute, cpufreq_quick_get_max(0)/1000); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 78/83] hsa/radeon: Don't verify cksum when parsing CRAT table
This patch removes the checksum verification done when parsing a CRAT table. The verification was both erronous and redundant, as it is done by another piece of kernel code. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_topology.c | 29 ++--- 1 file changed, 2 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c b/drivers/gpu/hsa/radeon/kfd_topology.c index d3aaad1..b686b7e 100644 --- a/drivers/gpu/hsa/radeon/kfd_topology.c +++ b/drivers/gpu/hsa/radeon/kfd_topology.c @@ -38,21 +38,6 @@ static struct kfd_system_properties sys_props; static DECLARE_RWSEM(topology_lock); - -static uint8_t checksum_image(const void *buf, size_t len) -{ - uint8_t *p = (uint8_t *)buf; - uint8_t sum = 0; - - if (!buf) - return 0; - - while (len-- 0) - sum += *p++; - - return sum; - } - struct kfd_dev *radeon_kfd_device_by_id(uint32_t gpu_id) { struct kfd_topology_device *top_dev; @@ -97,9 +82,9 @@ static int kfd_topology_get_crat_acpi(void *crat_image, size_t *size) if (!size) return -EINVAL; -/* + /* * Fetch the CRAT table from ACPI - */ +*/ status = acpi_get_table(CRAT_SIGNATURE, 0, crat_table); if (status == AE_NOT_FOUND) { pr_warn(CRAT table not found\n); @@ -111,16 +96,6 @@ static int kfd_topology_get_crat_acpi(void *crat_image, size_t *size) return -EINVAL; } - /* -* The checksum of the table should be verified -*/ - if (checksum_image(crat_table, crat_table-length) == - crat_table-checksum) { - pr_err(Bad checksum for the CRAT table\n); - return -EINVAL; -} - - if (*size = crat_table-length crat_image != 0) memcpy(crat_image, crat_table, crat_table-length); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 75/83] hsa/radeon: Fixing minor issues with kernel queues (DIQ)
From: Ben Goz ben@amd.com * re-execute runlist on kernel queues destruction. * delete kernel queues from pqm's queues list on pqm unint Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 4 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 4c53e57..12b8b33 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -759,6 +759,10 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm, { BUG_ON(!dqm || !kq); + pr_debug(kfd: In %s\n, __func__); + + dqm-destroy_queues(dqm); + mutex_lock(dqm-lock); list_del(kq-list); dqm-queue_count--; diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c index 89461ab..5d7c46d 100644 --- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c @@ -273,10 +273,10 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid) if (retval != 0) return retval; - list_del(pqn-process_queue_list); uninit_queue(pqn-q); } + list_del(pqn-process_queue_list); kfree(pqn); clear_bit(qid, pqm-queue_slot_bitmap); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 71/83] hsa/radeon: Remove old scheduler code
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 5 +- drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 987 -- 2 files changed, 2 insertions(+), 990 deletions(-) delete mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 26ce0ae..b5f05b4 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -4,9 +4,8 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ - kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ - kfd_queue.o kfd_mqd_manager.o \ + kfd_doorbell.o kfd_registers.o kfd_vidmem.o \ + kfd_interrupt.o kfd_aperture.o kfd_queue.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ kfd_process_queue_manager.o kfd_device_queue_manager.o diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c deleted file mode 100644 index d576d95..000 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ /dev/null @@ -1,987 +0,0 @@ -/* - * Copyright 2014 Advanced Micro Devices, Inc. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the Software), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR - * OTHER DEALINGS IN THE SOFTWARE. - */ - -#include linux/log2.h -#include linux/mutex.h -#include linux/slab.h -#include linux/types.h -#include linux/uaccess.h -#include linux/device.h -#include linux/sched.h -#include kfd_priv.h -#include kfd_scheduler.h -#include cik_regs.h -#include cik_int.h - -/* CIK CP hardware is arranged with 8 queues per pipe and 8 pipes per MEC (microengine for compute). - * The first MEC is ME 1 with the GFX ME as ME 0. - * We split the CP with the KGD, they take the first N pipes and we take the rest. - */ -#define CIK_QUEUES_PER_PIPE 8 -#define CIK_PIPES_PER_MEC 4 - -#define CIK_MAX_PIPES (2 * CIK_PIPES_PER_MEC) - -#define CIK_NUM_VMID 16 - -#define CIK_HPD_SIZE_LOG2 11 -#define CIK_HPD_SIZE (1U CIK_HPD_SIZE_LOG2) -#define CIK_HPD_ALIGNMENT 256 -#define CIK_MQD_ALIGNMENT 4 - -#pragma pack(push, 4) - -struct cik_hqd_registers { - u32 cp_mqd_base_addr; - u32 cp_mqd_base_addr_hi; - u32 cp_hqd_active; - u32 cp_hqd_vmid; - u32 cp_hqd_persistent_state; - u32 cp_hqd_pipe_priority; - u32 cp_hqd_queue_priority; - u32 cp_hqd_quantum; - u32 cp_hqd_pq_base; - u32 cp_hqd_pq_base_hi; - u32 cp_hqd_pq_rptr; - u32 cp_hqd_pq_rptr_report_addr; - u32 cp_hqd_pq_rptr_report_addr_hi; - u32 cp_hqd_pq_wptr_poll_addr; - u32 cp_hqd_pq_wptr_poll_addr_hi; - u32 cp_hqd_pq_doorbell_control; - u32 cp_hqd_pq_wptr; - u32 cp_hqd_pq_control; - u32 cp_hqd_ib_base_addr; - u32 cp_hqd_ib_base_addr_hi; - u32 cp_hqd_ib_rptr; - u32 cp_hqd_ib_control; - u32 cp_hqd_iq_timer; - u32 cp_hqd_iq_rptr; - u32 cp_hqd_dequeue_request; - u32 cp_hqd_dma_offload; - u32 cp_hqd_sema_cmd; - u32 cp_hqd_msg_type; - u32 cp_hqd_atomic0_preop_lo; - u32 cp_hqd_atomic0_preop_hi; - u32 cp_hqd_atomic1_preop_lo; - u32 cp_hqd_atomic1_preop_hi; - u32 cp_hqd_hq_scheduler0; - u32 cp_hqd_hq_scheduler1; - u32 cp_mqd_control; -}; - -struct cik_mqd { - u32 header; - u32 dispatch_initiator; - u32 dimensions[3]; - u32 start_idx[3]; - u32 num_threads[3]; - u32 pipeline_stat_enable; - u32 perf_counter_enable; - u32 pgm[2]; - u32 tba[2]; - u32 tma[2]; - u32 pgm_rsrc[2]; - u32 vmid; - u32 resource_limits; - u32 static_thread_mgmt01[2]; - u32 tmp_ring_size; - u32 static_thread_mgmt23[2]; - u32 restart[3]; - u32 thread_trace_enable; - u32 reserved1
[PATCH 65/83] hsa/radeon: fixing a bug to support 32b processes
From: Ben Goz ben@amd.com This commit is a bug fix for 32b hsa processes support Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_regs.h | 1 + drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 8 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h index fa5ec01..a6404e3 100644 --- a/drivers/gpu/hsa/radeon/cik_regs.h +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -45,6 +45,7 @@ /* if PTR32, this is the upper limit of GPUVM */ #defineSH_MEM_CONFIG 0x8C34 #definePTR32 (1 0) +#define PRIVATE_ATC(1 1) #defineALIGNMENT_MODE(x) ((x) 2) #defineSH_MEM_ALIGNMENT_MODE_DWORD 0 #defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT 1 diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 01573b1..3e1def1 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -90,15 +90,17 @@ static void init_process_memory(struct device_queue_manager *dqm, struct qcm_pro if (qpd-pqm-process-is_32bit_user_mode) { temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev); qpd-sh_mem_bases = SHARED_BASE(temp); + qpd-sh_mem_config = PTR32; } else { temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev); qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp); + qpd-sh_mem_config = 0; } - qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); + qpd-sh_mem_config |= ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED); qpd-sh_mem_ape1_limit = 0; - qpd-sh_mem_ape1_base = 1; + qpd-sh_mem_ape1_base = 0; pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n, qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases); @@ -854,7 +856,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm) } if (dqm-queue_count = 0 || dqm-processes_count = 0) - return 0; + return 0; mutex_lock(dqm-lock); if (dqm-active_runlist) { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 72/83] hsa/radeon: Static analysis (smatch) fixes
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device.c | 3 +++ drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 2 +- drivers/gpu/hsa/radeon/kfd_mqd_manager.c | 1 + drivers/gpu/hsa/radeon/kfd_packet_manager.c | 3 ++- drivers/gpu/hsa/radeon/kfd_process.c | 10 ++ 5 files changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index 9af812b..30558c9 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -88,6 +88,9 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev) return NULL; kfd = kzalloc(sizeof(*kfd), GFP_KERNEL); + if (!kfd) + return NULL; + kfd-kgd = kgd; kfd-device_info = device_info; kfd-pdev = pdev; diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 56875f9..4931f8a 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -317,7 +317,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(struct device_queue_manager * { struct mqd_manager *mqd; - BUG_ON(!dqm || type KFD_MQD_TYPE_MAX); + BUG_ON(!dqm || type = KFD_MQD_TYPE_MAX); pr_debug(kfd: In func %s mqd type %d\n, __func__, type); diff --git a/drivers/gpu/hsa/radeon/kfd_mqd_manager.c b/drivers/gpu/hsa/radeon/kfd_mqd_manager.c index a3e9f7c..8c1192e 100644 --- a/drivers/gpu/hsa/radeon/kfd_mqd_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_mqd_manager.c @@ -437,6 +437,7 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type, struct kfd_dev *dev mqd-uninitialize = uninitialize; break; default: + kfree(mqd); return NULL; break; } diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c b/drivers/gpu/hsa/radeon/kfd_packet_manager.c index 621a720..5cd23b0 100644 --- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c @@ -85,9 +85,10 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int **rl_b BUG_ON(!pm); BUG_ON(pm-allocated == true); + BUG_ON(is_over_subscription == NULL); pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription); - if (is_over_subscription + if (*is_over_subscription sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) return -EFAULT; diff --git a/drivers/gpu/hsa/radeon/kfd_process.c b/drivers/gpu/hsa/radeon/kfd_process.c index eb30cb3..aacc7ef 100644 --- a/drivers/gpu/hsa/radeon/kfd_process.c +++ b/drivers/gpu/hsa/radeon/kfd_process.c @@ -146,15 +146,15 @@ static struct kfd_process *create_process(const struct task_struct *thread) process = kzalloc(sizeof(*process), GFP_KERNEL); if (!process) - goto err_alloc; + goto err_alloc_process; process-queues = kmalloc_array(INITIAL_QUEUE_ARRAY_SIZE, sizeof(process-queues[0]), GFP_KERNEL); if (!process-queues) - goto err_alloc; + goto err_alloc_queues; process-pasid = radeon_kfd_pasid_alloc(); if (process-pasid == 0) - goto err_alloc; + goto err_alloc_pasid; mutex_init(process-mutex); @@ -178,9 +178,11 @@ err_process_pqm_init: radeon_kfd_pasid_free(process-pasid); list_del(process-processes_list); thread-mm-kfd_process = NULL; -err_alloc: +err_alloc_pasid: kfree(process-queues); +err_alloc_queues: kfree(process); +err_alloc_process: return ERR_PTR(err); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 70/83] hsa/radeon: Fix compilation warnings
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 51f790f..09c9a61 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -148,21 +148,22 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a q_properties.priority = args.queue_priority; q_properties.queue_address = args.ring_base_address; q_properties.queue_size = args.ring_size; - q_properties.read_ptr = args.read_pointer_address; - q_properties.write_ptr = args.write_pointer_address; + q_properties.read_ptr = (qptr_t *) args.read_pointer_address; + q_properties.write_ptr = (qptr_t *) args.write_pointer_address; pr_debug(%s Arguments: Queue Percentage (%d, %d)\n Queue Priority (%d, %d)\n Queue Address (0x%llX, 0x%llX)\n - Queue Size (%llX, %u)\n, - Queue r/w Pointers (%llX, %llX)\n, + Queue Size (0x%llX, %u)\n + Queue r/w Pointers (0x%llX, 0x%llX)\n, __func__, q_properties.queue_percent, args.queue_percentage, q_properties.priority, args.queue_priority, q_properties.queue_address, args.ring_base_address, q_properties.queue_size, args.ring_size, - q_properties.read_ptr, q_properties.write_ptr); + (uint64_t) q_properties.read_ptr, + (uint64_t) q_properties.write_ptr); dev = radeon_kfd_device_by_id(args.gpu_id); if (dev == NULL) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 69/83] hsa/radeon: Fix initialization of sh_mem registers
This patch fixes a bug in the code flow that caused an override of the sh_mem registers. The bug resulted in sh_mem registers being not initialized properly and overwrite over sh_mem registers for vmid 0 (which is a vmid of non-HSA processes). Reviewed-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 48 --- 1 file changed, 26 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 5ec8da7..56875f9 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -87,21 +87,25 @@ static void init_process_memory(struct device_queue_manager *dqm, struct qcm_pro unsigned int temp; BUG_ON(!dqm || !qpd); + /* check if sh_mem_config register already configured */ + if (qpd-sh_mem_config == 0) { + qpd-sh_mem_config = + ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) | + DEFAULT_MTYPE(MTYPE_NONCACHED) | + APE1_MTYPE(MTYPE_NONCACHED); + qpd-sh_mem_ape1_limit = 0; + qpd-sh_mem_ape1_base = 0; + } + if (qpd-pqm-process-is_32bit_user_mode) { temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev); qpd-sh_mem_bases = SHARED_BASE(temp); - qpd-sh_mem_config = PTR32; + qpd-sh_mem_config |= PTR32; } else { temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev); qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp); - qpd-sh_mem_config = 0; } - qpd-sh_mem_config |= ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); - qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED); - qpd-sh_mem_ape1_limit = 0; - qpd-sh_mem_ape1_base = 0; - pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n, qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases); } @@ -110,6 +114,8 @@ static void program_sh_mem_settings(struct device_queue_manager *dqm, struct qcm { struct mqd_manager *mqd; + BUG_ON(qpd-vmid KFD_VMID_START_OFFSET); + mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE); if (mqd == NULL) return; @@ -139,12 +145,6 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue * print_queue(q); mutex_lock(dqm-lock); - /* later memory apertures should be initialized in lazy mode */ - if (!is_mem_initialized) - if (init_memory(dqm) != 0) { - retval = -ENODATA; - goto init_memory_failed; - } if (dqm-vmid_bitmap == 0 qpd-vmid == 0) { retval = -ENOMEM; @@ -217,7 +217,6 @@ no_hqd: *allocate_vmid = qpd-vmid = q-properties.vmid = 0; } no_vmid: -init_memory_failed: mutex_unlock(dqm-lock); return retval; } @@ -951,20 +950,25 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm, qpd-sh_mem_ape1_limit = limit 16; } - default_mtype = (default_policy == cache_policy_coherent) ? MTYPE_NONCACHED : MTYPE_CACHED; - ape1_mtype = (alternate_policy == cache_policy_coherent) ? MTYPE_NONCACHED : MTYPE_CACHED; + default_mtype = (default_policy == cache_policy_coherent) ? + MTYPE_NONCACHED : + MTYPE_CACHED; + + ape1_mtype = (alternate_policy == cache_policy_coherent) ? + MTYPE_NONCACHED : + MTYPE_CACHED; - qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) + qpd-sh_mem_config = (qpd-sh_mem_config PTR32) + | ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) | DEFAULT_MTYPE(default_mtype) | APE1_MTYPE(ape1_mtype); - - if (sched_policy == KFD_SCHED_POLICY_NO_HWS) + if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) (qpd-vmid != 0)) program_sh_mem_settings(dqm, qpd); - - pr_debug(kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n, qpd-sh_mem_config, -qpd-sh_mem_ape1_base, qpd-sh_mem_ape1_limit); + pr_debug(kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n, + qpd-sh_mem_config, qpd-sh_mem_ape1_base, + qpd-sh_mem_ape1_limit); mutex_unlock(dqm-lock); return true; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 66/83] hsa/radeon: Fix number of pipes per ME
From: Ben Goz ben@amd.com Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 2 +- drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 2 +- drivers/gpu/hsa/radeon/kfd_packet_manager.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 3e1def1..5ec8da7 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -55,7 +55,7 @@ static inline unsigned int get_first_pipe(struct device_queue_manager *dqm) static inline unsigned int get_pipes_num_cpsch(void) { - return PIPE_PER_ME_CP_SCHEDULING - 1; + return PIPE_PER_ME_CP_SCHEDULING; } static unsigned int get_sh_mem_bases_nybble_64(struct kfd_process *process, struct kfd_dev *dev) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h index 57dc636..037eaf8 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h @@ -31,7 +31,7 @@ #define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS (500) #define QUEUES_PER_PIPE(8) -#define PIPE_PER_ME_CP_SCHEDULING (4) +#define PIPE_PER_ME_CP_SCHEDULING (3) #define CIK_VMID_NUM (8) #define KFD_VMID_START_OFFSET (8) #define VMID_PER_DEVICECIK_VMID_NUM diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c b/drivers/gpu/hsa/radeon/kfd_packet_manager.c index 3fc8c34..621a720 100644 --- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c @@ -62,7 +62,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int *rlib_size /* check if there is over subscription*/ *over_subscription = false; if ((process_count = VMID_PER_DEVICE) || - queue_count = PIPE_PER_ME_CP_SCHEDULING * QUEUES_PER_PIPE) { + queue_count PIPE_PER_ME_CP_SCHEDULING * QUEUES_PER_PIPE) { *over_subscription = true; pr_debug(kfd: over subscribed runlist\n); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 67/83] hsa/radeon: Removing hw pointer store module
From: Ben Goz ben@amd.com This module is unnecessary as we allocating read/write pointers from userspace thunk layer Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile| 2 +- drivers/gpu/hsa/radeon/kfd_chardev.c | 22 +-- drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c | 149 - drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h | 64 - drivers/gpu/hsa/radeon/kfd_priv.h | 10 +- drivers/gpu/hsa/radeon/kfd_process.c | 1 - drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 62 ++--- 7 files changed, 23 insertions(+), 287 deletions(-) delete mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c delete mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 3409203..26ce0ae 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -6,7 +6,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ - kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \ + kfd_queue.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ kfd_process_queue_manager.o kfd_device_queue_manager.o diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index b39df68..51f790f 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -32,9 +32,9 @@ #include linux/time.h #include kfd_priv.h #include linux/mm.h +#include linux/uaccess.h #include uapi/asm-generic/mman-common.h #include asm/processor.h -#include kfd_hw_pointer_store.h #include kfd_device_queue_manager.h static long kfd_ioctl(struct file *, unsigned int, unsigned long); @@ -137,24 +137,32 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a if (copy_from_user(args, arg, sizeof(args))) return -EFAULT; - /* need to validate parameters */ + if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(qptr_t))) + return -EFAULT; + + if (!access_ok(VERIFY_WRITE, args.write_pointer_address, sizeof(qptr_t))) + return -EFAULT; q_properties.is_interop = false; q_properties.queue_percent = args.queue_percentage; q_properties.priority = args.queue_priority; q_properties.queue_address = args.ring_base_address; q_properties.queue_size = args.ring_size; + q_properties.read_ptr = args.read_pointer_address; + q_properties.write_ptr = args.write_pointer_address; pr_debug(%s Arguments: Queue Percentage (%d, %d)\n Queue Priority (%d, %d)\n Queue Address (0x%llX, 0x%llX)\n Queue Size (%llX, %u)\n, + Queue r/w Pointers (%llX, %llX)\n, __func__, q_properties.queue_percent, args.queue_percentage, q_properties.priority, args.queue_priority, q_properties.queue_address, args.ring_base_address, - q_properties.queue_size, args.ring_size); + q_properties.queue_size, args.ring_size, + q_properties.read_ptr, q_properties.write_ptr); dev = radeon_kfd_device_by_id(args.gpu_id); if (dev == NULL) @@ -177,8 +185,6 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a goto err_create_queue; args.queue_id = queue_id; - args.read_pointer_address = (uint64_t)q_properties.read_ptr; - args.write_pointer_address = (uint64_t)q_properties.write_ptr; args.doorbell_address = (uint64_t)q_properties.doorbell_ptr; if (copy_to_user(arg, args, sizeof(args))) { @@ -515,11 +521,5 @@ kfd_mmap(struct file *filp, struct vm_area_struct *vma) if (pgoff = KFD_MMAP_DOORBELL_START pgoff KFD_MMAP_DOORBELL_END) return radeon_kfd_doorbell_mmap(process, vma); - if (pgoff = KFD_MMAP_RPTR_START pgoff KFD_MMAP_RPTR_END) - return radeon_kfd_hw_pointer_store_mmap(process-read_ptr, vma); - - if (pgoff = KFD_MMAP_WPTR_START pgoff KFD_MMAP_WPTR_END) - return radeon_kfd_hw_pointer_store_mmap(process-write_ptr, vma); - return -EINVAL; } diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c deleted file mode 100644 index 4e71f7d..000 --- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c +++ /dev/null @@ -1,149 +0,0 @@ -/* - * Copyright 2014
[PATCH 63/83] hsa/radeon: Update module information and version
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index 85069c5..fbfcce6 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -27,11 +27,13 @@ #include linux/device.h #include kfd_priv.h -#define DRIVER_AUTHOR Andrew Lewycky, Oded Gabbay, Evgeny Pinchuk, others. +#define KFD_DRIVER_AUTHOR AMD Inc. and others -#define DRIVER_NAMEkfd -#define DRIVER_DESCAMD HSA Kernel Fusion Driver -#define DRIVER_DATE20140127 +#define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs +#define KFD_DRIVER_DATE20140424 +#define KFD_DRIVER_MAJOR 0 +#define KFD_DRIVER_MINOR 5 +#define KFD_DRIVER_PATCHLEVEL 0 const struct kfd2kgd_calls *kfd2kgd; static const struct kgd2kfd_calls kgd2kfd = { @@ -120,6 +122,9 @@ static void __exit kfd_module_exit(void) module_init(kfd_module_init); module_exit(kfd_module_exit); -MODULE_AUTHOR(DRIVER_AUTHOR); -MODULE_DESCRIPTION(DRIVER_DESC); -MODULE_LICENSE(GPL); +MODULE_AUTHOR(KFD_DRIVER_AUTHOR); +MODULE_DESCRIPTION(KFD_DRIVER_DESC); +MODULE_LICENSE(GPL and additional rights); +MODULE_VERSION(__stringify(KFD_DRIVER_MAJOR) . + __stringify(KFD_DRIVER_MINOR) . + __stringify(KFD_DRIVER_PATCHLEVEL)); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 64/83] hsa/radeon: update queue fault handling
From: Ben Goz ben@amd.com This commit adding fault handling for process queue manager update queue Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c index fe74dd7..2034d2b 100644 --- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c @@ -334,6 +334,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid) int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p) { + int retval; struct process_queue_node *pqn; BUG_ON(!pqm); @@ -346,9 +347,17 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct pqn-q-properties.queue_percent = p-queue_percent; pqn-q-properties.priority = p-priority; - pqn-q-device-dqm-destroy_queues(pqn-q-device-dqm); - pqn-q-device-dqm-update_queue(pqn-q-device-dqm, pqn-q); - pqn-q-device-dqm-execute_queues(pqn-q-device-dqm); + retval = pqn-q-device-dqm-destroy_queues(pqn-q-device-dqm); + if (retval != 0) + return retval; + + retval = pqn-q-device-dqm-update_queue(pqn-q-device-dqm, pqn-q); + if (retval != 0) + return retval; + + retval = pqn-q-device-dqm-execute_queues(pqn-q-device-dqm); + if (retval != 0) + return retval; return 0; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 60/83] hsa/radeon: Rearrange structures in kfd_ioctl.h
This patch rearranges the structures defined in kfd_ioctl.h so that all the uint64_t variables are located at the start of each structure and then all the uint32_t variables are located. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- include/uapi/linux/kfd_ioctl.h | 51 ++ 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 509c4a0..3cedd1a 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -42,15 +42,15 @@ struct kfd_ioctl_get_version_args { struct kfd_ioctl_create_queue_args { uint64_t ring_base_address; /* to KFD */ + uint64_t write_pointer_address; /* from KFD */ + uint64_t read_pointer_address; /* from KFD */ + uint64_t doorbell_address; /* from KFD */ + uint32_t ring_size; /* to KFD */ uint32_t gpu_id;/* to KFD */ uint32_t queue_type;/* to KFD */ uint32_t queue_percentage; /* to KFD */ uint32_t queue_priority;/* to KFD */ - - uint64_t write_pointer_address; /* from KFD */ - uint64_t read_pointer_address; /* from KFD */ - uint64_t doorbell_address; /* from KFD */ uint32_t queue_id; /* from KFD */ }; @@ -59,8 +59,9 @@ struct kfd_ioctl_destroy_queue_args { }; struct kfd_ioctl_update_queue_args { - uint32_t queue_id; /* to KFD */ uint64_t ring_base_address; /* to KFD */ + + uint32_t queue_id; /* to KFD */ uint32_t ring_size; /* to KFD */ uint32_t queue_percentage; /* to KFD */ uint32_t queue_priority;/* to KFD */ @@ -71,31 +72,33 @@ struct kfd_ioctl_update_queue_args { #define KFD_IOC_CACHE_POLICY_NONCOHERENT 1 struct kfd_ioctl_set_memory_policy_args { + uint64_t alternate_aperture_base; /* to KFD */ + uint64_t alternate_aperture_size; /* to KFD */ + uint32_t gpu_id;/* to KFD */ uint32_t default_policy;/* to KFD */ uint32_t alternate_policy; /* to KFD */ - uint64_t alternate_aperture_base; /* to KFD */ - uint64_t alternate_aperture_size; /* to KFD */ }; struct kfd_ioctl_get_clock_counters_args { - uint32_t gpu_id;/* to KFD */ uint64_t gpu_clock_counter; /* from KFD */ uint64_t cpu_clock_counter; /* from KFD */ uint64_t system_clock_counter; /* from KFD */ uint64_t system_clock_freq; /* from KFD */ + + uint32_t gpu_id;/* to KFD */ }; #define NUM_OF_SUPPORTED_GPUS 7 struct kfd_process_device_apertures { - uint64_t lds_base;/* from KFD */ - uint64_t lds_limit;/* from KFD */ - uint64_t scratch_base;/* from KFD */ - uint64_t scratch_limit;/* from KFD */ - uint64_t gpuvm_base;/* from KFD */ - uint64_t gpuvm_limit;/* from KFD */ - uint32_t gpu_id;/* from KFD */ + uint64_t lds_base; /* from KFD */ + uint64_t lds_limit; /* from KFD */ + uint64_t scratch_base; /* from KFD */ + uint64_t scratch_limit; /* from KFD */ + uint64_t gpuvm_base;/* from KFD */ + uint64_t gpuvm_limit; /* from KFD */ + uint32_t gpu_id;/* from KFD */ }; struct kfd_ioctl_get_process_apertures_args { @@ -104,24 +107,24 @@ struct kfd_ioctl_get_process_apertures_args { }; struct kfd_ioctl_pmc_acquire_access_args { - uint32_t gpu_id;/* to KFD */ - uint64_t trace_id; /* to KFD */ + uint64_t trace_id; /* to KFD */ + uint32_t gpu_id;/* to KFD */ }; struct kfd_ioctl_pmc_release_access_args { - uint32_t gpu_id;/* to KFD */ - uint64_t trace_id; /* to KFD */ + uint64_t trace_id; /* to KFD */ + uint32_t gpu_id;/* to KFD */ }; #define KFD_IOC_MAGIC 'K' -#define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct kfd_ioctl_get_version_args) -#define KFD_IOC_CREATE_QUEUE _IOWR(KFD_IOC_MAGIC, 2, struct kfd_ioctl_create_queue_args) -#define KFD_IOC_DESTROY_QUEUE _IOWR(KFD_IOC_MAGIC, 3, struct kfd_ioctl_destroy_queue_args) +#define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct kfd_ioctl_get_version_args) +#define KFD_IOC_CREATE_QUEUE _IOWR(KFD_IOC_MAGIC, 2, struct kfd_ioctl_create_queue_args) +#define KFD_IOC_DESTROY_QUEUE _IOWR(KFD_IOC_MAGIC, 3, struct kfd_ioctl_destroy_queue_args) #define KFD_IOC_SET_MEMORY_POLICY _IOW(KFD_IOC_MAGIC, 4, struct kfd_ioctl_set_memory_policy_args) #define KFD_IOC_GET_CLOCK_COUNTERS _IOWR(KFD_IOC_MAGIC, 5, struct kfd_ioctl_get_clock_counters_args) -#define KFD_IOC_GET_PROCESS_APERTURES _IOR(KFD_IOC_MAGIC
[PATCH 61/83] hsa/radeon: change another pr_info to pr_debug
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c b/drivers/gpu/hsa/radeon/kfd_topology.c index 213ae7b..059b7db 100644 --- a/drivers/gpu/hsa/radeon/kfd_topology.c +++ b/drivers/gpu/hsa/radeon/kfd_topology.c @@ -1121,7 +1121,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu) gpu_id = kfd_generate_gpu_id(gpu); - pr_info(Adding new GPU (ID: 0x%x) to topology\n, gpu_id); + pr_debug(kfd: Adding new GPU (ID: 0x%x) to topology\n, gpu_id); down_write(topology_lock); /* -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 55/83] hsa/radeon: Add IOCTL for update queue
From: Ben Goz ben@amd.com This patch adds a new IOCTL that enables the user to perform update to an HSA queue. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_mqds.h | 1 - drivers/gpu/hsa/radeon/kfd_chardev.c | 29 ++ drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 1 - drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 1 - drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c | 1 - drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h | 1 - drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 1 - drivers/gpu/hsa/radeon/kfd_kernel_queue.h | 1 - drivers/gpu/hsa/radeon/kfd_mqd_manager.c | 1 - drivers/gpu/hsa/radeon/kfd_mqd_manager.h | 1 - drivers/gpu/hsa/radeon/kfd_packet_manager.c| 23 ++--- drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 1 - drivers/gpu/hsa/radeon/kfd_queue.c | 1 - include/uapi/linux/kfd_ioctl.h | 9 +++ 14 files changed, 58 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_mqds.h b/drivers/gpu/hsa/radeon/cik_mqds.h index 58945c8..35a35b4 100644 --- a/drivers/gpu/hsa/radeon/cik_mqds.h +++ b/drivers/gpu/hsa/radeon/cik_mqds.h @@ -19,7 +19,6 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * - * Author: Ben Goz */ #ifndef CIK_MQDS_H_ diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index bb2ef02..9a77332 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -230,6 +230,31 @@ kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p, void __user *a return retval; } +static int +kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p, void __user *arg) +{ + int retval; + struct kfd_ioctl_update_queue_args args; + struct queue_properties properties; + + if (copy_from_user(args, arg, sizeof(args))) + return -EFAULT; + + properties.queue_address = args.ring_base_address; + properties.queue_size = args.ring_size; + properties.queue_percent = args.queue_percentage; + properties.priority = args.queue_priority; + + pr_debug(kfd: updating queue id %d for PASID %d\n, args.queue_id, p-pasid); + + mutex_lock(p-mutex); + + retval = pqm_update_queue(p-pqm, args.queue_id, properties); + + mutex_unlock(p-mutex); + + return retval; +} static long kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void __user *arg) @@ -398,6 +423,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) err = kfd_ioctl_get_process_apertures(filep, process, (void __user *)arg); break; + case KFD_IOC_UPDATE_QUEUE: + err = kfd_ioctl_update_queue(filep, process, (void __user *)arg); + break; + default: dev_err(kfd_device, unknown ioctl cmd 0x%x, arg 0x%lx)\n, diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index 9e21074..c2d91c9 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -19,7 +19,6 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * - * Author: Ben Goz */ #include linux/slab.h diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h index 0529a96..fe9ef10 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h @@ -19,7 +19,6 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * - * Author: Ben Goz */ #ifndef DEVICE_QUEUE_MANAGER_H_ diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c index 1372fb2..4e71f7d 100644 --- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c +++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c @@ -19,7 +19,6 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * - * Author: Ben Goz */ #include linux/types.h diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h index be1d6cb..f384b7f 100644 --- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h +++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h @@ -19,7 +19,6 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * - * Author: Ben Goz */ #ifndef HW_POINTER_STORE_H_ diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c b/drivers/gpu/hsa
[PATCH 58/83] hsa/radeon: Various kernel styling fixes
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 6 +++--- drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h | 6 +++--- drivers/gpu/hsa/radeon/kfd_kernel_queue.h | 6 +++--- drivers/gpu/hsa/radeon/kfd_module.c | 8 drivers/gpu/hsa/radeon/kfd_mqd_manager.h | 6 +++--- drivers/gpu/hsa/radeon/kfd_pm4_headers.h | 11 ++- drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h | 6 +++--- 7 files changed, 25 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h index fe9ef10..57dc636 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h @@ -21,8 +21,8 @@ * */ -#ifndef DEVICE_QUEUE_MANAGER_H_ -#define DEVICE_QUEUE_MANAGER_H_ +#ifndef KFD_DEVICE_QUEUE_MANAGER_H_ +#define KFD_DEVICE_QUEUE_MANAGER_H_ #include linux/rwsem.h #include linux/list.h @@ -98,4 +98,4 @@ struct device_queue_manager { -#endif /* DEVICE_QUEUE_MANAGER_H_ */ +#endif /* KFD_DEVICE_QUEUE_MANAGER_H_ */ diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h index f384b7f..642703f 100644 --- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h +++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h @@ -21,8 +21,8 @@ * */ -#ifndef HW_POINTER_STORE_H_ -#define HW_POINTER_STORE_H_ +#ifndef KFD_HW_POINTER_STORE_H_ +#define KFD_HW_POINTER_STORE_H_ #include linux/mutex.h @@ -61,4 +61,4 @@ radeon_kfd_hw_pointer_store_mmap(struct hw_pointer_store_properties *ptr, struct vm_area_struct *vma); -#endif /* HW_POINTER_STORE_H_ */ +#endif /* KFD_HW_POINTER_STORE_H_ */ diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.h b/drivers/gpu/hsa/radeon/kfd_kernel_queue.h index 963e861..abfb9c8 100644 --- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.h +++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.h @@ -21,8 +21,8 @@ * */ -#ifndef KERNEL_QUEUE_H_ -#define KERNEL_QUEUE_H_ +#ifndef KFD_KERNEL_QUEUE_H_ +#define KFD_KERNEL_QUEUE_H_ #include linux/list.h #include linux/types.h @@ -63,4 +63,4 @@ struct kernel_queue { struct list_headlist; }; -#endif /* KERNEL_QUEUE_H_ */ +#endif /* KFD_KERNEL_QUEUE_H_ */ diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index e8bb67c..85069c5 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -24,7 +24,7 @@ #include linux/sched.h #include linux/notifier.h #include linux/moduleparam.h - +#include linux/device.h #include kfd_priv.h #define DRIVER_AUTHOR Andrew Lewycky, Oded Gabbay, Evgeny Pinchuk, others. @@ -46,7 +46,7 @@ static const struct kgd2kfd_calls kgd2kfd = { int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION; module_param(sched_policy, int, S_IRUSR | S_IWUSR); -MODULE_PARM_DESC(sched_policy, Kernel comline parameter define the kfd scheduling policy); +MODULE_PARM_DESC(sched_policy, Kernel cmdline parameter define the kfd scheduling policy); bool kgd2kfd_init(unsigned interface_version, const struct kfd2kgd_calls *f2g, @@ -95,7 +95,7 @@ static int __init kfd_module_init(void) if (err 0) goto err_topology; - pr_info([hsa] Initialized kfd module); + dev_info(kfd_device, Initialized module\n); return 0; err_topology: @@ -114,7 +114,7 @@ static void __exit kfd_module_exit(void) mmput_unregister_notifier(kfd_mmput_nb); radeon_kfd_chardev_exit(); radeon_kfd_pasid_exit(); - pr_info([hsa] Removed kfd module); + dev_info(kfd_device, Removed module\n); } module_init(kfd_module_init); diff --git a/drivers/gpu/hsa/radeon/kfd_mqd_manager.h b/drivers/gpu/hsa/radeon/kfd_mqd_manager.h index 8e7a5fd..314d490 100644 --- a/drivers/gpu/hsa/radeon/kfd_mqd_manager.h +++ b/drivers/gpu/hsa/radeon/kfd_mqd_manager.h @@ -21,8 +21,8 @@ * */ -#ifndef MQD_MANAGER_H_ -#define MQD_MANAGER_H_ +#ifndef KFD_MQD_MANAGER_H_ +#define KFD_MQD_MANAGER_H_ #include kfd_priv.h @@ -44,4 +44,4 @@ struct mqd_manager { }; -#endif /* MQD_MANAGER_H_ */ +#endif /* KFD_MQD_MANAGER_H_ */ diff --git a/drivers/gpu/hsa/radeon/kfd_pm4_headers.h b/drivers/gpu/hsa/radeon/kfd_pm4_headers.h index dae460f..3ffb3f4 100644 --- a/drivers/gpu/hsa/radeon/kfd_pm4_headers.h +++ b/drivers/gpu/hsa/radeon/kfd_pm4_headers.h @@ -21,8 +21,8 @@ * */ -#ifndef F32_MES_PM4_PACKETS_72_H -#define F32_MES_PM4_PACKETS_72_H +#ifndef KFD_PM4_HEADERS_H_ +#define KFD_PM4_HEADERS_H_ #ifndef PM4_HEADER_DEFINED #define PM4_HEADER_DEFINED @@ -657,7 +657,7 @@ typedef struct _PM4__SET_SH_REG { #ifndef _PM4__SET_CONFIG_REG_DEFINED #define _PM4__SET_CONFIG_REG_DEFINED -typedef struct _PM4__SET_CONFIG_REG { +struct pm4__set_config_reg { union { PM4_TYPE_3_HEADER header
[PATCH 53/83] hsa/radeon: Add device queue manager module
From: Ben Goz ben@amd.com The queue scheduler divides into two sections, one section is process bounded and the other section is device bounded. The device bounded section is handled by this module. The DQM module handles queue setup, update and tear-down from the device side. It also supports suspend/resume operation. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile |2 +- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 1006 + drivers/gpu/hsa/radeon/kfd_priv.h |2 + 3 files changed, 1009 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 341fa67..3409203 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -8,6 +8,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ - kfd_process_queue_manager.o + kfd_process_queue_manager.o kfd_device_queue_manager.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c new file mode 100644 index 000..9e21074 --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -0,0 +1,1006 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Ben Goz + */ + +#include linux/slab.h +#include linux/list.h +#include linux/types.h +#include linux/printk.h +#include linux/bitops.h +#include kfd_priv.h +#include kfd_device_queue_manager.h +#include kfd_mqd_manager.h +#include cik_regs.h +#include kfd_kernel_queue.h + +#define CIK_HPD_SIZE_LOG2 11 +#define CIK_HPD_SIZE (1U CIK_HPD_SIZE_LOG2) + +static bool is_mem_initialized; + +static int init_memory(struct device_queue_manager *dqm); +static int +set_pasid_vmid_mapping(struct device_queue_manager *dqm, unsigned int pasid, unsigned int vmid); + +static inline unsigned int get_pipes_num(struct device_queue_manager *dqm) +{ + BUG_ON(!dqm || !dqm-dev); + return dqm-dev-shared_resources.compute_pipe_count; +} + +static inline unsigned int get_first_pipe(struct device_queue_manager *dqm) +{ + BUG_ON(!dqm); + return dqm-dev-shared_resources.first_compute_pipe; +} + +static inline unsigned int get_pipes_num_cpsch(void) +{ + return PIPE_PER_ME_CP_SCHEDULING - 1; +} + +static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble); +static void init_process_memory(struct device_queue_manager *dqm, struct qcm_process_device *qpd) +{ + BUG_ON(!dqm || !qpd); + + qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); + qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED); + qpd-sh_mem_bases = compute_sh_mem_bases_64bit(6); + qpd-sh_mem_ape1_limit = 0; + qpd-sh_mem_ape1_base = 1; +} + +static void program_sh_mem_settings(struct device_queue_manager *dqm, struct qcm_process_device *qpd) +{ + struct mqd_manager *mqd; + + mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE); + if (mqd == NULL) + return; + + mqd-acquire_hqd(mqd, 0, 0, qpd-vmid); + + WRITE_REG(dqm-dev, SH_MEM_CONFIG, qpd-sh_mem_config); + + WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base); + WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit); + + mqd-release_hqd(mqd); +} + +static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue *q, + struct qcm_process_device *qpd, int *allocate_vmid) +{ + bool set
[PATCH 56/83] hsa/radeon: Queue Management integration with Memory Management
From: Ben Goz ben@amd.com This patch adding support for LDS aperture for user processes. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 41 +-- 1 file changed, 39 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c index c2d91c9..01573b1 100644 --- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c @@ -58,16 +58,50 @@ static inline unsigned int get_pipes_num_cpsch(void) return PIPE_PER_ME_CP_SCHEDULING - 1; } +static unsigned int get_sh_mem_bases_nybble_64(struct kfd_process *process, struct kfd_dev *dev) +{ + struct kfd_process_device *pdd; + uint32_t nybble; + + pdd = radeon_kfd_get_process_device_data(dev, process); + nybble = (pdd-lds_base 60) 0x0E; + + return nybble; + +} + +static unsigned int get_sh_mem_bases_32(struct kfd_process *process, struct kfd_dev *dev) +{ + struct kfd_process_device *pdd; + unsigned int shared_base; + + pdd = radeon_kfd_get_process_device_data(dev, process); + shared_base = (pdd-lds_base 16) 0xFF; + + return shared_base; +} + static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble); static void init_process_memory(struct device_queue_manager *dqm, struct qcm_process_device *qpd) { + unsigned int temp; BUG_ON(!dqm || !qpd); + if (qpd-pqm-process-is_32bit_user_mode) { + temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev); + qpd-sh_mem_bases = SHARED_BASE(temp); + } else { + temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev); + qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp); + } + qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED); qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED); - qpd-sh_mem_bases = compute_sh_mem_bases_64bit(6); qpd-sh_mem_ape1_limit = 0; qpd-sh_mem_ape1_base = 1; + + pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n, + qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases); } static void program_sh_mem_settings(struct device_queue_manager *dqm, struct qcm_process_device *qpd) @@ -84,6 +118,7 @@ static void program_sh_mem_settings(struct device_queue_manager *dqm, struct qcm WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base); WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit); + WRITE_REG(dqm-dev, SH_MEM_BASES, qpd-sh_mem_bases); mqd-release_hqd(mqd); } @@ -128,6 +163,8 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue * set_pasid_vmid_mapping(dqm, q-process-pasid, q-properties.vmid); qpd-vmid = *allocate_vmid; is_new_vmid = true; + + program_sh_mem_settings(dqm, qpd); } q-properties.vmid = qpd-vmid; @@ -418,7 +455,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble) * We don't bother to support different top nybbles for LDS/Scratch and GPUVM. */ - BUG_ON((top_address_nybble 1) || top_address_nybble 0xE); + BUG_ON((top_address_nybble 1) || top_address_nybble 0xE || top_address_nybble == 0); return PRIVATE_BASE(top_address_nybble 12) | SHARED_BASE(top_address_nybble 12); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 54/83] hsa/radeon: Switch to new queue scheduler
From: Ben Goz ben@amd.com This patch makes the switch between the old KFD queue scheduler to the new KFD queue scheduler. The new scheduler supports H/W CP scheduling, over-subscription of queues and pre-emption of queues. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_aperture.c | 1 - drivers/gpu/hsa/radeon/kfd_chardev.c | 107 +++-- drivers/gpu/hsa/radeon/kfd_device.c| 31 ++ drivers/gpu/hsa/radeon/kfd_interrupt.c | 4 +- drivers/gpu/hsa/radeon/kfd_priv.h | 2 + drivers/gpu/hsa/radeon/kfd_process.c | 56 - include/uapi/linux/kfd_ioctl.h | 4 +- 7 files changed, 88 insertions(+), 117 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_aperture.c b/drivers/gpu/hsa/radeon/kfd_aperture.c index 9e2d6da..2c72b21 100644 --- a/drivers/gpu/hsa/radeon/kfd_aperture.c +++ b/drivers/gpu/hsa/radeon/kfd_aperture.c @@ -32,7 +32,6 @@ #include uapi/linux/kfd_ioctl.h #include linux/time.h #include kfd_priv.h -#include kfd_scheduler.h #include linux/mm.h #include uapi/asm-generic/mman-common.h #include asm/processor.h diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 07cac88..bb2ef02 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -31,10 +31,11 @@ #include uapi/linux/kfd_ioctl.h #include linux/time.h #include kfd_priv.h -#include kfd_scheduler.h #include linux/mm.h #include uapi/asm-generic/mman-common.h #include asm/processor.h +#include kfd_hw_pointer_store.h +#include kfd_device_queue_manager.h static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); @@ -128,24 +129,36 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a struct kfd_dev *dev; int err = 0; unsigned int queue_id; - struct kfd_queue *queue; struct kfd_process_device *pdd; + struct queue_properties q_properties; + + memset(q_properties, 0, sizeof(struct queue_properties)); if (copy_from_user(args, arg, sizeof(args))) return -EFAULT; - dev = radeon_kfd_device_by_id(args.gpu_id); - if (dev == NULL) - return -EINVAL; + /* need to validate parameters */ + + q_properties.is_interop = false; + q_properties.queue_percent = args.queue_percentage; + q_properties.priority = args.queue_priority; + q_properties.queue_address = args.ring_base_address; + q_properties.queue_size = args.ring_size; - queue = kzalloc( - offsetof(struct kfd_queue, scheduler_queue) + dev-device_info-scheduler_class-queue_size, - GFP_KERNEL); - if (!queue) - return -ENOMEM; + pr_debug(%s Arguments: Queue Percentage (%d, %d)\n + Queue Priority (%d, %d)\n + Queue Address (0x%llX, 0x%llX)\n + Queue Size (%u64, %ll)\n, + __func__, + q_properties.queue_percent, args.queue_percentage, + q_properties.priority, args.queue_priority, + q_properties.queue_address, args.ring_base_address, + q_properties.queue_size, args.ring_size); - queue-dev = dev; + dev = radeon_kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; mutex_lock(p-mutex); @@ -159,23 +172,14 @@ kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *a p-pasid, dev-id); - if (!radeon_kfd_allocate_queue_id(p, queue_id)) - goto err_allocate_queue_id; - - err = dev-device_info-scheduler_class-create_queue(dev-scheduler, pdd-scheduler_process, - queue-scheduler_queue, - (void __user *)args.ring_base_address, - args.ring_size, - (void __user *)args.read_pointer_address, - (void __user *)args.write_pointer_address, - radeon_kfd_queue_id_to_doorbell(dev, p, queue_id)); - if (err) + err = pqm_create_queue(p-pqm, dev, filep, q_properties, 0, KFD_QUEUE_TYPE_COMPUTE, queue_id); + if (err != 0) goto err_create_queue; - radeon_kfd_install_queue(p, queue_id, queue); - args.queue_id = queue_id; - args.doorbell_address = (uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id); + args.read_pointer_address = (uint64_t
[PATCH 49/83] hsa/radeon: Add kernel queue support for KFD
From: Ben Goz ben@amd.com The kernel queue module enables the KFD to establish kernel queues, not exposed to user space. The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug Interface Queue) operations. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 3 +- drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 102 drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 302 ++ drivers/gpu/hsa/radeon/kfd_kernel_queue.h | 67 +++ drivers/gpu/hsa/radeon/kfd_pm4_headers.h | 681 ++ drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h | 107 drivers/gpu/hsa/radeon/kfd_priv.h | 34 ++ drivers/gpu/hsa/radeon/kfd_scheduler.h| 5 - 8 files changed, 1295 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h create mode 100644 drivers/gpu/hsa/radeon/kfd_kernel_queue.c create mode 100644 drivers/gpu/hsa/radeon/kfd_kernel_queue.h create mode 100644 drivers/gpu/hsa/radeon/kfd_pm4_headers.h create mode 100644 drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index c87b518..f06d925 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -6,6 +6,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ - kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o + kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \ + kfd_kernel_queue.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h new file mode 100644 index 000..0529a96 --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h @@ -0,0 +1,102 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Ben Goz + */ + +#ifndef DEVICE_QUEUE_MANAGER_H_ +#define DEVICE_QUEUE_MANAGER_H_ + +#include linux/rwsem.h +#include linux/list.h +#include kfd_priv.h +#include kfd_mqd_manager.h + +#define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS (500) +#define QUEUES_PER_PIPE(8) +#define PIPE_PER_ME_CP_SCHEDULING (4) +#define CIK_VMID_NUM (8) +#define KFD_VMID_START_OFFSET (8) +#define VMID_PER_DEVICECIK_VMID_NUM +#define KFD_DQM_FIRST_PIPE (0) + +struct device_process_node { + struct qcm_process_device *qpd; + struct list_head list; +}; + +struct device_queue_manager { + int (*create_queue)(struct device_queue_manager *dqm, + struct queue *q, + struct qcm_process_device *qpd, + int *allocate_vmid); + int (*destroy_queue)(struct device_queue_manager *dqm, + struct qcm_process_device *qpd, + struct queue *q); + int (*update_queue)(struct device_queue_manager *dqm, + struct queue *q); + int (*destroy_queues)(struct device_queue_manager *dqm); + struct mqd_manager * (*get_mqd_manager)(struct device_queue_manager *dqm, + enum KFD_MQD_TYPE type); + int (*execute_queues)(struct device_queue_manager *dqm); + int (*register_process)(struct device_queue_manager *dqm, + struct qcm_process_device *qpd); + int
[PATCH 50/83] hsa/radeon: Add module parameter of scheduling policy
From: Ben Goz ben@amd.com This patch adds a new parameter to the KFD module. This parameter enables the user to select the scheduling policy of the CP. The choices are: * CP Scheduling with support for over-subscription * CP Scheduling without support for over-subscription * Without CP Scheduling Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_module.c | 5 +++ drivers/gpu/hsa/radeon/kfd_priv.h | 65 + 2 files changed, 70 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_module.c b/drivers/gpu/hsa/radeon/kfd_module.c index a03743a..e8bb67c 100644 --- a/drivers/gpu/hsa/radeon/kfd_module.c +++ b/drivers/gpu/hsa/radeon/kfd_module.c @@ -23,6 +23,7 @@ #include linux/module.h #include linux/sched.h #include linux/notifier.h +#include linux/moduleparam.h #include kfd_priv.h @@ -43,6 +44,10 @@ static const struct kgd2kfd_calls kgd2kfd = { .resume = kgd2kfd_resume, }; +int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION; +module_param(sched_policy, int, S_IRUSR | S_IWUSR); +MODULE_PARM_DESC(sched_policy, Kernel comline parameter define the kfd scheduling policy); + bool kgd2kfd_init(unsigned interface_version, const struct kfd2kgd_calls *f2g, const struct kgd2kfd_calls **g2f) diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 3a5cecf..b3889aa 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -70,6 +70,15 @@ struct kfd_scheduler_class; /* Macro for allocating structures */ #define kfd_alloc_struct(ptr_to_struct)((typeof(ptr_to_struct)) kzalloc(sizeof(*ptr_to_struct), GFP_KERNEL)) +/* Kernel module parameter to specify the scheduling policy */ +extern int sched_policy; + +enum kfd_sched_policy { + KFD_SCHED_POLICY_HWS = 0, + KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION, + KFD_SCHED_POLICY_NO_HWS +}; + /* Large enough to hold the maximum usable pasid + 1. ** It must also be able to store the number of doorbells reported by a KFD device. */ typedef unsigned int pasid_t; @@ -243,6 +252,51 @@ enum KFD_MQD_TYPE { KFD_MQD_TYPE_MAX }; +struct scheduling_resources { + unsigned int vmid_mask; + enum kfd_queue_type type; + uint64_t queue_mask; + uint64_t gws_mask; + uint32_t oac_mask; + uint32_t gds_heap_base; + uint32_t gds_heap_size; +}; + +struct process_queue_manager { + /* data */ + struct kfd_process *process; + unsigned intnum_concurrent_processes; + struct list_headqueues; + unsigned long *queue_slot_bitmap; +}; + +struct qcm_process_device { + /* The Device Queue Manager that owns this data */ + struct device_queue_manager *dqm; + struct process_queue_manager *pqm; + /* Device Queue Manager lock */ + struct mutex *lock; + /* Queues list */ + struct list_head queues_list; + struct list_head priv_queue_list; + + unsigned int queue_count; + unsigned int vmid; + bool is_debug; + /* +* All the memory management data should be here too +*/ + uint64_t gds_context_area; + uint32_t sh_mem_config; + uint32_t sh_mem_bases; + uint32_t sh_mem_ape1_base; + uint32_t sh_mem_ape1_limit; + uint32_t page_table_base; + uint32_t gds_size; + uint32_t num_gws; + uint32_t num_oac; +}; + /* Data that is per-process-per device. */ struct kfd_process_device { /* List of all per-device data for a process. Starts from kfd_process.per_device_data. */ @@ -374,6 +428,8 @@ void print_queue_properties(struct queue_properties *q); void print_queue(struct queue *q); struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type, struct kfd_dev *dev); +struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum kfd_queue_type type); +void kernel_queue_uninit(struct kernel_queue *kq); /* Packet Manager */ @@ -391,4 +447,13 @@ struct packet_manager { kfd_mem_obj ib_buffer_obj; }; +int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm); +void pm_uninit(struct packet_manager *pm); +int pm_send_set_resources(struct packet_manager *pm, struct scheduling_resources *res); +int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues); +int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address, uint32_t fence_value); +int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type, + enum kfd_preempt_type_filter mode, uint32_t filter_param, bool reset); +void pm_release_ib(struct packet_manager *pm); + #endif -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http
[PATCH 46/83] hsa/radeon: Add queue and hw_pointer_store modules
From: Ben Goz ben@amd.com The queue module enables allocating and initializing queues uniformly. The hw_pointer_store module handles allocation and assignment of read and write pointers to user HSA queues. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 3 +- drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c | 150 ++ drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h | 65 +++ drivers/gpu/hsa/radeon/kfd_priv.h | 55 ++ drivers/gpu/hsa/radeon/kfd_queue.c| 110 +++ 5 files changed, 382 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c create mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h create mode 100644 drivers/gpu/hsa/radeon/kfd_queue.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 813b31f..18e1639 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -5,6 +5,7 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o kfd_interrupt.o kfd_aperture.o + kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ + kfd_queue.o kfd_hw_pointer_store.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c new file mode 100644 index 000..1372fb2 --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c @@ -0,0 +1,150 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Ben Goz + */ + +#include linux/types.h +#include linux/version.h +#include linux/kernel.h +#include linux/mutex.h +#include linux/mm.h +#include linux/mman.h +#include linux/slab.h +#include linux/io.h +#include kfd_hw_pointer_store.h +#include kfd_priv.h + +/* do the same trick as in map_doorbells() */ +static int hw_pointer_store_map(struct hw_pointer_store_properties *ptr, + struct file *devkfd) +{ + qptr_t __user *user_address; + + BUG_ON(!ptr || !devkfd); + + if (!ptr-page_mapping) { + if (!ptr-page_address) + return -EINVAL; + + user_address = (qptr_t __user *)vm_mmap(devkfd, 0, PAGE_SIZE, + PROT_WRITE | PROT_READ , MAP_SHARED, ptr-offset); + + if (IS_ERR(user_address)) + return PTR_ERR(user_address); + + ptr-page_mapping = user_address; + } + + return 0; +} + +int hw_pointer_store_init(struct hw_pointer_store_properties *ptr, + enum hw_pointer_store_type type) +{ + unsigned long *addr; + + BUG_ON(!ptr); + + /* using the offset value as a hint for mmap to distinguish between page types */ + if (type == KFD_HW_POINTER_STORE_TYPE_RPTR) + ptr-offset = KFD_MMAP_RPTR_START PAGE_SHIFT; + else if (type == KFD_HW_POINTER_STORE_TYPE_WPTR) + ptr-offset = KFD_MMAP_WPTR_START PAGE_SHIFT; + else + return -EINVAL; + + addr = (unsigned long *)get_zeroed_page(GFP_KERNEL); + if (!addr) { + pr_debug(Error allocating page\n); + return -ENOMEM; + } + + ptr-page_address = addr; + ptr-page_mapping = NULL; + + return 0; +} + +void hw_pointer_store_destroy(struct hw_pointer_store_properties *ptr) +{ + BUG_ON(!ptr); + pr_debug(kfd in func: %s\n, __func__); + if (ptr-page_address) + free_page((unsigned long)ptr-page_address); + if (ptr-page_mapping) + vm_munmap((uintptr_t)ptr-page_mapping, PAGE_SIZE); + ptr
[PATCH 48/83] hsa/radeon: Add mqd_manager module
From: Ben Goz ben@amd.com The mqd_manager module handles MQD data structures. MQD stands for Memory Queue Descriptor, which is used by the H/W to keep the HSA queue state in memory. Signed-off-by: Ben Goz ben@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/cik_mqds.h | 251 ++ drivers/gpu/hsa/radeon/cik_regs.h | 1 + drivers/gpu/hsa/radeon/kfd_mqd_manager.c | 453 ++ drivers/gpu/hsa/radeon/kfd_mqd_manager.h | 48 +++ drivers/gpu/hsa/radeon/kfd_priv.h | 26 ++ drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 10 - drivers/gpu/hsa/radeon/kfd_vidmem.c | 36 ++ 8 files changed, 816 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/hsa/radeon/cik_mqds.h create mode 100644 drivers/gpu/hsa/radeon/kfd_mqd_manager.c create mode 100644 drivers/gpu/hsa/radeon/kfd_mqd_manager.h diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 18e1639..c87b518 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -6,6 +6,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \ - kfd_queue.o kfd_hw_pointer_store.o + kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/cik_mqds.h b/drivers/gpu/hsa/radeon/cik_mqds.h new file mode 100644 index 000..58945c8 --- /dev/null +++ b/drivers/gpu/hsa/radeon/cik_mqds.h @@ -0,0 +1,251 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Author: Ben Goz + */ + +#ifndef CIK_MQDS_H_ +#define CIK_MQDS_H_ + +#pragma pack(push, 4) + +struct cik_hpd_registers { + u32 cp_hpd_roq_offsets; + u32 cp_hpd_eop_base_addr; + u32 cp_hpd_eop_base_addr_hi; + u32 cp_hpd_eop_vmid; + u32 cp_hpd_eop_control; +}; + +struct cik_hqd_registers { + u32 cp_mqd_base_addr; + u32 cp_mqd_base_addr_hi; + u32 cp_hqd_active; + u32 cp_hqd_vmid; + u32 cp_hqd_persistent_state; + u32 cp_hqd_pipe_priority; + u32 cp_hqd_queue_priority; + u32 cp_hqd_quantum; + u32 cp_hqd_pq_base; + u32 cp_hqd_pq_base_hi; + u32 cp_hqd_pq_rptr; + u32 cp_hqd_pq_rptr_report_addr; + u32 cp_hqd_pq_rptr_report_addr_hi; + u32 cp_hqd_pq_wptr_poll_addr; + u32 cp_hqd_pq_wptr_poll_addr_hi; + u32 cp_hqd_pq_doorbell_control; + u32 cp_hqd_pq_wptr; + u32 cp_hqd_pq_control; + u32 cp_hqd_ib_base_addr; + u32 cp_hqd_ib_base_addr_hi; + u32 cp_hqd_ib_rptr; + u32 cp_hqd_ib_control; + u32 cp_hqd_iq_timer; + u32 cp_hqd_iq_rptr; + u32 cp_hqd_dequeue_request; + u32 cp_hqd_dma_offload; + u32 cp_hqd_sema_cmd; + u32 cp_hqd_msg_type; + u32 cp_hqd_atomic0_preop_lo; + u32 cp_hqd_atomic0_preop_hi; + u32 cp_hqd_atomic1_preop_lo; + u32 cp_hqd_atomic1_preop_hi; + u32 cp_hqd_hq_scheduler0; + u32 cp_hqd_hq_scheduler1; + u32 cp_mqd_control; +}; + +struct cik_mqd { + u32 header; + u32 dispatch_initiator; + u32 dimensions[3]; + u32 start_idx[3]; + u32 num_threads[3]; + u32 pipeline_stat_enable; + u32 perf_counter_enable; + u32 pgm[2]; + u32 tba[2]; + u32 tma[2]; + u32 pgm_rsrc[2]; + u32 vmid; + u32 resource_limits; + u32 static_thread_mgmt01[2]; + u32 tmp_ring_size; + u32 static_thread_mgmt23[2]; + u32 restart[3]; + u32 thread_trace_enable; + u32 reserved1; + u32 user_data
[PATCH 42/83] hsa/radeon: 32-bit processes support
From: Alexey Skidanov alexey.skida...@amd.com Initializing compat_ioctl properly. All ioctls args are packed. Signed-off-by: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 7 +-- drivers/gpu/hsa/radeon/kfd_priv.h| 4 include/uapi/linux/kfd_ioctl.h | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 75fe11f..e95d597 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -27,6 +27,7 @@ #include linux/sched.h #include linux/slab.h #include linux/uaccess.h +#include linux/compat.h #include uapi/linux/kfd_ioctl.h #include linux/time.h #include kfd_priv.h @@ -41,6 +42,7 @@ static const char kfd_dev_name[] = kfd; static const struct file_operations kfd_fops = { .owner = THIS_MODULE, .unlocked_ioctl = kfd_ioctl, + .compat_ioctl = kfd_ioctl, .open = kfd_open, .mmap = kfd_mmap, }; @@ -105,8 +107,9 @@ kfd_open(struct inode *inode, struct file *filep) process = radeon_kfd_create_process(current); if (IS_ERR(process)) return PTR_ERR(process); - - pr_debug(\nkfd: process %d opened dev/kfd, process-pasid); + process-is_32bit_user_mode = is_compat_task(); + dev_info(kfd_device, process %d opened, compat mode (32 bit) - %d\n, + process-pasid, process-is_32bit_user_mode); return 0; } diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index 8b877ca..9d3b1fc 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -194,6 +194,10 @@ struct kfd_process { size_t queue_array_size; struct kfd_queue **queues; /* Size is queue_array_size, up to MAX_PROCESS_QUEUES. */ unsigned long allocated_queue_bitmap[DIV_ROUND_UP(MAX_PROCESS_QUEUES, BITS_PER_LONG)]; + + /*Is the user space process 32 bit?*/ + bool is_32bit_user_mode; + }; struct kfd_process *radeon_kfd_create_process(const struct task_struct *); diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 5b9517e..a7c3abd 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -29,7 +29,7 @@ #define KFD_IOCTL_CURRENT_VERSION 1 /* The 64-bit ABI is the authoritative version. */ -#pragma pack(push, 8) +#pragma pack(push, 1) struct kfd_ioctl_get_version_args { uint32_t min_supported_version; /* from KFD */ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 43/83] hsa/radeon: NULL pointer dereference bug workaround
From: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index 7573d25..7ee8125 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -627,8 +627,10 @@ static void cik_static_deregister_process(struct kfd_scheduler *scheduler, struct cik_static_private *priv = kfd_scheduler_to_private(scheduler); struct cik_static_process *pp = kfd_process_to_private(scheduler_process); - release_vmid(priv, pp-vmid); - kfree(pp); + if (priv pp) { + release_vmid(priv, pp-vmid); + kfree(pp); + } } static bool allocate_hqd(struct cik_static_private *priv, unsigned int *queue) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 44/83] hsa/radeon: HSA64/HSA32 modes support
From: Alexey Skidanov alexey.skida...@amd.com Added apertures initialization and appropriate ioctl Signed-off-by: Alexey Skidanov alexey.skida...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/kfd_aperture.c | 124 ++ drivers/gpu/hsa/radeon/kfd_chardev.c | 58 +++- drivers/gpu/hsa/radeon/kfd_priv.h | 18 drivers/gpu/hsa/radeon/kfd_process.c | 17 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 3 +- drivers/gpu/hsa/radeon/kfd_topology.c | 27 ++ include/uapi/linux/kfd_ioctl.h| 18 8 files changed, 264 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_aperture.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 5422e6a..813b31f 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -5,6 +5,6 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o kfd_interrupt.o + kfd_vidmem.o kfd_interrupt.o kfd_aperture.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_aperture.c b/drivers/gpu/hsa/radeon/kfd_aperture.c new file mode 100644 index 000..9e2d6da --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_aperture.c @@ -0,0 +1,124 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#include linux/device.h +#include linux/export.h +#include linux/err.h +#include linux/fs.h +#include linux/sched.h +#include linux/slab.h +#include linux/uaccess.h +#include linux/compat.h +#include uapi/linux/kfd_ioctl.h +#include linux/time.h +#include kfd_priv.h +#include kfd_scheduler.h +#include linux/mm.h +#include uapi/asm-generic/mman-common.h +#include asm/processor.h + + +#define MAKE_GPUVM_APP_BASE(gpu_num) (((uint64_t)(gpu_num) 61) + 0x1) +#define MAKE_GPUVM_APP_LIMIT(base) (((uint64_t)(base) 0xFF00) | 0xFF) +#define MAKE_SCRATCH_APP_BASE(gpu_num) (((uint64_t)(gpu_num) 61) + 0x1) +#define MAKE_SCRATCH_APP_LIMIT(base) (((uint64_t)base 0x) | 0x) +#define MAKE_LDS_APP_BASE(gpu_num) (((uint64_t)(gpu_num) 61) + 0x0) +#define MAKE_LDS_APP_LIMIT(base) (((uint64_t)(base) 0x) | 0x) + +#define HSA_32BIT_LDS_APP_SIZE 0x1 +#define HSA_32BIT_LDS_APP_ALIGNMENT 0x1 + +static unsigned long kfd_reserve_aperture(struct kfd_process *process, unsigned long len, unsigned long alignment) +{ + + unsigned long addr = 0; + unsigned long start_address; + + /* +* Go bottom up and find the first available aligned address. +* We may narrow space to scan by getting mmap range limits. +*/ + for (start_address = alignment; start_address (TASK_SIZE - alignment); start_address += alignment) { + addr = vm_mmap(NULL, start_address, len, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, 0); + if (!IS_ERR_VALUE(addr)) { + if (addr == start_address) + return addr; + vm_munmap(addr, len); + } + } + return 0; + +} + +int kfd_init_apertures(struct kfd_process *process) +{ + uint8_t id = 0; + struct kfd_dev *dev; + struct kfd_process_device *pdd; + + mutex_lock(process-mutex); + + /*Iterating over all devices*/ + while ((dev = kfd_topology_enum_kfd_devices(id)) != NULL id NUM_OF_SUPPORTED_GPUS) { + + pdd = radeon_kfd_get_process_device_data(dev, process); + + /*for 64 bit process aperture will be statically reserved
[PATCH 41/83] hsa/radeon: Alternating the source of max clock
From: Evgeny Pinchuk evgeny.pinc...@amd.com Changing the source of the max engine clock value. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 8b6d497..a28cf6b 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -316,5 +316,5 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd) struct radeon_device *rdev = (struct radeon_device *)kgd; /* The sclk is in quantas of 10kHz */ - return rdev-pm.power_state-clock_info-sclk / 100; + return rdev-pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 34/83] drm/radeon: adding synchronization for GRBM GFX
From: Evgeny Pinchuk evgeny.pinc...@amd.com Implementing a lock for selecting and accessing shader engines and arrays. This lock will make sure that drm/radeon and hsa/radeon are not colliding when accessing shader engines and arrays with GRBM_GFX_INDEX register. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/cik.c | 26 ++ drivers/gpu/drm/radeon/radeon.h| 2 ++ drivers/gpu/drm/radeon/radeon_device.c | 1 + drivers/gpu/drm/radeon/radeon_kfd.c| 23 +++ include/linux/radeon_kfd.h | 4 5 files changed, 56 insertions(+) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 6f4999a..fc560b0 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -1566,6 +1566,8 @@ static const u32 godavari_golden_registers[] = static void cik_init_golden_registers(struct radeon_device *rdev) { + /* Some of the registers might be dependant on GRBM_GFX_INDEX */ + mutex_lock(rdev-grbm_idx_mutex); switch (rdev-family) { case CHIP_BONAIRE: radeon_program_register_sequence(rdev, @@ -1640,6 +1642,7 @@ static void cik_init_golden_registers(struct radeon_device *rdev) default: break; } + mutex_unlock(rdev-grbm_idx_mutex); } /** @@ -3421,6 +3424,7 @@ static void cik_setup_rb(struct radeon_device *rdev, u32 disabled_rbs = 0; u32 enabled_rbs = 0; + mutex_lock(rdev-grbm_idx_mutex); for (i = 0; i se_num; i++) { for (j = 0; j sh_per_se; j++) { cik_select_se_sh(rdev, i, j); @@ -3432,6 +3436,7 @@ static void cik_setup_rb(struct radeon_device *rdev, } } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(rdev-grbm_idx_mutex); mask = 1; for (i = 0; i max_rb_num_per_se * se_num; i++) { @@ -3442,6 +3447,7 @@ static void cik_setup_rb(struct radeon_device *rdev, rdev-config.cik.backend_enable_mask = enabled_rbs; + mutex_lock(rdev-grbm_idx_mutex); for (i = 0; i se_num; i++) { cik_select_se_sh(rdev, i, 0x); data = 0; @@ -3469,6 +3475,7 @@ static void cik_setup_rb(struct radeon_device *rdev, WREG32(PA_SC_RASTER_CONFIG, data); } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(rdev-grbm_idx_mutex); } /** @@ -3686,6 +3693,12 @@ static void cik_gpu_init(struct radeon_device *rdev) /* set HW defaults for 3D engine */ WREG32(CP_MEQ_THRESHOLDS, MEQ1_START(0x30) | MEQ2_START(0x60)); + mutex_lock(rdev-grbm_idx_mutex); + /* +* making sure that the following register writes will be broadcasted +* to all the shaders +*/ + cik_select_se_sh(rdev, 0x, 0x); WREG32(SX_DEBUG_1, 0x20); WREG32(TA_CNTL_AUX, 0x0001); @@ -3741,6 +3754,7 @@ static void cik_gpu_init(struct radeon_device *rdev) WREG32(PA_CL_ENHANCE, CLIP_VTX_REORDER_ENA | NUM_CLIP_SEQ(3)); WREG32(PA_SC_ENHANCE, ENABLE_PA_SC_OUT_OF_ORDER); + mutex_unlock(rdev-grbm_idx_mutex); udelay(50); } @@ -6040,6 +6054,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device *rdev) u32 i, j, k; u32 mask; + mutex_lock(rdev-grbm_idx_mutex); for (i = 0; i rdev-config.cik.max_shader_engines; i++) { for (j = 0; j rdev-config.cik.max_sh_per_se; j++) { cik_select_se_sh(rdev, i, j); @@ -6051,6 +6066,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device *rdev) } } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(rdev-grbm_idx_mutex); mask = SE_MASTER_BUSY_MASK | GC_MASTER_BUSY | TC0_MASTER_BUSY | TC1_MASTER_BUSY; for (k = 0; k rdev-usec_timeout; k++) { @@ -6185,10 +6201,12 @@ static int cik_rlc_resume(struct radeon_device *rdev) WREG32(RLC_LB_CNTR_INIT, 0); WREG32(RLC_LB_CNTR_MAX, 0x8000); + mutex_lock(rdev-grbm_idx_mutex); cik_select_se_sh(rdev, 0x, 0x); WREG32(RLC_LB_INIT_CU_MASK, 0x); WREG32(RLC_LB_PARAMS, 0x00600408); WREG32(RLC_LB_CNTL, 0x8004); + mutex_unlock(rdev-grbm_idx_mutex); WREG32(RLC_MC_CNTL, 0); WREG32(RLC_UCODE_CNTL, 0); @@ -6255,11 +6273,13 @@ static void cik_enable_cgcg(struct radeon_device *rdev, bool enable) tmp = cik_halt_rlc(rdev); + mutex_lock(rdev-grbm_idx_mutex); cik_select_se_sh(rdev, 0x, 0x); WREG32(RLC_SERDES_WR_CU_MASTER_MASK, 0x); WREG32(RLC_SERDES_WR_NONCU_MASTER_MASK, 0x); tmp2 = BPM_ADDR_MASK | CGCG_OVERRIDE_0
[PATCH 38/83] hsa/radeon: Workaround for a bug in amd_iommu
This patch creates a workaround for a bug in amd_iommu driver, where the driver doesn't save all necessary information when going to suspend. The workaround removes a device from the IOMMU device list on suspend and register a resumed device in the IOMMU device list. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 2 +- drivers/gpu/hsa/radeon/kfd_device.c | 30 ++ drivers/gpu/hsa/radeon/kfd_pasid.c | 5 + drivers/gpu/hsa/radeon/kfd_pm.c | 43 - drivers/gpu/hsa/radeon/kfd_priv.h | 1 + 5 files changed, 37 insertions(+), 44 deletions(-) delete mode 100644 drivers/gpu/hsa/radeon/kfd_pm.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 935f9b7..5422e6a 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -5,6 +5,6 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ - kfd_vidmem.o kfd_interrupt.o kfd_pm.o + kfd_vidmem.o kfd_interrupt.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index a21c095..2e7d50d 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -188,3 +188,33 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) kfree(kfd); } + +void kgd2kfd_suspend(struct kfd_dev *kfd) +{ + BUG_ON(kfd == NULL); + + if (kfd-init_complete) { + kfd-device_info-scheduler_class-stop(kfd-scheduler); + amd_iommu_free_device(kfd-pdev); + } +} + +int kgd2kfd_resume(struct kfd_dev *kfd) +{ + pasid_t pasid_limit; + int err; + + BUG_ON(kfd == NULL); + + pasid_limit = radeon_kfd_get_pasid_limit(); + + if (kfd-init_complete) { + err = amd_iommu_init_device(kfd-pdev, pasid_limit); + if (err 0) + return -ENXIO; + amd_iommu_set_invalidate_ctx_cb(kfd-pdev, iommu_pasid_shutdown_callback); + kfd-device_info-scheduler_class-start(kfd-scheduler); + } + + return 0; +} diff --git a/drivers/gpu/hsa/radeon/kfd_pasid.c b/drivers/gpu/hsa/radeon/kfd_pasid.c index d78bd00..8bd1562 100644 --- a/drivers/gpu/hsa/radeon/kfd_pasid.c +++ b/drivers/gpu/hsa/radeon/kfd_pasid.c @@ -68,6 +68,11 @@ bool radeon_kfd_set_pasid_limit(pasid_t new_limit) return true; } +inline pasid_t radeon_kfd_get_pasid_limit(void) +{ + return pasid_limit; +} + pasid_t radeon_kfd_pasid_alloc(void) { pasid_t found; diff --git a/drivers/gpu/hsa/radeon/kfd_pm.c b/drivers/gpu/hsa/radeon/kfd_pm.c deleted file mode 100644 index 783311f..000 --- a/drivers/gpu/hsa/radeon/kfd_pm.c +++ /dev/null @@ -1,43 +0,0 @@ -/* - * Copyright 2014 Advanced Micro Devices, Inc. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the Software), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR - * OTHER DEALINGS IN THE SOFTWARE. - * - * Author: Oded Gabbay - */ - -#include linux/device.h -#include kfd_priv.h -#include kfd_scheduler.h - -void kgd2kfd_suspend(struct kfd_dev *kfd) -{ - BUG_ON(kfd == NULL); - - kfd-device_info-scheduler_class-stop(kfd-scheduler); -} - -int kgd2kfd_resume(struct kfd_dev *kfd) -{ - BUG_ON(kfd == NULL); - - kfd-device_info-scheduler_class-start(kfd-scheduler); - - return 0; -} diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h b/drivers/gpu/hsa/radeon/kfd_priv.h index bca9cce..8b877ca 100644 --- a/drivers/gpu/hsa/radeon/kfd_priv.h +++ b/drivers/gpu/hsa/radeon/kfd_priv.h @@ -213,6 +213,7 @@ struct kfd_queue *radeon_kfd_get_queue(struct kfd_process *p, unsigned int queue int radeon_kfd_pasid_init(void); void radeon_kfd_pasid_exit(void); bool radeon_kfd_set_pasid_limit(pasid_t new_limit); +pasid_t radeon_kfd_get_pasid_limit(void); pasid_t
[PATCH 40/83] hsa/radeon: Adding max clock speeds to topology
From: Evgeny Pinchuk evgeny.pinc...@amd.com Adding support for CPU and GPU max clock speeds in node properties. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_topology.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c b/drivers/gpu/hsa/radeon/kfd_topology.c index 2ee5444..21bb66e 100644 --- a/drivers/gpu/hsa/radeon/kfd_topology.c +++ b/drivers/gpu/hsa/radeon/kfd_topology.c @@ -26,6 +26,7 @@ #include linux/errno.h #include linux/acpi.h #include linux/hash.h +#include linux/cpufreq.h #include kfd_priv.h #include kfd_crat.h @@ -712,9 +713,10 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, sysfs_show_32bit_prop(buffer, location_id, dev-node_props.location_id); sysfs_show_32bit_prop(buffer, max_engine_clk_fcompute, - dev-node_props.max_engine_clk_fcompute); + kfd2kgd-get_max_engine_clock_in_mhz( + dev-gpu-kgd)); ret = sysfs_show_32bit_prop(buffer, max_engine_clk_ccompute, - dev-node_props.max_engine_clk_ccompute); + cpufreq_quick_get_max(0)/1000); } return ret; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 19/83] hsa/radeon: Enable/Disable KFD interrupt module
This patch add calls to initialize and finalize the KFD interrupt module. The calls are done per device initialize/finalize inside the kgd--kfd interface. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_regs.h | 1 + drivers/gpu/hsa/radeon/kfd_device.c | 10 -- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h index 9c3ce97..813cdc4 100644 --- a/drivers/gpu/hsa/radeon/cik_regs.h +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -73,6 +73,7 @@ #define CP_PQ_WPTR_POLL_CNTL 0xC20C #defineWPTR_POLL_EN(1 31) +#define CPC_INT_CNTL 0xC2D0 #define CP_ME1_PIPE0_INT_CNTL 0xC214 #define CP_ME1_PIPE1_INT_CNTL 0xC218 #define CP_ME1_PIPE2_INT_CNTL 0xC21C diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index b2d2861..b627e57 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -127,6 +127,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, radeon_kfd_doorbell_init(kfd); + if (radeon_kfd_interrupt_init(kfd)) + return false; + if (!device_iommu_pasid_init(kfd)) return false; @@ -155,10 +158,13 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) BUG_ON(err != 0); - if (kfd-init_complete) { + if (kfd-init_complete) kfd-device_info-scheduler_class-stop(kfd-scheduler); - kfd-device_info-scheduler_class-destroy(kfd-scheduler); + radeon_kfd_interrupt_exit(kfd); + + if (kfd-init_complete) { + kfd-device_info-scheduler_class-destroy(kfd-scheduler); amd_iommu_free_device(kfd-pdev); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 33/83] hsa/radeon: Fix coding style in cik_int.h
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/cik_int.h | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/hsa/radeon/cik_int.h b/drivers/gpu/hsa/radeon/cik_int.h index e98551d..350f0c2 100644 --- a/drivers/gpu/hsa/radeon/cik_int.h +++ b/drivers/gpu/hsa/radeon/cik_int.h @@ -26,20 +26,20 @@ #include linux/types.h struct cik_ih_ring_entry { - uint32_t source_id : 8; - uint32_t reserved1 : 8; - uint32_t reserved2 : 16; + uint32_t source_id:8; + uint32_t reserved1:8; + uint32_t reserved2:16; - uint32_t data : 28; - uint32_t reserved3 : 4; + uint32_t data:28; + uint32_t reserved3:4; /* pipeid, meid and unused3 are officially called RINGID, * but for our purposes, they always decode into pipe and ME. */ - uint32_t pipeid : 2; - uint32_t meid : 2; - uint32_t reserved4 : 4; - uint32_t vmid : 8; - uint32_t pasid : 16; + uint32_t pipeid:2; + uint32_t meid:2; + uint32_t reserved4:4; + uint32_t vmid:8; + uint32_t pasid:16; uint32_t reserved5; }; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 35/83] hsa/radeon: Print ioctl commnad only in debug mode
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index d6fa980..dba6084 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -324,9 +324,9 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) struct kfd_process *process; long err = -EINVAL; - dev_info(kfd_device, -ioctl cmd 0x%x (#%d), arg 0x%lx\n, -cmd, _IOC_NR(cmd), arg); + dev_dbg(kfd_device, + ioctl cmd 0x%x (#%d), arg 0x%lx\n, + cmd, _IOC_NR(cmd), arg); process = radeon_kfd_get_process(current); if (IS_ERR(process)) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 29/83] hsa/radeon: Fix memory size allocated for HPD
Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c index 3c3e7d6..5bfde5c 100644 --- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c +++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c @@ -433,7 +433,7 @@ static int cik_static_create(struct kfd_dev *dev, struct kfd_scheduler **schedul * are no active queues. */ err = radeon_kfd_vidmem_alloc(dev, - CIK_HPD_SIZE * priv-num_pipes * 2, + CIK_HPD_SIZE * priv-num_pipes, PAGE_SIZE, KFD_MEMPOOL_SYSTEM_WRITECOMBINE, priv-hpd_mem); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 31/83] drm/radeon: extending kfd-kgd interface
From: Evgeny Pinchuk evgeny.pinc...@amd.com Adding API for KFD to be able to query the GPU clock counter. Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 9 + include/linux/radeon_kfd.h | 1 + 2 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index f4cc3c5..121e67b 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -42,6 +42,7 @@ static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr); static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem); static uint64_t get_vmem_size(struct kgd_dev *kgd); +static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd); static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd); @@ -55,6 +56,7 @@ static const struct kfd2kgd_calls kfd2kgd = { .kmap_mem = kmap_mem, .unkmap_mem = unkmap_mem, .get_vmem_size = get_vmem_size, + .get_gpu_clock_counter = get_gpu_clock_counter, .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl, .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl, }; @@ -275,3 +277,10 @@ static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) mutex_unlock(rdev-srbm_mutex); } + +static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + + return rdev-asic-get_gpu_clock_counter(rdev); +} diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 63b7bac..fcb6c7a 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -84,6 +84,7 @@ struct kfd2kgd_calls { void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem); uint64_t (*get_vmem_size)(struct kgd_dev *kgd); + uint64_t (*get_gpu_clock_counter)(struct kgd_dev *kgd); /* SRBM_GFX_CNTL mutex */ void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/83] drm/radeon: Add radeon -- kfd interface
This patch adds the interface between the radeon driver and the kfd driver. The interface implementation is contained in radeon_kfd.c and radeon_kfd.h. The interface itself is represented by a pointer to struct kfd_dev. The pointer is located inside radeon_device structure. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/Makefile | 1 + drivers/gpu/drm/radeon/radeon.h | 3 ++ drivers/gpu/drm/radeon/radeon_kfd.c | 94 + include/linux/radeon_kfd.h | 67 ++ 4 files changed, 165 insertions(+) create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c create mode 100644 include/linux/radeon_kfd.h diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index 1b04002..a1c913d 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -104,6 +104,7 @@ radeon-y += \ radeon_vce.o \ vce_v1_0.o \ vce_v2_0.o \ + radeon_kfd.o radeon-$(CONFIG_COMPAT) += radeon_ioc32.o radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 4e7e41f..90f66bb 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2340,6 +2340,9 @@ struct radeon_device { struct dev_pm_domain vga_pm_domain; bool have_disp_power_ref; + + /* HSA KFD interface */ + struct kfd_dev *kfd; }; bool radeon_is_px(struct drm_device *dev); diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c new file mode 100644 index 000..7c7f808 --- /dev/null +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -0,0 +1,94 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include linux/module.h +#include linux/radeon_kfd.h +#include drm/drmP.h +#include radeon.h + +static const struct kfd2kgd_calls kfd2kgd = { +}; + +static const struct kgd2kfd_calls *kgd2kfd; + +bool radeon_kfd_init(void) +{ + bool (*kgd2kfd_init_p)(unsigned, const struct kfd2kgd_calls*, + const struct kgd2kfd_calls**); + + kgd2kfd_init_p = symbol_request(kgd2kfd_init); + + if (kgd2kfd_init_p == NULL) + return false; + + if (!kgd2kfd_init_p(KFD_INTERFACE_VERSION, kfd2kgd, kgd2kfd)) { + symbol_put(kgd2kfd_init); + kgd2kfd = NULL; + + return false; + } + + return true; +} + +void radeon_kfd_fini(void) +{ + if (kgd2kfd) { + kgd2kfd-exit(); + symbol_put(kgd2kfd_init); + } +} + +void radeon_kfd_device_probe(struct radeon_device *rdev) +{ + if (kgd2kfd) + rdev-kfd = kgd2kfd-probe((struct kgd_dev *)rdev, rdev-pdev); +} + +void radeon_kfd_device_init(struct radeon_device *rdev) +{ + if (rdev-kfd) { + struct kgd2kfd_shared_resources gpu_resources = { + .mmio_registers = rdev-rmmio, + + .compute_vmid_bitmap = 0xFF00, + + .first_compute_pipe = 1, + .compute_pipe_count = 8 - 1, + }; + + radeon_doorbell_get_kfd_info(rdev, + gpu_resources.doorbell_physical_address, + gpu_resources.doorbell_aperture_size, + gpu_resources.doorbell_start_offset); + + kgd2kfd-device_init(rdev-kfd, gpu_resources); + } +} + +void radeon_kfd_device_fini(struct radeon_device *rdev) +{ + if (rdev-kfd) { + kgd2kfd-device_exit(rdev-kfd); + rdev-kfd = NULL; + } +} diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h new file mode 100644 index 000..59785e9 --- /dev/null +++ b/include/linux/radeon_kfd.h @@ -0,0 +1,67
[PATCH 11/83] hsa/radeon: Add scheduler code
This patch adds the code base of the scheduler, which handles queue creation, deletion and scheduling on the CP of the GPU. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 3 +- drivers/gpu/hsa/radeon/cik_regs.h | 213 +++ drivers/gpu/hsa/radeon/kfd_device.c | 1 + drivers/gpu/hsa/radeon/kfd_registers.c| 50 ++ drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 800 ++ drivers/gpu/hsa/radeon/kfd_vidmem.c | 61 ++ 6 files changed, 1127 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/cik_regs.h create mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c create mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c create mode 100644 drivers/gpu/hsa/radeon/kfd_vidmem.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index 989518a..28da10c 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -4,6 +4,7 @@ radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ kfd_pasid.o kfd_topology.o kfd_process.o \ - kfd_doorbell.o + kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \ + kfd_vidmem.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/cik_regs.h b/drivers/gpu/hsa/radeon/cik_regs.h new file mode 100644 index 000..d0cdc57 --- /dev/null +++ b/drivers/gpu/hsa/radeon/cik_regs.h @@ -0,0 +1,213 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef CIK_REGS_H +#define CIK_REGS_H + +#define BIF_DOORBELL_CNTL 0x530Cu + +#defineSRBM_GFX_CNTL 0xE44 +#definePIPEID(x) ((x) 0) +#defineMEID(x) ((x) 2) +#defineVMID(x) ((x) 4) +#defineQUEUEID(x) ((x) 8) + +#defineSQ_CONFIG 0x8C00 + +#defineSH_MEM_BASES0x8C28 +/* if PTR32, these are the bases for scratch and lds */ +#definePRIVATE_BASE(x) ((x) 0) /* scratch */ +#defineSHARED_BASE(x) ((x) 16) /* LDS */ +#defineSH_MEM_APE1_BASE0x8C2C +/* if PTR32, this is the base location of GPUVM */ +#defineSH_MEM_APE1_LIMIT 0x8C30 +/* if PTR32, this is the upper limit of GPUVM */ +#defineSH_MEM_CONFIG 0x8C34 +#definePTR32 (1 0) +#defineALIGNMENT_MODE(x) ((x) 2) +#defineSH_MEM_ALIGNMENT_MODE_DWORD 0 +#defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT 1 +#defineSH_MEM_ALIGNMENT_MODE_STRICT2 +#defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 3 +#defineDEFAULT_MTYPE(x)((x) 4) +#defineAPE1_MTYPE(x) ((x) 7) + +/* valid for both DEFAULT_MTYPE and APE1_MTYPE */ +#defineMTYPE_NONCACHED 3 + + +#define SH_STATIC_MEM_CONFIG 0x9604u + +#defineTC_CFG_L1_LOAD_POLICY0 0xAC68 +#defineTC_CFG_L1_LOAD_POLICY1 0xAC6C +#defineTC_CFG_L1_STORE_POLICY 0xAC70 +#defineTC_CFG_L2_LOAD_POLICY0 0xAC74 +#defineTC_CFG_L2_LOAD_POLICY1 0xAC78 +#defineTC_CFG_L2_STORE_POLICY0
[PATCH 14/83] hsa/radeon: Update MAINTAINERS and CREDITS files
Update MAINTAINERS and CREDITS files with kfd driver information Signed-off-by: Oded Gabbay oded.gab...@amd.com --- CREDITS | 7 +++ MAINTAINERS | 8 2 files changed, 15 insertions(+) diff --git a/CREDITS b/CREDITS index 03343bf..c5f0aeae 100644 --- a/CREDITS +++ b/CREDITS @@ -1197,6 +1197,13 @@ S: R. Tocantins, 89 - Cristo Rei S: 80050-430 - Curitiba - Paraná S: Brazil +N: Oded Gabbay +E: oded.gab...@gmail.com +D: AMD HSA Radeon (KFD) driver maintainer +S: 12 Shraga Raphaeli +S: Petah-Tikva, 4906418 +S: Israel + N: Kumar Gala E: ga...@kernel.crashing.org D: Embedded PowerPC 6xx/7xx/74xx/82xx/83xx/85xx support diff --git a/MAINTAINERS b/MAINTAINERS index 3efbeaf..bf1081f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -592,6 +592,14 @@ F: drivers/crypto/geode* F: drivers/video/fbdev/geode/ F: arch/x86/include/asm/geode.h +AMD HSA RADEON DRIVER (KFD) +M: Oded Gabbay oded.gab...@amd.com +L: dri-de...@lists.freedesktop.org +S: Supported +F: drivers/gpu/hsa/radeon +F: include/linux/radeon_kfd.h +F: include/linux/uapi/linux/kfd_ioctl.h + AMD IOMMU (AMD-VI) M: Joerg Roedel j...@8bytes.org L: io...@lists.linux-foundation.org -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/83] drm/radeon: Add kfd--kgd interface to get virtual ram size
This patch adds a new interface to kfd2kgd_calls structure so that the kfd driver could get the virtual ram size of a specific radeon device. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/drm/radeon/radeon_kfd.c | 12 include/linux/radeon_kfd.h | 1 + 2 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 7c7f808..1b859b5 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -25,7 +25,10 @@ #include drm/drmP.h #include radeon.h +static uint64_t get_vmem_size(struct kgd_dev *kgd); + static const struct kfd2kgd_calls kfd2kgd = { + .get_vmem_size = get_vmem_size, }; static const struct kgd2kfd_calls *kgd2kfd; @@ -92,3 +95,12 @@ void radeon_kfd_device_fini(struct radeon_device *rdev) rdev-kfd = NULL; } } + +static uint64_t get_vmem_size(struct kgd_dev *kgd) +{ + struct radeon_device *rdev = (struct radeon_device *)kgd; + + BUG_ON(kgd == NULL); + + return rdev-mc.real_vram_size; +} diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h index 59785e9..28cddf5 100644 --- a/include/linux/radeon_kfd.h +++ b/include/linux/radeon_kfd.h @@ -57,6 +57,7 @@ struct kgd2kfd_calls { }; struct kfd2kgd_calls { + uint64_t (*get_vmem_size)(struct kgd_dev *kgd); }; bool kgd2kfd_init(unsigned interface_version, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/83] hsa/radeon: Add initialization and unmapping of doorbell aperture
This patch adds initialization of the doorbell aperture when initializing a kfd device. It also adds a call to unmap the doorbell when a process unbinds from the kfd Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/Makefile | 3 +- drivers/gpu/hsa/radeon/kfd_device.c | 2 + drivers/gpu/hsa/radeon/kfd_doorbell.c | 72 +++ 3 files changed, 76 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/hsa/radeon/kfd_doorbell.c diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile index ba16a09..989518a 100644 --- a/drivers/gpu/hsa/radeon/Makefile +++ b/drivers/gpu/hsa/radeon/Makefile @@ -3,6 +3,7 @@ # radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \ - kfd_pasid.o kfd_topology.o kfd_process.o + kfd_pasid.o kfd_topology.o kfd_process.o \ + kfd_doorbell.o obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o diff --git a/drivers/gpu/hsa/radeon/kfd_device.c b/drivers/gpu/hsa/radeon/kfd_device.c index d122920..4e9fe6c 100644 --- a/drivers/gpu/hsa/radeon/kfd_device.c +++ b/drivers/gpu/hsa/radeon/kfd_device.c @@ -123,6 +123,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, kfd-regs = gpu_resources-mmio_registers; + radeon_kfd_doorbell_init(kfd); + if (!device_iommu_pasid_init(kfd)) return false; diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c b/drivers/gpu/hsa/radeon/kfd_doorbell.c new file mode 100644 index 000..79a9d4b --- /dev/null +++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c @@ -0,0 +1,72 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include kfd_priv.h +#include linux/mm.h +#include linux/mman.h + +/* + * Each device exposes a doorbell aperture, a PCI MMIO aperture that + * receives 32-bit writes that are passed to queues as wptr values. + * The doorbells are intended to be written by applications as part + * of queueing work on user-mode queues. + * We assign doorbells to applications in PAGE_SIZE-sized and aligned chunks. + * We map the doorbell address space into user-mode when a process creates + * its first queue on each device. + * Although the mapping is done by KFD, it is equivalent to an mmap of + * the /dev/kfd with the particular device encoded in the mmap offset. + * There will be other uses for mmap of /dev/kfd, so only a range of + * offsets (KFD_MMAP_DOORBELL_START-END) is used for doorbells. + */ + +/* # of doorbell bytes allocated for each process. */ +static inline size_t doorbell_process_allocation(void) +{ + return roundup(sizeof(doorbell_t) * MAX_PROCESS_QUEUES, PAGE_SIZE); +} + +/* Doorbell calculations for device init. */ +void radeon_kfd_doorbell_init(struct kfd_dev *kfd) +{ + size_t doorbell_start_offset; + size_t doorbell_aperture_size; + size_t doorbell_process_limit; + + /* We start with calculations in bytes because the input data might only be byte-aligned. + ** Only after we have done the rounding can we assume any alignment. */ + + doorbell_start_offset = roundup(kfd-shared_resources.doorbell_start_offset, + doorbell_process_allocation()); + doorbell_aperture_size = rounddown(kfd-shared_resources.doorbell_aperture_size, + doorbell_process_allocation()); + + if (doorbell_aperture_size doorbell_start_offset) + doorbell_process_limit = + (doorbell_aperture_size - doorbell_start_offset) / doorbell_process_allocation(); + else + doorbell_process_limit = 0; + + kfd-doorbell_base = kfd-shared_resources.doorbell_physical_address + doorbell_start_offset; + kfd-doorbell_id_offset = doorbell_start_offset / sizeof(doorbell_t); + kfd-doorbell_process_limit = doorbell_process_limit; +} + -- 1.9.1
[PATCH 12/83] hsa/radeon: Add kfd mmap handler
This patch adds the kfd mmap handler that maps the physical address of a doorbell page to a user-space virtual address. That virtual address belongs to the process that uses the doorbell page. This mmap handler is called only from within the kernel and not to be called from user-mode mmap of /dev/kfd. Signed-off-by: Oded Gabbay oded.gab...@amd.com --- drivers/gpu/hsa/radeon/kfd_chardev.c | 20 + drivers/gpu/hsa/radeon/kfd_doorbell.c | 85 +++ 2 files changed, 105 insertions(+) diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c b/drivers/gpu/hsa/radeon/kfd_chardev.c index 7a56a8f..0b5bc74 100644 --- a/drivers/gpu/hsa/radeon/kfd_chardev.c +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c @@ -39,6 +39,7 @@ static const struct file_operations kfd_fops = { .owner = THIS_MODULE, .unlocked_ioctl = kfd_ioctl, .open = kfd_open, + .mmap = kfd_mmap, }; static int kfd_char_dev_major = -1; @@ -131,3 +132,22 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) return err; } + +static int +kfd_mmap(struct file *filp, struct vm_area_struct *vma) +{ + unsigned long pgoff = vma-vm_pgoff; + struct kfd_process *process; + + process = radeon_kfd_get_process(current); + if (IS_ERR(process)) + return PTR_ERR(process); + + if (pgoff KFD_MMAP_DOORBELL_START) + return -EINVAL; + + if (pgoff KFD_MMAP_DOORBELL_END) + return radeon_kfd_doorbell_mmap(process, vma); + + return -EINVAL; +} diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c b/drivers/gpu/hsa/radeon/kfd_doorbell.c index 79a9d4b..e1d8506 100644 --- a/drivers/gpu/hsa/radeon/kfd_doorbell.c +++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c @@ -70,3 +70,88 @@ void radeon_kfd_doorbell_init(struct kfd_dev *kfd) kfd-doorbell_process_limit = doorbell_process_limit; } +/* This is the /dev/kfd mmap (for doorbell) implementation. We intend that this is only called through map_doorbells, +** not through user-mode mmap of /dev/kfd. */ +int radeon_kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma) +{ + unsigned int device_index; + struct kfd_dev *dev; + phys_addr_t start; + + BUG_ON(vma-vm_pgoff KFD_MMAP_DOORBELL_START || vma-vm_pgoff = KFD_MMAP_DOORBELL_END); + + /* For simplicitly we only allow mapping of the entire doorbell allocation of a single device process. */ + if (vma-vm_end - vma-vm_start != doorbell_process_allocation()) + return -EINVAL; + + /* device_index must be GPU ID!! */ + device_index = vma-vm_pgoff - KFD_MMAP_DOORBELL_START; + + dev = radeon_kfd_device_by_id(device_index); + if (dev == NULL) + return -EINVAL; + + vma-vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | VM_DONTDUMP | VM_PFNMAP; + vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot); + + start = dev-doorbell_base + process-pasid * doorbell_process_allocation(); + + pr_debug(kfd: mapping doorbell page in radeon_kfd_doorbell_mmap\n + target user address == 0x%016llX\n + physical address== 0x%016llX\n + vm_flags== 0x%08lX\n + size== 0x%08lX\n, +(long long unsigned int) vma-vm_start, start, vma-vm_flags, +doorbell_process_allocation()); + + return io_remap_pfn_range(vma, + vma-vm_start, + start PAGE_SHIFT, + doorbell_process_allocation(), + vma-vm_page_prot); +} + +/* Map the doorbells for a single process device. This will indirectly call radeon_kfd_doorbell_mmap. +** This assumes that the process mutex is being held. */ +static int +map_doorbells(struct file *devkfd, struct kfd_process *process, struct kfd_dev *dev) +{ + struct kfd_process_device *pdd = radeon_kfd_get_process_device_data(dev, process); + + if (pdd == NULL) + return -ENOMEM; + + if (pdd-doorbell_mapping == NULL) { + unsigned long offset = (KFD_MMAP_DOORBELL_START + dev-id) PAGE_SHIFT; + doorbell_t __user *doorbell_mapping; + + doorbell_mapping = (doorbell_t __user *)vm_mmap(devkfd, 0, doorbell_process_allocation(), PROT_WRITE, + MAP_SHARED, offset); + if (IS_ERR(doorbell_mapping)) + return PTR_ERR(doorbell_mapping); + + pdd-doorbell_mapping = doorbell_mapping; + } + + return 0; +} + +/* Get the user-mode address of a doorbell. Assumes that the process mutex is being held. */ +doorbell_t __user *radeon_kfd_get_doorbell(struct file *devkfd, struct kfd_process *process, struct kfd_dev *dev
Re: [PATCH v2 00/25] AMDKFD kernel driver
On 20/07/14 20:46, Jerome Glisse wrote: On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: Forgot to cc mailing list on cover letter. Sorry. As a continuation to the existing discussion, here is a v2 patch series restructured with a cleaner history and no totally-different-early-versions of the code. Instead of 83 patches, there are now a total of 25 patches, where 5 of them are modifications to radeon driver and 18 of them include only amdkfd code. There is no code going away or even modified between patches, only added. The driver was renamed from radeon_kfd to amdkfd and moved to reside under drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is an AMD-only driver at this point. Having said that, we do foresee a generic hsa framework being implemented in the future and in that case, we will adjust amdkfd to work within that framework. As the amdkfd driver should support multiple AMD gfx drivers, we want to keep it as a seperate driver from radeon. Therefore, the amdkfd code is contained in its own folder. The amdkfd folder was put under the radeon folder because the only AMD gfx driver in the Linux kernel at this point is the radeon driver. Having said that, we will probably need to move it (maybe to be directly under drm) after we integrate with additional AMD gfx drivers. For people who like to review using git, the v2 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 Written by Oded Gabbayh oded.gab...@amd.com So quick comments before i finish going over all patches. There is many things that need more documentation espacialy as of right now there is no userspace i can go look at. So quick comments on some of your questions but first of all, thanks for the time you dedicated to review the code. There few show stopper, biggest one is gpu memory pinning this is a big no, that would need serious arguments for any hope of convincing me on that side. We only do gpu memory pinning for kernel objects. There are no userspace objects that are pinned on the gpu memory in our driver. If that is the case, is it still a show stopper ? The kernel objects are: - pipelines (4 per device) - mqd per hiq (only 1 per device) - mqd per userspace queue. On KV, we support up to 1K queues per process, for a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in 256 alignment. So total *possible* memory is 128MB - kernel queue (only 1 per device) - fence address for kernel queue - runlists for the CP (1 or 2 per device) It might be better to add a drivers/gpu/drm/amd directory and add common stuff there. Given that this is not intended to be final HSA api AFAICT then i would say this far better to avoid the whole kfd module and add ioctl to radeon. This would avoid crazy communication btw radeon and kfd. The whole aperture business needs some serious explanation. Especialy as you want to use userspace address there is nothing to prevent userspace program from allocating things at address you reserve for lds, scratch, ... only sane way would be to move those lds, scratch inside the virtual address reserved for kernel (see kernel memory map). The whole business of locking performance counter for exclusive per process access is a big NO. Which leads me to the questionable usefullness of user space command ring. That's like saying: Which leads me to the questionable usefulness of HSA. I find it analogous to a situation where a network maintainer nacking a driver for a network card, which is slower than a different network card. Doesn't seem reasonable this situation is would happen. He would still put both the drivers in the kernel because people want to use the H/W and its features. So, I don't think this is a valid reason to NACK the driver. I only see issues with that. First and foremost i would need to see solid figures that kernel ioctl or syscall has a higher an overhead that is measurable in any meaning full way against a simple function call. I know the userspace command ring is a big marketing features that please ignorant userspace programmer. But really this only brings issues and for absolutely not upside afaict. Really ? You think that doing a context switch to kernel space, with all its overhead, is _not_ more expansive than just calling a function in userspace which only puts a buffer on a ring and writes a doorbell ? So i would rather see a very simple ioctl that write the doorbell and might do more than that in case of ring/queue overcommit where it would first have to wait for a free ring/queue to schedule stuff. This would also allow sane implementation of things like performance counter that could be acquire by kernel for duration of a job submitted by userspace. While still not optimal this would be better that userspace locking. I might have more thoughts once i am done with all the patches. Cheers, Jérôme Original Cover Letter: This patch set implements
Re: [PATCH v2 00/25] AMDKFD kernel driver
On 21/07/14 16:39, Christian König wrote: Am 21.07.2014 14:36, schrieb Oded Gabbay: On 20/07/14 20:46, Jerome Glisse wrote: On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: Forgot to cc mailing list on cover letter. Sorry. As a continuation to the existing discussion, here is a v2 patch series restructured with a cleaner history and no totally-different-early-versions of the code. Instead of 83 patches, there are now a total of 25 patches, where 5 of them are modifications to radeon driver and 18 of them include only amdkfd code. There is no code going away or even modified between patches, only added. The driver was renamed from radeon_kfd to amdkfd and moved to reside under drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is an AMD-only driver at this point. Having said that, we do foresee a generic hsa framework being implemented in the future and in that case, we will adjust amdkfd to work within that framework. As the amdkfd driver should support multiple AMD gfx drivers, we want to keep it as a seperate driver from radeon. Therefore, the amdkfd code is contained in its own folder. The amdkfd folder was put under the radeon folder because the only AMD gfx driver in the Linux kernel at this point is the radeon driver. Having said that, we will probably need to move it (maybe to be directly under drm) after we integrate with additional AMD gfx drivers. For people who like to review using git, the v2 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 Written by Oded Gabbayh oded.gab...@amd.com So quick comments before i finish going over all patches. There is many things that need more documentation espacialy as of right now there is no userspace i can go look at. So quick comments on some of your questions but first of all, thanks for the time you dedicated to review the code. There few show stopper, biggest one is gpu memory pinning this is a big no, that would need serious arguments for any hope of convincing me on that side. We only do gpu memory pinning for kernel objects. There are no userspace objects that are pinned on the gpu memory in our driver. If that is the case, is it still a show stopper ? The kernel objects are: - pipelines (4 per device) - mqd per hiq (only 1 per device) - mqd per userspace queue. On KV, we support up to 1K queues per process, for a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in 256 alignment. So total *possible* memory is 128MB - kernel queue (only 1 per device) - fence address for kernel queue - runlists for the CP (1 or 2 per device) The main questions here are if it's avoid able to pin down the memory and if the memory is pinned down at driver load, by request from userspace or by anything else. As far as I can see only the mqd per userspace queue might be a bit questionable, everything else sounds reasonable. Christian. Most of the pin downs are done on device initialization. The mqd per userspace is done per userspace queue creation. However, as I said, it has an upper limit of 128MB on KV, and considering the 2G local memory, I think it is OK. The runlists are also done on userspace queue creation/deletion, but we only have 1 or 2 runlists per device, so it is not that bad. Oded It might be better to add a drivers/gpu/drm/amd directory and add common stuff there. Given that this is not intended to be final HSA api AFAICT then i would say this far better to avoid the whole kfd module and add ioctl to radeon. This would avoid crazy communication btw radeon and kfd. The whole aperture business needs some serious explanation. Especialy as you want to use userspace address there is nothing to prevent userspace program from allocating things at address you reserve for lds, scratch, ... only sane way would be to move those lds, scratch inside the virtual address reserved for kernel (see kernel memory map). The whole business of locking performance counter for exclusive per process access is a big NO. Which leads me to the questionable usefullness of user space command ring. That's like saying: Which leads me to the questionable usefulness of HSA. I find it analogous to a situation where a network maintainer nacking a driver for a network card, which is slower than a different network card. Doesn't seem reasonable this situation is would happen. He would still put both the drivers in the kernel because people want to use the H/W and its features. So, I don't think this is a valid reason to NACK the driver. I only see issues with that. First and foremost i would need to see solid figures that kernel ioctl or syscall has a higher an overhead that is measurable in any meaning full way against a simple function call. I know the userspace command ring is a big marketing features that please ignorant userspace programmer. But really this only brings issues and for absolutely not upside afaict. Really ? You think
Re: [PATCH v2 00/25] AMDKFD kernel driver
On 21/07/14 20:05, Daniel Vetter wrote: On Mon, Jul 21, 2014 at 11:58:52AM -0400, Jerome Glisse wrote: On Mon, Jul 21, 2014 at 05:25:11PM +0200, Daniel Vetter wrote: On Mon, Jul 21, 2014 at 03:39:09PM +0200, Christian König wrote: Am 21.07.2014 14:36, schrieb Oded Gabbay: On 20/07/14 20:46, Jerome Glisse wrote: On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: Forgot to cc mailing list on cover letter. Sorry. As a continuation to the existing discussion, here is a v2 patch series restructured with a cleaner history and no totally-different-early-versions of the code. Instead of 83 patches, there are now a total of 25 patches, where 5 of them are modifications to radeon driver and 18 of them include only amdkfd code. There is no code going away or even modified between patches, only added. The driver was renamed from radeon_kfd to amdkfd and moved to reside under drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is an AMD-only driver at this point. Having said that, we do foresee a generic hsa framework being implemented in the future and in that case, we will adjust amdkfd to work within that framework. As the amdkfd driver should support multiple AMD gfx drivers, we want to keep it as a seperate driver from radeon. Therefore, the amdkfd code is contained in its own folder. The amdkfd folder was put under the radeon folder because the only AMD gfx driver in the Linux kernel at this point is the radeon driver. Having said that, we will probably need to move it (maybe to be directly under drm) after we integrate with additional AMD gfx drivers. For people who like to review using git, the v2 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 Written by Oded Gabbayh oded.gab...@amd.com So quick comments before i finish going over all patches. There is many things that need more documentation espacialy as of right now there is no userspace i can go look at. So quick comments on some of your questions but first of all, thanks for the time you dedicated to review the code. There few show stopper, biggest one is gpu memory pinning this is a big no, that would need serious arguments for any hope of convincing me on that side. We only do gpu memory pinning for kernel objects. There are no userspace objects that are pinned on the gpu memory in our driver. If that is the case, is it still a show stopper ? The kernel objects are: - pipelines (4 per device) - mqd per hiq (only 1 per device) - mqd per userspace queue. On KV, we support up to 1K queues per process, for a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in 256 alignment. So total *possible* memory is 128MB - kernel queue (only 1 per device) - fence address for kernel queue - runlists for the CP (1 or 2 per device) The main questions here are if it's avoid able to pin down the memory and if the memory is pinned down at driver load, by request from userspace or by anything else. As far as I can see only the mqd per userspace queue might be a bit questionable, everything else sounds reasonable. Aside, i915 perspective again (i.e. how we solved this): When scheduling away from contexts we unpin them and put them into the lru. And in the shrinker we have a last-ditch callback to switch to a default context (since you can't ever have no context once you've started) which means we can evict any context object if it's getting in the way. So Intel hardware report through some interrupt or some channel when it is not using a context ? ie kernel side get notification when some user context is done executing ? Yes, as long as we do the scheduling with the cpu we get interrupts for context switches. The mechanic is already published in the execlist patches currently floating around. We get a special context switch interrupt. But we have this unpin logic already on the current code where we switch contexts through in-line cs commands from the kernel. There we obviously use the normal batch completion events. The issue with radeon hardware AFAICT is that the hardware do not report any thing about the userspace context running ie you do not get notification when a context is not use. Well AFAICT. Maybe hardware do provide that. I'm not sure whether we can do the same trick with the hw scheduler. But then unpinning hw contexts will drain the pipeline anyway, so I guess we can just stop feeding the hw scheduler until it runs dry. And then unpin and evict. So, I'm afraid but we can't do this for AMD Kaveri because: a. The hw scheduler doesn't inform us which queues it is going to execute next. We feed it a runlist of queues, which can be very large (we have a test that runs 1000 queues on the same runlist, but we can put a lot more). All the MQDs of those queues must be pinned in memory as long as the runlist is in effect. The runlist is in effect until either
Re: [PATCH v2 00/25] AMDKFD kernel driver
On 21/07/14 18:54, Jerome Glisse wrote: On Mon, Jul 21, 2014 at 05:12:06PM +0300, Oded Gabbay wrote: On 21/07/14 16:39, Christian König wrote: Am 21.07.2014 14:36, schrieb Oded Gabbay: On 20/07/14 20:46, Jerome Glisse wrote: On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: Forgot to cc mailing list on cover letter. Sorry. As a continuation to the existing discussion, here is a v2 patch series restructured with a cleaner history and no totally-different-early-versions of the code. Instead of 83 patches, there are now a total of 25 patches, where 5 of them are modifications to radeon driver and 18 of them include only amdkfd code. There is no code going away or even modified between patches, only added. The driver was renamed from radeon_kfd to amdkfd and moved to reside under drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is an AMD-only driver at this point. Having said that, we do foresee a generic hsa framework being implemented in the future and in that case, we will adjust amdkfd to work within that framework. As the amdkfd driver should support multiple AMD gfx drivers, we want to keep it as a seperate driver from radeon. Therefore, the amdkfd code is contained in its own folder. The amdkfd folder was put under the radeon folder because the only AMD gfx driver in the Linux kernel at this point is the radeon driver. Having said that, we will probably need to move it (maybe to be directly under drm) after we integrate with additional AMD gfx drivers. For people who like to review using git, the v2 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 Written by Oded Gabbayh oded.gab...@amd.com So quick comments before i finish going over all patches. There is many things that need more documentation espacialy as of right now there is no userspace i can go look at. So quick comments on some of your questions but first of all, thanks for the time you dedicated to review the code. There few show stopper, biggest one is gpu memory pinning this is a big no, that would need serious arguments for any hope of convincing me on that side. We only do gpu memory pinning for kernel objects. There are no userspace objects that are pinned on the gpu memory in our driver. If that is the case, is it still a show stopper ? The kernel objects are: - pipelines (4 per device) - mqd per hiq (only 1 per device) - mqd per userspace queue. On KV, we support up to 1K queues per process, for a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in 256 alignment. So total *possible* memory is 128MB - kernel queue (only 1 per device) - fence address for kernel queue - runlists for the CP (1 or 2 per device) The main questions here are if it's avoid able to pin down the memory and if the memory is pinned down at driver load, by request from userspace or by anything else. As far as I can see only the mqd per userspace queue might be a bit questionable, everything else sounds reasonable. Christian. Most of the pin downs are done on device initialization. The mqd per userspace is done per userspace queue creation. However, as I said, it has an upper limit of 128MB on KV, and considering the 2G local memory, I think it is OK. The runlists are also done on userspace queue creation/deletion, but we only have 1 or 2 runlists per device, so it is not that bad. 2G local memory ? You can not assume anything on userside configuration some one might build an hsa computer with 512M and still expect a functioning desktop. First of all, I'm only considering Kaveri computer, not hsa computer. Second, I would imagine we can build some protection around it, like checking total local memory and limit number of queues based on some percentage of that total local memory. So, if someone will have only 512M, he will be able to open less queues. I need to go look into what all this mqd is for, what it does and what it is about. But pinning is really bad and this is an issue with userspace command scheduling an issue that obviously AMD fails to take into account in design phase. Maybe, but that is the H/W design non-the-less. We can't very well change the H/W. Oded Oded It might be better to add a drivers/gpu/drm/amd directory and add common stuff there. Given that this is not intended to be final HSA api AFAICT then i would say this far better to avoid the whole kfd module and add ioctl to radeon. This would avoid crazy communication btw radeon and kfd. The whole aperture business needs some serious explanation. Especialy as you want to use userspace address there is nothing to prevent userspace program from allocating things at address you reserve for lds, scratch, ... only sane way would be to move those lds, scratch inside the virtual address reserved for kernel (see kernel memory map). The whole business