Re: [PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47

2013-06-10 Thread Oded Gabbay

Hi Dirk,

You are absolutely right.
I will revise my patch series to reflect the change.
Basically, I will call the generic mmc_of_parse from the probe function 
of Freescale's driver.

That will handle all the additional capabilities.

Thanks
Oded

On 06/10/2013 09:29 AM, Dirk Behme wrote:

On 02.06.2013 08:38, Oded Gabbay wrote:

This patch adds support of recognizing hard-wired (permanent) cards
to Freescale's SDHC host driver. This is done by adding the option
fsl,card-wired to the SDHC device-tree entry. Detection of this
option is done in the probe function. Update documentation in file
fsl-esdhc.txt


Why don't you want to introduce fsl,card-wired? Why don't you use 
non-removable?


To my understanding the patch

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7f217794ffa72f208a250b79ab0b7ea3de19677f 



explicitly removed fsl,card-wired. So I don't think re-introducing 
it is a good idea?


Best regards

Dirk


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47

2013-06-10 Thread Oded Gabbay

Hi All,

Just noticed that 3.4.47/8 doesn't have the mmc_of_parse (compared to 
3.9.4).
Therefore, I will not use it and just fix the code to recognize the 
property non-removable


Best regards,
Oded

On 06/10/2013 04:43 PM, Oded Gabbay wrote:

Hi Dirk,

You are absolutely right.
I will revise my patch series to reflect the change.
Basically, I will call the generic mmc_of_parse from the probe 
function of Freescale's driver.

That will handle all the additional capabilities.

Thanks
Oded

On 06/10/2013 09:29 AM, Dirk Behme wrote:

On 02.06.2013 08:38, Oded Gabbay wrote:

This patch adds support of recognizing hard-wired (permanent) cards
to Freescale's SDHC host driver. This is done by adding the option
fsl,card-wired to the SDHC device-tree entry. Detection of this
option is done in the probe function. Update documentation in file
fsl-esdhc.txt


Why don't you want to introduce fsl,card-wired? Why don't you use 
non-removable?


To my understanding the patch

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7f217794ffa72f208a250b79ab0b7ea3de19677f 



explicitly removed fsl,card-wired. So I don't think re-introducing 
it is a good idea?


Best regards

Dirk




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MMC: FSL SDHC: Add support for non-removable card. Kernel version 3.4.48

2013-06-10 Thread Oded Gabbay
This patch adds support of recognizing non-removable cards
to Freescale's SDHC host driver. This is done by detecting the
attribute non-removable in the probe function

This patch depends on patch[2/3] from 6-jun-2013:
https://patchwork.kernel.org/patch/2649381/

This patch is instead of patch[3/3] from 6-jun-2013:
https://patchwork.kernel.org/patch/2649231/

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 drivers/mmc/host/sdhci-of-esdhc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index e70f22f..a6e068c 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -222,6 +222,10 @@ static int __devinit sdhci_esdhc_probe(struct 
platform_device *pdev)
host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL;
}
 
+   /* Check if card is non-removable */
+   if (of_find_property(np, non-removable, NULL))
+   host-caps |= MMC_CAP_NONREMOVABLE;
+
ret = sdhci_add_host(host);
if (ret)
sdhci_pltfm_free(pdev);
-- 
1.8.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register

2013-06-12 Thread Oded Gabbay
This patch fixes a bug in the fsl_pq_mdio.c module and in relevant device-tree
files regarding the correct offset of the tbipa register in the eTSEC
controller in some of Freescale's PQ3 and QorIQ SoC.
The bug happens when the mdio in the device tree is configured to be compatible
to fsl,gianfar-tbi. Because the mdio device in the device tree points to
addresses 25520, 26520 or 27520 (depends on the controller ID), the variable
priv-map at function fsl_pq_mdio_probe, points to that address. However,
later in the function there is a write to register tbipa that is actually
located at 25030, 26030 or 27030. Because the correct address is not io mapped,
the contents are written to a different register in the controller.
The fix sets the address of the mdio device to start at 25000, 26000 or 27000
and changes the mii_offset field to 0x520 in the relevant entry
(fsl,gianfar-tbi) of the fsl_pq_mdio_match array.

Note: This patch may break MDIO functionallity of some old Freescale's SoC
until Freescale will fix their device tree files. Basically, every device tree
which contains an mdio device that is compatible to fsl,gianfar-tbi should be
examined.

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi| 4 ++--
 arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi| 4 ++--
 arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi| 4 ++--
 arch/powerpc/boot/dts/ge_imp3a.dts | 4 ++--
 arch/powerpc/boot/dts/mpc8536ds.dtsi   | 4 ++--
 arch/powerpc/boot/dts/mpc8544ds.dtsi   | 2 +-
 arch/powerpc/boot/dts/mpc8548cds.dtsi  | 6 +++---
 arch/powerpc/boot/dts/mpc8568mds.dts   | 2 +-
 arch/powerpc/boot/dts/mpc8572ds.dtsi   | 6 +++---
 arch/powerpc/boot/dts/mpc8572ds_camp_core0.dts | 4 ++--
 arch/powerpc/boot/dts/mpc8572ds_camp_core1.dts | 2 +-
 arch/powerpc/boot/dts/p2020ds.dtsi | 4 ++--
 arch/powerpc/boot/dts/p2020rdb-pc.dtsi | 4 ++--
 arch/powerpc/boot/dts/p2020rdb.dts | 4 ++--
 arch/powerpc/boot/dts/ppa8548.dts  | 6 +++---
 drivers/net/ethernet/freescale/fsl_pq_mdio.c   | 2 +-
 16 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi
index 96693b4..d38bf63 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-1.dtsi
@@ -46,9 +46,9 @@ ethernet@25000 {
interrupts = 35 2 0 0 36 2 0 0 40 2 0 0;
 };
 
-mdio@25520 {
+mdio@25000 {
#address-cells = 1;
#size-cells = 0;
compatible = fsl,gianfar-tbi;
-   reg = 0x25520 0x20;
+   reg = 0x25000 0x1000;
 };
diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi
index 6b3fab1..6290b49 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-2.dtsi
@@ -46,9 +46,9 @@ ethernet@26000 {
interrupts = 31 2 0 0 32 2 0 0 33 2 0 0;
 };
 
-mdio@26520 {
+mdio@26000 {
#address-cells = 1;
#size-cells = 0;
compatible = fsl,gianfar-tbi;
-   reg = 0x26520 0x20;
+   reg = 0x26000 0x1000;
 };
diff --git a/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi 
b/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi
index 0da592d..5296811 100644
--- a/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi
+++ b/arch/powerpc/boot/dts/fsl/pq3-etsec1-3.dtsi
@@ -46,9 +46,9 @@ ethernet@27000 {
interrupts = 37 2 0 0 38 2 0 0 39 2 0 0;
 };
 
-mdio@27520 {
+mdio@27000 {
#address-cells = 1;
#size-cells = 0;
compatible = fsl,gianfar-tbi;
-   reg = 0x27520 0x20;
+   reg = 0x27000 0x1000;
 };
diff --git a/arch/powerpc/boot/dts/ge_imp3a.dts 
b/arch/powerpc/boot/dts/ge_imp3a.dts
index fefae41..49d9b4e 100644
--- a/arch/powerpc/boot/dts/ge_imp3a.dts
+++ b/arch/powerpc/boot/dts/ge_imp3a.dts
@@ -174,14 +174,14 @@
};
};
 
-   mdio@25520 {
+   mdio@25000 {
tbi1: tbi-phy@11 {
reg = 0x11;
device_type = tbi-phy;
};
};
 
-   mdio@26520 {
+   mdio@26000 {
status = disabled;
};
 
diff --git a/arch/powerpc/boot/dts/mpc8536ds.dtsi 
b/arch/powerpc/boot/dts/mpc8536ds.dtsi
index 7c3dde8..c4df5a1 100644
--- a/arch/powerpc/boot/dts/mpc8536ds.dtsi
+++ b/arch/powerpc/boot/dts/mpc8536ds.dtsi
@@ -227,11 +227,11 @@
phy-connection-type = rgmii-id;
};
 
-   mdio@26520 {
+   mdio@26000 {
#address-cells = 1;
#size-cells = 0;
compatible = fsl,gianfar-tbi;
-   reg = 0x26520 0x20;
+   reg = 0x26000 0x1000;
 
tbi1: tbi-phy@11 {
reg = 0x11;
diff --git a/arch/powerpc/boot/dts/mpc8544ds.dtsi 
b/arch/powerpc/boot

Re: [PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register

2013-06-12 Thread Oded Gabbay

Oded Gabbay wrote:
Note: This patch may break MDIO functionallity of some old 
Freescale's SoC
until Freescale will fix their device tree files. Basically, every 
device tree
which contains an mdio device that is compatible to fsl,gianfar-tbi 
should be

examined.


On 06/12/2013 04:04 PM, Timur Tabi wrote:
I haven't had a chance to review the patch in detail, but I can tell 
you that breaking compatibility with older device trees is 
unacceptable.  You need to add some code, even if it's an ugly hack, 
to support those trees.


I generally agree with this statement except that without this patch, 
almost ALL of Freescale's SoC that uses fsl,gianfar-tbi are broken, 
including the older ones. At least this patch fixes some of the device 
trees. Because I'm not working at Freescale, I have a very limited 
access to a few SoC which I could test this patch on. I think it is 
Freescale's responsibility to release a complementary patch to fix the 
rest of the SoC device trees.


Oded
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] MMC: P2020 SDHC: Add support for 8-bit bus width and non-removable card

2013-06-12 Thread Oded Gabbay
This patch adds support of connecting an MMC media using an 8-bit
bus width connection to Freescale's P2020 H/W SDHC controller. During
the probe function, the generic function mmc_of_parse is called to detect
whether the controller is configured with 8-bit bus width. Also, the generic
function detecs if the non-removable property is set in the device tree.
The function esdhc_pltfm_bus_width was added because the bus width configuration
is platform specific.

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 drivers/mmc/host/sdhci-esdhc.h|  7 +++
 drivers/mmc/host/sdhci-of-esdhc.c | 44 ++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h
index d25f9ab..6f9a018 100644
--- a/drivers/mmc/host/sdhci-esdhc.h
+++ b/drivers/mmc/host/sdhci-esdhc.h
@@ -36,6 +36,13 @@
 /* pltfm-specific */
 #define ESDHC_HOST_CONTROL_LE  0x20
 
+/*
+ * P2020 interpretation of the SDHCI_HOST_CONTROL register
+ */
+#define ESDHC_CTRL_4BITBUS  (0x1  1)
+#define ESDHC_CTRL_8BITBUS  (0x2  1)
+#define ESDHC_CTRL_BUSWIDTH_MASK(0x3  1)
+
 /* OF-specific */
 #define ESDHC_DMA_SYSCTL   0x40c
 #define ESDHC_DMA_SNOOP0x0040
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index 5e68adc..fd149a0 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -13,6 +13,7 @@
  * your option) any later version.
  */
 
+#include linux/err.h
 #include linux/io.h
 #include linux/of.h
 #include linux/delay.h
@@ -230,6 +231,30 @@ static void esdhc_of_platform_init(struct sdhci_host *host)
host-quirks = ~SDHCI_QUIRK_NO_BUSY_IRQ;
 }
 
+static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width)
+{
+   u32 ctrl;
+
+   switch (width) {
+   case MMC_BUS_WIDTH_8:
+   ctrl = ESDHC_CTRL_8BITBUS;
+   break;
+
+   case MMC_BUS_WIDTH_4:
+   ctrl = ESDHC_CTRL_4BITBUS;
+   break;
+
+   default:
+   ctrl = 0;
+   break;
+   }
+
+   clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL,
+   ESDHC_CTRL_BUSWIDTH_MASK, ctrl);
+
+   return 0;
+}
+
 static const struct sdhci_ops sdhci_esdhc_ops = {
.read_l = esdhc_readl,
.read_w = esdhc_readw,
@@ -247,6 +272,7 @@ static const struct sdhci_ops sdhci_esdhc_ops = {
.platform_resume = esdhc_of_resume,
 #endif
.adma_workaround = esdhci_of_adma_workaround,
+   .platform_bus_width = esdhc_pltfm_bus_width,
 };
 
 static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
@@ -262,7 +288,23 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
 
 static int sdhci_esdhc_probe(struct platform_device *pdev)
 {
-   return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata);
+   struct sdhci_host *host;
+   int ret = 0;
+
+   host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata);
+   if (IS_ERR(host))
+   return PTR_ERR(host);
+
+   sdhci_get_of_property(pdev);
+
+   /* call to generic mmc_of_parse to support additional capabilities */
+   mmc_of_parse(host-mmc);
+
+   ret = sdhci_add_host(host);
+   if (ret)
+   sdhci_pltfm_free(pdev);
+
+   return ret;
 }
 
 static int sdhci_esdhc_remove(struct platform_device *pdev)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register

2013-06-12 Thread Oded Gabbay
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL
register. This patch adds a QUIRK in the SDHCI header to signal that
a host controller has a non-standard SDHCI_HOST_CONTROL register. The
patch adds a check to the function esdhc_writeb in file
sdhci-of-esdhc.c, where it checks if the write is done to the
SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK,
then the function simply returns instead of writing to the register.
The patch also detects if the processor is P2020 (by looking in dev
tree) and if so, adds the QUIRK to the host-quirk2

This patch depends on the first patch of this set (total of 2 patches)

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 drivers/mmc/host/sdhci-of-esdhc.c | 14 ++
 include/linux/mmc/sdhci.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index fd149a0..ca88529 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -121,6 +121,11 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, 
int reg)
if (reg == SDHCI_HOST_CONTROL) {
u32 dma_bits;
 
+   /* If host control register is not standard, exit
+* this function */
+   if (host-quirks2  SDHCI_QUIRK2_BROKEN_HOST_CONTROL)
+   return;
+
/* DMA select is 22,23 bits in Protocol Control Register */
dma_bits = (val  SDHCI_CTRL_DMA_MASK)  5;
clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK  5,
@@ -289,6 +294,7 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
 static int sdhci_esdhc_probe(struct platform_device *pdev)
 {
struct sdhci_host *host;
+   struct device_node *np;
int ret = 0;
 
host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata);
@@ -297,6 +303,14 @@ static int sdhci_esdhc_probe(struct platform_device *pdev)
 
sdhci_get_of_property(pdev);
 
+   np = pdev-dev.of_node;
+   if (of_device_is_compatible(np, fsl,p2020-esdhc)) {
+   /* Freescale messed up with P2020 as it has a non-standard
+   * host control register
+   */
+   host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL;
+   }
+
/* call to generic mmc_of_parse to support additional capabilities */
mmc_of_parse(host-mmc);
 
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index b838ffc..b73dbdd 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -95,6 +95,8 @@ struct sdhci_host {
 /* The system physically doesn't support 1.8v, even if the host does */
 #define SDHCI_QUIRK2_NO_1_8_V  (12)
 #define SDHCI_QUIRK2_PRESET_VALUE_BROKEN   (13)
+/* Controller has a non-standard host control register */
+#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL   (14)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] MDIO: FSL_PQ_MDIO: Fix bug on incorrect offset of tbipa register

2013-06-13 Thread Oded Gabbay

On 06/12/2013 09:31 PM, Scott Wood wrote:

On 06/12/2013 10:08:29 AM, Sebastian Andrzej Siewior wrote:

On 06/12/2013 02:47 PM, Oded Gabbay wrote:
 This patch fixes a bug in the fsl_pq_mdio.c module and in relevant 
device-tree

 files regarding the correct offset of the tbipa register in the eTSEC
 controller in some of Freescale's PQ3 and QorIQ SoC.
 The bug happens when the mdio in the device tree is configured to 
be compatible
 to fsl,gianfar-tbi. Because the mdio device in the device tree 
points to
 addresses 25520, 26520 or 27520 (depends on the controller ID), the 
variable
 priv-map at function fsl_pq_mdio_probe, points to that address. 
However,
 later in the function there is a write to register tbipa that is 
actually
 located at 25030, 26030 or 27030. Because the correct address is 
not io mapped,

 the contents are written to a different register in the controller.
 The fix sets the address of the mdio device to start at 25000, 
26000 or 27000

 and changes the mii_offset field to 0x520 in the relevant entry
 (fsl,gianfar-tbi) of the fsl_pq_mdio_match array.

 Note: This patch may break MDIO functionallity of some old 
Freescale's SoC
 until Freescale will fix their device tree files. Basically, every 
device tree
 which contains an mdio device that is compatible to 
fsl,gianfar-tbi should be

 examined.

Not as is.
Please add a check for the original address. If it has 0x520 at the end
print a warning and fix it up. Please add to the patch description
which register is modified instead if this patch is not applied.
Depending on how critical this it might has to go stable.


I'm not sure it's stable material if this is something that has never 
worked...


The device tree binding will also need to be fixed to note the 
difference in reg between fsl,gianfar-mdio and fsl-gianfar-tbi 
-- and should give an example of the latter.


-Scott
I read the 2 comments and I'm not sure what should be the best way to 
move ahead.

I would like to describe what is the impact of not accepting this patch:
When you connect any eTSEC, except the first one, using SGMII, you must 
configure the TBIPA register because
the MII management configuration uses the TBIPA address as part of the 
SGMII initialization sequence,

as described in the P2020 Reference manual.
So, if that register is not initialized, the sequence is broken the and 
eTSEC is not functioning (can not send/receive

packets).
I still think the best way to fix it is what I did:
1. Point the priv-map to the start of the whole registers range of the 
eTSEC
2. Set mii_offset to 0x520 in the gianfar-tbi entry of the 
fsl_pq_mdio_match array.
3. Fix all the usages of the gianfar-tbi in the device tree files - 
change the starting address and reg range


I think this is the best way because it is stated in fsl_pq_mdio_probe 
function that:

/*
 * Some device tree nodes represent only the MII registers, and
 * others represent the MAC and MII registers.  The 'mii_offset' field
 * contains the offset of the MII registers inside the mapped register
 * space.
 */
and that's why we have priv-map and priv-regs. So my fix goes 
according to the current design of the driver.


-Oded
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] MMC: P2020 SDHC: Add support for 8-bit bus width and non-removable card

2013-06-16 Thread Oded Gabbay
This patch adds support of connecting an MMC media using an 8-bit
bus width connection to Freescale's P2020 H/W SDHC controller. During
the probe function, the generic function mmc_of_parse is called to detect
whether the controller is configured with 8-bit bus width. Also, the generic
function detecs if the non-removable property is set in the device tree.
The function esdhc_pltfm_bus_width was added because the bus width configuration
is platform specific.

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
Reviewed-by: Anton Vorontsov an...@enomsg.org
---
 drivers/mmc/host/sdhci-esdhc.h|  7 +++
 drivers/mmc/host/sdhci-of-esdhc.c | 44 ++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h
index d25f9ab..6f9a018 100644
--- a/drivers/mmc/host/sdhci-esdhc.h
+++ b/drivers/mmc/host/sdhci-esdhc.h
@@ -36,6 +36,13 @@
 /* pltfm-specific */
 #define ESDHC_HOST_CONTROL_LE  0x20
 
+/*
+ * P2020 interpretation of the SDHCI_HOST_CONTROL register
+ */
+#define ESDHC_CTRL_4BITBUS  (0x1  1)
+#define ESDHC_CTRL_8BITBUS  (0x2  1)
+#define ESDHC_CTRL_BUSWIDTH_MASK(0x3  1)
+
 /* OF-specific */
 #define ESDHC_DMA_SYSCTL   0x40c
 #define ESDHC_DMA_SNOOP0x0040
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index 5e68adc..fd149a0 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -13,6 +13,7 @@
  * your option) any later version.
  */
 
+#include linux/err.h
 #include linux/io.h
 #include linux/of.h
 #include linux/delay.h
@@ -230,6 +231,30 @@ static void esdhc_of_platform_init(struct sdhci_host *host)
host-quirks = ~SDHCI_QUIRK_NO_BUSY_IRQ;
 }
 
+static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width)
+{
+   u32 ctrl;
+
+   switch (width) {
+   case MMC_BUS_WIDTH_8:
+   ctrl = ESDHC_CTRL_8BITBUS;
+   break;
+
+   case MMC_BUS_WIDTH_4:
+   ctrl = ESDHC_CTRL_4BITBUS;
+   break;
+
+   default:
+   ctrl = 0;
+   break;
+   }
+
+   clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL,
+   ESDHC_CTRL_BUSWIDTH_MASK, ctrl);
+
+   return 0;
+}
+
 static const struct sdhci_ops sdhci_esdhc_ops = {
.read_l = esdhc_readl,
.read_w = esdhc_readw,
@@ -247,6 +272,7 @@ static const struct sdhci_ops sdhci_esdhc_ops = {
.platform_resume = esdhc_of_resume,
 #endif
.adma_workaround = esdhci_of_adma_workaround,
+   .platform_bus_width = esdhc_pltfm_bus_width,
 };
 
 static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
@@ -262,7 +288,23 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
 
 static int sdhci_esdhc_probe(struct platform_device *pdev)
 {
-   return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata);
+   struct sdhci_host *host;
+   int ret;
+
+   host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata);
+   if (IS_ERR(host))
+   return PTR_ERR(host);
+
+   sdhci_get_of_property(pdev);
+
+   /* call to generic mmc_of_parse to support additional capabilities */
+   mmc_of_parse(host-mmc);
+
+   ret = sdhci_add_host(host);
+   if (ret)
+   sdhci_pltfm_free(pdev);
+
+   return ret;
 }
 
 static int sdhci_esdhc_remove(struct platform_device *pdev)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register

2013-06-16 Thread Oded Gabbay
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL
register. This patch adds a QUIRK in the SDHCI header to signal that
a host controller has a non-standard SDHCI_HOST_CONTROL register. The
patch adds a check to the function esdhc_writeb in file
sdhci-of-esdhc.c, where it checks if the write is done to the
SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK,
then the function simply returns instead of writing to the register.
The patch also detects if the processor is P2020 (by looking in dev
tree) and if so, adds the QUIRK to the host-quirk2

This patch depends on the first patch of this set (total of 2 patches)

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
Reviewed-by: Anton Vorontsov an...@enomsg.org
---
 drivers/mmc/host/sdhci-of-esdhc.c | 14 ++
 include/linux/mmc/sdhci.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index fd149a0..ca88529 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -121,6 +121,13 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, 
int reg)
if (reg == SDHCI_HOST_CONTROL) {
u32 dma_bits;
 
+   /*
+* If host control register is not standard, exit
+* this function
+*/
+   if (host-quirks2  SDHCI_QUIRK2_BROKEN_HOST_CONTROL)
+   return;
+
/* DMA select is 22,23 bits in Protocol Control Register */
dma_bits = (val  SDHCI_CTRL_DMA_MASK)  5;
clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK  5,
@@ -289,6 +296,7 @@ static const struct sdhci_pltfm_data sdhci_esdhc_pdata = {
 static int sdhci_esdhc_probe(struct platform_device *pdev)
 {
struct sdhci_host *host;
+   struct device_node *np;
int ret;
 
host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata);
@@ -297,6 +304,15 @@ static int sdhci_esdhc_probe(struct platform_device *pdev)
 
sdhci_get_of_property(pdev);
 
+   np = pdev-dev.of_node;
+   if (of_device_is_compatible(np, fsl,p2020-esdhc)) {
+   /*
+* Freescale messed up with P2020 as it has a non-standard
+* host control register
+*/
+   host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL;
+   }
+
/* call to generic mmc_of_parse to support additional capabilities */
mmc_of_parse(host-mmc);
 
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index b838ffc..b73dbdd 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -95,6 +95,8 @@ struct sdhci_host {
 /* The system physically doesn't support 1.8v, even if the host does */
 #define SDHCI_QUIRK2_NO_1_8_V  (12)
 #define SDHCI_QUIRK2_PRESET_VALUE_BROKEN   (13)
+/* Controller has a non-standard host control register */
+#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL   (14)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] MMC: FSL SDHC: Add support for hard-wired (permanent) card. Kernel version 3.4.47

2013-06-02 Thread Oded Gabbay
This patch adds support of recognizing hard-wired (permanent) cards
to Freescale's SDHC host driver. This is done by adding the option
fsl,card-wired to the SDHC device-tree entry. Detection of this
option is done in the probe function. Update documentation in file
fsl-esdhc.txt

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 Documentation/devicetree/bindings/mmc/fsl-esdhc.txt | 3 +++
 drivers/mmc/host/sdhci-of-esdhc.c   | 4 
 2 files changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt 
b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt
index 64bcb8b..6f0eefa 100644
--- a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt
+++ b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt
@@ -16,6 +16,9 @@ Required properties:
 only handle 1-bit data transfers.
   - sdhci,auto-cmd12: (optional) specifies that a controller can
 only handle auto CMD12.
+  - fsl,card-wired : (optional) specifies that the card is
+a permanent card and should not be detected for insertion or
+removal
 
 Example:
 
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index e70f22f..2f79ec2 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -222,6 +222,10 @@ static int __devinit sdhci_esdhc_probe(struct 
platform_device *pdev)
host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL;
}
 
+   /* If card is permanent, add capability of non-removable */
+   if (of_get_property(np, fsl,card-wired, NULL))
+   host-mmc-caps |= MMC_CAP_NONREMOVABLE;
+
ret = sdhci_add_host(host);
if (ret)
sdhci_pltfm_free(pdev);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] MMC: P2020 SDHC: Fix bug when writing to SDHCI_HOST_CONTROL register. Kernel version 3.4.47

2013-06-02 Thread Oded Gabbay
The P2020 has a non-standard implementation of the SDHCI_HOST_CONTROL
register. This patch adds a QUIRK in the SDHCI header to signal that
a host controller has a non-standard SDHCI_HOST_CONTROL register. The
patch adds a check to the function esdhc_writeb in file
sdhci-of-esdhc.c, where it checks if the write is done to the
SDHCI_HOST_CONTROL register and th host has the above mentioned QUIRK,
then the function simply returns instead of writing to the register.
The patch also detects if the processor is P2020 (by looking in dev
tree) and if so, adds the QUIRK to the host-quirk2

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 drivers/mmc/host/sdhci-of-esdhc.c | 10 ++
 include/linux/mmc/sdhci.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index 6f433b8..e70f22f 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -82,6 +82,11 @@ static void esdhc_writeb(struct sdhci_host *host, u8 val, 
int reg)
if (reg == SDHCI_HOST_CONTROL) {
u32 dma_bits;
 
+   /* If host control register is not standard, exit
+* this function */
+   if (host-quirks2  SDHCI_QUIRK2_BROKEN_HOST_CONTROL)
+   return;
+
/* DMA select is 22,23 bits in Protocol Control Register */
dma_bits = (val  SDHCI_CTRL_DMA_MASK)  5;
clrsetbits_be32(host-ioaddr + reg , SDHCI_CTRL_DMA_MASK  5,
@@ -210,6 +215,11 @@ static int __devinit sdhci_esdhc_probe(struct 
platform_device *pdev)
if (of_device_is_compatible(np, fsl,p2020-esdhc)) {
/* P2020 has capability of 8 bit bus width */
host-mmc-caps |= MMC_CAP_8_BIT_DATA;
+
+   /* Freescale messed up with P2020 as it has a non-standard
+   * host control register
+   */
+   host-quirks2 |= SDHCI_QUIRK2_BROKEN_HOST_CONTROL;
}
 
ret = sdhci_add_host(host);
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index e9051e1..2742134 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -91,6 +91,8 @@ struct sdhci_host {
unsigned int quirks2;   /* More deviations from spec. */
 
 #define SDHCI_QUIRK2_HOST_OFF_CARD_ON  (10)
+/* Controller has a non-standard host control register */
+#define SDHCI_QUIRK2_BROKEN_HOST_CONTROL(11)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] MMC: P2020 SDHC: Add support for 8-bit bus width connection. Kernel version 3.4.47

2013-06-02 Thread Oded Gabbay
This patch adds support of connecting an MMC media using an 8-bit 
bus width connection to Freescale's P2020 H/W SDHC controller. During 
the probe function, it detects if the processor is P2020 (by looking 
at device tree) and if so, it adds the MMC_CAP_8_BIT_DATA to the MMC 
caps 

Signed-off-by: Oded Gabbay ogab...@advaoptical.com
---
 drivers/mmc/host/sdhci-esdhc.h|  7 ++
 drivers/mmc/host/sdhci-of-esdhc.c | 49 ++-
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-esdhc.h b/drivers/mmc/host/sdhci-esdhc.h
index d25f9ab..6f9a018 100644
--- a/drivers/mmc/host/sdhci-esdhc.h
+++ b/drivers/mmc/host/sdhci-esdhc.h
@@ -36,6 +36,13 @@
 /* pltfm-specific */
 #define ESDHC_HOST_CONTROL_LE  0x20
 
+/*
+ * P2020 interpretation of the SDHCI_HOST_CONTROL register
+ */
+#define ESDHC_CTRL_4BITBUS  (0x1  1)
+#define ESDHC_CTRL_8BITBUS  (0x2  1)
+#define ESDHC_CTRL_BUSWIDTH_MASK(0x3  1)
+
 /* OF-specific */
 #define ESDHC_DMA_SYSCTL   0x40c
 #define ESDHC_DMA_SNOOP0x0040
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
b/drivers/mmc/host/sdhci-of-esdhc.c
index f8eb1fb..6f433b8 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -13,6 +13,7 @@
  * your option) any later version.
  */
 
+#include linux/err.h
 #include linux/io.h
 #include linux/of.h
 #include linux/delay.h
@@ -143,6 +144,31 @@ static void esdhc_of_resume(struct sdhci_host *host)
 }
 #endif
 
+static int esdhc_pltfm_bus_width(struct sdhci_host *host, int width)
+{
+   u32 ctrl;
+
+   switch (width) {
+   case MMC_BUS_WIDTH_8:
+   ctrl = ESDHC_CTRL_8BITBUS;
+   break;
+
+   case MMC_BUS_WIDTH_4:
+   ctrl = ESDHC_CTRL_4BITBUS;
+   break;
+
+   default:
+   ctrl = 0;
+   break;
+   }
+
+   clrsetbits_be32(host-ioaddr + SDHCI_HOST_CONTROL,
+   ESDHC_CTRL_BUSWIDTH_MASK, ctrl);
+
+   return 0;
+}
+
+
 static struct sdhci_ops sdhci_esdhc_ops = {
.read_l = sdhci_be32bs_readl,
.read_w = esdhc_readw,
@@ -158,6 +184,7 @@ static struct sdhci_ops sdhci_esdhc_ops = {
.platform_suspend = esdhc_of_suspend,
.platform_resume = esdhc_of_resume,
 #endif
+   .platform_8bit_width = esdhc_pltfm_bus_width
 };
 
 static struct sdhci_pltfm_data sdhci_esdhc_pdata = {
@@ -169,7 +196,27 @@ static struct sdhci_pltfm_data sdhci_esdhc_pdata = {
 
 static int __devinit sdhci_esdhc_probe(struct platform_device *pdev)
 {
-   return sdhci_pltfm_register(pdev, sdhci_esdhc_pdata);
+   struct sdhci_host *host;
+   struct device_node *np;
+   int ret = 0;
+
+   host = sdhci_pltfm_init(pdev, sdhci_esdhc_pdata);
+   if (IS_ERR(host))
+   return PTR_ERR(host);
+
+   sdhci_get_of_property(pdev);
+
+   np = pdev-dev.of_node;
+   if (of_device_is_compatible(np, fsl,p2020-esdhc)) {
+   /* P2020 has capability of 8 bit bus width */
+   host-mmc-caps |= MMC_CAP_8_BIT_DATA;
+   }
+
+   ret = sdhci_add_host(host);
+   if (ret)
+   sdhci_pltfm_free(pdev);
+
+   return ret;
 }
 
 static int __devexit sdhci_esdhc_remove(struct platform_device *pdev)
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/83] AMD HSA kernel driver

2014-07-10 Thread Oded Gabbay
. Each mm_struct is associated 
with a unique PASID, allowing the IOMMUv2 to make userspace process memory 
accessible to the GPU. 

Next step is for the application to collect topology information via sysfs. 
This gives userspace enough information to be able to identify specific 
nodes (processors) in subsequent queue management calls. Application 
processes can create queues on multiple processors, and processors support 
queues from multiple processes. 

At this point the application can create work queues in userspace memory and 
pass them through the usermode library to kfd to have them mapped onto HW 
queue slots so that commands written to the queues can be executed by the 
GPU. Queue operations specify a processor node, and so the bulk of this code 
is device-specific. 

Written by John Bridgman john.bridg...@amd.com

Alexey Skidanov (4):
  hsa/radeon: 32-bit processes support
  hsa/radeon: NULL pointer dereference bug workaround
  hsa/radeon: HSA64/HSA32 modes support
  hsa/radeon: Add local memory to topology

Andrew Lewycky (3):
  hsa/radeon: Make binding of process to device permanent
  hsa/radeon: Implement hsaKmtSetMemoryPolicy
  mm: Change timing of notification to IOMMUs about a page to be
invalidated

Ben Goz (20):
  hsa/radeon: Add queue and hw_pointer_store modules
  hsa/radeon: Add support allocating kernel doorbells
  hsa/radeon: Add mqd_manager module
  hsa/radeon: Add kernel queue support for KFD
  hsa/radeon: Add module parameter of scheduling policy
  hsa/radeon: Add packet manager module
  hsa/radeon: Add process queue manager module
  hsa/radeon: Add device queue manager module
  hsa/radeon: Switch to new queue scheduler
  hsa/radeon: Add IOCTL for update queue
  hsa/radeon: Queue Management integration with Memory Management
  hsa/radeon: update queue fault handling
  hsa/radeon: fixing a bug to support 32b processes
  hsa/radeon: Fix number of pipes per ME
  hsa/radeon: Removing hw pointer store module
  hsa/radeon: Adding some error messages
  hsa/radeon: Fixing minor issues with kernel queues (DIQ)
  drm/radeon: Add register access functions to kfd2kgd interface
  hsa/radeon: Eliminating all direct register accesses
  drm/radeon: Remove lock functions from kfd2kgd interface

Evgeny Pinchuk (9):
  hsa/radeon: fix the OEMID assignment in kfd_topology
  drm/radeon: extending kfd-kgd interface
  hsa/radeon: implementing IOCTL for clock counters
  drm/radeon: adding synchronization for GRBM GFX
  hsa/radeon: fixing clock counters bug
  drm/radeon: Extending kfd interface
  hsa/radeon: Adding max clock speeds to topology
  hsa/radeon: Alternating the source of max clock
  hsa/radeon: Exclusive access for perf. counters

Michael Varga (1):
  hsa/radeon: debugging print statements

Oded Gabbay (45):
  mm: Add kfd_process pointer to mm_struct
  drm/radeon: reduce number of free VMIDs and pipes in KV
  drm/radeon: Report doorbell configuration to kfd
  drm/radeon: Add radeon -- kfd interface
  drm/radeon: Add kfd--kgd interface to get virtual ram size
  drm/radeon: Add kfd--kgd interfaces of memory allocation/mapping
  drm/radeon: Add kfd--kgd interface of locking srbm_gfx_cntl register
  drm/radeon: Add calls to initialize and finalize kfd from radeon
  hsa/radeon: Add code base of hsa driver for AMD's GPUs
  hsa/radeon: Add initialization and unmapping of doorbell aperture
  hsa/radeon: Add scheduler code
  hsa/radeon: Add kfd mmap handler
  hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE
  hsa/radeon: Update MAINTAINERS and CREDITS files
  hsa/radeon: Add interrupt handling module
  hsa/radeon: Add the isr function of the KFD scehduler
  hsa/radeon: Handle deactivation of queues using interrupts
  hsa/radeon: Enable interrupts in KFD scheduler
  hsa/radeon: Enable/Disable KFD interrupt module
  hsa/radeon: Add interrupt callback function to kgd2kfd interface
  hsa/radeon: Add kgd--kfd interfaces for suspend and resume
  drm/radeon: Add calls to suspend and resume of kfd driver
  drm/radeon/cik: Don't touch int of pipes 1-7
  drm/radeon/cik: Call kfd isr function
  hsa/radeon: Fix memory size allocated for HPD
  hsa/radeon: Fix list of supported devices
  hsa/radeon: Fix coding style in cik_int.h
  hsa/radeon: Print ioctl commnad only in debug mode
  hsa/radeon: Print ISR info only in debug mode
  hsa/radeon: Workaround for a bug in amd_iommu
  hsa/radeon: Eliminate warnings in compilation
  hsa/radeon: Various kernel styling fixes
  hsa/radeon: Rearrange structures in kfd_ioctl.h
  hsa/radeon: change another pr_info to pr_debug
  hsa/radeon: Fix timeout calculation in sync_with_hw
  hsa/radeon: Update module information and version
  hsa/radeon: Update module version to 0.6.0
  hsa/radeon: Fix initialization of sh_mem registers
  hsa/radeon: Fix compilation warnings
  hsa/radeon: Remove old scheduler code
  hsa/radeon: Static analysis (smatch) fixes
  hsa/radeon: Check oversubscription before destroying runlist
  hsa/radeon: Don't verify cksum when parsing CRAT

[PATCH 01/83] mm: Add kfd_process pointer to mm_struct

2014-07-10 Thread Oded Gabbay
This patch enables the KFD to retrieve the kfd_process
object from the process's mm_struct. This is needed because kfd_process
lifespan is bound to the process's mm_struct lifespan.

When KFD is notified about an mm_struct tear-down, it checks if the
kfd_process pointer is valid. If so, it releases the kfd_process object
and all relevant resources.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 include/linux/mm_types.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 678097c..6179107 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -20,6 +20,10 @@
 struct hmm;
 #endif
 
+#ifdef CONFIG_HSA_RADEON
+struct kfd_process;
+#endif
+
 #ifndef AT_VECTOR_SIZE_ARCH
 #define AT_VECTOR_SIZE_ARCH 0
 #endif
@@ -439,6 +443,16 @@ struct mm_struct {
 */
struct hmm *hmm;
 #endif
+#if defined(CONFIG_HSA_RADEON) || defined(CONFIG_HSA_RADEON_MODULE)
+   /*
+* kfd always register an mmu_notifier we rely on mmu notifier to keep
+* refcount on mm struct as well as forbiding registering kfd on a
+* dying mm
+*
+* This field is set with mmap_sem old in write mode.
+*/
+   struct kfd_process *kfd_process;
+#endif
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE)  !USE_SPLIT_PMD_PTLOCKS
pgtable_t pmd_huge_pte; /* protected by page_table_lock */
 #endif
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-10 Thread Oded Gabbay
To support HSA on KV, we need to limit the number of vmids and pipes
that are available for radeon's use with KV.

This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
0-7) and also makes radeon thinks that KV has only a single MEC with a single
pipe in it

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cik.c | 48 ++--
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 4bfc2c0..e0c8052 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device *rdev)
/*
 * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
 * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
+* Nonetheless, we assign only 1 pipe because all other pipes will
+* be handled by KFD
 */
-   if (rdev-family == CHIP_KAVERI)
-   rdev-mec.num_mec = 2;
-   else
-   rdev-mec.num_mec = 1;
-   rdev-mec.num_pipe = 4;
+   rdev-mec.num_mec = 1;
+   rdev-mec.num_pipe = 1;
rdev-mec.num_queue = rdev-mec.num_mec * rdev-mec.num_pipe * 8;
 
if (rdev-mec.hpd_eop_obj == NULL) {
@@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct radeon_device 
*rdev)
 
/* init the pipes */
mutex_lock(rdev-srbm_mutex);
-   for (i = 0; i  (rdev-mec.num_pipe * rdev-mec.num_mec); i++) {
-   int me = (i  4) ? 1 : 2;
-   int pipe = (i  4) ? i : (i - 4);
 
-   eop_gpu_addr = rdev-mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 
2);
+   eop_gpu_addr = rdev-mec.hpd_eop_gpu_addr;
 
-   cik_srbm_select(rdev, me, pipe, 0, 0);
+   cik_srbm_select(rdev, 0, 0, 0, 0);
 
-   /* write the EOP addr */
-   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr  8);
-   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr)  
8);
+   /* write the EOP addr */
+   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr  8);
+   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr)  8);
 
-   /* set the VMID assigned */
-   WREG32(CP_HPD_EOP_VMID, 0);
+   /* set the VMID assigned */
+   WREG32(CP_HPD_EOP_VMID, 0);
+
+   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
+   tmp = RREG32(CP_HPD_EOP_CONTROL);
+   tmp = ~EOP_SIZE_MASK;
+   tmp |= order_base_2(MEC_HPD_SIZE / 8);
+   WREG32(CP_HPD_EOP_CONTROL, tmp);
 
-   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
-   tmp = RREG32(CP_HPD_EOP_CONTROL);
-   tmp = ~EOP_SIZE_MASK;
-   tmp |= order_base_2(MEC_HPD_SIZE / 8);
-   WREG32(CP_HPD_EOP_CONTROL, tmp);
-   }
-   cik_srbm_select(rdev, 0, 0, 0, 0);
mutex_unlock(rdev-srbm_mutex);
 
/* init the queues.  Just two for now. */
@@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct 
radeon_ib *ib)
  */
 int cik_vm_init(struct radeon_device *rdev)
 {
-   /* number of VMs */
-   rdev-vm_manager.nvm = 16;
+   /*
+* number of VMs
+* VMID 0 is reserved for Graphics
+* radeon compute will use VMIDs 1-7
+* KFD will use VMIDs 8-15
+*/
+   rdev-vm_manager.nvm = 8;
/* base offset of vram pages */
if (rdev-flags  RADEON_IS_IGP) {
u64 tmp = RREG32(MC_VM_FB_OFFSET);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/83] drm/radeon: Add kfd--kgd interfaces of memory allocation/mapping

2014-07-10 Thread Oded Gabbay
This patch adds new interfaces to kfd2kgd_calls structure.

The new interfaces allow the kfd driver to :

1. Allocated video memory through the radeon driver
2. Map and unmap video memory with GPUVM through the radeon driver
3. Map and unmap system memory with GPUVM through the radeon driver

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 129 
 include/linux/radeon_kfd.h  |  23 +++
 2 files changed, 152 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 1b859b5..66ee36b 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -25,9 +25,31 @@
 #include drm/drmP.h
 #include radeon.h
 
+struct kgd_mem {
+   struct radeon_bo *bo;
+   u32 domain;
+};
+
+static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment,
+   enum kgd_memory_pool pool, struct kgd_mem **memory_handle);
+
+static void free_mem(struct kgd_dev *kgd, struct kgd_mem *memory_handle);
+
+static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t 
*vmid0_address);
+static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem);
+
+static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr);
+static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem);
+
 static uint64_t get_vmem_size(struct kgd_dev *kgd);
 
 static const struct kfd2kgd_calls kfd2kgd = {
+   .allocate_mem = allocate_mem,
+   .free_mem = free_mem,
+   .gpumap_mem = gpumap_mem,
+   .ungpumap_mem = ungpumap_mem,
+   .kmap_mem = kmap_mem,
+   .unkmap_mem = unkmap_mem,
.get_vmem_size = get_vmem_size,
 };
 
@@ -96,6 +118,113 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
}
 }
 
+static u32 pool_to_domain(enum kgd_memory_pool p)
+{
+   switch (p) {
+   case KGD_POOL_FRAMEBUFFER: return RADEON_GEM_DOMAIN_VRAM;
+   default: return RADEON_GEM_DOMAIN_GTT;
+   }
+}
+
+static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment,
+   enum kgd_memory_pool pool, struct kgd_mem **memory_handle)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+   struct kgd_mem *mem;
+   int r;
+
+   mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
+   if (!mem)
+   return -ENOMEM;
+
+   mem-domain = pool_to_domain(pool);
+
+   r = radeon_bo_create(rdev, size, alignment, true, mem-domain, NULL, 
mem-bo);
+   if (r) {
+   kfree(mem);
+   return r;
+   }
+
+   *memory_handle = mem;
+   return 0;
+}
+
+static void free_mem(struct kgd_dev *kgd, struct kgd_mem *mem)
+{
+   /* Assume that KFD will never free gpumapped or kmapped memory. This is 
not quite settled. */
+   radeon_bo_unref(mem-bo);
+   kfree(mem);
+}
+
+static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t 
*vmid0_address)
+{
+   int r;
+
+   r = radeon_bo_reserve(mem-bo, true);
+
+   /*
+* ttm_bo_reserve can only fail if the buffer reservation lock
+* is held in circumstances that would deadlock
+*/
+   BUG_ON(r != 0);
+   r = radeon_bo_pin(mem-bo, mem-domain, vmid0_address);
+   radeon_bo_unreserve(mem-bo);
+
+   return r;
+}
+
+static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem)
+{
+   int r;
+
+   r = radeon_bo_reserve(mem-bo, true);
+
+   /*
+* ttm_bo_reserve can only fail if the buffer reservation lock
+* is held in circumstances that would deadlock
+*/
+   BUG_ON(r != 0);
+   r = radeon_bo_unpin(mem-bo);
+
+   /*
+* This unpin only removed NO_EVICT placement flags
+* and should never fail
+*/
+   BUG_ON(r != 0);
+   radeon_bo_unreserve(mem-bo);
+}
+
+static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr)
+{
+   int r;
+
+   r = radeon_bo_reserve(mem-bo, true);
+
+   /*
+* ttm_bo_reserve can only fail if the buffer reservation lock
+* is held in circumstances that would deadlock
+*/
+   BUG_ON(r != 0);
+   r = radeon_bo_kmap(mem-bo, ptr);
+   radeon_bo_unreserve(mem-bo);
+
+   return r;
+}
+
+static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem)
+{
+   int r;
+
+   r = radeon_bo_reserve(mem-bo, true);
+   /*
+* ttm_bo_reserve can only fail if the buffer reservation lock
+* is held in circumstances that would deadlock
+*/
+   BUG_ON(r != 0);
+   radeon_bo_kunmap(mem-bo);
+   radeon_bo_unreserve(mem-bo);
+}
+
 static uint64_t get_vmem_size(struct kgd_dev *kgd)
 {
struct radeon_device *rdev = (struct radeon_device *)kgd;
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 28cddf5..c7997d4 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -36,6 +36,14

[PATCH 03/83] drm/radeon: Report doorbell configuration to kfd

2014-07-10 Thread Oded Gabbay
Radeon and KFD share the doorbell aperture.
Radeon sets it up, takes the doorbells required for its own rings
and reports the setup to KFD.
Radeon reserved doorbells are at the start of the doorbell aperture.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon.h|  4 
 drivers/gpu/drm/radeon/radeon_device.c | 31 +++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 7cda75d..4e7e41f 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -676,6 +676,10 @@ struct radeon_doorbell {
 
 int radeon_doorbell_get(struct radeon_device *rdev, u32 *page);
 void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell);
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset);
 
 /*
  * IRQS.
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 03686fa..98538d2 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -328,6 +328,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 
doorbell)
__clear_bit(doorbell, rdev-doorbell.used);
 }
 
+/**
+ * radeon_doorbell_get_kfd_info - Report doorbell configuration required to
+ *setup KFD
+ *
+ * @rdev: radeon_device pointer
+ * @aperture_base: output returning doorbell aperture base physical address
+ * @aperture_size: output returning doorbell aperture size in bytes
+ * @start_offset: output returning # of doorbell bytes reserved for radeon.
+ *
+ * Radeon and the KFD share the doorbell aperture. Radeon sets it up,
+ * takes doorbells required for its own rings and reports the setup to KFD.
+ * Radeon reserved doorbells are at the start of the doorbell aperture.
+ */
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset)
+{
+   /* The first num_doorbells are used by radeon.
+* KFD takes whatever's left in the aperture. */
+   if (rdev-doorbell.size  rdev-doorbell.num_doorbells * sizeof(u32)) {
+   *aperture_base = rdev-doorbell.base;
+   *aperture_size = rdev-doorbell.size;
+   *start_offset = rdev-doorbell.num_doorbells * sizeof(u32);
+   } else {
+   *aperture_base = 0;
+   *aperture_size = 0;
+   *start_offset = 0;
+   }
+}
+
 /*
  * radeon_wb_*()
  * Writeback is the the method by which the the GPU updates special pages
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-10 Thread Oded Gabbay
The KFD driver should be loaded when the radeon driver is loaded and
should be finalized when the radeon driver is removed.

This patch adds a function call to initialize kfd from radeon_init
and a function call to finalize kfd from radeon_exit.

If the KFD driver is not present in the system, the initialize call
fails and the radeon driver continues normally.

This patch also adds calls to probe, initialize and finalize a kfd device
per radeon device using the kgd--kfd interface.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_drv.c | 6 ++
 drivers/gpu/drm/radeon/radeon_kms.c | 9 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index cb14213..88a45a0 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -151,6 +151,9 @@ static inline void radeon_register_atpx_handler(void) {}
 static inline void radeon_unregister_atpx_handler(void) {}
 #endif
 
+extern bool radeon_kfd_init(void);
+extern void radeon_kfd_fini(void);
+
 int radeon_no_wb;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
@@ -630,12 +633,15 @@ static int __init radeon_init(void)
 #endif
}
 
+   radeon_kfd_init();
+
/* let modprobe override vga console setting */
return drm_pci_init(driver, pdriver);
 }
 
 static void __exit radeon_exit(void)
 {
+   radeon_kfd_fini();
drm_pci_exit(driver, pdriver);
radeon_unregister_atpx_handler();
 }
diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 35d9318..0748284 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -34,6 +34,10 @@
 #include linux/slab.h
 #include linux/pm_runtime.h
 
+extern void radeon_kfd_device_probe(struct radeon_device *rdev);
+extern void radeon_kfd_device_init(struct radeon_device *rdev);
+extern void radeon_kfd_device_fini(struct radeon_device *rdev);
+
 #if defined(CONFIG_VGA_SWITCHEROO)
 bool radeon_has_atpx(void);
 #else
@@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev)
 
pm_runtime_get_sync(dev-dev);
 
+   radeon_kfd_device_fini(rdev);
+
radeon_acpi_fini(rdev);

radeon_modeset_fini(rdev);
@@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned 
long flags)
Error during ACPI methods call\n);
}
 
+   radeon_kfd_device_probe(rdev);
+   radeon_kfd_device_init(rdev);
+
if (radeon_is_px(dev)) {
pm_runtime_use_autosuspend(dev-dev);
pm_runtime_set_autosuspend_delay(dev-dev, 5000);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/83] drm/radeon: Add kfd--kgd interface of locking srbm_gfx_cntl register

2014-07-10 Thread Oded Gabbay
This patch adds a new interface to kfd2kgd_calls structure, which
allows the kfd to lock and unlock the srbm_gfx_cntl register

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 20 
 include/linux/radeon_kfd.h  |  4 
 2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 66ee36b..594020e 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem 
*mem);
 
 static uint64_t get_vmem_size(struct kgd_dev *kgd);
 
+static void lock_srbm_gfx_cntl(struct kgd_dev *kgd);
+static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
+
+
 static const struct kfd2kgd_calls kfd2kgd = {
.allocate_mem = allocate_mem,
.free_mem = free_mem,
@@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.kmap_mem = kmap_mem,
.unkmap_mem = unkmap_mem,
.get_vmem_size = get_vmem_size,
+   .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
+   .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
 };
 
 static const struct kgd2kfd_calls *kgd2kfd;
@@ -233,3 +239,17 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd)
 
return rdev-mc.real_vram_size;
 }
+
+static void lock_srbm_gfx_cntl(struct kgd_dev *kgd)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+
+   mutex_lock(rdev-srbm_mutex);
+}
+
+static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+
+   mutex_unlock(rdev-srbm_mutex);
+}
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index c7997d4..40b691c 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -81,6 +81,10 @@ struct kfd2kgd_calls {
void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
 
uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
+
+   /* SRBM_GFX_CNTL mutex */
+   void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
+   void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
 };
 
 bool kgd2kfd_init(unsigned interface_version,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/83] hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE

2014-07-10 Thread Oded Gabbay
This patch adds 2 new IOCTL to kfd driver.

The first IOCTL is KFD_IOC_CREATE_QUEUE that is used by the user-mode
application to create a compute queue on the GPU.

The second IOCTL is KFD_IOC_DESTROY_QUEUE that is used by the
user-mode application to destroy an existing compute queue on the GPU.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c  | 155 ++
 drivers/gpu/hsa/radeon/kfd_doorbell.c |  11 +++
 include/uapi/linux/kfd_ioctl.h|  69 +++
 3 files changed, 235 insertions(+)
 create mode 100644 include/uapi/linux/kfd_ioctl.h

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 0b5bc74..4e7d5d0 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -27,11 +27,13 @@
 #include linux/sched.h
 #include linux/slab.h
 #include linux/uaccess.h
+#include uapi/linux/kfd_ioctl.h
 #include kfd_priv.h
 #include kfd_scheduler.h
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
+static int kfd_mmap(struct file *, struct vm_area_struct *);
 
 static const char kfd_dev_name[] = kfd;
 
@@ -108,17 +110,170 @@ kfd_open(struct inode *inode, struct file *filep)
return 0;
 }
 
+static long
+kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user 
*arg)
+{
+   struct kfd_ioctl_create_queue_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   unsigned int queue_id;
+   struct kfd_queue *queue;
+   struct kfd_process_device *pdd;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   queue = kzalloc(
+   offsetof(struct kfd_queue, scheduler_queue) + 
dev-device_info-scheduler_class-queue_size,
+   GFP_KERNEL);
+
+   if (!queue)
+   return -ENOMEM;
+
+   queue-dev = dev;
+
+   mutex_lock(p-mutex);
+
+   pdd = radeon_kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd)  0) {
+   err = PTR_ERR(pdd);
+   goto err_bind_pasid;
+   }
+
+   pr_debug(kfd: creating queue number %d for PASID %d on GPU 0x%x\n,
+   pdd-queue_count,
+   p-pasid,
+   dev-id);
+
+   if (pdd-queue_count++ == 0) {
+   err = 
dev-device_info-scheduler_class-register_process(dev-scheduler, p, 
pdd-scheduler_process);
+   if (err  0)
+   goto err_register_process;
+   }
+
+   if (!radeon_kfd_allocate_queue_id(p, queue_id))
+   goto err_allocate_queue_id;
+
+   err = dev-device_info-scheduler_class-create_queue(dev-scheduler, 
pdd-scheduler_process,
+ 
queue-scheduler_queue,
+ (void __user 
*)args.ring_base_address,
+ args.ring_size,
+ (void __user 
*)args.read_pointer_address,
+ (void __user 
*)args.write_pointer_address,
+ 
radeon_kfd_queue_id_to_doorbell(dev, p, queue_id));
+   if (err)
+   goto err_create_queue;
+
+   radeon_kfd_install_queue(p, queue_id, queue);
+
+   args.queue_id = queue_id;
+   args.doorbell_address = 
(uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id);
+
+   if (copy_to_user(arg, args, sizeof(args))) {
+   err = -EFAULT;
+   goto err_copy_args_out;
+   }
+
+   mutex_unlock(p-mutex);
+
+   pr_debug(kfd: queue id %d was created successfully.\n
+ ring buffer address == 0x%016llX\n
+ read ptr address== 0x%016llX\n
+ write ptr address   == 0x%016llX\n
+ doorbell address== 0x%016llX\n,
+   args.queue_id,
+   args.ring_base_address,
+   args.read_pointer_address,
+   args.write_pointer_address,
+   args.doorbell_address);
+
+   return 0;
+
+err_copy_args_out:
+   dev-device_info-scheduler_class-destroy_queue(dev-scheduler, 
queue-scheduler_queue);
+err_create_queue:
+   radeon_kfd_remove_queue(p, queue_id);
+err_allocate_queue_id:
+   if (--pdd-queue_count == 0) {
+   
dev-device_info-scheduler_class-deregister_process(dev-scheduler, 
pdd-scheduler_process);
+   pdd-scheduler_process = NULL;
+   }
+err_register_process:
+err_bind_pasid:
+   kfree(queue);
+   mutex_unlock(p-mutex);
+   return err

[PATCH 15/83] hsa/radeon: Add interrupt handling module

2014-07-10 Thread Oded Gabbay
This patch adds the interrupt handling module, in kfd_interrupt.c,
and its related members in different data structures to the KFD
driver.

The KFD interrupt module maintains an internal interrupt ring per kfd
device. The internal interrupt ring contains interrupts that needs further
handling.The extra handling is deferred to a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware simply queues 
a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose 
interrupts because we have no back-pressure to the hardware.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile|   2 +-
 drivers/gpu/hsa/radeon/kfd_device.c|   1 +
 drivers/gpu/hsa/radeon/kfd_interrupt.c | 179 +
 drivers/gpu/hsa/radeon/kfd_priv.h  |  18 
 drivers/gpu/hsa/radeon/kfd_scheduler.h |   3 +
 5 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_interrupt.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 28da10c..5422e6a 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -5,6 +5,6 @@
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o
+   kfd_vidmem.o kfd_interrupt.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index 465c822..b2d2861 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -30,6 +30,7 @@
 static const struct kfd_device_info bonaire_device_info = {
.scheduler_class = radeon_kfd_cik_static_scheduler_class,
.max_pasid_bits = 16,
+   .ih_ring_entry_size = 4 * sizeof(uint32_t)
 };
 
 struct kfd_deviceid {
diff --git a/drivers/gpu/hsa/radeon/kfd_interrupt.c 
b/drivers/gpu/hsa/radeon/kfd_interrupt.c
new file mode 100644
index 000..2179780
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_interrupt.c
@@ -0,0 +1,179 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * KFD Interrupts.
+ *
+ * AMD GPUs deliver interrupts by pushing an interrupt description onto the
+ * interrupt ring and then sending an interrupt. KGD receives the interrupt
+ * in ISR and sends us a pointer to each new entry on the interrupt ring.
+ *
+ * We generally can't process interrupt-signaled events from ISR, so we call
+ * out to each interrupt client module (currently only the scheduler) to ask if
+ * each interrupt is interesting. If they return true, then it requires further
+ * processing so we copy it to an internal interrupt ring and call each
+ * interrupt client again from a work-queue.
+ *
+ * There's no acknowledgment for the interrupts we use. The hardware simply
+ * queues a new interrupt each time without waiting.
+ *
+ * The fixed-size internal queue means that it's possible for us to lose
+ * interrupts because we have no back-pressure to the hardware.
+ */
+
+#include linux/slab.h
+#include linux/device.h
+#include kfd_priv.h
+#include kfd_scheduler.h
+
+#define KFD_INTERRUPT_RING_SIZE 256
+
+static void interrupt_wq(struct work_struct *);
+
+int
+radeon_kfd_interrupt_init(struct kfd_dev *kfd)
+{
+   void *interrupt_ring = kmalloc_array(KFD_INTERRUPT_RING_SIZE,
+   kfd-device_info-ih_ring_entry_size,
+   GFP_KERNEL);
+   if (!interrupt_ring)
+   return -ENOMEM;
+
+   kfd-interrupt_ring = interrupt_ring;
+   kfd-interrupt_ring_size =
+   KFD_INTERRUPT_RING_SIZE * kfd-device_info-ih_ring_entry_size;
+   atomic_set(kfd

[PATCH 17/83] hsa/radeon: Handle deactivation of queues using interrupts

2014-07-10 Thread Oded Gabbay
This patch modifies the scheduler code to use interrupts to handle the
deactivation of queues. We prefer to use interrupts because the
deactivation could take a long time since we need to wait for the
wavefront to finish executing before deactivating the queue.

There is an array of waitqueues, each cell is represents queues for a
specific pipe. When a queue should be deactivated, it is inserted to the
wait queue. The event that triggers the waitqueue is a dequeue-complete
interrupt that arrives through the isr function of the scheduler.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_regs.h |  1 +
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 45 +--
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
index ef1d7ab..9c3ce97 100644
--- a/drivers/gpu/hsa/radeon/cik_regs.h
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -166,6 +166,7 @@
 
 #define CP_HQD_DEQUEUE_REQUEST 0xC974
 #defineDEQUEUE_REQUEST_DRAIN   1
+#defineDEQUEUE_INT (1U  
8)
 
 #define CP_HQD_SEMA_CMD0xC97Cu
 #define CP_HQD_MSG_TYPE0xC980u
diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index f86f958..5d42e88 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -139,6 +139,13 @@ struct cik_static_private {
 /* Queue q on pipe p is at bit QUEUES_PER_PIPE * p + q. */
unsigned long free_queues[DIV_ROUND_UP(CIK_MAX_PIPES * 
CIK_QUEUES_PER_PIPE, BITS_PER_LONG)];
 
+   /*
+* Dequeue waits for waves to finish so it could take a long time. We
+* defer through an interrupt. dequeue_wait is woken when a dequeue-
+* complete interrupt comes for that pipe.
+*/
+   wait_queue_head_t dequeue_wait[CIK_MAX_PIPES];
+
kfd_mem_obj hpd_mem;/* Single allocation for HPDs for all KFD 
pipes. */
kfd_mem_obj mqd_mem;/* Single allocation for all MQDs for all KFD
 * pipes. This is actually struct 
cik_mqd_padded. */
@@ -411,6 +418,9 @@ static int cik_static_create(struct kfd_dev *dev, struct 
kfd_scheduler **schedul
 
priv-free_vmid_mask = dev-shared_resources.compute_vmid_bitmap;
 
+   for (i = 0; i  priv-num_pipes; i++)
+   init_waitqueue_head(priv-dequeue_wait[i]);
+
/*
 * Allocate memory for the HPDs. This is hardware-owned per-pipe data.
 * The driver never accesses this memory after zeroing it. It doesn't 
even have
@@ -712,15 +722,18 @@ static void activate_queue(struct cik_static_private 
*priv, struct cik_static_qu
unlock_srbm_index(priv);
 }
 
-static void drain_hqd(struct cik_static_private *priv)
+static bool queue_inactive(struct cik_static_private *priv, struct 
cik_static_queue *queue)
 {
-   WRITE_REG(priv-dev, CP_HQD_DEQUEUE_REQUEST, DEQUEUE_REQUEST_DRAIN);
-}
+   bool inactive;
 
-static void wait_hqd_inactive(struct cik_static_private *priv)
-{
-   while (READ_REG(priv-dev, CP_HQD_ACTIVE) != 0)
-   cpu_relax();
+   lock_srbm_index(priv);
+   queue_select(priv, queue-queue);
+
+   inactive = (READ_REG(priv-dev, CP_HQD_ACTIVE) == 0);
+
+   unlock_srbm_index(priv);
+
+   return inactive;
 }
 
 static void deactivate_queue(struct cik_static_private *priv, struct 
cik_static_queue *queue)
@@ -728,10 +741,12 @@ static void deactivate_queue(struct cik_static_private 
*priv, struct cik_static_
lock_srbm_index(priv);
queue_select(priv, queue-queue);
 
-   drain_hqd(priv);
-   wait_hqd_inactive(priv);
+   WRITE_REG(priv-dev, CP_HQD_DEQUEUE_REQUEST, DEQUEUE_REQUEST_DRAIN | 
DEQUEUE_INT);
 
unlock_srbm_index(priv);
+
+   wait_event(priv-dequeue_wait[queue-queue/CIK_QUEUES_PER_PIPE],
+  queue_inactive(priv, queue));
 }
 
 #define BIT_MASK_64(high, low) (((1ULL  (high)) - 1)  ~((1ULL  (low)) - 
1))
@@ -791,6 +806,14 @@ cik_static_destroy_queue(struct kfd_scheduler *scheduler, 
struct kfd_scheduler_q
release_hqd(priv, hwq-queue);
 }
 
+static void
+dequeue_int_received(struct cik_static_private *priv, uint32_t pipe_id)
+{
+   /* The waiting threads will check CP_HQD_ACTIVE to see whether their
+* queue completed. */
+   wake_up_all(priv-dequeue_wait[pipe_id]);
+}
+
 /* Figure out the KFD compute pipe ID for an interrupt ring entry.
  * Returns true if it's a KFD compute pipe, false otherwise. */
 static bool int_compute_pipe(const struct cik_static_private *priv,
@@ -829,6 +852,10 @@ cik_static_interrupt_isr(struct kfd_scheduler *scheduler, 
const void *ih_ring_en
 ihre-source_id, ihre-data, pipe_id, ihre-vmid

[PATCH 16/83] hsa/radeon: Add the isr function of the KFD scehduler

2014-07-10 Thread Oded Gabbay
This patch adds the isr function to the KFD scheduler code. This
function us called from the kgd2kfd_interrupt function which is
an interrupt-context function.

The purpose of the isr function is to determine whether the interrupt
that arrived is interesting, i.e. some action need to be taken.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_int.h  | 50 
 drivers/gpu/hsa/radeon/cik_regs.h |  2 +
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 56 +++
 3 files changed, 108 insertions(+)
 create mode 100644 drivers/gpu/hsa/radeon/cik_int.h

diff --git a/drivers/gpu/hsa/radeon/cik_int.h b/drivers/gpu/hsa/radeon/cik_int.h
new file mode 100644
index 000..e98551d
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/cik_int.h
@@ -0,0 +1,50 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef HSA_RADEON_CIK_INT_H_INCLUDED
+#define HSA_RADEON_CIK_INT_H_INCLUDED
+
+#include linux/types.h
+
+struct cik_ih_ring_entry {
+   uint32_t source_id  : 8;
+   uint32_t reserved1  : 8;
+   uint32_t reserved2  : 16;
+
+   uint32_t data   : 28;
+   uint32_t reserved3  : 4;
+
+   /* pipeid, meid and unused3 are officially called RINGID,
+* but for our purposes, they always decode into pipe and ME. */
+   uint32_t pipeid : 2;
+   uint32_t meid   : 2;
+   uint32_t reserved4  : 4;
+   uint32_t vmid   : 8;
+   uint32_t pasid  : 16;
+
+   uint32_t reserved5;
+};
+
+#define CIK_INTSRC_DEQUEUE_COMPLETE0xC6
+
+#endif
+
diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
index d0cdc57..ef1d7ab 100644
--- a/drivers/gpu/hsa/radeon/cik_regs.h
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -23,6 +23,8 @@
 #ifndef CIK_REGS_H
 #define CIK_REGS_H
 
+#define IH_VMID_0_LUT  0x3D40u
+
 #define BIF_DOORBELL_CNTL  0x530Cu
 
 #defineSRBM_GFX_CNTL   0xE44
diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index b986ff9..f86f958 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -25,9 +25,12 @@
 #include linux/slab.h
 #include linux/types.h
 #include linux/uaccess.h
+#include linux/device.h
+#include linux/sched.h
 #include kfd_priv.h
 #include kfd_scheduler.h
 #include cik_regs.h
+#include cik_int.h
 
 /* CIK CP hardware is arranged with 8 queues per pipe and 8 pipes per MEC 
(microengine for compute).
  * The first MEC is ME 1 with the GFX ME as ME 0.
@@ -273,6 +276,8 @@ static void set_vmid_pasid_mapping(struct 
cik_static_private *priv, unsigned int
while (!(READ_REG(priv-dev, ATC_VMID_PASID_MAPPING_UPDATE_STATUS)  
(1U  vmid)))
cpu_relax();
WRITE_REG(priv-dev, ATC_VMID_PASID_MAPPING_UPDATE_STATUS, 1U  vmid);
+
+   WRITE_REG(priv-dev, IH_VMID_0_LUT + vmid*sizeof(uint32_t), pasid);
 }
 
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
@@ -786,6 +791,54 @@ cik_static_destroy_queue(struct kfd_scheduler *scheduler, 
struct kfd_scheduler_q
release_hqd(priv, hwq-queue);
 }
 
+/* Figure out the KFD compute pipe ID for an interrupt ring entry.
+ * Returns true if it's a KFD compute pipe, false otherwise. */
+static bool int_compute_pipe(const struct cik_static_private *priv,
+const struct cik_ih_ring_entry *ih_ring_entry,
+uint32_t *kfd_pipe)
+{
+   uint32_t pipe_id;
+
+   if (ih_ring_entry-meid == 0) /* Ignore graphics interrupts - compute 
only. */
+   return false;
+
+   pipe_id = (ih_ring_entry-meid - 1) * CIK_PIPES_PER_MEC + 
ih_ring_entry-pipeid

[PATCH 25/83] hsa/radeon: fix the OEMID assignment in kfd_topology

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

The assignment of OEMID from the CRAT table is into a 64 variable. The OEMID is 
48bit wide in the CRAT.
This fix makes sure that only 48bit are assigned for the OEMID value from the 
CRAT table.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_crat.h | 2 ++
 drivers/gpu/hsa/radeon/kfd_topology.c | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_crat.h 
b/drivers/gpu/hsa/radeon/kfd_crat.h
index 587455d..a374fa3 100644
--- a/drivers/gpu/hsa/radeon/kfd_crat.h
+++ b/drivers/gpu/hsa/radeon/kfd_crat.h
@@ -42,6 +42,8 @@
 #define CRAT_OEMTABLEID_LENGTH 8
 #define CRAT_RESERVED_LENGTH   6
 
+#define CRAT_OEMID_64BIT_MASK ((1ULL  (CRAT_OEMID_LENGTH * 8)) - 1)
+
 struct crat_header {
uint32_tsignature;
uint32_tlength;
diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c 
b/drivers/gpu/hsa/radeon/kfd_topology.c
index 6acac25..2ee5444 100644
--- a/drivers/gpu/hsa/radeon/kfd_topology.c
+++ b/drivers/gpu/hsa/radeon/kfd_topology.c
@@ -467,10 +467,10 @@ static int kfd_parse_crat_table(void *crat_image)
if (!top_dev) {
kfd_release_live_view();
return -ENOMEM;
+   }
}
-}
 
-   sys_props.platform_id = *((uint64_t *)crat_table-oem_id);
+   sys_props.platform_id = (*((uint64_t *)crat_table-oem_id))  
CRAT_OEMID_64BIT_MASK;
sys_props.platform_oem = *((uint64_t *)crat_table-oem_table_id);
sys_props.platform_rev = crat_table-revision;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 27/83] hsa/radeon: Implement hsaKmtSetMemoryPolicy

2014-07-10 Thread Oded Gabbay
From: Andrew Lewycky andrew.lewy...@amd.com

This patch adds support in KFD for the hsaKmtSetMemoryPolicy
HSA thunk API call

Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_regs.h |  1 +
 drivers/gpu/hsa/radeon/kfd_chardev.c  | 59 +
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 91 +--
 drivers/gpu/hsa/radeon/kfd_scheduler.h| 12 
 include/uapi/linux/kfd_ioctl.h| 13 
 5 files changed, 172 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
index 813cdc4..93f7b34 100644
--- a/drivers/gpu/hsa/radeon/cik_regs.h
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -54,6 +54,7 @@
 #defineAPE1_MTYPE(x)   ((x)  7)
 
 /* valid for both DEFAULT_MTYPE and APE1_MTYPE */
+#defineMTYPE_CACHED0
 #defineMTYPE_NONCACHED 3
 
 
diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index e0b276d..ddaf357 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -231,6 +231,61 @@ kfd_ioctl_destroy_queue(struct file *filp, struct 
kfd_process *p, void __user *a
 }
 
 static long
+kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void 
__user *arg)
+{
+   struct kfd_ioctl_set_memory_policy_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   struct kfd_process_device *pdd;
+   enum cache_policy default_policy, alternate_policy;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (args.default_policy != KFD_IOC_CACHE_POLICY_COHERENT
+args.default_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   if (args.alternate_policy != KFD_IOC_CACHE_POLICY_COHERENT
+args.alternate_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   mutex_lock(p-mutex);
+
+   pdd = radeon_kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd)  0) {
+   err = PTR_ERR(pdd);
+   goto out;
+   }
+
+   default_policy = (args.default_policy == KFD_IOC_CACHE_POLICY_COHERENT)
+? cache_policy_coherent : cache_policy_noncoherent;
+
+   alternate_policy = (args.alternate_policy == 
KFD_IOC_CACHE_POLICY_COHERENT)
+  ? cache_policy_coherent : cache_policy_noncoherent;
+
+   if (!dev-device_info-scheduler_class-set_cache_policy(dev-scheduler,
+
pdd-scheduler_process,
+default_policy,
+
alternate_policy,
+(void __user 
*)args.alternate_aperture_base,
+
args.alternate_aperture_size))
+   err = -EINVAL;
+
+out:
+   mutex_unlock(p-mutex);
+
+   return err;
+}
+
+
+static long
 kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
struct kfd_process *process;
@@ -253,6 +308,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
err = kfd_ioctl_destroy_queue(filep, process, (void __user 
*)arg);
break;
 
+   case KFD_IOC_SET_MEMORY_POLICY:
+   err = kfd_ioctl_set_memory_policy(filep, process, (void __user 
*)arg);
+   break;
+
default:
dev_err(kfd_device,
unknown ioctl cmd 0x%x, arg 0x%lx)\n,
diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index 9add5e5..3c3e7d6 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -162,6 +162,10 @@ struct cik_static_private {
 struct cik_static_process {
unsigned int vmid;
pasid_t pasid;
+
+   uint32_t sh_mem_config;
+   uint32_t ape1_base;
+   uint32_t ape1_limit;
 };
 
 struct cik_static_queue {
@@ -346,6 +350,7 @@ static void init_ats(struct cik_static_private *priv)
 
sh_mem_config = 
ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED);
+   sh_mem_config |= APE1_MTYPE(MTYPE_NONCACHED);
 
WRITE_REG(priv-dev, SH_MEM_CONFIG, sh_mem_config);
 
@@ -562,14 +567,26 @@ static void release_vmid(struct cik_static_private *priv, 
unsigned int vmid

[PATCH 26/83] hsa/radeon: Make binding of process to device permanent

2014-07-10 Thread Oded Gabbay
From: Andrew Lewycky andrew.lewy...@amd.com

Permanently bind the process to the device.
The binding survives even when all queues are destroyed.
Process exit and device removal terminate the binding.

Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 27 +++
 drivers/gpu/hsa/radeon/kfd_priv.h|  3 ---
 drivers/gpu/hsa/radeon/kfd_process.c | 21 ++---
 3 files changed, 13 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 4e7d5d0..e0b276d 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -141,20 +141,13 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
pdd = radeon_kfd_bind_process_to_device(dev, p);
if (IS_ERR(pdd)  0) {
err = PTR_ERR(pdd);
-   goto err_bind_pasid;
+   goto err_bind_process;
}
 
-   pr_debug(kfd: creating queue number %d for PASID %d on GPU 0x%x\n,
-   pdd-queue_count,
+   pr_debug(kfd: creating queue for PASID %d on GPU 0x%x\n,
p-pasid,
dev-id);
 
-   if (pdd-queue_count++ == 0) {
-   err = 
dev-device_info-scheduler_class-register_process(dev-scheduler, p, 
pdd-scheduler_process);
-   if (err  0)
-   goto err_register_process;
-   }
-
if (!radeon_kfd_allocate_queue_id(p, queue_id))
goto err_allocate_queue_id;
 
@@ -198,12 +191,7 @@ err_copy_args_out:
 err_create_queue:
radeon_kfd_remove_queue(p, queue_id);
 err_allocate_queue_id:
-   if (--pdd-queue_count == 0) {
-   
dev-device_info-scheduler_class-deregister_process(dev-scheduler, 
pdd-scheduler_process);
-   pdd-scheduler_process = NULL;
-   }
-err_register_process:
-err_bind_pasid:
+err_bind_process:
kfree(queue);
mutex_unlock(p-mutex);
return err;
@@ -215,7 +203,6 @@ kfd_ioctl_destroy_queue(struct file *filp, struct 
kfd_process *p, void __user *a
struct kfd_ioctl_destroy_queue_args args;
struct kfd_queue *queue;
struct kfd_dev *dev;
-   struct kfd_process_device *pdd;
 
if (copy_from_user(args, arg, sizeof(args)))
return -EFAULT;
@@ -239,14 +226,6 @@ kfd_ioctl_destroy_queue(struct file *filp, struct 
kfd_process *p, void __user *a
 
kfree(queue);
 
-   pdd = radeon_kfd_get_process_device_data(dev, p);
-   BUG_ON(pdd == NULL); /* Because a queue exists. */
-
-   if (--pdd-queue_count == 0) {
-   
dev-device_info-scheduler_class-deregister_process(dev-scheduler, 
pdd-scheduler_process);
-   pdd-scheduler_process = NULL;
-   }
-
mutex_unlock(p-mutex);
return 0;
 }
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 630d690..bca9cce 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -166,9 +166,6 @@ struct kfd_process_device {
/* The user-mode address of the doorbell mapping for this device. */
doorbell_t __user *doorbell_mapping;
 
-   /* The number of queues created by this process for this device. */
-   uint32_t queue_count;
-
/* Scheduler process data for this device. */
struct kfd_scheduler_process *scheduler_process;
 
diff --git a/drivers/gpu/hsa/radeon/kfd_process.c 
b/drivers/gpu/hsa/radeon/kfd_process.c
index 145ee38..f89f855 100644
--- a/drivers/gpu/hsa/radeon/kfd_process.c
+++ b/drivers/gpu/hsa/radeon/kfd_process.c
@@ -120,15 +120,6 @@ destroy_queues(struct kfd_process *p, struct kfd_dev 
*dev_filter)

dev-device_info-scheduler_class-destroy_queue(dev-scheduler, 
queue-scheduler_queue);
 
kfree(queue);
-
-   BUG_ON(pdd-queue_count == 0);
-   BUG_ON(pdd-scheduler_process == NULL);
-
-   if (--pdd-queue_count == 0) {
-   
dev-device_info-scheduler_class-deregister_process(dev-scheduler,
-   pdd-scheduler_process);
-   pdd-scheduler_process = NULL;
-   }
}
}
 }
@@ -144,6 +135,8 @@ static void free_process(struct kfd_process *p)
/* doorbell mappings: automatic */
 
list_for_each_entry_safe(pdd, temp, p-per_device_data, 
per_device_list) {
+   
pdd-dev-device_info-scheduler_class-deregister_process(pdd-dev-scheduler, 
pdd-scheduler_process);
+   pdd-scheduler_process = NULL;
amd_iommu_unbind_pasid(pdd-dev-pdev, p-pasid);
list_del(pdd-per_device_list);
kfree(pdd);
@@ -255,6 +248,12 @@ struct

[PATCH 22/83] drm/radeon: Add calls to suspend and resume of kfd driver

2014-07-10 Thread Oded Gabbay
The radeon driver can suspend and resume its device. For each device it
suspends/resumes, it should inform the kfd about it, so the kfd could
perform relevant actions per that device.

This patch adds the calls to kfd's suspend and resume functions. The
device is passed as an argument.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cik.c|  7 +++
 drivers/gpu/drm/radeon/radeon_kfd.c | 16 
 2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index e0c8052..b1c50f4 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -138,6 +138,8 @@ static void cik_fini_pg(struct radeon_device *rdev);
 static void cik_fini_cg(struct radeon_device *rdev);
 static void cik_enable_gui_idle_interrupt(struct radeon_device *rdev,
  bool enable);
+extern void radeon_kfd_suspend(struct radeon_device *rdev);
+extern int radeon_kfd_resume(struct radeon_device *rdev);
 
 /* get temperature in millidegrees */
 int ci_get_temp(struct radeon_device *rdev)
@@ -8429,6 +8431,10 @@ static int cik_startup(struct radeon_device *rdev)
if (r)
return r;
 
+   r = radeon_kfd_resume(rdev);
+   if (r)
+   return r;
+
return 0;
 }
 
@@ -8477,6 +8483,7 @@ int cik_resume(struct radeon_device *rdev)
  */
 int cik_suspend(struct radeon_device *rdev)
 {
+   radeon_kfd_suspend(rdev);
radeon_pm_suspend(rdev);
dce6_audio_fini(rdev);
radeon_vm_manager_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 594020e..e3af85b 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -124,6 +124,22 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
}
 }
 
+void radeon_kfd_suspend(struct radeon_device *rdev)
+{
+   if (rdev-kfd)
+   kgd2kfd-suspend(rdev-kfd);
+}
+
+int radeon_kfd_resume(struct radeon_device *rdev)
+{
+   int r = 0;
+
+   if (rdev-kfd)
+   r = kgd2kfd-resume(rdev-kfd);
+
+   return r;
+}
+
 static u32 pool_to_domain(enum kgd_memory_pool p)
 {
switch (p) {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 21/83] hsa/radeon: Add kgd--kfd interfaces for suspend and resume

2014-07-10 Thread Oded Gabbay
This patch adds two new interfaces to the kgd2kfd structure. Those
interfaces are for doing suspend and resume of a kfd device, when its
matching radeon device does suspend and resume.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile |  2 +-
 drivers/gpu/hsa/radeon/kfd_module.c |  2 ++
 drivers/gpu/hsa/radeon/kfd_pm.c | 43 +
 drivers/gpu/hsa/radeon/kfd_priv.h   |  4 
 include/linux/radeon_kfd.h  |  2 ++
 5 files changed, 52 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_pm.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 5422e6a..935f9b7 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -5,6 +5,6 @@
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o kfd_interrupt.o
+   kfd_vidmem.o kfd_interrupt.o kfd_pm.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index ad21c6d..a03743a 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -39,6 +39,8 @@ static const struct kgd2kfd_calls kgd2kfd = {
.device_init= kgd2kfd_device_init,
.device_exit= kgd2kfd_device_exit,
.interrupt  = kgd2kfd_interrupt,
+   .suspend= kgd2kfd_suspend,
+   .resume = kgd2kfd_resume,
 };
 
 bool kgd2kfd_init(unsigned interface_version,
diff --git a/drivers/gpu/hsa/radeon/kfd_pm.c b/drivers/gpu/hsa/radeon/kfd_pm.c
new file mode 100644
index 000..783311f
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_pm.c
@@ -0,0 +1,43 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Oded Gabbay
+ */
+
+#include linux/device.h
+#include kfd_priv.h
+#include kfd_scheduler.h
+
+void kgd2kfd_suspend(struct kfd_dev *kfd)
+{
+   BUG_ON(kfd == NULL);
+
+   kfd-device_info-scheduler_class-stop(kfd-scheduler);
+}
+
+int kgd2kfd_resume(struct kfd_dev *kfd)
+{
+   BUG_ON(kfd == NULL);
+
+   kfd-device_info-scheduler_class-start(kfd-scheduler);
+
+   return 0;
+}
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 5b6611f..630d690 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -247,4 +247,8 @@ int radeon_kfd_interrupt_init(struct kfd_dev *dev);
 void radeon_kfd_interrupt_exit(struct kfd_dev *dev);
 void kgd2kfd_interrupt(struct kfd_dev *dev, const void *ih_ring_entry);
 
+/* Power Management */
+void kgd2kfd_suspend(struct kfd_dev *dev);
+int kgd2kfd_resume(struct kfd_dev *dev);
+
 #endif
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 2f4f7c0..63b7bac 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -63,6 +63,8 @@ struct kgd2kfd_calls {
bool (*device_init)(struct kfd_dev *kfd, const struct 
kgd2kfd_shared_resources *gpu_resources);
void (*device_exit)(struct kfd_dev *kfd);
void (*interrupt)(struct kfd_dev *kfd, const void *ih_ring_entry);
+   void (*suspend)(struct kfd_dev *kfd);
+   int (*resume)(struct kfd_dev *kfd);
 };
 
 struct kfd2kgd_calls {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 24/83] drm/radeon/cik: Call kfd isr function

2014-07-10 Thread Oded Gabbay
When radeon handles interrupts for cik, propogate this interrupt to kfd.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cik.c| 4 
 drivers/gpu/drm/radeon/radeon_kfd.c | 6 ++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 803d0cb..6f4999a 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -140,6 +140,7 @@ static void cik_enable_gui_idle_interrupt(struct 
radeon_device *rdev,
  bool enable);
 extern void radeon_kfd_suspend(struct radeon_device *rdev);
 extern int radeon_kfd_resume(struct radeon_device *rdev);
+extern void radeon_kfd_interrupt(struct radeon_device *rdev, const void 
*ih_ring_entry);
 
 /* get temperature in millidegrees */
 int ci_get_temp(struct radeon_device *rdev)
@@ -7703,6 +7704,9 @@ restart_ih:
while (rptr != wptr) {
/* wptr/rptr are in bytes! */
ring_index = rptr / 4;
+
+   radeon_kfd_interrupt(rdev, (const void *) 
rdev-ih.ring[ring_index]);
+
src_id =  le32_to_cpu(rdev-ih.ring[ring_index])  0xff;
src_data = le32_to_cpu(rdev-ih.ring[ring_index + 1])  
0xfff;
ring_id = le32_to_cpu(rdev-ih.ring[ring_index + 2])  0xff;
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index e3af85b..f4cc3c5 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -124,6 +124,12 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
}
 }
 
+void radeon_kfd_interrupt(struct radeon_device *rdev, const void 
*ih_ring_entry)
+{
+   if (rdev-kfd)
+   kgd2kfd-interrupt(rdev-kfd, ih_ring_entry);
+}
+
 void radeon_kfd_suspend(struct radeon_device *rdev)
 {
if (rdev-kfd)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 28/83] mm: Change timing of notification to IOMMUs about a page to be invalidated

2014-07-10 Thread Oded Gabbay
From: Andrew Lewycky andrew.lewy...@amd.com

This patch changes the location of the mmu_notifier_invalidate_page function
call inside try_to_unmap_one. The mmu_notifier_invalidate_page function
call tells the IOMMU that a pgae should be invalidated.

The location is changed from after releasing the physical page to
before releasing the physical page.

This change should prevent the bug that would occur in the
(rare) case where the GPU attempts to access a page while the CPU
attempts to swap out that page (or discard it if it is not dirty).

Signed-off-by: Andrew Lewycky andrew.lewy...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 mm/rmap.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 196cd0c..73d4c3d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1231,13 +1231,17 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
} else
dec_mm_counter(mm, MM_FILEPAGES);
 
+   pte_unmap_unlock(pte, ptl);
+
+   mmu_notifier_invalidate_page(vma, address, event);
+
page_remove_rmap(page);
page_cache_release(page);
 
+   return ret;
+
 out_unmap:
pte_unmap_unlock(pte, ptl);
-   if (ret != SWAP_FAIL  !(flags  TTU_MUNLOCK))
-   mmu_notifier_invalidate_page(vma, address, event);
 out:
return ret;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 23/83] drm/radeon/cik: Don't touch int of pipes 1-7

2014-07-10 Thread Oded Gabbay
HSA radeon driver (kfd) should set interrupts for pipes 1-7.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cik.c | 71 +---
 1 file changed, 1 insertion(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index b1c50f4..803d0cb 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -7272,8 +7272,7 @@ static int cik_irq_init(struct radeon_device *rdev)
 int cik_irq_set(struct radeon_device *rdev)
 {
u32 cp_int_cntl;
-   u32 cp_m1p0, cp_m1p1, cp_m1p2, cp_m1p3;
-   u32 cp_m2p0, cp_m2p1, cp_m2p2, cp_m2p3;
+   u32 cp_m1p0;
u32 crtc1 = 0, crtc2 = 0, crtc3 = 0, crtc4 = 0, crtc5 = 0, crtc6 = 0;
u32 hpd1, hpd2, hpd3, hpd4, hpd5, hpd6;
u32 grbm_int_cntl = 0;
@@ -7307,13 +7306,6 @@ int cik_irq_set(struct radeon_device *rdev)
dma_cntl1 = RREG32(SDMA0_CNTL + SDMA1_REGISTER_OFFSET)  ~TRAP_ENABLE;
 
cp_m1p0 = RREG32(CP_ME1_PIPE0_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m1p1 = RREG32(CP_ME1_PIPE1_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m1p2 = RREG32(CP_ME1_PIPE2_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m1p3 = RREG32(CP_ME1_PIPE3_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m2p0 = RREG32(CP_ME2_PIPE0_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m2p1 = RREG32(CP_ME2_PIPE1_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m2p2 = RREG32(CP_ME2_PIPE2_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
-   cp_m2p3 = RREG32(CP_ME2_PIPE3_INT_CNTL)  ~TIME_STAMP_INT_ENABLE;
 
if (rdev-flags  RADEON_IS_IGP)
thermal_int = RREG32_SMC(CG_THERMAL_INT_CTRL) 
@@ -7335,33 +7327,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG(si_irq_set: sw int cp1 invalid pipe 
%d\n, ring-pipe);
-   break;
-   }
-   } else if (ring-me == 2) {
-   switch (ring-pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG(si_irq_set: sw int cp1 invalid pipe 
%d\n, ring-pipe);
break;
@@ -7378,33 +7343,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG(si_irq_set: sw int cp2 invalid pipe 
%d\n, ring-pipe);
-   break;
-   }
-   } else if (ring-me == 2) {
-   switch (ring-pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG(si_irq_set: sw int cp2 invalid pipe 
%d\n, ring-pipe);
break;
@@ -7487,13 +7425,6 @@ int cik_irq_set(struct radeon_device *rdev)
WREG32(SDMA0_CNTL

[PATCH 18/83] hsa/radeon: Enable interrupts in KFD scheduler

2014-07-10 Thread Oded Gabbay
This patch enables the use of interrupts in the KFD scheduler when the
scheduler performs its initialization.

It also disables the interrupts when the scheduler stops its work.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 28 +++
 1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index 5d42e88..9add5e5 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -486,6 +486,32 @@ static void cik_static_destroy(struct kfd_scheduler 
*scheduler)
kfree(priv);
 }
 
+static void
+enable_interrupts(struct cik_static_private *priv)
+{
+   unsigned int i;
+
+   lock_srbm_index(priv);
+   for (i = 0; i  priv-num_pipes; i++) {
+   pipe_select(priv, i);
+   WRITE_REG(priv-dev, CPC_INT_CNTL, DEQUEUE_REQUEST_INT_ENABLE);
+   }
+   unlock_srbm_index(priv);
+}
+
+static void
+disable_interrupts(struct cik_static_private *priv)
+{
+   unsigned int i;
+
+   lock_srbm_index(priv);
+   for (i = 0; i  priv-num_pipes; i++) {
+   pipe_select(priv, i);
+   WRITE_REG(priv-dev, CPC_INT_CNTL, 0);
+   }
+   unlock_srbm_index(priv);
+}
+
 static void cik_static_start(struct kfd_scheduler *scheduler)
 {
struct cik_static_private *priv = kfd_scheduler_to_private(scheduler);
@@ -495,6 +521,7 @@ static void cik_static_start(struct kfd_scheduler 
*scheduler)
 
init_pipes(priv);
init_ats(priv);
+   enable_interrupts(priv);
 }
 
 static void cik_static_stop(struct kfd_scheduler *scheduler)
@@ -502,6 +529,7 @@ static void cik_static_stop(struct kfd_scheduler *scheduler)
struct cik_static_private *priv = kfd_scheduler_to_private(scheduler);
 
exit_ats(priv);
+   disable_interrupts(priv);
 
radeon_kfd_vidmem_ungpumap(priv-dev, priv-hpd_mem);
radeon_kfd_vidmem_ungpumap(priv-dev, priv-mqd_mem);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 32/83] hsa/radeon: implementing IOCTL for clock counters

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Implemented new IOCTL to query the CPU and GPU clock counters.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 37 
 include/uapi/linux/kfd_ioctl.h   |  9 +
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index ddaf357..d6fa980 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -28,6 +28,7 @@
 #include linux/slab.h
 #include linux/uaccess.h
 #include uapi/linux/kfd_ioctl.h
+#include linux/time.h
 #include kfd_priv.h
 #include kfd_scheduler.h
 
@@ -284,6 +285,38 @@ out:
return err;
 }
 
+static long
+kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void 
__user *arg)
+{
+   struct kfd_ioctl_get_clock_counters_args args;
+   struct kfd_dev *dev;
+   struct timespec time;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   /* Reading GPU clock counter from KGD */
+   args.gpu_clock_counter = kfd2kgd-get_gpu_clock_counter(dev-kgd);
+
+   /* No access to rdtsc. Using raw monotonic time */
+   getrawmonotonic(time);
+   args.cpu_clock_counter = time.tv_nsec;
+
+   get_monotonic_boottime(time);
+   args.system_clock_counter = time.tv_nsec;
+
+   /* Since the counter is in nano-seconds we use 1GHz frequency */
+   args.system_clock_freq = 10;
+
+   if (copy_to_user(arg, args, sizeof(args)))
+   return -EFAULT;
+
+   return 0;
+}
 
 static long
 kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
@@ -312,6 +345,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
err = kfd_ioctl_set_memory_policy(filep, process, (void __user 
*)arg);
break;
 
+   case KFD_IOC_GET_CLOCK_COUNTERS:
+   err = kfd_ioctl_get_clock_counters(filep, process, (void __user 
*)arg);
+   break;
+
default:
dev_err(kfd_device,
unknown ioctl cmd 0x%x, arg 0x%lx)\n,
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 928e628..5b9517e 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -70,12 +70,21 @@ struct kfd_ioctl_set_memory_policy_args {
uint64_t alternate_aperture_size;   /* to KFD */
 };
 
+struct kfd_ioctl_get_clock_counters_args {
+   uint32_t gpu_id;/* to KFD */
+   uint64_t gpu_clock_counter; /* from KFD */
+   uint64_t cpu_clock_counter; /* from KFD */
+   uint64_t system_clock_counter;  /* from KFD */
+   uint64_t system_clock_freq; /* from KFD */
+};
+
 #define KFD_IOC_MAGIC 'K'
 
 #define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct 
kfd_ioctl_get_version_args)
 #define KFD_IOC_CREATE_QUEUE   _IOWR(KFD_IOC_MAGIC, 2, struct 
kfd_ioctl_create_queue_args)
 #define KFD_IOC_DESTROY_QUEUE  _IOWR(KFD_IOC_MAGIC, 3, struct 
kfd_ioctl_destroy_queue_args)
 #define KFD_IOC_SET_MEMORY_POLICY  _IOW(KFD_IOC_MAGIC, 4, struct 
kfd_ioctl_set_memory_policy_args)
+#define KFD_IOC_GET_CLOCK_COUNTERS _IOWR(KFD_IOC_MAGIC, 5, struct 
kfd_ioctl_get_clock_counters_args)
 
 #pragma pack(pop)
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 30/83] hsa/radeon: Fix list of supported devices

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index b627e57..a21c095 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -27,7 +27,7 @@
 #include kfd_priv.h
 #include kfd_scheduler.h
 
-static const struct kfd_device_info bonaire_device_info = {
+static const struct kfd_device_info kaveri_device_info = {
.scheduler_class = radeon_kfd_cik_static_scheduler_class,
.max_pasid_bits = 16,
.ih_ring_entry_size = 4 * sizeof(uint32_t)
@@ -40,10 +40,28 @@ struct kfd_deviceid {
 
 /* Please keep this sorted by increasing device id. */
 static const struct kfd_deviceid supported_devices[] = {
-   { 0x1305, bonaire_device_info },   /* Kaveri */
-   { 0x1307, bonaire_device_info },   /* Kaveri */
-   { 0x130F, bonaire_device_info },   /* Kaveri */
-   { 0x665C, bonaire_device_info },   /* Bonaire */
+   { 0x1304, kaveri_device_info },/* Kaveri */
+   { 0x1305, kaveri_device_info },/* Kaveri */
+   { 0x1306, kaveri_device_info },/* Kaveri */
+   { 0x1307, kaveri_device_info },/* Kaveri */
+   { 0x1309, kaveri_device_info },/* Kaveri */
+   { 0x130A, kaveri_device_info },/* Kaveri */
+   { 0x130B, kaveri_device_info },/* Kaveri */
+   { 0x130C, kaveri_device_info },/* Kaveri */
+   { 0x130D, kaveri_device_info },/* Kaveri */
+   { 0x130E, kaveri_device_info },/* Kaveri */
+   { 0x130F, kaveri_device_info },/* Kaveri */
+   { 0x1310, kaveri_device_info },/* Kaveri */
+   { 0x1311, kaveri_device_info },/* Kaveri */
+   { 0x1312, kaveri_device_info },/* Kaveri */
+   { 0x1313, kaveri_device_info },/* Kaveri */
+   { 0x1315, kaveri_device_info },/* Kaveri */
+   { 0x1316, kaveri_device_info },/* Kaveri */
+   { 0x1317, kaveri_device_info },/* Kaveri */
+   { 0x1318, kaveri_device_info },/* Kaveri */
+   { 0x131B, kaveri_device_info },/* Kaveri */
+   { 0x131C, kaveri_device_info },/* Kaveri */
+   { 0x131D, kaveri_device_info },/* Kaveri */
 };
 
 static const struct kfd_device_info *
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 20/83] hsa/radeon: Add interrupt callback function to kgd2kfd interface

2014-07-10 Thread Oded Gabbay
This patch adds a new callback function to the kgd2kfd interface. The
new callback is for propagating interrupts from radeon driver to the kfd
driver.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c | 1 +
 include/linux/radeon_kfd.h  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index 6978bc0..ad21c6d 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -38,6 +38,7 @@ static const struct kgd2kfd_calls kgd2kfd = {
.probe  = kgd2kfd_probe,
.device_init= kgd2kfd_device_init,
.device_exit= kgd2kfd_device_exit,
+   .interrupt  = kgd2kfd_interrupt,
 };
 
 bool kgd2kfd_init(unsigned interface_version,
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 40b691c..2f4f7c0 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -62,6 +62,7 @@ struct kgd2kfd_calls {
struct kfd_dev* (*probe)(struct kgd_dev *kgd, struct pci_dev *pdev);
bool (*device_init)(struct kfd_dev *kfd, const struct 
kgd2kfd_shared_resources *gpu_resources);
void (*device_exit)(struct kfd_dev *kfd);
+   void (*interrupt)(struct kfd_dev *kfd, const void *ih_ring_entry);
 };
 
 struct kfd2kgd_calls {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 36/83] hsa/radeon: fixing clock counters bug

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Fixed wrong reporting of timestamps in kfd_ioctl_get_clock_counters.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index dba6084..75fe11f 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -304,10 +304,10 @@ kfd_ioctl_get_clock_counters(struct file *filep, struct 
kfd_process *p, void __u
 
/* No access to rdtsc. Using raw monotonic time */
getrawmonotonic(time);
-   args.cpu_clock_counter = time.tv_nsec;
+   args.cpu_clock_counter = (uint64_t)timespec_to_ns(time);
 
get_monotonic_boottime(time);
-   args.system_clock_counter = time.tv_nsec;
+   args.system_clock_counter = (uint64_t)timespec_to_ns(time);
 
/* Since the counter is in nano-seconds we use 1GHz frequency */
args.system_clock_freq = 10;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 47/83] hsa/radeon: Add support allocating kernel doorbells

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch adds infrastructure to allocate doorbells which are not exposed to
user space.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_doorbell.c | 76 ++-
 drivers/gpu/hsa/radeon/kfd_priv.h |  5 +++
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c 
b/drivers/gpu/hsa/radeon/kfd_doorbell.c
index 3de8a02..abf4cb0 100644
--- a/drivers/gpu/hsa/radeon/kfd_doorbell.c
+++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c
@@ -23,6 +23,16 @@
 #include kfd_priv.h
 #include linux/mm.h
 #include linux/mman.h
+#include linux/slab.h
+
+/*
+ * This extension supports a kernel level doorbells management for the kernel 
queues
+ * basically the last doorbells page is devoted to kernel queues and that's 
assures
+ * that any user process won't get access to the kernel doorbells page
+ */
+static DEFINE_MUTEX(doorbell_mutex);
+static unsigned long doorbell_available_index[DIV_ROUND_UP(MAX_PROCESS_QUEUES, 
BITS_PER_LONG)] = { 0 };
+#define KERNEL_DOORBELL_PASID 1
 
 /*
  * Each device exposes a doorbell aperture, a PCI MMIO aperture that
@@ -67,7 +77,22 @@ void radeon_kfd_doorbell_init(struct kfd_dev *kfd)
 
kfd-doorbell_base = kfd-shared_resources.doorbell_physical_address + 
doorbell_start_offset;
kfd-doorbell_id_offset = doorbell_start_offset / sizeof(doorbell_t);
-   kfd-doorbell_process_limit = doorbell_process_limit;
+   kfd-doorbell_process_limit = doorbell_process_limit - 1;
+
+   kfd-doorbell_kernel_ptr = ioremap(kfd-doorbell_base, 
doorbell_process_allocation());
+   BUG_ON(!kfd-doorbell_kernel_ptr);
+
+   pr_debug(kfd: doorbell initialization\n
+ doorbell base   == 0x%08lX\n
+ doorbell_id_offset  == 0x%08lu\n
+ doorbell_process_limit  == 0x%08lu\n
+ doorbell_kernel_offset  == 0x%08lX\n
+ doorbell aperture size  == 0x%08lX\n
+ doorbell kernel address == 0x%08lX\n,
+(uintptr_t)kfd-doorbell_base, 
kfd-doorbell_id_offset, doorbell_process_limit,
+(uintptr_t)kfd-doorbell_base, 
kfd-shared_resources.doorbell_aperture_size,
+(uintptr_t)kfd-doorbell_kernel_ptr);
+
 }
 
 /* This is the /dev/kfd mmap (for doorbell) implementation. We intend that 
this is only called through map_doorbells,
@@ -136,6 +161,53 @@ map_doorbells(struct file *devkfd, struct kfd_process 
*process, struct kfd_dev *
return 0;
 }
 
+/* get kernel iomem pointer for a doorbell */
+u32 __iomem *radeon_kfd_get_kernel_doorbell(struct kfd_dev *kfd, unsigned int 
*doorbell_off)
+{
+   u32 inx;
+
+   BUG_ON(!kfd || !doorbell_off);
+
+   mutex_lock(doorbell_mutex);
+   inx = find_first_zero_bit(doorbell_available_index, MAX_PROCESS_QUEUES);
+   __set_bit(inx, doorbell_available_index);
+   mutex_unlock(doorbell_mutex);
+
+   if (inx = MAX_PROCESS_QUEUES)
+   return NULL;
+
+   /* caluculating the kernel doorbell offset using faked kernel pasid 
that allocated for kernel queues only */
+   *doorbell_off = KERNEL_DOORBELL_PASID * 
(doorbell_process_allocation()/sizeof(doorbell_t)) + inx;
+
+   pr_debug(kfd: get kernel queue doorbell\n
+ doorbell offset   == 0x%08d\n
+ kernel address== 0x%08lX\n,
+*doorbell_off, (uintptr_t)(kfd-doorbell_kernel_ptr + 
inx));
+
+   return kfd-doorbell_kernel_ptr + inx;
+}
+
+void radeon_kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem 
*db_addr)
+{
+   unsigned int inx;
+
+   BUG_ON(!kfd || !db_addr);
+
+   inx = (unsigned int)(db_addr - kfd-doorbell_kernel_ptr);
+
+   mutex_lock(doorbell_mutex);
+   __clear_bit(inx, doorbell_available_index);
+   mutex_unlock(doorbell_mutex);
+}
+
+inline void write_kernel_doorbell(u32 __iomem *db, u32 value)
+{
+   if (db) {
+   writel(value, db);
+   pr_debug(writing %d to doorbell address 0x%p\n, value, db);
+   }
+}
+
 /* Get the user-mode address of a doorbell. Assumes that the process mutex is 
being held. */
 doorbell_t __user *radeon_kfd_get_doorbell(struct file *devkfd, struct 
kfd_process *process, struct kfd_dev *dev,
   unsigned int doorbell_index)
@@ -152,6 +224,8 @@ doorbell_t __user *radeon_kfd_get_doorbell(struct file 
*devkfd, struct kfd_proce
pdd = radeon_kfd_get_process_device_data(dev, process);
BUG_ON(pdd == NULL); /* map_doorbells would have failed otherwise */
 
+   pr_debug(doorbell value on creation 0x%x\n, 
pdd-doorbell_mapping[doorbell_index]);
+
return pdd

[PATCH 37/83] hsa/radeon: Print ISR info only in debug mode

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index 5bfde5c..7573d25 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -899,7 +899,7 @@ cik_static_interrupt_isr(struct kfd_scheduler *scheduler, 
const void *ih_ring_en
if (!int_compute_pipe(priv, ihre, pipe_id))
return false;
 
-   dev_info(radeon_kfd_chardev(), INT(ISR): src=%02x, data=0x%x, pipe=%u, 
vmid=%u, pasid=%u\n,
+   dev_dbg(radeon_kfd_chardev(), INT(ISR): src=%02x, data=0x%x, pipe=%u, 
vmid=%u, pasid=%u\n,
 ihre-source_id, ihre-data, pipe_id, ihre-vmid, ihre-pasid);
 
switch (source_id) {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 39/83] drm/radeon: Extending kfd interface

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Adding new function to the interface used by kfd.
The new function retrieves the max engine clock speed.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 11 +++
 include/linux/radeon_kfd.h  |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 6dba170..8b6d497 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -50,6 +50,8 @@ static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
 static void lock_grbm_gfx_idx(struct kgd_dev *kgd);
 static void unlock_grbm_gfx_idx(struct kgd_dev *kgd);
 
+static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
+
 
 static const struct kfd2kgd_calls kfd2kgd = {
.allocate_mem = allocate_mem,
@@ -64,6 +66,7 @@ static const struct kfd2kgd_calls kfd2kgd = {
.unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
.lock_grbm_gfx_idx = lock_grbm_gfx_idx,
.unlock_grbm_gfx_idx = unlock_grbm_gfx_idx,
+   .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
 };
 
 static const struct kgd2kfd_calls *kgd2kfd;
@@ -307,3 +310,11 @@ static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
 
return rdev-asic-get_gpu_clock_counter(rdev);
 }
+
+static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+
+   /* The sclk is in quantas of 10kHz */
+   return rdev-pm.power_state-clock_info-sclk / 100;
+}
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 4c7e923..4114c8e 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -93,6 +93,8 @@ struct kfd2kgd_calls {
/* GRBM_GFX_INDEX mutex */
void (*lock_grbm_gfx_idx)(struct kgd_dev *kgd);
void (*unlock_grbm_gfx_idx)(struct kgd_dev *kgd);
+
+   uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
 };
 
 bool kgd2kfd_init(unsigned interface_version,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 51/83] hsa/radeon: Add packet manager module

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The packet manager module builds PM4 packets for the sole use of the CP
scheduler. Those packets are used by the HIQ to submit runlists to the CP.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile |   2 +-
 drivers/gpu/hsa/radeon/kfd_packet_manager.c | 473 
 2 files changed, 474 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_packet_manager.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index f06d925..4978915 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -7,6 +7,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o
+   kfd_kernel_queue.o kfd_packet_manager.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c 
b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
new file mode 100644
index 000..4967b7c
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
@@ -0,0 +1,473 @@
+/*
+ * packet_manager.c
+ *
+ *  Created on: Mar 16, 2014
+ *  Author: ben
+ */
+#include linux/slab.h
+#include linux/mutex.h
+#include kfd_device_queue_manager.h
+#include kfd_kernel_queue.h
+#include kfd_priv.h
+#include kfd_pm4_headers.h
+#include kfd_pm4_opcodes.h
+#include cik_mqds.h
+
+static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, 
unsigned int buffer_size_bytes)
+{
+   unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
+
+   BUG_ON((temp * sizeof(uint32_t))  buffer_size_bytes);
+   *wptr = temp;
+}
+
+static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
+{
+   PM4_TYPE_3_HEADER header;
+
+   header.u32all = 0;
+   header.opcode = opcode;
+   header.count = packet_size/sizeof(uint32_t) - 2;
+   header.type = PM4_TYPE_3;
+
+   return header.u32all;
+}
+
+static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int 
*rlib_size, bool *over_subscription)
+{
+   unsigned int process_count, queue_count;
+
+   BUG_ON(!pm || !rlib_size || !over_subscription);
+
+   process_count = pm-dqm-processes_count;
+   queue_count = pm-dqm-queue_count;
+
+   /* check if there is over subscription*/
+   *over_subscription = false;
+   if ((process_count = VMID_PER_DEVICE) ||
+   queue_count = PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE) {
+   *over_subscription = true;
+   pr_debug(kfd: over subscribed runlist\n);
+   }
+
+   /* calculate run list ib allocation size */
+   *rlib_size = process_count * sizeof(struct pm4_map_process) +
+queue_count * sizeof(struct pm4_map_queues);
+
+   /* increase the allocation size in case we need a chained run list when 
over subscription */
+   if (*over_subscription)
+   *rlib_size += sizeof(struct pm4_runlist);
+
+   pr_debug(kfd: runlist ib size %d\n, *rlib_size);
+}
+
+static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int 
**rl_buffer, uint64_t *rl_gpu_buffer,
+   unsigned int *rl_buffer_size, bool *is_over_subscription)
+{
+   int retval;
+
+   BUG_ON(!pm);
+   BUG_ON(pm-allocated == true);
+
+   pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
+   if (is_over_subscription 
+   sched_policy == 
KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION)
+   return -EFAULT;
+
+   retval = radeon_kfd_vidmem_alloc_map(pm-dqm-dev, pm-ib_buffer_obj, 
(void **)rl_buffer,
+rl_gpu_buffer, 
ALIGN(*rl_buffer_size, PAGE_SIZE));
+   if (retval != 0) {
+   pr_err(kfd: failed to allocate runlist IB\n);
+   return retval;
+   }
+
+   memset(*rl_buffer, 0, *rl_buffer_size);
+   pm-allocated = true;
+   return retval;
+}
+
+static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
+   uint64_t ib, size_t ib_size_in_dwords, bool chain)
+{
+   struct pm4_runlist *packet;
+
+   BUG_ON(!pm || !buffer || !ib);
+
+   packet = (struct pm4_runlist *)buffer;
+
+   memset(buffer, 0, sizeof(struct pm4_runlist));
+   packet-header.u32all = build_pm4_header(IT_RUN_LIST, sizeof(struct 
pm4_runlist));
+
+   packet-bitfields4.ib_size = ib_size_in_dwords;
+   packet-bitfields4.chain = chain ? 1 : 0;
+   packet-bitfields4.offload_polling = 0;
+   packet-bitfields4.valid = 1;
+   packet-bitfields4.vmid = 0;
+   packet-ordinal2 = lower_32(ib);
+   packet-bitfields3.ib_base_hi = upper_32(ib

[PATCH 45/83] hsa/radeon: debugging print statements

2014-07-10 Thread Oded Gabbay
From: Michael Varga michael.va...@amd.com

Added debug print statements so critical errors during init may be debugged 
more easily.

Signed-off-by: Michael Varga michael.va...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index 2e7d50d..82febf4 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -107,20 +107,30 @@ device_iommu_pasid_init(struct kfd_dev *kfd)
int err;
 
err = amd_iommu_device_info(kfd-pdev, iommu_info);
-   if (err  0)
+   if (err  0) {
+   dev_err(kfd_device, error getting iommu info. is the iommu 
enabled?\n);
return false;
+   }
 
-   if ((iommu_info.flags  required_iommu_flags) != required_iommu_flags)
+   if ((iommu_info.flags  required_iommu_flags) != required_iommu_flags) {
+   dev_err(kfd_device, error required iommu flags ats(%i), 
pri(%i), pasid(%i)\n,
+  (iommu_info.flags  AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
+  (iommu_info.flags  AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
+  (iommu_info.flags  AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 
0);
return false;
+   }
 
pasid_limit = min_t(pasid_t, (pasid_t)1  
kfd-device_info-max_pasid_bits, iommu_info.max_pasids);
pasid_limit = min_t(pasid_t, pasid_limit, kfd-doorbell_process_limit);
 
err = amd_iommu_init_device(kfd-pdev, pasid_limit);
-   if (err  0)
+   if (err  0) {
+   dev_err(kfd_device, error initializing iommu device\n);
return false;
+   }
 
if (!radeon_kfd_set_pasid_limit(pasid_limit)) {
+   dev_err(kfd_device, error setting pasid limit\n);
amd_iommu_free_device(kfd-pdev);
return false;
}
@@ -166,6 +176,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
kfd-device_info-scheduler_class-start(kfd-scheduler);
 
kfd-init_complete = true;
+   dev_info(kfd_device, added device (%x:%x)\n, kfd-pdev-vendor,
+kfd-pdev-device);
 
return true;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 52/83] hsa/radeon: Add process queue manager module

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The process bounded section is handled by this module. The PQM handles HSA 
queue setup, updates and tear-down.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile|   3 +-
 drivers/gpu/hsa/radeon/kfd_priv.h  |  29 ++
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 370 +
 3 files changed, 401 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 4978915..341fa67 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -7,6 +7,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o kfd_packet_manager.o
+   kfd_kernel_queue.o kfd_packet_manager.o \
+   kfd_process_queue_manager.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index b3889aa..e716745 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -311,6 +311,9 @@ struct kfd_process_device {
/* Scheduler process data for this device. */
struct kfd_scheduler_process *scheduler_process;
 
+   /* per-process-per device QCM data structure */
+   struct qcm_process_device qpd;
+
/* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */
bool bound;
 
@@ -342,6 +345,11 @@ struct kfd_process {
/* List of kfd_process_device structures, one for each device the 
process is using. */
struct list_head per_device_data;
 
+   struct hw_pointer_store_properties write_ptr;
+   struct hw_pointer_store_properties read_ptr;
+
+   struct process_queue_manager pqm;
+
/* The process's queues. */
size_t queue_array_size;
struct kfd_queue **queues;  /* Size is queue_array_size, up to 
MAX_PROCESS_QUEUES. */
@@ -431,6 +439,27 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE 
type, struct kfd_dev *dev
 struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum 
kfd_queue_type type);
 void kernel_queue_uninit(struct kernel_queue *kq);
 
+/* Process Queue Manager */
+struct process_queue_node {
+   struct queue *q;
+   struct kernel_queue *kq;
+   struct list_head process_queue_list;
+};
+
+int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p);
+void pqm_uninit(struct process_queue_manager *pqm);
+int pqm_create_queue(struct process_queue_manager *pqm,
+   struct kfd_dev *dev,
+   struct file *f,
+   struct queue_properties *properties,
+   unsigned int flags,
+   enum kfd_queue_type type,
+   unsigned int *qid);
+int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
+int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, 
struct queue_properties *p);
+struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, 
unsigned int qid);
+void test_diq(struct kfd_dev *dev, struct process_queue_manager *pqm);
+
 /* Packet Manager */
 
 #define KFD_HIQ_TIMEOUT (500)
diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
new file mode 100644
index 000..6e38ca4
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
@@ -0,0 +1,370 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT

[PATCH 82/83] drm/radeon: Remove lock functions from kfd2kgd interface

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 44 -
 include/linux/radeon_kfd.h  | 10 -
 2 files changed, 54 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 738c2b3..7e8e041 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -115,12 +115,6 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem 
*mem);
 static uint64_t get_vmem_size(struct kgd_dev *kgd);
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
-static void lock_srbm_gfx_cntl(struct kgd_dev *kgd);
-static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
-
-static void lock_grbm_gfx_idx(struct kgd_dev *kgd);
-static void unlock_grbm_gfx_idx(struct kgd_dev *kgd);
-
 static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 
 /*
@@ -146,10 +140,6 @@ static const struct kfd2kgd_calls kfd2kgd = {
.unkmap_mem = unkmap_mem,
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
-   .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
-   .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
-   .lock_grbm_gfx_idx = lock_grbm_gfx_idx,
-   .unlock_grbm_gfx_idx = unlock_grbm_gfx_idx,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
@@ -200,8 +190,6 @@ void radeon_kfd_device_init(struct radeon_device *rdev)
 {
if (rdev-kfd) {
struct kgd2kfd_shared_resources gpu_resources = {
-   .mmio_registers = rdev-rmmio,
-
.compute_vmid_bitmap = 0xFF00,
 
.first_compute_pipe = 1,
@@ -363,38 +351,6 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd)
return rdev-mc.real_vram_size;
 }
 
-static void lock_srbm_gfx_cntl(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   mutex_lock(rdev-srbm_mutex);
-}
-
-static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   mutex_unlock(rdev-srbm_mutex);
-}
-
-static void lock_grbm_gfx_idx(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   BUG_ON(kgd == NULL);
-
-   mutex_lock(rdev-grbm_idx_mutex);
-}
-
-static void unlock_grbm_gfx_idx(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   BUG_ON(kgd == NULL);
-
-   mutex_unlock(rdev-grbm_idx_mutex);
-}
-
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
 {
struct radeon_device *rdev = (struct radeon_device *)kgd;
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index aa021fb..2fffe32 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -45,8 +45,6 @@ enum kgd_memory_pool {
 };
 
 struct kgd2kfd_shared_resources {
-   void __iomem *mmio_registers; /* Mapped pointer to GFX MMIO registers. 
*/
-
unsigned int compute_vmid_bitmap; /* Bit n == 1 means VMID n is 
available for KFD. */
 
unsigned int first_compute_pipe; /* Compute pipes are counted starting 
from MEC0/pipe0 as 0. */
@@ -86,14 +84,6 @@ struct kfd2kgd_calls {
uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
uint64_t (*get_gpu_clock_counter)(struct kgd_dev *kgd);
 
-   /* SRBM_GFX_CNTL mutex */
-   void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
-   void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
-
-   /* GRBM_GFX_INDEX mutex */
-   void (*lock_grbm_gfx_idx)(struct kgd_dev *kgd);
-   void (*unlock_grbm_gfx_idx)(struct kgd_dev *kgd);
-
uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
 
/* Register access functions */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 62/83] hsa/radeon: Fix timeout calculation in sync_with_hw

2014-07-10 Thread Oded Gabbay
This patch fixes a bug in the timeout calculation done in sync_with_hw
functions. The original code assumed that jiffies is incremented in ms.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c 
b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
index 25528b3..ce3261b 100644
--- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
+++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
@@ -222,12 +222,18 @@ static void submit_packet(struct kernel_queue *kq)
 
 static int sync_with_hw(struct kernel_queue *kq, unsigned long timeout_ms)
 {
+   unsigned long org_timeout_ms;
+
BUG_ON(!kq);
-   timeout_ms += jiffies;
+
+   org_timeout_ms = timeout_ms;
+   timeout_ms += jiffies * 1000 / HZ;
while (*kq-wptr_kernel != *kq-rptr_kernel) {
-   if (time_after(jiffies, timeout_ms)) {
-   pr_err(kfd: kernel_queue %s timeout expired %lu\n, 
__func__, timeout_ms);
-   pr_err(kfd: wptr: %d rptr: %d\n, *kq-wptr_kernel, 
*kq-rptr_kernel);
+   if (time_after(jiffies * 1000 / HZ, timeout_ms)) {
+   pr_err(kfd: kernel_queue %s timeout expired %lu\n,
+   __func__, org_timeout_ms);
+   pr_err(kfd: wptr: %d rptr: %d\n,
+   *kq-wptr_kernel, *kq-rptr_kernel);
return -ETIME;
}
cpu_relax();
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 83/83] hsa/radeon: Update module version to 0.6.2

2014-07-10 Thread Oded Gabbay
This version is intended for upstreaming to the Linux kernel 3.17

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index c706236..c783eeb 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -30,10 +30,10 @@
 #define KFD_DRIVER_AUTHOR  AMD Inc. and others
 
 #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs
-#define KFD_DRIVER_DATE20140623
+#define KFD_DRIVER_DATE20140710
 #define KFD_DRIVER_MAJOR   0
 #define KFD_DRIVER_MINOR   6
-#define KFD_DRIVER_PATCHLEVEL  1
+#define KFD_DRIVER_PATCHLEVEL  2
 
 const struct kfd2kgd_calls *kfd2kgd;
 static const struct kgd2kfd_calls kgd2kfd = {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 68/83] hsa/radeon: Update module version to 0.6.0

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index fbfcce6..33cee3c 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -32,7 +32,7 @@
 #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs
 #define KFD_DRIVER_DATE20140424
 #define KFD_DRIVER_MAJOR   0
-#define KFD_DRIVER_MINOR   5
+#define KFD_DRIVER_MINOR   6
 #define KFD_DRIVER_PATCHLEVEL  0
 
 const struct kfd2kgd_calls *kfd2kgd;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 74/83] hsa/radeon: Adding some error messages

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 09c9a61..be89d26 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -137,11 +137,15 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
if (copy_from_user(args, arg, sizeof(args)))
return -EFAULT;
 
-   if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(qptr_t)))
+   if (!access_ok(VERIFY_WRITE, args.read_pointer_address, 
sizeof(qptr_t))) {
+   pr_err(kfd: can't access read pointer);
return -EFAULT;
+   }
 
-   if (!access_ok(VERIFY_WRITE, args.write_pointer_address, 
sizeof(qptr_t)))
+   if (!access_ok(VERIFY_WRITE, args.write_pointer_address, 
sizeof(qptr_t))) {
+   pr_err(kfd: can't access write pointer);
return -EFAULT;
+   }
 
q_properties.is_interop = false;
q_properties.queue_percent = args.queue_percentage;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 57/83] hsa/radeon: Eliminate warnings in compilation

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c  | 6 +++---
 drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 4 ++--
 drivers/gpu/hsa/radeon/kfd_queue.c| 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 9a77332..80b702e 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -114,8 +114,8 @@ kfd_open(struct inode *inode, struct file *filep)
 
process-is_32bit_user_mode = is_compat_task();
 
-   dev_info(kfd_device, process %d opened, compat mode (32 bit) - %d\n,
-   process-pasid, process-is_32bit_user_mode);
+   dev_dbg(kfd_device, process %d opened, compat mode (32 bit) - %d\n,
+   process-pasid, process-is_32bit_user_mode);
 
kfd_init_apertures(process);
 
@@ -149,7 +149,7 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
pr_debug(%s Arguments: Queue Percentage (%d, %d)\n
Queue Priority (%d, %d)\n
Queue Address (0x%llX, 0x%llX)\n
-   Queue Size (%u64, %ll)\n,
+   Queue Size (%llX, %u)\n,
__func__,
q_properties.queue_percent, args.queue_percentage,
q_properties.priority, args.queue_priority,
diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c 
b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
index aa64693e..25528b3 100644
--- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
+++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.c
@@ -89,8 +89,8 @@ static bool initialize(struct kernel_queue *kq, struct 
kfd_dev *dev,
prop.type = type;
prop.vmid = 0;
prop.queue_address = kq-pq_gpu_addr;
-   prop.read_ptr = kq-rptr_gpu_addr;
-   prop.write_ptr = kq-wptr_gpu_addr;
+   prop.read_ptr = (qptr_t *) kq-rptr_gpu_addr;
+   prop.write_ptr = (qptr_t *) kq-wptr_gpu_addr;
 
if (init_queue(kq-queue, prop) != 0)
goto err_init_queue;
diff --git a/drivers/gpu/hsa/radeon/kfd_queue.c 
b/drivers/gpu/hsa/radeon/kfd_queue.c
index 2d22cc1..646b6d1 100644
--- a/drivers/gpu/hsa/radeon/kfd_queue.c
+++ b/drivers/gpu/hsa/radeon/kfd_queue.c
@@ -67,7 +67,7 @@ void print_queue(struct queue *q)
Queue Doorbell Pointer: 0x%p\n
Queue Doorbell Offset: %u\n
Queue MQD Address: 0x%p\n
-   Queue MQD Gart: 0x%p\n
+   Queue MQD Gart: 0x%llX\n
Queue Process Address: 0x%p\n
Queue Device Address: 0x%p\n,
q-properties.type,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 80/83] drm/radeon: Add register access functions to kfd2kgd interface

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch extends the kfd2kgd interface by adding functions
that perform direct register access.

These functions can be called from kfd and will allow to
eliminate all direct register accesses from within the kfd.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cikd.h   |  51 +-
 drivers/gpu/drm/radeon/radeon_kfd.c | 354 
 include/linux/radeon_kfd.h  |  11 ++
 3 files changed, 415 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/cikd.h b/drivers/gpu/drm/radeon/cikd.h
index 0c6e1b5..0a2a403 100644
--- a/drivers/gpu/drm/radeon/cikd.h
+++ b/drivers/gpu/drm/radeon/cikd.h
@@ -1137,6 +1137,9 @@
 #defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 
3
 #defineDEFAULT_MTYPE(x)((x)  
4)
 #defineAPE1_MTYPE(x)   ((x)  
7)
+/* valid for both DEFAULT_MTYPE and APE1_MTYPE */
+#defineMTYPE_CACHED0
+#defineMTYPE_NONCACHED 3
 
 #defineSX_DEBUG_1  0x9060
 
@@ -1447,6 +1450,16 @@
 #define CP_HQD_ACTIVE 0xC91C
 #define CP_HQD_VMID   0xC920
 
+#define CP_HQD_PERSISTENT_STATE
0xC924u
+#defineDEFAULT_CP_HQD_PERSISTENT_STATE 
(0x33U  8)
+
+#define CP_HQD_PIPE_PRIORITY   
0xC928u
+#define CP_HQD_QUEUE_PRIORITY  
0xC92Cu
+#define CP_HQD_QUANTUM 
0xC930u
+#defineQUANTUM_EN  
1U
+#defineQUANTUM_SCALE_1MS   
(1U  4)
+#defineQUANTUM_DURATION(x) 
((x)  8)
+
 #define CP_HQD_PQ_BASE0xC934
 #define CP_HQD_PQ_BASE_HI 0xC938
 #define CP_HQD_PQ_RPTR0xC93C
@@ -1474,12 +1487,32 @@
 #definePRIV_STATE  (1  30)
 #defineKMD_QUEUE   (1  31)
 
-#define CP_HQD_DEQUEUE_REQUEST  0xC974
+#define CP_HQD_IB_BASE_ADDR0xC95Cu
+#define CP_HQD_IB_BASE_ADDR_HI 0xC960u
+#define CP_HQD_IB_RPTR 0xC964u
+#define CP_HQD_IB_CONTROL  0xC968u
+#defineIB_ATC_EN   
(1U  23)
+#defineDEFAULT_MIN_IB_AVAIL_SIZE   (3U  20)
+
+#define CP_HQD_DEQUEUE_REQUEST 0xC974
+#defineDEQUEUE_REQUEST_DRAIN   1
+#define DEQUEUE_REQUEST_RESET  2
 
 #define CP_MQD_CONTROL  0xC99C
 #defineMQD_VMID(x) ((x)  0)
 #defineMQD_VMID_MASK   (0xf  0)
 
+#define CP_HQD_SEMA_CMD0xC97Cu
+#define CP_HQD_MSG_TYPE0xC980u
+#define CP_HQD_ATOMIC0_PREOP_LO0xC984u
+#define CP_HQD_ATOMIC0_PREOP_HI0xC988u
+#define CP_HQD_ATOMIC1_PREOP_LO0xC98Cu
+#define CP_HQD_ATOMIC1_PREOP_HI0xC990u
+#define CP_HQD_HQ_SCHEDULER0   0xC994u
+#define CP_HQD_HQ_SCHEDULER1   0xC998u
+
+#define SH_STATIC_MEM_CONFIG   0x9604u
+
 #define DB_RENDER_CONTROL   0x28000
 
 #define PA_SC_RASTER_CONFIG 0x28350
@@ -2069,4 +2102,20 @@
 #define VCE_CMD_IB_AUTO0x0005
 #define VCE_CMD_SEMAPHORE  0x0006
 
+#define ATC_VMID0_PASID_MAPPING0x339Cu
+#defineATC_VMID_PASID_MAPPING_UPDATE_STATUS0x3398u
+#defineATC_VMID_PASID_MAPPING_VALID(1U  
31)
+
+#define ATC_VM_APERTURE0_CNTL  0x3310u
+#defineATS_ACCESS_MODE_NEVER   0
+#defineATS_ACCESS_MODE_ALWAYS  
1
+
+#define ATC_VM_APERTURE0_CNTL2 0x3318u
+#define ATC_VM_APERTURE0_HIGH_ADDR 0x3308u
+#define ATC_VM_APERTURE0_LOW_ADDR  0x3300u
+#define

[PATCH 73/83] hsa/radeon: Adding qcm fence return status

2014-07-10 Thread Oded Gabbay
From: Yair Shachar yair.shac...@amd.com

Waiting on fence returns status

Signed-off-by: Yair Shachar yair.shac...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 6 --
 drivers/gpu/hsa/radeon/kfd_priv.h | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 4931f8a..4c53e57 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -800,7 +800,7 @@ out:
return retval;
 }
 
-static void fence_wait_timeout(unsigned int *fence_addr, unsigned int 
fence_value, unsigned long timeout)
+int fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_value, 
unsigned long timeout)
 {
BUG_ON(!fence_addr);
timeout += jiffies;
@@ -808,10 +808,12 @@ static void fence_wait_timeout(unsigned int *fence_addr, 
unsigned int fence_valu
while (*fence_addr != fence_value) {
if (time_after(jiffies, timeout)) {
pr_err(kfd: qcm fence wait loop timeout expired\n);
-   break;
+   return -ETIME;
}
cpu_relax();
}
+
+   return 0;
 }
 
 static int destroy_queues_cpsch(struct device_queue_manager *dqm)
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 97bf58a..b61187a 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -463,6 +463,8 @@ int pqm_update_queue(struct process_queue_manager *pqm, 
unsigned int qid, struct
 struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, 
unsigned int qid);
 void test_diq(struct kfd_dev *dev, struct process_queue_manager *pqm);
 
+int fence_wait_timeout(unsigned int *fence_addr, unsigned int fence_value, 
unsigned long timeout);
+
 /* Packet Manager */
 
 #define KFD_HIQ_TIMEOUT (500)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 76/83] hsa/radeon: Check oversubscription before destroying runlist

2014-07-10 Thread Oded Gabbay
This patch fixes a bug when using the mode of CP hardware
scheduling without oversubscription.

The bug was that the oversubscription check was performed
_after_ the current runlist was destroyed, which caused
the current HSA application to stop working.

This patch moves the oversubscription check before the call
to destroy the current runlist. If there is oversubscription,
the function prints an error to dmesg and simply exits.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_packet_manager.c| 3 ---
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 9 +
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c 
b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
index 5cd23b0..0aef907 100644
--- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
@@ -88,9 +88,6 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, 
unsigned int **rl_b
BUG_ON(is_over_subscription == NULL);
 
pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
-   if (*is_over_subscription 
-   sched_policy == 
KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION)
-   return -EFAULT;
 
retval = radeon_kfd_vidmem_alloc_map(pm-dqm-dev, pm-ib_buffer_obj, 
(void **)rl_buffer,
 rl_gpu_buffer, 
ALIGN(*rl_buffer_size, PAGE_SIZE));
diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
index 5d7c46d..97b3cc6 100644
--- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
@@ -174,6 +174,15 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 
switch (type) {
case KFD_QUEUE_TYPE_COMPUTE:
+   /* check if there is over subscription */
+   if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) 

+   ((dev-dqm-processes_count = VMID_PER_DEVICE) ||
+   (dev-dqm-queue_count = PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE))) {
+   pr_err(kfd: over-subscription is not allowed in 
radeon_kfd.sched_policy == 1\n);
+   retval = -EPERM;
+   goto err_create_queue;
+   }
+
retval = create_cp_queue(pqm, dev, q, q_properties, f, *qid);
if (retval != 0)
goto err_create_queue;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 59/83] hsa/radeon: Exclusive access for perf. counters

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Introducing IOCTL implementation for controlling exclusive access to performace 
counters.
The exclusive access is per GPU device.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 61 
 drivers/gpu/hsa/radeon/kfd_device.c  |  2 ++
 drivers/gpu/hsa/radeon/kfd_priv.h|  5 +++
 drivers/gpu/hsa/radeon/kfd_process.c |  8 +++--
 include/uapi/linux/kfd_ioctl.h   | 12 +++
 5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 80b702e..b39df68 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -387,6 +387,59 @@ static int kfd_ioctl_get_process_apertures(struct file 
*filp, struct kfd_process
return 0;
 }
 
+static long
+kfd_ioctl_pmc_acquire_access(struct file *filp, struct kfd_process *p, void 
__user *arg)
+{
+   struct kfd_ioctl_pmc_acquire_access_args args;
+   struct kfd_dev *dev;
+   int err = -EBUSY;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   spin_lock(dev-pmc_access_lock);
+   if (dev-pmc_locking_process == NULL) {
+   dev-pmc_locking_process = p;
+   dev-pmc_locking_trace = args.trace_id;
+   err = 0;
+   } else if (dev-pmc_locking_process == p  dev-pmc_locking_trace == 
args.trace_id) {
+   /* Same trace already has an access. Returning success */
+   err = 0;
+   }
+
+   spin_unlock(dev-pmc_access_lock);
+
+   return err;
+}
+
+static long
+kfd_ioctl_pmc_release_access(struct file *filp, struct kfd_process *p, void 
__user *arg)
+{
+   struct kfd_ioctl_pmc_release_access_args args;
+   struct kfd_dev *dev;
+   int err = -EINVAL;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   spin_lock(dev-pmc_access_lock);
+   if (dev-pmc_locking_process == p  dev-pmc_locking_trace == 
args.trace_id) {
+   dev-pmc_locking_process = NULL;
+   dev-pmc_locking_trace = 0;
+   err = 0;
+   }
+   spin_unlock(dev-pmc_access_lock);
+
+   return err;
+}
 
 static long
 kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
@@ -427,6 +480,14 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
err = kfd_ioctl_update_queue(filep, process, (void __user 
*)arg);
break;
 
+   case KFD_IOC_PMC_ACQUIRE_ACCESS:
+   err = kfd_ioctl_pmc_acquire_access(filep, process, (void __user 
*) arg);
+   break;
+
+   case KFD_IOC_PMC_RELEASE_ACCESS:
+   err = kfd_ioctl_pmc_release_access(filep, process, (void __user 
*) arg);
+   break;
+
default:
dev_err(kfd_device,
unknown ioctl cmd 0x%x, arg 0x%lx)\n,
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index c602e16..9af812b 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -185,6 +185,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
return false;
}
 
+   spin_lock_init(kfd-pmc_access_lock);
+
kfd-init_complete = true;
dev_info(kfd_device, added device (%x:%x)\n, kfd-pdev-vendor,
 kfd-pdev-device);
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 049671b..e6d4993 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -135,6 +135,11 @@ struct kfd_dev {
 
/* QCM Device instance */
struct device_queue_manager *dqm;
+
+   /* Performance counters exclusivity lock */
+   spinlock_t pmc_access_lock;
+   struct kfd_process *pmc_locking_process;
+   uint64_t pmc_locking_trace;
 };
 
 /* KGD2KFD callbacks */
diff --git a/drivers/gpu/hsa/radeon/kfd_process.c 
b/drivers/gpu/hsa/radeon/kfd_process.c
index f967c15..9bb5cab 100644
--- a/drivers/gpu/hsa/radeon/kfd_process.c
+++ b/drivers/gpu/hsa/radeon/kfd_process.c
@@ -96,9 +96,13 @@ static void free_process(struct kfd_process *p)
 
BUG_ON(p == NULL);
 
-   /* doorbell mappings: automatic */
-
list_for_each_entry_safe(pdd, temp, p-per_device_data, 
per_device_list) {
+   spin_lock(pdd-dev-pmc_access_lock);
+   if (pdd-dev-pmc_locking_process == p) {
+   pdd-dev-pmc_locking_process = NULL;
+   pdd-dev-pmc_locking_trace = 0;
+   }
+   spin_unlock(pdd

[PATCH 79/83] hsa/radeon: Update module version to 0.6.1

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index 33cee3c..c706236 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -30,10 +30,10 @@
 #define KFD_DRIVER_AUTHOR  AMD Inc. and others
 
 #define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs
-#define KFD_DRIVER_DATE20140424
+#define KFD_DRIVER_DATE20140623
 #define KFD_DRIVER_MAJOR   0
 #define KFD_DRIVER_MINOR   6
-#define KFD_DRIVER_PATCHLEVEL  0
+#define KFD_DRIVER_PATCHLEVEL  1
 
 const struct kfd2kgd_calls *kfd2kgd;
 static const struct kgd2kfd_calls kgd2kfd = {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 81/83] hsa/radeon: Eliminating all direct register accesses

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch eliminates all direct register accesses from KFD
and eliminate using of shared locks between KFD and radeon.

The single exception is the doorbells that are used in
both of the drivers. However, because they are located
in separate pci bar pages, the danger of sharing registers
between the drivers is minimal.

Having said that, we are planning to move the doorbells as well
to radeon.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   2 +-
 drivers/gpu/hsa/radeon/kfd_device.c   |   2 -
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 113 +++---
 drivers/gpu/hsa/radeon/kfd_kernel_queue.c |  12 +-
 drivers/gpu/hsa/radeon/kfd_mqd_manager.c  | 175 +-
 drivers/gpu/hsa/radeon/kfd_mqd_manager.h  |  37 +++--
 drivers/gpu/hsa/radeon/kfd_priv.h |  18 ---
 drivers/gpu/hsa/radeon/kfd_registers.c|  50 ---
 8 files changed, 54 insertions(+), 355 deletions(-)
 delete mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index b5f05b4..d838bce 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -4,7 +4,7 @@
 
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
-   kfd_doorbell.o kfd_registers.o kfd_vidmem.o \
+   kfd_doorbell.o kfd_vidmem.o \
kfd_interrupt.o kfd_aperture.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
kfd_process_queue_manager.o kfd_device_queue_manager.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index 30558c9..0ff2241 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -157,8 +157,6 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 {
kfd-shared_resources = *gpu_resources;
 
-   kfd-regs = gpu_resources-mmio_registers;
-
radeon_kfd_doorbell_init(kfd);
 
if (radeon_kfd_interrupt_init(kfd))
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 12b8b33..3eb5db3 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -112,30 +112,15 @@ static void init_process_memory(struct 
device_queue_manager *dqm, struct qcm_pro
 
 static void program_sh_mem_settings(struct device_queue_manager *dqm, struct 
qcm_process_device *qpd)
 {
-   struct mqd_manager *mqd;
-
-   BUG_ON(qpd-vmid  KFD_VMID_START_OFFSET);
-
-   mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE);
-   if (mqd == NULL)
-   return;
-
-   mqd-acquire_hqd(mqd, 0, 0, qpd-vmid);
-
-   WRITE_REG(dqm-dev, SH_MEM_CONFIG, qpd-sh_mem_config);
-
-   WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base);
-   WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit);
-   WRITE_REG(dqm-dev, SH_MEM_BASES, qpd-sh_mem_bases);
-
-   mqd-release_hqd(mqd);
+   return kfd2kgd-program_sh_mem_settings(dqm-dev-kgd, qpd-vmid, 
qpd-sh_mem_config,
+   qpd-sh_mem_ape1_base, qpd-sh_mem_ape1_limit, 
qpd-sh_mem_bases);
 }
 
 static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue 
*q,
struct qcm_process_device *qpd, int *allocate_vmid)
 {
bool set, is_new_vmid;
-   int bit, retval, pipe;
+   int bit, retval, pipe, i;
struct mqd_manager *mqd;
 
BUG_ON(!dqm || !q || !qpd || !allocate_vmid);
@@ -171,8 +156,8 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm, struct queue *
q-properties.vmid = qpd-vmid;
 
set = false;
-   for (pipe = dqm-next_pipe_to_allocate; pipe  get_pipes_num(dqm);
-   pipe = (pipe + 1) % get_pipes_num(dqm)) {
+   for (i = 0, pipe = dqm-next_pipe_to_allocate; i  get_pipes_num(dqm);
+   pipe = (pipe + i++) % get_pipes_num(dqm)) {
if (dqm-allocated_queues[pipe] != 0) {
bit = find_first_bit((unsigned long 
*)dqm-allocated_queues[pipe], QUEUES_PER_PIPE);
clear_bit(bit, (unsigned long 
*)dqm-allocated_queues[pipe]);
@@ -238,9 +223,7 @@ static int destroy_queue_nocpsch(struct 
device_queue_manager *dqm, struct qcm_pr
retval = -ENOMEM;
goto out;
}
-   mqd-acquire_hqd(mqd, q-pipe, q-queue, 0);
-   retval = mqd-destroy_mqd(mqd, q-mqd, KFD_PREEMPT_TYPE_WAVEFRONT, 
QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS);
-   mqd-release_hqd(mqd);
+   retval = mqd-destroy_mqd(mqd, false, QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS, 
q-pipe, q-queue);
if (retval != 0)
goto

[PATCH 77/83] hsa/radeon: Add local memory to topology

2014-07-10 Thread Oded Gabbay
From: Alexey Skidanov alexey.skida...@amd.com

Signed-off-by: Alexey Skidanov alexey.skida...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_topology.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c 
b/drivers/gpu/hsa/radeon/kfd_topology.c
index 059b7db..d3aaad1 100644
--- a/drivers/gpu/hsa/radeon/kfd_topology.c
+++ b/drivers/gpu/hsa/radeon/kfd_topology.c
@@ -715,6 +715,8 @@ static ssize_t node_show(struct kobject *kobj, struct 
attribute *attr,
sysfs_show_32bit_prop(buffer, max_engine_clk_fcompute,
kfd2kgd-get_max_engine_clock_in_mhz(
dev-gpu-kgd));
+   sysfs_show_64bit_prop(buffer, local_mem_size,
+   kfd2kgd-get_vmem_size(dev-gpu-kgd));
ret = sysfs_show_32bit_prop(buffer, max_engine_clk_ccompute,
cpufreq_quick_get_max(0)/1000);
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 78/83] hsa/radeon: Don't verify cksum when parsing CRAT table

2014-07-10 Thread Oded Gabbay
This patch removes the checksum verification done when
parsing a CRAT table. The verification was both erronous and
redundant, as it is done by another piece of kernel code.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_topology.c | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c 
b/drivers/gpu/hsa/radeon/kfd_topology.c
index d3aaad1..b686b7e 100644
--- a/drivers/gpu/hsa/radeon/kfd_topology.c
+++ b/drivers/gpu/hsa/radeon/kfd_topology.c
@@ -38,21 +38,6 @@ static struct kfd_system_properties sys_props;
 
 static DECLARE_RWSEM(topology_lock);
 
-
-static uint8_t checksum_image(const void *buf, size_t len)
-{
-   uint8_t *p = (uint8_t *)buf;
-   uint8_t sum = 0;
-
-   if (!buf)
-   return 0;
-
-   while (len--  0)
-   sum += *p++;
-
-   return sum;
-   }
-
 struct kfd_dev *radeon_kfd_device_by_id(uint32_t gpu_id)
 {
struct kfd_topology_device *top_dev;
@@ -97,9 +82,9 @@ static int kfd_topology_get_crat_acpi(void *crat_image, 
size_t *size)
if (!size)
return -EINVAL;
 
-/*
+   /*
 * Fetch the CRAT table from ACPI
- */
+*/
status = acpi_get_table(CRAT_SIGNATURE, 0, crat_table);
if (status == AE_NOT_FOUND) {
pr_warn(CRAT table not found\n);
@@ -111,16 +96,6 @@ static int kfd_topology_get_crat_acpi(void *crat_image, 
size_t *size)
return -EINVAL;
}
 
-   /*
-* The checksum of the table should be verified
-*/
-   if (checksum_image(crat_table, crat_table-length) ==
-   crat_table-checksum) {
-   pr_err(Bad checksum for the CRAT table\n);
-   return -EINVAL;
-}
-
-
if (*size = crat_table-length  crat_image != 0)
memcpy(crat_image, crat_table, crat_table-length);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 75/83] hsa/radeon: Fixing minor issues with kernel queues (DIQ)

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

* re-execute runlist on kernel queues destruction.
* delete kernel queues from pqm's queues list on pqm unint

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c  | 4 
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 4c53e57..12b8b33 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -759,6 +759,10 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 {
BUG_ON(!dqm || !kq);
 
+   pr_debug(kfd: In %s\n, __func__);
+
+   dqm-destroy_queues(dqm);
+
mutex_lock(dqm-lock);
list_del(kq-list);
dqm-queue_count--;
diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
index 89461ab..5d7c46d 100644
--- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
@@ -273,10 +273,10 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
if (retval != 0)
return retval;
 
-   list_del(pqn-process_queue_list);
uninit_queue(pqn-q);
}
 
+   list_del(pqn-process_queue_list);
kfree(pqn);
clear_bit(qid, pqm-queue_slot_bitmap);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 71/83] hsa/radeon: Remove old scheduler code

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   5 +-
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 987 --
 2 files changed, 2 insertions(+), 990 deletions(-)
 delete mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 26ce0ae..b5f05b4 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -4,9 +4,8 @@
 
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
-   kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
-   kfd_queue.o kfd_mqd_manager.o \
+   kfd_doorbell.o kfd_registers.o kfd_vidmem.o \
+   kfd_interrupt.o kfd_aperture.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
kfd_process_queue_manager.o kfd_device_queue_manager.o
 
diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
deleted file mode 100644
index d576d95..000
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ /dev/null
@@ -1,987 +0,0 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the Software),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- */
-
-#include linux/log2.h
-#include linux/mutex.h
-#include linux/slab.h
-#include linux/types.h
-#include linux/uaccess.h
-#include linux/device.h
-#include linux/sched.h
-#include kfd_priv.h
-#include kfd_scheduler.h
-#include cik_regs.h
-#include cik_int.h
-
-/* CIK CP hardware is arranged with 8 queues per pipe and 8 pipes per MEC 
(microengine for compute).
- * The first MEC is ME 1 with the GFX ME as ME 0.
- * We split the CP with the KGD, they take the first N pipes and we take the 
rest.
- */
-#define CIK_QUEUES_PER_PIPE 8
-#define CIK_PIPES_PER_MEC 4
-
-#define CIK_MAX_PIPES (2 * CIK_PIPES_PER_MEC)
-
-#define CIK_NUM_VMID 16
-
-#define CIK_HPD_SIZE_LOG2 11
-#define CIK_HPD_SIZE (1U  CIK_HPD_SIZE_LOG2)
-#define CIK_HPD_ALIGNMENT 256
-#define CIK_MQD_ALIGNMENT 4
-
-#pragma pack(push, 4)
-
-struct cik_hqd_registers {
-   u32 cp_mqd_base_addr;
-   u32 cp_mqd_base_addr_hi;
-   u32 cp_hqd_active;
-   u32 cp_hqd_vmid;
-   u32 cp_hqd_persistent_state;
-   u32 cp_hqd_pipe_priority;
-   u32 cp_hqd_queue_priority;
-   u32 cp_hqd_quantum;
-   u32 cp_hqd_pq_base;
-   u32 cp_hqd_pq_base_hi;
-   u32 cp_hqd_pq_rptr;
-   u32 cp_hqd_pq_rptr_report_addr;
-   u32 cp_hqd_pq_rptr_report_addr_hi;
-   u32 cp_hqd_pq_wptr_poll_addr;
-   u32 cp_hqd_pq_wptr_poll_addr_hi;
-   u32 cp_hqd_pq_doorbell_control;
-   u32 cp_hqd_pq_wptr;
-   u32 cp_hqd_pq_control;
-   u32 cp_hqd_ib_base_addr;
-   u32 cp_hqd_ib_base_addr_hi;
-   u32 cp_hqd_ib_rptr;
-   u32 cp_hqd_ib_control;
-   u32 cp_hqd_iq_timer;
-   u32 cp_hqd_iq_rptr;
-   u32 cp_hqd_dequeue_request;
-   u32 cp_hqd_dma_offload;
-   u32 cp_hqd_sema_cmd;
-   u32 cp_hqd_msg_type;
-   u32 cp_hqd_atomic0_preop_lo;
-   u32 cp_hqd_atomic0_preop_hi;
-   u32 cp_hqd_atomic1_preop_lo;
-   u32 cp_hqd_atomic1_preop_hi;
-   u32 cp_hqd_hq_scheduler0;
-   u32 cp_hqd_hq_scheduler1;
-   u32 cp_mqd_control;
-};
-
-struct cik_mqd {
-   u32 header;
-   u32 dispatch_initiator;
-   u32 dimensions[3];
-   u32 start_idx[3];
-   u32 num_threads[3];
-   u32 pipeline_stat_enable;
-   u32 perf_counter_enable;
-   u32 pgm[2];
-   u32 tba[2];
-   u32 tma[2];
-   u32 pgm_rsrc[2];
-   u32 vmid;
-   u32 resource_limits;
-   u32 static_thread_mgmt01[2];
-   u32 tmp_ring_size;
-   u32 static_thread_mgmt23[2];
-   u32 restart[3];
-   u32 thread_trace_enable;
-   u32 reserved1

[PATCH 65/83] hsa/radeon: fixing a bug to support 32b processes

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This commit is a bug fix for 32b hsa processes support

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_regs.h | 1 +
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 8 +---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
index fa5ec01..a6404e3 100644
--- a/drivers/gpu/hsa/radeon/cik_regs.h
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -45,6 +45,7 @@
 /* if PTR32, this is the upper limit of GPUVM */
 #defineSH_MEM_CONFIG   0x8C34
 #definePTR32   (1  0)
+#define PRIVATE_ATC(1  1)
 #defineALIGNMENT_MODE(x)   ((x)  2)
 #defineSH_MEM_ALIGNMENT_MODE_DWORD 0
 #defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT  1
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 01573b1..3e1def1 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -90,15 +90,17 @@ static void init_process_memory(struct device_queue_manager 
*dqm, struct qcm_pro
if (qpd-pqm-process-is_32bit_user_mode) {
temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev);
qpd-sh_mem_bases = SHARED_BASE(temp);
+   qpd-sh_mem_config = PTR32;
} else {
temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev);
qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+   qpd-sh_mem_config = 0;
}
 
-   qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
+   qpd-sh_mem_config |= ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED);
qpd-sh_mem_ape1_limit = 0;
-   qpd-sh_mem_ape1_base = 1;
+   qpd-sh_mem_ape1_base = 0;
 
pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and 
register 0x%X\n,
qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases);
@@ -854,7 +856,7 @@ static int execute_queues_cpsch(struct device_queue_manager 
*dqm)
}
 
if (dqm-queue_count = 0 || dqm-processes_count = 0)
-   return 0;
+   return 0;
 
mutex_lock(dqm-lock);
if (dqm-active_runlist) {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 72/83] hsa/radeon: Static analysis (smatch) fixes

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device.c   |  3 +++
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c |  2 +-
 drivers/gpu/hsa/radeon/kfd_mqd_manager.c  |  1 +
 drivers/gpu/hsa/radeon/kfd_packet_manager.c   |  3 ++-
 drivers/gpu/hsa/radeon/kfd_process.c  | 10 ++
 5 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index 9af812b..30558c9 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -88,6 +88,9 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct 
pci_dev *pdev)
return NULL;
 
kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
+   if (!kfd)
+   return NULL;
+
kfd-kgd = kgd;
kfd-device_info = device_info;
kfd-pdev = pdev;
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 56875f9..4931f8a 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -317,7 +317,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(struct 
device_queue_manager *
 {
struct mqd_manager *mqd;
 
-   BUG_ON(!dqm || type  KFD_MQD_TYPE_MAX);
+   BUG_ON(!dqm || type = KFD_MQD_TYPE_MAX);
 
pr_debug(kfd: In func %s mqd type %d\n, __func__, type);
 
diff --git a/drivers/gpu/hsa/radeon/kfd_mqd_manager.c 
b/drivers/gpu/hsa/radeon/kfd_mqd_manager.c
index a3e9f7c..8c1192e 100644
--- a/drivers/gpu/hsa/radeon/kfd_mqd_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_mqd_manager.c
@@ -437,6 +437,7 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE 
type, struct kfd_dev *dev
mqd-uninitialize = uninitialize;
break;
default:
+   kfree(mqd);
return NULL;
break;
}
diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c 
b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
index 621a720..5cd23b0 100644
--- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
@@ -85,9 +85,10 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, 
unsigned int **rl_b
 
BUG_ON(!pm);
BUG_ON(pm-allocated == true);
+   BUG_ON(is_over_subscription == NULL);
 
pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
-   if (is_over_subscription 
+   if (*is_over_subscription 
sched_policy == 
KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION)
return -EFAULT;
 
diff --git a/drivers/gpu/hsa/radeon/kfd_process.c 
b/drivers/gpu/hsa/radeon/kfd_process.c
index eb30cb3..aacc7ef 100644
--- a/drivers/gpu/hsa/radeon/kfd_process.c
+++ b/drivers/gpu/hsa/radeon/kfd_process.c
@@ -146,15 +146,15 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)
process = kzalloc(sizeof(*process), GFP_KERNEL);
 
if (!process)
-   goto err_alloc;
+   goto err_alloc_process;
 
process-queues = kmalloc_array(INITIAL_QUEUE_ARRAY_SIZE, 
sizeof(process-queues[0]), GFP_KERNEL);
if (!process-queues)
-   goto err_alloc;
+   goto err_alloc_queues;
 
process-pasid = radeon_kfd_pasid_alloc();
if (process-pasid == 0)
-   goto err_alloc;
+   goto err_alloc_pasid;
 
mutex_init(process-mutex);
 
@@ -178,9 +178,11 @@ err_process_pqm_init:
radeon_kfd_pasid_free(process-pasid);
list_del(process-processes_list);
thread-mm-kfd_process = NULL;
-err_alloc:
+err_alloc_pasid:
kfree(process-queues);
+err_alloc_queues:
kfree(process);
+err_alloc_process:
return ERR_PTR(err);
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 70/83] hsa/radeon: Fix compilation warnings

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 51f790f..09c9a61 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -148,21 +148,22 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
q_properties.priority = args.queue_priority;
q_properties.queue_address = args.ring_base_address;
q_properties.queue_size = args.ring_size;
-   q_properties.read_ptr = args.read_pointer_address;
-   q_properties.write_ptr = args.write_pointer_address;
+   q_properties.read_ptr = (qptr_t *) args.read_pointer_address;
+   q_properties.write_ptr = (qptr_t *) args.write_pointer_address;
 
 
pr_debug(%s Arguments: Queue Percentage (%d, %d)\n
Queue Priority (%d, %d)\n
Queue Address (0x%llX, 0x%llX)\n
-   Queue Size (%llX, %u)\n,
-   Queue r/w Pointers (%llX, %llX)\n,
+   Queue Size (0x%llX, %u)\n
+   Queue r/w Pointers (0x%llX, 0x%llX)\n,
__func__,
q_properties.queue_percent, args.queue_percentage,
q_properties.priority, args.queue_priority,
q_properties.queue_address, args.ring_base_address,
q_properties.queue_size, args.ring_size,
-   q_properties.read_ptr, q_properties.write_ptr);
+   (uint64_t) q_properties.read_ptr,
+   (uint64_t) q_properties.write_ptr);
 
dev = radeon_kfd_device_by_id(args.gpu_id);
if (dev == NULL)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 69/83] hsa/radeon: Fix initialization of sh_mem registers

2014-07-10 Thread Oded Gabbay
This patch fixes a bug in the code flow that caused an override of the sh_mem
registers.
The bug resulted in sh_mem registers being not initialized properly and
overwrite over sh_mem registers for vmid 0 (which is a vmid of non-HSA 
processes).

Reviewed-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 48 ---
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 5ec8da7..56875f9 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -87,21 +87,25 @@ static void init_process_memory(struct device_queue_manager 
*dqm, struct qcm_pro
unsigned int temp;
BUG_ON(!dqm || !qpd);
 
+   /* check if sh_mem_config register already configured */
+   if (qpd-sh_mem_config == 0) {
+   qpd-sh_mem_config =
+   ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) |
+   DEFAULT_MTYPE(MTYPE_NONCACHED) |
+   APE1_MTYPE(MTYPE_NONCACHED);
+   qpd-sh_mem_ape1_limit = 0;
+   qpd-sh_mem_ape1_base = 0;
+   }
+
if (qpd-pqm-process-is_32bit_user_mode) {
temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev);
qpd-sh_mem_bases = SHARED_BASE(temp);
-   qpd-sh_mem_config = PTR32;
+   qpd-sh_mem_config |= PTR32;
} else {
temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev);
qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp);
-   qpd-sh_mem_config = 0;
}
 
-   qpd-sh_mem_config |= ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
-   qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED);
-   qpd-sh_mem_ape1_limit = 0;
-   qpd-sh_mem_ape1_base = 0;
-
pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and 
register 0x%X\n,
qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases);
 }
@@ -110,6 +114,8 @@ static void program_sh_mem_settings(struct 
device_queue_manager *dqm, struct qcm
 {
struct mqd_manager *mqd;
 
+   BUG_ON(qpd-vmid  KFD_VMID_START_OFFSET);
+
mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE);
if (mqd == NULL)
return;
@@ -139,12 +145,6 @@ static int create_queue_nocpsch(struct 
device_queue_manager *dqm, struct queue *
print_queue(q);
 
mutex_lock(dqm-lock);
-   /* later memory apertures should be initialized in lazy mode */
-   if (!is_mem_initialized)
-   if (init_memory(dqm) != 0) {
-   retval = -ENODATA;
-   goto init_memory_failed;
-   }
 
if (dqm-vmid_bitmap == 0  qpd-vmid == 0) {
retval = -ENOMEM;
@@ -217,7 +217,6 @@ no_hqd:
*allocate_vmid = qpd-vmid = q-properties.vmid = 0;
}
 no_vmid:
-init_memory_failed:
mutex_unlock(dqm-lock);
return retval;
 }
@@ -951,20 +950,25 @@ static bool set_cache_memory_policy(struct 
device_queue_manager *dqm,
qpd-sh_mem_ape1_limit = limit  16;
}
 
-   default_mtype = (default_policy == cache_policy_coherent) ? 
MTYPE_NONCACHED : MTYPE_CACHED;
-   ape1_mtype = (alternate_policy == cache_policy_coherent) ? 
MTYPE_NONCACHED : MTYPE_CACHED;
+   default_mtype = (default_policy == cache_policy_coherent) ?
+   MTYPE_NONCACHED :
+   MTYPE_CACHED;
+
+   ape1_mtype = (alternate_policy == cache_policy_coherent) ?
+   MTYPE_NONCACHED :
+   MTYPE_CACHED;
 
-   qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED)
+   qpd-sh_mem_config = (qpd-sh_mem_config  PTR32)
+   | ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED)
| DEFAULT_MTYPE(default_mtype)
| APE1_MTYPE(ape1_mtype);
 
-
-   if (sched_policy == KFD_SCHED_POLICY_NO_HWS)
+   if ((sched_policy == KFD_SCHED_POLICY_NO_HWS)  (qpd-vmid != 0))
program_sh_mem_settings(dqm, qpd);
 
-
-   pr_debug(kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 
0x%x\n, qpd-sh_mem_config,
-qpd-sh_mem_ape1_base, qpd-sh_mem_ape1_limit);
+   pr_debug(kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 
0x%x\n,
+   qpd-sh_mem_config, qpd-sh_mem_ape1_base,
+   qpd-sh_mem_ape1_limit);
 
mutex_unlock(dqm-lock);
return true;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 66/83] hsa/radeon: Fix number of pipes per ME

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 2 +-
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 2 +-
 drivers/gpu/hsa/radeon/kfd_packet_manager.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 3e1def1..5ec8da7 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -55,7 +55,7 @@ static inline unsigned int get_first_pipe(struct 
device_queue_manager *dqm)
 
 static inline unsigned int get_pipes_num_cpsch(void)
 {
-   return PIPE_PER_ME_CP_SCHEDULING - 1;
+   return PIPE_PER_ME_CP_SCHEDULING;
 }
 
 static unsigned int get_sh_mem_bases_nybble_64(struct kfd_process *process, 
struct kfd_dev *dev)
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
index 57dc636..037eaf8 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
@@ -31,7 +31,7 @@
 
 #define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS   (500)
 #define QUEUES_PER_PIPE(8)
-#define PIPE_PER_ME_CP_SCHEDULING  (4)
+#define PIPE_PER_ME_CP_SCHEDULING  (3)
 #define CIK_VMID_NUM   (8)
 #define KFD_VMID_START_OFFSET  (8)
 #define VMID_PER_DEVICECIK_VMID_NUM
diff --git a/drivers/gpu/hsa/radeon/kfd_packet_manager.c 
b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
index 3fc8c34..621a720 100644
--- a/drivers/gpu/hsa/radeon/kfd_packet_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_packet_manager.c
@@ -62,7 +62,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm, 
unsigned int *rlib_size
/* check if there is over subscription*/
*over_subscription = false;
if ((process_count = VMID_PER_DEVICE) ||
-   queue_count = PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE) {
+   queue_count  PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE) {
*over_subscription = true;
pr_debug(kfd: over subscribed runlist\n);
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 67/83] hsa/radeon: Removing hw pointer store module

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This module is unnecessary as we allocating read/write pointers
from userspace thunk layer

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile|   2 +-
 drivers/gpu/hsa/radeon/kfd_chardev.c   |  22 +--
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c  | 149 -
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h  |  64 -
 drivers/gpu/hsa/radeon/kfd_priv.h  |  10 +-
 drivers/gpu/hsa/radeon/kfd_process.c   |   1 -
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c |  62 ++---
 7 files changed, 23 insertions(+), 287 deletions(-)
 delete mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
 delete mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 3409203..26ce0ae 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -6,7 +6,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
-   kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \
+   kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
kfd_process_queue_manager.o kfd_device_queue_manager.o
 
diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index b39df68..51f790f 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -32,9 +32,9 @@
 #include linux/time.h
 #include kfd_priv.h
 #include linux/mm.h
+#include linux/uaccess.h
 #include uapi/asm-generic/mman-common.h
 #include asm/processor.h
-#include kfd_hw_pointer_store.h
 #include kfd_device_queue_manager.h
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
@@ -137,24 +137,32 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
if (copy_from_user(args, arg, sizeof(args)))
return -EFAULT;
 
-   /* need to validate parameters */
+   if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(qptr_t)))
+   return -EFAULT;
+
+   if (!access_ok(VERIFY_WRITE, args.write_pointer_address, 
sizeof(qptr_t)))
+   return -EFAULT;
 
q_properties.is_interop = false;
q_properties.queue_percent = args.queue_percentage;
q_properties.priority = args.queue_priority;
q_properties.queue_address = args.ring_base_address;
q_properties.queue_size = args.ring_size;
+   q_properties.read_ptr = args.read_pointer_address;
+   q_properties.write_ptr = args.write_pointer_address;
 
 
pr_debug(%s Arguments: Queue Percentage (%d, %d)\n
Queue Priority (%d, %d)\n
Queue Address (0x%llX, 0x%llX)\n
Queue Size (%llX, %u)\n,
+   Queue r/w Pointers (%llX, %llX)\n,
__func__,
q_properties.queue_percent, args.queue_percentage,
q_properties.priority, args.queue_priority,
q_properties.queue_address, args.ring_base_address,
-   q_properties.queue_size, args.ring_size);
+   q_properties.queue_size, args.ring_size,
+   q_properties.read_ptr, q_properties.write_ptr);
 
dev = radeon_kfd_device_by_id(args.gpu_id);
if (dev == NULL)
@@ -177,8 +185,6 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
goto err_create_queue;
 
args.queue_id = queue_id;
-   args.read_pointer_address = (uint64_t)q_properties.read_ptr;
-   args.write_pointer_address = (uint64_t)q_properties.write_ptr;
args.doorbell_address = (uint64_t)q_properties.doorbell_ptr;
 
if (copy_to_user(arg, args, sizeof(args))) {
@@ -515,11 +521,5 @@ kfd_mmap(struct file *filp, struct vm_area_struct *vma)
if (pgoff = KFD_MMAP_DOORBELL_START  pgoff  KFD_MMAP_DOORBELL_END)
return radeon_kfd_doorbell_mmap(process, vma);
 
-   if (pgoff = KFD_MMAP_RPTR_START  pgoff  KFD_MMAP_RPTR_END)
-   return radeon_kfd_hw_pointer_store_mmap(process-read_ptr, 
vma);
-
-   if (pgoff = KFD_MMAP_WPTR_START  pgoff  KFD_MMAP_WPTR_END)
-   return radeon_kfd_hw_pointer_store_mmap(process-write_ptr, 
vma);
-
return -EINVAL;
 }
diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c 
b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
deleted file mode 100644
index 4e71f7d..000
--- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
+++ /dev/null
@@ -1,149 +0,0 @@
-/*
- * Copyright 2014

[PATCH 63/83] hsa/radeon: Update module information and version

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index 85069c5..fbfcce6 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -27,11 +27,13 @@
 #include linux/device.h
 #include kfd_priv.h
 
-#define DRIVER_AUTHOR  Andrew Lewycky, Oded Gabbay, Evgeny Pinchuk, 
others.
+#define KFD_DRIVER_AUTHOR  AMD Inc. and others
 
-#define DRIVER_NAMEkfd
-#define DRIVER_DESCAMD HSA Kernel Fusion Driver
-#define DRIVER_DATE20140127
+#define KFD_DRIVER_DESCStandalone HSA driver for AMD's GPUs
+#define KFD_DRIVER_DATE20140424
+#define KFD_DRIVER_MAJOR   0
+#define KFD_DRIVER_MINOR   5
+#define KFD_DRIVER_PATCHLEVEL  0
 
 const struct kfd2kgd_calls *kfd2kgd;
 static const struct kgd2kfd_calls kgd2kfd = {
@@ -120,6 +122,9 @@ static void __exit kfd_module_exit(void)
 module_init(kfd_module_init);
 module_exit(kfd_module_exit);
 
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_LICENSE(GPL);
+MODULE_AUTHOR(KFD_DRIVER_AUTHOR);
+MODULE_DESCRIPTION(KFD_DRIVER_DESC);
+MODULE_LICENSE(GPL and additional rights);
+MODULE_VERSION(__stringify(KFD_DRIVER_MAJOR) .
+  __stringify(KFD_DRIVER_MINOR) .
+  __stringify(KFD_DRIVER_PATCHLEVEL));
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 64/83] hsa/radeon: update queue fault handling

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This commit adding fault handling for process queue manager update queue

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
index fe74dd7..2034d2b 100644
--- a/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_process_queue_manager.c
@@ -334,6 +334,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
 
 int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, 
struct queue_properties *p)
 {
+   int retval;
struct process_queue_node *pqn;
 
BUG_ON(!pqm);
@@ -346,9 +347,17 @@ int pqm_update_queue(struct process_queue_manager *pqm, 
unsigned int qid, struct
pqn-q-properties.queue_percent = p-queue_percent;
pqn-q-properties.priority = p-priority;
 
-   pqn-q-device-dqm-destroy_queues(pqn-q-device-dqm);
-   pqn-q-device-dqm-update_queue(pqn-q-device-dqm, pqn-q);
-   pqn-q-device-dqm-execute_queues(pqn-q-device-dqm);
+   retval = pqn-q-device-dqm-destroy_queues(pqn-q-device-dqm);
+   if (retval != 0)
+   return retval;
+
+   retval = pqn-q-device-dqm-update_queue(pqn-q-device-dqm, pqn-q);
+   if (retval != 0)
+   return retval;
+
+   retval = pqn-q-device-dqm-execute_queues(pqn-q-device-dqm);
+   if (retval != 0)
+   return retval;
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 60/83] hsa/radeon: Rearrange structures in kfd_ioctl.h

2014-07-10 Thread Oded Gabbay
This patch rearranges the structures defined in kfd_ioctl.h so that
all the uint64_t variables are located at the start of each structure and
then all the uint32_t variables are located.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 include/uapi/linux/kfd_ioctl.h | 51 ++
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 509c4a0..3cedd1a 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -42,15 +42,15 @@ struct kfd_ioctl_get_version_args {
 
 struct kfd_ioctl_create_queue_args {
uint64_t ring_base_address; /* to KFD */
+   uint64_t write_pointer_address; /* from KFD */
+   uint64_t read_pointer_address;  /* from KFD */
+   uint64_t doorbell_address;  /* from KFD */
+
uint32_t ring_size; /* to KFD */
uint32_t gpu_id;/* to KFD */
uint32_t queue_type;/* to KFD */
uint32_t queue_percentage;  /* to KFD */
uint32_t queue_priority;/* to KFD */
-
-   uint64_t write_pointer_address; /* from KFD */
-   uint64_t read_pointer_address;  /* from KFD */
-   uint64_t doorbell_address;  /* from KFD */
uint32_t queue_id;  /* from KFD */
 };
 
@@ -59,8 +59,9 @@ struct kfd_ioctl_destroy_queue_args {
 };
 
 struct kfd_ioctl_update_queue_args {
-   uint32_t queue_id;  /* to KFD */
uint64_t ring_base_address; /* to KFD */
+
+   uint32_t queue_id;  /* to KFD */
uint32_t ring_size; /* to KFD */
uint32_t queue_percentage;  /* to KFD */
uint32_t queue_priority;/* to KFD */
@@ -71,31 +72,33 @@ struct kfd_ioctl_update_queue_args {
 #define KFD_IOC_CACHE_POLICY_NONCOHERENT 1
 
 struct kfd_ioctl_set_memory_policy_args {
+   uint64_t alternate_aperture_base;   /* to KFD */
+   uint64_t alternate_aperture_size;   /* to KFD */
+
uint32_t gpu_id;/* to KFD */
uint32_t default_policy;/* to KFD */
uint32_t alternate_policy;  /* to KFD */
-   uint64_t alternate_aperture_base;   /* to KFD */
-   uint64_t alternate_aperture_size;   /* to KFD */
 };
 
 struct kfd_ioctl_get_clock_counters_args {
-   uint32_t gpu_id;/* to KFD */
uint64_t gpu_clock_counter; /* from KFD */
uint64_t cpu_clock_counter; /* from KFD */
uint64_t system_clock_counter;  /* from KFD */
uint64_t system_clock_freq; /* from KFD */
+
+   uint32_t gpu_id;/* to KFD */
 };
 
 #define NUM_OF_SUPPORTED_GPUS 7
 
 struct kfd_process_device_apertures {
-   uint64_t lds_base;/* from KFD */
-   uint64_t lds_limit;/* from KFD */
-   uint64_t scratch_base;/* from KFD */
-   uint64_t scratch_limit;/* from KFD */
-   uint64_t gpuvm_base;/* from KFD */
-   uint64_t gpuvm_limit;/* from KFD */
-   uint32_t gpu_id;/* from KFD */
+   uint64_t lds_base;  /* from KFD */
+   uint64_t lds_limit; /* from KFD */
+   uint64_t scratch_base;  /* from KFD */
+   uint64_t scratch_limit; /* from KFD */
+   uint64_t gpuvm_base;/* from KFD */
+   uint64_t gpuvm_limit;   /* from KFD */
+   uint32_t gpu_id;/* from KFD */
 };
 
 struct kfd_ioctl_get_process_apertures_args {
@@ -104,24 +107,24 @@ struct kfd_ioctl_get_process_apertures_args {
 };
 
 struct kfd_ioctl_pmc_acquire_access_args {
-   uint32_t gpu_id;/* to KFD */
-   uint64_t trace_id;  /* to KFD */
+   uint64_t trace_id;  /* to KFD */
+   uint32_t gpu_id;/* to KFD */
 };
 
 struct kfd_ioctl_pmc_release_access_args {
-   uint32_t gpu_id;/* to KFD */
-   uint64_t trace_id;  /* to KFD */
+   uint64_t trace_id;  /* to KFD */
+   uint32_t gpu_id;/* to KFD */
 };
 
 #define KFD_IOC_MAGIC 'K'
 
-#define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct 
kfd_ioctl_get_version_args)
-#define KFD_IOC_CREATE_QUEUE   _IOWR(KFD_IOC_MAGIC, 2, struct 
kfd_ioctl_create_queue_args)
-#define KFD_IOC_DESTROY_QUEUE  _IOWR(KFD_IOC_MAGIC, 3, struct 
kfd_ioctl_destroy_queue_args)
+#define KFD_IOC_GET_VERSION_IOR(KFD_IOC_MAGIC, 1, struct 
kfd_ioctl_get_version_args)
+#define KFD_IOC_CREATE_QUEUE   _IOWR(KFD_IOC_MAGIC, 2, struct 
kfd_ioctl_create_queue_args)
+#define KFD_IOC_DESTROY_QUEUE  _IOWR(KFD_IOC_MAGIC, 3, struct 
kfd_ioctl_destroy_queue_args)
 #define KFD_IOC_SET_MEMORY_POLICY  _IOW(KFD_IOC_MAGIC, 4, struct 
kfd_ioctl_set_memory_policy_args)
 #define KFD_IOC_GET_CLOCK_COUNTERS _IOWR(KFD_IOC_MAGIC, 5, struct 
kfd_ioctl_get_clock_counters_args)
-#define KFD_IOC_GET_PROCESS_APERTURES _IOR(KFD_IOC_MAGIC

[PATCH 61/83] hsa/radeon: change another pr_info to pr_debug

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_topology.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c 
b/drivers/gpu/hsa/radeon/kfd_topology.c
index 213ae7b..059b7db 100644
--- a/drivers/gpu/hsa/radeon/kfd_topology.c
+++ b/drivers/gpu/hsa/radeon/kfd_topology.c
@@ -1121,7 +1121,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 
gpu_id = kfd_generate_gpu_id(gpu);
 
-   pr_info(Adding new GPU (ID: 0x%x) to topology\n, gpu_id);
+   pr_debug(kfd: Adding new GPU (ID: 0x%x) to topology\n, gpu_id);
 
down_write(topology_lock);
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 55/83] hsa/radeon: Add IOCTL for update queue

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch adds a new IOCTL that enables the user to perform update to an HSA
queue.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_mqds.h  |  1 -
 drivers/gpu/hsa/radeon/kfd_chardev.c   | 29 ++
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c  |  1 -
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h  |  1 -
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c  |  1 -
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h  |  1 -
 drivers/gpu/hsa/radeon/kfd_kernel_queue.c  |  1 -
 drivers/gpu/hsa/radeon/kfd_kernel_queue.h  |  1 -
 drivers/gpu/hsa/radeon/kfd_mqd_manager.c   |  1 -
 drivers/gpu/hsa/radeon/kfd_mqd_manager.h   |  1 -
 drivers/gpu/hsa/radeon/kfd_packet_manager.c| 23 ++---
 drivers/gpu/hsa/radeon/kfd_process_queue_manager.c |  1 -
 drivers/gpu/hsa/radeon/kfd_queue.c |  1 -
 include/uapi/linux/kfd_ioctl.h |  9 +++
 14 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_mqds.h 
b/drivers/gpu/hsa/radeon/cik_mqds.h
index 58945c8..35a35b4 100644
--- a/drivers/gpu/hsa/radeon/cik_mqds.h
+++ b/drivers/gpu/hsa/radeon/cik_mqds.h
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Author: Ben Goz
  */
 
 #ifndef CIK_MQDS_H_
diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index bb2ef02..9a77332 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -230,6 +230,31 @@ kfd_ioctl_destroy_queue(struct file *filp, struct 
kfd_process *p, void __user *a
return retval;
 }
 
+static int
+kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p, void __user 
*arg)
+{
+   int retval;
+   struct kfd_ioctl_update_queue_args args;
+   struct queue_properties properties;
+
+   if (copy_from_user(args, arg, sizeof(args)))
+   return -EFAULT;
+
+   properties.queue_address = args.ring_base_address;
+   properties.queue_size = args.ring_size;
+   properties.queue_percent = args.queue_percentage;
+   properties.priority = args.queue_priority;
+
+   pr_debug(kfd: updating queue id %d for PASID %d\n, args.queue_id, 
p-pasid);
+
+   mutex_lock(p-mutex);
+
+   retval = pqm_update_queue(p-pqm, args.queue_id, properties);
+
+   mutex_unlock(p-mutex);
+
+   return retval;
+}
 
 static long
 kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void 
__user *arg)
@@ -398,6 +423,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
err = kfd_ioctl_get_process_apertures(filep, process, (void 
__user *)arg);
break;
 
+   case KFD_IOC_UPDATE_QUEUE:
+   err = kfd_ioctl_update_queue(filep, process, (void __user 
*)arg);
+   break;
+
default:
dev_err(kfd_device,
unknown ioctl cmd 0x%x, arg 0x%lx)\n,
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index 9e21074..c2d91c9 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Author: Ben Goz
  */
 
 #include linux/slab.h
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
index 0529a96..fe9ef10 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Author: Ben Goz
  */
 
 #ifndef DEVICE_QUEUE_MANAGER_H_
diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c 
b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
index 1372fb2..4e71f7d 100644
--- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
+++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Author: Ben Goz
  */
 
 #include linux/types.h
diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h 
b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
index be1d6cb..f384b7f 100644
--- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
+++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
@@ -19,7 +19,6 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Author: Ben Goz
  */
 
 #ifndef HW_POINTER_STORE_H_
diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.c 
b/drivers/gpu/hsa

[PATCH 58/83] hsa/radeon: Various kernel styling fixes

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h |  6 +++---
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h |  6 +++---
 drivers/gpu/hsa/radeon/kfd_kernel_queue.h |  6 +++---
 drivers/gpu/hsa/radeon/kfd_module.c   |  8 
 drivers/gpu/hsa/radeon/kfd_mqd_manager.h  |  6 +++---
 drivers/gpu/hsa/radeon/kfd_pm4_headers.h  | 11 ++-
 drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h  |  6 +++---
 7 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
index fe9ef10..57dc636 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
@@ -21,8 +21,8 @@
  *
  */
 
-#ifndef DEVICE_QUEUE_MANAGER_H_
-#define DEVICE_QUEUE_MANAGER_H_
+#ifndef KFD_DEVICE_QUEUE_MANAGER_H_
+#define KFD_DEVICE_QUEUE_MANAGER_H_
 
 #include linux/rwsem.h
 #include linux/list.h
@@ -98,4 +98,4 @@ struct device_queue_manager {
 
 
 
-#endif /* DEVICE_QUEUE_MANAGER_H_ */
+#endif /* KFD_DEVICE_QUEUE_MANAGER_H_ */
diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h 
b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
index f384b7f..642703f 100644
--- a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
+++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
@@ -21,8 +21,8 @@
  *
  */
 
-#ifndef HW_POINTER_STORE_H_
-#define HW_POINTER_STORE_H_
+#ifndef KFD_HW_POINTER_STORE_H_
+#define KFD_HW_POINTER_STORE_H_
 
 #include linux/mutex.h
 
@@ -61,4 +61,4 @@ radeon_kfd_hw_pointer_store_mmap(struct 
hw_pointer_store_properties *ptr,
struct vm_area_struct *vma);
 
 
-#endif /* HW_POINTER_STORE_H_ */
+#endif /* KFD_HW_POINTER_STORE_H_ */
diff --git a/drivers/gpu/hsa/radeon/kfd_kernel_queue.h 
b/drivers/gpu/hsa/radeon/kfd_kernel_queue.h
index 963e861..abfb9c8 100644
--- a/drivers/gpu/hsa/radeon/kfd_kernel_queue.h
+++ b/drivers/gpu/hsa/radeon/kfd_kernel_queue.h
@@ -21,8 +21,8 @@
  *
  */
 
-#ifndef KERNEL_QUEUE_H_
-#define KERNEL_QUEUE_H_
+#ifndef KFD_KERNEL_QUEUE_H_
+#define KFD_KERNEL_QUEUE_H_
 
 #include linux/list.h
 #include linux/types.h
@@ -63,4 +63,4 @@ struct kernel_queue {
struct list_headlist;
 };
 
-#endif /* KERNEL_QUEUE_H_ */
+#endif /* KFD_KERNEL_QUEUE_H_ */
diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index e8bb67c..85069c5 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -24,7 +24,7 @@
 #include linux/sched.h
 #include linux/notifier.h
 #include linux/moduleparam.h
-
+#include linux/device.h
 #include kfd_priv.h
 
 #define DRIVER_AUTHOR  Andrew Lewycky, Oded Gabbay, Evgeny Pinchuk, 
others.
@@ -46,7 +46,7 @@ static const struct kgd2kfd_calls kgd2kfd = {
 
 int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION;
 module_param(sched_policy, int, S_IRUSR | S_IWUSR);
-MODULE_PARM_DESC(sched_policy, Kernel comline parameter define the kfd 
scheduling policy);
+MODULE_PARM_DESC(sched_policy, Kernel cmdline parameter define the kfd 
scheduling policy);
 
 bool kgd2kfd_init(unsigned interface_version,
  const struct kfd2kgd_calls *f2g,
@@ -95,7 +95,7 @@ static int __init kfd_module_init(void)
if (err  0)
goto err_topology;
 
-   pr_info([hsa] Initialized kfd module);
+   dev_info(kfd_device, Initialized module\n);
 
return 0;
 err_topology:
@@ -114,7 +114,7 @@ static void __exit kfd_module_exit(void)
mmput_unregister_notifier(kfd_mmput_nb);
radeon_kfd_chardev_exit();
radeon_kfd_pasid_exit();
-   pr_info([hsa] Removed kfd module);
+   dev_info(kfd_device, Removed module\n);
 }
 
 module_init(kfd_module_init);
diff --git a/drivers/gpu/hsa/radeon/kfd_mqd_manager.h 
b/drivers/gpu/hsa/radeon/kfd_mqd_manager.h
index 8e7a5fd..314d490 100644
--- a/drivers/gpu/hsa/radeon/kfd_mqd_manager.h
+++ b/drivers/gpu/hsa/radeon/kfd_mqd_manager.h
@@ -21,8 +21,8 @@
  *
  */
 
-#ifndef MQD_MANAGER_H_
-#define MQD_MANAGER_H_
+#ifndef KFD_MQD_MANAGER_H_
+#define KFD_MQD_MANAGER_H_
 
 #include kfd_priv.h
 
@@ -44,4 +44,4 @@ struct mqd_manager {
 };
 
 
-#endif /* MQD_MANAGER_H_ */
+#endif /* KFD_MQD_MANAGER_H_ */
diff --git a/drivers/gpu/hsa/radeon/kfd_pm4_headers.h 
b/drivers/gpu/hsa/radeon/kfd_pm4_headers.h
index dae460f..3ffb3f4 100644
--- a/drivers/gpu/hsa/radeon/kfd_pm4_headers.h
+++ b/drivers/gpu/hsa/radeon/kfd_pm4_headers.h
@@ -21,8 +21,8 @@
  *
  */
 
-#ifndef F32_MES_PM4_PACKETS_72_H
-#define F32_MES_PM4_PACKETS_72_H
+#ifndef KFD_PM4_HEADERS_H_
+#define KFD_PM4_HEADERS_H_
 
 #ifndef PM4_HEADER_DEFINED
 #define PM4_HEADER_DEFINED
@@ -657,7 +657,7 @@ typedef struct _PM4__SET_SH_REG {
 #ifndef _PM4__SET_CONFIG_REG_DEFINED
 #define _PM4__SET_CONFIG_REG_DEFINED
 
-typedef struct _PM4__SET_CONFIG_REG {
+struct pm4__set_config_reg {
union {
PM4_TYPE_3_HEADER header

[PATCH 53/83] hsa/radeon: Add device queue manager module

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The device bounded section is handled by this module.
The DQM module handles queue setup, update and tear-down from the device side.
It also supports suspend/resume operation.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |2 +-
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 1006 +
 drivers/gpu/hsa/radeon/kfd_priv.h |2 +
 3 files changed, 1009 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 341fa67..3409203 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -8,6 +8,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
-   kfd_process_queue_manager.o
+   kfd_process_queue_manager.o kfd_device_queue_manager.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
new file mode 100644
index 000..9e21074
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -0,0 +1,1006 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Ben Goz
+ */
+
+#include linux/slab.h
+#include linux/list.h
+#include linux/types.h
+#include linux/printk.h
+#include linux/bitops.h
+#include kfd_priv.h
+#include kfd_device_queue_manager.h
+#include kfd_mqd_manager.h
+#include cik_regs.h
+#include kfd_kernel_queue.h
+
+#define CIK_HPD_SIZE_LOG2 11
+#define CIK_HPD_SIZE (1U  CIK_HPD_SIZE_LOG2)
+
+static bool is_mem_initialized;
+
+static int init_memory(struct device_queue_manager *dqm);
+static int
+set_pasid_vmid_mapping(struct device_queue_manager *dqm, unsigned int pasid, 
unsigned int vmid);
+
+static inline unsigned int get_pipes_num(struct device_queue_manager *dqm)
+{
+   BUG_ON(!dqm || !dqm-dev);
+   return dqm-dev-shared_resources.compute_pipe_count;
+}
+
+static inline unsigned int get_first_pipe(struct device_queue_manager *dqm)
+{
+   BUG_ON(!dqm);
+   return dqm-dev-shared_resources.first_compute_pipe;
+}
+
+static inline unsigned int get_pipes_num_cpsch(void)
+{
+   return PIPE_PER_ME_CP_SCHEDULING - 1;
+}
+
+static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble);
+static void init_process_memory(struct device_queue_manager *dqm, struct 
qcm_process_device *qpd)
+{
+   BUG_ON(!dqm || !qpd);
+
+   qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
+   qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED);
+   qpd-sh_mem_bases = compute_sh_mem_bases_64bit(6);
+   qpd-sh_mem_ape1_limit = 0;
+   qpd-sh_mem_ape1_base = 1;
+}
+
+static void program_sh_mem_settings(struct device_queue_manager *dqm, struct 
qcm_process_device *qpd)
+{
+   struct mqd_manager *mqd;
+
+   mqd = dqm-get_mqd_manager(dqm, KFD_MQD_TYPE_CIK_COMPUTE);
+   if (mqd == NULL)
+   return;
+
+   mqd-acquire_hqd(mqd, 0, 0, qpd-vmid);
+
+   WRITE_REG(dqm-dev, SH_MEM_CONFIG, qpd-sh_mem_config);
+
+   WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base);
+   WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit);
+
+   mqd-release_hqd(mqd);
+}
+
+static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue 
*q,
+   struct qcm_process_device *qpd, int *allocate_vmid)
+{
+   bool set

[PATCH 56/83] hsa/radeon: Queue Management integration with Memory Management

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch adding support for LDS aperture for user processes.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.c | 41 +--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
index c2d91c9..01573b1 100644
--- a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.c
@@ -58,16 +58,50 @@ static inline unsigned int get_pipes_num_cpsch(void)
return PIPE_PER_ME_CP_SCHEDULING - 1;
 }
 
+static unsigned int get_sh_mem_bases_nybble_64(struct kfd_process *process, 
struct kfd_dev *dev)
+{
+   struct kfd_process_device *pdd;
+   uint32_t nybble;
+
+   pdd = radeon_kfd_get_process_device_data(dev, process);
+   nybble = (pdd-lds_base  60)  0x0E;
+
+   return nybble;
+
+}
+
+static unsigned int get_sh_mem_bases_32(struct kfd_process *process, struct 
kfd_dev *dev)
+{
+   struct kfd_process_device *pdd;
+   unsigned int shared_base;
+
+   pdd = radeon_kfd_get_process_device_data(dev, process);
+   shared_base = (pdd-lds_base  16)  0xFF;
+
+   return shared_base;
+}
+
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble);
 static void init_process_memory(struct device_queue_manager *dqm, struct 
qcm_process_device *qpd)
 {
+   unsigned int temp;
BUG_ON(!dqm || !qpd);
 
+   if (qpd-pqm-process-is_32bit_user_mode) {
+   temp = get_sh_mem_bases_32(qpd-pqm-process, dqm-dev);
+   qpd-sh_mem_bases = SHARED_BASE(temp);
+   } else {
+   temp = get_sh_mem_bases_nybble_64(qpd-pqm-process, dqm-dev);
+   qpd-sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+   }
+
qpd-sh_mem_config = ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED);
qpd-sh_mem_config |= DEFAULT_MTYPE(MTYPE_NONCACHED);
-   qpd-sh_mem_bases = compute_sh_mem_bases_64bit(6);
qpd-sh_mem_ape1_limit = 0;
qpd-sh_mem_ape1_base = 1;
+
+   pr_debug(kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and 
register 0x%X\n,
+   qpd-pqm-process-is_32bit_user_mode, temp, qpd-sh_mem_bases);
 }
 
 static void program_sh_mem_settings(struct device_queue_manager *dqm, struct 
qcm_process_device *qpd)
@@ -84,6 +118,7 @@ static void program_sh_mem_settings(struct 
device_queue_manager *dqm, struct qcm
 
WRITE_REG(dqm-dev, SH_MEM_APE1_BASE, qpd-sh_mem_ape1_base);
WRITE_REG(dqm-dev, SH_MEM_APE1_LIMIT, qpd-sh_mem_ape1_limit);
+   WRITE_REG(dqm-dev, SH_MEM_BASES, qpd-sh_mem_bases);
 
mqd-release_hqd(mqd);
 }
@@ -128,6 +163,8 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm, struct queue *
set_pasid_vmid_mapping(dqm, q-process-pasid, 
q-properties.vmid);
qpd-vmid = *allocate_vmid;
is_new_vmid = true;
+
+   program_sh_mem_settings(dqm, qpd);
}
q-properties.vmid = qpd-vmid;
 
@@ -418,7 +455,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int 
top_address_nybble)
 * We don't bother to support different top nybbles for LDS/Scratch and 
GPUVM.
 */
 
-   BUG_ON((top_address_nybble  1) || top_address_nybble  0xE);
+   BUG_ON((top_address_nybble  1) || top_address_nybble  0xE || 
top_address_nybble == 0);
 
return PRIVATE_BASE(top_address_nybble  12) | 
SHARED_BASE(top_address_nybble  12);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 54/83] hsa/radeon: Switch to new queue scheduler

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch makes the switch between the old KFD queue scheduler to the new KFD
queue scheduler. The new scheduler supports H/W CP scheduling, over-subscription
of queues and pre-emption of queues.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_aperture.c  |   1 -
 drivers/gpu/hsa/radeon/kfd_chardev.c   | 107 +++--
 drivers/gpu/hsa/radeon/kfd_device.c|  31 ++
 drivers/gpu/hsa/radeon/kfd_interrupt.c |   4 +-
 drivers/gpu/hsa/radeon/kfd_priv.h  |   2 +
 drivers/gpu/hsa/radeon/kfd_process.c   |  56 -
 include/uapi/linux/kfd_ioctl.h |   4 +-
 7 files changed, 88 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_aperture.c 
b/drivers/gpu/hsa/radeon/kfd_aperture.c
index 9e2d6da..2c72b21 100644
--- a/drivers/gpu/hsa/radeon/kfd_aperture.c
+++ b/drivers/gpu/hsa/radeon/kfd_aperture.c
@@ -32,7 +32,6 @@
 #include uapi/linux/kfd_ioctl.h
 #include linux/time.h
 #include kfd_priv.h
-#include kfd_scheduler.h
 #include linux/mm.h
 #include uapi/asm-generic/mman-common.h
 #include asm/processor.h
diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 07cac88..bb2ef02 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -31,10 +31,11 @@
 #include uapi/linux/kfd_ioctl.h
 #include linux/time.h
 #include kfd_priv.h
-#include kfd_scheduler.h
 #include linux/mm.h
 #include uapi/asm-generic/mman-common.h
 #include asm/processor.h
+#include kfd_hw_pointer_store.h
+#include kfd_device_queue_manager.h
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -128,24 +129,36 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
struct kfd_dev *dev;
int err = 0;
unsigned int queue_id;
-   struct kfd_queue *queue;
struct kfd_process_device *pdd;
+   struct queue_properties q_properties;
+
+   memset(q_properties, 0, sizeof(struct queue_properties));
 
if (copy_from_user(args, arg, sizeof(args)))
return -EFAULT;
 
-   dev = radeon_kfd_device_by_id(args.gpu_id);
-   if (dev == NULL)
-   return -EINVAL;
+   /* need to validate parameters */
+
+   q_properties.is_interop = false;
+   q_properties.queue_percent = args.queue_percentage;
+   q_properties.priority = args.queue_priority;
+   q_properties.queue_address = args.ring_base_address;
+   q_properties.queue_size = args.ring_size;
 
-   queue = kzalloc(
-   offsetof(struct kfd_queue, scheduler_queue) + 
dev-device_info-scheduler_class-queue_size,
-   GFP_KERNEL);
 
-   if (!queue)
-   return -ENOMEM;
+   pr_debug(%s Arguments: Queue Percentage (%d, %d)\n
+   Queue Priority (%d, %d)\n
+   Queue Address (0x%llX, 0x%llX)\n
+   Queue Size (%u64, %ll)\n,
+   __func__,
+   q_properties.queue_percent, args.queue_percentage,
+   q_properties.priority, args.queue_priority,
+   q_properties.queue_address, args.ring_base_address,
+   q_properties.queue_size, args.ring_size);
 
-   queue-dev = dev;
+   dev = radeon_kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
 
mutex_lock(p-mutex);
 
@@ -159,23 +172,14 @@ kfd_ioctl_create_queue(struct file *filep, struct 
kfd_process *p, void __user *a
p-pasid,
dev-id);
 
-   if (!radeon_kfd_allocate_queue_id(p, queue_id))
-   goto err_allocate_queue_id;
-
-   err = dev-device_info-scheduler_class-create_queue(dev-scheduler, 
pdd-scheduler_process,
- 
queue-scheduler_queue,
- (void __user 
*)args.ring_base_address,
- args.ring_size,
- (void __user 
*)args.read_pointer_address,
- (void __user 
*)args.write_pointer_address,
- 
radeon_kfd_queue_id_to_doorbell(dev, p, queue_id));
-   if (err)
+   err = pqm_create_queue(p-pqm, dev, filep, q_properties, 0, 
KFD_QUEUE_TYPE_COMPUTE, queue_id);
+   if (err != 0)
goto err_create_queue;
 
-   radeon_kfd_install_queue(p, queue_id, queue);
-
args.queue_id = queue_id;
-   args.doorbell_address = 
(uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id);
+   args.read_pointer_address = (uint64_t

[PATCH 49/83] hsa/radeon: Add kernel queue support for KFD

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The kernel queue module enables the KFD to establish kernel queues, not exposed
to user space. The kernel queues are used for HIQ (HSA Interface Queue) and DIQ
(Debug Interface Queue) operations.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   3 +-
 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h | 102 
 drivers/gpu/hsa/radeon/kfd_kernel_queue.c | 302 ++
 drivers/gpu/hsa/radeon/kfd_kernel_queue.h |  67 +++
 drivers/gpu/hsa/radeon/kfd_pm4_headers.h  | 681 ++
 drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h  | 107 
 drivers/gpu/hsa/radeon/kfd_priv.h |  34 ++
 drivers/gpu/hsa/radeon/kfd_scheduler.h|   5 -
 8 files changed, 1295 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_kernel_queue.c
 create mode 100644 drivers/gpu/hsa/radeon/kfd_kernel_queue.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_pm4_headers.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_pm4_opcodes.h

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index c87b518..f06d925 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -6,6 +6,7 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
-   kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o
+   kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o \
+   kfd_kernel_queue.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h 
b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
new file mode 100644
index 000..0529a96
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_device_queue_manager.h
@@ -0,0 +1,102 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Ben Goz
+ */
+
+#ifndef DEVICE_QUEUE_MANAGER_H_
+#define DEVICE_QUEUE_MANAGER_H_
+
+#include linux/rwsem.h
+#include linux/list.h
+#include kfd_priv.h
+#include kfd_mqd_manager.h
+
+#define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS   (500)
+#define QUEUES_PER_PIPE(8)
+#define PIPE_PER_ME_CP_SCHEDULING  (4)
+#define CIK_VMID_NUM   (8)
+#define KFD_VMID_START_OFFSET  (8)
+#define VMID_PER_DEVICECIK_VMID_NUM
+#define KFD_DQM_FIRST_PIPE (0)
+
+struct device_process_node {
+   struct qcm_process_device *qpd;
+   struct list_head list;
+};
+
+struct device_queue_manager {
+   int (*create_queue)(struct device_queue_manager *dqm,
+   struct queue *q,
+   struct qcm_process_device *qpd,
+   int *allocate_vmid);
+   int (*destroy_queue)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
+   struct queue *q);
+   int (*update_queue)(struct device_queue_manager *dqm,
+   struct queue *q);
+   int (*destroy_queues)(struct device_queue_manager *dqm);
+   struct mqd_manager * (*get_mqd_manager)(struct device_queue_manager 
*dqm,
+   enum KFD_MQD_TYPE type);
+   int (*execute_queues)(struct device_queue_manager *dqm);
+   int (*register_process)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd);
+   int

[PATCH 50/83] hsa/radeon: Add module parameter of scheduling policy

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

This patch adds a new parameter to the KFD module. This parameter enables the
user to select the scheduling policy of the CP. The choices are:

* CP Scheduling with support for over-subscription
* CP Scheduling without support for over-subscription
* Without CP Scheduling

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_module.c |  5 +++
 drivers/gpu/hsa/radeon/kfd_priv.h   | 65 +
 2 files changed, 70 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index a03743a..e8bb67c 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -23,6 +23,7 @@
 #include linux/module.h
 #include linux/sched.h
 #include linux/notifier.h
+#include linux/moduleparam.h
 
 #include kfd_priv.h
 
@@ -43,6 +44,10 @@ static const struct kgd2kfd_calls kgd2kfd = {
.resume = kgd2kfd_resume,
 };
 
+int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION;
+module_param(sched_policy, int, S_IRUSR | S_IWUSR);
+MODULE_PARM_DESC(sched_policy, Kernel comline parameter define the kfd 
scheduling policy);
+
 bool kgd2kfd_init(unsigned interface_version,
  const struct kfd2kgd_calls *f2g,
  const struct kgd2kfd_calls **g2f)
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 3a5cecf..b3889aa 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -70,6 +70,15 @@ struct kfd_scheduler_class;
 /* Macro for allocating structures */
 #define kfd_alloc_struct(ptr_to_struct)((typeof(ptr_to_struct)) 
kzalloc(sizeof(*ptr_to_struct), GFP_KERNEL))
 
+/* Kernel module parameter to specify the scheduling policy */
+extern int sched_policy;
+
+enum kfd_sched_policy {
+   KFD_SCHED_POLICY_HWS = 0,
+   KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION,
+   KFD_SCHED_POLICY_NO_HWS
+};
+
 /* Large enough to hold the maximum usable pasid + 1.
 ** It must also be able to store the number of doorbells reported by a KFD 
device. */
 typedef unsigned int pasid_t;
@@ -243,6 +252,51 @@ enum KFD_MQD_TYPE {
KFD_MQD_TYPE_MAX
 };
 
+struct scheduling_resources {
+   unsigned int vmid_mask;
+   enum kfd_queue_type type;
+   uint64_t queue_mask;
+   uint64_t gws_mask;
+   uint32_t oac_mask;
+   uint32_t gds_heap_base;
+   uint32_t gds_heap_size;
+};
+
+struct process_queue_manager {
+   /* data */
+   struct kfd_process  *process;
+   unsigned intnum_concurrent_processes;
+   struct list_headqueues;
+   unsigned long   *queue_slot_bitmap;
+};
+
+struct qcm_process_device {
+   /* The Device Queue Manager that owns this data */
+   struct device_queue_manager *dqm;
+   struct process_queue_manager *pqm;
+   /* Device Queue Manager lock */
+   struct mutex *lock;
+   /* Queues list */
+   struct list_head queues_list;
+   struct list_head priv_queue_list;
+
+   unsigned int queue_count;
+   unsigned int vmid;
+   bool is_debug;
+   /*
+* All the memory management data should be here too
+*/
+   uint64_t gds_context_area;
+   uint32_t sh_mem_config;
+   uint32_t sh_mem_bases;
+   uint32_t sh_mem_ape1_base;
+   uint32_t sh_mem_ape1_limit;
+   uint32_t page_table_base;
+   uint32_t gds_size;
+   uint32_t num_gws;
+   uint32_t num_oac;
+};
+
 /* Data that is per-process-per device. */
 struct kfd_process_device {
/* List of all per-device data for a process. Starts from 
kfd_process.per_device_data. */
@@ -374,6 +428,8 @@ void print_queue_properties(struct queue_properties *q);
 void print_queue(struct queue *q);
 
 struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type, struct kfd_dev 
*dev);
+struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum 
kfd_queue_type type);
+void kernel_queue_uninit(struct kernel_queue *kq);
 
 /* Packet Manager */
 
@@ -391,4 +447,13 @@ struct packet_manager {
kfd_mem_obj ib_buffer_obj;
 };
 
+int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm);
+void pm_uninit(struct packet_manager *pm);
+int pm_send_set_resources(struct packet_manager *pm, struct 
scheduling_resources *res);
+int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues);
+int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address, 
uint32_t fence_value);
+int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
+   enum kfd_preempt_type_filter mode, uint32_t 
filter_param, bool reset);
+void pm_release_ib(struct packet_manager *pm);
+
 #endif
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http

[PATCH 46/83] hsa/radeon: Add queue and hw_pointer_store modules

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The queue module enables allocating and initializing queues uniformly.
The hw_pointer_store module handles allocation and assignment of read and write
pointers to user HSA queues.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   3 +-
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c | 150 ++
 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h |  65 +++
 drivers/gpu/hsa/radeon/kfd_priv.h |  55 ++
 drivers/gpu/hsa/radeon/kfd_queue.c| 110 +++
 5 files changed, 382 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
 create mode 100644 drivers/gpu/hsa/radeon/kfd_hw_pointer_store.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_queue.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 813b31f..18e1639 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -5,6 +5,7 @@
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o kfd_interrupt.o kfd_aperture.o
+   kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
+   kfd_queue.o kfd_hw_pointer_store.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c 
b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
new file mode 100644
index 000..1372fb2
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_hw_pointer_store.c
@@ -0,0 +1,150 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Ben Goz
+ */
+
+#include linux/types.h
+#include linux/version.h
+#include linux/kernel.h
+#include linux/mutex.h
+#include linux/mm.h
+#include linux/mman.h
+#include linux/slab.h
+#include linux/io.h
+#include kfd_hw_pointer_store.h
+#include kfd_priv.h
+
+/* do the same trick as in map_doorbells() */
+static int hw_pointer_store_map(struct hw_pointer_store_properties *ptr,
+   struct file *devkfd)
+{
+   qptr_t __user *user_address;
+
+   BUG_ON(!ptr || !devkfd);
+
+   if (!ptr-page_mapping) {
+   if (!ptr-page_address)
+   return -EINVAL;
+
+   user_address = (qptr_t __user *)vm_mmap(devkfd, 0, PAGE_SIZE,
+   PROT_WRITE | PROT_READ , MAP_SHARED, ptr-offset);
+
+   if (IS_ERR(user_address))
+   return PTR_ERR(user_address);
+
+   ptr-page_mapping = user_address;
+   }
+
+   return 0;
+}
+
+int hw_pointer_store_init(struct hw_pointer_store_properties *ptr,
+   enum hw_pointer_store_type type)
+{
+   unsigned long *addr;
+
+   BUG_ON(!ptr);
+
+   /* using the offset value as a hint for mmap to distinguish between 
page types */
+   if (type == KFD_HW_POINTER_STORE_TYPE_RPTR)
+   ptr-offset = KFD_MMAP_RPTR_START  PAGE_SHIFT;
+   else if (type == KFD_HW_POINTER_STORE_TYPE_WPTR)
+   ptr-offset = KFD_MMAP_WPTR_START  PAGE_SHIFT;
+   else
+   return -EINVAL;
+
+   addr = (unsigned long *)get_zeroed_page(GFP_KERNEL);
+   if (!addr) {
+   pr_debug(Error allocating page\n);
+   return -ENOMEM;
+   }
+
+   ptr-page_address = addr;
+   ptr-page_mapping = NULL;
+
+   return 0;
+}
+
+void hw_pointer_store_destroy(struct hw_pointer_store_properties *ptr)
+{
+   BUG_ON(!ptr);
+   pr_debug(kfd in func: %s\n, __func__);
+   if (ptr-page_address)
+   free_page((unsigned long)ptr-page_address);
+   if (ptr-page_mapping)
+   vm_munmap((uintptr_t)ptr-page_mapping, PAGE_SIZE);
+   ptr

[PATCH 48/83] hsa/radeon: Add mqd_manager module

2014-07-10 Thread Oded Gabbay
From: Ben Goz ben@amd.com

The mqd_manager module handles MQD data structures. MQD stands for Memory Queue
Descriptor, which is used by the H/W to keep the HSA queue state in memory.

Signed-off-by: Ben Goz ben@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   2 +-
 drivers/gpu/hsa/radeon/cik_mqds.h | 251 ++
 drivers/gpu/hsa/radeon/cik_regs.h |   1 +
 drivers/gpu/hsa/radeon/kfd_mqd_manager.c  | 453 ++
 drivers/gpu/hsa/radeon/kfd_mqd_manager.h  |  48 +++
 drivers/gpu/hsa/radeon/kfd_priv.h |  26 ++
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c |  10 -
 drivers/gpu/hsa/radeon/kfd_vidmem.c   |  36 ++
 8 files changed, 816 insertions(+), 11 deletions(-)
 create mode 100644 drivers/gpu/hsa/radeon/cik_mqds.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_mqd_manager.c
 create mode 100644 drivers/gpu/hsa/radeon/kfd_mqd_manager.h

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 18e1639..c87b518 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -6,6 +6,6 @@ radeon_kfd-y:= kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
kfd_vidmem.o kfd_interrupt.o kfd_aperture.o \
-   kfd_queue.o kfd_hw_pointer_store.o
+   kfd_queue.o kfd_hw_pointer_store.o kfd_mqd_manager.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/cik_mqds.h 
b/drivers/gpu/hsa/radeon/cik_mqds.h
new file mode 100644
index 000..58945c8
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/cik_mqds.h
@@ -0,0 +1,251 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Ben Goz
+ */
+
+#ifndef CIK_MQDS_H_
+#define CIK_MQDS_H_
+
+#pragma pack(push, 4)
+
+struct cik_hpd_registers {
+   u32 cp_hpd_roq_offsets;
+   u32 cp_hpd_eop_base_addr;
+   u32 cp_hpd_eop_base_addr_hi;
+   u32 cp_hpd_eop_vmid;
+   u32 cp_hpd_eop_control;
+};
+
+struct cik_hqd_registers {
+   u32 cp_mqd_base_addr;
+   u32 cp_mqd_base_addr_hi;
+   u32 cp_hqd_active;
+   u32 cp_hqd_vmid;
+   u32 cp_hqd_persistent_state;
+   u32 cp_hqd_pipe_priority;
+   u32 cp_hqd_queue_priority;
+   u32 cp_hqd_quantum;
+   u32 cp_hqd_pq_base;
+   u32 cp_hqd_pq_base_hi;
+   u32 cp_hqd_pq_rptr;
+   u32 cp_hqd_pq_rptr_report_addr;
+   u32 cp_hqd_pq_rptr_report_addr_hi;
+   u32 cp_hqd_pq_wptr_poll_addr;
+   u32 cp_hqd_pq_wptr_poll_addr_hi;
+   u32 cp_hqd_pq_doorbell_control;
+   u32 cp_hqd_pq_wptr;
+   u32 cp_hqd_pq_control;
+   u32 cp_hqd_ib_base_addr;
+   u32 cp_hqd_ib_base_addr_hi;
+   u32 cp_hqd_ib_rptr;
+   u32 cp_hqd_ib_control;
+   u32 cp_hqd_iq_timer;
+   u32 cp_hqd_iq_rptr;
+   u32 cp_hqd_dequeue_request;
+   u32 cp_hqd_dma_offload;
+   u32 cp_hqd_sema_cmd;
+   u32 cp_hqd_msg_type;
+   u32 cp_hqd_atomic0_preop_lo;
+   u32 cp_hqd_atomic0_preop_hi;
+   u32 cp_hqd_atomic1_preop_lo;
+   u32 cp_hqd_atomic1_preop_hi;
+   u32 cp_hqd_hq_scheduler0;
+   u32 cp_hqd_hq_scheduler1;
+   u32 cp_mqd_control;
+};
+
+struct cik_mqd {
+   u32 header;
+   u32 dispatch_initiator;
+   u32 dimensions[3];
+   u32 start_idx[3];
+   u32 num_threads[3];
+   u32 pipeline_stat_enable;
+   u32 perf_counter_enable;
+   u32 pgm[2];
+   u32 tba[2];
+   u32 tma[2];
+   u32 pgm_rsrc[2];
+   u32 vmid;
+   u32 resource_limits;
+   u32 static_thread_mgmt01[2];
+   u32 tmp_ring_size;
+   u32 static_thread_mgmt23[2];
+   u32 restart[3];
+   u32 thread_trace_enable;
+   u32 reserved1;
+   u32 user_data

[PATCH 42/83] hsa/radeon: 32-bit processes support

2014-07-10 Thread Oded Gabbay
From: Alexey Skidanov alexey.skida...@amd.com

Initializing compat_ioctl properly. All ioctls args are packed.

Signed-off-by: Alexey Skidanov alexey.skida...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 7 +--
 drivers/gpu/hsa/radeon/kfd_priv.h| 4 
 include/uapi/linux/kfd_ioctl.h   | 2 +-
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 75fe11f..e95d597 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -27,6 +27,7 @@
 #include linux/sched.h
 #include linux/slab.h
 #include linux/uaccess.h
+#include linux/compat.h
 #include uapi/linux/kfd_ioctl.h
 #include linux/time.h
 #include kfd_priv.h
@@ -41,6 +42,7 @@ static const char kfd_dev_name[] = kfd;
 static const struct file_operations kfd_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = kfd_ioctl,
+   .compat_ioctl = kfd_ioctl,
.open = kfd_open,
.mmap = kfd_mmap,
 };
@@ -105,8 +107,9 @@ kfd_open(struct inode *inode, struct file *filep)
process = radeon_kfd_create_process(current);
if (IS_ERR(process))
return PTR_ERR(process);
-
-   pr_debug(\nkfd: process %d opened dev/kfd, process-pasid);
+   process-is_32bit_user_mode = is_compat_task();
+   dev_info(kfd_device, process %d opened, compat mode (32 bit) - %d\n,
+   process-pasid, process-is_32bit_user_mode);
 
return 0;
 }
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index 8b877ca..9d3b1fc 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -194,6 +194,10 @@ struct kfd_process {
size_t queue_array_size;
struct kfd_queue **queues;  /* Size is queue_array_size, up to 
MAX_PROCESS_QUEUES. */
unsigned long allocated_queue_bitmap[DIV_ROUND_UP(MAX_PROCESS_QUEUES, 
BITS_PER_LONG)];
+
+   /*Is the user space process 32 bit?*/
+   bool is_32bit_user_mode;
+
 };
 
 struct kfd_process *radeon_kfd_create_process(const struct task_struct *);
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 5b9517e..a7c3abd 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -29,7 +29,7 @@
 #define KFD_IOCTL_CURRENT_VERSION 1
 
 /* The 64-bit ABI is the authoritative version. */
-#pragma pack(push, 8)
+#pragma pack(push, 1)
 
 struct kfd_ioctl_get_version_args {
uint32_t min_supported_version; /* from KFD */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 43/83] hsa/radeon: NULL pointer dereference bug workaround

2014-07-10 Thread Oded Gabbay
From: Alexey Skidanov alexey.skida...@amd.com

Signed-off-by: Alexey Skidanov alexey.skida...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index 7573d25..7ee8125 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -627,8 +627,10 @@ static void cik_static_deregister_process(struct 
kfd_scheduler *scheduler,
struct cik_static_private *priv = kfd_scheduler_to_private(scheduler);
struct cik_static_process *pp = 
kfd_process_to_private(scheduler_process);
 
-   release_vmid(priv, pp-vmid);
-   kfree(pp);
+   if (priv  pp) {
+   release_vmid(priv, pp-vmid);
+   kfree(pp);
+   }
 }
 
 static bool allocate_hqd(struct cik_static_private *priv, unsigned int *queue)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 44/83] hsa/radeon: HSA64/HSA32 modes support

2014-07-10 Thread Oded Gabbay
From: Alexey Skidanov alexey.skida...@amd.com

Added apertures initialization and appropriate ioctl

Signed-off-by: Alexey Skidanov alexey.skida...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   2 +-
 drivers/gpu/hsa/radeon/kfd_aperture.c | 124 ++
 drivers/gpu/hsa/radeon/kfd_chardev.c  |  58 +++-
 drivers/gpu/hsa/radeon/kfd_priv.h |  18 
 drivers/gpu/hsa/radeon/kfd_process.c  |  17 
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c |   3 +-
 drivers/gpu/hsa/radeon/kfd_topology.c |  27 ++
 include/uapi/linux/kfd_ioctl.h|  18 
 8 files changed, 264 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_aperture.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 5422e6a..813b31f 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -5,6 +5,6 @@
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o kfd_interrupt.o
+   kfd_vidmem.o kfd_interrupt.o kfd_aperture.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_aperture.c 
b/drivers/gpu/hsa/radeon/kfd_aperture.c
new file mode 100644
index 000..9e2d6da
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_aperture.c
@@ -0,0 +1,124 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include linux/device.h
+#include linux/export.h
+#include linux/err.h
+#include linux/fs.h
+#include linux/sched.h
+#include linux/slab.h
+#include linux/uaccess.h
+#include linux/compat.h
+#include uapi/linux/kfd_ioctl.h
+#include linux/time.h
+#include kfd_priv.h
+#include kfd_scheduler.h
+#include linux/mm.h
+#include uapi/asm-generic/mman-common.h
+#include asm/processor.h
+
+
+#define MAKE_GPUVM_APP_BASE(gpu_num) (((uint64_t)(gpu_num)  61) + 
0x1)
+#define MAKE_GPUVM_APP_LIMIT(base) (((uint64_t)(base)  0xFF00) | 
0xFF)
+#define MAKE_SCRATCH_APP_BASE(gpu_num) (((uint64_t)(gpu_num)  61) + 
0x1)
+#define MAKE_SCRATCH_APP_LIMIT(base) (((uint64_t)base  0x) | 
0x)
+#define MAKE_LDS_APP_BASE(gpu_num) (((uint64_t)(gpu_num)  61) + 0x0)
+#define MAKE_LDS_APP_LIMIT(base) (((uint64_t)(base)  0x) | 
0x)
+
+#define HSA_32BIT_LDS_APP_SIZE 0x1
+#define HSA_32BIT_LDS_APP_ALIGNMENT 0x1
+
+static unsigned long kfd_reserve_aperture(struct kfd_process *process, 
unsigned long len, unsigned long alignment)
+{
+
+   unsigned long addr = 0;
+   unsigned long start_address;
+
+   /*
+* Go bottom up and find the first available aligned address.
+* We may narrow space to scan by getting mmap range limits.
+*/
+   for (start_address =  alignment; start_address  (TASK_SIZE - 
alignment); start_address += alignment) {
+   addr = vm_mmap(NULL, start_address, len, PROT_NONE, MAP_PRIVATE 
| MAP_ANONYMOUS, 0);
+   if (!IS_ERR_VALUE(addr)) {
+   if (addr == start_address)
+   return addr;
+   vm_munmap(addr, len);
+   }
+   }
+   return 0;
+
+}
+
+int kfd_init_apertures(struct kfd_process *process)
+{
+   uint8_t id  = 0;
+   struct kfd_dev *dev;
+   struct kfd_process_device *pdd;
+
+   mutex_lock(process-mutex);
+
+   /*Iterating over all devices*/
+   while ((dev = kfd_topology_enum_kfd_devices(id)) != NULL  id  
NUM_OF_SUPPORTED_GPUS) {
+
+   pdd = radeon_kfd_get_process_device_data(dev, process);
+
+   /*for 64 bit process aperture will be statically reserved

[PATCH 41/83] hsa/radeon: Alternating the source of max clock

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Changing the source of the max engine clock value.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 8b6d497..a28cf6b 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -316,5 +316,5 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev 
*kgd)
struct radeon_device *rdev = (struct radeon_device *)kgd;
 
/* The sclk is in quantas of 10kHz */
-   return rdev-pm.power_state-clock_info-sclk / 100;
+   return rdev-pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 34/83] drm/radeon: adding synchronization for GRBM GFX

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Implementing a lock for selecting and accessing shader engines and arrays.
This lock will make sure that drm/radeon and hsa/radeon are not colliding when
accessing shader engines and arrays with GRBM_GFX_INDEX register.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/cik.c   | 26 ++
 drivers/gpu/drm/radeon/radeon.h|  2 ++
 drivers/gpu/drm/radeon/radeon_device.c |  1 +
 drivers/gpu/drm/radeon/radeon_kfd.c| 23 +++
 include/linux/radeon_kfd.h |  4 
 5 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 6f4999a..fc560b0 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -1566,6 +1566,8 @@ static const u32 godavari_golden_registers[] =
 
 static void cik_init_golden_registers(struct radeon_device *rdev)
 {
+   /* Some of the registers might be dependant on GRBM_GFX_INDEX */
+   mutex_lock(rdev-grbm_idx_mutex);
switch (rdev-family) {
case CHIP_BONAIRE:
radeon_program_register_sequence(rdev,
@@ -1640,6 +1642,7 @@ static void cik_init_golden_registers(struct 
radeon_device *rdev)
default:
break;
}
+   mutex_unlock(rdev-grbm_idx_mutex);
 }
 
 /**
@@ -3421,6 +3424,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
u32 disabled_rbs = 0;
u32 enabled_rbs = 0;
 
+   mutex_lock(rdev-grbm_idx_mutex);
for (i = 0; i  se_num; i++) {
for (j = 0; j  sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -3432,6 +3436,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(rdev-grbm_idx_mutex);
 
mask = 1;
for (i = 0; i  max_rb_num_per_se * se_num; i++) {
@@ -3442,6 +3447,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
 
rdev-config.cik.backend_enable_mask = enabled_rbs;
 
+   mutex_lock(rdev-grbm_idx_mutex);
for (i = 0; i  se_num; i++) {
cik_select_se_sh(rdev, i, 0x);
data = 0;
@@ -3469,6 +3475,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
WREG32(PA_SC_RASTER_CONFIG, data);
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(rdev-grbm_idx_mutex);
 }
 
 /**
@@ -3686,6 +3693,12 @@ static void cik_gpu_init(struct radeon_device *rdev)
/* set HW defaults for 3D engine */
WREG32(CP_MEQ_THRESHOLDS, MEQ1_START(0x30) | MEQ2_START(0x60));
 
+   mutex_lock(rdev-grbm_idx_mutex);
+   /*
+* making sure that the following register writes will be broadcasted
+* to all the shaders
+*/
+   cik_select_se_sh(rdev, 0x, 0x);
WREG32(SX_DEBUG_1, 0x20);
 
WREG32(TA_CNTL_AUX, 0x0001);
@@ -3741,6 +3754,7 @@ static void cik_gpu_init(struct radeon_device *rdev)
 
WREG32(PA_CL_ENHANCE, CLIP_VTX_REORDER_ENA | NUM_CLIP_SEQ(3));
WREG32(PA_SC_ENHANCE, ENABLE_PA_SC_OUT_OF_ORDER);
+   mutex_unlock(rdev-grbm_idx_mutex);
 
udelay(50);
 }
@@ -6040,6 +6054,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
u32 i, j, k;
u32 mask;
 
+   mutex_lock(rdev-grbm_idx_mutex);
for (i = 0; i  rdev-config.cik.max_shader_engines; i++) {
for (j = 0; j  rdev-config.cik.max_sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -6051,6 +6066,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(rdev-grbm_idx_mutex);
 
mask = SE_MASTER_BUSY_MASK | GC_MASTER_BUSY | TC0_MASTER_BUSY | 
TC1_MASTER_BUSY;
for (k = 0; k  rdev-usec_timeout; k++) {
@@ -6185,10 +6201,12 @@ static int cik_rlc_resume(struct radeon_device *rdev)
WREG32(RLC_LB_CNTR_INIT, 0);
WREG32(RLC_LB_CNTR_MAX, 0x8000);
 
+   mutex_lock(rdev-grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_LB_INIT_CU_MASK, 0x);
WREG32(RLC_LB_PARAMS, 0x00600408);
WREG32(RLC_LB_CNTL, 0x8004);
+   mutex_unlock(rdev-grbm_idx_mutex);
 
WREG32(RLC_MC_CNTL, 0);
WREG32(RLC_UCODE_CNTL, 0);
@@ -6255,11 +6273,13 @@ static void cik_enable_cgcg(struct radeon_device *rdev, 
bool enable)
 
tmp = cik_halt_rlc(rdev);
 
+   mutex_lock(rdev-grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_SERDES_WR_CU_MASTER_MASK, 0x);
WREG32(RLC_SERDES_WR_NONCU_MASTER_MASK, 0x);
tmp2 = BPM_ADDR_MASK | CGCG_OVERRIDE_0

[PATCH 38/83] hsa/radeon: Workaround for a bug in amd_iommu

2014-07-10 Thread Oded Gabbay
This patch creates a workaround for a bug in amd_iommu driver, where the driver
doesn't save all necessary information when going to suspend.
The workaround removes a device from the IOMMU device list on suspend and 
register a resumed device in the IOMMU device list.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile |  2 +-
 drivers/gpu/hsa/radeon/kfd_device.c | 30 ++
 drivers/gpu/hsa/radeon/kfd_pasid.c  |  5 +
 drivers/gpu/hsa/radeon/kfd_pm.c | 43 -
 drivers/gpu/hsa/radeon/kfd_priv.h   |  1 +
 5 files changed, 37 insertions(+), 44 deletions(-)
 delete mode 100644 drivers/gpu/hsa/radeon/kfd_pm.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 935f9b7..5422e6a 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -5,6 +5,6 @@
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
-   kfd_vidmem.o kfd_interrupt.o kfd_pm.o
+   kfd_vidmem.o kfd_interrupt.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index a21c095..2e7d50d 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -188,3 +188,33 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 
kfree(kfd);
 }
+
+void kgd2kfd_suspend(struct kfd_dev *kfd)
+{
+   BUG_ON(kfd == NULL);
+
+   if (kfd-init_complete) {
+   kfd-device_info-scheduler_class-stop(kfd-scheduler);
+   amd_iommu_free_device(kfd-pdev);
+   }
+}
+
+int kgd2kfd_resume(struct kfd_dev *kfd)
+{
+   pasid_t pasid_limit;
+   int err;
+
+   BUG_ON(kfd == NULL);
+
+   pasid_limit = radeon_kfd_get_pasid_limit();
+
+   if (kfd-init_complete) {
+   err = amd_iommu_init_device(kfd-pdev, pasid_limit);
+   if (err  0)
+   return -ENXIO;
+   amd_iommu_set_invalidate_ctx_cb(kfd-pdev, 
iommu_pasid_shutdown_callback);
+   kfd-device_info-scheduler_class-start(kfd-scheduler);
+   }
+
+   return 0;
+}
diff --git a/drivers/gpu/hsa/radeon/kfd_pasid.c 
b/drivers/gpu/hsa/radeon/kfd_pasid.c
index d78bd00..8bd1562 100644
--- a/drivers/gpu/hsa/radeon/kfd_pasid.c
+++ b/drivers/gpu/hsa/radeon/kfd_pasid.c
@@ -68,6 +68,11 @@ bool radeon_kfd_set_pasid_limit(pasid_t new_limit)
return true;
 }
 
+inline pasid_t radeon_kfd_get_pasid_limit(void)
+{
+   return pasid_limit;
+}
+
 pasid_t radeon_kfd_pasid_alloc(void)
 {
pasid_t found;
diff --git a/drivers/gpu/hsa/radeon/kfd_pm.c b/drivers/gpu/hsa/radeon/kfd_pm.c
deleted file mode 100644
index 783311f..000
--- a/drivers/gpu/hsa/radeon/kfd_pm.c
+++ /dev/null
@@ -1,43 +0,0 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the Software),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Author: Oded Gabbay
- */
-
-#include linux/device.h
-#include kfd_priv.h
-#include kfd_scheduler.h
-
-void kgd2kfd_suspend(struct kfd_dev *kfd)
-{
-   BUG_ON(kfd == NULL);
-
-   kfd-device_info-scheduler_class-stop(kfd-scheduler);
-}
-
-int kgd2kfd_resume(struct kfd_dev *kfd)
-{
-   BUG_ON(kfd == NULL);
-
-   kfd-device_info-scheduler_class-start(kfd-scheduler);
-
-   return 0;
-}
diff --git a/drivers/gpu/hsa/radeon/kfd_priv.h 
b/drivers/gpu/hsa/radeon/kfd_priv.h
index bca9cce..8b877ca 100644
--- a/drivers/gpu/hsa/radeon/kfd_priv.h
+++ b/drivers/gpu/hsa/radeon/kfd_priv.h
@@ -213,6 +213,7 @@ struct kfd_queue *radeon_kfd_get_queue(struct kfd_process 
*p, unsigned int queue
 int radeon_kfd_pasid_init(void);
 void radeon_kfd_pasid_exit(void);
 bool radeon_kfd_set_pasid_limit(pasid_t new_limit);
+pasid_t radeon_kfd_get_pasid_limit(void);
 pasid_t

[PATCH 40/83] hsa/radeon: Adding max clock speeds to topology

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Adding support for CPU and GPU max clock speeds in node properties.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_topology.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_topology.c 
b/drivers/gpu/hsa/radeon/kfd_topology.c
index 2ee5444..21bb66e 100644
--- a/drivers/gpu/hsa/radeon/kfd_topology.c
+++ b/drivers/gpu/hsa/radeon/kfd_topology.c
@@ -26,6 +26,7 @@
 #include linux/errno.h
 #include linux/acpi.h
 #include linux/hash.h
+#include linux/cpufreq.h
 
 #include kfd_priv.h
 #include kfd_crat.h
@@ -712,9 +713,10 @@ static ssize_t node_show(struct kobject *kobj, struct 
attribute *attr,
sysfs_show_32bit_prop(buffer, location_id,
dev-node_props.location_id);
sysfs_show_32bit_prop(buffer, max_engine_clk_fcompute,
-   dev-node_props.max_engine_clk_fcompute);
+   kfd2kgd-get_max_engine_clock_in_mhz(
+   dev-gpu-kgd));
ret = sysfs_show_32bit_prop(buffer, max_engine_clk_ccompute,
-   dev-node_props.max_engine_clk_ccompute);
+   cpufreq_quick_get_max(0)/1000);
}
 
return ret;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 19/83] hsa/radeon: Enable/Disable KFD interrupt module

2014-07-10 Thread Oded Gabbay
This patch add calls to initialize and finalize the KFD interrupt
module.

The calls are done per device initialize/finalize inside the kgd--kfd
interface.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_regs.h   |  1 +
 drivers/gpu/hsa/radeon/kfd_device.c | 10 --
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
index 9c3ce97..813cdc4 100644
--- a/drivers/gpu/hsa/radeon/cik_regs.h
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -73,6 +73,7 @@
 #define CP_PQ_WPTR_POLL_CNTL   0xC20C
 #defineWPTR_POLL_EN(1  31)
 
+#define CPC_INT_CNTL   0xC2D0
 #define CP_ME1_PIPE0_INT_CNTL  0xC214
 #define CP_ME1_PIPE1_INT_CNTL  0xC218
 #define CP_ME1_PIPE2_INT_CNTL  0xC21C
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index b2d2861..b627e57 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -127,6 +127,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
radeon_kfd_doorbell_init(kfd);
 
+   if (radeon_kfd_interrupt_init(kfd))
+   return false;
+
if (!device_iommu_pasid_init(kfd))
return false;
 
@@ -155,10 +158,13 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 
BUG_ON(err != 0);
 
-   if (kfd-init_complete) {
+   if (kfd-init_complete)
kfd-device_info-scheduler_class-stop(kfd-scheduler);
-   kfd-device_info-scheduler_class-destroy(kfd-scheduler);
 
+   radeon_kfd_interrupt_exit(kfd);
+
+   if (kfd-init_complete) {
+   kfd-device_info-scheduler_class-destroy(kfd-scheduler);
amd_iommu_free_device(kfd-pdev);
}
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 33/83] hsa/radeon: Fix coding style in cik_int.h

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/cik_int.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/cik_int.h b/drivers/gpu/hsa/radeon/cik_int.h
index e98551d..350f0c2 100644
--- a/drivers/gpu/hsa/radeon/cik_int.h
+++ b/drivers/gpu/hsa/radeon/cik_int.h
@@ -26,20 +26,20 @@
 #include linux/types.h
 
 struct cik_ih_ring_entry {
-   uint32_t source_id  : 8;
-   uint32_t reserved1  : 8;
-   uint32_t reserved2  : 16;
+   uint32_t source_id:8;
+   uint32_t reserved1:8;
+   uint32_t reserved2:16;
 
-   uint32_t data   : 28;
-   uint32_t reserved3  : 4;
+   uint32_t data:28;
+   uint32_t reserved3:4;
 
/* pipeid, meid and unused3 are officially called RINGID,
 * but for our purposes, they always decode into pipe and ME. */
-   uint32_t pipeid : 2;
-   uint32_t meid   : 2;
-   uint32_t reserved4  : 4;
-   uint32_t vmid   : 8;
-   uint32_t pasid  : 16;
+   uint32_t pipeid:2;
+   uint32_t meid:2;
+   uint32_t reserved4:4;
+   uint32_t vmid:8;
+   uint32_t pasid:16;
 
uint32_t reserved5;
 };
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 35/83] hsa/radeon: Print ioctl commnad only in debug mode

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index d6fa980..dba6084 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -324,9 +324,9 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
struct kfd_process *process;
long err = -EINVAL;
 
-   dev_info(kfd_device,
-ioctl cmd 0x%x (#%d), arg 0x%lx\n,
-cmd, _IOC_NR(cmd), arg);
+   dev_dbg(kfd_device,
+   ioctl cmd 0x%x (#%d), arg 0x%lx\n,
+   cmd, _IOC_NR(cmd), arg);
 
process = radeon_kfd_get_process(current);
if (IS_ERR(process))
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 29/83] hsa/radeon: Fix memory size allocated for HPD

2014-07-10 Thread Oded Gabbay
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c 
b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
index 3c3e7d6..5bfde5c 100644
--- a/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
+++ b/drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
@@ -433,7 +433,7 @@ static int cik_static_create(struct kfd_dev *dev, struct 
kfd_scheduler **schedul
 * are no active queues.
 */
err = radeon_kfd_vidmem_alloc(dev,
- CIK_HPD_SIZE * priv-num_pipes * 2,
+ CIK_HPD_SIZE * priv-num_pipes,
  PAGE_SIZE,
  KFD_MEMPOOL_SYSTEM_WRITECOMBINE,
  priv-hpd_mem);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 31/83] drm/radeon: extending kfd-kgd interface

2014-07-10 Thread Oded Gabbay
From: Evgeny Pinchuk evgeny.pinc...@amd.com

Adding API for KFD to be able to query the GPU clock counter.

Signed-off-by: Evgeny Pinchuk evgeny.pinc...@amd.com
Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 9 +
 include/linux/radeon_kfd.h  | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index f4cc3c5..121e67b 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -42,6 +42,7 @@ static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, 
void **ptr);
 static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem);
 
 static uint64_t get_vmem_size(struct kgd_dev *kgd);
+static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 static void lock_srbm_gfx_cntl(struct kgd_dev *kgd);
 static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
@@ -55,6 +56,7 @@ static const struct kfd2kgd_calls kfd2kgd = {
.kmap_mem = kmap_mem,
.unkmap_mem = unkmap_mem,
.get_vmem_size = get_vmem_size,
+   .get_gpu_clock_counter = get_gpu_clock_counter,
.lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
.unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
 };
@@ -275,3 +277,10 @@ static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd)
 
mutex_unlock(rdev-srbm_mutex);
 }
+
+static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+
+   return rdev-asic-get_gpu_clock_counter(rdev);
+}
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 63b7bac..fcb6c7a 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -84,6 +84,7 @@ struct kfd2kgd_calls {
void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
 
uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
+   uint64_t (*get_gpu_clock_counter)(struct kgd_dev *kgd);
 
/* SRBM_GFX_CNTL mutex */
void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/83] drm/radeon: Add radeon -- kfd interface

2014-07-10 Thread Oded Gabbay
This patch adds the interface between the radeon driver and the kfd
driver. The interface implementation is contained in
radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/Makefile |  1 +
 drivers/gpu/drm/radeon/radeon.h |  3 ++
 drivers/gpu/drm/radeon/radeon_kfd.c | 94 +
 include/linux/radeon_kfd.h  | 67 ++
 4 files changed, 165 insertions(+)
 create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c
 create mode 100644 include/linux/radeon_kfd.h

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 1b04002..a1c913d 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -104,6 +104,7 @@ radeon-y += \
radeon_vce.o \
vce_v1_0.o \
vce_v2_0.o \
+   radeon_kfd.o
 
 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 4e7e41f..90f66bb 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2340,6 +2340,9 @@ struct radeon_device {
 
struct dev_pm_domain vga_pm_domain;
bool have_disp_power_ref;
+
+   /* HSA KFD interface */
+   struct kfd_dev  *kfd;
 };
 
 bool radeon_is_px(struct drm_device *dev);
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
new file mode 100644
index 000..7c7f808
--- /dev/null
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -0,0 +1,94 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include linux/module.h
+#include linux/radeon_kfd.h
+#include drm/drmP.h
+#include radeon.h
+
+static const struct kfd2kgd_calls kfd2kgd = {
+};
+
+static const struct kgd2kfd_calls *kgd2kfd;
+
+bool radeon_kfd_init(void)
+{
+   bool (*kgd2kfd_init_p)(unsigned, const struct kfd2kgd_calls*,
+   const struct kgd2kfd_calls**);
+
+   kgd2kfd_init_p = symbol_request(kgd2kfd_init);
+
+   if (kgd2kfd_init_p == NULL)
+   return false;
+
+   if (!kgd2kfd_init_p(KFD_INTERFACE_VERSION, kfd2kgd, kgd2kfd)) {
+   symbol_put(kgd2kfd_init);
+   kgd2kfd = NULL;
+
+   return false;
+   }
+
+   return true;
+}
+
+void radeon_kfd_fini(void)
+{
+   if (kgd2kfd) {
+   kgd2kfd-exit();
+   symbol_put(kgd2kfd_init);
+   }
+}
+
+void radeon_kfd_device_probe(struct radeon_device *rdev)
+{
+   if (kgd2kfd)
+   rdev-kfd = kgd2kfd-probe((struct kgd_dev *)rdev, rdev-pdev);
+}
+
+void radeon_kfd_device_init(struct radeon_device *rdev)
+{
+   if (rdev-kfd) {
+   struct kgd2kfd_shared_resources gpu_resources = {
+   .mmio_registers = rdev-rmmio,
+
+   .compute_vmid_bitmap = 0xFF00,
+
+   .first_compute_pipe = 1,
+   .compute_pipe_count = 8 - 1,
+   };
+
+   radeon_doorbell_get_kfd_info(rdev,
+   gpu_resources.doorbell_physical_address,
+   gpu_resources.doorbell_aperture_size,
+   gpu_resources.doorbell_start_offset);
+
+   kgd2kfd-device_init(rdev-kfd, gpu_resources);
+   }
+}
+
+void radeon_kfd_device_fini(struct radeon_device *rdev)
+{
+   if (rdev-kfd) {
+   kgd2kfd-device_exit(rdev-kfd);
+   rdev-kfd = NULL;
+   }
+}
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
new file mode 100644
index 000..59785e9
--- /dev/null
+++ b/include/linux/radeon_kfd.h
@@ -0,0 +1,67

[PATCH 11/83] hsa/radeon: Add scheduler code

2014-07-10 Thread Oded Gabbay
This patch adds the code base of the scheduler, which handles queue
creation, deletion and scheduling on the CP of the GPU.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |   3 +-
 drivers/gpu/hsa/radeon/cik_regs.h | 213 +++
 drivers/gpu/hsa/radeon/kfd_device.c   |   1 +
 drivers/gpu/hsa/radeon/kfd_registers.c|  50 ++
 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 800 ++
 drivers/gpu/hsa/radeon/kfd_vidmem.c   |  61 ++
 6 files changed, 1127 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/cik_regs.h
 create mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c
 create mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
 create mode 100644 drivers/gpu/hsa/radeon/kfd_vidmem.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index 989518a..28da10c 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -4,6 +4,7 @@
 
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
kfd_pasid.o kfd_topology.o kfd_process.o \
-   kfd_doorbell.o
+   kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
+   kfd_vidmem.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
b/drivers/gpu/hsa/radeon/cik_regs.h
new file mode 100644
index 000..d0cdc57
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/cik_regs.h
@@ -0,0 +1,213 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef CIK_REGS_H
+#define CIK_REGS_H
+
+#define BIF_DOORBELL_CNTL  0x530Cu
+
+#defineSRBM_GFX_CNTL   0xE44
+#definePIPEID(x)   ((x)  0)
+#defineMEID(x) ((x)  2)
+#defineVMID(x) ((x)  4)
+#defineQUEUEID(x)  ((x)  8)
+
+#defineSQ_CONFIG   0x8C00
+
+#defineSH_MEM_BASES0x8C28
+/* if PTR32, these are the bases for scratch and lds */
+#definePRIVATE_BASE(x) ((x)  0) /* 
scratch */
+#defineSHARED_BASE(x)  ((x)  16) /* 
LDS */
+#defineSH_MEM_APE1_BASE0x8C2C
+/* if PTR32, this is the base location of GPUVM */
+#defineSH_MEM_APE1_LIMIT   0x8C30
+/* if PTR32, this is the upper limit of GPUVM */
+#defineSH_MEM_CONFIG   0x8C34
+#definePTR32   (1  0)
+#defineALIGNMENT_MODE(x)   ((x)  2)
+#defineSH_MEM_ALIGNMENT_MODE_DWORD 0
+#defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT  1
+#defineSH_MEM_ALIGNMENT_MODE_STRICT2
+#defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 3
+#defineDEFAULT_MTYPE(x)((x)  4)
+#defineAPE1_MTYPE(x)   ((x)  7)
+
+/* valid for both DEFAULT_MTYPE and APE1_MTYPE */
+#defineMTYPE_NONCACHED 3
+
+
+#define SH_STATIC_MEM_CONFIG   0x9604u
+
+#defineTC_CFG_L1_LOAD_POLICY0  0xAC68
+#defineTC_CFG_L1_LOAD_POLICY1  0xAC6C
+#defineTC_CFG_L1_STORE_POLICY  0xAC70
+#defineTC_CFG_L2_LOAD_POLICY0  0xAC74
+#defineTC_CFG_L2_LOAD_POLICY1  0xAC78
+#defineTC_CFG_L2_STORE_POLICY0

[PATCH 14/83] hsa/radeon: Update MAINTAINERS and CREDITS files

2014-07-10 Thread Oded Gabbay
Update MAINTAINERS and CREDITS files with kfd driver information

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 CREDITS | 7 +++
 MAINTAINERS | 8 
 2 files changed, 15 insertions(+)

diff --git a/CREDITS b/CREDITS
index 03343bf..c5f0aeae 100644
--- a/CREDITS
+++ b/CREDITS
@@ -1197,6 +1197,13 @@ S: R. Tocantins, 89 - Cristo Rei
 S: 80050-430 - Curitiba - Paraná
 S: Brazil
 
+N: Oded Gabbay
+E: oded.gab...@gmail.com
+D: AMD HSA Radeon (KFD) driver maintainer
+S: 12 Shraga Raphaeli
+S: Petah-Tikva, 4906418
+S: Israel
+
 N: Kumar Gala
 E: ga...@kernel.crashing.org
 D: Embedded PowerPC 6xx/7xx/74xx/82xx/83xx/85xx support
diff --git a/MAINTAINERS b/MAINTAINERS
index 3efbeaf..bf1081f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -592,6 +592,14 @@ F: drivers/crypto/geode*
 F: drivers/video/fbdev/geode/
 F: arch/x86/include/asm/geode.h
 
+AMD HSA RADEON DRIVER (KFD)
+M: Oded Gabbay oded.gab...@amd.com
+L: dri-de...@lists.freedesktop.org
+S: Supported
+F: drivers/gpu/hsa/radeon
+F: include/linux/radeon_kfd.h
+F: include/linux/uapi/linux/kfd_ioctl.h
+
 AMD IOMMU (AMD-VI)
 M: Joerg Roedel j...@8bytes.org
 L: io...@lists.linux-foundation.org
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/83] drm/radeon: Add kfd--kgd interface to get virtual ram size

2014-07-10 Thread Oded Gabbay
This patch adds a new interface to kfd2kgd_calls structure so that
the kfd driver could get the virtual ram size of a specific
radeon device.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 12 
 include/linux/radeon_kfd.h  |  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 7c7f808..1b859b5 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -25,7 +25,10 @@
 #include drm/drmP.h
 #include radeon.h
 
+static uint64_t get_vmem_size(struct kgd_dev *kgd);
+
 static const struct kfd2kgd_calls kfd2kgd = {
+   .get_vmem_size = get_vmem_size,
 };
 
 static const struct kgd2kfd_calls *kgd2kfd;
@@ -92,3 +95,12 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
rdev-kfd = NULL;
}
 }
+
+static uint64_t get_vmem_size(struct kgd_dev *kgd)
+{
+   struct radeon_device *rdev = (struct radeon_device *)kgd;
+
+   BUG_ON(kgd == NULL);
+
+   return rdev-mc.real_vram_size;
+}
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index 59785e9..28cddf5 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -57,6 +57,7 @@ struct kgd2kfd_calls {
 };
 
 struct kfd2kgd_calls {
+   uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
 };
 
 bool kgd2kfd_init(unsigned interface_version,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/83] hsa/radeon: Add initialization and unmapping of doorbell aperture

2014-07-10 Thread Oded Gabbay
This patch adds initialization of the doorbell aperture when
initializing a kfd device.

It also adds a call to unmap the doorbell when a process unbinds
from the kfd

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/Makefile   |  3 +-
 drivers/gpu/hsa/radeon/kfd_device.c   |  2 +
 drivers/gpu/hsa/radeon/kfd_doorbell.c | 72 +++
 3 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/hsa/radeon/kfd_doorbell.c

diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
index ba16a09..989518a 100644
--- a/drivers/gpu/hsa/radeon/Makefile
+++ b/drivers/gpu/hsa/radeon/Makefile
@@ -3,6 +3,7 @@
 #
 
 radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
-   kfd_pasid.o kfd_topology.o kfd_process.o
+   kfd_pasid.o kfd_topology.o kfd_process.o \
+   kfd_doorbell.o
 
 obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
b/drivers/gpu/hsa/radeon/kfd_device.c
index d122920..4e9fe6c 100644
--- a/drivers/gpu/hsa/radeon/kfd_device.c
+++ b/drivers/gpu/hsa/radeon/kfd_device.c
@@ -123,6 +123,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
kfd-regs = gpu_resources-mmio_registers;
 
+   radeon_kfd_doorbell_init(kfd);
+
if (!device_iommu_pasid_init(kfd))
return false;
 
diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c 
b/drivers/gpu/hsa/radeon/kfd_doorbell.c
new file mode 100644
index 000..79a9d4b
--- /dev/null
+++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c
@@ -0,0 +1,72 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include kfd_priv.h
+#include linux/mm.h
+#include linux/mman.h
+
+/*
+ * Each device exposes a doorbell aperture, a PCI MMIO aperture that
+ * receives 32-bit writes that are passed to queues as wptr values.
+ * The doorbells are intended to be written by applications as part
+ * of queueing work on user-mode queues.
+ * We assign doorbells to applications in PAGE_SIZE-sized and aligned chunks.
+ * We map the doorbell address space into user-mode when a process creates
+ * its first queue on each device.
+ * Although the mapping is done by KFD, it is equivalent to an mmap of
+ * the /dev/kfd with the particular device encoded in the mmap offset.
+ * There will be other uses for mmap of /dev/kfd, so only a range of
+ * offsets (KFD_MMAP_DOORBELL_START-END) is used for doorbells.
+ */
+
+/* # of doorbell bytes allocated for each process. */
+static inline size_t doorbell_process_allocation(void)
+{
+   return roundup(sizeof(doorbell_t) * MAX_PROCESS_QUEUES, PAGE_SIZE);
+}
+
+/* Doorbell calculations for device init. */
+void radeon_kfd_doorbell_init(struct kfd_dev *kfd)
+{
+   size_t doorbell_start_offset;
+   size_t doorbell_aperture_size;
+   size_t doorbell_process_limit;
+
+   /* We start with calculations in bytes because the input data might 
only be byte-aligned.
+   ** Only after we have done the rounding can we assume any alignment. */
+
+   doorbell_start_offset = 
roundup(kfd-shared_resources.doorbell_start_offset,
+   doorbell_process_allocation());
+   doorbell_aperture_size = 
rounddown(kfd-shared_resources.doorbell_aperture_size,
+   doorbell_process_allocation());
+
+   if (doorbell_aperture_size  doorbell_start_offset)
+   doorbell_process_limit =
+   (doorbell_aperture_size - doorbell_start_offset) / 
doorbell_process_allocation();
+   else
+   doorbell_process_limit = 0;
+
+   kfd-doorbell_base = kfd-shared_resources.doorbell_physical_address + 
doorbell_start_offset;
+   kfd-doorbell_id_offset = doorbell_start_offset / sizeof(doorbell_t);
+   kfd-doorbell_process_limit = doorbell_process_limit;
+}
+
-- 
1.9.1

[PATCH 12/83] hsa/radeon: Add kfd mmap handler

2014-07-10 Thread Oded Gabbay
This patch adds the kfd mmap handler that maps the physical address
of a doorbell page to a user-space virtual address. That virtual address
belongs to the process that uses the doorbell page.

This mmap handler is called only from within the kernel and not to be
called from user-mode mmap of /dev/kfd.

Signed-off-by: Oded Gabbay oded.gab...@amd.com
---
 drivers/gpu/hsa/radeon/kfd_chardev.c  | 20 +
 drivers/gpu/hsa/radeon/kfd_doorbell.c | 85 +++
 2 files changed, 105 insertions(+)

diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
b/drivers/gpu/hsa/radeon/kfd_chardev.c
index 7a56a8f..0b5bc74 100644
--- a/drivers/gpu/hsa/radeon/kfd_chardev.c
+++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
@@ -39,6 +39,7 @@ static const struct file_operations kfd_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = kfd_ioctl,
.open = kfd_open,
+   .mmap = kfd_mmap,
 };
 
 static int kfd_char_dev_major = -1;
@@ -131,3 +132,22 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
long arg)
 
return err;
 }
+
+static int
+kfd_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+   unsigned long pgoff = vma-vm_pgoff;
+   struct kfd_process *process;
+
+   process = radeon_kfd_get_process(current);
+   if (IS_ERR(process))
+   return PTR_ERR(process);
+
+   if (pgoff  KFD_MMAP_DOORBELL_START)
+   return -EINVAL;
+
+   if (pgoff  KFD_MMAP_DOORBELL_END)
+   return radeon_kfd_doorbell_mmap(process, vma);
+
+   return -EINVAL;
+}
diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c 
b/drivers/gpu/hsa/radeon/kfd_doorbell.c
index 79a9d4b..e1d8506 100644
--- a/drivers/gpu/hsa/radeon/kfd_doorbell.c
+++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c
@@ -70,3 +70,88 @@ void radeon_kfd_doorbell_init(struct kfd_dev *kfd)
kfd-doorbell_process_limit = doorbell_process_limit;
 }
 
+/* This is the /dev/kfd mmap (for doorbell) implementation. We intend that 
this is only called through map_doorbells,
+** not through user-mode mmap of /dev/kfd. */
+int radeon_kfd_doorbell_mmap(struct kfd_process *process, struct 
vm_area_struct *vma)
+{
+   unsigned int device_index;
+   struct kfd_dev *dev;
+   phys_addr_t start;
+
+   BUG_ON(vma-vm_pgoff  KFD_MMAP_DOORBELL_START || vma-vm_pgoff = 
KFD_MMAP_DOORBELL_END);
+
+   /* For simplicitly we only allow mapping of the entire doorbell 
allocation of a single device  process. */
+   if (vma-vm_end - vma-vm_start != doorbell_process_allocation())
+   return -EINVAL;
+
+   /* device_index must be GPU ID!! */
+   device_index = vma-vm_pgoff - KFD_MMAP_DOORBELL_START;
+
+   dev = radeon_kfd_device_by_id(device_index);
+   if (dev == NULL)
+   return -EINVAL;
+
+   vma-vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | 
VM_DONTDUMP | VM_PFNMAP;
+   vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot);
+
+   start = dev-doorbell_base + process-pasid * 
doorbell_process_allocation();
+
+   pr_debug(kfd: mapping doorbell page in radeon_kfd_doorbell_mmap\n
+ target user address == 0x%016llX\n
+ physical address== 0x%016llX\n
+ vm_flags== 0x%08lX\n
+ size== 0x%08lX\n,
+(long long unsigned int) vma-vm_start, start, vma-vm_flags,
+doorbell_process_allocation());
+
+   return io_remap_pfn_range(vma,
+   vma-vm_start,
+   start  PAGE_SHIFT,
+   doorbell_process_allocation(),
+   vma-vm_page_prot);
+}
+
+/* Map the doorbells for a single process  device. This will indirectly call 
radeon_kfd_doorbell_mmap.
+** This assumes that the process mutex is being held. */
+static int
+map_doorbells(struct file *devkfd, struct kfd_process *process, struct kfd_dev 
*dev)
+{
+   struct kfd_process_device *pdd = 
radeon_kfd_get_process_device_data(dev, process);
+
+   if (pdd == NULL)
+   return -ENOMEM;
+
+   if (pdd-doorbell_mapping == NULL) {
+   unsigned long offset = (KFD_MMAP_DOORBELL_START + dev-id)  
PAGE_SHIFT;
+   doorbell_t __user *doorbell_mapping;
+
+   doorbell_mapping = (doorbell_t __user *)vm_mmap(devkfd, 0, 
doorbell_process_allocation(), PROT_WRITE,
+   MAP_SHARED, 
offset);
+   if (IS_ERR(doorbell_mapping))
+   return PTR_ERR(doorbell_mapping);
+
+   pdd-doorbell_mapping = doorbell_mapping;
+   }
+
+   return 0;
+}
+
+/* Get the user-mode address of a doorbell. Assumes that the process mutex is 
being held. */
+doorbell_t __user *radeon_kfd_get_doorbell(struct file *devkfd, struct 
kfd_process *process, struct kfd_dev *dev

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-21 Thread Oded Gabbay

On 20/07/14 20:46, Jerome Glisse wrote:

On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:

Forgot to cc mailing list on cover letter. Sorry.

As a continuation to the existing discussion, here is a v2 patch series
restructured with a cleaner history and no totally-different-early-versions
of the code.

Instead of 83 patches, there are now a total of 25 patches, where 5 of them
are modifications to radeon driver and 18 of them include only amdkfd code.
There is no code going away or even modified between patches, only added.

The driver was renamed from radeon_kfd to amdkfd and moved to reside under
drm/radeon/amdkfd. This move was done to emphasize the fact that this driver
is an AMD-only driver at this point. Having said that, we do foresee a
generic hsa framework being implemented in the future and in that case, we
will adjust amdkfd to work within that framework.

As the amdkfd driver should support multiple AMD gfx drivers, we want to
keep it as a seperate driver from radeon. Therefore, the amdkfd code is
contained in its own folder. The amdkfd folder was put under the radeon
folder because the only AMD gfx driver in the Linux kernel at this point
is the radeon driver. Having said that, we will probably need to move it
(maybe to be directly under drm) after we integrate with additional AMD gfx
drivers.

For people who like to review using git, the v2 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

Written by Oded Gabbayh oded.gab...@amd.com


So quick comments before i finish going over all patches. There is many
things that need more documentation espacialy as of right now there is
no userspace i can go look at.
So quick comments on some of your questions but first of all, thanks for the 
time you dedicated to review the code.


There few show stopper, biggest one is gpu memory pinning this is a big
no, that would need serious arguments for any hope of convincing me on
that side.
We only do gpu memory pinning for kernel objects. There are no userspace objects 
that are pinned on the gpu memory in our driver. If that is the case, is it 
still a show stopper ?


The kernel objects are:
- pipelines (4 per device)
- mqd per hiq (only 1 per device)
- mqd per userspace queue. On KV, we support up to 1K queues per process, for a 
total of 512K queues. Each mqd is 151 bytes, but the allocation is done in 256 
alignment. So total *possible* memory is 128MB

- kernel queue (only 1 per device)
- fence address for kernel queue
- runlists for the CP (1 or 2 per device)



It might be better to add a drivers/gpu/drm/amd directory and add common
stuff there.

Given that this is not intended to be final HSA api AFAICT then i would
say this far better to avoid the whole kfd module and add ioctl to radeon.
This would avoid crazy communication btw radeon and kfd.

The whole aperture business needs some serious explanation. Especialy as
you want to use userspace address there is nothing to prevent userspace
program from allocating things at address you reserve for lds, scratch,
... only sane way would be to move those lds, scratch inside the virtual
address reserved for kernel (see kernel memory map).

The whole business of locking performance counter for exclusive per process
access is a big NO. Which leads me to the questionable usefullness of user
space command ring.
That's like saying: Which leads me to the questionable usefulness of HSA. I 
find it analogous to a situation where a network maintainer nacking a driver for 
a network card, which is slower than a different network card. Doesn't seem 
reasonable this situation is would happen. He would still put both the drivers 
in the kernel because people want to use the H/W and its features. So, I don't 
think this is a valid reason to NACK the driver.



I only see issues with that. First and foremost i would
need to see solid figures that kernel ioctl or syscall has a higher an
overhead that is measurable in any meaning full way against a simple
function call. I know the userspace command ring is a big marketing features
that please ignorant userspace programmer. But really this only brings issues
and for absolutely not upside afaict.
Really ? You think that doing a context switch to kernel space, with all its 
overhead, is _not_ more expansive than just calling a function in userspace 
which only puts a buffer on a ring and writes a doorbell ?


So i would rather see a very simple ioctl that write the doorbell and might
do more than that in case of ring/queue overcommit where it would first have
to wait for a free ring/queue to schedule stuff. This would also allow sane
implementation of things like performance counter that could be acquire by
kernel for duration of a job submitted by userspace. While still not optimal
this would be better that userspace locking.


I might have more thoughts once i am done with all the patches.

Cheers,
Jérôme



Original Cover Letter:

This patch set implements

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-21 Thread Oded Gabbay

On 21/07/14 16:39, Christian König wrote:

Am 21.07.2014 14:36, schrieb Oded Gabbay:

On 20/07/14 20:46, Jerome Glisse wrote:

On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:

Forgot to cc mailing list on cover letter. Sorry.

As a continuation to the existing discussion, here is a v2 patch series
restructured with a cleaner history and no totally-different-early-versions
of the code.

Instead of 83 patches, there are now a total of 25 patches, where 5 of them
are modifications to radeon driver and 18 of them include only amdkfd code.
There is no code going away or even modified between patches, only added.

The driver was renamed from radeon_kfd to amdkfd and moved to reside under
drm/radeon/amdkfd. This move was done to emphasize the fact that this driver
is an AMD-only driver at this point. Having said that, we do foresee a
generic hsa framework being implemented in the future and in that case, we
will adjust amdkfd to work within that framework.

As the amdkfd driver should support multiple AMD gfx drivers, we want to
keep it as a seperate driver from radeon. Therefore, the amdkfd code is
contained in its own folder. The amdkfd folder was put under the radeon
folder because the only AMD gfx driver in the Linux kernel at this point
is the radeon driver. Having said that, we will probably need to move it
(maybe to be directly under drm) after we integrate with additional AMD gfx
drivers.

For people who like to review using git, the v2 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

Written by Oded Gabbayh oded.gab...@amd.com


So quick comments before i finish going over all patches. There is many
things that need more documentation espacialy as of right now there is
no userspace i can go look at.

So quick comments on some of your questions but first of all, thanks for the
time you dedicated to review the code.


There few show stopper, biggest one is gpu memory pinning this is a big
no, that would need serious arguments for any hope of convincing me on
that side.

We only do gpu memory pinning for kernel objects. There are no userspace
objects that are pinned on the gpu memory in our driver. If that is the case,
is it still a show stopper ?

The kernel objects are:
- pipelines (4 per device)
- mqd per hiq (only 1 per device)
- mqd per userspace queue. On KV, we support up to 1K queues per process, for
a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in
256 alignment. So total *possible* memory is 128MB
- kernel queue (only 1 per device)
- fence address for kernel queue
- runlists for the CP (1 or 2 per device)


The main questions here are if it's avoid able to pin down the memory and if the
memory is pinned down at driver load, by request from userspace or by anything
else.

As far as I can see only the mqd per userspace queue might be a bit
questionable, everything else sounds reasonable.

Christian.


Most of the pin downs are done on device initialization.
The mqd per userspace is done per userspace queue creation. However, as I 
said, it has an upper limit of 128MB on KV, and considering the 2G local memory, 
I think it is OK.
The runlists are also done on userspace queue creation/deletion, but we only 
have 1 or 2 runlists per device, so it is not that bad.


Oded




It might be better to add a drivers/gpu/drm/amd directory and add common
stuff there.

Given that this is not intended to be final HSA api AFAICT then i would
say this far better to avoid the whole kfd module and add ioctl to radeon.
This would avoid crazy communication btw radeon and kfd.

The whole aperture business needs some serious explanation. Especialy as
you want to use userspace address there is nothing to prevent userspace
program from allocating things at address you reserve for lds, scratch,
... only sane way would be to move those lds, scratch inside the virtual
address reserved for kernel (see kernel memory map).

The whole business of locking performance counter for exclusive per process
access is a big NO. Which leads me to the questionable usefullness of user
space command ring.

That's like saying: Which leads me to the questionable usefulness of HSA. I
find it analogous to a situation where a network maintainer nacking a driver
for a network card, which is slower than a different network card. Doesn't
seem reasonable this situation is would happen. He would still put both the
drivers in the kernel because people want to use the H/W and its features. So,
I don't think this is a valid reason to NACK the driver.


I only see issues with that. First and foremost i would
need to see solid figures that kernel ioctl or syscall has a higher an
overhead that is measurable in any meaning full way against a simple
function call. I know the userspace command ring is a big marketing features
that please ignorant userspace programmer. But really this only brings issues
and for absolutely not upside afaict.

Really ? You think

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-21 Thread Oded Gabbay
On 21/07/14 20:05, Daniel Vetter wrote:
 On Mon, Jul 21, 2014 at 11:58:52AM -0400, Jerome Glisse wrote:
 On Mon, Jul 21, 2014 at 05:25:11PM +0200, Daniel Vetter wrote:
 On Mon, Jul 21, 2014 at 03:39:09PM +0200, Christian König wrote:
 Am 21.07.2014 14:36, schrieb Oded Gabbay:
 On 20/07/14 20:46, Jerome Glisse wrote:
 On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:
 Forgot to cc mailing list on cover letter. Sorry.

 As a continuation to the existing discussion, here is a v2 patch series
 restructured with a cleaner history and no
 totally-different-early-versions
 of the code.

 Instead of 83 patches, there are now a total of 25 patches, where 5 of
 them
 are modifications to radeon driver and 18 of them include only amdkfd
 code.
 There is no code going away or even modified between patches, only
 added.

 The driver was renamed from radeon_kfd to amdkfd and moved to reside
 under
 drm/radeon/amdkfd. This move was done to emphasize the fact that this
 driver
 is an AMD-only driver at this point. Having said that, we do foresee a
 generic hsa framework being implemented in the future and in that
 case, we
 will adjust amdkfd to work within that framework.

 As the amdkfd driver should support multiple AMD gfx drivers, we want
 to
 keep it as a seperate driver from radeon. Therefore, the amdkfd code is
 contained in its own folder. The amdkfd folder was put under the radeon
 folder because the only AMD gfx driver in the Linux kernel at this
 point
 is the radeon driver. Having said that, we will probably need to move
 it
 (maybe to be directly under drm) after we integrate with additional
 AMD gfx
 drivers.

 For people who like to review using git, the v2 patch set is located
 at:
 http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

 Written by Oded Gabbayh oded.gab...@amd.com

 So quick comments before i finish going over all patches. There is many
 things that need more documentation espacialy as of right now there is
 no userspace i can go look at.
 So quick comments on some of your questions but first of all, thanks for
 the time you dedicated to review the code.

 There few show stopper, biggest one is gpu memory pinning this is a big
 no, that would need serious arguments for any hope of convincing me on
 that side.
 We only do gpu memory pinning for kernel objects. There are no userspace
 objects that are pinned on the gpu memory in our driver. If that is the
 case, is it still a show stopper ?

 The kernel objects are:
 - pipelines (4 per device)
 - mqd per hiq (only 1 per device)
 - mqd per userspace queue. On KV, we support up to 1K queues per process,
 for a total of 512K queues. Each mqd is 151 bytes, but the allocation is
 done in 256 alignment. So total *possible* memory is 128MB
 - kernel queue (only 1 per device)
 - fence address for kernel queue
 - runlists for the CP (1 or 2 per device)

 The main questions here are if it's avoid able to pin down the memory and 
 if
 the memory is pinned down at driver load, by request from userspace or by
 anything else.

 As far as I can see only the mqd per userspace queue might be a bit
 questionable, everything else sounds reasonable.

 Aside, i915 perspective again (i.e. how we solved this): When scheduling
 away from contexts we unpin them and put them into the lru. And in the
 shrinker we have a last-ditch callback to switch to a default context
 (since you can't ever have no context once you've started) which means we
 can evict any context object if it's getting in the way.

 So Intel hardware report through some interrupt or some channel when it is
 not using a context ? ie kernel side get notification when some user context
 is done executing ?
 
 Yes, as long as we do the scheduling with the cpu we get interrupts for
 context switches. The mechanic is already published in the execlist
 patches currently floating around. We get a special context switch
 interrupt.
 
 But we have this unpin logic already on the current code where we switch
 contexts through in-line cs commands from the kernel. There we obviously
 use the normal batch completion events.
 
 The issue with radeon hardware AFAICT is that the hardware do not report any
 thing about the userspace context running ie you do not get notification when
 a context is not use. Well AFAICT. Maybe hardware do provide that.
 
 I'm not sure whether we can do the same trick with the hw scheduler. But
 then unpinning hw contexts will drain the pipeline anyway, so I guess we
 can just stop feeding the hw scheduler until it runs dry. And then unpin
 and evict.
So, I'm afraid but we can't do this for AMD Kaveri because:

a. The hw scheduler doesn't inform us which queues it is going to
execute next. We feed it a runlist of queues, which can be very large
(we have a test that runs 1000 queues on the same runlist, but we can
put a lot more). All the MQDs of those queues must be pinned in memory
as long as the runlist is in effect. The runlist is in effect until
either

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-21 Thread Oded Gabbay
On 21/07/14 18:54, Jerome Glisse wrote:
 On Mon, Jul 21, 2014 at 05:12:06PM +0300, Oded Gabbay wrote:
 On 21/07/14 16:39, Christian König wrote:
 Am 21.07.2014 14:36, schrieb Oded Gabbay:
 On 20/07/14 20:46, Jerome Glisse wrote:
 On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:
 Forgot to cc mailing list on cover letter. Sorry.

 As a continuation to the existing discussion, here is a v2 patch series
 restructured with a cleaner history and no 
 totally-different-early-versions
 of the code.

 Instead of 83 patches, there are now a total of 25 patches, where 5 of 
 them
 are modifications to radeon driver and 18 of them include only amdkfd 
 code.
 There is no code going away or even modified between patches, only added.

 The driver was renamed from radeon_kfd to amdkfd and moved to reside 
 under
 drm/radeon/amdkfd. This move was done to emphasize the fact that this 
 driver
 is an AMD-only driver at this point. Having said that, we do foresee a
 generic hsa framework being implemented in the future and in that case, 
 we
 will adjust amdkfd to work within that framework.

 As the amdkfd driver should support multiple AMD gfx drivers, we want to
 keep it as a seperate driver from radeon. Therefore, the amdkfd code is
 contained in its own folder. The amdkfd folder was put under the radeon
 folder because the only AMD gfx driver in the Linux kernel at this point
 is the radeon driver. Having said that, we will probably need to move it
 (maybe to be directly under drm) after we integrate with additional AMD 
 gfx
 drivers.

 For people who like to review using git, the v2 patch set is located at:
 http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

 Written by Oded Gabbayh oded.gab...@amd.com

 So quick comments before i finish going over all patches. There is many
 things that need more documentation espacialy as of right now there is
 no userspace i can go look at.
 So quick comments on some of your questions but first of all, thanks for 
 the
 time you dedicated to review the code.

 There few show stopper, biggest one is gpu memory pinning this is a big
 no, that would need serious arguments for any hope of convincing me on
 that side.
 We only do gpu memory pinning for kernel objects. There are no userspace
 objects that are pinned on the gpu memory in our driver. If that is the 
 case,
 is it still a show stopper ?

 The kernel objects are:
 - pipelines (4 per device)
 - mqd per hiq (only 1 per device)
 - mqd per userspace queue. On KV, we support up to 1K queues per process, 
 for
 a total of 512K queues. Each mqd is 151 bytes, but the allocation is done 
 in
 256 alignment. So total *possible* memory is 128MB
 - kernel queue (only 1 per device)
 - fence address for kernel queue
 - runlists for the CP (1 or 2 per device)

 The main questions here are if it's avoid able to pin down the memory and 
 if the
 memory is pinned down at driver load, by request from userspace or by 
 anything
 else.

 As far as I can see only the mqd per userspace queue might be a bit
 questionable, everything else sounds reasonable.

 Christian.

 Most of the pin downs are done on device initialization.
 The mqd per userspace is done per userspace queue creation. However, as I
 said, it has an upper limit of 128MB on KV, and considering the 2G local
 memory, I think it is OK.
 The runlists are also done on userspace queue creation/deletion, but we only
 have 1 or 2 runlists per device, so it is not that bad.
 
 2G local memory ? You can not assume anything on userside configuration some
 one might build an hsa computer with 512M and still expect a functioning
 desktop.
First of all, I'm only considering Kaveri computer, not hsa computer.
Second, I would imagine we can build some protection around it, like
checking total local memory and limit number of queues based on some
percentage of that total local memory. So, if someone will have only
512M, he will be able to open less queues.


 
 I need to go look into what all this mqd is for, what it does and what it is
 about. But pinning is really bad and this is an issue with userspace command
 scheduling an issue that obviously AMD fails to take into account in design
 phase.
Maybe, but that is the H/W design non-the-less. We can't very well
change the H/W.
Oded
 
  Oded


 It might be better to add a drivers/gpu/drm/amd directory and add common
 stuff there.

 Given that this is not intended to be final HSA api AFAICT then i would
 say this far better to avoid the whole kfd module and add ioctl to radeon.
 This would avoid crazy communication btw radeon and kfd.

 The whole aperture business needs some serious explanation. Especialy as
 you want to use userspace address there is nothing to prevent userspace
 program from allocating things at address you reserve for lds, scratch,
 ... only sane way would be to move those lds, scratch inside the virtual
 address reserved for kernel (see kernel memory map).

 The whole business

  1   2   3   4   5   6   7   8   9   10   >