date:20130620



> -Original Message-
> From: Lucas Stach [mailto:l.stach at pengutronix.de]
> Sent: Thursday, June 20, 2013 7:11 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
> 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> kernel at lists.infradead.org; linux-media at vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Donnerstag, den 20.06.2013, 17:24 +0900 schrieb Inki Dae:
> [...]
> > > > In addition, please see the below more detail examples.
> > > >
> > > > The conventional way (without dmabuf-sync) is:
> > > > Task A
> > > > 
> > > >  1. CPU accesses buf
> > > >  2. Send the buf to Task B
> > > >  3. Wait for the buf from Task B
> > > >  4. go to 1
> > > >
> > > > Task B
> > > > ---
> > > > 1. Wait for the buf from Task A
> > > > 2. qbuf the buf
> > > > 2.1 insert the buf to incoming queue
> > > > 3. stream on
> > > > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > > > 3.2 get the buf from ready queue, and dma start.
> > > > 4. dqbuf
> > > > 4.1 dma_unmap_sg after dma operation completion
> > > > 4.2 move the buf to outgoing queue
> > > > 5. back the buf to Task A
> > > > 6. go to 1
> > > >
> > > > In case that two tasks share buffers, and data flow goes from Task A
> to
> > > Task
> > > > B, we would need IPC operation to send and receive buffers properly
> > > between
> > > > those two tasks every time CPU or DMA access to buffers is started
> or
> > > > completed.
> > > >
> > > >
> > > > With dmabuf-sync:
> > > >
> > > > Task A
> > > > 
> > > >  1. dma_buf_sync_lock <- synpoint (call by user side)
> > > >  2. CPU accesses buf
> > > >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> > > >  4. Send the buf to Task B (just one time)
> > > >  5. go to 1
> > > >
> > > >
> > > > Task B
> > > > ---
> > > > 1. Wait for the buf from Task A (just one time)
> > > > 2. qbuf the buf
> > > > 1.1 insert the buf to incoming queue
> > > > 3. stream on
> > > > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > > > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > > > 3.3 get the buf from ready queue, and dma start.
> > > > 4. dqbuf
> > > > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > > > 4.2 dma_unmap_sg after dma operation completion
> > > > 4.3 move the buf to outgoing queue
> > > > 5. go to 1
> > > >
> > > > On the other hand, in case of using dmabuf-sync, as you can see the
> > > above
> > > > example, we would need IPC operation just one time. That way, I
> think we
> > > > could not only reduce performance overhead but also make user
> > > application
> > > > simplified. Of course, this approach can be used for all DMA device
> > > drivers
> > > > such as DRM. I'm not a specialist in v4l2 world so there may be
> missing
> > > > point.
> > > >
> > >
> > > You already need some kind of IPC between the two tasks, as I suspect
> > > even in your example it wouldn't make much sense to queue the buffer
> > > over and over again in task B without task A writing anything to it.
> So
> > > task A has to signal task B there is new data in the buffer to be
> > > processed.
> > >
> > > There is no need to share the buffer over and over again just to get
> the
> > > two processes to work together on the same thing. Just share the fd
> > > between both and then do out-of-band completion signaling, as you need
> > > this anyway. Without this you'll end up with unpredictable behavior.
> > > Just because sync allows you to access the buffer doesn't mean it's
> > > valid for your use-case. Without completion signaling you could easily
> > > end up overwriting your data from task A multiple times before task B
> > > even tries to lock the buffer for processing.
> > >
> > > So the valid flow is (and this already works with the current APIs):
> > > Task ATask B
> > > ----
> > > CPU access buffer
> > >  --completion signal->
> > >   qbuf (dragging buffer into
> > >   device domain, flush caches,
> > >   reserve buffer etc.)
> > > |
> > >   wait for device operation to
> > >   complete
> > > |
> > >   dqbuf (dragging buffer back
> > >   into CPU domain, invalidate
> > >   caches, unreserve)
> > > <-completion signal
> > > CPU access buffer
> > >
> >
> >

[PATCH v2 8/8] of/documentation: update with clock information for exynos hdmi subsystem

Adding information about clocks to the binding documentation
for exynos mixer and hdmi.

Signed-off-by: Rahul Sharma 
---
 Documentation/devicetree/bindings/video/exynos_hdmi.txt  |   14 +-
 Documentation/devicetree/bindings/video/exynos_mixer.txt |4 
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/video/exynos_hdmi.txt 
b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
index 584385a..827e64b 100644
--- a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
+++ b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
@@ -9,7 +9,19 @@ Required properties:
a) phandle of the gpio controller node.
b) pin number within the gpio controller.
c) optional flags and pull up/down.
-
+- clocks: list of clock IDs from SoC clock driver.
+   a) hdmi: It is required for gate operation on aclk_200_disp1 clock
+   which clocks the display1 block.
+   b) sclk_hdmi: It is required for gate operation on sclk_hdmi clock
+   which clocks hdmi IP.
+   c) sclk_pixel: Parent for mux mout_hdmi.
+   d) sclk_hdmiphy: Parent for mux mout_hdmi.
+   e) mout_hdmi: It is required by the driver to switch between the 2
+   parents i.e. sclk_pixel and sclk_hdmiphy. If hdmiphy is stable
+   after configuration, parent is set to sclk_hdmiphy else
+   sclk_pixel.
+- clock-names: aliases as per driver requirements for above clock IDs:
+   "hdmi", "sclk_hdmi", "sclk_pixel", "sclk_hdmiphy" and "mout_hdmi".
 Example:

hdmi {
diff --git a/Documentation/devicetree/bindings/video/exynos_mixer.txt 
b/Documentation/devicetree/bindings/video/exynos_mixer.txt
index 9b2ea03..54790a2 100644
--- a/Documentation/devicetree/bindings/video/exynos_mixer.txt
+++ b/Documentation/devicetree/bindings/video/exynos_mixer.txt
@@ -5,6 +5,10 @@ Required properties:
 - reg: physical base address of the mixer and length of memory mapped
region.
 - interrupts: interrupt number to the cpu.
+- clocks: list of clock IDs from SoC clock driver.
+   a) mixer: It is required for gate operation on aclk_200_disp1 clock
+   which clocks the display1 block.
+   b) sclk_hdmi: Parent for mux mout_mixer.

 Example:

-- 
1.7.10.4

[PATCH v2 7/8] of/documentation: Update hpd gpio property for exynos_hdmi

From: Sachin Kamat 

Exynos SoCs use pinctrl to configure GPIOs. Update the document
to reflect this change.

Signed-off-by: Sachin Kamat 
Signed-off-by: Rahul Sharma 
---
 Documentation/devicetree/bindings/video/exynos_hdmi.txt |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/video/exynos_hdmi.txt 
b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
index 589edee..584385a 100644
--- a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
+++ b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
@@ -8,9 +8,7 @@ Required properties:
 - hpd-gpio: following information about the hotplug gpio pin.
a) phandle of the gpio controller node.
b) pin number within the gpio controller.
-   c) pin function mode.
-   d) optional flags and pull up/down.
-   e) drive strength.
+   c) optional flags and pull up/down.

 Example:

@@ -18,5 +16,5 @@ Example:
compatible = "samsung,exynos5-hdmi";
reg = <0x1453 0x10>;
interrupts = <0 95 0>;
-   hpd-gpio = < 7 0xf 1 3>;
+   hpd-gpio = < 7 1>;
};
-- 
1.7.10.4

[PATCH v2 6/8] ARM: dts: exynos5420: add hdmi hpd gpio pinctrl node

Add pinctrl node for hdmi-hpd gpio pin to exynos5420
device tree files.

Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/exynos5420-smdk5420.dts |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420-smdk5420.dts 
b/arch/arm/boot/dts/exynos5420-smdk5420.dts
index fcd0f69..0f98eab 100644
--- a/arch/arm/boot/dts/exynos5420-smdk5420.dts
+++ b/arch/arm/boot/dts/exynos5420-smdk5420.dts
@@ -31,9 +31,20 @@
};
};

+   pinctrl at 1340 {
+   hdmi_hpd_irq: hdmi-hpd-irq {
+   samsung,pins = "gpx3-7";
+   samsung,pin-function = <0>;
+   samsung,pin-pud = <1>;
+   samsung,pin-drv = <0>;
+   };
+   };
+
hdmi at 1453 {
status = "okay";
hpd-gpio = < 7 0>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_hpd_irq>;
};

mixer at 1445 {
-- 
1.7.10.4

[PATCH v2 5/8] ARM: dts: exynos5420: add clocks for hdmi subsystem

Add clocks for hdmi and mixer for exynos5420 SoC.

Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/exynos5420.dtsi |6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index 93caef7..5fa4093 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -180,9 +180,15 @@

hdmi at 1453 {
compatible = "samsung,exynos4212-hdmi";
+   clocks = < 413>, < 143>, < 144>,
+   < 158>, < 1024>;
+   clock-names = "hdmi", "sclk_hdmi", "sclk_pixel",
+   "sclk_hdmiphy", "mout_hdmi";
};

mixer at 1445 {
compatible = "samsung,exynos5420-mixer";
+   clocks = < 431>, < 143>;
+   clock-names = "mixer", "sclk_hdmi";
};
 };
-- 
1.7.10.4

[PATCH v2 4/8] ARM: dts: exynos5420: add dt nodes for hdmi subsystem

Add hdmi, mixer, ddc device tree nodes for Exynos 5420 SoC.

Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/exynos5420-smdk5420.dts |   20 
 arch/arm/boot/dts/exynos5420.dtsi |8 
 2 files changed, 28 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420-smdk5420.dts 
b/arch/arm/boot/dts/exynos5420-smdk5420.dts
index 08607df..fcd0f69 100644
--- a/arch/arm/boot/dts/exynos5420-smdk5420.dts
+++ b/arch/arm/boot/dts/exynos5420-smdk5420.dts
@@ -30,4 +30,24 @@
clock-frequency = <2400>;
};
};
+
+   hdmi at 1453 {
+   status = "okay";
+   hpd-gpio = < 7 0>;
+   };
+
+   mixer at 1445 {
+   status = "okay";
+   };
+
+   i2c_2: i2c at 12C8 {
+   samsung,i2c-sda-delay = <100>;
+   samsung,i2c-max-bus-freq = <66000>;
+   status = "okay";
+
+   hdmiddc at 50 {
+   compatible = "samsung,exynos4210-hdmiddc";
+   reg = <0x50>;
+   };
+   };
 };
diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index 953f877..93caef7 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -177,4 +177,12 @@
pinctrl-names = "default";
pinctrl-0 = <_bus>;
};
+
+   hdmi at 1453 {
+   compatible = "samsung,exynos4212-hdmi";
+   };
+
+   mixer at 1445 {
+   compatible = "samsung,exynos5420-mixer";
+   };
 };
-- 
1.7.10.4

[PATCH v2 3/8] ARM: dts: exynos5250: move common hdmi properties to exynos5 dtsi

Hdmi Subsystem nodes shares many properties across exynos5 SoCs
(exynos5250 and exyno5420). Common code is moved to exynos5.dtsi
which is included in exyno5250 and exynos5420 SoC files.

It also renames the hdmi and mixer nodes as per dt naming
convention in the format name at phy_add.

Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/cros5250-common.dtsi|2 +-
 arch/arm/boot/dts/exynos5.dtsi|   12 
 arch/arm/boot/dts/exynos5250-arndale.dts  |7 ++-
 arch/arm/boot/dts/exynos5250-smdk5250.dts |7 ++-
 arch/arm/boot/dts/exynos5250-snow.dts |8 
 arch/arm/boot/dts/exynos5250.dtsi |8 ++--
 6 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/arch/arm/boot/dts/cros5250-common.dtsi 
b/arch/arm/boot/dts/cros5250-common.dtsi
index 3f0239e..abda0b9 100644
--- a/arch/arm/boot/dts/cros5250-common.dtsi
+++ b/arch/arm/boot/dts/cros5250-common.dtsi
@@ -299,7 +299,7 @@
status = "disabled";
};

-   hdmi {
+   hdmi at 1453 {
hpd-gpio = < 7 0>;
};

diff --git a/arch/arm/boot/dts/exynos5.dtsi b/arch/arm/boot/dts/exynos5.dtsi
index 1ae179e..dcb4943 100644
--- a/arch/arm/boot/dts/exynos5.dtsi
+++ b/arch/arm/boot/dts/exynos5.dtsi
@@ -144,4 +144,16 @@
#size-cells = <0>;
status = "disabled";
};
+
+   hdmi at 1453 {
+   reg = <0x1453 0x7>;
+   interrupts = <0 95 0>;
+   status = "disabled";
+   };
+
+   mixer at 1445 {
+   reg = <0x1445 0x1>;
+   interrupts = <0 94 0>;
+   status = "disabled";
+   };
 };
diff --git a/arch/arm/boot/dts/exynos5250-arndale.dts 
b/arch/arm/boot/dts/exynos5250-arndale.dts
index 5c7dd6d..c18c346 100644
--- a/arch/arm/boot/dts/exynos5250-arndale.dts
+++ b/arch/arm/boot/dts/exynos5250-arndale.dts
@@ -423,13 +423,18 @@
};
};

-   hdmi {
+   hdmi at 1453 {
+   status = "okay";
hpd-gpio = < 7 2>;
vdd_osc-supply = <_reg>;
vdd_pll-supply = <_reg>;
vdd-supply = <_reg>;
};

+   mixer at 1445 {
+   status = "okay";
+   };
+
mmc_reg: voltage-regulator {
compatible = "regulator-fixed";
regulator-name = "VDD_33ON_2.8V";
diff --git a/arch/arm/boot/dts/exynos5250-smdk5250.dts 
b/arch/arm/boot/dts/exynos5250-smdk5250.dts
index d3da369..c1b8b7f 100644
--- a/arch/arm/boot/dts/exynos5250-smdk5250.dts
+++ b/arch/arm/boot/dts/exynos5250-smdk5250.dts
@@ -221,10 +221,15 @@
status = "disabled";
};

-   hdmi {
+   hdmi at 1453 {
+   status = "okay";
hpd-gpio = < 7 0>;
};

+   mixer at 1445 {
+   status = "okay";
+   };
+
codec at 1100 {
samsung,mfc-r = <0x4300 0x80>;
samsung,mfc-l = <0x5100 0x80>;
diff --git a/arch/arm/boot/dts/exynos5250-snow.dts 
b/arch/arm/boot/dts/exynos5250-snow.dts
index e79331d..b1378af 100644
--- a/arch/arm/boot/dts/exynos5250-snow.dts
+++ b/arch/arm/boot/dts/exynos5250-snow.dts
@@ -196,4 +196,12 @@
clock-frequency = <2400>;
};
};
+
+   hdmi at 1453 {
+   status = "okay";
+   };
+
+   mixer at 1445 {
+   status = "okay";
+   };
 };
diff --git a/arch/arm/boot/dts/exynos5250.dtsi 
b/arch/arm/boot/dts/exynos5250.dtsi
index baf6f88..0fdbe55 100644
--- a/arch/arm/boot/dts/exynos5250.dtsi
+++ b/arch/arm/boot/dts/exynos5250.dtsi
@@ -578,20 +578,16 @@
clock-names = "gscl";
};

-   hdmi {
+   hdmi at 1453 {
compatible = "samsung,exynos5-hdmi";
-   reg = <0x1453 0x7>;
-   interrupts = <0 95 0>;
clocks = < 333>, < 136>, < 137>,
< 333>, < 333>;
clock-names = "hdmi", "sclk_hdmi", "sclk_pixel",
"sclk_hdmiphy", "hdmiphy";
};

-   mixer {
+   mixer at 1445 {
compatible = "samsung,exynos5-mixer";
-   reg = <0x1445 0x1>;
-   interrupts = <0 94 0>;
};

dp-controller {
-- 
1.7.10.4

[PATCH v2 2/8] ARM: dts: exynos5420: add i2c device nodes

From: Andrew Bresticker 

This adds device-tree nodes for the i2c busses on Exynos
5420 platforms.

Signed-off-by: Andrew Bresticker 
Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/exynos5420.dtsi |   32 
 1 file changed, 32 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index 8c54c4b..953f877 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -24,6 +24,10 @@
pinctrl2 = _2;
pinctrl3 = _3;
pinctrl4 = _4;
+   i2c0 = _0;
+   i2c1 = _1;
+   i2c2 = _2;
+   i2c3 = _3;
};

cpus {
@@ -145,4 +149,32 @@
clocks = < 260>, < 131>;
clock-names = "uart", "clk_uart_baud0";
};
+
+   i2c_0: i2c at 12C6 {
+   clocks = < 261>;
+   clock-names = "i2c";
+   pinctrl-names = "default";
+   pinctrl-0 = <_bus>;
+   };
+
+   i2c_1: i2c at 12C7 {
+   clocks = < 262>;
+   clock-names = "i2c";
+   pinctrl-names = "default";
+   pinctrl-0 = <_bus>;
+   };
+
+   i2c_2: i2c at 12C8 {
+   clocks = < 263>;
+   clock-names = "i2c";
+   pinctrl-names = "default";
+   pinctrl-0 = <_bus>;
+   };
+
+   i2c_3: i2c at 12C9 {
+   clocks = < 264>;
+   clock-names = "i2c";
+   pinctrl-names = "default";
+   pinctrl-0 = <_bus>;
+   };
 };
-- 
1.7.10.4

[PATCH v2 1/8] ARM: dts: exynos5250: move common i2c properties to exynos5 dtsi

I2C nodes shares many properties across exynos5 SoCs (exynos5250
and exyno5420). Common code is moved to exynos5.dtsi which is
included in exyno5250 and exynos5420 SoC files.

Signed-off-by: Rahul Sharma 
---
 arch/arm/boot/dts/exynos5.dtsi|   36 +
 arch/arm/boot/dts/exynos5250-arndale.dts  |1 +
 arch/arm/boot/dts/exynos5250-smdk5250.dts |3 +++
 arch/arm/boot/dts/exynos5250.dtsi |   20 
 4 files changed, 40 insertions(+), 20 deletions(-)

diff --git a/arch/arm/boot/dts/exynos5.dtsi b/arch/arm/boot/dts/exynos5.dtsi
index f65e124..1ae179e 100644
--- a/arch/arm/boot/dts/exynos5.dtsi
+++ b/arch/arm/boot/dts/exynos5.dtsi
@@ -108,4 +108,40 @@
interrupts = <0 42 0>;
status = "disabled";
};
+
+   i2c_0: i2c at 12C6 {
+   compatible = "samsung,s3c2440-i2c";
+   reg = <0x12C6 0x100>;
+   interrupts = <0 56 0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+   };
+
+   i2c_1: i2c at 12C7 {
+   compatible = "samsung,s3c2440-i2c";
+   reg = <0x12C7 0x100>;
+   interrupts = <0 57 0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+   };
+
+   i2c_2: i2c at 12C8 {
+   compatible = "samsung,s3c2440-i2c";
+   reg = <0x12C8 0x100>;
+   interrupts = <0 58 0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+   };
+
+   i2c_3: i2c at 12C9 {
+   compatible = "samsung,s3c2440-i2c";
+   reg = <0x12C9 0x100>;
+   interrupts = <0 59 0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+   };
 };
diff --git a/arch/arm/boot/dts/exynos5250-arndale.dts 
b/arch/arm/boot/dts/exynos5250-arndale.dts
index abc7272..5c7dd6d 100644
--- a/arch/arm/boot/dts/exynos5250-arndale.dts
+++ b/arch/arm/boot/dts/exynos5250-arndale.dts
@@ -30,6 +30,7 @@
};

i2c at 12C6 {
+   status = "okay";
samsung,i2c-sda-delay = <100>;
samsung,i2c-max-bus-freq = <2>;
samsung,i2c-slave-addr = <0x66>;
diff --git a/arch/arm/boot/dts/exynos5250-smdk5250.dts 
b/arch/arm/boot/dts/exynos5250-smdk5250.dts
index 35a66de..d3da369 100644
--- a/arch/arm/boot/dts/exynos5250-smdk5250.dts
+++ b/arch/arm/boot/dts/exynos5250-smdk5250.dts
@@ -28,6 +28,7 @@
};

i2c at 12C6 {
+   status = "okay";
samsung,i2c-sda-delay = <100>;
samsung,i2c-max-bus-freq = <2>;

@@ -62,6 +63,7 @@
};

i2c at 12C7 {
+   status = "okay";
samsung,i2c-sda-delay = <100>;
samsung,i2c-max-bus-freq = <2>;

@@ -101,6 +103,7 @@
};

i2c at 12C8 {
+   status = "okay";
samsung,i2c-sda-delay = <100>;
samsung,i2c-max-bus-freq = <66000>;

diff --git a/arch/arm/boot/dts/exynos5250.dtsi 
b/arch/arm/boot/dts/exynos5250.dtsi
index d04ab0a..baf6f88 100644
--- a/arch/arm/boot/dts/exynos5250.dtsi
+++ b/arch/arm/boot/dts/exynos5250.dtsi
@@ -217,11 +217,6 @@
};

i2c_0: i2c at 12C6 {
-   compatible = "samsung,s3c2440-i2c";
-   reg = <0x12C6 0x100>;
-   interrupts = <0 56 0>;
-   #address-cells = <1>;
-   #size-cells = <0>;
clocks = < 294>;
clock-names = "i2c";
pinctrl-names = "default";
@@ -229,11 +224,6 @@
};

i2c_1: i2c at 12C7 {
-   compatible = "samsung,s3c2440-i2c";
-   reg = <0x12C7 0x100>;
-   interrupts = <0 57 0>;
-   #address-cells = <1>;
-   #size-cells = <0>;
clocks = < 295>;
clock-names = "i2c";
pinctrl-names = "default";
@@ -241,11 +231,6 @@
};

i2c_2: i2c at 12C8 {
-   compatible = "samsung,s3c2440-i2c";
-   reg = <0x12C8 0x100>;
-   interrupts = <0 58 0>;
-   #address-cells = <1>;
-   #size-cells = <0>;
clocks = < 296>;
clock-names = "i2c";
pinctrl-names = "default";
@@ -253,11 +238,6 @@
};

i2c_3: i2c at 12C9 {
-   compatible = "samsung,s3c2440-i2c";
-   reg = <0x12C9 0x100>;
-   interrupts = <0 59 0>;
-   #address-cells = <1>;
-   #size-cells = <0>;
clocks = < 297>;
clock-names = "i2c";
pinctrl-names = "default";
-- 
1.7.10.4

[PATCH v2 0/8] ARM: dts: exynos5420: add support for hdmi subsystem

Common properties for I2C and Hdmi Subsystem is moved to exynos5
dtsi file. It also adds Device tree nodes and clocks information
for exynos5420 SoC. It adds pinctrl node for hdmi hpd gpio and
update binding documents.

This patch is based on v3.11-next/soc-exynos5420-pinctrl branch
at http://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git

device tree binding document for hdmi compatible type is update at
http://www.spinics.net/lists/dri-devel/msg39987.html

v2:
1) Added patch for moving common i2c properties to exynos5.dtsi
2) Added patch for moving common hdmi, mixer properties to exynos5.dtsi
3) moved hpd pinctrl node to board file.
4) Added Sachin's patch to update binding document for hdmi with hpd
information.

Andrew Bresticker (1):
  ARM: dts: exynos5420: add i2c device nodes

Rahul Sharma (6):
  ARM: dts: exynos5250: move common i2c properties to exynos5 dtsi
  ARM: dts: exynos5250: move common hdmi properties to exynos5 dtsi
  ARM: dts: exynos5420: add dt nodes for hdmi subsystem
  ARM: dts: exynos5420: add clocks for hdmi subsystem
  ARM: dts: exynos5420: add hdmi hpd gpio pinctrl node
  of/documentation: update with clock information for exynos hdmi
subsystem

Sachin Kamat (1):
  of/documentation: Update hpd gpio property for exynos_hdmi

 .../devicetree/bindings/video/exynos_hdmi.txt  |   20 ++--
 .../devicetree/bindings/video/exynos_mixer.txt |4 ++
 arch/arm/boot/dts/cros5250-common.dtsi |2 +-
 arch/arm/boot/dts/exynos5.dtsi |   48 
 arch/arm/boot/dts/exynos5250-arndale.dts   |8 +++-
 arch/arm/boot/dts/exynos5250-smdk5250.dts  |   10 +++-
 arch/arm/boot/dts/exynos5250-snow.dts  |8 
 arch/arm/boot/dts/exynos5250.dtsi  |   28 +---
 arch/arm/boot/dts/exynos5420-smdk5420.dts  |   31 +
 arch/arm/boot/dts/exynos5420.dtsi  |   46 +++
 10 files changed, 171 insertions(+), 34 deletions(-)

-- 
1.7.10.4

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework



> -Original Message-
> From: Lucas Stach [mailto:l.stach at pengutronix.de]
> Sent: Thursday, June 20, 2013 4:47 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
> 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> kernel at lists.infradead.org; linux-media at vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
> >
> > > -Original Message-
> > > From: dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org
> > > [mailto:dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org]
> On
> > > Behalf Of Russell King - ARM Linux
> > > Sent: Thursday, June 20, 2013 3:29 AM
> > > To: Inki Dae
> > > Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
> YoungJun
> > > Cho; linux-media at vger.kernel.org; linux-arm-kernel at 
> > > lists.infradead.org
> > > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer
> synchronization
> > > framework
> > >
> > > On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > > > On the other hand, the below shows how we could enhance the
> conventional
> > > > way with my approach (just example):
> > > >
> > > > CPU -> DMA,
> > > > ioctl(qbuf command)  ioctl(streamon)
> > > >   |   |
> > > >   |   |
> > > > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> > > >
> > > > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> > > And
> > > > the syncpoint is performed by calling dma_buf_sync_lock(), and then
> DMA
> > > > accesses the sync buffer.
> > > >
> > > > And DMA -> CPU,
> > > > ioctl(dqbuf command)
> > > >   |
> > > >   |
> > > > dqbuf <- nothing to do
> > > >
> > > > Actual syncpoint is when DMA operation is completed (in interrupt
> > > handler):
> > > > the syncpoint is performed by calling dma_buf_sync_unlock().
> > > > Hence,  my approach is to move the syncpoints into just before dma
> > > access
> > > > as long as possible.
> > >
> > > What you've just described does *not* work on architectures such as
> > > ARMv7 which do speculative cache fetches from memory at any time that
> > > that memory is mapped with a cacheable status, and will lead to data
> > > corruption.
> >
> > I didn't explain that enough. Sorry about that. 'nothing to do' means
> that a
> > dmabuf sync interface isn't called but existing functions are called. So
> > this may be explained again:
> > ioctl(dqbuf command)
> > |
> > |
> > dqbuf <- 1. dma_unmap_sg
> > 2. dma_buf_sync_unlock (syncpoint)
> >
> > The syncpoint I mentioned means lock mechanism; not doing cache
> operation.
> >
> > In addition, please see the below more detail examples.
> >
> > The conventional way (without dmabuf-sync) is:
> > Task A
> > 
> >  1. CPU accesses buf
> >  2. Send the buf to Task B
> >  3. Wait for the buf from Task B
> >  4. go to 1
> >
> > Task B
> > ---
> > 1. Wait for the buf from Task A
> > 2. qbuf the buf
> > 2.1 insert the buf to incoming queue
> > 3. stream on
> > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > 3.2 get the buf from ready queue, and dma start.
> > 4. dqbuf
> > 4.1 dma_unmap_sg after dma operation completion
> > 4.2 move the buf to outgoing queue
> > 5. back the buf to Task A
> > 6. go to 1
> >
> > In case that two tasks share buffers, and data flow goes from Task A to
> Task
> > B, we would need IPC operation to send and receive buffers properly
> between
> > those two tasks every time CPU or DMA access to buffers is started or
> > completed.
> >
> >
> > With dmabuf-sync:
> >
> > Task A
> > 
> >  1. dma_buf_sync_lock <- synpoint (call by user side)
> >  2. CPU accesses buf
> >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> >  4. Send the buf to Task B (just one time)
> >  5. go to 1
> >
> >
> > Task B
> > ---
> > 1. Wait for the buf from Task A (just one time)
> > 2. qbuf the buf
> > 1.1 insert the buf to incoming queue
> > 3. stream on
> > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > 3.3 get the buf from ready queue, and dma start.
> > 4. dqbuf
> > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > 4.2 dma_unmap_sg after dma operation completion
> > 4.3 move the buf to outgoing queue
> > 5. go to 1
> >
> > On the other hand, in case of using dmabuf-sync, as you can see the
> above
> > example, we would need IPC operation just one time. That way, I think we
> > could not only reduce performance overhead but also make user
> application
> >

[PATCH] drm: Improve manual IRQ installation documentation

2013-06-20 Thread Daniel Vetter

On Thu, Jun 20, 2013 at 01:00:36PM +0200, Thierry Reding wrote:
> On Thu, Jun 20, 2013 at 12:46:21PM +0200, Laurent Pinchart wrote:
> > Hi Thierry,
> > 
> > On Thursday 20 June 2013 12:40:26 Thierry Reding wrote:
> > > On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
> > > > On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
> > > > > On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> > > > > > Signed-off-by: Laurent Pinchart
> > > > > > 
> > > > > > ---
> > > > > > 
> > > > > >  Documentation/DocBook/drm.tmpl | 14 --
> > > > > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > > > > 
> > > > > > diff --git a/Documentation/DocBook/drm.tmpl
> > > > > > b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
> > > > > > --- a/Documentation/DocBook/drm.tmpl
> > > > > > +++ b/Documentation/DocBook/drm.tmpl
> > > > > > @@ -186,11 +186,12 @@
> > > > > > 
> > > > > >
> > > > > >
> > > > > >  
> > > > > > DRIVER_HAVE_IRQDRIVER_IRQ_SHARED > > > > >  >
> > > > > >  
> > > > > > 
> > > > > > -  DRIVER_HAVE_IRQ indicates whether the driver has an 
> > > > > > IRQ
> > > > > > handler. The -  DRM core will automatically register an
> > > > > > interrupt handler when the -  flag is set.
> > > > > > DRIVER_IRQ_SHARED
> > > > > > indicates whether the device  -  handler support
> > > > > > shared
> > > > > > IRQs (note that this is required of PCI -  drivers).
> > > > > > +  DRIVER_HAVE_IRQ indicates whether the driver has an 
> > > > > > IRQ
> > > > > > handler +  managed by the DRM Core. The core will 
> > > > > > support
> > > > > > simple IRQ handler +  installation when the flag is set.
> > > > > > The
> > > > > > installation process is +  described in  > > > > > linkend="drm-irq-registration"/>. +
> > > > > > DRIVER_IRQ_SHARED indicates whether the device  handler +
> > > > > > 
> > > > > > support shared IRQs (note that this is required of PCI 
> > > > > > drivers).>
> > > > > > 
> > > > > >  
> > > > > >
> > > > > >
> > > > > >
> > > > > > 
> > > > > > @@ -344,7 +345,8 @@ char *date;
> > > > > > 
> > > > > >The DRM core tries to facilitate IRQ handler registration
> > > > > >and
> > > > > >unregistration by providing
> > > > > >drm_irq_install and
> > > > > >drm_irq_uninstall functions. Those
> > > > > >functions only
> > > > > > 
> > > > > > -  support a single interrupt per device.
> > > > > > +  support a single interrupt per device, devices that use
> > > > > > more
> > > > > > than one +  IRQs need to be handled manually.
> > > > > 
> > > > > Perhaps this should mention that if you handle IRQ installation 
> > > > > manually
> > > > > you also need to manually set drm->irq_enabled = 1, as otherwise 
> > > > > things
> > > > > like DRM_IOCTL_WAIT_VBLANK won't work properly.
> > > > 
> > > > That's only needed if DRIVER_HAVE_IRQ is set, otherwise the
> > > > drm_wait_vblank() function skips the irq_enabled check.
> > > 
> > > Not quite. There's another check for dev->irq_enabled in the
> > > DRM_WAIT_ON() so it'll never actually block.
> > 
> > Indeed.
> > 
> > > Arguably this could be solved by making the DRM_WAIT_ON() condition drop 
> > > the
> > > !dev->irq_enabled in case DRIVER_HAVE_IRQ isn't set.
> > 
> > Or we could force drivers to set drm->irq_enabled, I'm fine with that as 
> > well. 
> > I can fix the documentation if that would be the preferred solution.
> 
> Thinking about it some more, the latter seems more appropriate. Consider
> some driver that doesn't set DRIVER_HAVE_IRQ but also doesn't initialize
> interrupts manually. If so it'll cause DRM_IOCTL_WAIT_VBLANK to block
> indefinitely.
> 
> On the other hand perhaps the check at the beginning of drm_wait_vblank
> should be improved. Maybe make it something like this:
> 
>   if (drm_core_check_feature(dev, DRIVER_HAVE_IRQ))
>   if (!drm_dev_to_irq(dev))
>   return -EINVAL;
> 
>   if (!dev->irq_enabled)
>   return -EINVAL;

I think the vblank subsystem is ripe for rework, and I have a few plans
for that. But till that happens I guess we could just roll forward with
whatever hacks we currently have ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH v3] drm: Renesas R-Car Display Unit DRM driver

2013-06-20 Thread Daniel Vetter

On Wed, Jun 19, 2013 at 01:45:45AM +0200, Laurent Pinchart wrote:
> Hello Adam,
> 
> Ping ?
> 
> Daniel, would it help getting the driver in v3.11 if I resubmit it now with a 
> get_modes operation that just returns 0 ?

Yeah, I guess that works, too.
-Daniel

> 
> On Friday 14 June 2013 16:03:19 Daniel Vetter wrote:
> > On Fri, Jun 14, 2013 at 02:54:04AM +0200, Laurent Pinchart wrote:
> > > On Friday 07 June 2013 10:50:55 Daniel Vetter wrote:
> > > > On Fri, Jun 07, 2013 at 09:44:45AM +0200, Laurent Pinchart wrote:
> > > > > On Wednesday 05 June 2013 10:55:05 Daniel Vetter wrote:
> > > > > > On Wed, Jun 05, 2013 at 03:51:53AM +0200, Laurent Pinchart wrote:
> > > > > > > On Tuesday 04 June 2013 20:36:20 Daniel Vetter wrote:
> > > > > > > > On Tue, Jun 4, 2013 at 8:03 PM, Laurent Pinchart wrote:
> > > > > > > > > On Tuesday 04 June 2013 16:12:36 Daniel Vetter wrote:
> > > > > > > > >> On Tue, Jun 04, 2013 at 04:53:40AM +0200, Laurent Pinchart 
> wrote:
> > > > > > [snip]
> > > > > > 
> > > > > > > > >> > +static int rcar_du_vga_connector_get_modes(struct
> > > > > > > > >> > drm_connector
> > > > > > > > >> > *connector)
> > > > > > > > >> > +{
> > > > > > > > >> > +   return drm_add_modes_noedid(connector, 1280, 768);
> > > > > > > > >> > +}
> > > > > > > > >> 
> > > > > > > > >> This (and the dummy detect function below) looks a bit funny,
> > > > > > > > >> since it essentially overrides the default behaviour already
> > > > > > > > >> provided by the crtc helpers. Until rcar has at least proper
> > > > > > > > >> detect support for VGA
> > > > > > > > > 
> > > > > > > > > I would add that but the DDC signals are not connected on the
> > > > > > > > > boards I have access to :-/
> > > > > > > > > 
> > > > > > > > >> I'd just kill this and use the connector force support (and
> > > > > > > > >> manually adding the right resolutions).
> > > > > > > > > 
> > > > > > > > > Looks like that's a candidate for better documentation... How
> > > > > > > > > does force support work ?
> > > > > > > > 
> > > > > > > > Grep for DRM_FORCE_ON, iirc it can be set on the commandline,
> > > > > > > > where you can also force a specific mode. The best I could find
> > > > > > > > wrt docs is the kerneldoc for
> > > > > > > > drm_mode_parse_command_line_for_connector. With a bit more
> > > > > > > > reading it looks like it's intermingled with the fbdev helper
> > > > > > > > code, but should be fairly easy to extract and used by your
> > > > > > > > driver.
> > > > > > > 
> > > > > > > It makes sense to force the connector state from command line, but
> > > > > > > I'm not sure if the same mechanism is the best solution here. As
> > > > > > > the driver has no way to know the connector state, the best we can
> > > > > > > do is guess what modes are supported. I can just return 0 in the
> > > > > > > get_modes handler, but then the core will not call
> > > > > > > drm_add_modes_noedid(), and modes will need to be added manually.
> > > > > > > 
> > > > > > > Is your point that for a board on which the VGA connector state
> > > > > > > can't be detected, the user should always be responsible for
> > > > > > > adding all the modes supported by the VGA monitor on the command
> > > > > > > line ?
> > > > > > 
> > > > > > My point is that we already have both an established code for
> > > > > > connected outputs without EDID to add fallback modes and means to
> > > > > > force connectors to certain states. Your code here seems to reinvent
> > > > > > that wheel, so I wonder what we should/need to improve in the common
> > > > > > code to suit your needs.
> > > > > 
> > > > > The currently available code might suit my needs, it might just be
> > > > > that I fail to see how to use it properly.
> > > > > 
> > > > > Regarding the "code for connected outputs without EDID to add fallback
> > > > > modes" you're referring to, is that
> > > > > 
> > > > > if (count == 0 && connector->status ==
> > > > > connector_status_connected)
> > > > > count = drm_add_modes_noedid(connector, 1024, 768);
> > > > > 
> > > > > in drm_helper_probe_single_connector_modes() ? That function will only
> > > > > be called if the connector status is connector_status_connected. There
> > > > > are two ways to enforce that:
> > > > > 
> > > > > - returning connector_status_connected from the connector detect()
> > > > > operation, which seems to defeat the purpose of having
> > > > > connector_status_unknown completely.
> > > > 
> > > > We might want to add such a default mode also for unknown, I'm not sure.
> > > > Userspace policy is to first try to light up any connected outputs, and
> > > > if> there's none try to light up any unknown outputs. Not sure whether
> > > > userspace (i.e. X) will automatically add a default mode. fbcon might
> > > > also handle this less gracefully.
> > > > 
> > > > Personally I'm ok with extending this to unknown, it shouldn't really
> > > > hurt (since we already try really hard not to leak

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

> -Original Message-
> From: dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org
> [mailto:dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org] On
> Behalf Of Russell King - ARM Linux
> Sent: Thursday, June 20, 2013 3:29 AM
> To: Inki Dae
> Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> Cho; linux-media at vger.kernel.org; linux-arm-kernel at lists.infradead.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > On the other hand, the below shows how we could enhance the conventional
> > way with my approach (just example):
> >
> > CPU -> DMA,
> > ioctl(qbuf command)  ioctl(streamon)
> >   |   |
> >   |   |
> > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> >
> > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> And
> > the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
> > accesses the sync buffer.
> >
> > And DMA -> CPU,
> > ioctl(dqbuf command)
> >   |
> >   |
> > dqbuf <- nothing to do
> >
> > Actual syncpoint is when DMA operation is completed (in interrupt
> handler):
> > the syncpoint is performed by calling dma_buf_sync_unlock().
> > Hence,  my approach is to move the syncpoints into just before dma
> access
> > as long as possible.
> 
> What you've just described does *not* work on architectures such as
> ARMv7 which do speculative cache fetches from memory at any time that
> that memory is mapped with a cacheable status, and will lead to data
> corruption.

I didn't explain that enough. Sorry about that. 'nothing to do' means that a
dmabuf sync interface isn't called but existing functions are called. So
this may be explained again:
ioctl(dqbuf command)
|
|
dqbuf <- 1. dma_unmap_sg
2. dma_buf_sync_unlock (syncpoint)

The syncpoint I mentioned means lock mechanism; not doing cache operation.

In addition, please see the below more detail examples.

The conventional way (without dmabuf-sync) is:
Task A 

 1. CPU accesses buf  
 2. Send the buf to Task B  
 3. Wait for the buf from Task B
 4. go to 1

Task B
---
1. Wait for the buf from Task A
2. qbuf the buf 
2.1 insert the buf to incoming queue
3. stream on
3.1 dma_map_sg if ready, and move the buf to ready queue
3.2 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_unmap_sg after dma operation completion
4.2 move the buf to outgoing queue
5. back the buf to Task A
6. go to 1

In case that two tasks share buffers, and data flow goes from Task A to Task
B, we would need IPC operation to send and receive buffers properly between
those two tasks every time CPU or DMA access to buffers is started or
completed.

With dmabuf-sync:

Task A 

 1. dma_buf_sync_lock <- synpoint (call by user side)
 2. CPU accesses buf  
 3. dma_buf_sync_unlock <- syncpoint (call by user side)
 4. Send the buf to Task B (just one time)
 5. go to 1

Task B
---
1. Wait for the buf from Task A (just one time)
2. qbuf the buf 
1.1 insert the buf to incoming queue
3. stream on
3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
3.2 dma_map_sg if ready, and move the buf to ready queue
3.3 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
4.2 dma_unmap_sg after dma operation completion
4.3 move the buf to outgoing queue
5. go to 1

On the other hand, in case of using dmabuf-sync, as you can see the above
example, we would need IPC operation just one time. That way, I think we
could not only reduce performance overhead but also make user application
simplified. Of course, this approach can be used for all DMA device drivers
such as DRM. I'm not a specialist in v4l2 world so there may be missing
point.

Thanks,
Inki Dae

> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

[pull] radeon drm-fixes-3.10

2013-06-20 Thread alexdeuc...@gmail.com

From: Alex Deucher 

Hi Dave,

Just one more fix for radeon.  Patch from Jerome to avoid GPU
resets due to false positives when the ring is empty.

The following changes since commit c139b1ee4e36374af705660af6172d7477352792:

  drm/radeon: fix UVD on big endian (2013-06-14 17:05:57 -0400)

are available in the git repository at:
  git://people.freedesktop.org/~agd5f/linux drm-fixes-3.10

Jerome Glisse (1):
  drm/radeon: update lockup tracking when scheduling in empty ring

 drivers/gpu/drm/radeon/radeon_ring.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

2013-06-20 Thread Ingo Molnar

* Maarten Lankhorst  wrote:

> Well they've helped me with some of the changes and contributed some 
> code and/or fixes, but if acked-by is preferred I'll use that..

Such contributions can be credited in the changelog, and/or copyright 
notices, and/or the code itself. The signoff chain on the other hand is 
strictly defined as a 'route the patch took', with a single point of 
origin, the main author. See Documentation/SubmittingPatches, pt 12.

[ A signoff chain _can_ signal multi-authored code where the code got 
  written by someone and then further fixed/developed by someone else - 
  who adds a SOB to the end - but in that case I expect to get the patch 
  from the last person in the signoff chain. ]

Thanks,

Ingo

[PATCH] drm/shmobile: Drop usage of removed drm_plane enabled field

On Thu, Jun 20, 2013 at 12:45 AM, Laurent Pinchart
 wrote:
> The enabled field has been removed from struct drm_plane. Don't use it
> in the driver.

Applied to -next,

Dave.

[PATCH 1/1] gpu:drm:tilcdc: get preferred_bpp value from DT

>>
>> Signed-off-by: Benoit Parrot 
>
> Acked-by: Rob Clark 

Applied to -next,

thanks,
Dave.

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

Op 20-06-13 13:55, Ingo Molnar schreef:
> * Maarten Lankhorst  wrote:
>
>> Changes since RFC patch v1:
>>  - Updated to use atomic_long instead of atomic, since the reservation_id 
>> was a long.
>>  - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
>>  - removed mutex_locked_set_reservation_id (or w/e it was called)
>> Changes since RFC patch v2:
>>  - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
>> combination of
>>mutex_(,reserve_)lock/unlock.
>> Changes since v1:
>>  - Add __always_inline to __mutex_lock_common, otherwise reservation paths 
>> can be
>>triggered from normal locks, because __builtin_constant_p might evaluate 
>> to false
>>for the constant 0 in that case. Tests for this have been added in the 
>> next patch.
>>  - Updated documentation slightly.
>> Changes since v2:
>>  - Renamed everything to ww_mutex. (mlankhorst)
>>  - Added ww_acquire_ctx and ww_class. (mlankhorst)
>>  - Added a lot of checks for wrong api usage. (mlankhorst)
>>  - Documentation updates. (danvet)
>> Changes since v3:
>>  - Small documentation fixes (robclark)
>>  - Memory barrier fix (danvet)
>> Changes since v4:
>>  - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
>>  - Rename ww_mutex_trylock_single to ww_mutex_trylock.
>>  - Remove separate implementations of ww_mutex_lock_slow*, normal
>>functions can be used. Inline versions still exist for extra
>>debugging.
>>  - Cleanup unneeded memory barriers, add comment to the remaining
>>smp_mb().
> That's not a proper changelog. It should be a short description of what it 
> does, possibly referring to the new Documentation/ww-mutex-design.txt file 
> for more details.
Well they've helped me with some of the changes and contributed some code 
and/or fixes, but if acked-by is preferred I'll use that..
>> Signed-off-by: Maarten Lankhorst 
>> Signed-off-by: Daniel Vetter 
>> Signed-off-by: Rob Clark 
> That's not a valid signoff chain: the last signoff in the chain is the 
> person sending me the patch. The first signoff is the person who wrote the 
> patch. The other two gents should be Acked-by I suspect?
>
I guess so.

[Bug 65963] screen goes blank, Linux hangs - Radeon 7870, Gallium, Glamor

https://bugs.freedesktop.org/show_bug.cgi?id=65963

--- Comment #2 from Damian Nowak  ---
Thanks for taking a look at it. Package versions as follows. Pasting a
PKGBUILDs so you can see the flags for compilation.

extra/mesa 9.1.3-1

https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/mesa
extra/llvm-amdgpu-lib-snapshot 20130403-3

https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/llvm-amdgpu-snapshot

Let me know if you want me to install these from git as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/fdadba66/attachment.html>

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

2013-06-20 Thread Ingo Molnar


* Maarten Lankhorst  wrote:

> +The algorithm that TTM came up with for dealing with this problem is quite
> +simple. [...]

'TTM' here reads like a person - but in reality it's the TTM graphics 
subsystem, right?

Please clarify this portion of the text.

Thanks,

Ingo

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

Alex Deucher  changed:

   What|Removed |Added

  Attachment #81105|text/plain  |image/jpeg
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/2ec56af5/attachment.html>

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

2013-06-20 Thread Ingo Molnar


* Maarten Lankhorst  wrote:

> Changes since RFC patch v1:
>  - Updated to use atomic_long instead of atomic, since the reservation_id was 
> a long.
>  - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
>  - removed mutex_locked_set_reservation_id (or w/e it was called)
> Changes since RFC patch v2:
>  - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
> combination of
>mutex_(,reserve_)lock/unlock.
> Changes since v1:
>  - Add __always_inline to __mutex_lock_common, otherwise reservation paths 
> can be
>triggered from normal locks, because __builtin_constant_p might evaluate 
> to false
>for the constant 0 in that case. Tests for this have been added in the 
> next patch.
>  - Updated documentation slightly.
> Changes since v2:
>  - Renamed everything to ww_mutex. (mlankhorst)
>  - Added ww_acquire_ctx and ww_class. (mlankhorst)
>  - Added a lot of checks for wrong api usage. (mlankhorst)
>  - Documentation updates. (danvet)
> Changes since v3:
>  - Small documentation fixes (robclark)
>  - Memory barrier fix (danvet)
> Changes since v4:
>  - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
>  - Rename ww_mutex_trylock_single to ww_mutex_trylock.
>  - Remove separate implementations of ww_mutex_lock_slow*, normal
>functions can be used. Inline versions still exist for extra
>debugging.
>  - Cleanup unneeded memory barriers, add comment to the remaining
>smp_mb().

That's not a proper changelog. It should be a short description of what it 
does, possibly referring to the new Documentation/ww-mutex-design.txt file 
for more details.

> Signed-off-by: Maarten Lankhorst 
> Signed-off-by: Daniel Vetter 
> Signed-off-by: Rob Clark 

That's not a valid signoff chain: the last signoff in the chain is the 
person sending me the patch. The first signoff is the person who wrote the 
patch. The other two gents should be Acked-by I suspect?

Thanks,

Ingo

[Bug 65963] screen goes blank, Linux hangs - Radeon 7870, Gallium, Glamor

https://bugs.freedesktop.org/show_bug.cgi?id=65963

Alex Deucher  changed:

   What|Removed |Added

   Assignee|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
 QA Contact|xorg-team at lists.x.org   |
Product|xorg|Mesa
Version|git |unspecified
  Component|Driver/Radeon   |Drivers/Gallium/radeonsi

--- Comment #1 from Alex Deucher  ---
Is this a regression?  What version of mesa and llvm are you using?  Also, in
the future please attach your logs directly to the bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/6fee18aa/attachment.html>

[Bug 65958] GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

--- Comment #2 from Alex Deucher  ---
Does setting the env var RADEON_VA=0 in your /etc/environment fix the issue? 
This looks like a mesa issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/1823eb33/attachment.html>

[PATCH v5 7/7] locking-selftests: handle unexpected failures more strictly

When CONFIG_PROVE_LOCKING is not enabled, more tests are expected to
pass unexpectedly, but there no tests that should start to fail that
pass with CONFIG_PROVE_LOCKING enabled.

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index d554f3f..aad024d 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -976,16 +976,18 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
-   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
+   if (expected == FAILURE && debug_locks) {
expected_testcase_failures++;
printk("failed|");
-#else
+   }
+   else
+#endif
+   if (debug_locks != expected) {
unexpected_testcase_failures++;
printk("FAILED|");

dump_stack();
-#endif
} else {
testcase_successes++;
printk("  ok  |");

[PATCH v5 6/7] mutex: add more ww tests to test EDEADLK path handling

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |  264 +++-
 1 file changed, 261 insertions(+), 3 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 37faefd..d554f3f 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -47,7 +47,7 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_WW0x10

 static struct ww_acquire_ctx t, t2;
-static struct ww_mutex o, o2;
+static struct ww_mutex o, o2, o3;

 /*
  * Normal standalone locks, for the circular and irq-context
@@ -947,12 +947,12 @@ static void reset_locks(void)

I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
-   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); I_WW(o3.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();

-   ww_mutex_init(, _lockdep); ww_mutex_init(, _lockdep);
+   ww_mutex_init(, _lockdep); ww_mutex_init(, _lockdep); 
ww_mutex_init(, _lockdep);
memset(, 0, sizeof(t)); memset(, 0, sizeof(t2));
memset(_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
memset(_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
@@ -1292,6 +1292,251 @@ static void ww_test_object_lock_stale_context(void)
WWL(, );
 }

+static void ww_test_edeadlk_normal(void)
+{
+   int ret;
+
+   mutex_lock();
+   o2.ctx = 
+   mutex_release(_map, 1, _THIS_IP_);
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(_map, 0, 1, _THIS_IP_);
+   mutex_unlock();
+   WWU();
+
+   WWL(, );
+}
+
+static void ww_test_edeadlk_normal_slow(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(_map, 0, 1, _THIS_IP_);
+   mutex_unlock();
+   WWU();
+
+   ww_mutex_lock_slow(, );
+}
+
+static void ww_test_edeadlk_no_unlock(void)
+{
+   int ret;
+
+   mutex_lock();
+   o2.ctx = 
+   mutex_release(_map, 1, _THIS_IP_);
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(_map, 0, 1, _THIS_IP_);
+   mutex_unlock();
+
+   WWL(, );
+}
+
+static void ww_test_edeadlk_no_unlock_slow(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(_map, 0, 1, _THIS_IP_);
+   mutex_unlock();
+
+   ww_mutex_lock_slow(, );
+}
+
+static void ww_test_edeadlk_acquire_more(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   ret = WWL(, );
+}
+
+static void ww_test_edeadlk_acquire_more_slow(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   ww_mutex_lock_slow(, );
+}
+
+static void ww_test_edeadlk_acquire_more_edeadlk(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o3.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+}
+
+static void ww_test_edeadlk_acquire_more_edeadlk_slow(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o3.ctx = 
+
+   WWAI();
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(, );
+   WARN_ON(ret);
+
+   ret = WWL(, );
+   WARN_ON(ret != -EDEADLK);
+
+   ww_mutex_lock_slow(, );
+}
+
+static void ww_test_edeadlk_acquire_wrong(void)
+{
+   int ret;
+
+   mutex_lock();
+   mutex_release(_map, 1, _THIS_IP_);
+   o2.ctx = 
+
+   WWAI();
+   t2 = t;

[PATCH v5 5/7] mutex: add more tests to lib/locking-selftest.c

None of the ww_mutex codepaths should be taken in the 'normal'
mutex calls. The easiest way to verify this is by using the
normal mutex calls, and making sure o.ctx is unmodified.

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |   62 
 1 file changed, 62 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 9962262..37faefd 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1162,6 +1162,67 @@ static void ww_test_fail_acquire(void)
 #endif
 }

+static void ww_test_normal(void)
+{
+   int ret;
+
+   WWAI();
+
+   /*
+* None of the ww_mutex codepaths should be taken in the 'normal'
+* mutex calls. The easiest way to verify this is by using the
+* normal mutex calls, and making sure o.ctx is unmodified.
+*/
+
+   /* mutex_lock (and indirectly, mutex_lock_nested) */
+   o.ctx = (void *)~0UL;
+   mutex_lock();
+   mutex_unlock();
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* mutex_lock_interruptible (and *_nested) */
+   o.ctx = (void *)~0UL;
+   ret = mutex_lock_interruptible();
+   if (!ret)
+   mutex_unlock();
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* mutex_lock_killable (and *_nested) */
+   o.ctx = (void *)~0UL;
+   ret = mutex_lock_killable();
+   if (!ret)
+   mutex_unlock();
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* trylock, succeeding */
+   o.ctx = (void *)~0UL;
+   ret = mutex_trylock();
+   WARN_ON(!ret);
+   if (ret)
+   mutex_unlock();
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* trylock, failing */
+   o.ctx = (void *)~0UL;
+   mutex_lock();
+   ret = mutex_trylock();
+   WARN_ON(ret);
+   mutex_unlock();
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* nest_lock */
+   o.ctx = (void *)~0UL;
+   mutex_lock_nest_lock(, );
+   mutex_unlock();
+   WARN_ON(o.ctx != (void *)~0UL);
+}
+
 static void ww_test_two_contexts(void)
 {
WWAI();
@@ -1415,6 +1476,7 @@ static void ww_tests(void)

print_testname("ww api failures");
dotest(ww_test_fail_acquire, SUCCESS, LOCKTYPE_WW);
+   dotest(ww_test_normal, SUCCESS, LOCKTYPE_WW);
dotest(ww_test_unneeded_slow, FAILURE, LOCKTYPE_WW);
printk("\n");

[PATCH v5 4/7] mutex: Add ww tests to lib/locking-selftest.c. v5

This stresses the lockdep code in some ways specifically useful to
ww_mutexes. It adds checks for most of the common locking errors.

Changes since v1:
 - Add tests to verify reservation_id is untouched.
 - Use L() and U() macros where possible.

Changes since v2:
 - Use the ww_mutex api directly.
 - Use macros for most of the code.
Changes since v3:
 - Rework tests for the api changes.
Changes since v4:
 - Added a test to cover ww_acquire_done being called multiple times.
 - Added a test for calling ww_acquire_fini being called before
   all locks have been unlocked.
 - Added a test for locking after ww_acquire_done has been called.
 - Added a test for unbalance for ctx->acquired dropping below zero.
 - Added a test for unlocked ww_mutex with ctx != NULL.

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |  400 ++--
 1 file changed, 381 insertions(+), 19 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index c3eb261..9962262 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -26,6 +26,8 @@
  */
 static unsigned int debug_locks_verbose;

+static DEFINE_WW_CLASS(ww_lockdep);
+
 static int __init setup_debug_locks_verbose(char *str)
 {
get_option(, _locks_verbose);
@@ -42,6 +44,10 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_RWLOCK0x2
 #define LOCKTYPE_MUTEX 0x4
 #define LOCKTYPE_RWSEM 0x8
+#define LOCKTYPE_WW0x10
+
+static struct ww_acquire_ctx t, t2;
+static struct ww_mutex o, o2;

 /*
  * Normal standalone locks, for the circular and irq-context
@@ -193,6 +199,20 @@ static void init_shared_classes(void)
 #define RSU(x) up_read(_##x)
 #define RWSI(x)init_rwsem(_##x)

+#ifndef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+#define WWAI(x)ww_acquire_init(x, _lockdep)
+#else
+#define WWAI(x)do { ww_acquire_init(x, _lockdep); 
(x)->deadlock_inject_countdown = ~0U; } while (0)
+#endif
+#define WWAD(x)ww_acquire_done(x)
+#define WWAF(x)ww_acquire_fini(x)
+
+#define WWL(x, c)  ww_mutex_lock(x, c)
+#define WWT(x) ww_mutex_trylock(x)
+#define WWL1(x)ww_mutex_lock(x, NULL)
+#define WWU(x) ww_mutex_unlock(x)
+
+
 #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)

 /*
@@ -894,11 +914,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 # define I_RWLOCK(x)   lockdep_reset_lock(_##x.dep_map)
 # define I_MUTEX(x)lockdep_reset_lock(_##x.dep_map)
 # define I_RWSEM(x)lockdep_reset_lock(_##x.dep_map)
+# define I_WW(x)   lockdep_reset_lock(_map)
 #else
 # define I_SPINLOCK(x)
 # define I_RWLOCK(x)
 # define I_MUTEX(x)
 # define I_RWSEM(x)
+# define I_WW(x)
 #endif

 #define I1(x)  \
@@ -920,11 +942,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 static void reset_locks(void)
 {
local_irq_disable();
+   lockdep_free_key_range(_lockdep.acquire_key, 1);
+   lockdep_free_key_range(_lockdep.mutex_key, 1);
+
I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();
+
+   ww_mutex_init(, _lockdep); ww_mutex_init(, _lockdep);
+   memset(, 0, sizeof(t)); memset(, 0, sizeof(t2));
+   memset(_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
+   memset(_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
local_irq_enable();
 }

@@ -938,7 +969,6 @@ static int unexpected_testcase_failures;
 static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 {
unsigned long saved_preempt_count = preempt_count();
-   int expected_failure = 0;

WARN_ON(irqs_disabled());

@@ -946,26 +976,16 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
+   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
-   if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
-   expected_failure = 1;
+   expected_testcase_failures++;
+   printk("failed|");
+#else
+   unexpected_testcase_failures++;
+   printk("FAILED|");
+
+   dump_stack();
 #endif
-   if (debug_locks != expected) {
-   if (expected_failure) {
-

[PATCH v5 3/7] mutex: w/w mutex slowpath debugging

From: Daniel Vetter 

Injects EDEADLK conditions at pseudo-random interval, with exponential
backoff up to UINT_MAX (to ensure that every lock operation still
completes in a reasonable time).

This way we can test the wound slowpath even for ww mutex users where
contention is never expected, and the ww deadlock avoidance algorithm
is only needed for correctness against malicious userspace. An example
would be protecting kernel modesetting properties, which thanks to
single-threaded X isn't really expected to contend, ever.

I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but
decided against it for two reasons:

- EDEADLK handling is mandatory for ww mutex users and should never
  affect the outcome of a syscall. This is in contrast to -ENOMEM
  injection. So fine configurability isn't required.

- The fault injection framework only allows to set a simple
  probability for failure. Now the probability that a ww mutex acquire
  stage with N locks will never complete (due to too many injected
  EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
  operations for the completely uncontended case would be O(exp(N)).
  The per-acuiqire ctx exponential backoff solution choosen here only
  results in O(log N) overhead due to injection and so O(log N * N)
  lock operations. This way we can fail with high probability (and so
  have good test coverage even for fancy backoff and lock acquisition
  paths) without running into patalogical cases.

Note that EDEADLK will only ever be injected when we managed to
acquire the lock. This prevents any behaviour changes for users which
rely on the EALREADY semantics.

v2: Drop the cargo-culted __sched (I should read docs next time
around) and annotate the non-debug case with inline to prevent gcc
from doing something horrible.

v3: Rebase on top of Maarten's latest patches.

v4: Actually make this stuff compile, I've misplace the hunk in the
wrong #ifdef block.

v5: Simplify ww_mutex_deadlock_injection definition, and fix
lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition
to work correctly. (mlankhorst)

v6:
Do not inject -EDEADLK when ctx->acquired == 0, because
the _slow paths are merged now. (mlankhorst)

Cc: Steven Rostedt 
Signed-off-by: Daniel Vetter 
Signed-off-by: Maarten Lankhorst 
---
 include/linux/mutex.h |8 
 kernel/mutex.c|   44 +---
 lib/Kconfig.debug |   13 +
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index f3ad181..2ff9178 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -95,6 +95,10 @@ struct ww_acquire_ctx {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned deadlock_inject_interval;
+   unsigned deadlock_inject_countdown;
+#endif
 };

 struct ww_mutex {
@@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx 
*ctx,
 _class->acquire_key, 0);
mutex_acquire(>dep_map, 0, 0, _RET_IP_);
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   ctx->deadlock_inject_interval = 1;
+   ctx->deadlock_inject_countdown = ctx->stamp & 0xf;
+#endif
 }

 /**
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 75fc7c4..e40004b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)

 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);

+static inline int
+ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned tmp;
+
+   if (ctx->deadlock_inject_countdown-- == 0) {
+   tmp = ctx->deadlock_inject_interval;
+   if (tmp > UINT_MAX/4)
+   tmp = UINT_MAX;
+   else
+   tmp = tmp*2 + tmp + tmp/2;
+
+   ctx->deadlock_inject_interval = tmp;
+   ctx->deadlock_inject_countdown = tmp;
+   ctx->contending_lock = lock;
+
+   ww_mutex_unlock(lock);
+
+   return -EDEADLK;
+   }
+#endif
+
+   return 0;
+}

 int __sched
 __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(>base, TASK_UNINTERRUPTIBLE,
+   ret =  __mutex_lock_common(>base, TASK_UNINTERRUPTIBLE,
   0, >dep_map, _RET_IP_, ctx);
+   if (!ret && ctx->acquired > 0)
+   return ww_mutex_deadlock_injection(lock, ctx);
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__ww_mutex_lock);

 int __sched
 __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx 
*ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(>base, TASK_INTERRUPTIBLE,
-

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

Changes since RFC patch v1:
 - Updated to use atomic_long instead of atomic, since the reservation_id was a 
long.
 - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
 - removed mutex_locked_set_reservation_id (or w/e it was called)
Changes since RFC patch v2:
 - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
combination of
   mutex_(,reserve_)lock/unlock.
Changes since v1:
 - Add __always_inline to __mutex_lock_common, otherwise reservation paths can 
be
   triggered from normal locks, because __builtin_constant_p might evaluate to 
false
   for the constant 0 in that case. Tests for this have been added in the next 
patch.
 - Updated documentation slightly.
Changes since v2:
 - Renamed everything to ww_mutex. (mlankhorst)
 - Added ww_acquire_ctx and ww_class. (mlankhorst)
 - Added a lot of checks for wrong api usage. (mlankhorst)
 - Documentation updates. (danvet)
Changes since v3:
 - Small documentation fixes (robclark)
 - Memory barrier fix (danvet)
Changes since v4:
 - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
 - Rename ww_mutex_trylock_single to ww_mutex_trylock.
 - Remove separate implementations of ww_mutex_lock_slow*, normal
   functions can be used. Inline versions still exist for extra
   debugging.
 - Cleanup unneeded memory barriers, add comment to the remaining
   smp_mb().

Signed-off-by: Maarten Lankhorst 
Signed-off-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
 Documentation/ww-mutex-design.txt |  343 
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  355 +
 kernel/mutex.c|  318 +++--
 lib/debug_locks.c |2 
 5 files changed, 1002 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

diff --git a/Documentation/ww-mutex-design.txt 
b/Documentation/ww-mutex-design.txt
new file mode 100644
index 000..379739c
--- /dev/null
+++ b/Documentation/ww-mutex-design.txt
@@ -0,0 +1,343 @@
+Wait/Wound Deadlock-Proof Mutex Design
+==
+
+Please read mutex-design.txt first, as it applies to wait/wound mutexes too.
+
+Motivation for WW-Mutexes
+-
+
+GPU's do operations that commonly involve many buffers.  Those buffers
+can be shared across contexts/processes, exist in different memory
+domains (for example VRAM vs system memory), and so on.  And with
+PRIME / dmabuf, they can even be shared across devices.  So there are
+a handful of situations where the driver needs to wait for buffers to
+become ready.  If you think about this in terms of waiting on a buffer
+mutex for it to become available, this presents a problem because
+there is no way to guarantee that buffers appear in a execbuf/batch in
+the same order in all contexts.  That is directly under control of
+userspace, and a result of the sequence of GL calls that an application
+makes. Which results in the potential for deadlock.  The problem gets
+more complex when you consider that the kernel may need to migrate the
+buffer(s) into VRAM before the GPU operates on the buffer(s), which
+may in turn require evicting some other buffers (and you don't want to
+evict other buffers which are already queued up to the GPU), but for a
+simplified understanding of the problem you can ignore this.
+
+The algorithm that TTM came up with for dealing with this problem is quite
+simple.  For each group of buffers (execbuf) that need to be locked, the caller
+would be assigned a unique reservation id/ticket, from a global counter.  In
+case of deadlock while locking all the buffers associated with a execbuf, the
+one with the lowest reservation ticket (i.e. the oldest task) wins, and the one
+with the higher reservation id (i.e. the younger task) unlocks all of the
+buffers that it has already locked, and then tries again.
+
+In the RDBMS literature this deadlock handling approach is called wait/wound:
+The older tasks waits until it can acquire the contended lock. The younger 
tasks
+needs to back off and drop all the locks it is currently holding, i.e. the
+younger task is wounded.
+
+Concepts
+
+
+Compared to normal mutexes two additional concepts/objects show up in the lock
+interface for w/w mutexes:
+
+Acquire context: To ensure eventual forward progress it is important the a task
+trying to acquire locks doesn't grab a new reservation id, but keeps the one it
+acquired when starting the lock acquisition. This ticket is stored in the
+acquire context. Furthermore the acquire context keeps track of debugging state
+to catch w/w mutex interface abuse.
+
+W/w class: In contrast to normal mutexes the lock class needs to be explicit 
for
+w/w mutexes, since it is required to initialize the acquire context.
+
+Furthermore there are three different class of w/w lock acquire functions:
+
+* Normal lock acquisition with a context, using

[PATCH v5 1/7] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.

This will allow me to call functions that have multiple arguments if fastpath 
fails.
This is required to support ticket mutexes, because they need to be able to 
pass an
extra argument to the fail function.

Originally I duplicated the functions, by adding 
__mutex_fastpath_lock_retval_arg.
This ended up being just a duplication of the existing function, so a way to 
test
if fastpath was called ended up being better.

This also cleaned up the reservation mutex patch some by being able to call an
atomic_set instead of atomic_xchg, and making it easier to detect if the wrong
unlock function was previously used.

Changes since v1, pointed out by Francesco Lavra:
- fix a small comment issue in mutex_32.h
- fix the __mutex_fastpath_lock_retval macro for mutex-null.h

Signed-off-by: Maarten Lankhorst 
---
 arch/ia64/include/asm/mutex.h|   10 --
 arch/powerpc/include/asm/mutex.h |   10 --
 arch/sh/include/asm/mutex-llsc.h |4 ++--
 arch/x86/include/asm/mutex_32.h  |   11 ---
 arch/x86/include/asm/mutex_64.h  |   11 ---
 include/asm-generic/mutex-dec.h  |   10 --
 include/asm-generic/mutex-null.h |2 +-
 include/asm-generic/mutex-xchg.h |   10 --
 kernel/mutex.c   |   32 ++--
 9 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..f41e66d 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
-   return fail_fn(count);
+   return -1;
return 0;
 }

diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..127ab23 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(__mutex_dec_return_lock(count) < 0))
-   return fail_fn(count);
+   return -1;
return 0;
 }

diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..dad29b6 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
 }

 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
int __done, __res;

@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int 
(*fail_fn)(atomic_t *))
: "t");

if (unlikely(!__done || __res != 0))
-   __res = fail_fn(count);
+   __res = -1;

return __res;
 }
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..0208c3c 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -42,17 +42,14 @@ do {
\
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if it
- * wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function

[PATCH v5 0/7] add mutex wait/wound/style style locks

Changes since v4:
- Some documentation cleanups.
- Added a lot more tests to cover all the DEBUG_LOCKS_WARN_ON cases.
- Added EDEADLK tests.
- Split off the normal mutex tests to a separate patch.
- Added a patch to not allow tests to fail that succeed with PROVE_LOCKING 
enabled.

---

Daniel Vetter (1):
  mutex: w/w mutex slowpath debugging

Maarten Lankhorst (6):
  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded 
or not.
  mutex: add support for wound/wait style locks, v5
  mutex: Add ww tests to lib/locking-selftest.c. v5
  mutex: add more tests to lib/locking-selftest.c
  mutex: add more ww tests to test EDEADLK path handling
  locking-selftests: handle unexpected failures more strictly


 Documentation/ww-mutex-design.txt |  343 ++
 arch/ia64/include/asm/mutex.h |   10 -
 arch/powerpc/include/asm/mutex.h  |   10 -
 arch/sh/include/asm/mutex-llsc.h  |4 
 arch/x86/include/asm/mutex_32.h   |   11 -
 arch/x86/include/asm/mutex_64.h   |   11 -
 include/asm-generic/mutex-dec.h   |   10 -
 include/asm-generic/mutex-null.h  |2 
 include/asm-generic/mutex-xchg.h  |   10 -
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  363 +++
 kernel/mutex.c|  384 ++--
 lib/Kconfig.debug |   13 +
 lib/debug_locks.c |2 
 lib/locking-selftest.c|  720 -
 15 files changed, 1802 insertions(+), 92 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

-- 
Signature

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

--- Comment #4 from Andreas Ringlstetter  ---
PS:
I will not be able to test with 9.0 or 9.1 since one of the shaders causes a
segfault while compiling in these version. This has only recently (last 1-2
months) been fixed in git.

This was caused by a faulty implementation of peephole_mul_omod() in
compiler/radeon_optimize.c, the SIGSEGV was thrown in
rc_variable_list_get_writers_one_reader due to writer_list beeing NULL.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/ef7d0ab4/attachment-0001.html>

[PATCH] drm: Improve manual IRQ installation documentation

On Thu, Jun 20, 2013 at 12:46:21PM +0200, Laurent Pinchart wrote:
> Hi Thierry,
> 
> On Thursday 20 June 2013 12:40:26 Thierry Reding wrote:
> > On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
> > > On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
> > > > On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> > > > > Signed-off-by: Laurent Pinchart
> > > > > <laurent.pinchart+renesas at ideasonboard.com>
> > > > > ---
> > > > > 
> > > > >  Documentation/DocBook/drm.tmpl | 14 --
> > > > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > > > 
> > > > > diff --git a/Documentation/DocBook/drm.tmpl
> > > > > b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
> > > > > --- a/Documentation/DocBook/drm.tmpl
> > > > > +++ b/Documentation/DocBook/drm.tmpl
> > > > > @@ -186,11 +186,12 @@
> > > > > 
> > > > >
> > > > >
> > > > >  DRIVER_HAVE_IRQDRIVER_IRQ_SHARED > > > >  >
> > > > >  
> > > > > 
> > > > > -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > > > handler. The -  DRM core will automatically register an
> > > > > interrupt handler when the -  flag is set.
> > > > > DRIVER_IRQ_SHARED
> > > > > indicates whether the device  -  handler support
> > > > > shared
> > > > > IRQs (note that this is required of PCI -  drivers).
> > > > > +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > > > handler +  managed by the DRM Core. The core will support
> > > > > simple IRQ handler +  installation when the flag is set.
> > > > > The
> > > > > installation process is +  described in  > > > > linkend="drm-irq-registration"/>. +
> > > > > DRIVER_IRQ_SHARED indicates whether the device  handler +
> > > > > 
> > > > > support shared IRQs (note that this is required of PCI 
> > > > > drivers).>
> > > > > 
> > > > >  
> > > > >
> > > > >
> > > > >
> > > > > 
> > > > > @@ -344,7 +345,8 @@ char *date;
> > > > > 
> > > > >The DRM core tries to facilitate IRQ handler registration
> > > > >and
> > > > >unregistration by providing
> > > > >drm_irq_install and
> > > > >drm_irq_uninstall functions. Those
> > > > >functions only
> > > > > 
> > > > > -  support a single interrupt per device.
> > > > > +  support a single interrupt per device, devices that use
> > > > > more
> > > > > than one +  IRQs need to be handled manually.
> > > > 
> > > > Perhaps this should mention that if you handle IRQ installation manually
> > > > you also need to manually set drm->irq_enabled = 1, as otherwise things
> > > > like DRM_IOCTL_WAIT_VBLANK won't work properly.
> > > 
> > > That's only needed if DRIVER_HAVE_IRQ is set, otherwise the
> > > drm_wait_vblank() function skips the irq_enabled check.
> > 
> > Not quite. There's another check for dev->irq_enabled in the
> > DRM_WAIT_ON() so it'll never actually block.
> 
> Indeed.
> 
> > Arguably this could be solved by making the DRM_WAIT_ON() condition drop the
> > !dev->irq_enabled in case DRIVER_HAVE_IRQ isn't set.
> 
> Or we could force drivers to set drm->irq_enabled, I'm fine with that as 
> well. 
> I can fix the documentation if that would be the preferred solution.

Thinking about it some more, the latter seems more appropriate. Consider
some driver that doesn't set DRIVER_HAVE_IRQ but also doesn't initialize
interrupts manually. If so it'll cause DRM_IOCTL_WAIT_VBLANK to block
indefinitely.

On the other hand perhaps the check at the beginning of drm_wait_vblank
should be improved. Maybe make it something like this:

if (drm_core_check_feature(dev, DRIVER_HAVE_IRQ))
if (!drm_dev_to_irq(dev))
return -EINVAL;

if (!dev->irq_enabled)
return -EINVAL;

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/2d9f4896/attachment.pgp>

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

Andreas Ringlstetter  changed:

   What|Removed |Added

  Attachment #81105|0   |1
is obsolete||

--- Comment #3 from Andreas Ringlstetter  ---
Created attachment 81108
  --> https://bugs.freedesktop.org/attachment.cgi?id=81108=edit
Example of corruption in PA. The skybox texture has been completely
overwritten, partly with textures from other programms, corruption in other
textures is already starting.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/21da2ace/attachment.html>

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

--- Comment #2 from Andreas Ringlstetter  ---
Created attachment 81107
  --> https://bugs.freedesktop.org/attachment.cgi?id=81107=edit
glxinfo log

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/82f8b661/attachment.html>

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

--- Comment #1 from Andreas Ringlstetter  ---
Created attachment 81106
  --> https://bugs.freedesktop.org/attachment.cgi?id=81106=edit
Xorg log

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/1b611fdd/attachment.html>

[Bug 65968] New: Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

Priority: medium
Bug ID: 65968
Assignee: dri-devel at lists.freedesktop.org
Summary: Massive memory corruption in Planetary Annihilation
Alpha
Severity: normal
Classification: Unclassified
OS: Linux (All)
Reporter: andreas.ringlstetter at gmail.com
Hardware: x86-64 (AMD64)
Status: NEW
Version: git
Component: Drivers/DRI/r300
Product: Mesa

Created attachment 81105
--> https://bugs.freedesktop.org/attachment.cgi?id=81105=edit
Example of corruption in PA. The skybox texture has been completely
overwritten, partly with textures from other programms, corruption in other
textures is already starting.

Using the R300 driver (git version from 2013-06-19) on a Mobility Radeon X1400
(128MB dedicated ???), I get massive memory corruption which can be seen in the
attached screenshot when running the Planetary Annihilation Alpha.

The game makes use of virtual texturing, thats means a mega texture which won't
possibly fit in the RAM in one piece.

However, it appears like textures which are NOT part of the mega texture have
been mapped into the same address space. I could see other textures, and even
bitmaps from other applications.

In the screenshot, there are large grey stripes for example, however there is
no such texture in the game. The color does match the color of the window
border though. Performing further tests, I even managed to get parts of album
covers from Banshee into PA.

This issue is not only limited to Planetary Annihilation though and the
corruption also works other way around, where applications overwrite the
bitmaps of other applications.

The effects of the corruption are clearly visible in PA due to the large
textures. They are not deterministic, but appear very reliable, most likely due
to the high memory usage.

Using other applications which frequently allocate new textures (like Banshee
with album covers) speeds up the corruption and makes it even visible in other
applications like Firefox, Cinnamon etc., although not reliable.

Attached are:
Screenshot of corruption
Xorg-log
glxinfo output

--
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL:
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/5bce7ee3/attachment.html>

[PATCH] drm: Improve manual IRQ installation documentation

Hi Thierry,

On Thursday 20 June 2013 12:40:26 Thierry Reding wrote:
> On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
> > On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
> > > On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> > > > Signed-off-by: Laurent Pinchart
> > > > <laurent.pinchart+renesas at ideasonboard.com>
> > > > ---
> > > > 
> > > >  Documentation/DocBook/drm.tmpl | 14 --
> > > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/Documentation/DocBook/drm.tmpl
> > > > b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
> > > > --- a/Documentation/DocBook/drm.tmpl
> > > > +++ b/Documentation/DocBook/drm.tmpl
> > > > @@ -186,11 +186,12 @@
> > > > 
> > > >
> > > >
> > > >  DRIVER_HAVE_IRQDRIVER_IRQ_SHARED > > >  >
> > > >  
> > > > 
> > > > -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > > handler. The -  DRM core will automatically register an
> > > > interrupt handler when the -  flag is set.
> > > > DRIVER_IRQ_SHARED
> > > > indicates whether the device  -  handler support
> > > > shared
> > > > IRQs (note that this is required of PCI -  drivers).
> > > > +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > > handler +  managed by the DRM Core. The core will support
> > > > simple IRQ handler +  installation when the flag is set.
> > > > The
> > > > installation process is +  described in  > > > linkend="drm-irq-registration"/>. +
> > > > DRIVER_IRQ_SHARED indicates whether the device  handler +
> > > > 
> > > > support shared IRQs (note that this is required of PCI 
> > > > drivers).>
> > > > 
> > > >  
> > > >
> > > >
> > > >
> > > > 
> > > > @@ -344,7 +345,8 @@ char *date;
> > > > 
> > > >The DRM core tries to facilitate IRQ handler registration
> > > >and
> > > >unregistration by providing
> > > >drm_irq_install and
> > > >drm_irq_uninstall functions. Those
> > > >functions only
> > > > 
> > > > -  support a single interrupt per device.
> > > > +  support a single interrupt per device, devices that use
> > > > more
> > > > than one +  IRQs need to be handled manually.
> > > 
> > > Perhaps this should mention that if you handle IRQ installation manually
> > > you also need to manually set drm->irq_enabled = 1, as otherwise things
> > > like DRM_IOCTL_WAIT_VBLANK won't work properly.
> > 
> > That's only needed if DRIVER_HAVE_IRQ is set, otherwise the
> > drm_wait_vblank() function skips the irq_enabled check.
> 
> Not quite. There's another check for dev->irq_enabled in the
> DRM_WAIT_ON() so it'll never actually block.

Indeed.

> Arguably this could be solved by making the DRM_WAIT_ON() condition drop the
> !dev->irq_enabled in case DRIVER_HAVE_IRQ isn't set.

Or we could force drivers to set drm->irq_enabled, I'm fine with that as well. 
I can fix the documentation if that would be the preferred solution.

> I'll see if I can find the time to come up with a patch.

-- 
Regards,

Laurent Pinchart
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part.
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/ccec4b09/attachment.pgp>

[PATCH] drm: Improve manual IRQ installation documentation

On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
> Hi Thierry,
> 
> On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
> > On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> > > Signed-off-by: Laurent Pinchart
> > > <laurent.pinchart+renesas at ideasonboard.com>
> > > ---
> > > 
> > >  Documentation/DocBook/drm.tmpl | 14 --
> > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/Documentation/DocBook/drm.tmpl
> > > b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
> > > --- a/Documentation/DocBook/drm.tmpl
> > > +++ b/Documentation/DocBook/drm.tmpl
> > > @@ -186,11 +186,12 @@
> > > 
> > >
> > >
> > >  DRIVER_HAVE_IRQDRIVER_IRQ_SHARED
> > >  
> > > 
> > > -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > handler. The -  DRM core will automatically register an
> > > interrupt handler when the -  flag is set. DRIVER_IRQ_SHARED
> > > indicates whether the device  -  handler support shared
> > > IRQs (note that this is required of PCI -  drivers).
> > > +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > > handler +  managed by the DRM Core. The core will support
> > > simple IRQ handler +  installation when the flag is set. The
> > > installation process is +  described in  > > linkend="drm-irq-registration"/>. + 
> > > DRIVER_IRQ_SHARED indicates whether the device  handler + 
> > > support shared IRQs (note that this is required of PCI  
> > > drivers).> 
> > >  
> > >
> > >
> > >
> > > 
> > > @@ -344,7 +345,8 @@ char *date;
> > > 
> > >The DRM core tries to facilitate IRQ handler registration and
> > >unregistration by providing
> > >drm_irq_install and
> > >drm_irq_uninstall functions. Those
> > >functions only
> > > 
> > > -  support a single interrupt per device.
> > > +  support a single interrupt per device, devices that use more
> > > than one +  IRQs need to be handled manually.
> > 
> > Perhaps this should mention that if you handle IRQ installation manually
> > you also need to manually set drm->irq_enabled = 1, as otherwise things
> > like DRM_IOCTL_WAIT_VBLANK won't work properly.
> 
> That's only needed if DRIVER_HAVE_IRQ is set, otherwise the drm_wait_vblank() 
> function skips the irq_enabled check.

Not quite. There's another check for dev->irq_enabled in the
DRM_WAIT_ON() so it'll never actually block. Arguably this could be
solved by making the DRM_WAIT_ON() condition drop the !dev->irq_enabled
in case DRIVER_HAVE_IRQ isn't set.

I'll see if I can find the time to come up with a patch.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/dac54ffe/attachment-0001.pgp>

[PULL] drm-tegra-fixes

Hi Dave,

The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://anongit.freedesktop.org/tegra/linux.git drm/fixes

for you to fetch changes up to 2b15aa0fd466a4b2defdfd84c0b5168804b3eb33:

  gpu: host1x: Rework CPU syncpoint increment (2013-06-20 12:21:38 +0200)

I know this comes awfully late and is much bigger than I'd like. It's
all my fault, though, since I've been travelling a lot lately and just
didn't find the time to prepare this earlier. I did want to sort out
patches that didn't warrant to go in before the final 3.10 release, but
all met the criteria as far as I could tell.

The larger chunks are fixes for the firewall that checks job submissions
from userspace, which was unfortunately not working properly. All other
patches are mostly one-liners to fix things that were introduced in
3.10.

Thierry


Arto Merilainen (5):
  gpu: host1x: Check reloc table before usage
  gpu: host1x: Copy gathers before verification
  gpu: host1x: Fix memory access in syncpt request
  gpu: host1x: Fix client_managed type
  gpu: host1x: Rework CPU syncpoint increment

Emil Goode (1):
  drm/tegra: Include header drm/drm.h

Laurent Pinchart (1):
  drm/tegra: Remove DRIVER_BUS_PLATFORM from driver_features

Terje Bergstrom (2):
  gpu: host1x: Check INCR opcode correctly
  gpu: host1x: Don't reset firewall between gathers

Thierry Reding (5):
  drm/tegra: Don't disable unused planes
  drm/tegra: Explicitly set irq_enabled
  drm/tegra: Honor pixel-format changes
  MAINTAINERS: Update Tegra DRM entry
  drm/tegra: Fix return value

Wei Yongjun (2):
  drm/tegra: fix missing unlock on error
  drm/tegra: fix error return code in gr2d_submit()

 MAINTAINERS   |   7 +-
 drivers/gpu/host1x/dev.h  |   8 +--
 drivers/gpu/host1x/drm/dc.c   |   5 ++
 drivers/gpu/host1x/drm/drm.c  |  14 +++-
 drivers/gpu/host1x/drm/gr2d.c |  12 ++--
 drivers/gpu/host1x/hw/cdma_hw.c   |   2 +-
 drivers/gpu/host1x/hw/syncpt_hw.c |  12 ++--
 drivers/gpu/host1x/job.c  | 135 +-
 drivers/gpu/host1x/syncpt.c   |  26 +++-
 drivers/gpu/host1x/syncpt.h   |  13 ++--
 include/uapi/drm/tegra_drm.h  |   2 +
 11 files changed, 113 insertions(+), 123 deletions(-)
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/3fe39495/attachment.pgp>

[GIT PULL FOR v3.11] shmob-drm patches

Hi Dave,

Could you please pull the following four shmob-drm patches for v3.11 ?

The following changes since commit dc28aa072f502433b6adc5c9ae8f56955c07580a:

  gpu:drm:tilcdc: get preferred_bpp value from DT (2013-06-20 14:08:01 +1000)

are available in the git repository at:

  git://linuxtv.org/pinchartl/fbdev.git drm/shmob

for you to fetch changes up to 227c1fb28c7d01ccbd9203e42734c06b470d1744:

  drm/shmobile: Enable compilation on all ARM platforms (2013-06-20 10:07:14 
+0200)


Laurent Pinchart (4):
  drm/shmobile: Minor typo fix in debug message
  drm/shmobile: Use devm_* managed functions
  drm/shmobile: Add DRM PRIME support
  drm/shmobile: Enable compilation on all ARM platforms

 drivers/gpu/drm/shmobile/Kconfig   |  2 +-
 drivers/gpu/drm/shmobile/shmob_drm_drv.c   | 35 +
 drivers/gpu/drm/shmobile/shmob_drm_kms.c   |  2 +-
 drivers/gpu/drm/shmobile/shmob_drm_plane.c |  7 +--
 4 files changed, 18 insertions(+), 28 deletions(-)

-- 
Regards,

Laurent Pinchart

[PATCH] drm: Improve manual IRQ installation documentation

Hi Thierry,

On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
> On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> > Signed-off-by: Laurent Pinchart
> > <laurent.pinchart+renesas at ideasonboard.com>
> > ---
> > 
> >  Documentation/DocBook/drm.tmpl | 14 --
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/Documentation/DocBook/drm.tmpl
> > b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
> > --- a/Documentation/DocBook/drm.tmpl
> > +++ b/Documentation/DocBook/drm.tmpl
> > @@ -186,11 +186,12 @@
> > 
> >
> >
> >  DRIVER_HAVE_IRQDRIVER_IRQ_SHARED
> >  
> > 
> > -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > handler. The -  DRM core will automatically register an
> > interrupt handler when the -  flag is set. DRIVER_IRQ_SHARED
> > indicates whether the device  -  handler support shared
> > IRQs (note that this is required of PCI -  drivers).
> > +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
> > handler +  managed by the DRM Core. The core will support
> > simple IRQ handler +  installation when the flag is set. The
> > installation process is +  described in  > linkend="drm-irq-registration"/>. + 
> > DRIVER_IRQ_SHARED indicates whether the device  handler + 
> > support shared IRQs (note that this is required of PCI  drivers).> 
> >  
> >
> >
> >
> > 
> > @@ -344,7 +345,8 @@ char *date;
> > 
> >The DRM core tries to facilitate IRQ handler registration and
> >unregistration by providing
> >drm_irq_install and
> >drm_irq_uninstall functions. Those
> >functions only
> > 
> > -  support a single interrupt per device.
> > +  support a single interrupt per device, devices that use more
> > than one +  IRQs need to be handled manually.
> 
> Perhaps this should mention that if you handle IRQ installation manually
> you also need to manually set drm->irq_enabled = 1, as otherwise things
> like DRM_IOCTL_WAIT_VBLANK won't work properly.

That's only needed if DRIVER_HAVE_IRQ is set, otherwise the drm_wait_vblank() 
function skips the irq_enabled check.

-- 
Regards,

Laurent Pinchart
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part.
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/04e70f6b/attachment.pgp>

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 17:24 +0900 schrieb Inki Dae:
[...]
> > > In addition, please see the below more detail examples.
> > >
> > > The conventional way (without dmabuf-sync) is:
> > > Task A
> > > 
> > >  1. CPU accesses buf
> > >  2. Send the buf to Task B
> > >  3. Wait for the buf from Task B
> > >  4. go to 1
> > >
> > > Task B
> > > ---
> > > 1. Wait for the buf from Task A
> > > 2. qbuf the buf
> > > 2.1 insert the buf to incoming queue
> > > 3. stream on
> > > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > > 3.2 get the buf from ready queue, and dma start.
> > > 4. dqbuf
> > > 4.1 dma_unmap_sg after dma operation completion
> > > 4.2 move the buf to outgoing queue
> > > 5. back the buf to Task A
> > > 6. go to 1
> > >
> > > In case that two tasks share buffers, and data flow goes from Task A to
> > Task
> > > B, we would need IPC operation to send and receive buffers properly
> > between
> > > those two tasks every time CPU or DMA access to buffers is started or
> > > completed.
> > >
> > >
> > > With dmabuf-sync:
> > >
> > > Task A
> > > 
> > >  1. dma_buf_sync_lock <- synpoint (call by user side)
> > >  2. CPU accesses buf
> > >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> > >  4. Send the buf to Task B (just one time)
> > >  5. go to 1
> > >
> > >
> > > Task B
> > > ---
> > > 1. Wait for the buf from Task A (just one time)
> > > 2. qbuf the buf
> > > 1.1 insert the buf to incoming queue
> > > 3. stream on
> > > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > > 3.3 get the buf from ready queue, and dma start.
> > > 4. dqbuf
> > > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > > 4.2 dma_unmap_sg after dma operation completion
> > > 4.3 move the buf to outgoing queue
> > > 5. go to 1
> > >
> > > On the other hand, in case of using dmabuf-sync, as you can see the
> > above
> > > example, we would need IPC operation just one time. That way, I think we
> > > could not only reduce performance overhead but also make user
> > application
> > > simplified. Of course, this approach can be used for all DMA device
> > drivers
> > > such as DRM. I'm not a specialist in v4l2 world so there may be missing
> > > point.
> > >
> > 
> > You already need some kind of IPC between the two tasks, as I suspect
> > even in your example it wouldn't make much sense to queue the buffer
> > over and over again in task B without task A writing anything to it. So
> > task A has to signal task B there is new data in the buffer to be
> > processed.
> > 
> > There is no need to share the buffer over and over again just to get the
> > two processes to work together on the same thing. Just share the fd
> > between both and then do out-of-band completion signaling, as you need
> > this anyway. Without this you'll end up with unpredictable behavior.
> > Just because sync allows you to access the buffer doesn't mean it's
> > valid for your use-case. Without completion signaling you could easily
> > end up overwriting your data from task A multiple times before task B
> > even tries to lock the buffer for processing.
> > 
> > So the valid flow is (and this already works with the current APIs):
> > Task ATask B
> > ----
> > CPU access buffer
> >  --completion signal->
> >   qbuf (dragging buffer into
> >   device domain, flush caches,
> >   reserve buffer etc.)
> > |
> >   wait for device operation to
> >   complete
> > |
> >   dqbuf (dragging buffer back
> >   into CPU domain, invalidate
> >   caches, unreserve)
> > <-completion signal
> > CPU access buffer
> > 
> 
> Correct. In case that data flow goes from A to B, it needs some kind
> of IPC between the two tasks every time as you said. Then, without
> dmabuf-sync, how do think about the case that two tasks share the same
> buffer but these tasks access the buffer(buf1) as write, and data of
> the buffer(buf1) isn't needed to be shared?
> 
Sorry, I don't see the point you are trying to solve here. If you share
a buffer and want its content to be clearly defined at every point in
time you have to synchronize the tasks working with the buffer, not just
the buffer accesses itself.

Easiest way to do so is doing sync through userspace with out-of-band
IPC, like in

[PATCH] drm: Improve manual IRQ installation documentation

On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas at ideasonboard.com>
> ---
>  Documentation/DocBook/drm.tmpl | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
> index f9df3b8..738b727 100644
> --- a/Documentation/DocBook/drm.tmpl
> +++ b/Documentation/DocBook/drm.tmpl
> @@ -186,11 +186,12 @@
>
>  DRIVER_HAVE_IRQDRIVER_IRQ_SHARED
>  
> -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ 
> handler. The
> -  DRM core will automatically register an interrupt handler when 
> the
> -  flag is set. DRIVER_IRQ_SHARED indicates whether the device 
> 
> -  handler support shared IRQs (note that this is required of PCI
> -  drivers).
> +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ handler
> +  managed by the DRM Core. The core will support simple IRQ 
> handler
> +  installation when the flag is set. The installation process is
> +  described in .
> +  DRIVER_IRQ_SHARED indicates whether the device  
> handler
> +  support shared IRQs (note that this is required of PCI  
> drivers).
>  
>
>
> @@ -344,7 +345,8 @@ char *date;
>The DRM core tries to facilitate IRQ handler registration and
>unregistration by providing drm_irq_install 
> and
>drm_irq_uninstall functions. Those functions 
> only
> -  support a single interrupt per device.
> +  support a single interrupt per device, devices that use more than 
> one
> +  IRQs need to be handled manually.

Perhaps this should mention that if you handle IRQ installation manually
you also need to manually set drm->irq_enabled = 1, as otherwise things
like DRM_IOCTL_WAIT_VBLANK won't work properly.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/47a949f9/attachment.pgp>

[PATCH 2/2] drm/qxl: add support for cursor hotspot.

From: Dave Airlie 

This uses the cursor hotspot info from userspace and passes
it to the qxl hw layer.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/qxl/qxl_display.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
b/drivers/gpu/drm/qxl/qxl_display.c
index 823d29e..489e879 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -255,11 +255,11 @@ qxl_hide_cursor(struct qxl_device *qdev)
qxl_release_unreserve(qdev, release);
 }

-static int qxl_crtc_cursor_set(struct drm_crtc *crtc,
-  struct drm_file *file_priv,
-  uint32_t handle,
-  uint32_t width,
-  uint32_t height)
+static int qxl_crtc_cursor_set2(struct drm_crtc *crtc,
+   struct drm_file *file_priv,
+   uint32_t handle,
+   uint32_t width,
+   uint32_t height, int32_t hot_x, int32_t hot_y)
 {
struct drm_device *dev = crtc->dev;
struct qxl_device *qdev = dev->dev_private;
@@ -315,8 +315,8 @@ static int qxl_crtc_cursor_set(struct drm_crtc *crtc,
cursor->header.type = SPICE_CURSOR_TYPE_ALPHA;
cursor->header.width = 64;
cursor->header.height = 64;
-   cursor->header.hot_spot_x = 0;
-   cursor->header.hot_spot_y = 0;
+   cursor->header.hot_spot_x = hot_x;
+   cursor->header.hot_spot_y = hot_y;
cursor->data_size = size;
cursor->chunk.next_chunk = 0;
cursor->chunk.prev_chunk = 0;
@@ -397,7 +397,7 @@ static int qxl_crtc_cursor_move(struct drm_crtc *crtc,


 static const struct drm_crtc_funcs qxl_crtc_funcs = {
-   .cursor_set = qxl_crtc_cursor_set,
+   .cursor_set2 = qxl_crtc_cursor_set2,
.cursor_move = qxl_crtc_cursor_move,
.gamma_set = qxl_crtc_gamma_set,
.set_config = drm_crtc_helper_set_config,
-- 
1.8.2.1

[PATCH 1/2] drm: add hotspot support for cursors.

From: Dave Airlie 

So it looks like for virtual hw cursors on QXL we need to inform
the "hw" device what the cursor hotspot parameters are. This
makes sense if you think the host has to draw the cursor and interpret
clicks from it. However the current modesetting interface doesn't support
passing the hotspot information from userspace.

This implements a new cursor ioctl, that takes the hotspot info as well,
userspace can try calling the new interface and if it -ENOSYS, can just
fallback to the old non-hotspot interface.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/drm_crtc.c  | 35 +--
 drivers/gpu/drm/drm_drv.c   |  1 +
 include/drm/drm_crtc.h  |  5 +
 include/uapi/drm/drm.h  |  1 +
 include/uapi/drm/drm_mode.h | 13 +
 5 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index e7e9242..cc9eada 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -2099,10 +2099,10 @@ out:
return ret;
 }

-int drm_mode_cursor_ioctl(struct drm_device *dev,
-   void *data, struct drm_file *file_priv)
+static int drm_mode_cursor_common(struct drm_device *dev,
+ struct drm_mode_cursor2 *req,
+ struct drm_file *file_priv)
 {
-   struct drm_mode_cursor *req = data;
struct drm_mode_object *obj;
struct drm_crtc *crtc;
int ret = 0;
@@ -2122,13 +2122,17 @@ int drm_mode_cursor_ioctl(struct drm_device *dev,

mutex_lock(>mutex);
if (req->flags & DRM_MODE_CURSOR_BO) {
-   if (!crtc->funcs->cursor_set) {
+   if (!crtc->funcs->cursor_set && !crtc->funcs->cursor_set2) {
ret = -ENXIO;
goto out;
}
/* Turns off the cursor if handle is 0 */
-   ret = crtc->funcs->cursor_set(crtc, file_priv, req->handle,
- req->width, req->height);
+   if (crtc->funcs->cursor_set2)
+   ret = crtc->funcs->cursor_set2(crtc, file_priv, 
req->handle,
+ req->width, req->height, 
req->hot_x, req->hot_y);
+   else
+   ret = crtc->funcs->cursor_set(crtc, file_priv, 
req->handle,
+ req->width, req->height);
}

if (req->flags & DRM_MODE_CURSOR_MOVE) {
@@ -2143,6 +2147,25 @@ out:
mutex_unlock(>mutex);

return ret;
+
+}
+int drm_mode_cursor_ioctl(struct drm_device *dev,
+   void *data, struct drm_file *file_priv)
+{
+   struct drm_mode_cursor *req = data;
+   struct drm_mode_cursor2 new_req;
+
+   memcpy(_req, req, sizeof(struct drm_mode_cursor));
+   new_req.hot_x = new_req.hot_y = 0;
+
+   return drm_mode_cursor_common(dev, _req, file_priv);
+}
+
+int drm_mode_cursor2_ioctl(struct drm_device *dev,
+  void *data, struct drm_file *file_priv)
+{
+   struct drm_mode_cursor2 *req = data;
+   return drm_mode_cursor_common(dev, req, file_priv);
 }

 /* Original addfb only supported RGB formats, so figure out which one */
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 9cc247f..99fcd7c 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -166,6 +166,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
DRM_IOCTL_DEF(DRM_IOCTL_MODE_DESTROY_DUMB, drm_mode_destroy_dumb_ioctl, 
DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_OBJ_GETPROPERTIES, 
drm_mode_obj_get_properties_ioctl, DRM_CONTROL_ALLOW|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_OBJ_SETPROPERTY, 
drm_mode_obj_set_property_ioctl, DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
+   DRM_IOCTL_DEF(DRM_IOCTL_MODE_CURSOR2, drm_mode_cursor2_ioctl, 
DRM_MASTER|DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 };

 #define DRM_CORE_IOCTL_COUNT   ARRAY_SIZE( drm_ioctls )
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index adb3f9b..093c030 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -339,6 +339,9 @@ struct drm_crtc_funcs {
/* cursor controls */
int (*cursor_set)(struct drm_crtc *crtc, struct drm_file *file_priv,
  uint32_t handle, uint32_t width, uint32_t height);
+   int (*cursor_set2)(struct drm_crtc *crtc, struct drm_file *file_priv,
+  uint32_t handle, uint32_t width, uint32_t height,
+  int32_t hot_x, int32_t hot_y);
int (*cursor_move)(struct drm_crtc *crtc, int x, int y);

/* Set gamma on the CRTC */
@@ -1022,6 +1025,8 @@ extern int drm_mode_setplane(struct drm_device *dev,
   void *data, struct drm_file *file_priv);
 extern int drm_mode_cursor_ioctl(struct

[PATCH v3 2/3] drm/exynos: add support for exynos5420 mixer

2013-06-20 Thread 김승우

Hello Rahul,

This patch is exactly same with v2 2/4 and only rebased on v3 1/3, so my
ack is valid for this patch.

On 2013? 06? 19? 21:51, Rahul Sharma wrote:
> Add support for exynos5420 mixer IP in the drm mixer driver.
> 
> Signed-off-by: Rahul Sharma 

Acked-by: Seung-Woo Kim 

Anyway, this patch can be merged after merging [Patch v3 1/3] as like v2.

Best Regards,
- Seung-Woo Kim

> ---
>  .../devicetree/bindings/video/exynos_mixer.txt |1 +
>  drivers/gpu/drm/exynos/exynos_mixer.c  |   49 
> +++-
>  drivers/gpu/drm/exynos/regs-mixer.h|7 +++
>  3 files changed, 45 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/video/exynos_mixer.txt 
> b/Documentation/devicetree/bindings/video/exynos_mixer.txt
> index 9131b99..3334b0a 100644
> --- a/Documentation/devicetree/bindings/video/exynos_mixer.txt
> +++ b/Documentation/devicetree/bindings/video/exynos_mixer.txt
> @@ -5,6 +5,7 @@ Required properties:
>   1) "samsung,exynos5-mixer" 
>   2) "samsung,exynos4210-mixer"
>   3) "samsung,exynos5250-mixer"
> + 4) "samsung,exynos5420-mixer"
>  
>  - reg: physical base address of the mixer and length of memory mapped
>   region.
> diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c 
> b/drivers/gpu/drm/exynos/exynos_mixer.c
> index 6225501..b1280b4 100644
> --- a/drivers/gpu/drm/exynos/exynos_mixer.c
> +++ b/drivers/gpu/drm/exynos/exynos_mixer.c
> @@ -78,6 +78,7 @@ struct mixer_resources {
>  enum mixer_version_id {
>   MXR_VER_0_0_0_16,
>   MXR_VER_16_0_33_0,
> + MXR_VER_128_0_0_184,
>  };
>  
>  struct mixer_context {
> @@ -283,17 +284,19 @@ static void mixer_cfg_scan(struct mixer_context *ctx, 
> unsigned int height)
>   val = (ctx->interlace ? MXR_CFG_SCAN_INTERLACE :
>   MXR_CFG_SCAN_PROGRASSIVE);
>  
> - /* choosing between porper HD and SD mode */
> - if (height <= 480)
> - val |= MXR_CFG_SCAN_NTSC | MXR_CFG_SCAN_SD;
> - else if (height <= 576)
> - val |= MXR_CFG_SCAN_PAL | MXR_CFG_SCAN_SD;
> - else if (height <= 720)
> - val |= MXR_CFG_SCAN_HD_720 | MXR_CFG_SCAN_HD;
> - else if (height <= 1080)
> - val |= MXR_CFG_SCAN_HD_1080 | MXR_CFG_SCAN_HD;
> - else
> - val |= MXR_CFG_SCAN_HD_720 | MXR_CFG_SCAN_HD;
> + if (ctx->mxr_ver != MXR_VER_128_0_0_184) {
> + /* choosing between proper HD and SD mode */
> + if (height <= 480)
> + val |= MXR_CFG_SCAN_NTSC | MXR_CFG_SCAN_SD;
> + else if (height <= 576)
> + val |= MXR_CFG_SCAN_PAL | MXR_CFG_SCAN_SD;
> + else if (height <= 720)
> + val |= MXR_CFG_SCAN_HD_720 | MXR_CFG_SCAN_HD;
> + else if (height <= 1080)
> + val |= MXR_CFG_SCAN_HD_1080 | MXR_CFG_SCAN_HD;
> + else
> + val |= MXR_CFG_SCAN_HD_720 | MXR_CFG_SCAN_HD;
> + }
>  
>   mixer_reg_writemask(res, MXR_CFG, val, MXR_CFG_SCAN_MASK);
>  }
> @@ -557,6 +560,14 @@ static void mixer_graph_buffer(struct mixer_context 
> *ctx, int win)
>   /* setup geometry */
>   mixer_reg_write(res, MXR_GRAPHIC_SPAN(win), win_data->fb_width);
>  
> + /* setup display size */
> + if (ctx->mxr_ver == MXR_VER_128_0_0_184 &&
> + win == MIXER_DEFAULT_WIN) {
> + val  = MXR_MXR_RES_HEIGHT(win_data->fb_height);
> + val |= MXR_MXR_RES_WIDTH(win_data->fb_width);
> + mixer_reg_write(res, MXR_RESOLUTION, val);
> + }
> +
>   val  = MXR_GRP_WH_WIDTH(win_data->crtc_width);
>   val |= MXR_GRP_WH_HEIGHT(win_data->crtc_height);
>   val |= MXR_GRP_WH_H_SCALE(x_ratio);
> @@ -581,7 +592,8 @@ static void mixer_graph_buffer(struct mixer_context *ctx, 
> int win)
>   mixer_cfg_layer(ctx, win, true);
>  
>   /* layer update mandatory for mixer 16.0.33.0 */
> - if (ctx->mxr_ver == MXR_VER_16_0_33_0)
> + if (ctx->mxr_ver == MXR_VER_16_0_33_0 ||
> + ctx->mxr_ver == MXR_VER_128_0_0_184)
>   mixer_layer_update(ctx);
>  
>   mixer_run(ctx);
> @@ -816,6 +828,7 @@ static void mixer_win_disable(void *ctx, int win)
>  
>  static int mixer_check_mode(void *ctx, struct drm_display_mode *mode)
>  {
> + struct mixer_context *mixer_ctx = ctx;
>   u32 w, h;
>  
>   w = mode->hdisplay;
> @@ -825,6 +838,10 @@ static int mixer_check_mode(void *ctx, struct 
> drm_display_mode *mode)
>   mode->hdisplay, mode->vdisplay, mode->vrefresh,
>   (mode->flags & DRM_MODE_FLAG_INTERLACE) ? 1 : 0);
>  
> + if (mixer_ctx->mxr_ver == MXR_VER_0_0_0_16 ||
> + mixer_ctx->mxr_ver == MXR_VER_128_0_0_184)
> + return 0;
> +
>   if ((w >= 464 && w <= 720 && h >= 261 && h <= 576) ||
>   (w >= 1024 && w <= 1280 && h >= 576 && h <= 720) ||
>   (w >= 1664 && w <= 1920 && h >= 936 && h

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 09:17 +0100 schrieb Russell King - ARM
Linux:
> On Thu, Jun 20, 2013 at 09:47:07AM +0200, Lucas Stach wrote:
> > Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
> > > 
> > > > -Original Message-
> > > > From: dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org
> > > > [mailto:dri-devel-bounces+inki.dae=samsung.com at 
> > > > lists.freedesktop.org] On
> > > > Behalf Of Russell King - ARM Linux
> > > > Sent: Thursday, June 20, 2013 3:29 AM
> > > > To: Inki Dae
> > > > Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> > > > Cho; linux-media at vger.kernel.org; linux-arm-kernel at 
> > > > lists.infradead.org
> > > > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer 
> > > > synchronization
> > > > framework
> > > > 
> > > > On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > > > > On the other hand, the below shows how we could enhance the 
> > > > > conventional
> > > > > way with my approach (just example):
> > > > >
> > > > > CPU -> DMA,
> > > > > ioctl(qbuf command)  ioctl(streamon)
> > > > >   |   |
> > > > >   |   |
> > > > > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> > > > >
> > > > > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> > > > And
> > > > > the syncpoint is performed by calling dma_buf_sync_lock(), and then 
> > > > > DMA
> > > > > accesses the sync buffer.
> > > > >
> > > > > And DMA -> CPU,
> > > > > ioctl(dqbuf command)
> > > > >   |
> > > > >   |
> > > > > dqbuf <- nothing to do
> > > > >
> > > > > Actual syncpoint is when DMA operation is completed (in interrupt
> > > > handler):
> > > > > the syncpoint is performed by calling dma_buf_sync_unlock().
> > > > > Hence,  my approach is to move the syncpoints into just before dma
> > > > access
> > > > > as long as possible.
> > > > 
> > > > What you've just described does *not* work on architectures such as
> > > > ARMv7 which do speculative cache fetches from memory at any time that
> > > > that memory is mapped with a cacheable status, and will lead to data
> > > > corruption.
> > > 
> > > I didn't explain that enough. Sorry about that. 'nothing to do' means 
> > > that a
> > > dmabuf sync interface isn't called but existing functions are called. So
> > > this may be explained again:
> > > ioctl(dqbuf command)
> > > |
> > > |
> > > dqbuf <- 1. dma_unmap_sg
> > > 2. dma_buf_sync_unlock (syncpoint)
> > > 
> > > The syncpoint I mentioned means lock mechanism; not doing cache operation.
> > > 
> > > In addition, please see the below more detail examples.
> > > 
> > > The conventional way (without dmabuf-sync) is:
> > > Task A 
> > > 
> > >  1. CPU accesses buf  
> > >  2. Send the buf to Task B  
> > >  3. Wait for the buf from Task B
> > >  4. go to 1
> > > 
> > > Task B
> > > ---
> > > 1. Wait for the buf from Task A
> > > 2. qbuf the buf 
> > > 2.1 insert the buf to incoming queue
> > > 3. stream on
> > > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > > 3.2 get the buf from ready queue, and dma start.
> > > 4. dqbuf
> > > 4.1 dma_unmap_sg after dma operation completion
> > > 4.2 move the buf to outgoing queue
> > > 5. back the buf to Task A
> > > 6. go to 1
> > > 
> > > In case that two tasks share buffers, and data flow goes from Task A to 
> > > Task
> > > B, we would need IPC operation to send and receive buffers properly 
> > > between
> > > those two tasks every time CPU or DMA access to buffers is started or
> > > completed.
> > > 
> > > 
> > > With dmabuf-sync:
> > > 
> > > Task A 
> > > 
> > >  1. dma_buf_sync_lock <- synpoint (call by user side)
> > >  2. CPU accesses buf  
> > >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> > >  4. Send the buf to Task B (just one time)
> > >  5. go to 1
> > > 
> > > 
> > > Task B
> > > ---
> > > 1. Wait for the buf from Task A (just one time)
> > > 2. qbuf the buf 
> > > 1.1 insert the buf to incoming queue
> > > 3. stream on
> > > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > > 3.3 get the buf from ready queue, and dma start.
> > > 4. dqbuf
> > > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > > 4.2 dma_unmap_sg after dma operation completion
> > > 4.3 move the buf to outgoing queue
> > > 5. go to 1
> > > 
> > > On the other hand, in case of using dmabuf-sync, as you can see the above
> > > example, we would need IPC operation just one time.

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
> 
> > -Original Message-
> > From: dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org
> > [mailto:dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org] On
> > Behalf Of Russell King - ARM Linux
> > Sent: Thursday, June 20, 2013 3:29 AM
> > To: Inki Dae
> > Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> > Cho; linux-media at vger.kernel.org; linux-arm-kernel at lists.infradead.org
> > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> > framework
> > 
> > On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > > On the other hand, the below shows how we could enhance the conventional
> > > way with my approach (just example):
> > >
> > > CPU -> DMA,
> > > ioctl(qbuf command)  ioctl(streamon)
> > >   |   |
> > >   |   |
> > > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> > >
> > > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> > And
> > > the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
> > > accesses the sync buffer.
> > >
> > > And DMA -> CPU,
> > > ioctl(dqbuf command)
> > >   |
> > >   |
> > > dqbuf <- nothing to do
> > >
> > > Actual syncpoint is when DMA operation is completed (in interrupt
> > handler):
> > > the syncpoint is performed by calling dma_buf_sync_unlock().
> > > Hence,  my approach is to move the syncpoints into just before dma
> > access
> > > as long as possible.
> > 
> > What you've just described does *not* work on architectures such as
> > ARMv7 which do speculative cache fetches from memory at any time that
> > that memory is mapped with a cacheable status, and will lead to data
> > corruption.
> 
> I didn't explain that enough. Sorry about that. 'nothing to do' means that a
> dmabuf sync interface isn't called but existing functions are called. So
> this may be explained again:
> ioctl(dqbuf command)
> |
> |
> dqbuf <- 1. dma_unmap_sg
> 2. dma_buf_sync_unlock (syncpoint)
> 
> The syncpoint I mentioned means lock mechanism; not doing cache operation.
> 
> In addition, please see the below more detail examples.
> 
> The conventional way (without dmabuf-sync) is:
> Task A 
> 
>  1. CPU accesses buf  
>  2. Send the buf to Task B  
>  3. Wait for the buf from Task B
>  4. go to 1
> 
> Task B
> ---
> 1. Wait for the buf from Task A
> 2. qbuf the buf 
> 2.1 insert the buf to incoming queue
> 3. stream on
> 3.1 dma_map_sg if ready, and move the buf to ready queue
> 3.2 get the buf from ready queue, and dma start.
> 4. dqbuf
> 4.1 dma_unmap_sg after dma operation completion
> 4.2 move the buf to outgoing queue
> 5. back the buf to Task A
> 6. go to 1
> 
> In case that two tasks share buffers, and data flow goes from Task A to Task
> B, we would need IPC operation to send and receive buffers properly between
> those two tasks every time CPU or DMA access to buffers is started or
> completed.
> 
> 
> With dmabuf-sync:
> 
> Task A 
> 
>  1. dma_buf_sync_lock <- synpoint (call by user side)
>  2. CPU accesses buf  
>  3. dma_buf_sync_unlock <- syncpoint (call by user side)
>  4. Send the buf to Task B (just one time)
>  5. go to 1
> 
> 
> Task B
> ---
> 1. Wait for the buf from Task A (just one time)
> 2. qbuf the buf 
> 1.1 insert the buf to incoming queue
> 3. stream on
> 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> 3.2 dma_map_sg if ready, and move the buf to ready queue
> 3.3 get the buf from ready queue, and dma start.
> 4. dqbuf
> 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> 4.2 dma_unmap_sg after dma operation completion
> 4.3 move the buf to outgoing queue
> 5. go to 1
> 
> On the other hand, in case of using dmabuf-sync, as you can see the above
> example, we would need IPC operation just one time. That way, I think we
> could not only reduce performance overhead but also make user application
> simplified. Of course, this approach can be used for all DMA device drivers
> such as DRM. I'm not a specialist in v4l2 world so there may be missing
> point.
> 

You already need some kind of IPC between the two tasks, as I suspect
even in your example it wouldn't make much sense to queue the buffer
over and over again in task B without task A writing anything to it. So
task A has to signal task B there is new data in the buffer to be
processed.

There is no need to share the buffer over and over again just to get the
two processes to work together on the same

[PATCH v3 1/3] drm/exynos: add new compatible strings for hdmi subsystem

2013-06-20 Thread Tomasz Figa

Hi Rahul,

On Thursday 20 of June 2013 07:41:53 Rahul Sharma wrote:
> Hi Tomasz, Lucas,
> 
> How does this patch look to you ? Please share your views.

Looks fine now. Have my

Reviewed-by: Tomasz Figa 

for the whole series.

Best regards,
Tomasz

> regards,
> Rahul Sharma.
> 
> On Wed, Jun 19, 2013 at 6:21 PM, Rahul Sharma  
wrote:
> > This patch adds new combatible strings for hdmi, mixer, ddc
> > and hdmiphy. It follows the convention of using compatible string
> > which represent the SoC in which the IP was added for the first
> > time.
> > 
> > Drivers continue to support the previous compatible strings
> > but further addition of these compatible strings in device tree
> > is deprecated.
> > 
> > Signed-off-by: Rahul Sharma 
> > ---
> > 
> >  Documentation/devicetree/bindings/video/exynos_hdmi.txt   |7
> >  +-- .../devicetree/bindings/video/exynos_hdmiddc.txt  | 
> >7 +-- .../devicetree/bindings/video/exynos_hdmiphy.txt
> >   |7 +--
> >  Documentation/devicetree/bindings/video/exynos_mixer.txt  |8
> >  ++-- drivers/gpu/drm/exynos/exynos_ddc.c   |
> > 2 ++ drivers/gpu/drm/exynos/exynos_hdmi.c  | 
> >3 +++ drivers/gpu/drm/exynos/exynos_hdmiphy.c   | 
> >4  drivers/gpu/drm/exynos/exynos_mixer.c |
> >13 - 8 files changed, 38 insertions(+), 13
> >  deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> > b/Documentation/devicetree/bindings/video/exynos_hdmi.txt index
> > 589edee..c71d0f0 100644
> > --- a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> > +++ b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> > @@ -1,7 +1,10 @@
> > 
> >  Device-Tree bindings for drm hdmi driver
> > 
> >  Required properties:
> > -- compatible: value should be "samsung,exynos5-hdmi".
> > +- compatible: value should be one among the following:
> > +   1) "samsung,exynos5-hdmi" 
> > +   2) "samsung,exynos4210-hdmi"
> > +   3) "samsung,exynos4212-hdmi"
> > 
> >  - reg: physical base address of the hdmi and length of memory mapped
> >  
> > region.
> >  
> >  - interrupts: interrupt number to the cpu.
> > 
> > @@ -15,7 +18,7 @@ Required properties:
> >  Example:
> > hdmi {
> > 
> > -   compatible = "samsung,exynos5-hdmi";
> > +   compatible = "samsung,exynos4212-hdmi";
> > 
> > reg = <0x1453 0x10>;
> > interrupts = <0 95 0>;
> > hpd-gpio = < 7 0xf 1 3>;
> > 
> > diff --git
> > a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> > b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt index
> > fa166d9..41eee97 100644
> > --- a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> > +++ b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> > @@ -1,12 +1,15 @@
> > 
> >  Device-Tree bindings for hdmiddc driver
> > 
> >  Required properties:
> > -- compatible: value should be "samsung,exynos5-hdmiddc".
> > +- compatible: value should be one of the following
> > +   1) "samsung,exynos5-hdmiddc" 
> > +   2) "samsung,exynos4210-hdmiddc"
> > +
> > 
> >  - reg: I2C address of the hdmiddc device.
> >  
> >  Example:
> > hdmiddc {
> > 
> > -   compatible = "samsung,exynos5-hdmiddc";
> > +   compatible = "samsung,exynos4210-hdmiddc";
> > 
> > reg = <0x50>;
> > 
> > };
> > 
> > diff --git
> > a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> > b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt index
> > 858f4f9..162f641 100644
> > --- a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> > +++ b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> > @@ -1,12 +1,15 @@
> > 
> >  Device-Tree bindings for hdmiphy driver
> > 
> >  Required properties:
> > -- compatible: value should be "samsung,exynos5-hdmiphy".
> > +- compatible: value should be one of the following:
> > +   1) "samsung,exynos5-hdmiphy" 
> > +   2) "samsung,exynos4210-hdmiphy".
> > +   3) "samsung,exynos4212-hdmiphy".
> > 
> >  - reg: I2C address of the hdmiphy device.
> >  
> >  Example:
> > hdmiphy {
> > 
> > -   compatible = "samsung,exynos5-hdmiphy";
> > +   compatible = "samsung,exynos4210-hdmiphy";
> > 
> > reg = <0x38>;
> > 
> > };
> > 
> > diff --git a/Documentation/devicetree/bindings/video/exynos_mixer.txt
> > b/Documentation/devicetree/bindings/video/exynos_mixer.txt index
> > 9b2ea03..9131b99 100644
> > --- a/Documentation/devicetree/bindings/video/exynos_mixer.txt
> > +++ b/Documentation/devicetree/bindings/video/exynos_mixer.txt
> > @@ -1,7 +1,11 @@
> > 
> >  Device-Tree bindings for mixer driver
> > 
> >  Required properties:
> > -- compatible: value should be "samsung,exynos5-mixer".
> > +- compatible: value should be one of the

[Bug 65958] GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

--- Comment #1 from Mike Lothian  ---
Created attachment 81099
  --> https://bugs.freedesktop.org/attachment.cgi?id=81099=edit
Lspci

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/5d354bf6/attachment-0001.html>

[Bug 65958] New: GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

  Priority: medium
Bug ID: 65958
  Assignee: dri-devel at lists.freedesktop.org
   Summary: GPU Lockup on Trinity 7500G
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: mike at fireburn.co.uk
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

Created attachment 81098
  --> https://bugs.freedesktop.org/attachment.cgi?id=81098=edit
Dmesg

I'm running the latest git from Linus's tree with xf86-video-ati libdrm and
mesa from git

The lockup seems to happen when video is playing. It's Xv rather than vdpau
that seems to cause this

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/c4274402/attachment.html>

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-20 Thread Russell King - ARM Linux

On Thu, Jun 20, 2013 at 09:47:07AM +0200, Lucas Stach wrote:
> Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
> > 
> > > -Original Message-
> > > From: dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org
> > > [mailto:dri-devel-bounces+inki.dae=samsung.com at lists.freedesktop.org] 
> > > On
> > > Behalf Of Russell King - ARM Linux
> > > Sent: Thursday, June 20, 2013 3:29 AM
> > > To: Inki Dae
> > > Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> > > Cho; linux-media at vger.kernel.org; linux-arm-kernel at 
> > > lists.infradead.org
> > > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> > > framework
> > > 
> > > On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > > > On the other hand, the below shows how we could enhance the conventional
> > > > way with my approach (just example):
> > > >
> > > > CPU -> DMA,
> > > > ioctl(qbuf command)  ioctl(streamon)
> > > >   |   |
> > > >   |   |
> > > > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> > > >
> > > > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> > > And
> > > > the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
> > > > accesses the sync buffer.
> > > >
> > > > And DMA -> CPU,
> > > > ioctl(dqbuf command)
> > > >   |
> > > >   |
> > > > dqbuf <- nothing to do
> > > >
> > > > Actual syncpoint is when DMA operation is completed (in interrupt
> > > handler):
> > > > the syncpoint is performed by calling dma_buf_sync_unlock().
> > > > Hence,  my approach is to move the syncpoints into just before dma
> > > access
> > > > as long as possible.
> > > 
> > > What you've just described does *not* work on architectures such as
> > > ARMv7 which do speculative cache fetches from memory at any time that
> > > that memory is mapped with a cacheable status, and will lead to data
> > > corruption.
> > 
> > I didn't explain that enough. Sorry about that. 'nothing to do' means that a
> > dmabuf sync interface isn't called but existing functions are called. So
> > this may be explained again:
> > ioctl(dqbuf command)
> > |
> > |
> > dqbuf <- 1. dma_unmap_sg
> > 2. dma_buf_sync_unlock (syncpoint)
> > 
> > The syncpoint I mentioned means lock mechanism; not doing cache operation.
> > 
> > In addition, please see the below more detail examples.
> > 
> > The conventional way (without dmabuf-sync) is:
> > Task A 
> > 
> >  1. CPU accesses buf  
> >  2. Send the buf to Task B  
> >  3. Wait for the buf from Task B
> >  4. go to 1
> > 
> > Task B
> > ---
> > 1. Wait for the buf from Task A
> > 2. qbuf the buf 
> > 2.1 insert the buf to incoming queue
> > 3. stream on
> > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > 3.2 get the buf from ready queue, and dma start.
> > 4. dqbuf
> > 4.1 dma_unmap_sg after dma operation completion
> > 4.2 move the buf to outgoing queue
> > 5. back the buf to Task A
> > 6. go to 1
> > 
> > In case that two tasks share buffers, and data flow goes from Task A to Task
> > B, we would need IPC operation to send and receive buffers properly between
> > those two tasks every time CPU or DMA access to buffers is started or
> > completed.
> > 
> > 
> > With dmabuf-sync:
> > 
> > Task A 
> > 
> >  1. dma_buf_sync_lock <- synpoint (call by user side)
> >  2. CPU accesses buf  
> >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> >  4. Send the buf to Task B (just one time)
> >  5. go to 1
> > 
> > 
> > Task B
> > ---
> > 1. Wait for the buf from Task A (just one time)
> > 2. qbuf the buf 
> > 1.1 insert the buf to incoming queue
> > 3. stream on
> > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > 3.3 get the buf from ready queue, and dma start.
> > 4. dqbuf
> > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > 4.2 dma_unmap_sg after dma operation completion
> > 4.3 move the buf to outgoing queue
> > 5. go to 1
> > 
> > On the other hand, in case of using dmabuf-sync, as you can see the above
> > example, we would need IPC operation just one time. That way, I think we
> > could not only reduce performance overhead but also make user application
> > simplified. Of course, this approach can be used for all DMA device drivers
> > such as DRM. I'm not a specialist in v4l2 world so there may be missing
> > point.
> > 
> 
> You already need some kind of IPC between the two tasks, as I suspect
> even in your

[PATCH -next] drm/i915: fix potential NULL pointer dereference in i915_gem_context_get_hang_stats()

2013-06-20 Thread Wei Yongjun

From: Wei Yongjun 

The dereference should be moved below the NULL test.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index ff47145..f32107e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -309,7 +309,7 @@ i915_gem_context_get_hang_stats(struct intel_ring_buffer 
*ring,
u32 id)
 {
struct drm_i915_private *dev_priv = ring->dev->dev_private;
-   struct drm_i915_file_private *file_priv = file->driver_priv;
+   struct drm_i915_file_private *file_priv;
struct i915_hw_context *to;

if (dev_priv->hw_contexts_disabled)
@@ -321,6 +321,7 @@ i915_gem_context_get_hang_stats(struct intel_ring_buffer 
*ring,
if (file == NULL)
return ERR_PTR(-EINVAL);

+   file_priv = file->driver_priv;
if (id == DEFAULT_CONTEXT_ID)
return _priv->hang_stats;

[PATCH v3 1/3] drm/exynos: add new compatible strings for hdmi subsystem

Hi Tomasz, Lucas,

How does this patch look to you ? Please share your views.

regards,
Rahul Sharma.

On Wed, Jun 19, 2013 at 6:21 PM, Rahul Sharma  
wrote:
> This patch adds new combatible strings for hdmi, mixer, ddc
> and hdmiphy. It follows the convention of using compatible string
> which represent the SoC in which the IP was added for the first
> time.
>
> Drivers continue to support the previous compatible strings
> but further addition of these compatible strings in device tree
> is deprecated.
>
> Signed-off-by: Rahul Sharma 
> ---
>  Documentation/devicetree/bindings/video/exynos_hdmi.txt   |7 +--
>  .../devicetree/bindings/video/exynos_hdmiddc.txt  |7 +--
>  .../devicetree/bindings/video/exynos_hdmiphy.txt  |7 +--
>  Documentation/devicetree/bindings/video/exynos_mixer.txt  |8 ++--
>  drivers/gpu/drm/exynos/exynos_ddc.c   |2 ++
>  drivers/gpu/drm/exynos/exynos_hdmi.c  |3 +++
>  drivers/gpu/drm/exynos/exynos_hdmiphy.c   |4 
>  drivers/gpu/drm/exynos/exynos_mixer.c |   13 
> -
>  8 files changed, 38 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/video/exynos_hdmi.txt 
> b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> index 589edee..c71d0f0 100644
> --- a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> +++ b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
> @@ -1,7 +1,10 @@
>  Device-Tree bindings for drm hdmi driver
>
>  Required properties:
> -- compatible: value should be "samsung,exynos5-hdmi".
> +- compatible: value should be one among the following:
> +   1) "samsung,exynos5-hdmi" 
> +   2) "samsung,exynos4210-hdmi"
> +   3) "samsung,exynos4212-hdmi"
>  - reg: physical base address of the hdmi and length of memory mapped
> region.
>  - interrupts: interrupt number to the cpu.
> @@ -15,7 +18,7 @@ Required properties:
>  Example:
>
> hdmi {
> -   compatible = "samsung,exynos5-hdmi";
> +   compatible = "samsung,exynos4212-hdmi";
> reg = <0x1453 0x10>;
> interrupts = <0 95 0>;
> hpd-gpio = < 7 0xf 1 3>;
> diff --git a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt 
> b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> index fa166d9..41eee97 100644
> --- a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> +++ b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
> @@ -1,12 +1,15 @@
>  Device-Tree bindings for hdmiddc driver
>
>  Required properties:
> -- compatible: value should be "samsung,exynos5-hdmiddc".
> +- compatible: value should be one of the following
> +   1) "samsung,exynos5-hdmiddc" 
> +   2) "samsung,exynos4210-hdmiddc"
> +
>  - reg: I2C address of the hdmiddc device.
>
>  Example:
>
> hdmiddc {
> -   compatible = "samsung,exynos5-hdmiddc";
> +   compatible = "samsung,exynos4210-hdmiddc";
> reg = <0x50>;
> };
> diff --git a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt 
> b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> index 858f4f9..162f641 100644
> --- a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> +++ b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
> @@ -1,12 +1,15 @@
>  Device-Tree bindings for hdmiphy driver
>
>  Required properties:
> -- compatible: value should be "samsung,exynos5-hdmiphy".
> +- compatible: value should be one of the following:
> +   1) "samsung,exynos5-hdmiphy" 
> +   2) "samsung,exynos4210-hdmiphy".
> +   3) "samsung,exynos4212-hdmiphy".
>  - reg: I2C address of the hdmiphy device.
>
>  Example:
>
> hdmiphy {
> -   compatible = "samsung,exynos5-hdmiphy";
> +   compatible = "samsung,exynos4210-hdmiphy";
> reg = <0x38>;
> };
> diff --git a/Documentation/devicetree/bindings/video/exynos_mixer.txt 
> b/Documentation/devicetree/bindings/video/exynos_mixer.txt
> index 9b2ea03..9131b99 100644
> --- a/Documentation/devicetree/bindings/video/exynos_mixer.txt
> +++ b/Documentation/devicetree/bindings/video/exynos_mixer.txt
> @@ -1,7 +1,11 @@
>  Device-Tree bindings for mixer driver
>
>  Required properties:
> -- compatible: value should be "samsung,exynos5-mixer".
> +- compatible: value should be one of the following:
> +   1) "samsung,exynos5-mixer" 
> +   2) "samsung,exynos4210-mixer"
> +   3) "samsung,exynos5250-mixer"
> +
>  - reg: physical base address of the mixer and length of memory mapped
> region.
>  - interrupts: interrupt number to the cpu.
> @@ -9,7 +13,7 @@ Required properties:
>  Example:
>
> mixer {
> -   compatible = "samsung,exynos5-mixer";
> +   compatible = "samsung,exynos5250-mixer";
> reg = <0x1445 0x1>;
>

[Bug 58021] [BISECTED] nouveau, nv50: External VGA not detected anymore

2013-06-20 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=58021


wippbox at gmx.net changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||CODE_FIX




--- Comment #6 from wippbox at gmx.net  2013-06-20 07:40:23 ---
Solved in https://bugs.freedesktop.org/show_bug.cgi?id=64904

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

queue a buffer and thus dragging it
> back into CPU domain while there is still DMA ongoing. Equally the queue
> ioctrl should make sure caches are properly written back to memory. The
> results of reading/writing to the mmap of a V4L2 buffer while it is
> enqueued to the hardware is simply undefined and there is nothing
> suggesting that this is a valid usecase.


Thanks to comments. However, that's not definitely my point, and you
just say the conventional way. My point is to more enhance the conventional
way.

The conventional way is (sorry but I'm not really a good painter) :

CPU -> DMA,
ioctl(qbuf command)  ioctl(streamon)
   |  |
   |  |
qbuf  <- syncpoint   start streaming <- dma access

DMA accesses a queued buffer with start streaming if source and
destination queues are ready.

And DMA -> CPU,
ioctl(dqbuf command)
  |
  |
dqbuf <- syncpoint

Internally, dqbuf waits for until DMA opertion is completed. And if
completed then user process can access a dequeued buffer.


On the other hand, the below shows how we could enhance the conventional
way with my approach (just example):

CPU -> DMA,
ioctl(qbuf command)  ioctl(streamon)
  |   |
  |   |
qbuf  <- dma_buf_sync_get   start streaming <- syncpoint

dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object. And
the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
accesses the sync buffer.

And DMA -> CPU,
ioctl(dqbuf command)
  |
  |
dqbuf <- nothing to do

Actual syncpoint is when DMA operation is completed (in interrupt handler):
the syncpoint is performed by calling dma_buf_sync_unlock().
Hence,  my approach is to move the syncpoints into just before dma access
as long as possible. Of course, user process need to call dma-buf
interfaces or simliar things for buffer synchronization with DMA side.
However, as of now, there is no good idea: I had implemented the user
interfaces in dma-buf framework but that was just to show you so ugly. The
eventual purpose of my approach is to integrate sync interfaces with dmabuf
sync, and this approach can be used commonly for v4l2, drm drivers, user
processes, and so on: as I already mentioned in document file, this
approach is for DMA devices using system memory as DMA buffer, especially
most ARM SoCs.

Thanks,
Inki Dae


> Regards,
> Lucas
>
> --
> Pengutronix e.K.   | Lucas Stach |
> Industrial Linux Solutions | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
> Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130620/7eb4db6c/attachment-0001.html>

[PATCH 0/5] clk/exynos5420: add clocks for hdmi subsystem

2013-06-20 Thread Kukjin Kim

On 06/19/13 13:04, Rahul Sharma wrote:
> + mike
>
Mike, I'm waiting for your reply on this. If you're OK, let me take this 
series on top of exynos5420 stuff in samsung tree.

Of course, if any concerns, please let me know.

Thanks,
- Kukjin

> On Tue, Jun 18, 2013 at 8:03 PM, Rahul Sharma  
> wrote:
>> Add clock changes for hdmi subsystem for exynos5250 SoC. These
>> include addition of new clocks like mout_hdmi and smmu_tv, associating
>> ID to clk_hdmiphy and some essential corrections.
>>
>> This set is based on kukjin's for-next branch at
>> http://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git.
>>
>> This set dependents on the following patches which add support for Exynos5420
>> http://www.mail-archive.com/linux-samsung-soc at 
>> vger.kernel.org/msg19264.html
>>
>> Rahul Sharma (5):
>>clk/exynos5420: add sclk_hdmiphy to the list of special clocks
>>clk/exynos5420: add gate clock for tv sysmmu
>>clk/exynos5420: fix the order of parents of hdmi mux
>>clk/exynos5420: add hdmi mux to change parents in hdmi driver
>>clk/exynos5420: assign sclk_pixel id to pixel clock divider
>>
>>   .../devicetree/bindings/clock/exynos5420-clock.txt   |7 +++
>>   drivers/clk/samsung/clk-exynos5420.c |   18 
>> +++---
>>   2 files changed, 18 insertions(+), 7 deletions(-)
>>
>> --
>> 1.7.10.4

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

 -Original Message-
 From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
 [mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org] On
 Behalf Of Russell King - ARM Linux
 Sent: Thursday, June 20, 2013 3:29 AM
 To: Inki Dae
 Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
 Cho; linux-me...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
 Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
 framework

 On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
  On the other hand, the below shows how we could enhance the conventional
  way with my approach (just example):

  CPU - DMA,
  ioctl(qbuf command)  ioctl(streamon)
|   |
|   |
  qbuf  - dma_buf_sync_get   start streaming - syncpoint

  dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
 And
  the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
  accesses the sync buffer.

  And DMA - CPU,
  ioctl(dqbuf command)
|
|
  dqbuf - nothing to do

  Actual syncpoint is when DMA operation is completed (in interrupt
 handler):
  the syncpoint is performed by calling dma_buf_sync_unlock().
  Hence,  my approach is to move the syncpoints into just before dma
 access
  as long as possible.

 What you've just described does *not* work on architectures such as
 ARMv7 which do speculative cache fetches from memory at any time that
 that memory is mapped with a cacheable status, and will lead to data
 corruption.

I didn't explain that enough. Sorry about that. 'nothing to do' means that a
dmabuf sync interface isn't called but existing functions are called. So
this may be explained again:
ioctl(dqbuf command)
|
|
dqbuf - 1. dma_unmap_sg
2. dma_buf_sync_unlock (syncpoint)

The syncpoint I mentioned means lock mechanism; not doing cache operation.

In addition, please see the below more detail examples.

The conventional way (without dmabuf-sync) is:
Task A 

 1. CPU accesses buf  
 2. Send the buf to Task B  
 3. Wait for the buf from Task B
 4. go to 1

Task B
---
1. Wait for the buf from Task A
2. qbuf the buf 
2.1 insert the buf to incoming queue
3. stream on
3.1 dma_map_sg if ready, and move the buf to ready queue
3.2 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_unmap_sg after dma operation completion
4.2 move the buf to outgoing queue
5. back the buf to Task A
6. go to 1

In case that two tasks share buffers, and data flow goes from Task A to Task
B, we would need IPC operation to send and receive buffers properly between
those two tasks every time CPU or DMA access to buffers is started or
completed.

With dmabuf-sync:

Task A 

 1. dma_buf_sync_lock - synpoint (call by user side)
 2. CPU accesses buf  
 3. dma_buf_sync_unlock - syncpoint (call by user side)
 4. Send the buf to Task B (just one time)
 5. go to 1

Task B
---
1. Wait for the buf from Task A (just one time)
2. qbuf the buf 
1.1 insert the buf to incoming queue
3. stream on
3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
3.2 dma_map_sg if ready, and move the buf to ready queue
3.3 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
4.2 dma_unmap_sg after dma operation completion
4.3 move the buf to outgoing queue
5. go to 1

On the other hand, in case of using dmabuf-sync, as you can see the above
example, we would need IPC operation just one time. That way, I think we
could not only reduce performance overhead but also make user application
simplified. Of course, this approach can be used for all DMA device drivers
such as DRM. I'm not a specialist in v4l2 world so there may be missing
point.

Thanks,
Inki Dae

 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v3 1/3] drm/exynos: add new compatible strings for hdmi subsystem

2013-06-20 Thread Tomasz Figa

Hi Rahul,

On Thursday 20 of June 2013 07:41:53 Rahul Sharma wrote:
 Hi Tomasz, Lucas,
 
 How does this patch look to you ? Please share your views.

Looks fine now. Have my

Reviewed-by: Tomasz Figa t.f...@samsung.com

for the whole series.

Best regards,
Tomasz

 regards,
 Rahul Sharma.
 
 On Wed, Jun 19, 2013 at 6:21 PM, Rahul Sharma rahul.sha...@samsung.com 
wrote:
  This patch adds new combatible strings for hdmi, mixer, ddc
  and hdmiphy. It follows the convention of using compatible string
  which represent the SoC in which the IP was added for the first
  time.
  
  Drivers continue to support the previous compatible strings
  but further addition of these compatible strings in device tree
  is deprecated.
  
  Signed-off-by: Rahul Sharma rahul.sha...@samsung.com
  ---
  
   Documentation/devicetree/bindings/video/exynos_hdmi.txt   |7
   +-- .../devicetree/bindings/video/exynos_hdmiddc.txt  | 
 7 +-- .../devicetree/bindings/video/exynos_hdmiphy.txt
|7 +--
   Documentation/devicetree/bindings/video/exynos_mixer.txt  |8
   ++-- drivers/gpu/drm/exynos/exynos_ddc.c   |
  2 ++ drivers/gpu/drm/exynos/exynos_hdmi.c  | 
 3 +++ drivers/gpu/drm/exynos/exynos_hdmiphy.c   | 
 4  drivers/gpu/drm/exynos/exynos_mixer.c |
 13 - 8 files changed, 38 insertions(+), 13
   deletions(-)
  
  diff --git a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
  b/Documentation/devicetree/bindings/video/exynos_hdmi.txt index
  589edee..c71d0f0 100644
  --- a/Documentation/devicetree/bindings/video/exynos_hdmi.txt
  +++ b/Documentation/devicetree/bindings/video/exynos_hdmi.txt
  @@ -1,7 +1,10 @@
  
   Device-Tree bindings for drm hdmi driver
  
   Required properties:
  -- compatible: value should be samsung,exynos5-hdmi.
  +- compatible: value should be one among the following:
  +   1) samsung,exynos5-hdmi DEPRECATED
  +   2) samsung,exynos4210-hdmi
  +   3) samsung,exynos4212-hdmi
  
   - reg: physical base address of the hdmi and length of memory mapped
   
  region.
   
   - interrupts: interrupt number to the cpu.
  
  @@ -15,7 +18,7 @@ Required properties:
   Example:
  hdmi {
  
  -   compatible = samsung,exynos5-hdmi;
  +   compatible = samsung,exynos4212-hdmi;
  
  reg = 0x1453 0x10;
  interrupts = 0 95 0;
  hpd-gpio = gpx3 7 0xf 1 3;
  
  diff --git
  a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
  b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt index
  fa166d9..41eee97 100644
  --- a/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
  +++ b/Documentation/devicetree/bindings/video/exynos_hdmiddc.txt
  @@ -1,12 +1,15 @@
  
   Device-Tree bindings for hdmiddc driver
  
   Required properties:
  -- compatible: value should be samsung,exynos5-hdmiddc.
  +- compatible: value should be one of the following
  +   1) samsung,exynos5-hdmiddc DEPRECATED
  +   2) samsung,exynos4210-hdmiddc
  +
  
   - reg: I2C address of the hdmiddc device.
   
   Example:
  hdmiddc {
  
  -   compatible = samsung,exynos5-hdmiddc;
  +   compatible = samsung,exynos4210-hdmiddc;
  
  reg = 0x50;
  
  };
  
  diff --git
  a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
  b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt index
  858f4f9..162f641 100644
  --- a/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
  +++ b/Documentation/devicetree/bindings/video/exynos_hdmiphy.txt
  @@ -1,12 +1,15 @@
  
   Device-Tree bindings for hdmiphy driver
  
   Required properties:
  -- compatible: value should be samsung,exynos5-hdmiphy.
  +- compatible: value should be one of the following:
  +   1) samsung,exynos5-hdmiphy DEPRECATED
  +   2) samsung,exynos4210-hdmiphy.
  +   3) samsung,exynos4212-hdmiphy.
  
   - reg: I2C address of the hdmiphy device.
   
   Example:
  hdmiphy {
  
  -   compatible = samsung,exynos5-hdmiphy;
  +   compatible = samsung,exynos4210-hdmiphy;
  
  reg = 0x38;
  
  };
  
  diff --git a/Documentation/devicetree/bindings/video/exynos_mixer.txt
  b/Documentation/devicetree/bindings/video/exynos_mixer.txt index
  9b2ea03..9131b99 100644
  --- a/Documentation/devicetree/bindings/video/exynos_mixer.txt
  +++ b/Documentation/devicetree/bindings/video/exynos_mixer.txt
  @@ -1,7 +1,11 @@
  
   Device-Tree bindings for mixer driver
  
   Required properties:
  -- compatible: value should be samsung,exynos5-mixer.
  +- compatible: value should be one of the following:
  +   1) samsung,exynos5-mixer DEPRECATED
  +   2) samsung,exynos4210-mixer
  +   3) samsung,exynos5250-mixer
  +
  
   - reg: physical base address of the mixer and length of memory

[Bug 58021] [BISECTED] nouveau, nv50: External VGA not detected anymore

https://bugzilla.kernel.org/show_bug.cgi?id=58021


wipp...@gmx.net changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||CODE_FIX




--- Comment #6 from wipp...@gmx.net  2013-06-20 07:40:23 ---
Solved in https://bugs.freedesktop.org/show_bug.cgi?id=64904

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:

  -Original Message-
  From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
  [mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org] On
  Behalf Of Russell King - ARM Linux
  Sent: Thursday, June 20, 2013 3:29 AM
  To: Inki Dae
  Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
  Cho; linux-me...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
  Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
  framework

  On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
   On the other hand, the below shows how we could enhance the conventional
   way with my approach (just example):

   CPU - DMA,
   ioctl(qbuf command)  ioctl(streamon)
 |   |
 |   |
   qbuf  - dma_buf_sync_get   start streaming - syncpoint

   dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
  And
   the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
   accesses the sync buffer.

   And DMA - CPU,
   ioctl(dqbuf command)
 |
 |
   dqbuf - nothing to do

   Actual syncpoint is when DMA operation is completed (in interrupt
  handler):
   the syncpoint is performed by calling dma_buf_sync_unlock().
   Hence,  my approach is to move the syncpoints into just before dma
  access
   as long as possible.

  What you've just described does *not* work on architectures such as
  ARMv7 which do speculative cache fetches from memory at any time that
  that memory is mapped with a cacheable status, and will lead to data
  corruption.

 I didn't explain that enough. Sorry about that. 'nothing to do' means that a
 dmabuf sync interface isn't called but existing functions are called. So
 this may be explained again:
 ioctl(dqbuf command)
 |
 |
 dqbuf - 1. dma_unmap_sg
 2. dma_buf_sync_unlock (syncpoint)

 The syncpoint I mentioned means lock mechanism; not doing cache operation.

 In addition, please see the below more detail examples.

 The conventional way (without dmabuf-sync) is:
 Task A 

  1. CPU accesses buf  
  2. Send the buf to Task B  
  3. Wait for the buf from Task B
  4. go to 1

 Task B
 ---
 1. Wait for the buf from Task A
 2. qbuf the buf 
 2.1 insert the buf to incoming queue
 3. stream on
 3.1 dma_map_sg if ready, and move the buf to ready queue
 3.2 get the buf from ready queue, and dma start.
 4. dqbuf
 4.1 dma_unmap_sg after dma operation completion
 4.2 move the buf to outgoing queue
 5. back the buf to Task A
 6. go to 1

 In case that two tasks share buffers, and data flow goes from Task A to Task
 B, we would need IPC operation to send and receive buffers properly between
 those two tasks every time CPU or DMA access to buffers is started or
 completed.

 With dmabuf-sync:

 Task A 

  1. dma_buf_sync_lock - synpoint (call by user side)
  2. CPU accesses buf  
  3. dma_buf_sync_unlock - syncpoint (call by user side)
  4. Send the buf to Task B (just one time)
  5. go to 1

 Task B
 ---
 1. Wait for the buf from Task A (just one time)
 2. qbuf the buf 
 1.1 insert the buf to incoming queue
 3. stream on
 3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
 3.2 dma_map_sg if ready, and move the buf to ready queue
 3.3 get the buf from ready queue, and dma start.
 4. dqbuf
 4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
 4.2 dma_unmap_sg after dma operation completion
 4.3 move the buf to outgoing queue
 5. go to 1

 On the other hand, in case of using dmabuf-sync, as you can see the above
 example, we would need IPC operation just one time. That way, I think we
 could not only reduce performance overhead but also make user application
 simplified. Of course, this approach can be used for all DMA device drivers
 such as DRM. I'm not a specialist in v4l2 world so there may be missing
 point.

You already need some kind of IPC between the two tasks, as I suspect
even in your example it wouldn't make much sense to queue the buffer
over and over again in task B without task A writing anything to it. So
task A has to signal task B there is new data in the buffer to be
processed.

There is no need to share the buffer over and over again just to get the
two processes to work together on the same thing. Just share the fd
between both and then do out-of-band completion signaling, as you need
this anyway. Without this you'll end up with unpredictable behavior.
Just because sync allows you to access the

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

 -Original Message-
 From: Lucas Stach [mailto:l.st...@pengutronix.de]
 Sent: Thursday, June 20, 2013 4:47 PM
 To: Inki Dae
 Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
 ker...@lists.infradead.org; linux-me...@vger.kernel.org
 Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
 framework

 Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:

   -Original Message-
   From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
   [mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org]
 On
   Behalf Of Russell King - ARM Linux
   Sent: Thursday, June 20, 2013 3:29 AM
   To: Inki Dae
   Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
 YoungJun
   Cho; linux-me...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
   Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer
 synchronization
   framework

   On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
On the other hand, the below shows how we could enhance the
 conventional
way with my approach (just example):

CPU - DMA,
ioctl(qbuf command)  ioctl(streamon)
  |   |
  |   |
qbuf  - dma_buf_sync_get   start streaming - syncpoint

dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
   And
the syncpoint is performed by calling dma_buf_sync_lock(), and then
 DMA
accesses the sync buffer.

And DMA - CPU,
ioctl(dqbuf command)
  |
  |
dqbuf - nothing to do

Actual syncpoint is when DMA operation is completed (in interrupt
   handler):
the syncpoint is performed by calling dma_buf_sync_unlock().
Hence,  my approach is to move the syncpoints into just before dma
   access
as long as possible.

   What you've just described does *not* work on architectures such as
   ARMv7 which do speculative cache fetches from memory at any time that
   that memory is mapped with a cacheable status, and will lead to data
   corruption.

  I didn't explain that enough. Sorry about that. 'nothing to do' means
 that a
  dmabuf sync interface isn't called but existing functions are called. So
  this may be explained again:
  ioctl(dqbuf command)
  |
  |
  dqbuf - 1. dma_unmap_sg
  2. dma_buf_sync_unlock (syncpoint)

  The syncpoint I mentioned means lock mechanism; not doing cache
 operation.

  In addition, please see the below more detail examples.

  The conventional way (without dmabuf-sync) is:
  Task A

   1. CPU accesses buf
   2. Send the buf to Task B
   3. Wait for the buf from Task B
   4. go to 1

  Task B
  ---
  1. Wait for the buf from Task A
  2. qbuf the buf
  2.1 insert the buf to incoming queue
  3. stream on
  3.1 dma_map_sg if ready, and move the buf to ready queue
  3.2 get the buf from ready queue, and dma start.
  4. dqbuf
  4.1 dma_unmap_sg after dma operation completion
  4.2 move the buf to outgoing queue
  5. back the buf to Task A
  6. go to 1

  In case that two tasks share buffers, and data flow goes from Task A to
 Task
  B, we would need IPC operation to send and receive buffers properly
 between
  those two tasks every time CPU or DMA access to buffers is started or
  completed.

  With dmabuf-sync:

  Task A

   1. dma_buf_sync_lock - synpoint (call by user side)
   2. CPU accesses buf
   3. dma_buf_sync_unlock - syncpoint (call by user side)
   4. Send the buf to Task B (just one time)
   5. go to 1

  Task B
  ---
  1. Wait for the buf from Task A (just one time)
  2. qbuf the buf
  1.1 insert the buf to incoming queue
  3. stream on
  3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
  3.2 dma_map_sg if ready, and move the buf to ready queue
  3.3 get the buf from ready queue, and dma start.
  4. dqbuf
  4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
  4.2 dma_unmap_sg after dma operation completion
  4.3 move the buf to outgoing queue
  5. go to 1

  On the other hand, in case of using dmabuf-sync, as you can see the
 above
  example, we would need IPC operation just one time. That way, I think we
  could not only reduce performance overhead but also make user
 application
  simplified. Of course, this approach can be used for all DMA device
 drivers
  such as DRM. I'm not a specialist in v4l2 world so there may be missing
  point.

 You already need some kind of IPC between the two tasks, as I suspect
 even in your example it wouldn't make much sense to queue the buffer
 over and over again in task B without task A writing

Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 09:17 +0100 schrieb Russell King - ARM
Linux:
 On Thu, Jun 20, 2013 at 09:47:07AM +0200, Lucas Stach wrote:
  Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
   
-Original Message-
From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
[mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org] On
Behalf Of Russell King - ARM Linux
Sent: Thursday, June 20, 2013 3:29 AM
To: Inki Dae
Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
Cho; linux-me...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer 
synchronization
framework

On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
 On the other hand, the below shows how we could enhance the 
 conventional
 way with my approach (just example):

 CPU - DMA,
 ioctl(qbuf command)  ioctl(streamon)
   |   |
   |   |
 qbuf  - dma_buf_sync_get   start streaming - syncpoint

 dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
And
 the syncpoint is performed by calling dma_buf_sync_lock(), and then 
 DMA
 accesses the sync buffer.

 And DMA - CPU,
 ioctl(dqbuf command)
   |
   |
 dqbuf - nothing to do

 Actual syncpoint is when DMA operation is completed (in interrupt
handler):
 the syncpoint is performed by calling dma_buf_sync_unlock().
 Hence,  my approach is to move the syncpoints into just before dma
access
 as long as possible.

What you've just described does *not* work on architectures such as
ARMv7 which do speculative cache fetches from memory at any time that
that memory is mapped with a cacheable status, and will lead to data
corruption.
   
   I didn't explain that enough. Sorry about that. 'nothing to do' means 
   that a
   dmabuf sync interface isn't called but existing functions are called. So
   this may be explained again:
   ioctl(dqbuf command)
   |
   |
   dqbuf - 1. dma_unmap_sg
   2. dma_buf_sync_unlock (syncpoint)
   
   The syncpoint I mentioned means lock mechanism; not doing cache operation.
   
   In addition, please see the below more detail examples.
   
   The conventional way (without dmabuf-sync) is:
   Task A 
   
1. CPU accesses buf  
2. Send the buf to Task B  
3. Wait for the buf from Task B
4. go to 1
   
   Task B
   ---
   1. Wait for the buf from Task A
   2. qbuf the buf 
   2.1 insert the buf to incoming queue
   3. stream on
   3.1 dma_map_sg if ready, and move the buf to ready queue
   3.2 get the buf from ready queue, and dma start.
   4. dqbuf
   4.1 dma_unmap_sg after dma operation completion
   4.2 move the buf to outgoing queue
   5. back the buf to Task A
   6. go to 1
   
   In case that two tasks share buffers, and data flow goes from Task A to 
   Task
   B, we would need IPC operation to send and receive buffers properly 
   between
   those two tasks every time CPU or DMA access to buffers is started or
   completed.
   
   
   With dmabuf-sync:
   
   Task A 
   
1. dma_buf_sync_lock - synpoint (call by user side)
2. CPU accesses buf  
3. dma_buf_sync_unlock - syncpoint (call by user side)
4. Send the buf to Task B (just one time)
5. go to 1
   
   
   Task B
   ---
   1. Wait for the buf from Task A (just one time)
   2. qbuf the buf 
   1.1 insert the buf to incoming queue
   3. stream on
   3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
   3.2 dma_map_sg if ready, and move the buf to ready queue
   3.3 get the buf from ready queue, and dma start.
   4. dqbuf
   4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
   4.2 dma_unmap_sg after dma operation completion
   4.3 move the buf to outgoing queue
   5. go to 1
   
   On the other hand, in case of using dmabuf-sync, as you can see the above
   example, we would need IPC operation just one time. That way, I think we
   could not only reduce performance overhead but also make user application
   simplified. Of course, this approach can be used for all DMA device 
   drivers
   such as DRM. I'm not a specialist in v4l2 world so there may be missing
   point.
   
  
  You already need some kind of IPC between the two tasks, as I suspect
  even in your example it wouldn't make much sense to queue the buffer
  over and over again in task B without task A writing

[Bug 65958] New: GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

  Priority: medium
Bug ID: 65958
  Assignee: dri-devel@lists.freedesktop.org
   Summary: GPU Lockup on Trinity 7500G
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: m...@fireburn.co.uk
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

Created attachment 81098
  -- https://bugs.freedesktop.org/attachment.cgi?id=81098action=edit
Dmesg

I'm running the latest git from Linus's tree with xf86-video-ati libdrm and
mesa from git

The lockup seems to happen when video is playing. It's Xv rather than vdpau
that seems to cause this

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65958] GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

--- Comment #1 from Mike Lothian m...@fireburn.co.uk ---
Created attachment 81099
  -- https://bugs.freedesktop.org/attachment.cgi?id=81099action=edit
Lspci

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: Improve manual IRQ installation documentation

On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
 Signed-off-by: Laurent Pinchart laurent.pinchart+rene...@ideasonboard.com
 ---
  Documentation/DocBook/drm.tmpl | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)
 
 diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
 index f9df3b8..738b727 100644
 --- a/Documentation/DocBook/drm.tmpl
 +++ b/Documentation/DocBook/drm.tmpl
 @@ -186,11 +186,12 @@
varlistentry
  termDRIVER_HAVE_IRQ/termtermDRIVER_IRQ_SHARED/term
  listitempara
 -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ 
 handler. The
 -  DRM core will automatically register an interrupt handler when 
 the
 -  flag is set. DRIVER_IRQ_SHARED indicates whether the device 
 amp;
 -  handler support shared IRQs (note that this is required of PCI
 -  drivers).
 +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ handler
 +  managed by the DRM Core. The core will support simple IRQ 
 handler
 +  installation when the flag is set. The installation process is
 +  described in xref linkend=drm-irq-registration/./para
 +  paraDRIVER_IRQ_SHARED indicates whether the device amp; 
 handler
 +  support shared IRQs (note that this is required of PCI  
 drivers).
  /para/listitem
/varlistentry
varlistentry
 @@ -344,7 +345,8 @@ char *date;/synopsis
The DRM core tries to facilitate IRQ handler registration and
unregistration by providing functiondrm_irq_install/function 
 and
functiondrm_irq_uninstall/function functions. Those functions 
 only
 -  support a single interrupt per device.
 +  support a single interrupt per device, devices that use more than 
 one
 +  IRQs need to be handled manually.

Perhaps this should mention that if you handle IRQ installation manually
you also need to manually set drm-irq_enabled = 1, as otherwise things
like DRM_IOCTL_WAIT_VBLANK won't work properly.

Thierry


pgp69cIkMuIoB.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

Am Donnerstag, den 20.06.2013, 17:24 +0900 schrieb Inki Dae:
[...]
   In addition, please see the below more detail examples.
  
   The conventional way (without dmabuf-sync) is:
   Task A
   
1. CPU accesses buf
2. Send the buf to Task B
3. Wait for the buf from Task B
4. go to 1
  
   Task B
   ---
   1. Wait for the buf from Task A
   2. qbuf the buf
   2.1 insert the buf to incoming queue
   3. stream on
   3.1 dma_map_sg if ready, and move the buf to ready queue
   3.2 get the buf from ready queue, and dma start.
   4. dqbuf
   4.1 dma_unmap_sg after dma operation completion
   4.2 move the buf to outgoing queue
   5. back the buf to Task A
   6. go to 1
  
   In case that two tasks share buffers, and data flow goes from Task A to
  Task
   B, we would need IPC operation to send and receive buffers properly
  between
   those two tasks every time CPU or DMA access to buffers is started or
   completed.
  
  
   With dmabuf-sync:
  
   Task A
   
1. dma_buf_sync_lock - synpoint (call by user side)
2. CPU accesses buf
3. dma_buf_sync_unlock - syncpoint (call by user side)
4. Send the buf to Task B (just one time)
5. go to 1
  
  
   Task B
   ---
   1. Wait for the buf from Task A (just one time)
   2. qbuf the buf
   1.1 insert the buf to incoming queue
   3. stream on
   3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
   3.2 dma_map_sg if ready, and move the buf to ready queue
   3.3 get the buf from ready queue, and dma start.
   4. dqbuf
   4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
   4.2 dma_unmap_sg after dma operation completion
   4.3 move the buf to outgoing queue
   5. go to 1
  
   On the other hand, in case of using dmabuf-sync, as you can see the
  above
   example, we would need IPC operation just one time. That way, I think we
   could not only reduce performance overhead but also make user
  application
   simplified. Of course, this approach can be used for all DMA device
  drivers
   such as DRM. I'm not a specialist in v4l2 world so there may be missing
   point.
  
  
  You already need some kind of IPC between the two tasks, as I suspect
  even in your example it wouldn't make much sense to queue the buffer
  over and over again in task B without task A writing anything to it. So
  task A has to signal task B there is new data in the buffer to be
  processed.
  
  There is no need to share the buffer over and over again just to get the
  two processes to work together on the same thing. Just share the fd
  between both and then do out-of-band completion signaling, as you need
  this anyway. Without this you'll end up with unpredictable behavior.
  Just because sync allows you to access the buffer doesn't mean it's
  valid for your use-case. Without completion signaling you could easily
  end up overwriting your data from task A multiple times before task B
  even tries to lock the buffer for processing.
  
  So the valid flow is (and this already works with the current APIs):
  Task ATask B
  ----
  CPU access buffer
   --completion signal-
qbuf (dragging buffer into
device domain, flush caches,
reserve buffer etc.)
  |
wait for device operation to
complete
  |
dqbuf (dragging buffer back
into CPU domain, invalidate
caches, unreserve)
  -completion signal
  CPU access buffer
  
 
 Correct. In case that data flow goes from A to B, it needs some kind
 of IPC between the two tasks every time as you said. Then, without
 dmabuf-sync, how do think about the case that two tasks share the same
 buffer but these tasks access the buffer(buf1) as write, and data of
 the buffer(buf1) isn't needed to be shared?
 
Sorry, I don't see the point you are trying to solve here. If you share
a buffer and want its content to be clearly defined at every point in
time you have to synchronize the tasks working with the buffer, not just
the buffer accesses itself.

Easiest way to do so is doing sync through userspace with out-of-band
IPC, like in the example above. A more advanced way to achieve this
would be using cross-device fences to avoid going through userspace for
every syncpoint.

 
 With dmabuf-sync is:
 
  Task A
  
  1. dma_buf_sync_lock - synpoint (call by user side)
  2. CPU writes

Re: [PATCH] drm: Improve manual IRQ installation documentation

Hi Thierry,

On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
 On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
  Signed-off-by: Laurent Pinchart
  laurent.pinchart+rene...@ideasonboard.com
  ---
  
   Documentation/DocBook/drm.tmpl | 14 --
   1 file changed, 8 insertions(+), 6 deletions(-)
  
  diff --git a/Documentation/DocBook/drm.tmpl
  b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
  --- a/Documentation/DocBook/drm.tmpl
  +++ b/Documentation/DocBook/drm.tmpl
  @@ -186,11 +186,12 @@
  
 varlistentry
 
   termDRIVER_HAVE_IRQ/termtermDRIVER_IRQ_SHARED/term
   listitempara
  
  -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
  handler. The -  DRM core will automatically register an
  interrupt handler when the -  flag is set. DRIVER_IRQ_SHARED
  indicates whether the device amp; -  handler support shared
  IRQs (note that this is required of PCI -  drivers).
  +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
  handler +  managed by the DRM Core. The core will support
  simple IRQ handler +  installation when the flag is set. The
  installation process is +  described in xref
  linkend=drm-irq-registration/./para + 
  paraDRIVER_IRQ_SHARED indicates whether the device amp; handler + 
  support shared IRQs (note that this is required of PCI  drivers). 
   /para/listitem
 
 /varlistentry
 varlistentry
  
  @@ -344,7 +345,8 @@ char *date;/synopsis
  
 The DRM core tries to facilitate IRQ handler registration and
 unregistration by providing
 functiondrm_irq_install/function and
 functiondrm_irq_uninstall/function functions. Those
 functions only
  
  -  support a single interrupt per device.
  +  support a single interrupt per device, devices that use more
  than one +  IRQs need to be handled manually.
 
 Perhaps this should mention that if you handle IRQ installation manually
 you also need to manually set drm-irq_enabled = 1, as otherwise things
 like DRM_IOCTL_WAIT_VBLANK won't work properly.

That's only needed if DRIVER_HAVE_IRQ is set, otherwise the drm_wait_vblank() 
function skips the irq_enabled check.

-- 
Regards,

Laurent Pinchart


signature.asc
Description: This is a digitally signed message part.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[GIT PULL FOR v3.11] shmob-drm patches

Hi Dave,

Could you please pull the following four shmob-drm patches for v3.11 ?

The following changes since commit dc28aa072f502433b6adc5c9ae8f56955c07580a:

  gpu:drm:tilcdc: get preferred_bpp value from DT (2013-06-20 14:08:01 +1000)

are available in the git repository at:

  git://linuxtv.org/pinchartl/fbdev.git drm/shmob

for you to fetch changes up to 227c1fb28c7d01ccbd9203e42734c06b470d1744:

  drm/shmobile: Enable compilation on all ARM platforms (2013-06-20 10:07:14 
+0200)


Laurent Pinchart (4):
  drm/shmobile: Minor typo fix in debug message
  drm/shmobile: Use devm_* managed functions
  drm/shmobile: Add DRM PRIME support
  drm/shmobile: Enable compilation on all ARM platforms

 drivers/gpu/drm/shmobile/Kconfig   |  2 +-
 drivers/gpu/drm/shmobile/shmob_drm_drv.c   | 35 +
 drivers/gpu/drm/shmobile/shmob_drm_kms.c   |  2 +-
 drivers/gpu/drm/shmobile/shmob_drm_plane.c |  7 +--
 4 files changed, 18 insertions(+), 28 deletions(-)

-- 
Regards,

Laurent Pinchart

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PULL] drm-tegra-fixes

Hi Dave,

The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://anongit.freedesktop.org/tegra/linux.git drm/fixes

for you to fetch changes up to 2b15aa0fd466a4b2defdfd84c0b5168804b3eb33:

  gpu: host1x: Rework CPU syncpoint increment (2013-06-20 12:21:38 +0200)

I know this comes awfully late and is much bigger than I'd like. It's
all my fault, though, since I've been travelling a lot lately and just
didn't find the time to prepare this earlier. I did want to sort out
patches that didn't warrant to go in before the final 3.10 release, but
all met the criteria as far as I could tell.

The larger chunks are fixes for the firewall that checks job submissions
from userspace, which was unfortunately not working properly. All other
patches are mostly one-liners to fix things that were introduced in
3.10.

Thierry


Arto Merilainen (5):
  gpu: host1x: Check reloc table before usage
  gpu: host1x: Copy gathers before verification
  gpu: host1x: Fix memory access in syncpt request
  gpu: host1x: Fix client_managed type
  gpu: host1x: Rework CPU syncpoint increment

Emil Goode (1):
  drm/tegra: Include header drm/drm.h

Laurent Pinchart (1):
  drm/tegra: Remove DRIVER_BUS_PLATFORM from driver_features

Terje Bergstrom (2):
  gpu: host1x: Check INCR opcode correctly
  gpu: host1x: Don't reset firewall between gathers

Thierry Reding (5):
  drm/tegra: Don't disable unused planes
  drm/tegra: Explicitly set irq_enabled
  drm/tegra: Honor pixel-format changes
  MAINTAINERS: Update Tegra DRM entry
  drm/tegra: Fix return value

Wei Yongjun (2):
  drm/tegra: fix missing unlock on error
  drm/tegra: fix error return code in gr2d_submit()

 MAINTAINERS   |   7 +-
 drivers/gpu/host1x/dev.h  |   8 +--
 drivers/gpu/host1x/drm/dc.c   |   5 ++
 drivers/gpu/host1x/drm/drm.c  |  14 +++-
 drivers/gpu/host1x/drm/gr2d.c |  12 ++--
 drivers/gpu/host1x/hw/cdma_hw.c   |   2 +-
 drivers/gpu/host1x/hw/syncpt_hw.c |  12 ++--
 drivers/gpu/host1x/job.c  | 135 +-
 drivers/gpu/host1x/syncpt.c   |  26 +++-
 drivers/gpu/host1x/syncpt.h   |  13 ++--
 include/uapi/drm/tegra_drm.h  |   2 +
 11 files changed, 113 insertions(+), 123 deletions(-)


pgpxRlgojC2qk.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: Improve manual IRQ installation documentation

On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
 Hi Thierry,
 
 On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
  On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
   Signed-off-by: Laurent Pinchart
   laurent.pinchart+rene...@ideasonboard.com
   ---
   
Documentation/DocBook/drm.tmpl | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
   
   diff --git a/Documentation/DocBook/drm.tmpl
   b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
   --- a/Documentation/DocBook/drm.tmpl
   +++ b/Documentation/DocBook/drm.tmpl
   @@ -186,11 +186,12 @@
   
  varlistentry
  
termDRIVER_HAVE_IRQ/termtermDRIVER_IRQ_SHARED/term
listitempara
   
   -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
   handler. The -  DRM core will automatically register an
   interrupt handler when the -  flag is set. DRIVER_IRQ_SHARED
   indicates whether the device amp; -  handler support shared
   IRQs (note that this is required of PCI -  drivers).
   +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
   handler +  managed by the DRM Core. The core will support
   simple IRQ handler +  installation when the flag is set. The
   installation process is +  described in xref
   linkend=drm-irq-registration/./para + 
   paraDRIVER_IRQ_SHARED indicates whether the device amp; handler + 
   support shared IRQs (note that this is required of PCI  
   drivers). 
/para/listitem
  
  /varlistentry
  varlistentry
   
   @@ -344,7 +345,8 @@ char *date;/synopsis
   
  The DRM core tries to facilitate IRQ handler registration and
  unregistration by providing
  functiondrm_irq_install/function and
  functiondrm_irq_uninstall/function functions. Those
  functions only
   
   -  support a single interrupt per device.
   +  support a single interrupt per device, devices that use more
   than one +  IRQs need to be handled manually.
  
  Perhaps this should mention that if you handle IRQ installation manually
  you also need to manually set drm-irq_enabled = 1, as otherwise things
  like DRM_IOCTL_WAIT_VBLANK won't work properly.
 
 That's only needed if DRIVER_HAVE_IRQ is set, otherwise the drm_wait_vblank() 
 function skips the irq_enabled check.

Not quite. There's another check for dev-irq_enabled in the
DRM_WAIT_ON() so it'll never actually block. Arguably this could be
solved by making the DRM_WAIT_ON() condition drop the !dev-irq_enabled
in case DRIVER_HAVE_IRQ isn't set.

I'll see if I can find the time to come up with a patch.

Thierry


pgpxHkjsTlMjM.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: Improve manual IRQ installation documentation

Hi Thierry,

On Thursday 20 June 2013 12:40:26 Thierry Reding wrote:
 On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
  On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
   On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
Signed-off-by: Laurent Pinchart
laurent.pinchart+rene...@ideasonboard.com
---

 Documentation/DocBook/drm.tmpl | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/DocBook/drm.tmpl
b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
--- a/Documentation/DocBook/drm.tmpl
+++ b/Documentation/DocBook/drm.tmpl
@@ -186,11 +186,12 @@

   varlistentry
   
 termDRIVER_HAVE_IRQ/termtermDRIVER_IRQ_SHARED/term
 
 listitempara

-  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
handler. The -  DRM core will automatically register an
interrupt handler when the -  flag is set.
DRIVER_IRQ_SHARED
indicates whether the device amp; -  handler support
shared
IRQs (note that this is required of PCI -  drivers).
+  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
handler +  managed by the DRM Core. The core will support
simple IRQ handler +  installation when the flag is set.
The
installation process is +  described in xref
linkend=drm-irq-registration/./para +
paraDRIVER_IRQ_SHARED indicates whether the device amp; handler +

support shared IRQs (note that this is required of PCI 
drivers).

 /para/listitem
   
   /varlistentry
   varlistentry

@@ -344,7 +345,8 @@ char *date;/synopsis

   The DRM core tries to facilitate IRQ handler registration
   and
   unregistration by providing
   functiondrm_irq_install/function and
   functiondrm_irq_uninstall/function functions. Those
   functions only

-  support a single interrupt per device.
+  support a single interrupt per device, devices that use
more
than one +  IRQs need to be handled manually.
   
   Perhaps this should mention that if you handle IRQ installation manually
   you also need to manually set drm-irq_enabled = 1, as otherwise things
   like DRM_IOCTL_WAIT_VBLANK won't work properly.
  
  That's only needed if DRIVER_HAVE_IRQ is set, otherwise the
  drm_wait_vblank() function skips the irq_enabled check.
 
 Not quite. There's another check for dev-irq_enabled in the
 DRM_WAIT_ON() so it'll never actually block.

Indeed.

 Arguably this could be solved by making the DRM_WAIT_ON() condition drop the
 !dev-irq_enabled in case DRIVER_HAVE_IRQ isn't set.

Or we could force drivers to set drm-irq_enabled, I'm fine with that as well. 
I can fix the documentation if that would be the preferred solution.

 I'll see if I can find the time to come up with a patch.

-- 
Regards,

Laurent Pinchart


signature.asc
Description: This is a digitally signed message part.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: Improve manual IRQ installation documentation

On Thu, Jun 20, 2013 at 12:46:21PM +0200, Laurent Pinchart wrote:
 Hi Thierry,
 
 On Thursday 20 June 2013 12:40:26 Thierry Reding wrote:
  On Thu, Jun 20, 2013 at 12:17:25PM +0200, Laurent Pinchart wrote:
   On Thursday 20 June 2013 12:10:47 Thierry Reding wrote:
On Wed, Jun 19, 2013 at 02:00:45PM +0200, Laurent Pinchart wrote:
 Signed-off-by: Laurent Pinchart
 laurent.pinchart+rene...@ideasonboard.com
 ---
 
  Documentation/DocBook/drm.tmpl | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)
 
 diff --git a/Documentation/DocBook/drm.tmpl
 b/Documentation/DocBook/drm.tmpl index f9df3b8..738b727 100644
 --- a/Documentation/DocBook/drm.tmpl
 +++ b/Documentation/DocBook/drm.tmpl
 @@ -186,11 +186,12 @@
 
varlistentry

  termDRIVER_HAVE_IRQ/termtermDRIVER_IRQ_SHARED/term
  
  listitempara
 
 -  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
 handler. The -  DRM core will automatically register an
 interrupt handler when the -  flag is set.
 DRIVER_IRQ_SHARED
 indicates whether the device amp; -  handler support
 shared
 IRQs (note that this is required of PCI -  drivers).
 +  DRIVER_HAVE_IRQ indicates whether the driver has an IRQ
 handler +  managed by the DRM Core. The core will support
 simple IRQ handler +  installation when the flag is set.
 The
 installation process is +  described in xref
 linkend=drm-irq-registration/./para +
 paraDRIVER_IRQ_SHARED indicates whether the device amp; handler +
 
 support shared IRQs (note that this is required of PCI 
 drivers).
 
  /para/listitem

/varlistentry
varlistentry
 
 @@ -344,7 +345,8 @@ char *date;/synopsis
 
The DRM core tries to facilitate IRQ handler registration
and
unregistration by providing
functiondrm_irq_install/function and
functiondrm_irq_uninstall/function functions. Those
functions only
 
 -  support a single interrupt per device.
 +  support a single interrupt per device, devices that use
 more
 than one +  IRQs need to be handled manually.

Perhaps this should mention that if you handle IRQ installation manually
you also need to manually set drm-irq_enabled = 1, as otherwise things
like DRM_IOCTL_WAIT_VBLANK won't work properly.
   
   That's only needed if DRIVER_HAVE_IRQ is set, otherwise the
   drm_wait_vblank() function skips the irq_enabled check.
  
  Not quite. There's another check for dev-irq_enabled in the
  DRM_WAIT_ON() so it'll never actually block.
 
 Indeed.
 
  Arguably this could be solved by making the DRM_WAIT_ON() condition drop the
  !dev-irq_enabled in case DRIVER_HAVE_IRQ isn't set.
 
 Or we could force drivers to set drm-irq_enabled, I'm fine with that as 
 well. 
 I can fix the documentation if that would be the preferred solution.

Thinking about it some more, the latter seems more appropriate. Consider
some driver that doesn't set DRIVER_HAVE_IRQ but also doesn't initialize
interrupts manually. If so it'll cause DRM_IOCTL_WAIT_VBLANK to block
indefinitely.

On the other hand perhaps the check at the beginning of drm_wait_vblank
should be improved. Maybe make it something like this:

if (drm_core_check_feature(dev, DRIVER_HAVE_IRQ))
if (!drm_dev_to_irq(dev))
return -EINVAL;

if (!dev-irq_enabled)
return -EINVAL;

Thierry


pgp2XSrMV_o9c.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

 -Original Message-
 From: Lucas Stach [mailto:l.st...@pengutronix.de]
 Sent: Thursday, June 20, 2013 7:11 PM
 To: Inki Dae
 Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
 ker...@lists.infradead.org; linux-me...@vger.kernel.org
 Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
 framework

 Am Donnerstag, den 20.06.2013, 17:24 +0900 schrieb Inki Dae:
 [...]
In addition, please see the below more detail examples.

The conventional way (without dmabuf-sync) is:
Task A

 1. CPU accesses buf
 2. Send the buf to Task B
 3. Wait for the buf from Task B
 4. go to 1

Task B
---
1. Wait for the buf from Task A
2. qbuf the buf
2.1 insert the buf to incoming queue
3. stream on
3.1 dma_map_sg if ready, and move the buf to ready queue
3.2 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_unmap_sg after dma operation completion
4.2 move the buf to outgoing queue
5. back the buf to Task A
6. go to 1

In case that two tasks share buffers, and data flow goes from Task A
 to
   Task
B, we would need IPC operation to send and receive buffers properly
   between
those two tasks every time CPU or DMA access to buffers is started
 or
completed.

With dmabuf-sync:

Task A

 1. dma_buf_sync_lock - synpoint (call by user side)
 2. CPU accesses buf
 3. dma_buf_sync_unlock - syncpoint (call by user side)
 4. Send the buf to Task B (just one time)
 5. go to 1

Task B
---
1. Wait for the buf from Task A (just one time)
2. qbuf the buf
1.1 insert the buf to incoming queue
3. stream on
3.1 dma_buf_sync_lock - syncpoint (call by kernel side)
3.2 dma_map_sg if ready, and move the buf to ready queue
3.3 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_buf_sync_unlock - syncpoint (call by kernel side)
4.2 dma_unmap_sg after dma operation completion
4.3 move the buf to outgoing queue
5. go to 1

On the other hand, in case of using dmabuf-sync, as you can see the
   above
example, we would need IPC operation just one time. That way, I
 think we
could not only reduce performance overhead but also make user
   application
simplified. Of course, this approach can be used for all DMA device
   drivers
such as DRM. I'm not a specialist in v4l2 world so there may be
 missing
point.

   You already need some kind of IPC between the two tasks, as I suspect
   even in your example it wouldn't make much sense to queue the buffer
   over and over again in task B without task A writing anything to it.
 So
   task A has to signal task B there is new data in the buffer to be
   processed.

   There is no need to share the buffer over and over again just to get
 the
   two processes to work together on the same thing. Just share the fd
   between both and then do out-of-band completion signaling, as you need
   this anyway. Without this you'll end up with unpredictable behavior.
   Just because sync allows you to access the buffer doesn't mean it's
   valid for your use-case. Without completion signaling you could easily
   end up overwriting your data from task A multiple times before task B
   even tries to lock the buffer for processing.

   So the valid flow is (and this already works with the current APIs):
   Task ATask B
   ----
   CPU access buffer
--completion signal-
 qbuf (dragging buffer into
 device domain, flush caches,
 reserve buffer etc.)
   |
 wait for device operation to
 complete
   |
 dqbuf (dragging buffer back
 into CPU domain, invalidate
 caches, unreserve)
   -completion signal
   CPU access buffer

  Correct. In case that data flow goes from A to B, it needs some kind
  of IPC between the two tasks every time as you said. Then, without
  dmabuf-sync, how do think about the case that two tasks share the same
  buffer but these tasks access the buffer(buf1) as write, and data of
  the buffer(buf1) isn't needed to be shared?

 Sorry, I don't see the point you are trying to solve here. If you share
 a

[PATCH v5 1/7] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.

This will allow me to call functions that have multiple arguments if fastpath 
fails.
This is required to support ticket mutexes, because they need to be able to 
pass an
extra argument to the fail function.

Originally I duplicated the functions, by adding 
__mutex_fastpath_lock_retval_arg.
This ended up being just a duplication of the existing function, so a way to 
test
if fastpath was called ended up being better.

This also cleaned up the reservation mutex patch some by being able to call an
atomic_set instead of atomic_xchg, and making it easier to detect if the wrong
unlock function was previously used.

Changes since v1, pointed out by Francesco Lavra:
- fix a small comment issue in mutex_32.h
- fix the __mutex_fastpath_lock_retval macro for mutex-null.h

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 arch/ia64/include/asm/mutex.h|   10 --
 arch/powerpc/include/asm/mutex.h |   10 --
 arch/sh/include/asm/mutex-llsc.h |4 ++--
 arch/x86/include/asm/mutex_32.h  |   11 ---
 arch/x86/include/asm/mutex_64.h  |   11 ---
 include/asm-generic/mutex-dec.h  |   10 --
 include/asm-generic/mutex-null.h |2 +-
 include/asm-generic/mutex-xchg.h |   10 --
 kernel/mutex.c   |   32 ++--
 9 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..f41e66d 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call fail_fn if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
-   return fail_fn(count);
+   return -1;
return 0;
 }
 
diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..127ab23 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call fail_fn if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(__mutex_dec_return_lock(count)  0))
-   return fail_fn(count);
+   return -1;
return 0;
 }
 
diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..dad29b6 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
 }
 
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
int __done, __res;
 
@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int 
(*fail_fn)(atomic_t *))
: t);
 
if (unlikely(!__done || __res != 0))
-   __res = fail_fn(count);
+   __res = -1;
 
return __res;
 }
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..0208c3c 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -42,17 +42,14 @@ do {
\
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call fail_fn if it
- * wasn't 1 originally. This function returns 0 if the

[PATCH v5 0/7] add mutex wait/wound/style style locks

Changes since v4:
- Some documentation cleanups.
- Added a lot more tests to cover all the DEBUG_LOCKS_WARN_ON cases.
- Added EDEADLK tests.
- Split off the normal mutex tests to a separate patch.
- Added a patch to not allow tests to fail that succeed with PROVE_LOCKING 
enabled.

---

Daniel Vetter (1):
  mutex: w/w mutex slowpath debugging

Maarten Lankhorst (6):
  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded 
or not.
  mutex: add support for wound/wait style locks, v5
  mutex: Add ww tests to lib/locking-selftest.c. v5
  mutex: add more tests to lib/locking-selftest.c
  mutex: add more ww tests to test EDEADLK path handling
  locking-selftests: handle unexpected failures more strictly


 Documentation/ww-mutex-design.txt |  343 ++
 arch/ia64/include/asm/mutex.h |   10 -
 arch/powerpc/include/asm/mutex.h  |   10 -
 arch/sh/include/asm/mutex-llsc.h  |4 
 arch/x86/include/asm/mutex_32.h   |   11 -
 arch/x86/include/asm/mutex_64.h   |   11 -
 include/asm-generic/mutex-dec.h   |   10 -
 include/asm-generic/mutex-null.h  |2 
 include/asm-generic/mutex-xchg.h  |   10 -
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  363 +++
 kernel/mutex.c|  384 ++--
 lib/Kconfig.debug |   13 +
 lib/debug_locks.c |2 
 lib/locking-selftest.c|  720 -
 15 files changed, 1802 insertions(+), 92 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

-- 
Signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

Changes since RFC patch v1:
 - Updated to use atomic_long instead of atomic, since the reservation_id was a 
long.
 - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
 - removed mutex_locked_set_reservation_id (or w/e it was called)
Changes since RFC patch v2:
 - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
combination of
   mutex_(,reserve_)lock/unlock.
Changes since v1:
 - Add __always_inline to __mutex_lock_common, otherwise reservation paths can 
be
   triggered from normal locks, because __builtin_constant_p might evaluate to 
false
   for the constant 0 in that case. Tests for this have been added in the next 
patch.
 - Updated documentation slightly.
Changes since v2:
 - Renamed everything to ww_mutex. (mlankhorst)
 - Added ww_acquire_ctx and ww_class. (mlankhorst)
 - Added a lot of checks for wrong api usage. (mlankhorst)
 - Documentation updates. (danvet)
Changes since v3:
 - Small documentation fixes (robclark)
 - Memory barrier fix (danvet)
Changes since v4:
 - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
 - Rename ww_mutex_trylock_single to ww_mutex_trylock.
 - Remove separate implementations of ww_mutex_lock_slow*, normal
   functions can be used. Inline versions still exist for extra
   debugging.
 - Cleanup unneeded memory barriers, add comment to the remaining
   smp_mb().

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Rob Clark robdcl...@gmail.com
---
 Documentation/ww-mutex-design.txt |  343 
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  355 +
 kernel/mutex.c|  318 +++--
 lib/debug_locks.c |2 
 5 files changed, 1002 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

diff --git a/Documentation/ww-mutex-design.txt 
b/Documentation/ww-mutex-design.txt
new file mode 100644
index 000..379739c
--- /dev/null
+++ b/Documentation/ww-mutex-design.txt
@@ -0,0 +1,343 @@
+Wait/Wound Deadlock-Proof Mutex Design
+==
+
+Please read mutex-design.txt first, as it applies to wait/wound mutexes too.
+
+Motivation for WW-Mutexes
+-
+
+GPU's do operations that commonly involve many buffers.  Those buffers
+can be shared across contexts/processes, exist in different memory
+domains (for example VRAM vs system memory), and so on.  And with
+PRIME / dmabuf, they can even be shared across devices.  So there are
+a handful of situations where the driver needs to wait for buffers to
+become ready.  If you think about this in terms of waiting on a buffer
+mutex for it to become available, this presents a problem because
+there is no way to guarantee that buffers appear in a execbuf/batch in
+the same order in all contexts.  That is directly under control of
+userspace, and a result of the sequence of GL calls that an application
+makes. Which results in the potential for deadlock.  The problem gets
+more complex when you consider that the kernel may need to migrate the
+buffer(s) into VRAM before the GPU operates on the buffer(s), which
+may in turn require evicting some other buffers (and you don't want to
+evict other buffers which are already queued up to the GPU), but for a
+simplified understanding of the problem you can ignore this.
+
+The algorithm that TTM came up with for dealing with this problem is quite
+simple.  For each group of buffers (execbuf) that need to be locked, the caller
+would be assigned a unique reservation id/ticket, from a global counter.  In
+case of deadlock while locking all the buffers associated with a execbuf, the
+one with the lowest reservation ticket (i.e. the oldest task) wins, and the one
+with the higher reservation id (i.e. the younger task) unlocks all of the
+buffers that it has already locked, and then tries again.
+
+In the RDBMS literature this deadlock handling approach is called wait/wound:
+The older tasks waits until it can acquire the contended lock. The younger 
tasks
+needs to back off and drop all the locks it is currently holding, i.e. the
+younger task is wounded.
+
+Concepts
+
+
+Compared to normal mutexes two additional concepts/objects show up in the lock
+interface for w/w mutexes:
+
+Acquire context: To ensure eventual forward progress it is important the a task
+trying to acquire locks doesn't grab a new reservation id, but keeps the one it
+acquired when starting the lock acquisition. This ticket is stored in the
+acquire context. Furthermore the acquire context keeps track of debugging state
+to catch w/w mutex interface abuse.
+
+W/w class: In contrast to normal mutexes the lock class needs to be explicit 
for
+w/w mutexes, since it is required to initialize the acquire context.
+
+Furthermore there are three different class of w/w lock

[PATCH v5 4/7] mutex: Add ww tests to lib/locking-selftest.c. v5

This stresses the lockdep code in some ways specifically useful to
ww_mutexes. It adds checks for most of the common locking errors.

Changes since v1:
 - Add tests to verify reservation_id is untouched.
 - Use L() and U() macros where possible.

Changes since v2:
 - Use the ww_mutex api directly.
 - Use macros for most of the code.
Changes since v3:
 - Rework tests for the api changes.
Changes since v4:
 - Added a test to cover ww_acquire_done being called multiple times.
 - Added a test for calling ww_acquire_fini being called before
   all locks have been unlocked.
 - Added a test for locking after ww_acquire_done has been called.
 - Added a test for unbalance for ctx-acquired dropping below zero.
 - Added a test for unlocked ww_mutex with ctx != NULL.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 lib/locking-selftest.c |  400 ++--
 1 file changed, 381 insertions(+), 19 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index c3eb261..9962262 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -26,6 +26,8 @@
  */
 static unsigned int debug_locks_verbose;
 
+static DEFINE_WW_CLASS(ww_lockdep);
+
 static int __init setup_debug_locks_verbose(char *str)
 {
get_option(str, debug_locks_verbose);
@@ -42,6 +44,10 @@ __setup(debug_locks_verbose=, setup_debug_locks_verbose);
 #define LOCKTYPE_RWLOCK0x2
 #define LOCKTYPE_MUTEX 0x4
 #define LOCKTYPE_RWSEM 0x8
+#define LOCKTYPE_WW0x10
+
+static struct ww_acquire_ctx t, t2;
+static struct ww_mutex o, o2;
 
 /*
  * Normal standalone locks, for the circular and irq-context
@@ -193,6 +199,20 @@ static void init_shared_classes(void)
 #define RSU(x) up_read(rwsem_##x)
 #define RWSI(x)init_rwsem(rwsem_##x)
 
+#ifndef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+#define WWAI(x)ww_acquire_init(x, ww_lockdep)
+#else
+#define WWAI(x)do { ww_acquire_init(x, ww_lockdep); 
(x)-deadlock_inject_countdown = ~0U; } while (0)
+#endif
+#define WWAD(x)ww_acquire_done(x)
+#define WWAF(x)ww_acquire_fini(x)
+
+#define WWL(x, c)  ww_mutex_lock(x, c)
+#define WWT(x) ww_mutex_trylock(x)
+#define WWL1(x)ww_mutex_lock(x, NULL)
+#define WWU(x) ww_mutex_unlock(x)
+
+
 #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)
 
 /*
@@ -894,11 +914,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 # define I_RWLOCK(x)   lockdep_reset_lock(rwlock_##x.dep_map)
 # define I_MUTEX(x)lockdep_reset_lock(mutex_##x.dep_map)
 # define I_RWSEM(x)lockdep_reset_lock(rwsem_##x.dep_map)
+# define I_WW(x)   lockdep_reset_lock(x.dep_map)
 #else
 # define I_SPINLOCK(x)
 # define I_RWLOCK(x)
 # define I_MUTEX(x)
 # define I_RWSEM(x)
+# define I_WW(x)
 #endif
 
 #define I1(x)  \
@@ -920,11 +942,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 static void reset_locks(void)
 {
local_irq_disable();
+   lockdep_free_key_range(ww_lockdep.acquire_key, 1);
+   lockdep_free_key_range(ww_lockdep.mutex_key, 1);
+
I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();
+
+   ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep);
+   memset(t, 0, sizeof(t)); memset(t2, 0, sizeof(t2));
+   memset(ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
+   memset(ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
local_irq_enable();
 }
 
@@ -938,7 +969,6 @@ static int unexpected_testcase_failures;
 static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 {
unsigned long saved_preempt_count = preempt_count();
-   int expected_failure = 0;
 
WARN_ON(irqs_disabled());
 
@@ -946,26 +976,16 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
+   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
-   if ((lockclass_mask  LOCKTYPE_SPIN)  debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask  LOCKTYPE_RWLOCK)  debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask  LOCKTYPE_MUTEX)  debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask  LOCKTYPE_RWSEM)  debug_locks != expected)
-   expected_failure = 1;
+   expected_testcase_failures++;
+   printk(failed|);
+#else
+   unexpected_testcase_failures++;
+   printk(FAILED|);
+
+   dump_stack();
 #endif
-   if (debug_locks !=

[PATCH v5 3/7] mutex: w/w mutex slowpath debugging

From: Daniel Vetter daniel.vet...@ffwll.ch

Injects EDEADLK conditions at pseudo-random interval, with exponential
backoff up to UINT_MAX (to ensure that every lock operation still
completes in a reasonable time).

This way we can test the wound slowpath even for ww mutex users where
contention is never expected, and the ww deadlock avoidance algorithm
is only needed for correctness against malicious userspace. An example
would be protecting kernel modesetting properties, which thanks to
single-threaded X isn't really expected to contend, ever.

I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but
decided against it for two reasons:

- EDEADLK handling is mandatory for ww mutex users and should never
  affect the outcome of a syscall. This is in contrast to -ENOMEM
  injection. So fine configurability isn't required.

- The fault injection framework only allows to set a simple
  probability for failure. Now the probability that a ww mutex acquire
  stage with N locks will never complete (due to too many injected
  EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
  operations for the completely uncontended case would be O(exp(N)).
  The per-acuiqire ctx exponential backoff solution choosen here only
  results in O(log N) overhead due to injection and so O(log N * N)
  lock operations. This way we can fail with high probability (and so
  have good test coverage even for fancy backoff and lock acquisition
  paths) without running into patalogical cases.

Note that EDEADLK will only ever be injected when we managed to
acquire the lock. This prevents any behaviour changes for users which
rely on the EALREADY semantics.

v2: Drop the cargo-culted __sched (I should read docs next time
around) and annotate the non-debug case with inline to prevent gcc
from doing something horrible.

v3: Rebase on top of Maarten's latest patches.

v4: Actually make this stuff compile, I've misplace the hunk in the
wrong #ifdef block.

v5: Simplify ww_mutex_deadlock_injection definition, and fix
lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition
to work correctly. (mlankhorst)

v6:
Do not inject -EDEADLK when ctx-acquired == 0, because
the _slow paths are merged now. (mlankhorst)

Cc: Steven Rostedt rost...@goodmis.org
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 include/linux/mutex.h |8 
 kernel/mutex.c|   44 +---
 lib/Kconfig.debug |   13 +
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index f3ad181..2ff9178 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -95,6 +95,10 @@ struct ww_acquire_ctx {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned deadlock_inject_interval;
+   unsigned deadlock_inject_countdown;
+#endif
 };
 
 struct ww_mutex {
@@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx 
*ctx,
 ww_class-acquire_key, 0);
mutex_acquire(ctx-dep_map, 0, 0, _RET_IP_);
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   ctx-deadlock_inject_interval = 1;
+   ctx-deadlock_inject_countdown = ctx-stamp  0xf;
+#endif
 }
 
 /**
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 75fc7c4..e40004b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 
+static inline int
+ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned tmp;
+
+   if (ctx-deadlock_inject_countdown-- == 0) {
+   tmp = ctx-deadlock_inject_interval;
+   if (tmp  UINT_MAX/4)
+   tmp = UINT_MAX;
+   else
+   tmp = tmp*2 + tmp + tmp/2;
+
+   ctx-deadlock_inject_interval = tmp;
+   ctx-deadlock_inject_countdown = tmp;
+   ctx-contending_lock = lock;
+
+   ww_mutex_unlock(lock);
+
+   return -EDEADLK;
+   }
+#endif
+
+   return 0;
+}
 
 int __sched
 __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE,
+   ret =  __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE,
   0, ctx-dep_map, _RET_IP_, ctx);
+   if (!ret  ctx-acquired  0)
+   return ww_mutex_deadlock_injection(lock, ctx);
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__ww_mutex_lock);
 
 int __sched
 __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx 
*ctx)
 {
+   int ret;
+
might_sleep();
-   return

[PATCH v5 5/7] mutex: add more tests to lib/locking-selftest.c

None of the ww_mutex codepaths should be taken in the 'normal'
mutex calls. The easiest way to verify this is by using the
normal mutex calls, and making sure o.ctx is unmodified.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 lib/locking-selftest.c |   62 
 1 file changed, 62 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 9962262..37faefd 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1162,6 +1162,67 @@ static void ww_test_fail_acquire(void)
 #endif
 }
 
+static void ww_test_normal(void)
+{
+   int ret;
+
+   WWAI(t);
+
+   /*
+* None of the ww_mutex codepaths should be taken in the 'normal'
+* mutex calls. The easiest way to verify this is by using the
+* normal mutex calls, and making sure o.ctx is unmodified.
+*/
+
+   /* mutex_lock (and indirectly, mutex_lock_nested) */
+   o.ctx = (void *)~0UL;
+   mutex_lock(o.base);
+   mutex_unlock(o.base);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* mutex_lock_interruptible (and *_nested) */
+   o.ctx = (void *)~0UL;
+   ret = mutex_lock_interruptible(o.base);
+   if (!ret)
+   mutex_unlock(o.base);
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* mutex_lock_killable (and *_nested) */
+   o.ctx = (void *)~0UL;
+   ret = mutex_lock_killable(o.base);
+   if (!ret)
+   mutex_unlock(o.base);
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* trylock, succeeding */
+   o.ctx = (void *)~0UL;
+   ret = mutex_trylock(o.base);
+   WARN_ON(!ret);
+   if (ret)
+   mutex_unlock(o.base);
+   else
+   WARN_ON(1);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* trylock, failing */
+   o.ctx = (void *)~0UL;
+   mutex_lock(o.base);
+   ret = mutex_trylock(o.base);
+   WARN_ON(ret);
+   mutex_unlock(o.base);
+   WARN_ON(o.ctx != (void *)~0UL);
+
+   /* nest_lock */
+   o.ctx = (void *)~0UL;
+   mutex_lock_nest_lock(o.base, t);
+   mutex_unlock(o.base);
+   WARN_ON(o.ctx != (void *)~0UL);
+}
+
 static void ww_test_two_contexts(void)
 {
WWAI(t);
@@ -1415,6 +1476,7 @@ static void ww_tests(void)
 
print_testname(ww api failures);
dotest(ww_test_fail_acquire, SUCCESS, LOCKTYPE_WW);
+   dotest(ww_test_normal, SUCCESS, LOCKTYPE_WW);
dotest(ww_test_unneeded_slow, FAILURE, LOCKTYPE_WW);
printk(\n);
 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v5 6/7] mutex: add more ww tests to test EDEADLK path handling

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 lib/locking-selftest.c |  264 +++-
 1 file changed, 261 insertions(+), 3 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 37faefd..d554f3f 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -47,7 +47,7 @@ __setup(debug_locks_verbose=, setup_debug_locks_verbose);
 #define LOCKTYPE_WW0x10
 
 static struct ww_acquire_ctx t, t2;
-static struct ww_mutex o, o2;
+static struct ww_mutex o, o2, o3;
 
 /*
  * Normal standalone locks, for the circular and irq-context
@@ -947,12 +947,12 @@ static void reset_locks(void)
 
I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
-   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); I_WW(o3.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();
 
-   ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep);
+   ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep); 
ww_mutex_init(o3, ww_lockdep);
memset(t, 0, sizeof(t)); memset(t2, 0, sizeof(t2));
memset(ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
memset(ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
@@ -1292,6 +1292,251 @@ static void ww_test_object_lock_stale_context(void)
WWL(o, t);
 }
 
+static void ww_test_edeadlk_normal(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   o2.ctx = t2;
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_);
+   mutex_unlock(o2.base);
+   WWU(o);
+
+   WWL(o2, t);
+}
+
+static void ww_test_edeadlk_normal_slow(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_);
+   mutex_unlock(o2.base);
+   WWU(o);
+
+   ww_mutex_lock_slow(o2, t);
+}
+
+static void ww_test_edeadlk_no_unlock(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   o2.ctx = t2;
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_);
+   mutex_unlock(o2.base);
+
+   WWL(o2, t);
+}
+
+static void ww_test_edeadlk_no_unlock_slow(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   o2.ctx = NULL;
+   mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_);
+   mutex_unlock(o2.base);
+
+   ww_mutex_lock_slow(o2, t);
+}
+
+static void ww_test_edeadlk_acquire_more(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   ret = WWL(o3, t);
+}
+
+static void ww_test_edeadlk_acquire_more_slow(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   ww_mutex_lock_slow(o3, t);
+}
+
+static void ww_test_edeadlk_acquire_more_edeadlk(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   mutex_lock(o3.base);
+   mutex_release(o3.base.dep_map, 1, _THIS_IP_);
+   o3.ctx = t2;
+
+   WWAI(t);
+   t2 = t;
+   t2.stamp--;
+
+   ret = WWL(o, t);
+   WARN_ON(ret);
+
+   ret = WWL(o2, t);
+   WARN_ON(ret != -EDEADLK);
+
+   ret = WWL(o3, t);
+   WARN_ON(ret != -EDEADLK);
+}
+
+static void ww_test_edeadlk_acquire_more_edeadlk_slow(void)
+{
+   int ret;
+
+   mutex_lock(o2.base);
+   mutex_release(o2.base.dep_map, 1, _THIS_IP_);
+   o2.ctx = t2;
+
+   mutex_lock(o3.base);
+   mutex_release(o3.base.dep_map, 1, _THIS_IP_);

[PATCH v5 7/7] locking-selftests: handle unexpected failures more strictly

When CONFIG_PROVE_LOCKING is not enabled, more tests are expected to
pass unexpectedly, but there no tests that should start to fail that
pass with CONFIG_PROVE_LOCKING enabled.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
 lib/locking-selftest.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index d554f3f..aad024d 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -976,16 +976,18 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
-   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
+   if (expected == FAILURE  debug_locks) {
expected_testcase_failures++;
printk(failed|);
-#else
+   }
+   else
+#endif
+   if (debug_locks != expected) {
unexpected_testcase_failures++;
printk(FAILED|);
 
dump_stack();
-#endif
} else {
testcase_successes++;
printk(  ok  |);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v5 2/7] mutex: add support for wound/wait style locks, v5

Op 20-06-13 13:55, Ingo Molnar schreef:
 * Maarten Lankhorst maarten.lankho...@canonical.com wrote:

 Changes since RFC patch v1:
  - Updated to use atomic_long instead of atomic, since the reservation_id 
 was a long.
  - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
  - removed mutex_locked_set_reservation_id (or w/e it was called)
 Changes since RFC patch v2:
  - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
 combination of
mutex_(,reserve_)lock/unlock.
 Changes since v1:
  - Add __always_inline to __mutex_lock_common, otherwise reservation paths 
 can be
triggered from normal locks, because __builtin_constant_p might evaluate 
 to false
for the constant 0 in that case. Tests for this have been added in the 
 next patch.
  - Updated documentation slightly.
 Changes since v2:
  - Renamed everything to ww_mutex. (mlankhorst)
  - Added ww_acquire_ctx and ww_class. (mlankhorst)
  - Added a lot of checks for wrong api usage. (mlankhorst)
  - Documentation updates. (danvet)
 Changes since v3:
  - Small documentation fixes (robclark)
  - Memory barrier fix (danvet)
 Changes since v4:
  - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
  - Rename ww_mutex_trylock_single to ww_mutex_trylock.
  - Remove separate implementations of ww_mutex_lock_slow*, normal
functions can be used. Inline versions still exist for extra
debugging.
  - Cleanup unneeded memory barriers, add comment to the remaining
smp_mb().
 That's not a proper changelog. It should be a short description of what it 
 does, possibly referring to the new Documentation/ww-mutex-design.txt file 
 for more details.
Well they've helped me with some of the changes and contributed some code 
and/or fixes, but if acked-by is preferred I'll use that..
 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 Signed-off-by: Rob Clark robdcl...@gmail.com
 That's not a valid signoff chain: the last signoff in the chain is the 
 person sending me the patch. The first signoff is the person who wrote the 
 patch. The other two gents should be Acked-by I suspect?

I guess so.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65968] New: Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

  Priority: medium
Bug ID: 65968
  Assignee: dri-devel@lists.freedesktop.org
   Summary: Massive memory corruption in Planetary Annihilation
Alpha
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: andreas.ringlstet...@gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Drivers/DRI/r300
   Product: Mesa

Created attachment 81105
  -- https://bugs.freedesktop.org/attachment.cgi?id=81105action=edit
Example of corruption in PA. The skybox texture has been completely
overwritten, partly with textures from other programms, corruption in other
textures is already starting.

Using the R300 driver (git version from 2013-06-19) on a Mobility Radeon X1400
(128MB dedicated ???), I get massive memory corruption which can be seen in the
attached screenshot when running the Planetary Annihilation Alpha.

The game makes use of virtual texturing, thats means a mega texture which won't
possibly fit in the RAM in one piece.

However, it appears like textures which are NOT part of the mega texture have
been mapped into the same address space. I could see other textures, and even
bitmaps from other applications.

In the screenshot, there are large grey stripes for example, however there is
no such texture in the game. The color does match the color of the window
border though. Performing further tests, I even managed to get parts of album
covers from Banshee into PA.


This issue is not only limited to Planetary Annihilation though and the
corruption also works other way around, where applications overwrite the
bitmaps of other applications.

The effects of the corruption are clearly visible in PA due to the large
textures. They are not deterministic, but appear very reliable, most likely due
to the high memory usage.

Using other applications which frequently allocate new textures (like Banshee
with album covers) speeds up the corruption and makes it even visible in other
applications like Firefox, Cinnamon etc., although not reliable.

Attached are:
Screenshot of corruption
Xorg-log
glxinfo output

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

--- Comment #2 from Andreas Ringlstetter andreas.ringlstet...@gmail.com ---
Created attachment 81107
  -- https://bugs.freedesktop.org/attachment.cgi?id=81107action=edit
glxinfo log

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

Andreas Ringlstetter andreas.ringlstet...@gmail.com changed:

   What|Removed |Added

  Attachment #81105|0   |1
is obsolete||

--- Comment #3 from Andreas Ringlstetter andreas.ringlstet...@gmail.com ---
Created attachment 81108
  -- https://bugs.freedesktop.org/attachment.cgi?id=81108action=edit
Example of corruption in PA. The skybox texture has been completely
overwritten, partly with textures from other programms, corruption in other
textures is already starting.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

--- Comment #4 from Andreas Ringlstetter andreas.ringlstet...@gmail.com ---
PS:
I will not be able to test with 9.0 or 9.1 since one of the shaders causes a
segfault while compiling in these version. This has only recently (last 1-2
months) been fixed in git.

This was caused by a faulty implementation of peephole_mul_omod() in
compiler/radeon_optimize.c, the SIGSEGV was thrown in
rc_variable_list_get_writers_one_reader due to writer_list beeing NULL.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65958] GPU Lockup on Trinity 7500G

https://bugs.freedesktop.org/show_bug.cgi?id=65958

--- Comment #2 from Alex Deucher ag...@yahoo.com ---
Does setting the env var RADEON_VA=0 in your /etc/environment fix the issue? 
This looks like a mesa issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65963] screen goes blank, Linux hangs - Radeon 7870, Gallium, Glamor

https://bugs.freedesktop.org/show_bug.cgi?id=65963

Alex Deucher ag...@yahoo.com changed:

   What|Removed |Added

   Assignee|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop
   ||.org
 QA Contact|xorg-t...@lists.x.org   |
Product|xorg|Mesa
Version|git |unspecified
  Component|Driver/Radeon   |Drivers/Gallium/radeonsi

--- Comment #1 from Alex Deucher ag...@yahoo.com ---
Is this a regression?  What version of mesa and llvm are you using?  Also, in
the future please attach your logs directly to the bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65968] Massive memory corruption in Planetary Annihilation Alpha

https://bugs.freedesktop.org/show_bug.cgi?id=65968

Alex Deucher ag...@yahoo.com changed:

   What|Removed |Added

  Attachment #81105|text/plain  |image/jpeg
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65963] screen goes blank, Linux hangs - Radeon 7870, Gallium, Glamor