[Bug 74551] Unable to run linux with radeon.runpm=1

2014-07-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=74551

--- Comment #14 from maxis11  ---
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728524] [drm] Internal
thermal controller with fan control
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728612] == power state 0 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728615] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728617] internal class:
boot 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728618] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728620] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728622] power level
0sclk: 1 mclk: 15000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728623] power level
1sclk: 1 mclk: 15000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728625] power level
2sclk: 1 mclk: 15000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728626] status: c r b 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728628] == power state 1 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728629] ui class:
performance
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728630] internal class:
none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728631] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728632] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728634] power level
0sclk: 1 mclk: 3 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728635] power level
1sclk: 4 mclk: 9 vddc: 1000 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728636] power level
2sclk: 75000 mclk: 9 vddc: 1100 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728637] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728638] == power state 2 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728639] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728640] internal class:
uvd 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728641] caps: video 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728643] uvdvclk:
7 dclk: 56000
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728644] power level
0sclk: 75000 mclk: 9 vddc: 1100 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728645] power level
1sclk: 75000 mclk: 9 vddc: 1100 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728647] power level
2sclk: 75000 mclk: 9 vddc: 1100 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728647] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728648] == power state 3 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728649] ui class:
battery
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728650] internal class:
none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728652] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728653] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728654] power level
0sclk: 1 mclk: 3 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728655] power level
1sclk: 1 mclk: 3 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728656] power level
2sclk: 3 mclk: 3 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728657] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728658] == power state 4 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728659] ui class:
battery
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728660] internal class:
uvd_hd 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728661] caps: video 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728663] uvdvclk:
4 dclk: 3
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728664] power level
0sclk: 4 mclk: 65000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728666] power level
1sclk: 4 mclk: 65000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728667] power level
2sclk: 4 mclk: 65000 vddc: 900 vddci: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728668] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728669] == power state 5 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728670] ui class:
battery
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728671] internal class:
uvd_sd 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728672] caps: video 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728673] uvdvclk:
1 dclk: 1
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [6.728675] power level
0 

[Bug 74551] Unable to run linux with radeon.runpm=1

2014-07-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=74551

--- Comment #13 from maxis11  ---
(In reply to Alex Deucher from comment #12)
> (In reply to maxis11 from comment #6)
> > BTW runpm works with 3.12.7 and 3.12.8 (with 3.13 and later runpm=1 freezes
> > laptop during boot)
> 
> runpm support didn't exist until 3.13. If "echo OFF >
> /sys/kernel/debug/vgaswitcheroo/switch" has never worked, then your system
> has never had functional support for turning on/off the dGPU.

kernel 3.12.7:
When loading SUMO2(6480g)
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915421] [drm] Internal
thermal controller without fan control
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915457] == power state 0 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915459] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915461] internal class:
uvd_mvc 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915463] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915465] uvdvclk:
7 dclk: 55173
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915468] power level
0sclk: 59260 vddc: 1113
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915469] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915471] == power state 1 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915472] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915473] internal class:
uvd_hd 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915475] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915477] uvdvclk:
4 dclk: 30770
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915478] power level
0sclk: 27587 vddc: 888
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915479] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915481] == power state 2 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915482] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915483] internal class:
uvd 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915485] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915487] uvdvclk:
7 dclk: 55173
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915488] power level
0sclk: 59260 vddc: 1113
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915489] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915491] == power state 3 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915492] ui class:
battery
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915493] internal class:
none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915495] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915496] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915498] power level
0sclk: 27587 vddc: 888
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915499] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915500] == power state 4 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915502] ui class:
performance
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915503] internal class:
none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915504] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915506] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915508] power level
0sclk: 27587 vddc: 888
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915509] power level
1sclk: 59260 vddc: 1113
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915510] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915512] == power state 5 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915513] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915514] internal class:
boot 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915516] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915517] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915519] power level
0sclk: 2 vddc: 1113
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915520] status: c r b 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915523] == power state 6 ==
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915524] ui class: none
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915525] internal class:
thermal 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915527] caps: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915528] uvdvclk: 0
dclk: 0
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915530] power level
0sclk: 2 vddc: 888
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915531] status: 
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.915534] [drm] Found smc
ucode version: 0x00011200
Jul 18 00:34:03 maxis11-Aspire-5560 kernel: [1.936330] switching from power

[Bug 74551] Unable to run linux with radeon.runpm=1

2014-07-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=74551

--- Comment #12 from Alex Deucher  ---
(In reply to maxis11 from comment #6)
> BTW runpm works with 3.12.7 and 3.12.8 (with 3.13 and later runpm=1 freezes
> laptop during boot)

runpm support didn't exist until 3.13. If "echo OFF >
/sys/kernel/debug/vgaswitcheroo/switch" has never worked, then your system has
never had functional support for turning on/off the dGPU.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[pull] radeon drm-fixes-3.16

2014-07-17 Thread Alex Deucher
On Thu, Jul 17, 2014 at 8:50 PM, Dieter N?tzel  wrote:
> Am 17.07.2014 22:50, schrieb Alex Deucher:
>
>> Hi Dave,
>>
>> A few more fixes for 3.16.  The pageflipping fixes I dropped last week
>> have finally shaped up so this is mostly fixes for fallout from the
>> pageflipping code changes.  Also fix a memory leak and a black screen
>> when restoring the backlight on console unblanking.
>>
>> The following changes since commit
>> bf38b025d3f58f4c1273714ff1be5bfbf99574a4:
>>
>>   Merge branch 'drm-fixes-3.16' of
>> git://people.freedesktop.org/~agd5f/linux into drm-fixes (2014-07-11
>> 11:24:13 +1000)
>>
>> are available in the git repository at:
>>
>>
>>   git://people.freedesktop.org/~agd5f/linux drm-fixes-3.16
>>
>> for you to fetch changes up to 5f87e090a7368adc2290ae17ffd82a070caadd20:
>>
>>   drm/radeon: Make classic pageflip completion path less racy.
>> (2014-07-17 09:04:03 -0400)
>>
>> 
>> Alex Deucher (2):
>>   drm/radeon: avoid leaking edid data
>>   drm/radeon: set default bl level to something reasonable
>>
>> Mario Kleiner (4):
>>   drm/radeon: Prevent too early kms-pageflips triggered by vblank.
>>   drm/radeon: Remove redundant fence unref in pageflip path.
>>   drm/radeon: Add missing vblank_put in pageflip ioctl error path.
>>   drm/radeon: Make classic pageflip completion path less racy.
>>
>> Michel D?nzer (2):
>>   drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
>>   drm/radeon: Complete page flip even if waiting on the BO fence fails
>
>
> Hello Alex and Michel,
>
> isn't this needed anylonger?
> drm-radeon-Disable-pflip-interrupts.patch

No it's not needed for 3.16.

Alex

>
> [-]
> From 462febcc7b5f6e54648f2fd941b9f90de16e54f1 Mon Sep 17 00:00:00 2001
> From: Michel D?nzer 
> Date: Mon, 14 Jul 2014 06:42:06 +
> Subject: drm/radeon: Disable pflip interrupts
>
> With Mario's previous fix, there are three possible scenarios for the
> pflip interrupts:
>
> 1) If a pflip interrupt can occur before the corresponding vblank
>interrupt, the sequence number of the userspace event will be too
>small by one.
> 2) If a pflip interrupt can occur after the vblank interrupt and after
>the next flip is programmed to the hardware, radeon_crtc_handle_flip()
>will complete the next flip earlier than expected by userspace.
> 3) Otherwise, radeon_crtc_handle_flip() doesn't perform any actual work
>when called from the pflip interrupt handler, i.e. the pflip interrupt
>is useless.
>
> In summary, Mario's fix made the pflip interrupts useless in the best
> case and harmful in the worst case, so let's disable them.
>
> Signed-off-by: Michel D?nzer 
> Signed-off-by: Alex Deucher 
> [-]
>
> Thanks,
>   Dieter


[Bug 74551] Unable to run linux with radeon.runpm=1

2014-07-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=74551

maxis11  changed:

   What|Removed |Added

Summary|Unable to enable ACPI   |Unable to run linux with
   ||radeon.runpm=1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 74551] Unable to enable ACPI

2014-07-17 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=74551

--- Comment #11 from maxis11  ---
executing "echo OFF > /sys/kernel/debug/vgaswitcheroo/switch" on every kernel
causes freezing OS(even where runpm and dpm are working). Additionally, when OS
freezes, it doesn't write any log(not in kern.log or Xorg log) and doesn't
executing any commands (try to add after previous command && dmesg > info.log,
but file wasn't created), so I just can't give you any log. Problem still
exists in 3.15.5 and in 3.16.0-4-generic.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #46 from Lukas Kahnert  ---
I tried to run piglit with all tests and everytime(I tried 3 times) i get a
blackscreen and the System hangs. I dont know the usage of Piglit so i cant say
on which test the GPU hangs.
It looks like the same bug which also appears randomly by watching
videos/flash.
I used the latest patch(v3).

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/80529966/attachment.html>


[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #45 from Aaron B  ---
Also seems, by looking at my xorg log, many problems are happening along the
way.

http://pastebin.com/q3b8fEid

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/0975a77e/attachment.html>


[PATCH 5/5] r600g,radeonsi: Prefer VRAM for persistent mappings

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index c8a0723..6f7fa29 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -125,12 +125,10 @@ bool r600_init_resource(struct r600_common_screen 
*rscreen,
break;
}

-   /* Use GTT for all persistent mappings, because they are
-* always cached and coherent. */
if (res->b.b.target == PIPE_BUFFER &&
res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
  PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
-   res->domains = RADEON_DOMAIN_GTT;
+   res->domains = RADEON_DOMAIN_VRAM;
flags = RADEON_FLAG_GTT_WC;
}

-- 
2.0.0



[PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

This is hopefully safe: The kernel makes sure writes to these mappings
finish before the GPU might start reading from them, and the GPU caches
are invalidated at the start of a command stream.

Signed-off-by: Michel D?nzer 
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 40917f0..c8a0723 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -131,7 +131,7 @@ bool r600_init_resource(struct r600_common_screen *rscreen,
res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
  PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
res->domains = RADEON_DOMAIN_GTT;
-   flags = 0;
+   flags = RADEON_FLAG_GTT_WC;
}

/* Tiled textures are unmappable. Always put them in VRAM. */
-- 
2.0.0



[PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming buffers

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 4e6b897..40917f0 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -110,15 +110,13 @@ bool r600_init_resource(struct r600_common_screen 
*rscreen,
enum radeon_bo_flag flags = 0;

switch (res->b.b.usage) {
-   case PIPE_USAGE_DYNAMIC:
-   case PIPE_USAGE_STREAM:
-   flags = RADEON_FLAG_GTT_WC;
-   /* fall through */
case PIPE_USAGE_STAGING:
/* Transfers are likely to occur more often with these 
resources. */
res->domains = RADEON_DOMAIN_GTT;
break;
case PIPE_USAGE_DEFAULT:
+   case PIPE_USAGE_STREAM:
+   case PIPE_USAGE_DYNAMIC:
case PIPE_USAGE_IMMUTABLE:
default:
/* Not listing GTT here improves performance in some apps. */
-- 
2.0.0



[PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some BOs in GTT

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 src/gallium/drivers/r300/r300_query.c |  2 +-
 src/gallium/drivers/r300/r300_render.c|  2 +-
 src/gallium/drivers/r300/r300_screen_buffer.c |  4 ++--
 src/gallium/drivers/r300/r300_texture.c   |  2 +-
 src/gallium/drivers/radeon/r600_buffer_common.c   |  9 ++--
 src/gallium/drivers/radeon/r600_texture.c |  2 ++
 src/gallium/drivers/radeon/radeon_uvd.c   |  8 +---
 src/gallium/drivers/radeon/radeon_vce.c   |  8 
 src/gallium/drivers/radeon/radeon_video.c | 11 ++
 src/gallium/drivers/radeon/radeon_video.h |  4 +++-
 src/gallium/drivers/radeonsi/si_state.c   |  2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 25 +++
 src/gallium/winsys/radeon/drm/radeon_drm_bo.h |  1 +
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c |  2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 12 +++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h |  2 ++
 src/gallium/winsys/radeon/drm/radeon_winsys.h |  7 ++-
 17 files changed, 77 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_query.c 
b/src/gallium/drivers/r300/r300_query.c
index 5305ebd..1679433 100644
--- a/src/gallium/drivers/r300/r300_query.c
+++ b/src/gallium/drivers/r300/r300_query.c
@@ -59,7 +59,7 @@ static struct pipe_query *r300_create_query(struct 
pipe_context *pipe,
 q->num_pipes = r300screen->info.r300_num_gb_pipes;

 q->buf = r300->rws->buffer_create(r300->rws, 4096, 4096, TRUE,
-  RADEON_DOMAIN_GTT);
+  RADEON_DOMAIN_GTT, 0);
 if (!q->buf) {
 FREE(q);
 return NULL;
diff --git a/src/gallium/drivers/r300/r300_render.c 
b/src/gallium/drivers/r300/r300_render.c
index 175b83a..6e5b381 100644
--- a/src/gallium/drivers/r300/r300_render.c
+++ b/src/gallium/drivers/r300/r300_render.c
@@ -907,7 +907,7 @@ static boolean r300_render_allocate_vertices(struct 
vbuf_render* render,
 r300->vbo = rws->buffer_create(rws,
MAX2(R300_MAX_DRAW_VBO_SIZE, size),
R300_BUFFER_ALIGNMENT, TRUE,
-   RADEON_DOMAIN_GTT);
+   RADEON_DOMAIN_GTT, 0);
 if (!r300->vbo) {
 return FALSE;
 }
diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
b/src/gallium/drivers/r300/r300_screen_buffer.c
index 86e4478..de557b5 100644
--- a/src/gallium/drivers/r300/r300_screen_buffer.c
+++ b/src/gallium/drivers/r300/r300_screen_buffer.c
@@ -103,7 +103,7 @@ r300_buffer_transfer_map( struct pipe_context *context,
 /* Create a new one in the same pipe_resource. */
 new_buf = r300->rws->buffer_create(r300->rws, rbuf->b.b.width0,
R300_BUFFER_ALIGNMENT, TRUE,
-   rbuf->domain);
+   rbuf->domain, 0);
 if (new_buf) {
 /* Discard the old buffer. */
 pb_reference(>buf, NULL);
@@ -185,7 +185,7 @@ struct pipe_resource *r300_buffer_create(struct pipe_screen 
*screen,
 rbuf->buf =
 r300screen->rws->buffer_create(r300screen->rws, rbuf->b.b.width0,
R300_BUFFER_ALIGNMENT, TRUE,
-   rbuf->domain);
+   rbuf->domain, 0);
 if (!rbuf->buf) {
 FREE(rbuf);
 return NULL;
diff --git a/src/gallium/drivers/r300/r300_texture.c 
b/src/gallium/drivers/r300/r300_texture.c
index 4ea69dc..ffe8c00 100644
--- a/src/gallium/drivers/r300/r300_texture.c
+++ b/src/gallium/drivers/r300/r300_texture.c
@@ -1042,7 +1042,7 @@ r300_texture_create_object(struct r300_screen *rscreen,
 /* Create the backing buffer if needed. */
 if (!tex->buf) {
 tex->buf = rws->buffer_create(rws, tex->tex.size_in_bytes, 2048, TRUE,
-  tex->domain);
+  tex->domain, 0);

 if (!tex->buf) {
 goto fail;
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 0eaa817..4e6b897 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -107,11 +107,14 @@ bool r600_init_resource(struct r600_common_screen 
*rscreen,
 {
struct r600_texture *rtex = (struct r600_texture*)res;
struct pb_buffer *old_buf, *new_buf;
+   enum radeon_bo_flag flags = 0;

switch (res->b.b.usage) {
-   case PIPE_USAGE_STAGING:
case PIPE_USAGE_DYNAMIC:
case PIPE_USAGE_STREAM:
+   flags = RADEON_FLAG_GTT_WC;
+   /* fall 

[PATCH 1/5] winsys/radeon: Use separate caching buffer managers for VRAM and GTT

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Should reduce overhead because the caching buffer manager doesn't need to
consider buffers of the wrong type.

Signed-off-by: Michel D?nzer 
---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 10 +++---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 16 +++-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h |  3 ++-
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 0ebe196..d06bb34 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -800,10 +800,14 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
 desc.initial_domains = domain;

 /* Assign a buffer manager. */
-if (use_reusable_pool)
-provider = ws->cman;
-else
+if (use_reusable_pool) {
+if (domain == RADEON_DOMAIN_VRAM)
+provider = ws->cman_vram;
+else
+provider = ws->cman_gtt;
+} else {
 provider = ws->kman;
+}

 buffer = provider->create_buffer(provider, size, );
 if (!buffer)
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 576fea5..0834cbd 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -417,7 +417,8 @@ static void radeon_winsys_destroy(struct radeon_winsys *rws)
 pipe_mutex_destroy(ws->cmask_owner_mutex);
 pipe_mutex_destroy(ws->cs_stack_lock);

-ws->cman->destroy(ws->cman);
+ws->cman_vram->destroy(ws->cman_vram);
+ws->cman_gtt->destroy(ws->cman_gtt);
 ws->kman->destroy(ws->kman);
 if (ws->gen >= DRV_R600) {
 radeon_surface_manager_free(ws->surf_man);
@@ -632,8 +633,11 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
 ws->kman = radeon_bomgr_create(ws);
 if (!ws->kman)
 goto fail;
-ws->cman = pb_cache_manager_create(ws->kman, 100, 2.0f, 0);
-if (!ws->cman)
+ws->cman_vram = pb_cache_manager_create(ws->kman, 100, 2.0f, 0);
+if (!ws->cman_vram)
+goto fail;
+ws->cman_gtt = pb_cache_manager_create(ws->kman, 100, 2.0f, 0);
+if (!ws->cman_gtt)
 goto fail;

 if (ws->gen >= DRV_R600) {
@@ -689,8 +693,10 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)

 fail:
 pipe_mutex_unlock(fd_tab_mutex);
-if (ws->cman)
-ws->cman->destroy(ws->cman);
+if (ws->cman_gtt)
+ws->cman_gtt->destroy(ws->cman_gtt);
+if (ws->cman_vram)
+ws->cman_vram->destroy(ws->cman_vram);
 if (ws->kman)
 ws->kman->destroy(ws->kman);
 if (ws->surf_man)
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.h 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.h
index 18fe0ae..fc6f53b 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.h
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.h
@@ -57,7 +57,8 @@ struct radeon_drm_winsys {
 uint32_t va_start;

 struct pb_manager *kman;
-struct pb_manager *cman;
+struct pb_manager *cman_vram;
+struct pb_manager *cman_gtt;
 struct radeon_surface_manager *surf_man;

 uint32_t num_cpus;  /* Number of CPUs. */
-- 
2.0.0



[PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/cik.c | 3 +++
 drivers/gpu/drm/radeon/cik_sdma.c| 2 ++
 drivers/gpu/drm/radeon/ni.c  | 3 +++
 drivers/gpu/drm/radeon/ni_dma.c  | 2 ++
 drivers/gpu/drm/radeon/radeon_ring.c | 2 +-
 5 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index df39095..8af5c9a 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -3846,6 +3846,9 @@ void cik_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
  (ib->gpu_addr & 0xFFFC));
radeon_ring_write(ring, upper_32_bits(ib->gpu_addr) & 0x);
radeon_ring_write(ring, control);
+
+   /* Flush HDP cache */
+   WREG32(HDP_MEM_COHERENCY_FLUSH_CNTL, 0);
 }

 /**
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c 
b/drivers/gpu/drm/radeon/cik_sdma.c
index 3396b28..2ab873d 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -158,6 +158,8 @@ void cik_sdma_ring_ib_execute(struct radeon_device *rdev,
radeon_ring_write(ring, upper_32_bits(ib->gpu_addr));
radeon_ring_write(ring, ib->length_dw);

+   /* Flush HDP cache */
+   WREG32(HDP_MEM_COHERENCY_FLUSH_CNTL, 0);
 }

 /**
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index b589fe7..ea58e5b 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1397,6 +1397,9 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
radeon_ring_write(ring, 0x);
radeon_ring_write(ring, 0);
radeon_ring_write(ring, ((ib->vm ? ib->vm->id : 0) << 24) | 10); /* 
poll interval */
+
+   /* Flush HDP cache (for SI) */
+   WREG32(HDP_MEM_COHERENCY_FLUSH_CNTL, 0x1);
 }

 static void cayman_cp_enable(struct radeon_device *rdev, bool enable)
diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c
index 119fc69..0e575ea 100644
--- a/drivers/gpu/drm/radeon/ni_dma.c
+++ b/drivers/gpu/drm/radeon/ni_dma.c
@@ -148,6 +148,8 @@ void cayman_dma_ring_ib_execute(struct radeon_device *rdev,
radeon_ring_write(ring, (ib->gpu_addr & 0xFFE0));
radeon_ring_write(ring, (ib->length_dw << 12) | 
(upper_32_bits(ib->gpu_addr) & 0xFF));

+   /* Flush HDP cache (for SI) */
+   WREG32(HDP_MEM_COHERENCY_FLUSH_CNTL, 0x1);
 }

 /**
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 62e9e57..31ac4fd 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -206,7 +206,7 @@ int radeon_ib_pool_init(struct radeon_device *rdev)
r = radeon_sa_bo_manager_init(rdev, >ring_tmp_bo,
  RADEON_IB_POOL_SIZE*64*1024,
  RADEON_GPU_PAGE_SIZE,
- RADEON_GEM_DOMAIN_GTT,
+ RADEON_GEM_DOMAIN_VRAM,
  RADEON_GEM_GTT_WC);
} else {
/* Without GPUVM, it's better to stick to cacheable GTT due
-- 
2.0.0



[PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and IBs on >= SI

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/cik.c |  3 +++
 drivers/gpu/drm/radeon/cik_sdma.c|  4 
 drivers/gpu/drm/radeon/ni.c  |  3 +++
 drivers/gpu/drm/radeon/ni_dma.c  |  4 
 drivers/gpu/drm/radeon/radeon_ring.c | 22 +-
 5 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index a9fd3e7..df39095 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4181,6 +4181,9 @@ u32 cik_gfx_get_wptr(struct radeon_device *rdev,
 void cik_gfx_set_wptr(struct radeon_device *rdev,
  struct radeon_ring *ring)
 {
+   /* Make IB/ring buffer writes land before the WPTR register write */
+   wmb();
+
WREG32(CP_RB0_WPTR, ring->wptr);
(void)RREG32(CP_RB0_WPTR);
 }
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c 
b/drivers/gpu/drm/radeon/cik_sdma.c
index a7f66c8..3396b28 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -112,12 +112,16 @@ void cik_sdma_set_wptr(struct radeon_device *rdev,
 {
u32 reg;

+   /* Make IB/ring buffer writes land before the WPTR register write */
+   wmb();
+
if (ring->idx == R600_RING_TYPE_DMA_INDEX)
reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET;
else
reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET;

WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+   (void)RREG32(reg);
 }

 /**
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 327b85f..b589fe7 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1449,6 +1449,9 @@ u32 cayman_gfx_get_wptr(struct radeon_device *rdev,
 void cayman_gfx_set_wptr(struct radeon_device *rdev,
 struct radeon_ring *ring)
 {
+   /* Make IB/ring buffer writes land before the WPTR register write */
+   wmb();
+
if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) {
WREG32(CP_RB0_WPTR, ring->wptr);
(void)RREG32(CP_RB0_WPTR);
diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c
index 6378e02..119fc69 100644
--- a/drivers/gpu/drm/radeon/ni_dma.c
+++ b/drivers/gpu/drm/radeon/ni_dma.c
@@ -103,12 +103,16 @@ void cayman_dma_set_wptr(struct radeon_device *rdev,
 {
u32 reg;

+   /* Make IB/ring buffer writes land before the WPTR register write */
+   wmb();
+
if (ring->idx == R600_RING_TYPE_DMA_INDEX)
reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET;
else
reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET;

WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+   (void)RREG32(reg);
 }

 /**
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 71439f0..62e9e57 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -201,10 +201,22 @@ int radeon_ib_pool_init(struct radeon_device *rdev)
if (rdev->ib_pool_ready) {
return 0;
}
-   r = radeon_sa_bo_manager_init(rdev, >ring_tmp_bo,
- RADEON_IB_POOL_SIZE*64*1024,
- RADEON_GPU_PAGE_SIZE,
- RADEON_GEM_DOMAIN_GTT, 0);
+
+   if (rdev->family >= CHIP_TAHITI) {
+   r = radeon_sa_bo_manager_init(rdev, >ring_tmp_bo,
+ RADEON_IB_POOL_SIZE*64*1024,
+ RADEON_GPU_PAGE_SIZE,
+ RADEON_GEM_DOMAIN_GTT,
+ RADEON_GEM_GTT_WC);
+   } else {
+   /* Without GPUVM, it's better to stick to cacheable GTT due
+* to the command stream patching
+*/
+   r = radeon_sa_bo_manager_init(rdev, >ring_tmp_bo,
+ RADEON_IB_POOL_SIZE*64*1024,
+ RADEON_GPU_PAGE_SIZE,
+ RADEON_GEM_DOMAIN_GTT, 0);
+   }
if (r) {
return r;
}
@@ -640,7 +652,7 @@ int radeon_ring_init(struct radeon_device *rdev, struct 
radeon_ring *ring, unsig
/* Allocate ring buffer */
if (ring->ring_obj == NULL) {
r = radeon_bo_create(rdev, ring->ring_size, PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_GTT, 0,
+RADEON_GEM_DOMAIN_GTT, RADEON_GEM_GTT_WC,
 NULL, >ring_obj);
if (r) {
dev_err(rdev->dev, "(%d) ring create failed\n", r);
-- 
2.0.0



[PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in GTT

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/cik.c  |  4 ++--
 drivers/gpu/drm/radeon/cik_sdma.c |  3 ++-
 drivers/gpu/drm/radeon/evergreen.c| 12 
 drivers/gpu/drm/radeon/r600.c |  4 ++--
 drivers/gpu/drm/radeon/radeon.h   |  3 ++-
 drivers/gpu/drm/radeon/radeon_benchmark.c |  4 ++--
 drivers/gpu/drm/radeon/radeon_device.c|  3 ++-
 drivers/gpu/drm/radeon/radeon_fb.c|  2 +-
 drivers/gpu/drm/radeon/radeon_gart.c  |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c   | 16 ++--
 drivers/gpu/drm/radeon/radeon_object.c| 24 +++-
 drivers/gpu/drm/radeon/radeon_object.h|  5 +++--
 drivers/gpu/drm/radeon/radeon_prime.c |  2 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |  4 ++--
 drivers/gpu/drm/radeon/radeon_sa.c|  4 ++--
 drivers/gpu/drm/radeon/radeon_test.c  |  4 ++--
 drivers/gpu/drm/radeon/radeon_ttm.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_uvd.c   |  6 +++---
 drivers/gpu/drm/radeon/radeon_vce.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_vm.c|  8 ++--
 drivers/gpu/drm/radeon/si_dma.c   |  3 ++-
 21 files changed, 70 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 1b0da66..a9fd3e7 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4374,7 +4374,7 @@ static int cik_mec_init(struct radeon_device *rdev)
r = radeon_bo_create(rdev,
 rdev->mec.num_mec *rdev->mec.num_pipe * 
MEC_HPD_SIZE * 2,
 PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_GTT, NULL,
+RADEON_GEM_DOMAIN_GTT, 0, NULL,
 >mec.hpd_eop_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create HDP EOP bo failed\n", 
r);
@@ -4544,7 +4544,7 @@ static int cik_cp_compute_resume(struct radeon_device 
*rdev)
r = radeon_bo_create(rdev,
 sizeof(struct bonaire_mqd),
 PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_GTT, NULL,
+RADEON_GEM_DOMAIN_GTT, 0, NULL,
 >ring[idx].mqd_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create MQD bo 
failed\n", r);
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c 
b/drivers/gpu/drm/radeon/cik_sdma.c
index 8e9d0f1..a7f66c8 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -742,7 +742,8 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev,

trace_radeon_vm_set_page(pe, addr, count, incr, flags);

-   if (flags == R600_PTE_GART) {
+   /* XXX: How to distinguish between GART and other system memory pages? 
*/
+   if (flags & R600_PTE_SYSTEM) {
uint64_t src = rdev->gart.table_addr + (addr >> 12) * 8;
while (count) {
unsigned bytes = count * 8;
diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 39ada71..902334f 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -4022,7 +4022,8 @@ int sumo_rlc_init(struct radeon_device *rdev)
/* save restore block */
if (rdev->rlc.save_restore_obj == NULL) {
r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_VRAM, NULL, 
>rlc.save_restore_obj);
+RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+>rlc.save_restore_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create RLC sr bo 
failed\n", r);
return r;
@@ -4100,7 +4101,8 @@ int sumo_rlc_init(struct radeon_device *rdev)

if (rdev->rlc.clear_state_obj == NULL) {
r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_VRAM, NULL, 
>rlc.clear_state_obj);
+RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+>rlc.clear_state_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create RLC c bo 
failed\n", r);
sumo_rlc_fini(rdev);
@@ -4174,8 +4176,10 @@ int sumo_rlc_init(struct radeon_device *rdev)

if (rdev->rlc.cp_table_size) {
if (rdev->rlc.cp_table_obj == NULL) {
-   r = radeon_bo_create(rdev, rdev->rlc.cp_table_size, 

[PATCH 2/5] drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Signed-off-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/r100.c|  2 +-
 drivers/gpu/drm/radeon/r300.c| 12 +---
 drivers/gpu/drm/radeon/radeon.h  | 12 +---
 drivers/gpu/drm/radeon/radeon_asic.h |  8 
 drivers/gpu/drm/radeon/radeon_gart.c |  9 ++---
 drivers/gpu/drm/radeon/radeon_ttm.c  |  8 ++--
 drivers/gpu/drm/radeon/rs400.c   | 13 ++---
 drivers/gpu/drm/radeon/rs600.c   | 16 +++-
 include/uapi/drm/radeon_drm.h|  4 +++-
 9 files changed, 59 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index ed1c53e..9241b89 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -682,7 +682,7 @@ void r100_pci_gart_disable(struct radeon_device *rdev)
 }

 void r100_pci_gart_set_page(struct radeon_device *rdev, unsigned i,
-   uint64_t addr)
+   uint64_t addr, uint32_t flags)
 {
u32 *gtt = rdev->gart.ptr;
gtt[i] = cpu_to_le32(lower_32_bits(addr));
diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index 8d14e66..75b3033 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -69,17 +69,23 @@ void rv370_pcie_gart_tlb_flush(struct radeon_device *rdev)
mb();
 }

+#define R300_PTE_UNSNOOPED (1 << 0)
 #define R300_PTE_WRITEABLE (1 << 2)
 #define R300_PTE_READABLE  (1 << 3)

 void rv370_pcie_gart_set_page(struct radeon_device *rdev, unsigned i,
- uint64_t addr)
+ uint64_t addr, uint32_t flags)
 {
void __iomem *ptr = rdev->gart.ptr;

addr = (lower_32_bits(addr) >> 8) |
-  ((upper_32_bits(addr) & 0xff) << 24) |
-  R300_PTE_WRITEABLE | R300_PTE_READABLE;
+   ((upper_32_bits(addr) & 0xff) << 24);
+   if (flags & RADEON_GART_PAGE_READ)
+   addr |= R300_PTE_READABLE;
+   if (flags & RADEON_GART_PAGE_WRITE)
+   addr |= R300_PTE_WRITEABLE;
+   if (!(flags & RADEON_GART_PAGE_SNOOP))
+   addr |= R300_PTE_UNSNOOPED;
/* on x86 we want this to be CPU endian, on powerpc
 * on powerpc without HW swappers, it'll get swapped on way
 * into VRAM - so no need for cpu_to_le32 on VRAM tables */
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index f4869b4..4dd092e 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -589,6 +589,12 @@ struct radeon_mc;
 #define RADEON_GPU_PAGE_SHIFT 12
 #define RADEON_GPU_PAGE_ALIGN(a) (((a) + RADEON_GPU_PAGE_MASK) & 
~RADEON_GPU_PAGE_MASK)

+#define RADEON_GART_PAGE_DUMMY  0
+#define RADEON_GART_PAGE_VALID (1 << 0)
+#define RADEON_GART_PAGE_READ  (1 << 1)
+#define RADEON_GART_PAGE_WRITE (1 << 2)
+#define RADEON_GART_PAGE_SNOOP (1 << 3)
+
 struct radeon_gart {
dma_addr_t  table_addr;
struct radeon_bo*robj;
@@ -613,7 +619,7 @@ void radeon_gart_unbind(struct radeon_device *rdev, 
unsigned offset,
int pages);
 int radeon_gart_bind(struct radeon_device *rdev, unsigned offset,
 int pages, struct page **pagelist,
-dma_addr_t *dma_addr);
+dma_addr_t *dma_addr, uint32_t flags);


 /*
@@ -1775,7 +1781,7 @@ struct radeon_asic {
struct {
void (*tlb_flush)(struct radeon_device *rdev);
void (*set_page)(struct radeon_device *rdev, unsigned i,
-uint64_t addr);
+uint64_t addr, uint32_t flags);
} gart;
struct {
int (*init)(struct radeon_device *rdev);
@@ -2702,7 +2708,7 @@ void radeon_ring_write(struct radeon_ring *ring, uint32_t 
v);
 #define radeon_vga_set_state(rdev, state) (rdev)->asic->vga_set_state((rdev), 
(state))
 #define radeon_asic_reset(rdev) (rdev)->asic->asic_reset((rdev))
 #define radeon_gart_tlb_flush(rdev) (rdev)->asic->gart.tlb_flush((rdev))
-#define radeon_gart_set_page(rdev, i, p) (rdev)->asic->gart.set_page((rdev), 
(i), (p))
+#define radeon_gart_set_page(rdev, i, p, f) 
(rdev)->asic->gart.set_page((rdev), (i), (p), (f))
 #define radeon_asic_vm_init(rdev) (rdev)->asic->vm.init((rdev))
 #define radeon_asic_vm_fini(rdev) (rdev)->asic->vm.fini((rdev))
 #define radeon_asic_vm_set_page(rdev, ib, pe, addr, count, incr, flags) 
((rdev)->asic->vm.set_page((rdev), (ib), (pe), (addr), (count), (incr), 
(flags)))
diff --git a/drivers/gpu/drm/radeon/radeon_asic.h 
b/drivers/gpu/drm/radeon/radeon_asic.h
index 01e7c0a..f632e31 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.h
+++ b/drivers/gpu/drm/radeon/radeon_asic.h
@@ -68,7 +68,7 @@ int r100_asic_reset(struct radeon_device *rdev);
 u32 r100_get_vblank_counter(struct radeon_device *rdev, int crtc);
 void 

[PATCH 1/5] drm/radeon: Remove radeon_gart_restore()

2014-07-17 Thread Michel Dänzer
From: Michel D?nzer 

Doesn't seem necessary, the GART table memory should be persistent.

Signed-off-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/cik.c |  1 -
 drivers/gpu/drm/radeon/evergreen.c   |  1 -
 drivers/gpu/drm/radeon/ni.c  |  1 -
 drivers/gpu/drm/radeon/r100.c|  1 -
 drivers/gpu/drm/radeon/r300.c|  1 -
 drivers/gpu/drm/radeon/r600.c|  1 -
 drivers/gpu/drm/radeon/radeon.h  |  1 -
 drivers/gpu/drm/radeon/radeon_gart.c | 27 ---
 drivers/gpu/drm/radeon/rs400.c   |  1 -
 drivers/gpu/drm/radeon/rs600.c   |  1 -
 drivers/gpu/drm/radeon/rv770.c   |  1 -
 drivers/gpu/drm/radeon/si.c  |  1 -
 12 files changed, 38 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 0b24711..1b0da66 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -5401,7 +5401,6 @@ static int cik_pcie_gart_enable(struct radeon_device 
*rdev)
r = radeon_gart_table_vram_pin(rdev);
if (r)
return r;
-   radeon_gart_restore(rdev);
/* Setup TLB control */
WREG32(MC_VM_MX_L1_TLB_CNTL,
   (0xA << 7) |
diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 250bac3..39ada71 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2424,7 +2424,6 @@ static int evergreen_pcie_gart_enable(struct 
radeon_device *rdev)
r = radeon_gart_table_vram_pin(rdev);
if (r)
return r;
-   radeon_gart_restore(rdev);
/* Setup L2 cache */
WREG32(VM_L2_CNTL, ENABLE_L2_CACHE | ENABLE_L2_FRAGMENT_PROCESSING |
ENABLE_L2_PTE_CACHE_LRU_UPDATE_BY_WRITE |
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 5a33ca6..327b85f 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1229,7 +1229,6 @@ static int cayman_pcie_gart_enable(struct radeon_device 
*rdev)
r = radeon_gart_table_vram_pin(rdev);
if (r)
return r;
-   radeon_gart_restore(rdev);
/* Setup TLB control */
WREG32(MC_VM_MX_L1_TLB_CNTL,
   (0xA << 7) |
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 1544efc..ed1c53e 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -652,7 +652,6 @@ int r100_pci_gart_enable(struct radeon_device *rdev)
 {
uint32_t tmp;

-   radeon_gart_restore(rdev);
/* discard memory request outside of configured range */
tmp = RREG32(RADEON_AIC_CNTL) | RADEON_DIS_OUT_OF_PCI_GART_ACCESS;
WREG32(RADEON_AIC_CNTL, tmp);
diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index 3c21d77..8d14e66 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -120,7 +120,6 @@ int rv370_pcie_gart_enable(struct radeon_device *rdev)
r = radeon_gart_table_vram_pin(rdev);
if (r)
return r;
-   radeon_gart_restore(rdev);
/* discard memory request outside of configured range */
tmp = RADEON_PCIE_TX_GART_UNMAPPED_ACCESS_DISCARD;
WREG32_PCIE(RADEON_PCIE_TX_GART_CNTL, tmp);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index c66952d..e1be5ce 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -968,7 +968,6 @@ static int r600_pcie_gart_enable(struct radeon_device *rdev)
r = radeon_gart_table_vram_pin(rdev);
if (r)
return r;
-   radeon_gart_restore(rdev);

/* Setup L2 cache */
WREG32(VM_L2_CNTL, ENABLE_L2_CACHE | ENABLE_L2_FRAGMENT_PROCESSING |
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 079eac7..f4869b4 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -614,7 +614,6 @@ void radeon_gart_unbind(struct radeon_device *rdev, 
unsigned offset,
 int radeon_gart_bind(struct radeon_device *rdev, unsigned offset,
 int pages, struct page **pagelist,
 dma_addr_t *dma_addr);
-void radeon_gart_restore(struct radeon_device *rdev);


 /*
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 2e72365..b7d3e84 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -298,33 +298,6 @@ int radeon_gart_bind(struct radeon_device *rdev, unsigned 
offset,
 }

 /**
- * radeon_gart_restore - bind all pages in the gart page table
- *
- * @rdev: radeon_device pointer
- *
- * Binds all pages in the gart page table (all asics).
- * Used to rebuild the gart table on device startup or resume.
- */
-void radeon_gart_restore(struct radeon_device *rdev)
-{
-   int i, j, t;
-   u64 page_base;
-
-   if (!rdev->gart.ptr) {
- 

[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT

2014-07-17 Thread Michel Dänzer
In order to try and improve X(Shm)PutImage performance with glamor, I
implemented support for write-combined CPU mappings of BOs in GTT.

This did provide a nice speedup, but to my surprise, using VRAM instead
of write-combined GTT turned out to be even faster in general on my
Kaveri machine, both for the internal GPU and for discrete GPUs.

However, I've kept the changes from GTT to VRAM separated, in case this
turns out to be a loss on other setups.

Kernel patches:

[PATCH 1/5] drm/radeon: Remove radeon_gart_restore()
[PATCH 2/5] drm/radeon: Pass GART page flags to
[PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in
[PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and
[PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI

Mesa patches:

[PATCH 1/5] winsys/radeon: Use separate caching buffer managers for
[PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some
[PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming
[PATCH 4/5] r600g,radeonsi: Use write-combined persistent GTT
[PATCH 5/5] r600g,radeonsi: Prefer VRAM for persistent mappings


[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #44 from Aaron B  ---
(In reply to comment #42)
> Created attachment 102992 [details] [review]
> Possible fix v3.
> 
> Updated and largely simplified patch.
> 
> I'm running the third piglit test with it now and so far the system seems to
> be stable.

Just had a crash happen, was opening a Yahoo page. Very normal to crash on it
TBH from the old version too, but it shows that this patch may only delay the
problem, not be an actual fix. I don't really know what to say about it, same
old same old. :/

http://pastebin.com/VXAb5k17

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/938ca1e0/attachment-0001.html>


[PATCH] drm/radeon: remove visible vram size limit on bo allocation

2014-07-17 Thread Christian König
Am 17.07.2014 18:29, schrieb Alex Deucher:
> On Thu, Jul 17, 2014 at 10:28 AM, Christian K?nig
>  wrote:
>> Am 17.07.2014 06:02, schrieb Michel D?nzer:
>>
>>> On 17.07.2014 02:26, Alex Deucher wrote:
 Now that fallback to gtt is fixed for cpu access, we can
 remove this limit.

 Signed-off-by: Alex Deucher 
 ---
drivers/gpu/drm/radeon/radeon_gem.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/radeon_gem.c
 b/drivers/gpu/drm/radeon/radeon_gem.c
 index fdd189b..07a13c9 100644
 --- a/drivers/gpu/drm/radeon/radeon_gem.c
 +++ b/drivers/gpu/drm/radeon/radeon_gem.c
 @@ -55,8 +55,11 @@ int radeon_gem_object_create(struct radeon_device
 *rdev, int size,
  alignment = PAGE_SIZE;
  }
- /* maximun bo size is the minimun btw visible vram and gtt size
 */
 -   max_size = min(rdev->mc.visible_vram_size, rdev->mc.gtt_size);
 +   /* Maximum bo size is the gtt size since we use the gtt to handle
 +* vram to system pool migrations.  We could probably remove this
 +* check altogether with a little additional work.
 +*/
 +   max_size = rdev->mc.gtt_size;
  if (size > max_size) {
  DRM_DEBUG("Allocation size %dMb bigger than %ldMb
 limit\n",
size >> 20, max_size >> 20);
>>> A BO of size rdev->mc.gtt_size can never actually be bound to GTT,
>>> because we have some pinned BOs in there. I think it's a bit
>>> disingenuous to let userspace allocate a BO that can never actually be
>>> used by the GPU. :)
>>>
>>> The hack I attached to
>>> https://bugs.freedesktop.org/show_bug.cgi?id=78717 has a start for
>>> dealing with that. I was running that patch for a while and didn't
>>> notice any bad effects from it.
>>
>> Haven't looked at the patch yet, but can't we just go over all existing
>> allocations on PIN and figure out the largest free area and save that value?
>> I mean pinning of GTT memory happens rarely and mostly on system startup.
>
> How about that attached patches?

LGTM. My thinking was more complicated, but this should be fine as well.

Patches are: Reviewed-by: Christian K?nig 

Christian.

>
> Alex



[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #43 from Aaron B  ---
Built, testing. Played youtube videos, chrome, multiple tabs, all while playing
Portal 2 and not a single hiccup on the output, outside of the casual VBlank
update problems I see you guys working on for 3-17. I did get a crash on the
old patch, as said, but we'll give it more time and I'll post any negative
results. For now, this is much more stable than before, though.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/e81a870c/attachment.html>


[PATCH v6 14/14] ARM: dts: exynos5420: add dsi node

2014-07-17 Thread YoungJun Cho
This patch adds common part of dsi node.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos5420.dtsi | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index 0b9d15d..3a7862b 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -523,6 +523,20 @@
#phy-cells = <1>;
};

+   dsi at 1450 {
+   compatible = "samsung,exynos5410-mipi-dsi";
+   reg = <0x1450 0x1>;
+   interrupts = <0 82 0>;
+   samsung,power-domain = <_pd>;
+   phys = <_phy 1>;
+   phy-names = "dsim";
+   clocks = < CLK_DSIM1>, < CLK_SCLK_MIPI1>;
+   clock-names = "bus_clk", "pll_clk";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "disabled";
+   };
+
fimd: fimd at 1440 {
samsung,power-domain = <_pd>;
clocks = < CLK_SCLK_FIMD1>, < CLK_FIMD1>;
-- 
1.9.0



[PATCH v6 13/14] ARM: dts: exynos5420: add mipi-phy node

2014-07-17 Thread YoungJun Cho
This patch adds mipi-phy node for MIPI DSI device.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos5420.dtsi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index e385322..0b9d15d 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -517,6 +517,12 @@
phy-names = "dp";
};

+   mipi_phy: video-phy at 10040714 {
+   compatible = "samsung,s5pv210-mipi-video-phy";
+   reg = <0x10040714 12>;
+   #phy-cells = <1>;
+   };
+
fimd: fimd at 1440 {
samsung,power-domain = <_pd>;
clocks = < CLK_SCLK_FIMD1>, < CLK_FIMD1>;
-- 
1.9.0



[PATCH v6 12/14] ARM: dts: exynos5: add system register property

2014-07-17 Thread YoungJun Cho
This patch adds sysreg property to fimd device node
which is required to use I80 interface.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos5.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/exynos5.dtsi b/arch/arm/boot/dts/exynos5.dtsi
index 79d0608..fdead12 100644
--- a/arch/arm/boot/dts/exynos5.dtsi
+++ b/arch/arm/boot/dts/exynos5.dtsi
@@ -87,6 +87,7 @@
reg = <0x1440 0x4>;
interrupt-names = "fifo", "vsync", "lcd_sys";
interrupts = <18 4>, <18 5>, <18 6>;
+   samsung,sysreg = <_system_controller>;
status = "disabled";
};

-- 
1.9.0



[PATCH v6 11/14] ARM: dts: exynos4: add system register property

2014-07-17 Thread YoungJun Cho
This patch adds sysreg property to fimd device node
which is required to use I80 interface.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos4.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
index fbaf426..3793881 100644
--- a/arch/arm/boot/dts/exynos4.dtsi
+++ b/arch/arm/boot/dts/exynos4.dtsi
@@ -608,6 +608,7 @@
clocks = < CLK_SCLK_FIMD0>, < CLK_FIMD0>;
clock-names = "sclk_fimd", "fimd";
samsung,power-domain = <_lcd0>;
+   samsung,sysreg = <_reg>;
status = "disabled";
};
 };
-- 
1.9.0



[PATCH v6 10/14] drm/panel: add S6E3FA0 driver

2014-07-17 Thread YoungJun Cho
This patch adds MIPI DSI command mode based
S6E3FA0 AMOLED LCD Panel driver.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/panel/Kconfig |   7 +
 drivers/gpu/drm/panel/Makefile|   1 +
 drivers/gpu/drm/panel/panel-s6e3fa0.c | 541 ++
 3 files changed, 549 insertions(+)
 create mode 100644 drivers/gpu/drm/panel/panel-s6e3fa0.c

diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 4ec874d..be1392e 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -30,4 +30,11 @@ config DRM_PANEL_S6E8AA0
select DRM_MIPI_DSI
select VIDEOMODE_HELPERS

+config DRM_PANEL_S6E3FA0
+   tristate "S6E3FA0 DSI command mode panel"
+   depends on DRM && DRM_PANEL
+   depends on OF
+   select DRM_MIPI_DSI
+   select VIDEOMODE_HELPERS
+
 endmenu
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index 8b92921..85c6738 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_DRM_PANEL_SIMPLE) += panel-simple.o
 obj-$(CONFIG_DRM_PANEL_LD9040) += panel-ld9040.o
 obj-$(CONFIG_DRM_PANEL_S6E8AA0) += panel-s6e8aa0.o
+obj-$(CONFIG_DRM_PANEL_S6E3FA0) += panel-s6e3fa0.o
diff --git a/drivers/gpu/drm/panel/panel-s6e3fa0.c 
b/drivers/gpu/drm/panel/panel-s6e3fa0.c
new file mode 100644
index 000..811ec92
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-s6e3fa0.c
@@ -0,0 +1,541 @@
+/*
+ * MIPI DSI command mode based s6e3fa0 AMOLED LCD 5.7 inch drm panel driver.
+ *
+ * Copyright (c) 2014 Samsung Electronics Co., Ltd
+ *
+ * YoungJun Cho 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+/* Manufacturer Command Set */
+#define MCS_GLOBAL_PARAMETER   0xb0
+#define MCS_AID0xb2
+#define MCS_ELVSSOPT   0xb6
+#define MCS_TEMPERATURE_SET0xb8
+#define MCS_PENTILE_CTRL   0xc0
+#define MCS_GAMMA_MODE 0xca
+#define MCS_VDDM   0xd7
+#define MCS_ALS0xe3
+#define MCS_ERR_FG 0xed
+#define MCS_KEY_LEV1   0xf0
+#define MCS_GAMMA_UPDATE   0xf7
+#define MCS_KEY_LEV2   0xfc
+#define MCS_RE 0xfe
+#define MCS_TOUT2_HSYNC0xff
+
+/* Content Adaptive Brightness Control */
+#define DCS_WRITE_CABC 0x55
+
+#define MTP_ID_LEN 3
+#define GAMMA_LEVEL_NUM30
+
+#define DEFAULT_VDDM_VAL   0x15
+
+struct s6e3fa0 {
+   struct device   *dev;
+   struct drm_panelpanel;
+
+   struct regulator_bulk_data  supplies[2];
+   struct gpio_desc*reset_gpio;
+   struct videomodevm;
+
+   unsigned intpower_on_delay;
+   unsigned intreset_delay;
+   unsigned intinit_delay;
+   unsigned intwidth_mm;
+   unsigned intheight_mm;
+
+   unsigned char   id;
+   unsigned char   vddm;
+   unsigned intbrightness;
+};
+
+#define panel_to_s6e3fa0(p) container_of(p, struct s6e3fa0, panel)
+
+/* VDD Memory Lookup Table contains pairs of {ReadValue, WriteValue} */
+static const unsigned char s6e3fa0_vddm_lut[][2] = {
+   {0x00, 0x0d}, {0x01, 0x0d}, {0x02, 0x0e}, {0x03, 0x0f}, {0x04, 0x10},
+   {0x05, 0x11}, {0x06, 0x12}, {0x07, 0x13}, {0x08, 0x14}, {0x09, 0x15},
+   {0x0a, 0x16}, {0x0b, 0x17}, {0x0c, 0x18}, {0x0d, 0x19}, {0x0e, 0x1a},
+   {0x0f, 0x1b}, {0x10, 0x1c}, {0x11, 0x1d}, {0x12, 0x1e}, {0x13, 0x1f},
+   {0x14, 0x20}, {0x15, 0x21}, {0x16, 0x22}, {0x17, 0x23}, {0x18, 0x24},
+   {0x19, 0x25}, {0x1a, 0x26}, {0x1b, 0x27}, {0x1c, 0x28}, {0x1d, 0x29},
+   {0x1e, 0x2a}, {0x1f, 0x2b}, {0x20, 0x2c}, {0x21, 0x2d}, {0x22, 0x2e},
+   {0x23, 0x2f}, {0x24, 0x30}, {0x25, 0x31}, {0x26, 0x32}, {0x27, 0x33},
+   {0x28, 0x34}, {0x29, 0x35}, {0x2a, 0x36}, {0x2b, 0x37}, {0x2c, 0x38},
+   {0x2d, 0x39}, {0x2e, 0x3a}, {0x2f, 0x3b}, {0x30, 0x3c}, {0x31, 0x3d},
+   {0x32, 0x3e}, {0x33, 0x3f}, {0x34, 0x3f}, {0x35, 0x3f}, {0x36, 0x3f},
+   {0x37, 0x3f}, {0x38, 0x3f}, {0x39, 0x3f}, {0x3a, 0x3f}, {0x3b, 0x3f},
+   {0x3c, 0x3f}, {0x3d, 0x3f}, {0x3e, 0x3f}, {0x3f, 0x3f}, {0x40, 0x0c},
+   {0x41, 0x0b}, {0x42, 0x0a}, {0x43, 0x09}, {0x44, 0x08}, {0x45, 0x07},
+   {0x46, 0x06}, {0x47, 0x05}, {0x48, 0x04}, {0x49, 0x03}, {0x4a, 0x02},
+   {0x4b, 0x01}, {0x4c, 0x40}, {0x4d, 0x41}, {0x4e, 0x42}, {0x4f, 0x43},
+   {0x50, 0x44}, {0x51, 0x45}, {0x52, 0x46}, {0x53, 0x47}, {0x54, 0x48},
+   {0x55, 0x49}, {0x56, 

[PATCH v6 09/14] ARM: dts: s6e3fa0: add DT bindings

2014-07-17 Thread YoungJun Cho
This patch adds DT bindings for s6e3fa0 panel.
The bindings describes panel resources and display timings.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 .../devicetree/bindings/panel/samsung,s6e3fa0.txt  | 46 ++
 1 file changed, 46 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt

diff --git a/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt 
b/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
new file mode 100644
index 000..2cd32f5
--- /dev/null
+++ b/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
@@ -0,0 +1,46 @@
+Samsung S6E3FA0 AMOLED LCD 5.7 inch panel
+
+Required properties:
+  - compatible: "samsung,s6e3fa0"
+  - reg: the virtual channel number of a DSI peripheral
+  - vdd3-supply: core voltage supply
+  - vci-supply: voltage supply for analog circuits
+  - reset-gpios: a GPIO spec for the reset pin
+  - det-gpios: a GPIO spec for the OLED detection pin
+  - te-gpios: a GPIO spec for the TE pin
+  - display-timings: timings for the connected panel as described by [1]
+
+Optional properties:
+
+The device node can contain one 'port' child node with one child 'endpoint'
+node, according to the bindings defined in [2]. This node should describe
+panel's video bus.
+
+[1]: Documentation/devicetree/bindings/video/display-timing.txt
+[2]: Documentation/devicetree/bindings/media/video-interfaces.txt
+
+Example:
+
+   panel at 0 {
+   compatible = "samsung,s6e3fa0";
+   reg = <0>;
+   vdd3-supply = <_reg>;
+   vci-supply = <_reg>;
+   reset-gpios = < 4 0>;
+   det-gpios = < 6 0>;
+   te-gpios = < 7 0>;
+
+   display-timings {
+   timings0 {
+   clock-frequency = <0>;
+   hactive = <1080>;
+   vactive = <1920>;
+   hfront-porch = <2>;
+   hback-porch = <2>;
+   hsync-len = <1>;
+   vfront-porch = <1>;
+   vback-porch = <4>;
+   vsync-len = <1>;
+   };
+   };
+   };
-- 
1.9.0



[PATCH v6 08/14] drm/exynos: dsi: add driver data to support Exynos5410/5420/5440 SoCs

2014-07-17 Thread YoungJun Cho
The offset of register DSIM_PLLTMR_REG in Exynos5410 / 5420 / 5440
SoCs is different from the one in Exynos4 SoCs.

In case of Exynos5410 / 5420 / 5440 SoCs, there is no frequency
band bit in DSIM_PLLCTRL_REG, and it uses DSIM_PHYCTRL_REG and
DSIM_PHYTIMING*_REG instead.
So this patch adds driver data to distinguish it.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 157 +++-
 1 file changed, 135 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 4997bfe..93ad4a2 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -57,9 +58,12 @@

 /* FIFO memory AC characteristic register */
 #define DSIM_PLLCTRL_REG   0x4c/* PLL control register */
-#define DSIM_PLLTMR_REG0x50/* PLL timer register */
 #define DSIM_PHYACCHR_REG  0x54/* D-PHY AC characteristic register */
 #define DSIM_PHYACCHR1_REG 0x58/* D-PHY AC characteristic register1 */
+#define DSIM_PHYCTRL_REG   0x5c
+#define DSIM_PHYTIMING_REG 0x64
+#define DSIM_PHYTIMING1_REG0x68
+#define DSIM_PHYTIMING2_REG0x6c

 /* DSIM_STATUS */
 #define DSIM_STOP_STATE_DAT(x) (((x) & 0xf) << 0)
@@ -203,6 +207,24 @@
 #define DSIM_PLL_M(x)  ((x) << 4)
 #define DSIM_PLL_S(x)  ((x) << 1)

+/* DSIM_PHYCTRL */
+#define DSIM_PHYCTRL_ULPS_EXIT(x)  (((x) & 0x1ff) << 0)
+
+/* DSIM_PHYTIMING */
+#define DSIM_PHYTIMING_LPX(x)  ((x) << 8)
+#define DSIM_PHYTIMING_HS_EXIT(x)  ((x) << 0)
+
+/* DSIM_PHYTIMING1 */
+#define DSIM_PHYTIMING1_CLK_PREPARE(x) ((x) << 24)
+#define DSIM_PHYTIMING1_CLK_ZERO(x)((x) << 16)
+#define DSIM_PHYTIMING1_CLK_POST(x)((x) << 8)
+#define DSIM_PHYTIMING1_CLK_TRAIL(x)   ((x) << 0)
+
+/* DSIM_PHYTIMING2 */
+#define DSIM_PHYTIMING2_HS_PREPARE(x)  ((x) << 16)
+#define DSIM_PHYTIMING2_HS_ZERO(x) ((x) << 8)
+#define DSIM_PHYTIMING2_HS_TRAIL(x)((x) << 0)
+
 #define DSI_MAX_BUS_WIDTH  4
 #define DSI_NUM_VIRTUAL_CHANNELS   4
 #define DSI_TX_FIFO_SIZE   2048
@@ -236,6 +258,12 @@ struct exynos_dsi_transfer {
 #define DSIM_STATE_INITIALIZED BIT(1)
 #define DSIM_STATE_CMD_LPM BIT(2)

+struct exynos_dsi_driver_data {
+   unsigned int plltmr_reg;
+
+   unsigned int has_freqband:1;
+};
+
 struct exynos_dsi {
struct mipi_dsi_host dsi_host;
struct drm_connector connector;
@@ -266,11 +294,39 @@ struct exynos_dsi {

spinlock_t transfer_lock; /* protects transfer_list */
struct list_head transfer_list;
+
+   struct exynos_dsi_driver_data *driver_data;
 };

 #define host_to_dsi(host) container_of(host, struct exynos_dsi, dsi_host)
 #define connector_to_dsi(c) container_of(c, struct exynos_dsi, connector)

+static struct exynos_dsi_driver_data exynos4_dsi_driver_data = {
+   .plltmr_reg = 0x50,
+   .has_freqband = 1,
+};
+
+static struct exynos_dsi_driver_data exynos5_dsi_driver_data = {
+   .plltmr_reg = 0x58,
+};
+
+static struct of_device_id exynos_dsi_of_match[] = {
+   { .compatible = "samsung,exynos4210-mipi-dsi",
+ .data = _dsi_driver_data },
+   { .compatible = "samsung,exynos5410-mipi-dsi",
+ .data = _dsi_driver_data },
+   { }
+};
+
+static inline struct exynos_dsi_driver_data *exynos_dsi_get_driver_data(
+   struct platform_device *pdev)
+{
+   const struct of_device_id *of_id =
+   of_match_device(exynos_dsi_of_match, >dev);
+
+   return (struct exynos_dsi_driver_data *)of_id->data;
+}
+
 static void exynos_dsi_wait_for_reset(struct exynos_dsi *dsi)
 {
if (wait_for_completion_timeout(>completed, msecs_to_jiffies(300)))
@@ -344,14 +400,9 @@ static unsigned long exynos_dsi_pll_find_pms(struct 
exynos_dsi *dsi,
 static unsigned long exynos_dsi_set_pll(struct exynos_dsi *dsi,
unsigned long freq)
 {
-   static const unsigned long freq_bands[] = {
-   100 * MHZ, 120 * MHZ, 160 * MHZ, 200 * MHZ,
-   270 * MHZ, 320 * MHZ, 390 * MHZ, 450 * MHZ,
-   510 * MHZ, 560 * MHZ, 640 * MHZ, 690 * MHZ,
-   770 * MHZ, 870 * MHZ, 950 * MHZ,
-   };
+   struct exynos_dsi_driver_data *driver_data = dsi->driver_data;
unsigned long fin, fout;
-   int timeout, band;
+   int timeout;
u8 p, s;
u16 m;
u32 reg;
@@ -372,18 +423,30 @@ static unsigned long exynos_dsi_set_pll(struct exynos_dsi 
*dsi,
"failed to find PLL PMS for requested frequency\n");
return -EFAULT;
}
+   dev_dbg(dsi->dev, "PLL freq %lu, (p %d, m %d, s %d)\n", fout, p, m, s);

-   for 

[PATCH v6 07/14] ARM: dts: exynos_dsim: add exynos5410 compatible to DT bindings

2014-07-17 Thread YoungJun Cho
This patch adds relevant to exynos5410 compatible
for exynos5410 / 5420 / 5440 SoCs support.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 Documentation/devicetree/bindings/video/exynos_dsim.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/video/exynos_dsim.txt 
b/Documentation/devicetree/bindings/video/exynos_dsim.txt
index 33b5730..31036c6 100644
--- a/Documentation/devicetree/bindings/video/exynos_dsim.txt
+++ b/Documentation/devicetree/bindings/video/exynos_dsim.txt
@@ -1,7 +1,9 @@
 Exynos MIPI DSI Master

 Required properties:
-  - compatible: "samsung,exynos4210-mipi-dsi"
+  - compatible: value should be one of the following
+   "samsung,exynos4210-mipi-dsi" /* for Exynos4 SoCs */
+   "samsung,exynos5410-mipi-dsi" /* for Exynos5410/5420/5440 SoCs 
*/
   - reg: physical base address and length of the registers set for the device
   - interrupts: should contain DSI interrupt
   - clocks: list of clock specifiers, must contain an entry for each required
-- 
1.9.0



[PATCH v6 06/14] drm/exynos: fimd: support LCD I80 interface

2014-07-17 Thread YoungJun Cho
To support MIPI command mode based I80 interface panel,
FIMD should do followings:
- Sets LCD I80 interface timings configuration.
- Uses "lcd_sys" as an IRQ resource and sets relevant IRQ configuration.
- Sets LCD block configuration for I80 interface.
- Sets ideal(pixel) clock is 2 times faster than the original one
  to generate frame done IRQ prior to the next TE signal.
- Implements trigger feature that transfers image data if there is page
  flip request, and implements TE handler to call trigger function.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/Kconfig   |   1 +
 drivers/gpu/drm/exynos/exynos_drm_fimd.c | 276 ++-
 include/video/samsung_fimd.h |   3 +-
 3 files changed, 235 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
index 178d2a9..9ba1aae 100644
--- a/drivers/gpu/drm/exynos/Kconfig
+++ b/drivers/gpu/drm/exynos/Kconfig
@@ -28,6 +28,7 @@ config DRM_EXYNOS_FIMD
bool "Exynos DRM FIMD"
depends on DRM_EXYNOS && !FB_S3C
select FB_MODE_HELPERS
+   select MFD_SYSCON
help
  Choose this option if you want to use Exynos FIMD for DRM.

diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c 
b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index 33161ad..28a3168 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 
 #include 
@@ -61,6 +63,24 @@
 /* color key value register for hardware window 1 ~ 4. */
 #define WKEYCON1_BASE(x)   ((WKEYCON1 + 0x140) + ((x - 1) * 8))

+/* I80 / RGB trigger control register */
+#define TRIGCON0x1A4
+#define TRGMODE_I80_RGB_ENABLE_I80 (1 << 0)
+#define SWTRGCMD_I80_RGB_ENABLE(1 << 1)
+
+/* display mode change control register except exynos4 */
+#define VIDOUT_CON 0x000
+#define VIDOUT_CON_F_I80_LDI0  (0x2 << 8)
+
+/* I80 interface control for main LDI register */
+#define I80IFCONFAx(x) (0x1B0 + (x) * 4)
+#define I80IFCONFBx(x) (0x1B8 + (x) * 4)
+#define LCD_CS_SETUP(x)((x) << 16)
+#define LCD_WR_SETUP(x)((x) << 12)
+#define LCD_WR_ACTIVE(x)   ((x) << 8)
+#define LCD_WR_HOLD(x) ((x) << 4)
+#define I80IFEN_ENABLE (1 << 0)
+
 /* FIMD has totally five hardware windows. */
 #define WINDOWS_NR 5

@@ -68,10 +88,14 @@

 struct fimd_driver_data {
unsigned int timing_base;
+   unsigned int lcdblk_offset;
+   unsigned int lcdblk_vt_shift;
+   unsigned int lcdblk_bypass_shift;

unsigned int has_shadowcon:1;
unsigned int has_clksel:1;
unsigned int has_limited_fmt:1;
+   unsigned int has_vidoutcon:1;
 };

 static struct fimd_driver_data s3c64xx_fimd_driver_data = {
@@ -82,12 +106,19 @@ static struct fimd_driver_data s3c64xx_fimd_driver_data = {

 static struct fimd_driver_data exynos4_fimd_driver_data = {
.timing_base = 0x0,
+   .lcdblk_offset = 0x210,
+   .lcdblk_vt_shift = 10,
+   .lcdblk_bypass_shift = 1,
.has_shadowcon = 1,
 };

 static struct fimd_driver_data exynos5_fimd_driver_data = {
.timing_base = 0x2,
+   .lcdblk_offset = 0x214,
+   .lcdblk_vt_shift = 24,
+   .lcdblk_bypass_shift = 15,
.has_shadowcon = 1,
+   .has_vidoutcon = 1,
 };

 struct fimd_win_data {
@@ -112,15 +143,22 @@ struct fimd_context {
struct clk  *bus_clk;
struct clk  *lcd_clk;
void __iomem*regs;
+   struct regmap   *sysreg;
struct drm_display_mode mode;
struct fimd_win_datawin_data[WINDOWS_NR];
unsigned intdefault_win;
unsigned long   irq_flags;
+   u32 vidcon0;
u32 vidcon1;
+   u32 vidout_con;
+   u32 i80ifcon;
+   booli80_if;
boolsuspended;
int pipe;
wait_queue_head_t   wait_vsync_queue;
atomic_twait_vsync_event;
+   atomic_twin_updated;
+   atomic_ttriggering;

struct exynos_drm_panel_info panel;
struct fimd_driver_data *driver_data;
@@ -243,6 +281,14 @@ static u32 fimd_calc_clkdiv(struct fimd_context *ctx,
unsigned long ideal_clk = mode->htotal * mode->vtotal * mode->vrefresh;
u32 clkdiv;

+   if (ctx->i80_if) {
+   /*
+* The frame done interrupt should be 

[PATCH v6 05/14] drm/exynos: dsi: add TE interrupt handler to support LCD I80 interface

2014-07-17 Thread YoungJun Cho
To support LCD I80 interface, the DSI host should register
TE interrupt handler from the TE GPIO of attached panel.
So the panel generates a tearing effect synchronization signal
then the DSI host calls the CRTC device manager to trigger
to transfer video image.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 95 -
 1 file changed, 93 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 58bfb2a..4997bfe 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -16,7 +16,9 @@
 #include 

 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,6 +26,7 @@
 #include 
 #include 

+#include "exynos_drm_crtc.h"
 #include "exynos_drm_drv.h"

 /* returns true iff both arguments logically differs */
@@ -247,6 +250,7 @@ struct exynos_dsi {
struct clk *bus_clk;
struct regulator_bulk_data supplies[2];
int irq;
+   int te_gpio;

u32 pll_clk_rate;
u32 burst_clk_rate;
@@ -954,17 +958,89 @@ static irqreturn_t exynos_dsi_irq(int irq, void *dev_id)
return IRQ_HANDLED;
 }

+static irqreturn_t exynos_dsi_te_irq_handler(int irq, void *dev_id)
+{
+   struct exynos_dsi *dsi = (struct exynos_dsi *)dev_id;
+   struct drm_encoder *encoder = dsi->encoder;
+
+   if (dsi->state & DSIM_STATE_ENABLED)
+   exynos_drm_crtc_te_handler(encoder->crtc);
+
+   return IRQ_HANDLED;
+}
+
+static void exynos_dsi_enable_irq(struct exynos_dsi *dsi)
+{
+   enable_irq(dsi->irq);
+
+   if (gpio_is_valid(dsi->te_gpio))
+   enable_irq(gpio_to_irq(dsi->te_gpio));
+}
+
+static void exynos_dsi_disable_irq(struct exynos_dsi *dsi)
+{
+   if (gpio_is_valid(dsi->te_gpio))
+   disable_irq(gpio_to_irq(dsi->te_gpio));
+
+   disable_irq(dsi->irq);
+}
+
 static int exynos_dsi_init(struct exynos_dsi *dsi)
 {
exynos_dsi_enable_clock(dsi);
exynos_dsi_reset(dsi);
-   enable_irq(dsi->irq);
+   exynos_dsi_enable_irq(dsi);
exynos_dsi_wait_for_reset(dsi);
exynos_dsi_init_link(dsi);

return 0;
 }

+static int exynos_dsi_register_te_irq(struct exynos_dsi *dsi)
+{
+   int ret;
+
+   dsi->te_gpio = of_get_named_gpio(dsi->panel_node, "te-gpios", 0);
+   if (!gpio_is_valid(dsi->te_gpio)) {
+   dev_err(dsi->dev, "no te-gpios specified\n");
+   ret = dsi->te_gpio;
+   goto out;
+   }
+
+   ret = gpio_request_one(dsi->te_gpio, GPIOF_IN, "te_gpio");
+   if (ret) {
+   dev_err(dsi->dev, "gpio request failed with %d\n", ret);
+   goto out;
+   }
+
+   /*
+* This TE GPIO IRQ should not be set to IRQ_NOAUTOEN, because panel
+* calls drm_panel_init() first then calls mipi_dsi_attach() in probe().
+* It means that te_gpio is invalid when exynos_dsi_enable_irq() is
+* called by drm_panel_init() before panel is attached.
+*/
+   ret = request_threaded_irq(gpio_to_irq(dsi->te_gpio),
+   exynos_dsi_te_irq_handler, NULL,
+   IRQF_TRIGGER_RISING, "TE", dsi);
+   if (ret) {
+   dev_err(dsi->dev, "request interrupt failed with %d\n", ret);
+   gpio_free(dsi->te_gpio);
+   goto out;
+   }
+
+out:
+   return ret;
+}
+
+static void exynos_dsi_unregister_te_irq(struct exynos_dsi *dsi)
+{
+   if (gpio_is_valid(dsi->te_gpio)) {
+   free_irq(gpio_to_irq(dsi->te_gpio), dsi);
+   gpio_free(dsi->te_gpio);
+   dsi->te_gpio = -ENOENT;
+   }
+}
+
 static int exynos_dsi_host_attach(struct mipi_dsi_host *host,
  struct mipi_dsi_device *device)
 {
@@ -978,6 +1054,16 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host 
*host,
if (dsi->connector.dev)
drm_helper_hpd_irq_event(dsi->connector.dev);

+   /*
+* If attached panel device is for command mode one, dsi should
+* register TE interrupt handler.
+*/
+   if (!(dsi->mode_flags & MIPI_DSI_MODE_VIDEO)) {
+   int ret = exynos_dsi_register_te_irq(dsi);
+   if (ret)
+   return ret;
+   }
+
return 0;
 }

@@ -986,6 +1072,8 @@ static int exynos_dsi_host_detach(struct mipi_dsi_host 
*host,
 {
struct exynos_dsi *dsi = host_to_dsi(host);

+   exynos_dsi_unregister_te_irq(dsi);
+
dsi->panel_node = NULL;

if (dsi->connector.dev)
@@ -1099,7 +1187,7 @@ static void exynos_dsi_poweroff(struct exynos_dsi *dsi)

exynos_dsi_disable_clock(dsi);

-   disable_irq(dsi->irq);
+   exynos_dsi_disable_irq(dsi);
}

dsi->state &= ~DSIM_STATE_CMD_LPM;

[PATCH v6 04/14] drm/exynos: add TE handler to support LCD I80 interface

2014-07-17 Thread YoungJun Cho
To support LCD I80 interface, the panel should generate
Tearing Effect synchronization signal between MCU and FB
to display video images.
And the display controller should trigger to transfer
video image at this signal.
So the panel receives the TE IRQ, then calls these handler
chains to notify it to the display controller.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_crtc.c | 8 
 drivers/gpu/drm/exynos/exynos_drm_crtc.h | 7 +++
 drivers/gpu/drm/exynos/exynos_drm_drv.h  | 3 +++
 3 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.c 
b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
index 3bf091d..b68e58f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
@@ -511,3 +511,11 @@ int exynos_drm_crtc_get_pipe_from_type(struct drm_device 
*drm_dev,

return -EPERM;
 }
+
+void exynos_drm_crtc_te_handler(struct drm_crtc *crtc)
+{
+   struct exynos_drm_manager *manager = to_exynos_crtc(crtc)->manager;
+
+   if (manager->ops->te_handler)
+   manager->ops->te_handler(manager);
+}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.h 
b/drivers/gpu/drm/exynos/exynos_drm_crtc.h
index 9f74b10..690dcdd 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.h
@@ -36,4 +36,11 @@ void exynos_drm_crtc_plane_disable(struct drm_crtc *crtc, 
int zpos);
 int exynos_drm_crtc_get_pipe_from_type(struct drm_device *drm_dev,
unsigned int out_type);

+/*
+ * This function calls the crtc device(manager)'s te_handler() callback
+ * to trigger to transfer video image at the tearing effect synchronization
+ * signal.
+ */
+void exynos_drm_crtc_te_handler(struct drm_crtc *crtc);
+
 #endif
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h 
b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index 02f3b3d..13be498 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -186,6 +186,8 @@ struct exynos_drm_display {
  * @win_commit: apply hardware specific overlay data to registers.
  * @win_enable: enable hardware specific overlay.
  * @win_disable: disable hardware specific overlay.
+ * @te_handler: trigger to transfer video image at the tearing effect
+ * synchronization signal if there is a page flip request.
  */
 struct exynos_drm_manager;
 struct exynos_drm_manager_ops {
@@ -204,6 +206,7 @@ struct exynos_drm_manager_ops {
void (*win_commit)(struct exynos_drm_manager *mgr, int zpos);
void (*win_enable)(struct exynos_drm_manager *mgr, int zpos);
void (*win_disable)(struct exynos_drm_manager *mgr, int zpos);
+   void (*te_handler)(struct exynos_drm_manager *mgr);
 };

 /*
-- 
1.9.0



[PATCH v6 03/14] ARM: dts: samsung-fimd: add LCD I80 interface specific properties

2014-07-17 Thread YoungJun Cho
In case of using MIPI DSI based I80 interface panel,
the relevant registers should be set.
So this patch adds relevant DT bindings.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 .../devicetree/bindings/video/samsung-fimd.txt | 28 ++
 1 file changed, 28 insertions(+)

diff --git a/Documentation/devicetree/bindings/video/samsung-fimd.txt 
b/Documentation/devicetree/bindings/video/samsung-fimd.txt
index 2dad41b..59ff61e 100644
--- a/Documentation/devicetree/bindings/video/samsung-fimd.txt
+++ b/Documentation/devicetree/bindings/video/samsung-fimd.txt
@@ -44,6 +44,34 @@ Optional Properties:
 - display-timings: timing settings for FIMD, as described in document [1].
Can be used in case timings cannot be provided otherwise
or to override timings provided by the panel.
+- samsung,sysreg: handle to syscon used to control the system registers
+- i80-if-timings: timing configuration for lcd i80 interface support.
+  - cs-setup: clock cycles for the active period of address signal is enabled
+  until chip select is enabled.
+  If not specified, the default value(0) will be used.
+  - wr-setup: clock cycles for the active period of CS signal is enabled until
+  write signal is enabled.
+  If not specified, the default value(0) will be used.
+  - wr-active: clock cycles for the active period of CS is enabled.
+   If not specified, the default value(1) will be used.
+  - wr-hold: clock cycles for the active period of CS is disabled until write
+ signal is disabled.
+ If not specified, the default value(0) will be used.
+
+  The parameters are defined as:
+
+VCLK(internal)  __|??|_|??|_|??|_|??|_|??
+  :::::
+Address Output  --:|:::
+Chip Select ???|::|??
+   | wr-setup+1 || wr-hold+1  |
+   |<-->||<-->|
+Write Enable||???
+| wr-active+1|
+|<-->|
+Video Data  --

 The device node can contain 'port' child nodes according to the bindings 
defined
 in [2]. The following are properties specific to those nodes:
-- 
1.9.0



[PATCH v6 02/14] drm/exynos: use wait_event_timeout() for safety usage

2014-07-17 Thread YoungJun Cho
There could be the case that the page flip operation isn't finished correctly
with some abnormal condition such as panel reset. So this patch replaces
wait_event() with wait_event_timeout() to avoid waiting for page flip completion
infinitely.
And clears exynos_crtc->pending_flip in exynos_drm_crtc_page_flip()
when exynos_drm_crtc_mode_set_commit() is failed.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/exynos/exynos_drm_crtc.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.c 
b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
index 95c9435..3bf091d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
@@ -69,8 +69,10 @@ static void exynos_drm_crtc_dpms(struct drm_crtc *crtc, int 
mode)

if (mode > DRM_MODE_DPMS_ON) {
/* wait for the completion of page flip. */
-   wait_event(exynos_crtc->pending_flip_queue,
-   atomic_read(_crtc->pending_flip) == 0);
+   if (!wait_event_timeout(exynos_crtc->pending_flip_queue,
+   !atomic_read(_crtc->pending_flip),
+   HZ/20))
+   atomic_set(_crtc->pending_flip, 0);
drm_vblank_off(crtc->dev, exynos_crtc->pipe);
}

@@ -259,6 +261,7 @@ static int exynos_drm_crtc_page_flip(struct drm_crtc *crtc,
spin_lock_irq(>event_lock);
drm_vblank_put(dev, exynos_crtc->pipe);
list_del(>base.link);
+   atomic_set(_crtc->pending_flip, 0);
spin_unlock_irq(>event_lock);

goto out;
-- 
1.9.0



[PATCH v6 01/14] drm/exynos: dsi: move the EoT packets configuration point

2014-07-17 Thread YoungJun Cho
This configuration could be used in MIPI DSI command mode also.
And adds user manual description for display configuration.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 2df3592..58bfb2a 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -468,13 +468,20 @@ static int exynos_dsi_init_link(struct exynos_dsi *dsi)
/* DSI configuration */
reg = 0;

+   /*
+* The first bit of mode_flags specifies display configuration.
+* If this bit is set[= MIPI_DSI_MODE_VIDEO], dsi will support video
+* mode, otherwise it will support command mode.
+*/
if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO) {
reg |= DSIM_VIDEO_MODE;

+   /*
+* The user manual describes that following bits are ignored in
+* command mode.
+*/
if (!(dsi->mode_flags & MIPI_DSI_MODE_VSYNC_FLUSH))
reg |= DSIM_MFLUSH_VS;
-   if (!(dsi->mode_flags & MIPI_DSI_MODE_EOT_PACKET))
-   reg |= DSIM_EOT_DISABLE;
if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO_SYNC_PULSE)
reg |= DSIM_SYNC_INFORM;
if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO_BURST)
@@ -491,6 +498,9 @@ static int exynos_dsi_init_link(struct exynos_dsi *dsi)
reg |= DSIM_HSA_MODE;
}

+   if (!(dsi->mode_flags & MIPI_DSI_MODE_EOT_PACKET))
+   reg |= DSIM_EOT_DISABLE;
+
switch (dsi->format) {
case MIPI_DSI_FMT_RGB888:
reg |= DSIM_MAIN_PIX_FORMAT_RGB888;
-- 
1.9.0



[PATCH v6 00/14] drm/exynos: support LCD I80 interface display

2014-07-17 Thread YoungJun Cho
Hi,

This series adds LCD I80 interface display support for Exynos DRM driver.
The FIMD(display controller) specification describes it as "LCD I80 interface"
and the DSI specification describes it as "Command mode interface".

This is based on exynos-drm-next branch.

The previous patches,
RFC: http://www.spinics.net/lists/dri-devel/msg58898.html
V1: http://www.spinics.net/lists/dri-devel/msg59291.html
V2: http://www.spinics.net/lists/dri-devel/msg59867.html
V3: http://www.spinics.net/lists/dri-devel/msg60708.html
V4: http://www.spinics.net/lists/dri-devel/msg60943.html
V5: http://www.spinics.net/lists/dri-devel/msg62956.html

Changelog v2:
- Fixes typo and removes unnecessary error log. (commented by Andrzej Hazda)
- Adds missed pendlig_flip flag clear points. (commented by Daniel Kurtz)

Changelog v3:
- Removes generic command mode and command mode display timing interface.
- Moves I80 interface timings from panel DT to the FIMD(display controller) DT.

Changelog v4:
- Removes exynos5 sysreg(syscon) DT bindings and node from dtsi because
  it was already updated by linux-samsung-soc. (commented by Vivek Gautam)

Changelog v5:
- Fixes FIMD vidcon0 register relevant code.
- Fixes panel gamma table, disable sequence.
- Slitely updates for code cleanup.

Changelog v6:
- Removes pass TE host ops in dsi and exynos dsi uses TE irq handler instead,
  and it is related with the TE GPIO of panel. (commented by Thierry Reding)

Patches 1 and 2 fix trivial bugs.

Patches 3, 4, 5 and 6 implement FIMD(display controller) I80 interface.
The MIPI DSI command mode based panel generates Tearing Effect synchronization
signal between MCU and FB to display video image, and FIMD should trigger to
transfer video image at this signal.
So the panel generates it and the dsi should receive the TE IRQ and call TE
handler chains to notify it to the FIMD.

Patches 7 and 8 implement to use Exynos5410 / 5420 / 5440 SoC DSI driver
which is different from previous Exynos4 SoCs for some registers control.

Patches 9 and 10 introduce MIPI DSI command mode based Samsung S6E3FA0 AMOLED
5.7" LCD drm panel driver.

The ohters add DT property nodes to support MIPI DSI command mode.

I welcome any comments.

Thank you.
Best regards YJ

YoungJun Cho (14):
  drm/exynos: dsi: move the EoT packets configuration point
  drm/exynos: use wait_event_timeout() for safety usage
  ARM: dts: samsung-fimd: add LCD I80 interface specific properties
  drm/exynos: add TE handler to support LCD I80 interface
  drm/exynos: dsi: add TE interrupt handler to support LCD I80 interface
  drm/exynos: fimd: support LCD I80 interface
  ARM: dts: exynos_dsim: add exynos5410 compatible to DT bindings
  drm/exynos: dsi: add driver data to support Exynos5410/5420/5440 SoCs
  ARM: dts: s6e3fa0: add DT bindings
  drm/panel: add S6E3FA0 driver
  ARM: dts: exynos4: add system register property
  ARM: dts: exynos5: add system register property
  ARM: dts: exynos5420: add mipi-phy node
  ARM: dts: exynos5420: add dsi node

 .../devicetree/bindings/panel/samsung,s6e3fa0.txt  |  46 ++
 .../devicetree/bindings/video/exynos_dsim.txt  |   4 +-
 .../devicetree/bindings/video/samsung-fimd.txt |  28 ++
 arch/arm/boot/dts/exynos4.dtsi |   1 +
 arch/arm/boot/dts/exynos5.dtsi |   1 +
 arch/arm/boot/dts/exynos5420.dtsi  |  20 +
 drivers/gpu/drm/exynos/Kconfig |   1 +
 drivers/gpu/drm/exynos/exynos_drm_crtc.c   |  15 +-
 drivers/gpu/drm/exynos/exynos_drm_crtc.h   |   7 +
 drivers/gpu/drm/exynos/exynos_drm_drv.h|   3 +
 drivers/gpu/drm/exynos/exynos_drm_dsi.c| 266 +-
 drivers/gpu/drm/exynos/exynos_drm_fimd.c   | 276 +--
 drivers/gpu/drm/panel/Kconfig  |   7 +
 drivers/gpu/drm/panel/Makefile |   1 +
 drivers/gpu/drm/panel/panel-s6e3fa0.c  | 541 +
 include/video/samsung_fimd.h   |   3 +-
 16 files changed, 1146 insertions(+), 74 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
 create mode 100644 drivers/gpu/drm/panel/panel-s6e3fa0.c

-- 
1.9.0



[Bug 81279] RadeonSI in Counter Strike Source [Source Engine] has wrong textures.

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81279

Aaron B  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Aaron B  ---
Fixed, LLVM breakage patched with Mesa commit
d859bdb4b5beee8059d3e5c0f789dd8ae4061c4.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/3c8b1c06/attachment.html>


[PATCH v2 00/25] AMDKFD kernel driver

2014-07-17 Thread Oded Gabbay
Forgot to cc mailing list on cover letter. Sorry.

As a continuation to the existing discussion, here is a v2 patch series 
restructured with a cleaner history and no totally-different-early-versions of 
the code.

Instead of 83 patches, there are now a total of 25 patches, where 5 of them
are modifications to radeon driver and 18 of them include only amdkfd code.
There is no code going away or even modified between patches, only added.

The driver was renamed from radeon_kfd to amdkfd and moved to reside under
drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is 
an AMD-only driver at this point. Having said that, we do foresee a generic hsa 
framework being implemented in the future and in that case, we will adjust 
amdkfd to work within that framework.

As the amdkfd driver should support multiple AMD gfx drivers, we want to keep 
it 
as a seperate driver from radeon. Therefore, the amdkfd code is contained in 
its 
own folder. The amdkfd folder was put under the radeon folder because the only 
AMD gfx driver in the Linux kernel at this point
is the radeon driver. Having said that, we will probably need to move it (maybe 
to be directly under drm) after we integrate with additional AMD gfx drivers.

For people who like to review using git, the v2 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

Written by Oded Gabbayh 

Original Cover Letter:

This patch set implements a Heterogeneous System Architecture (HSA) driver for 
radeon-family GPUs.
HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to share system 
resources more effectively via HW features including shared pageable memory, 
userspace-accessible work queues, and platform-level atomics. In addition to 
the 
memory protection mechanisms in GPUVM and IOMMUv2, the Sea Islands family of 
GPUs also performs HW-level validation of commands passed in through the queues 
(aka rings).

The code in this patch set is intended to serve both as a sample driver for 
other HSA-compatible hardware devices and as a production driver for 
radeon-family processors. The code is architected to support multiple CPUs each 
with connected GPUs, although the current implementation focuses on a single 
Kaveri/Berlin APU, and works alongside the existing radeon kernel graphics 
driver (kgd).
AMD GPUs designed for use with HSA (Sea Islands and up) share some hardware 
functionality between HSA compute and regular gfx/compute (memory, interrupts, 
registers), while other functionality has been added specifically for HSA 
compute  (hw scheduler for virtualized compute rings). All shared hardware is 
owned by the radeon graphics driver, and an interface between kfd and kgd 
allows 
the kfd to make use of those shared resources, while HSA-specific functionality 
is managed directly by kfd by submitting packets into an HSA-specific command 
queue (the "HIQ").

During kfd module initialization a char device node (/dev/kfd) is created 
(surviving until module exit), with ioctls for queue creation & management, and 
data structures are initialized for managing HSA device topology.
The rest of the initialization is driven by calls from the radeon kgd at the 
following points :

- radeon_init (kfd_init)
- radeon_exit (kfd_fini)
- radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
- radeon_driver_unload_kms (kfd_device_fini)

During the probe and init processing per-device data structures are established 
which connect to the associated graphics kernel driver. This information is 
exposed to userspace via sysfs, along with a version number allowing userspace 
to determine if a topology change has occurred while it was reading from sysfs.
The interface between kfd and kgd also allows the kfd to request buffer 
management services from kgd, and allows kgd to route interrupt requests to kfd 
code since the interrupt block is shared between regular graphics/compute and 
HSA compute subsystems in the GPU.

The kfd code works with an open source usermode library ("libhsakmt") which is 
in the final stages of IP review and should be published in a separate repo 
over 
the next few days.
The code operates in one of three modes, selectable via the sched_policy module 
parameter :

- sched_policy=0 uses a hardware scheduler running in the MEC block within CP, 
and allows oversubscription (more queues than HW slots)
- sched_policy=1 also uses HW scheduling but does not allow oversubscription, 
so 
create_queue requests fail when we run out of HW slots
- sched_policy=2 does not use HW scheduling, so the driver manually assigns 
queues to HW slots by programming registers

The "no HW scheduling" option is for debug & new hardware bringup only, so has 
less test coverage than the other options. Default in the current code is "HW 
scheduling without oversubscription" since that is where we have the most test 
coverage but we expect to change the default to "HW scheduling with 
oversubscription" after 

[pull] radeon drm-fixes-3.16

2014-07-17 Thread Alex Deucher
Hi Dave,

A few more fixes for 3.16.  The pageflipping fixes I dropped last week
have finally shaped up so this is mostly fixes for fallout from the
pageflipping code changes.  Also fix a memory leak and a black screen
when restoring the backlight on console unblanking.

The following changes since commit bf38b025d3f58f4c1273714ff1be5bfbf99574a4:

  Merge branch 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux 
into drm-fixes (2014-07-11 11:24:13 +1000)

are available in the git repository at:


  git://people.freedesktop.org/~agd5f/linux drm-fixes-3.16

for you to fetch changes up to 5f87e090a7368adc2290ae17ffd82a070caadd20:

  drm/radeon: Make classic pageflip completion path less racy. (2014-07-17 
09:04:03 -0400)


Alex Deucher (2):
  drm/radeon: avoid leaking edid data
  drm/radeon: set default bl level to something reasonable

Mario Kleiner (4):
  drm/radeon: Prevent too early kms-pageflips triggered by vblank.
  drm/radeon: Remove redundant fence unref in pageflip path.
  drm/radeon: Add missing vblank_put in pageflip ioctl error path.
  drm/radeon: Make classic pageflip completion path less racy.

Michel D?nzer (2):
  drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
  drm/radeon: Complete page flip even if waiting on the BO fence fails

 drivers/gpu/drm/radeon/atombios_crtc.c |   8 +-
 drivers/gpu/drm/radeon/atombios_encoders.c |  10 +-
 drivers/gpu/drm/radeon/evergreen.c |   5 +-
 drivers/gpu/drm/radeon/evergreen_reg.h |   1 -
 drivers/gpu/drm/radeon/radeon.h|   3 +-
 drivers/gpu/drm/radeon/radeon_display.c| 198 +++--
 drivers/gpu/drm/radeon/rv515.c |   5 +-
 7 files changed, 119 insertions(+), 111 deletions(-)


[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

Christian K?nig  changed:

   What|Removed |Added

 Attachment #102867|0   |1
is obsolete||
 Attachment #102925|0   |1
is obsolete||
 Attachment #102966|0   |1
is obsolete||

--- Comment #42 from Christian K?nig  ---
Created attachment 102992
  --> https://bugs.freedesktop.org/attachment.cgi?id=102992=edit
Possible fix v3.

Updated and largely simplified patch.

I'm running the third piglit test with it now and so far the system seems to be
stable.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/de506817/attachment.html>


[Bug 78453] [HAWAII] Get acceleration working

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78453

--- Comment #80 from Alex Deucher  ---
(In reply to comment #79)
> Ok, this is weird: just out of curiosity I tried to launch Xorg with
> "-retro", then I do see errors logged (see attached excerpt from dmesg). If
> I run Xorg without parameters, I just end up with a black screen, no logged
> errors and Xorg.0.log looks like everything is fine (except for that
> "config_odev_get_int_attribute" error):

> Shouldn't I be seeing the same errors as with "-retro" as well?

-retro invokes acceleration while the non-retro case does not.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/18d14d25/attachment.html>


[Bug 79980] Random radeonsi crashes

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #41 from Aaron B  ---
This is with only the 3.16-rc5 patch without fix-ups, which was working okay.
But when I clicked on the top-right of facebook to open up an event, it went
out just like old times. But if you see from the time, it had a good run this
time for sure. Youtube/Video players in general never crashed it once. I have
the fixed kernel building now so soon I'll jump on the fixed one, it looks like
code related to this has changed (Error message output a little different.) so
I'll try it out.

http://pastebin.com/zntHnrxu

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/b2378cf2/attachment-0001.html>


[PATCH v2 25/25] amdkfd: Implement the PMC Acquire/Release IOCTLs

2014-07-17 Thread Oded Gabbay
From: Evgeny Pinchuk 

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 46 +++--
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c  |  2 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  5 
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c |  6 
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index 1e19504..be90ab9 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -391,12 +391,54 @@ static int kfd_ioctl_get_process_apertures(struct file 
*filp, struct kfd_process

 static long kfd_ioctl_pmc_acquire_access(struct file *filp, struct kfd_process 
*p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_pmc_acquire_access_args args;
+   struct kfd_dev *dev;
+   int err = -EBUSY;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   spin_lock(>pmc_access_lock);
+   if (dev->pmc_locking_process == NULL) {
+   dev->pmc_locking_process = p;
+   dev->pmc_locking_trace = args.trace_id;
+   err = 0;
+   } else if (dev->pmc_locking_process == p && dev->pmc_locking_trace == 
args.trace_id) {
+   /* Same trace already has an access. Returning success */
+   err = 0;
+   }
+
+   spin_unlock(>pmc_access_lock);
+
+   return err;
 }

 static long kfd_ioctl_pmc_release_access(struct file *filp, struct kfd_process 
*p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_pmc_release_access_args args;
+   struct kfd_dev *dev;
+   int err = -EINVAL;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   spin_lock(>pmc_access_lock);
+   if (dev->pmc_locking_process == p && dev->pmc_locking_trace == 
args.trace_id) {
+   dev->pmc_locking_process = NULL;
+   dev->pmc_locking_trace = 0;
+   err = 0;
+   }
+   spin_unlock(>pmc_access_lock);
+
+   return err;
 }

 static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index 6a7a8b2..f1cbc46 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -184,6 +184,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
return false;
}

+   spin_lock_init(>pmc_access_lock);
+
kfd->init_complete = true;
dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
 kfd->pdev->device);
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 1db1ede..a5356d1 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -131,6 +131,11 @@ struct kfd_dev {

/* QCM Device instance */
struct device_queue_manager *dqm;
+
+   /* Performance counters exclusivity lock */
+   spinlock_t pmc_access_lock;
+   struct kfd_process *pmc_locking_process;
+   uint64_t pmc_locking_trace;
 };

 /* KGD2KFD callbacks */
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
index bcc004f..a67c239 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
@@ -98,6 +98,12 @@ static void free_process(struct kfd_process *p)
BUG_ON(p == NULL);

list_for_each_entry_safe(pdd, temp, >per_device_data, 
per_device_list) {
+   spin_lock(>dev->pmc_access_lock);
+   if (pdd->dev->pmc_locking_process == p) {
+   pdd->dev->pmc_locking_process = NULL;
+   pdd->dev->pmc_locking_trace = 0;
+   }
+   spin_unlock(>dev->pmc_access_lock);
amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
list_del(>per_device_list);
kfree(pdd);
-- 
1.9.1



[PATCH v2 24/25] amdkfd: Implement the Get Process Aperture IOCTL

2014-07-17 Thread Oded Gabbay
From: Alexey Skidanov 

Signed-off-by: Alexey Skidanov 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 40 -
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  5 
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index 72d8e79..1e19504 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -348,7 +348,45 @@ static long kfd_ioctl_get_clock_counters(struct file 
*filep, struct kfd_process

 static int kfd_ioctl_get_process_apertures(struct file *filp, struct 
kfd_process *p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_get_process_apertures_args args;
+   struct kfd_process_device *pdd;
+
+   dev_dbg(kfd_device, "get apertures for PASID %d", p->pasid);
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   args.num_of_nodes = 0;
+
+   mutex_lock(>mutex);
+
+   /*if the process-device list isn't empty*/
+   if (kfd_has_process_device_data(p)) {
+   /* Run over all pdd of the process */
+   pdd = kfd_get_first_process_device_data(p);
+   do {
+
+   args.process_apertures[args.num_of_nodes].gpu_id = 
pdd->dev->id;
+   args.process_apertures[args.num_of_nodes].lds_base = 
pdd->lds_base;
+   args.process_apertures[args.num_of_nodes].lds_limit = 
pdd->lds_limit;
+   args.process_apertures[args.num_of_nodes].gpuvm_base = 
pdd->gpuvm_base;
+   args.process_apertures[args.num_of_nodes].gpuvm_limit = 
pdd->gpuvm_limit;
+   args.process_apertures[args.num_of_nodes].scratch_base 
= pdd->scratch_base;
+   args.process_apertures[args.num_of_nodes].scratch_limit 
= pdd->scratch_limit;
+
+   dev_dbg(kfd_device, "node id %u, gpu id %u, lds_base 
%llX lds_limit %llX gpuvm_base %llX gpuvm_limit %llX scratch_base %llX 
scratch_limit %llX",
+   args.num_of_nodes, pdd->dev->id, 
pdd->lds_base, pdd->lds_limit, pdd->gpuvm_base, pdd->gpuvm_limit, 
pdd->scratch_base, pdd->scratch_limit);
+   args.num_of_nodes++;
+   } while ((pdd = kfd_get_next_process_device_data(p, pdd)) != 
NULL &&
+   (args.num_of_nodes < NUM_OF_SUPPORTED_GPUS));
+   }
+
+   mutex_unlock(>mutex);
+
+   if (copy_to_user(arg, , sizeof(args)))
+   return -EFAULT;
+
+   return 0;
 }

 static long kfd_ioctl_pmc_acquire_access(struct file *filp, struct kfd_process 
*p, void __user *arg)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 7ea0e81..1db1ede 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -346,6 +346,11 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, 
pasid_t pasid);
 struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
struct kfd_process *p);

+/* Process device data iterator */
+struct kfd_process_device *kfd_get_first_process_device_data(struct 
kfd_process *p);
+struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process 
*p, struct kfd_process_device *pdd);
+bool kfd_has_process_device_data(struct kfd_process *p);
+
 /* PASIDs */
 int kfd_pasid_init(void);
 void kfd_pasid_exit(void);
-- 
1.9.1



[PATCH v2 23/25] amdkfd: Implement the Get Clock Counters IOCTL

2014-07-17 Thread Oded Gabbay
From: Evgeny Pinchuk 

Signed-off-by: Evgeny Pinchuk 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index 085bd91..72d8e79 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -315,7 +315,34 @@ out:

 static long kfd_ioctl_get_clock_counters(struct file *filep, struct 
kfd_process *p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_get_clock_counters_args args;
+   struct kfd_dev *dev;
+   struct timespec time;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   /* Reading GPU clock counter from KGD */
+   args.gpu_clock_counter = kfd2kgd->get_gpu_clock_counter(dev->kgd);
+
+   /* No access to rdtsc. Using raw monotonic time */
+   getrawmonotonic();
+   args.cpu_clock_counter = (uint64_t)timespec_to_ns();
+
+   get_monotonic_boottime();
+   args.system_clock_counter = (uint64_t)timespec_to_ns();
+
+   /* Since the counter is in nano-seconds we use 1GHz frequency */
+   args.system_clock_freq = 10;
+
+   if (copy_to_user(arg, , sizeof(args)))
+   return -EFAULT;
+
+   return 0;
 }


-- 
1.9.1



[PATCH v2 22/25] amdkfd: Implement the Set Memory Policy IOCTL

2014-07-17 Thread Oded Gabbay
From: Andrew Lewycky 

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 51 -
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index a74693a..085bd91 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include "kfd_priv.h"
+#include "kfd_device_queue_manager.h"

 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -261,7 +262,55 @@ static int kfd_ioctl_update_queue(struct file *filp, 
struct kfd_process *p, void

 static long kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process 
*p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_set_memory_policy_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   struct kfd_process_device *pdd;
+   enum cache_policy default_policy, alternate_policy;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (args.default_policy != KFD_IOC_CACHE_POLICY_COHERENT
+   && args.default_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   if (args.alternate_policy != KFD_IOC_CACHE_POLICY_COHERENT
+   && args.alternate_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   pdd = kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd) < 0) {
+   err = PTR_ERR(pdd);
+   goto out;
+   }
+
+   default_policy = (args.default_policy == KFD_IOC_CACHE_POLICY_COHERENT)
+? cache_policy_coherent : cache_policy_noncoherent;
+
+   alternate_policy = (args.alternate_policy == 
KFD_IOC_CACHE_POLICY_COHERENT)
+  ? cache_policy_coherent : cache_policy_noncoherent;
+
+   if (!dev->dqm->set_cache_memory_policy(dev->dqm,
+>qpd,
+default_policy,
+alternate_policy,
+(void __user 
*)args.alternate_aperture_base,
+args.alternate_aperture_size))
+   err = -EINVAL;
+
+out:
+   mutex_unlock(>mutex);
+
+   return err;
 }

 static long kfd_ioctl_get_clock_counters(struct file *filep, struct 
kfd_process *p, void __user *arg)
-- 
1.9.1



[PATCH v2 21/25] amdkfd: Implement the create/destroy/update queue IOCTLs

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 133 +++-
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|   8 ++
 2 files changed, 138 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index d6580a6..a74693a 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -119,17 +119,144 @@ static int kfd_open(struct inode *inode, struct file 
*filep)

 static long kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, 
void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_create_queue_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   unsigned int queue_id;
+   struct kfd_process_device *pdd;
+   struct queue_properties q_properties;
+
+   memset(_properties, 0, sizeof(struct queue_properties));
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (!access_ok(VERIFY_WRITE, args.read_pointer_address, 
sizeof(qptr_t))) {
+   pr_err("kfd: can't access read pointer");
+   return -EFAULT;
+   }
+
+   if (!access_ok(VERIFY_WRITE, args.write_pointer_address, 
sizeof(qptr_t))) {
+   pr_err("kfd: can't access write pointer");
+   return -EFAULT;
+   }
+
+   q_properties.is_interop = false;
+   q_properties.queue_percent = args.queue_percentage;
+   q_properties.priority = args.queue_priority;
+   q_properties.queue_address = args.ring_base_address;
+   q_properties.queue_size = args.ring_size;
+   q_properties.read_ptr = (qptr_t *) args.read_pointer_address;
+   q_properties.write_ptr = (qptr_t *) args.write_pointer_address;
+
+
+   pr_debug("%s Arguments: Queue Percentage (%d, %d)\n"
+   "Queue Priority (%d, %d)\n"
+   "Queue Address (0x%llX, 0x%llX)\n"
+   "Queue Size (0x%llX, %u)\n"
+   "Queue r/w Pointers (0x%llX, 0x%llX)\n",
+   __func__,
+   q_properties.queue_percent, args.queue_percentage,
+   q_properties.priority, args.queue_priority,
+   q_properties.queue_address, args.ring_base_address,
+   q_properties.queue_size, args.ring_size,
+   (uint64_t) q_properties.read_ptr,
+   (uint64_t) q_properties.write_ptr);
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   pdd = kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd) < 0) {
+   err = PTR_ERR(pdd);
+   goto err_bind_process;
+   }
+
+   pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n",
+   p->pasid,
+   dev->id);
+
+   err = pqm_create_queue(>pqm, dev, filep, _properties, 0, 
KFD_QUEUE_TYPE_COMPUTE, _id);
+   if (err != 0)
+   goto err_create_queue;
+
+   args.queue_id = queue_id;
+   args.doorbell_address = (uint64_t)q_properties.doorbell_ptr;
+
+   if (copy_to_user(arg, , sizeof(args))) {
+   err = -EFAULT;
+   goto err_copy_args_out;
+   }
+
+   mutex_unlock(>mutex);
+
+   pr_debug("kfd: queue id %d was created successfully.\n"
+" ring buffer address == 0x%016llX\n"
+" read ptr address== 0x%016llX\n"
+" write ptr address   == 0x%016llX\n"
+" doorbell address== 0x%016llX\n",
+   args.queue_id,
+   args.ring_base_address,
+   args.read_pointer_address,
+   args.write_pointer_address,
+   args.doorbell_address);
+
+   return 0;
+
+err_copy_args_out:
+   pqm_destroy_queue(>pqm, queue_id);
+err_create_queue:
+err_bind_process:
+   mutex_unlock(>mutex);
+   return err;
 }

 static int kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p, 
void __user *arg)
 {
-   return -ENODEV;
+   int retval;
+   struct kfd_ioctl_destroy_queue_args args;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   pr_debug("kfd: destroying queue id %d for PASID %d\n",
+   args.queue_id,
+   p->pasid);
+
+   mutex_lock(>mutex);
+
+   retval = pqm_destroy_queue(>pqm, args.queue_id);
+
+   mutex_unlock(>mutex);
+   return retval;
 }

 static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p, 
void __user *arg)
 {
-   return -ENODEV;
+   int retval;
+   struct kfd_ioctl_update_queue_args args;
+   struct queue_properties 

[PATCH v2 20/25] amdkfd: Add interrupt handling module

2014-07-17 Thread Oded Gabbay
From: Andrew Lewycky 

This patch adds the interrupt handling module, in kfd_interrupt.c, and its 
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring per amdkfd 
device. The internal interrupt ring contains interrupts that needs further 
handling. The extra handling is deferred to a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware simply queues 
a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose 
interrupts because we have no back-pressure to the hardware.

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile|   3 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c|  16 ++-
 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c | 161 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h  |  18 ++-
 4 files changed, 193 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 44639f2..e634681 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -8,6 +8,7 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o 
kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
-   kfd_process_queue_manager.o kfd_device_queue_manager.o
+   kfd_process_queue_manager.o kfd_device_queue_manager.o \
+   kfd_interrupt.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index f5e9f39..6a7a8b2 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -29,6 +29,7 @@

 static const struct kfd_device_info kaveri_device_info = {
.max_pasid_bits = 16,
+   .ih_ring_entry_size = 4 * sizeof(uint32_t)
 };

 struct kfd_deviceid {
@@ -156,6 +157,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,

kfd_doorbell_init(kfd);

+   if (kfd_interrupt_init(kfd))
+   return false;
+
if (!device_iommu_pasid_init(kfd))
return false;

@@ -195,6 +199,8 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)

BUG_ON(err != 0);

+   kfd_interrupt_exit(kfd);
+
if (kfd->init_complete) {
device_queue_manager_uninit(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
@@ -233,6 +239,14 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
return 0;
 }

-void kgd2kfd_interrupt(struct kfd_dev *dev, const void *ih_ring_entry)
+/* This is called directly from KGD at ISR. */
+void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
 {
+   spin_lock(>interrupt_lock);
+
+   if (kfd->interrupts_active
+   && enqueue_ih_ring_entry(kfd, ih_ring_entry))
+   schedule_work(>interrupt_work);
+
+   spin_unlock(>interrupt_lock);
 }
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c
new file mode 100644
index 000..eed43a7
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c
@@ -0,0 +1,161 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * KFD Interrupts.
+ *
+ * AMD GPUs deliver interrupts by pushing an interrupt description onto the
+ * interrupt ring and then sending an interrupt. KGD receives the interrupt
+ * in ISR and sends us a pointer to each new entry on the interrupt ring.
+ *
+ * We generally can't process interrupt-signaled events from ISR, so we call
+ * 

[PATCH v2 19/25] amdkfd: Add device queue manager module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The queue scheduler divides into two sections, one section is process bounded 
and the other section is device bounded.
The device bounded section is handled by this module.
The DQM module handles queue setup, update and tear-down from the device side.
It also supports suspend/resume operation.

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c |  26 +-
 .../drm/radeon/amdkfd/kfd_device_queue_manager.c   | 985 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  13 +
 4 files changed, 1023 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index eacef85..44639f2 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -8,6 +8,6 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o 
kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
-   kfd_process_queue_manager.o
+   kfd_process_queue_manager.o kfd_device_queue_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index 7c4c836..f5e9f39 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include "kfd_priv.h"
+#include "kfd_device_queue_manager.h"

 static const struct kfd_device_info kaveri_device_info = {
.max_pasid_bits = 16,
@@ -165,10 +166,26 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,

amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);

+   kfd->dqm = device_queue_manager_init(kfd);
+   if (!kfd->dqm) {
+   kfd_topology_remove_device(kfd);
+   amd_iommu_free_device(kfd->pdev);
+   return false;
+   }
+
+   if (kfd->dqm->start(kfd->dqm) != 0) {
+   device_queue_manager_uninit(kfd->dqm);
+   kfd_topology_remove_device(kfd);
+   amd_iommu_free_device(kfd->pdev);
+   return false;
+   }
+
kfd->init_complete = true;
dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
 kfd->pdev->device);

+   pr_debug("kfd: Starting kfd with the following scheduling policy %d\n", 
sched_policy);
+
return true;
 }

@@ -178,8 +195,10 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)

BUG_ON(err != 0);

-   if (kfd->init_complete)
+   if (kfd->init_complete) {
+   device_queue_manager_uninit(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
+   }

kfree(kfd);
 }
@@ -188,8 +207,10 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
BUG_ON(kfd == NULL);

-   if (kfd->init_complete)
+   if (kfd->init_complete) {
+   kfd->dqm->stop(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
+   }
 }

 int kgd2kfd_resume(struct kfd_dev *kfd)
@@ -206,6 +227,7 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
if (err < 0)
return -ENXIO;
amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);
+   kfd->dqm->start(kfd->dqm);
}

return 0;
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c
new file mode 100644
index 000..d875d00
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c
@@ -0,0 +1,985 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION 

[PATCH v2 18/25] amdkfd: Add process queue manager module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The queue scheduler divides into two sections, one section is process bounded 
and the other section is device bounded.
The process bounded section is handled by this module. The PQM handles usermode 
queue setup, updates and tear-down.

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   3 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  17 +
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c|  13 +
 .../drm/radeon/amdkfd/kfd_process_queue_manager.c  | 343 +
 4 files changed, 375 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 4083f28..eacef85 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -7,6 +7,7 @@ ccflags-y := -Iinclude/drm
 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o kfd_packet_manager.o
+   kfd_kernel_queue.o kfd_packet_manager.o \
+   kfd_process_queue_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 63e492a..c444b38 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -277,6 +277,9 @@ struct kfd_process_device {
/* The user-mode address of the doorbell mapping for this device. */
doorbell_t __user *doorbell_mapping;

+   /* per-process-per device QCM data structure */
+   struct qcm_process_device qpd;
+
/* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */
bool bound;

@@ -312,6 +315,8 @@ struct kfd_process {
 */
struct list_head per_device_data;

+   struct process_queue_manager pqm;
+
/* The process's queues. */
size_t queue_array_size;

@@ -382,11 +387,23 @@ inline uint32_t upper_32(uint64_t x);

 int init_queue(struct queue **q, struct queue_properties properties);
 void uninit_queue(struct queue *q);
+void print_queue_properties(struct queue_properties *q);
 void print_queue(struct queue *q);

 struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum 
kfd_queue_type type);
 void kernel_queue_uninit(struct kernel_queue *kq);

+/* Process Queue Manager */
+struct process_queue_node {
+   struct queue *q;
+   struct kernel_queue *kq;
+   struct list_head process_queue_list;
+};
+
+int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p);
+void pqm_uninit(struct process_queue_manager *pqm);
+int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
+
 /* Packet Manager */

 #define KFD_HIQ_TIMEOUT (500)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
index 908b3b7..bcc004f 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
@@ -163,8 +163,16 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)

INIT_LIST_HEAD(>per_device_data);

+   err = pqm_init(>pqm, process);
+   if (err != 0)
+   goto err_process_pqm_init;
+
return process;

+err_process_pqm_init:
+   kfd_pasid_free(process->pasid);
+   list_del(>processes_list);
+   thread->mm->kfd_process = NULL;
 err_alloc_pasid:
kfree(process->queues);
 err_alloc_queues:
@@ -185,6 +193,9 @@ struct kfd_process_device 
*kfd_get_process_device_data(struct kfd_dev *dev,
pdd = kzalloc(sizeof(*pdd), GFP_KERNEL);
if (pdd != NULL) {
pdd->dev = dev;
+   INIT_LIST_HEAD(>qpd.queues_list);
+   INIT_LIST_HEAD(>qpd.priv_queue_list);
+   pdd->qpd.dqm = dev->dqm;
list_add(>per_device_list, >per_device_data);
}

@@ -246,6 +257,8 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, 
pasid_t pasid)

mutex_lock(>mutex);

+   pqm_uninit(>pqm);
+
/*
 * Just mark pdd as unbound, because we still need it to call
 * amd_iommu_unbind_pasid() in when the process exits.
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c
new file mode 100644
index 000..f54df3c
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c
@@ -0,0 +1,343 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including 

[PATCH v2 17/25] amdkfd: Add packet manager module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The packet manager module builds PM4 packets for the sole use of the CP 
scheduler. Those packets are used by the HIQ to submit runlists to the CP.

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c | 488 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  62 +++
 3 files changed, 551 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index bead1be..4083f28 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -7,6 +7,6 @@ ccflags-y := -Iinclude/drm
 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o
+   kfd_kernel_queue.o kfd_packet_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c
new file mode 100644
index 000..394fbd9
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c
@@ -0,0 +1,488 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include "kfd_device_queue_manager.h"
+#include "kfd_kernel_queue.h"
+#include "kfd_priv.h"
+#include "kfd_pm4_headers.h"
+#include "kfd_pm4_opcodes.h"
+#include "cik_mqds.h"
+
+static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, 
unsigned int buffer_size_bytes)
+{
+   unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
+
+   BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes);
+   *wptr = temp;
+}
+
+static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
+{
+   union PM4_TYPE_3_HEADER header;
+
+   header.u32all = 0;
+   header.opcode = opcode;
+   header.count = packet_size/sizeof(uint32_t) - 2;
+   header.type = PM4_TYPE_3;
+
+   return header.u32all;
+}
+
+static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int 
*rlib_size, bool *over_subscription)
+{
+   unsigned int process_count, queue_count;
+
+   BUG_ON(!pm || !rlib_size || !over_subscription);
+
+   process_count = pm->dqm->processes_count;
+   queue_count = pm->dqm->queue_count;
+
+   /* check if there is over subscription*/
+   *over_subscription = false;
+   if ((process_count >= VMID_PER_DEVICE) ||
+   queue_count > PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE) {
+   *over_subscription = true;
+   pr_debug("kfd: over subscribed runlist\n");
+   }
+
+   /* calculate run list ib allocation size */
+   *rlib_size = process_count * sizeof(struct pm4_map_process) +
+queue_count * sizeof(struct pm4_map_queues);
+
+   /* increase the allocation size in case we need a chained run list when 
over subscription */
+   if (*over_subscription)
+   *rlib_size += sizeof(struct pm4_runlist);
+
+   pr_debug("kfd: runlist ib size %d\n", *rlib_size);
+}
+
+static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int 
**rl_buffer, uint64_t *rl_gpu_buffer,
+   unsigned int *rl_buffer_size, bool *is_over_subscription)
+{
+   int retval;
+
+   BUG_ON(!pm);
+   BUG_ON(pm->allocated == true);
+   BUG_ON(is_over_subscription == NULL);
+
+   pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
+
+   retval = kfd_vidmem_alloc_map(pm->dqm->dev, >ib_buffer_obj, (void 
**)rl_buffer,
+

[PATCH v2 16/25] amdkfd: Add module parameter of scheduling policy

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

This patch adds a new parameter to the amdkfd driver. This parameter enables 
the user to select the scheduling policy of the CP. The choices are:

* CP Scheduling with support for over-subscription
* CP Scheduling without support for over-subscription
* Without CP Scheduling

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 4 
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   | 9 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
index dc08f51..fe5e39d 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
@@ -46,6 +46,10 @@ static const struct kgd2kfd_calls kgd2kfd = {
.resume = kgd2kfd_resume,
 };

+int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION;
+module_param(sched_policy, int, S_IRUSR | S_IWUSR);
+MODULE_PARM_DESC(sched_policy, "Kernel cmdline parameter define the kfd 
scheduling policy");
+
 bool kgd2kfd_init(unsigned interface_version,
  const struct kfd2kgd_calls *f2g,
  const struct kgd2kfd_calls **g2f)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 25f23c5..8be07a1 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -64,6 +64,15 @@
 /* Macro for allocating structures */
 #define kfd_alloc_struct(ptr_to_struct)((typeof(ptr_to_struct)) 
kzalloc(sizeof(*ptr_to_struct), GFP_KERNEL))

+/* Kernel module parameter to specify the scheduling policy */
+extern int sched_policy;
+
+enum kfd_sched_policy {
+   KFD_SCHED_POLICY_HWS = 0,
+   KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION,
+   KFD_SCHED_POLICY_NO_HWS
+};
+
 /*
  * Large enough to hold the maximum usable pasid + 1.
  * It must also be able to store the number of doorbells
-- 
1.9.1



[PATCH v2 15/25] amdkfd: Add kernel queue module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The kernel queue module enables the amdkfd to establish kernel queues, not 
exposed to user space.

The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug 
Interface Queue) operations

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   3 +-
 .../drm/radeon/amdkfd/kfd_device_queue_manager.h   | 101 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c   | 305 +
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h   |  66 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h| 682 +
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h| 107 
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  32 +
 7 files changed, 1295 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index b5201f4..bead1be 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,7 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
-   kfd_process.o kfd_queue.o kfd_mqd_manager.o
+   kfd_process.o kfd_queue.o kfd_mqd_manager.o \
+   kfd_kernel_queue.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
new file mode 100644
index 000..037eaf8
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
@@ -0,0 +1,101 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef KFD_DEVICE_QUEUE_MANAGER_H_
+#define KFD_DEVICE_QUEUE_MANAGER_H_
+
+#include 
+#include 
+#include "kfd_priv.h"
+#include "kfd_mqd_manager.h"
+
+#define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS   (500)
+#define QUEUES_PER_PIPE(8)
+#define PIPE_PER_ME_CP_SCHEDULING  (3)
+#define CIK_VMID_NUM   (8)
+#define KFD_VMID_START_OFFSET  (8)
+#define VMID_PER_DEVICECIK_VMID_NUM
+#define KFD_DQM_FIRST_PIPE (0)
+
+struct device_process_node {
+   struct qcm_process_device *qpd;
+   struct list_head list;
+};
+
+struct device_queue_manager {
+   int (*create_queue)(struct device_queue_manager *dqm,
+   struct queue *q,
+   struct qcm_process_device *qpd,
+   int *allocate_vmid);
+   int (*destroy_queue)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
+   struct queue *q);
+   int (*update_queue)(struct device_queue_manager *dqm,
+   struct queue *q);
+   int (*destroy_queues)(struct device_queue_manager *dqm);
+   struct mqd_manager * (*get_mqd_manager)(struct device_queue_manager 
*dqm,
+   enum KFD_MQD_TYPE type);
+   int (*execute_queues)(struct device_queue_manager *dqm);
+   int (*register_process)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd);
+   int (*unregister_process)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd);
+   int

[PATCH v2 14/25] amdkfd: Add mqd_manager module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The mqd_manager module handles MQD data structures. MQD stands for Memory Queue 
Descriptor, which is used by the H/W to keep the usermode queue state in memory.

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile  |   2 +-
 drivers/gpu/drm/radeon/amdkfd/cik_mqds.h| 185 +++
 drivers/gpu/drm/radeon/amdkfd/cik_regs.h| 220 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c | 291 
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h |  54 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|   8 +
 6 files changed, 759 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_mqds.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_regs.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index dbff147..b5201f4 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
-   kfd_process.o kfd_queue.o
+   kfd_process.o kfd_queue.o kfd_mqd_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/cik_mqds.h 
b/drivers/gpu/drm/radeon/amdkfd/cik_mqds.h
new file mode 100644
index 000..ce75604
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/cik_mqds.h
@@ -0,0 +1,185 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef CIK_MQDS_H_
+#define CIK_MQDS_H_
+
+#pragma pack(push, 4)
+
+struct cik_hpd_registers {
+   u32 cp_hpd_roq_offsets;
+   u32 cp_hpd_eop_base_addr;
+   u32 cp_hpd_eop_base_addr_hi;
+   u32 cp_hpd_eop_vmid;
+   u32 cp_hpd_eop_control;
+};
+
+/* This structure represents mqd used for cp scheduling queue
+ * taken from Gfx72_cp_program_spec.pdf
+ */
+struct cik_compute_mqd {
+   u32 header;
+   u32 compute_dispatch_initiator;
+   u32 compute_dim_x;
+   u32 compute_dim_y;
+   u32 compute_dim_z;
+   u32 compute_start_x;
+   u32 compute_start_y;
+   u32 compute_start_z;
+   u32 compute_num_thread_x;
+   u32 compute_num_thread_y;
+   u32 compute_num_thread_z;
+   u32 compute_pipelinestat_enable;
+   u32 compute_perfcount_enable;
+   u32 compute_pgm_lo;
+   u32 compute_pgm_hi;
+   u32 compute_tba_lo;
+   u32 compute_tba_hi;
+   u32 compute_tma_lo;
+   u32 compute_tma_hi;
+   u32 compute_pgm_rsrc1;
+   u32 compute_pgm_rsrc2;
+   u32 compute_vmid;
+   u32 compute_resource_limits;
+   u32 compute_static_thread_mgmt_se0;
+   u32 compute_static_thread_mgmt_se1;
+   u32 compute_tmpring_size;
+   u32 compute_static_thread_mgmt_se2;
+   u32 compute_static_thread_mgmt_se3;
+   u32 compute_restart_x;
+   u32 compute_restart_y;
+   u32 compute_restart_z;
+   u32 compute_thread_trace_enable;
+   u32 compute_misc_reserved;
+   u32 compute_user_data[16];
+   u32 vgt_csinvoc_count_lo;
+   u32 vgt_csinvoc_count_hi;
+   u32 cp_mqd_base_addr51;
+   u32 cp_mqd_base_addr_hi;
+   u32 cp_hqd_active;
+   u32 cp_hqd_vmid;
+   u32 cp_hqd_persistent_state;
+   u32 cp_hqd_pipe_priority;
+   u32 cp_hqd_queue_priority;
+   u32 cp_hqd_quantum;
+   u32 cp_hqd_pq_base;
+   u32 cp_hqd_pq_base_hi;
+   u32 cp_hqd_pq_rptr;
+   u32 cp_hqd_pq_rptr_report_addr;
+   u32 cp_hqd_pq_rptr_report_addr_hi;
+   u32 cp_hqd_pq_wptr_poll_addr;
+   u32 cp_hqd_pq_wptr_poll_addr_hi;
+   u32 

[PATCH v2 13/25] amdkfd: Add queue module

2014-07-17 Thread Oded Gabbay
From: Ben Goz 

The queue module enables allocating and initializing queues uniformly.

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile|   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h  |  48 +
 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c | 109 ++
 3 files changed, 158 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index daf75a8..dbff147 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
-   kfd_process.o
+   kfd_process.o kfd_queue.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 604c317..94ff1c3 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -65,6 +65,9 @@ typedef unsigned int pasid_t;
 /* Type that represents a HW doorbell slot. */
 typedef u32 doorbell_t;

+/* Type that represents queue pointer */
+typedef u32 qptr_t;
+
 struct kfd_device_info {
const struct kfd_scheduler_class *scheduler_class;
unsigned int max_pasid_bits;
@@ -125,12 +128,57 @@ void kfd_vidmem_unkmap(struct kfd_dev *kfd, kfd_mem_obj 
mem_obj);
 int kfd_vidmem_alloc_map(struct kfd_dev *kfd, kfd_mem_obj *mem_obj, void **ptr,
uint64_t *vmid0_address, size_t size);
 void kfd_vidmem_free_unmap(struct kfd_dev *kfd, kfd_mem_obj mem_obj);
+
 /* Character device interface */
 int kfd_chardev_init(void);
 void kfd_chardev_exit(void);
 struct device *kfd_chardev(void);


+enum kfd_queue_type  {
+   KFD_QUEUE_TYPE_COMPUTE,
+   KFD_QUEUE_TYPE_SDMA,
+   KFD_QUEUE_TYPE_HIQ,
+   KFD_QUEUE_TYPE_DIQ
+};
+
+struct queue_properties {
+   enum kfd_queue_type type;
+   unsigned int queue_id;
+   uint64_t queue_address;
+   uint64_t  queue_size;
+   uint32_t priority;
+   uint32_t queue_percent;
+   qptr_t *read_ptr;
+   qptr_t *write_ptr;
+   qptr_t *doorbell_ptr;
+   qptr_t doorbell_off;
+   bool is_interop;
+   bool is_active;
+   /* Not relevant for user mode queues in cp scheduling */
+   unsigned int vmid;
+};
+
+struct queue {
+   struct list_head list;
+   void *mqd;
+   /* kfd_mem_obj contains the mqd */
+   kfd_mem_obj mqd_mem_obj;
+   uint64_t gart_mqd_addr; /* needed for cp scheduling */
+   struct queue_properties properties;
+
+   /*
+* Used by the queue device manager to track the hqd slot per queue
+* when using no cp scheduling
+*/
+   uint32_t mec;
+   uint32_t pipe;
+   uint32_t queue;
+
+   struct kfd_process  *process;
+   struct kfd_dev  *device;
+};
+
 /* Data that is per-process-per device. */
 struct kfd_process_device {
/*
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_queue.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_queue.c
new file mode 100644
index 000..646b6d1
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_queue.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include "kfd_priv.h"
+
+void print_queue_properties(struct queue_properties *q)
+{
+   if (!q)
+   return;
+
+   pr_debug("Printing queue properties\n"
+   "Queue Type: %u\n"
+   "Queue Size: %llu\n"
+   "Queue percent: %u\n"
+   "Queue Address: 0x%llX\n"
+   "Queue 

[PATCH v2 12/25] amdkfd: Add binding/unbinding calls to amd_iommu driver

2014-07-17 Thread Oded Gabbay
This patch adds the functions to bind and unbind pasid from a device through 
the amd_iommu driver.

The unbind function is called when the mm_struct of the process is released.

The bind function is not called here because it is called only in the IOCTLs 
which are not yet implemented at this stage of the patchset.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c  | 80 -
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  1 +
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c | 12 +
 3 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index f6a7cf7..7c4c836 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -95,6 +95,59 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct 
pci_dev *pdev)
return kfd;
 }

+static bool device_iommu_pasid_init(struct kfd_dev *kfd)
+{
+   const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP | 
AMD_IOMMU_DEVICE_FLAG_PRI_SUP
+   | AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
+
+   struct amd_iommu_device_info iommu_info;
+   pasid_t pasid_limit;
+   int err;
+
+   err = amd_iommu_device_info(kfd->pdev, _info);
+   if (err < 0) {
+   dev_err(kfd_device, "error getting iommu info. is the iommu 
enabled?\n");
+   return false;
+   }
+
+   if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
+   dev_err(kfd_device, "error required iommu flags ats(%i), 
pri(%i), pasid(%i)\n",
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 
0);
+   return false;
+   }
+
+   pasid_limit = min_t(pasid_t, (pasid_t)1 << 
kfd->device_info->max_pasid_bits, iommu_info.max_pasids);
+   /*
+* last pasid is used for kernel queues doorbells
+* in the future the last pasid might be used for a kernel thread.
+*/
+   pasid_limit = min_t(pasid_t, pasid_limit, kfd->doorbell_process_limit - 
1);
+
+   err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+   if (err < 0) {
+   dev_err(kfd_device, "error initializing iommu device\n");
+   return false;
+   }
+
+   if (!kfd_set_pasid_limit(pasid_limit)) {
+   dev_err(kfd_device, "error setting pasid limit\n");
+   amd_iommu_free_device(kfd->pdev);
+   return false;
+   }
+
+   return true;
+}
+
+static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
+{
+   struct kfd_dev *dev = kfd_device_by_pci_dev(pdev);
+
+   if (dev)
+   kfd_unbind_process_from_device(dev, pasid);
+}
+
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
 const struct kgd2kfd_shared_resources *gpu_resources)
 {
@@ -102,8 +155,15 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,

kfd_doorbell_init(kfd);

-   if (kfd_topology_add_device(kfd) != 0)
+   if (!device_iommu_pasid_init(kfd))
+   return false;
+
+   if (kfd_topology_add_device(kfd) != 0) {
+   amd_iommu_free_device(kfd->pdev);
return false;
+   }
+
+   amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);

kfd->init_complete = true;
dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
@@ -118,18 +178,36 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)

BUG_ON(err != 0);

+   if (kfd->init_complete)
+   amd_iommu_free_device(kfd->pdev);
+
kfree(kfd);
 }

 void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
BUG_ON(kfd == NULL);
+
+   if (kfd->init_complete)
+   amd_iommu_free_device(kfd->pdev);
 }

 int kgd2kfd_resume(struct kfd_dev *kfd)
 {
+   pasid_t pasid_limit;
+   int err;
+
BUG_ON(kfd == NULL);

+   pasid_limit = kfd_get_pasid_limit();
+
+   if (kfd->init_complete) {
+   err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+   if (err < 0)
+   return -ENXIO;
+   amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);
+   }
+
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index af5a5e4..604c317 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -195,6 +195,7 @@ struct kfd_process {
 struct kfd_process *kfd_create_process(const struct task_struct *);
 struct kfd_process *kfd_get_process(const struct task_struct *);

+void kfd_unbind_process_from_device(struct kfd_dev *dev, pasid_t pasid);
 struct kfd_process_device 

[PATCH v2 11/25] amdkfd: Add basic modules to amdkfd

2014-07-17 Thread Oded Gabbay
From: Andrew Lewycky 

This patch adds the process module and 4 helper modules:

- kfd_process, which handles process which open /dev/kfd
- kfd_doorbell, which provides helper functions for doorbell allocation, 
release and mapping to userspace
- kfd_pasid, which provides helper functions for pasid allocation and release
- kfd_vidmem, which provides helper functions for allocation and release of 
memory from the gfx driver
- kfd_aperture, which provides helper functions for managing the LDS, Local GPU 
memory and Scratch memory apertures of the process

This patch only contains the basic kfd_process module, which doesn't contain 
the reference to the queue scheduler. This was done to allow easier code review.

Also, this patch doesn't contain the calls to the IOMMU driver for binding the 
pasid to the device. Again, this was done to allow easier code review

The kfd_process object is created when a process opens /dev/kfd and is closed 
when the mm_struct of that process is teared-down.

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile   |   4 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c | 123 +
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c  |  36 ++-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c   |   2 +
 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c | 264 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c   |  22 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c|  97 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 148 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c  | 374 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_vidmem.c   |  96 +++
 10 files changed, 1163 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_vidmem.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 08ecfcd..daf75a8 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -4,6 +4,8 @@

 ccflags-y := -Iinclude/drm

-amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
+   kfd_pasid.o kfd_doorbell.o kfd_vidmem.o kfd_aperture.o \
+   kfd_process.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
new file mode 100644
index 000..0468114
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
@@ -0,0 +1,123 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "kfd_priv.h"
+#include 
+#include 
+#include 
+
+
+#define MAKE_GPUVM_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 
0x1)
+#define MAKE_GPUVM_APP_LIMIT(base) (((uint64_t)(base) & 0xFF00) | 
0xFF)
+#define MAKE_SCRATCH_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 
0x1)
+#define MAKE_SCRATCH_APP_LIMIT(base) (((uint64_t)base & 0x) | 
0x)
+#define MAKE_LDS_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 0x0)
+#define MAKE_LDS_APP_LIMIT(base) (((uint64_t)(base) & 0x) | 
0x)
+
+#define HSA_32BIT_LDS_APP_SIZE 0x1
+#define HSA_32BIT_LDS_APP_ALIGNMENT 0x1
+
+static unsigned long kfd_reserve_aperture(struct kfd_process *process, 
unsigned long len, unsigned long alignment)
+{
+
+   unsigned long addr = 0;
+   unsigned long 

[PATCH v2 10/25] amdkfd: Add topology module to amdkfd

2014-07-17 Thread Oded Gabbay
From: Evgeny Pinchuk 

This patch adds the topology module to the driver. The topology is exposed to
userspace through the sysfs.

The calls to add and remove a device to/from topology are done by the radeon
driver.

Signed-off-by: Evgeny Pinchuk 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile   |2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h |  294 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c   |7 +
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c   |7 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h |   17 +
 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c | 1207 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h |  168 
 7 files changed, 1701 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 9564e75..08ecfcd 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -4,6 +4,6 @@

 ccflags-y := -Iinclude/drm

-amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
new file mode 100644
index 000..a374fa3
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
@@ -0,0 +1,294 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KFD_CRAT_H_INCLUDED
+#define KFD_CRAT_H_INCLUDED
+
+#include 
+
+#pragma pack(1)
+
+/*
+ * 4CC signature values for the CRAT and CDIT ACPI tables
+ */
+
+#define CRAT_SIGNATURE "CRAT"
+#define CDIT_SIGNATURE "CDIT"
+
+/*
+ * Component Resource Association Table (CRAT)
+ */
+
+#define CRAT_OEMID_LENGTH  6
+#define CRAT_OEMTABLEID_LENGTH 8
+#define CRAT_RESERVED_LENGTH   6
+
+#define CRAT_OEMID_64BIT_MASK ((1ULL << (CRAT_OEMID_LENGTH * 8)) - 1)
+
+struct crat_header {
+   uint32_tsignature;
+   uint32_tlength;
+   uint8_t revision;
+   uint8_t checksum;
+   uint8_t oem_id[CRAT_OEMID_LENGTH];
+   uint8_t oem_table_id[CRAT_OEMTABLEID_LENGTH];
+   uint32_toem_revision;
+   uint32_tcreator_id;
+   uint32_tcreator_revision;
+   uint32_ttotal_entries;
+   uint16_tnum_domains;
+   uint8_t reserved[CRAT_RESERVED_LENGTH];
+};
+
+/*
+ * The header structure is immediately followed by total_entries of the
+ * data definitions
+ */
+
+/*
+ * The currently defined subtype entries in the CRAT
+ */
+#define CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY  0
+#define CRAT_SUBTYPE_MEMORY_AFFINITY   1
+#define CRAT_SUBTYPE_CACHE_AFFINITY2
+#define CRAT_SUBTYPE_TLB_AFFINITY  3
+#define CRAT_SUBTYPE_CCOMPUTE_AFFINITY 4
+#define CRAT_SUBTYPE_IOLINK_AFFINITY   5
+#define CRAT_SUBTYPE_MAX   6
+
+#define CRAT_SIBLINGMAP_SIZE   32
+
+/*
+ * ComputeUnit Affinity structure and definitions
+ */
+#define CRAT_CU_FLAGS_ENABLED  0x0001
+#define CRAT_CU_FLAGS_HOT_PLUGGABLE0x0002
+#define CRAT_CU_FLAGS_CPU_PRESENT  0x0004
+#define CRAT_CU_FLAGS_GPU_PRESENT  0x0008
+#define CRAT_CU_FLAGS_IOMMU_PRESENT0x0010
+#define CRAT_CU_FLAGS_RESERVED 0xffe0
+
+#define CRAT_COMPUTEUNIT_RESERVED_LENGTH 4
+
+struct crat_subtype_computeunit {
+   uint8_t type;
+   uint8_t length;
+   uint16_treserved;
+   uint32_tflags;
+   uint32_tproximity_domain;
+   uint32_tprocessor_id_low;
+   

[PATCH v2 09/25] amdkfd: Add amdkfd skeleton driver

2014-07-17 Thread Oded Gabbay
This patch adds the amdkfd skeleton driver. The driver does nothing except
define a /dev/kfd device.

It returns -ENODEV on all amdkfd IOCTLs.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/Kconfig  |   2 +
 drivers/gpu/drm/radeon/Makefile |   2 +
 drivers/gpu/drm/radeon/amdkfd/Kconfig   |  10 ++
 drivers/gpu/drm/radeon/amdkfd/Makefile  |   9 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 203 
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c  | 129 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c  |  98 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  81 +++
 8 files changed, 534 insertions(+)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/Kconfig
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/Makefile
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_module.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h

diff --git a/drivers/gpu/drm/radeon/Kconfig b/drivers/gpu/drm/radeon/Kconfig
index 970f8e9..b697321 100644
--- a/drivers/gpu/drm/radeon/Kconfig
+++ b/drivers/gpu/drm/radeon/Kconfig
@@ -6,3 +6,5 @@ config DRM_RADEON_UMS

  Userspace modesetting is deprecated for quite some time now, so
  enable this only if you have ancient versions of the DDX drivers.
+
+source "drivers/gpu/drm/radeon/amdkfd/Kconfig"
diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index a1c913d..50823a1 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -112,4 +112,6 @@ radeon-$(CONFIG_ACPI) += radeon_acpi.o

 obj-$(CONFIG_DRM_RADEON)+= radeon.o

+obj-$(CONFIG_HSA_RADEON)+= amdkfd/
+
 CFLAGS_radeon_trace_points.o := -I$(src)
diff --git a/drivers/gpu/drm/radeon/amdkfd/Kconfig 
b/drivers/gpu/drm/radeon/amdkfd/Kconfig
new file mode 100644
index 000..900bb34
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/Kconfig
@@ -0,0 +1,10 @@
+#
+# Heterogenous system architecture configuration
+#
+
+config HSA_RADEON
+   tristate "HSA kernel driver for AMD Radeon devices"
+   depends on DRM_RADEON && AMD_IOMMU_V2 && X86_64
+   default m
+   help
+ Enable this if you want to use HSA features on AMD radeon devices.
diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
new file mode 100644
index 000..9564e75
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -0,0 +1,9 @@
+#
+# Makefile for Heterogenous System Architecture support for AMD radeon devices
+#
+
+ccflags-y := -Iinclude/drm
+
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o
+
+obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
new file mode 100644
index 000..b98bcb7
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -0,0 +1,203 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "kfd_priv.h"
+
+static long kfd_ioctl(struct file *, unsigned int, unsigned long);
+static int kfd_open(struct inode *, struct file *);
+
+static const char kfd_dev_name[] = "kfd";
+
+static const struct file_operations kfd_fops = {
+   .owner = THIS_MODULE,
+   .unlocked_ioctl = kfd_ioctl,
+   .compat_ioctl = kfd_ioctl,
+   .open = kfd_open,
+};
+
+static int kfd_char_dev_major = -1;
+static struct class *kfd_class;
+struct device *kfd_device;
+
+int kfd_chardev_init(void)
+{
+   int err = 0;
+
+   kfd_char_dev_major = register_chrdev(0, kfd_dev_name, _fops);
+   err = kfd_char_dev_major;
+   if (err < 

[PATCH v2 06/25] drm/radeon: Add radeon <--> amdkfd interface

2014-07-17 Thread Oded Gabbay
This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

The loading of the amdkfd module is done via symbol lookup. According to the 
code review discussions, this may change in v3 of the patch set.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/Makefile |   1 +
 drivers/gpu/drm/radeon/cik.c|   9 +
 drivers/gpu/drm/radeon/cik_reg.h|  65 +
 drivers/gpu/drm/radeon/cikd.h   |  51 +++-
 drivers/gpu/drm/radeon/radeon.h |   3 +
 drivers/gpu/drm/radeon/radeon_drv.c |   5 +
 drivers/gpu/drm/radeon/radeon_kfd.c | 566 
 drivers/gpu/drm/radeon/radeon_kfd.h | 119 
 drivers/gpu/drm/radeon/radeon_kms.c |   7 +
 9 files changed, 825 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c
 create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.h

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 1b04002..a1c913d 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -104,6 +104,7 @@ radeon-y += \
radeon_vce.o \
vce_v1_0.o \
vce_v2_0.o \
+   radeon_kfd.o

 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index b4bbc22..6f71095 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -32,6 +32,7 @@
 #include "cik_blit_shaders.h"
 #include "radeon_ucode.h"
 #include "clearstate_ci.h"
+#include "radeon_kfd.h"

 MODULE_FIRMWARE("radeon/BONAIRE_pfp.bin");
 MODULE_FIRMWARE("radeon/BONAIRE_me.bin");
@@ -7727,6 +7728,9 @@ restart_ih:
while (rptr != wptr) {
/* wptr/rptr are in bytes! */
ring_index = rptr / 4;
+
+   radeon_kfd_interrupt(rdev, (const void *) 
>ih.ring[ring_index]);
+
src_id =  le32_to_cpu(rdev->ih.ring[ring_index]) & 0xff;
src_data = le32_to_cpu(rdev->ih.ring[ring_index + 1]) & 
0xfff;
ring_id = le32_to_cpu(rdev->ih.ring[ring_index + 2]) & 0xff;
@@ -8386,6 +8390,10 @@ static int cik_startup(struct radeon_device *rdev)
if (r)
return r;

+   r = radeon_kfd_resume(rdev);
+   if (r)
+   return r;
+
return 0;
 }

@@ -8434,6 +8442,7 @@ int cik_resume(struct radeon_device *rdev)
  */
 int cik_suspend(struct radeon_device *rdev)
 {
+   radeon_kfd_suspend(rdev);
radeon_pm_suspend(rdev);
dce6_audio_fini(rdev);
radeon_vm_manager_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/cik_reg.h b/drivers/gpu/drm/radeon/cik_reg.h
index ca1bb61..1ab3dbc 100644
--- a/drivers/gpu/drm/radeon/cik_reg.h
+++ b/drivers/gpu/drm/radeon/cik_reg.h
@@ -147,4 +147,69 @@

 #define CIK_LB_DESKTOP_HEIGHT 0x6b0c

+struct cik_hqd_registers {
+   u32 cp_mqd_base_addr;
+   u32 cp_mqd_base_addr_hi;
+   u32 cp_hqd_active;
+   u32 cp_hqd_vmid;
+   u32 cp_hqd_persistent_state;
+   u32 cp_hqd_pipe_priority;
+   u32 cp_hqd_queue_priority;
+   u32 cp_hqd_quantum;
+   u32 cp_hqd_pq_base;
+   u32 cp_hqd_pq_base_hi;
+   u32 cp_hqd_pq_rptr;
+   u32 cp_hqd_pq_rptr_report_addr;
+   u32 cp_hqd_pq_rptr_report_addr_hi;
+   u32 cp_hqd_pq_wptr_poll_addr;
+   u32 cp_hqd_pq_wptr_poll_addr_hi;
+   u32 cp_hqd_pq_doorbell_control;
+   u32 cp_hqd_pq_wptr;
+   u32 cp_hqd_pq_control;
+   u32 cp_hqd_ib_base_addr;
+   u32 cp_hqd_ib_base_addr_hi;
+   u32 cp_hqd_ib_rptr;
+   u32 cp_hqd_ib_control;
+   u32 cp_hqd_iq_timer;
+   u32 cp_hqd_iq_rptr;
+   u32 cp_hqd_dequeue_request;
+   u32 cp_hqd_dma_offload;
+   u32 cp_hqd_sema_cmd;
+   u32 cp_hqd_msg_type;
+   u32 cp_hqd_atomic0_preop_lo;
+   u32 cp_hqd_atomic0_preop_hi;
+   u32 cp_hqd_atomic1_preop_lo;
+   u32 cp_hqd_atomic1_preop_hi;
+   u32 cp_hqd_hq_scheduler0;
+   u32 cp_hqd_hq_scheduler1;
+   u32 cp_mqd_control;
+};
+
+struct cik_mqd {
+   u32 header;
+   u32 dispatch_initiator;
+   u32 dimensions[3];
+   u32 start_idx[3];
+   u32 num_threads[3];
+   u32 pipeline_stat_enable;
+   u32 perf_counter_enable;
+   u32 pgm[2];
+   u32 

[PATCH v2 05/25] drm/radeon: adding synchronization for GRBM GFX

2014-07-17 Thread Oded Gabbay
Implementing a lock for selecting and accessing shader engines and arrays.
This lock will make sure that radeon and amdkfd are not colliding when
accessing shader engines and arrays with GRBM_GFX_INDEX register.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c   | 26 ++
 drivers/gpu/drm/radeon/radeon.h|  2 ++
 drivers/gpu/drm/radeon/radeon_device.c |  1 +
 3 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 1d7dd3b..b4bbc22 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -1563,6 +1563,8 @@ static const u32 godavari_golden_registers[] =

 static void cik_init_golden_registers(struct radeon_device *rdev)
 {
+   /* Some of the registers might be dependant on GRBM_GFX_INDEX */
+   mutex_lock(>grbm_idx_mutex);
switch (rdev->family) {
case CHIP_BONAIRE:
radeon_program_register_sequence(rdev,
@@ -1637,6 +1639,7 @@ static void cik_init_golden_registers(struct 
radeon_device *rdev)
default:
break;
}
+   mutex_unlock(>grbm_idx_mutex);
 }

 /**
@@ -3418,6 +3421,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
u32 disabled_rbs = 0;
u32 enabled_rbs = 0;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < se_num; i++) {
for (j = 0; j < sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -3429,6 +3433,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);

mask = 1;
for (i = 0; i < max_rb_num_per_se * se_num; i++) {
@@ -3439,6 +3444,7 @@ static void cik_setup_rb(struct radeon_device *rdev,

rdev->config.cik.backend_enable_mask = enabled_rbs;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < se_num; i++) {
cik_select_se_sh(rdev, i, 0x);
data = 0;
@@ -3466,6 +3472,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
WREG32(PA_SC_RASTER_CONFIG, data);
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);
 }

 /**
@@ -3683,6 +3690,12 @@ static void cik_gpu_init(struct radeon_device *rdev)
/* set HW defaults for 3D engine */
WREG32(CP_MEQ_THRESHOLDS, MEQ1_START(0x30) | MEQ2_START(0x60));

+   mutex_lock(>grbm_idx_mutex);
+   /*
+* making sure that the following register writes will be broadcasted
+* to all the shaders
+*/
+   cik_select_se_sh(rdev, 0x, 0x);
WREG32(SX_DEBUG_1, 0x20);

WREG32(TA_CNTL_AUX, 0x0001);
@@ -3738,6 +3751,7 @@ static void cik_gpu_init(struct radeon_device *rdev)

WREG32(PA_CL_ENHANCE, CLIP_VTX_REORDER_ENA | NUM_CLIP_SEQ(3));
WREG32(PA_SC_ENHANCE, ENABLE_PA_SC_OUT_OF_ORDER);
+   mutex_unlock(>grbm_idx_mutex);

udelay(50);
 }
@@ -6037,6 +6051,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
u32 i, j, k;
u32 mask;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < rdev->config.cik.max_shader_engines; i++) {
for (j = 0; j < rdev->config.cik.max_sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -6048,6 +6063,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);

mask = SE_MASTER_BUSY_MASK | GC_MASTER_BUSY | TC0_MASTER_BUSY | 
TC1_MASTER_BUSY;
for (k = 0; k < rdev->usec_timeout; k++) {
@@ -6182,10 +6198,12 @@ static int cik_rlc_resume(struct radeon_device *rdev)
WREG32(RLC_LB_CNTR_INIT, 0);
WREG32(RLC_LB_CNTR_MAX, 0x8000);

+   mutex_lock(>grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_LB_INIT_CU_MASK, 0x);
WREG32(RLC_LB_PARAMS, 0x00600408);
WREG32(RLC_LB_CNTL, 0x8004);
+   mutex_unlock(>grbm_idx_mutex);

WREG32(RLC_MC_CNTL, 0);
WREG32(RLC_UCODE_CNTL, 0);
@@ -6252,11 +6270,13 @@ static void cik_enable_cgcg(struct radeon_device *rdev, 
bool enable)

tmp = cik_halt_rlc(rdev);

+   mutex_lock(>grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_SERDES_WR_CU_MASTER_MASK, 0x);
WREG32(RLC_SERDES_WR_NONCU_MASTER_MASK, 0x);
tmp2 = BPM_ADDR_MASK | CGCG_OVERRIDE_0 | CGLS_ENABLE;
WREG32(RLC_SERDES_WR_CTRL, tmp2);
+   mutex_unlock(>grbm_idx_mutex);

cik_update_rlc(rdev, tmp);

@@ -6298,11 +6318,13 @@ static void cik_enable_mgcg(struct radeon_device *rdev, 
bool enable)

tmp = cik_halt_rlc(rdev);

+ 

[PATCH v2 04/25] drm/radeon: Report doorbell configuration to amdkfd

2014-07-17 Thread Oded Gabbay
radeon and amdkfd share the doorbell aperture.
radeon sets it up, takes the doorbells required for its own rings
and reports the setup to amdkfd.
radeon reserved doorbells are at the start of the doorbell aperture.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/radeon.h|  4 
 drivers/gpu/drm/radeon/radeon_device.c | 31 +++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 7cda75d..4e7e41f 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -676,6 +676,10 @@ struct radeon_doorbell {

 int radeon_doorbell_get(struct radeon_device *rdev, u32 *page);
 void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell);
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset);

 /*
  * IRQS.
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 03686fa..98538d2 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -328,6 +328,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 
doorbell)
__clear_bit(doorbell, rdev->doorbell.used);
 }

+/**
+ * radeon_doorbell_get_kfd_info - Report doorbell configuration required to
+ *setup KFD
+ *
+ * @rdev: radeon_device pointer
+ * @aperture_base: output returning doorbell aperture base physical address
+ * @aperture_size: output returning doorbell aperture size in bytes
+ * @start_offset: output returning # of doorbell bytes reserved for radeon.
+ *
+ * Radeon and the KFD share the doorbell aperture. Radeon sets it up,
+ * takes doorbells required for its own rings and reports the setup to KFD.
+ * Radeon reserved doorbells are at the start of the doorbell aperture.
+ */
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset)
+{
+   /* The first num_doorbells are used by radeon.
+* KFD takes whatever's left in the aperture. */
+   if (rdev->doorbell.size > rdev->doorbell.num_doorbells * sizeof(u32)) {
+   *aperture_base = rdev->doorbell.base;
+   *aperture_size = rdev->doorbell.size;
+   *start_offset = rdev->doorbell.num_doorbells * sizeof(u32);
+   } else {
+   *aperture_base = 0;
+   *aperture_size = 0;
+   *start_offset = 0;
+   }
+}
+
 /*
  * radeon_wb_*()
  * Writeback is the the method by which the the GPU updates special pages
-- 
1.9.1



[PATCH v2 03/25] drm/radeon/cik: Don't touch int of pipes 1-7

2014-07-17 Thread Oded Gabbay
amdkfd should set interrupts for pipes 1-7.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c | 71 +---
 1 file changed, 1 insertion(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 0b53633..1d7dd3b 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -7270,8 +7270,7 @@ static int cik_irq_init(struct radeon_device *rdev)
 int cik_irq_set(struct radeon_device *rdev)
 {
u32 cp_int_cntl;
-   u32 cp_m1p0, cp_m1p1, cp_m1p2, cp_m1p3;
-   u32 cp_m2p0, cp_m2p1, cp_m2p2, cp_m2p3;
+   u32 cp_m1p0;
u32 crtc1 = 0, crtc2 = 0, crtc3 = 0, crtc4 = 0, crtc5 = 0, crtc6 = 0;
u32 hpd1, hpd2, hpd3, hpd4, hpd5, hpd6;
u32 grbm_int_cntl = 0;
@@ -7305,13 +7304,6 @@ int cik_irq_set(struct radeon_device *rdev)
dma_cntl1 = RREG32(SDMA0_CNTL + SDMA1_REGISTER_OFFSET) & ~TRAP_ENABLE;

cp_m1p0 = RREG32(CP_ME1_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p1 = RREG32(CP_ME1_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p2 = RREG32(CP_ME1_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p3 = RREG32(CP_ME1_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p0 = RREG32(CP_ME2_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p1 = RREG32(CP_ME2_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p2 = RREG32(CP_ME2_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p3 = RREG32(CP_ME2_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;

if (rdev->flags & RADEON_IS_IGP)
thermal_int = RREG32_SMC(CG_THERMAL_INT_CTRL) &
@@ -7333,33 +7325,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe 
%d\n", ring->pipe);
-   break;
-   }
-   } else if (ring->me == 2) {
-   switch (ring->pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe 
%d\n", ring->pipe);
break;
@@ -7376,33 +7341,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe 
%d\n", ring->pipe);
-   break;
-   }
-   } else if (ring->me == 2) {
-   switch (ring->pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe 
%d\n", ring->pipe);
break;
@@ -7485,13 +7423,6 @@ int cik_irq_set(struct radeon_device *rdev)
WREG32(SDMA0_CNTL + 

[PATCH v2 02/25] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-17 Thread Oded Gabbay
To support HSA on KV, we need to limit the number of vmids and pipes
that are available for radeon's use with KV.

This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs
0-7) and also makes radeon thinks that KV has only a single MEC with a single
pipe in it

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c | 48 ++--
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 4bfc2c0..0b53633 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device *rdev)
/*
 * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
 * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
+* Nonetheless, we assign only 1 pipe because all other pipes will
+* be handled by KFD
 */
-   if (rdev->family == CHIP_KAVERI)
-   rdev->mec.num_mec = 2;
-   else
-   rdev->mec.num_mec = 1;
-   rdev->mec.num_pipe = 4;
+   rdev->mec.num_mec = 1;
+   rdev->mec.num_pipe = 1;
rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;

if (rdev->mec.hpd_eop_obj == NULL) {
@@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct radeon_device 
*rdev)

/* init the pipes */
mutex_lock(>srbm_mutex);
-   for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
-   int me = (i < 4) ? 1 : 2;
-   int pipe = (i < 4) ? i : (i - 4);

-   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 
2);
+   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;

-   cik_srbm_select(rdev, me, pipe, 0, 0);
+   cik_srbm_select(rdev, 0, 0, 0, 0);

-   /* write the EOP addr */
-   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
-   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 
8);
+   /* write the EOP addr */
+   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
+   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8);

-   /* set the VMID assigned */
-   WREG32(CP_HPD_EOP_VMID, 0);
+   /* set the VMID assigned */
+   WREG32(CP_HPD_EOP_VMID, 0);
+
+   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
+   tmp = RREG32(CP_HPD_EOP_CONTROL);
+   tmp &= ~EOP_SIZE_MASK;
+   tmp |= order_base_2(MEC_HPD_SIZE / 8);
+   WREG32(CP_HPD_EOP_CONTROL, tmp);

-   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
-   tmp = RREG32(CP_HPD_EOP_CONTROL);
-   tmp &= ~EOP_SIZE_MASK;
-   tmp |= order_base_2(MEC_HPD_SIZE / 8);
-   WREG32(CP_HPD_EOP_CONTROL, tmp);
-   }
-   cik_srbm_select(rdev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);

/* init the queues.  Just two for now. */
@@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct 
radeon_ib *ib)
  */
 int cik_vm_init(struct radeon_device *rdev)
 {
-   /* number of VMs */
-   rdev->vm_manager.nvm = 16;
+   /*
+* number of VMs
+* VMID 0 is reserved for System
+* radeon graphics/compute will use VMIDs 1-7
+* amdkfd will use VMIDs 8-15
+*/
+   rdev->vm_manager.nvm = 8;
/* base offset of vram pages */
if (rdev->flags & RADEON_IS_IGP) {
u64 tmp = RREG32(MC_VM_FB_OFFSET);
-- 
1.9.1



[PATCH] drm/radeon: remove visible vram size limit on bo allocation

2014-07-17 Thread Christian König
Am 17.07.2014 06:02, schrieb Michel D?nzer:
> On 17.07.2014 02:26, Alex Deucher wrote:
>> Now that fallback to gtt is fixed for cpu access, we can
>> remove this limit.
>>
>> Signed-off-by: Alex Deucher 
>> ---
>>   drivers/gpu/drm/radeon/radeon_gem.c | 7 +--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
>> b/drivers/gpu/drm/radeon/radeon_gem.c
>> index fdd189b..07a13c9 100644
>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>> @@ -55,8 +55,11 @@ int radeon_gem_object_create(struct radeon_device *rdev, 
>> int size,
>>  alignment = PAGE_SIZE;
>>  }
>>   
>> -/* maximun bo size is the minimun btw visible vram and gtt size */
>> -max_size = min(rdev->mc.visible_vram_size, rdev->mc.gtt_size);
>> +/* Maximum bo size is the gtt size since we use the gtt to handle
>> + * vram to system pool migrations.  We could probably remove this
>> + * check altogether with a little additional work.
>> + */
>> +max_size = rdev->mc.gtt_size;
>>  if (size > max_size) {
>>  DRM_DEBUG("Allocation size %dMb bigger than %ldMb limit\n",
>>size >> 20, max_size >> 20);
> A BO of size rdev->mc.gtt_size can never actually be bound to GTT,
> because we have some pinned BOs in there. I think it's a bit
> disingenuous to let userspace allocate a BO that can never actually be
> used by the GPU. :)
>
> The hack I attached to
> https://bugs.freedesktop.org/show_bug.cgi?id=78717 has a start for
> dealing with that. I was running that patch for a while and didn't
> notice any bad effects from it.

Haven't looked at the patch yet, but can't we just go over all existing 
allocations on PIN and figure out the largest free area and save that 
value? I mean pinning of GTT memory happens rarely and mostly on system 
startup.

Christian.




[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-17 Thread Daniel Vetter
On Thu, Jul 17, 2014 at 02:45:09PM +0200, Christian K?nig wrote:
> Am 17.07.2014 14:30, schrieb Oded Gabbay:
> >On 17/07/14 15:29, Christian K?nig wrote:
> >>Am 17.07.2014 13:57, schrieb Oded Gabbay:
> >>>On 11/07/14 19:36, Jerome Glisse wrote:
> On Fri, Jul 11, 2014 at 12:50:08AM +0300, Oded Gabbay wrote:
> >The KFD driver should be loaded when the radeon driver is loaded and
> >should be finalized when the radeon driver is removed.
> >
> >This patch adds a function call to initialize kfd from radeon_init
> >and a function call to finalize kfd from radeon_exit.
> >
> >If the KFD driver is not present in the system, the initialize call
> >fails and the radeon driver continues normally.
> >
> >This patch also adds calls to probe, initialize and finalize a kfd
> >device
> >per radeon device using the kgd-->kfd interface.
> >
> >Signed-off-by: Oded Gabbay 
> 
> It might be nice to allow to build radeon without HSA so i think an
> CONFIG_HSA should be added and have other thing depends on it.
> Otherwise this one is.
> 
> Reviewed-by: J?r?me Glisse 
> 
> >>>We do allow it :)
> >>>There is no problem building radeon without the kfd. In that case,
> >>>when radeon
> >>>finds out that kfd is not available, it simply moves on with its
> >>>initialization procedure.
> >>
> >>At least off hand I don't see how this should work. Radeon directly
> >>calls
> >>radeon_kfd_(probe|init|fini) and so has a direct dependency on it.
> >>
> >>Christian.
> >But radeon_kfd.c is now a permanent part of the radeon driver. I talked
> >with Alex about it and we both agreed on that. So radeon_kfd_* functions
> >are *always* there when you build radeon.
> 
> Ah, I see. So radeon_kfd_init then tries to load the other module through
> symbol_request(). Long story short that's a bad idea for a couple of
> reasons.
> 
> First of all it only works when you build everything as module and second by
> doing so the radeon<->kfd interface must be handled as internal stable
> interface.
> 
> Only a very few drivers/subsystem do use symbol_request() and to see how to
> use it correctly please take a look at (for example)
> sound/pci/hda/hda_codec.c.

We do this in i915 to coordinate a bunch of things with the snd_hda
driver. And it's a major pain. Imo the proper way to do this is for one
driver to expose a platform driver with a bunch of specific interfaces and
for the other driver to register as a platform driver against that device.

Then all the usual linux hotplug infrastructure will make sure that this
all works and there's a clear runtime depency. For i915 that's what I've
requested the audio guys to look into, and also what I'll require for
other such sub-driver stuff (e.g. we have a non-intel video codec on vlv
gfx). Well for audio it will be a bit fancier since we also want some
standardized stuff to allow userspace to see the association between the
gfx output and the audio side. Atm you can only guess if you have more
than one screen connected.

This approach gives you full flexibility and you can e.g. blacklist the
subdriver for debugging, without a kernel recompile.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-17 Thread Oded Gabbay
On 17/07/14 15:29, Christian K?nig wrote:
> Am 17.07.2014 13:57, schrieb Oded Gabbay:
>> On 11/07/14 19:36, Jerome Glisse wrote:
>>> On Fri, Jul 11, 2014 at 12:50:08AM +0300, Oded Gabbay wrote:
 The KFD driver should be loaded when the radeon driver is loaded and
 should be finalized when the radeon driver is removed.

 This patch adds a function call to initialize kfd from radeon_init
 and a function call to finalize kfd from radeon_exit.

 If the KFD driver is not present in the system, the initialize call
 fails and the radeon driver continues normally.

 This patch also adds calls to probe, initialize and finalize a kfd device
 per radeon device using the kgd-->kfd interface.

 Signed-off-by: Oded Gabbay 
>>>
>>> It might be nice to allow to build radeon without HSA so i think an
>>> CONFIG_HSA should be added and have other thing depends on it.
>>> Otherwise this one is.
>>>
>>> Reviewed-by: J?r?me Glisse 
>>>
>> We do allow it :)
>> There is no problem building radeon without the kfd. In that case, when 
>> radeon
>> finds out that kfd is not available, it simply moves on with its
>> initialization procedure.
>
> At least off hand I don't see how this should work. Radeon directly calls
> radeon_kfd_(probe|init|fini) and so has a direct dependency on it.
>
> Christian.
But radeon_kfd.c is now a permanent part of the radeon driver. I talked with 
Alex about it and we both agreed on that. So radeon_kfd_* functions are 
*always* 
there when you build radeon.
Oded
>
>>
>> Oded
>>>
 ---
   drivers/gpu/drm/radeon/radeon_drv.c | 6 ++
   drivers/gpu/drm/radeon/radeon_kms.c | 9 +
   2 files changed, 15 insertions(+)

 diff --git a/drivers/gpu/drm/radeon/radeon_drv.c
 b/drivers/gpu/drm/radeon/radeon_drv.c
 index cb14213..88a45a0 100644
 --- a/drivers/gpu/drm/radeon/radeon_drv.c
 +++ b/drivers/gpu/drm/radeon/radeon_drv.c
 @@ -151,6 +151,9 @@ static inline void radeon_register_atpx_handler(void) 
 {}
   static inline void radeon_unregister_atpx_handler(void) {}
   #endif

 +extern bool radeon_kfd_init(void);
 +extern void radeon_kfd_fini(void);
 +
   int radeon_no_wb;
   int radeon_modeset = -1;
   int radeon_dynclks = -1;
 @@ -630,12 +633,15 @@ static int __init radeon_init(void)
   #endif
   }

 +radeon_kfd_init();
 +
   /* let modprobe override vga console setting */
   return drm_pci_init(driver, pdriver);
   }

   static void __exit radeon_exit(void)
   {
 +radeon_kfd_fini();
   drm_pci_exit(driver, pdriver);
   radeon_unregister_atpx_handler();
   }
 diff --git a/drivers/gpu/drm/radeon/radeon_kms.c
 b/drivers/gpu/drm/radeon/radeon_kms.c
 index 35d9318..0748284 100644
 --- a/drivers/gpu/drm/radeon/radeon_kms.c
 +++ b/drivers/gpu/drm/radeon/radeon_kms.c
 @@ -34,6 +34,10 @@
   #include 
   #include 

 +extern void radeon_kfd_device_probe(struct radeon_device *rdev);
 +extern void radeon_kfd_device_init(struct radeon_device *rdev);
 +extern void radeon_kfd_device_fini(struct radeon_device *rdev);
 +
   #if defined(CONFIG_VGA_SWITCHEROO)
   bool radeon_has_atpx(void);
   #else
 @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev)

   pm_runtime_get_sync(dev->dev);

 +radeon_kfd_device_fini(rdev);
 +
   radeon_acpi_fini(rdev);

   radeon_modeset_fini(rdev);
 @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device *dev,
 unsigned long flags)
   "Error during ACPI methods call\n");
   }

 +radeon_kfd_device_probe(rdev);
 +radeon_kfd_device_init(rdev);
 +
   if (radeon_is_px(dev)) {
   pm_runtime_use_autosuspend(dev->dev);
   pm_runtime_set_autosuspend_delay(dev->dev, 5000);
 --
 1.9.1

>>
>



[Bug 73457] mpeg4 through vdpau randomly either correct or garbled (on same file!)

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73457

--- Comment #15 from Fabrice Bellet  ---
Increasing the dpb_size value in src/gallium/drivers/radeon/radeon_uvd.c seems
to be a workaround. I tested with dpb_size = dpb_size * 2, since a couple of
hours, and I cannot reproduce the problem. max_references is set to 2 in my
case, this is the default value of the vaapi-vdpau-driver for mpeg4 : defined
in function get_num_ref_frames().

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/df889987/attachment.html>


[PATCH 11/83] hsa/radeon: Add scheduler code

2014-07-17 Thread Oded Gabbay
On 11/07/14 21:25, Jerome Glisse wrote:
> On Fri, Jul 11, 2014 at 12:50:11AM +0300, Oded Gabbay wrote:
>> This patch adds the code base of the scheduler, which handles queue
>> creation, deletion and scheduling on the CP of the GPU.
>>
>> Signed-off-by: Oded Gabbay 
>
> I would rather see all this squashed, this gave feeling that driver
> can access register which is latter remove. I know jungling with
> patch squashing can be daunting but really it makes reviewing hard
> here because i have to jump back and forth to see if thing i am looking
> at really matter in the final version.
>
> Cheers,
> J?r?me
Squashed and restructured in v2 of the patchset.
Oded
>
>> ---
>>   drivers/gpu/hsa/radeon/Makefile   |   3 +-
>>   drivers/gpu/hsa/radeon/cik_regs.h | 213 +++
>>   drivers/gpu/hsa/radeon/kfd_device.c   |   1 +
>>   drivers/gpu/hsa/radeon/kfd_registers.c|  50 ++
>>   drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 800 
>> ++
>>   drivers/gpu/hsa/radeon/kfd_vidmem.c   |  61 ++
>>   6 files changed, 1127 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/gpu/hsa/radeon/cik_regs.h
>>   create mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c
>>   create mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
>>   create mode 100644 drivers/gpu/hsa/radeon/kfd_vidmem.c
>>
>> diff --git a/drivers/gpu/hsa/radeon/Makefile 
>> b/drivers/gpu/hsa/radeon/Makefile
>> index 989518a..28da10c 100644
>> --- a/drivers/gpu/hsa/radeon/Makefile
>> +++ b/drivers/gpu/hsa/radeon/Makefile
>> @@ -4,6 +4,7 @@
>>
>>   radeon_kfd-y   := kfd_module.o kfd_device.o kfd_chardev.o \
>>  kfd_pasid.o kfd_topology.o kfd_process.o \
>> -kfd_doorbell.o
>> +kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
>> +kfd_vidmem.o
>>
>>   obj-$(CONFIG_HSA_RADEON)   += radeon_kfd.o
>> diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
>> b/drivers/gpu/hsa/radeon/cik_regs.h
>> new file mode 100644
>> index 000..d0cdc57
>> --- /dev/null
>> +++ b/drivers/gpu/hsa/radeon/cik_regs.h
>> @@ -0,0 +1,213 @@
>> +/*
>> + * Copyright 2014 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#ifndef CIK_REGS_H
>> +#define CIK_REGS_H
>> +
>> +#define BIF_DOORBELL_CNTL   0x530Cu
>> +
>> +#define SRBM_GFX_CNTL   0xE44
>> +#define PIPEID(x)   ((x) << 0)
>> +#define MEID(x) ((x) << 2)
>> +#define VMID(x) ((x) << 4)
>> +#define QUEUEID(x)  ((x) << 8)
>> +
>> +#define SQ_CONFIG   0x8C00
>> +
>> +#define SH_MEM_BASES0x8C28
>> +/* if PTR32, these are the bases for scratch and lds */
>> +#define PRIVATE_BASE(x) ((x) << 0) /* 
>> scratch */
>> +#define SHARED_BASE(x)  ((x) << 16) /* 
>> LDS */
>> +#define SH_MEM_APE1_BASE0x8C2C
>> +/* if PTR32, this is the base location of GPUVM */
>> +#define SH_MEM_APE1_LIMIT   0x8C30
>> +/* if PTR32, this is the upper limit of GPUVM */
>> +#define SH_MEM_CONFIG   0x8C34
>> +#define PTR32   (1 << 0)
>> +#define ALIGNMENT_MODE(x)   ((x) << 2)
>> +#define SH_MEM_ALIGNMENT_MODE_DWORD 0
>> +#define SH_MEM_ALIGNMENT_MODE_DWORD_STRICT  1
>> +#define SH_MEM_ALIGNMENT_MODE_STRICT2
>> +#define SH_MEM_ALIGNMENT_MODE_UNALIGNED 3
>> 

[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-17 Thread Oded Gabbay
On 11/07/14 19:36, Jerome Glisse wrote:
> On Fri, Jul 11, 2014 at 12:50:08AM +0300, Oded Gabbay wrote:
>> The KFD driver should be loaded when the radeon driver is loaded and
>> should be finalized when the radeon driver is removed.
>>
>> This patch adds a function call to initialize kfd from radeon_init
>> and a function call to finalize kfd from radeon_exit.
>>
>> If the KFD driver is not present in the system, the initialize call
>> fails and the radeon driver continues normally.
>>
>> This patch also adds calls to probe, initialize and finalize a kfd device
>> per radeon device using the kgd-->kfd interface.
>>
>> Signed-off-by: Oded Gabbay 
>
> It might be nice to allow to build radeon without HSA so i think an
> CONFIG_HSA should be added and have other thing depends on it.
> Otherwise this one is.
>
> Reviewed-by: J?r?me Glisse 
>
We do allow it :)
There is no problem building radeon without the kfd. In that case, when radeon 
finds out that kfd is not available, it simply moves on with its initialization 
procedure.
Oded
>
>> ---
>>   drivers/gpu/drm/radeon/radeon_drv.c | 6 ++
>>   drivers/gpu/drm/radeon/radeon_kms.c | 9 +
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
>> b/drivers/gpu/drm/radeon/radeon_drv.c
>> index cb14213..88a45a0 100644
>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>> @@ -151,6 +151,9 @@ static inline void radeon_register_atpx_handler(void) {}
>>   static inline void radeon_unregister_atpx_handler(void) {}
>>   #endif
>>
>> +extern bool radeon_kfd_init(void);
>> +extern void radeon_kfd_fini(void);
>> +
>>   int radeon_no_wb;
>>   int radeon_modeset = -1;
>>   int radeon_dynclks = -1;
>> @@ -630,12 +633,15 @@ static int __init radeon_init(void)
>>   #endif
>>  }
>>
>> +radeon_kfd_init();
>> +
>>  /* let modprobe override vga console setting */
>>  return drm_pci_init(driver, pdriver);
>>   }
>>
>>   static void __exit radeon_exit(void)
>>   {
>> +radeon_kfd_fini();
>>  drm_pci_exit(driver, pdriver);
>>  radeon_unregister_atpx_handler();
>>   }
>> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
>> b/drivers/gpu/drm/radeon/radeon_kms.c
>> index 35d9318..0748284 100644
>> --- a/drivers/gpu/drm/radeon/radeon_kms.c
>> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
>> @@ -34,6 +34,10 @@
>>   #include 
>>   #include 
>>
>> +extern void radeon_kfd_device_probe(struct radeon_device *rdev);
>> +extern void radeon_kfd_device_init(struct radeon_device *rdev);
>> +extern void radeon_kfd_device_fini(struct radeon_device *rdev);
>> +
>>   #if defined(CONFIG_VGA_SWITCHEROO)
>>   bool radeon_has_atpx(void);
>>   #else
>> @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev)
>>
>>  pm_runtime_get_sync(dev->dev);
>>
>> +radeon_kfd_device_fini(rdev);
>> +
>>  radeon_acpi_fini(rdev);
>>  
>>  radeon_modeset_fini(rdev);
>> @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device *dev, 
>> unsigned long flags)
>>  "Error during ACPI methods call\n");
>>  }
>>
>> +radeon_kfd_device_probe(rdev);
>> +radeon_kfd_device_init(rdev);
>> +
>>  if (radeon_is_px(dev)) {
>>  pm_runtime_use_autosuspend(dev->dev);
>>  pm_runtime_set_autosuspend_delay(dev->dev, 5000);
>> --
>> 1.9.1
>>



[PATCH 04/83] drm/radeon: Add radeon <--> kfd interface

2014-07-17 Thread Oded Gabbay
On 11/07/14 19:24, Jerome Glisse wrote:
> On Thu, Jul 10, 2014 at 03:38:33PM -0700, Joe Perches wrote:
>> On Fri, 2014-07-11 at 00:50 +0300, Oded Gabbay wrote:
>>> This patch adds the interface between the radeon driver and the kfd
>>> driver. The interface implementation is contained in
>>> radeon_kfd.c and radeon_kfd.h.
>> []
>>>   include/linux/radeon_kfd.h  | 67 ++
>>
>> Is there a good reason to put this file in include/linux?
>>
>
> Agrees, we do not want to clutter include/linux/ with specific driver
> include, i think its one of the rules even thought there is some hw header
> already in there.
>
> I would rather see either a new dir include/hsa or inside include/drm.
>
> Cheers,
> J?r?me
>

Moved to drm/radeon in v2 of the patchset
Oded


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-17 Thread Oded Gabbay
On 11/07/14 20:28, Joe Perches wrote:
> On Fri, 2014-07-11 at 13:04 -0400, Jerome Glisse wrote:
>> On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
> []
>>> +static long kfd_ioctl(struct file *, unsigned int, unsigned long);
>>
>> Nitpick, avoid unsigned int just use unsigned.
>
> I suggest unsigned int is much more common (and better)
> than just unsigned.
>
> $ git grep -P '\bunsigned\s+(?!long|int|short|char)' -- "*.[ch]" | wc -l
> 20778
>
> $ git grep -P "\bunsigned\s+int\b" -- "*.[ch]" | wc -l
> 98068
>
So I left it as unsigned int in v2 of the patchset.

>>> +static int kfd_open(struct inode *, struct file *);
>
> It's also generally better to use types and names tno
> improve how a human reads and understands the code.
>
>
Fixed in v2 of the patchset.

Oded



[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-17 Thread Oded Gabbay
On 11/07/14 22:22, Jerome Glisse wrote:
> On Fri, Jul 11, 2014 at 06:56:12PM +, Bridgman, John wrote:
>>> From: Jerome Glisse [mailto:j.glisse at gmail.com]
>>> Sent: Friday, July 11, 2014 2:52 PM
>>> To: Bridgman, John
>>> Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
>>> kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky, 
>>> Andrew;
>>> Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon
>>> Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada;
>>> Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
>>> Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
>>> AMD's GPUs
>>>
>>> On Fri, Jul 11, 2014 at 06:46:30PM +, Bridgman, John wrote:
> From: Jerome Glisse [mailto:j.glisse at gmail.com]
> Sent: Friday, July 11, 2014 2:11 PM
> To: Bridgman, John
> Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
> kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky,
> Andrew; Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J.
> Wysocki; Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke;
> Srinivas Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
> Philipp Zabel
> Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
> for AMD's GPUs
>
> On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
>>> From: Jerome Glisse [mailto:j.glisse at gmail.com]
>>> Sent: Friday, July 11, 2014 1:04 PM
>>> To: Oded Gabbay
>>> Cc: David Airlie; Deucher, Alexander;
>>> linux-kernel at vger.kernel.org;
>>> dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
>>> Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki;
>>> Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas
>>> Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
>>> Philipp Zabel
>>> Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
>>> for AMD's GPUs
>>>
>>> On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
 This patch adds the code base of the hsa driver for AMD's GPUs.

 This driver is called kfd.

 This initial version supports the first HSA chip, Kaveri.

 This driver is located in a new directory structure under drivers/gpu.

 Signed-off-by: Oded Gabbay 
>>>
>>> There is too coding style issues. While we have been lax on the
>>> enforcing the scripts/checkpatch.pl rules i think there is a limit
>>> to that. I am not strict on the 80chars per line but others things
>>> needs fixing
> so we stay inline.
>>>
>>> Also i am a bit worried about the license, given top comment in
>>> each of the files i am not sure this is GPL2 compatible. I would
>>> need to ask lawyer to review that.
>>>
>>
>> Hi Jerome,
>>
>> Which line in the license are you concerned about ? In theory we're
>> using
> the same license as the initial code pushes for radeon, and I just
> did a side-by side compare with the license header on cik.c in the
> radeon tree and confirmed that the two licenses are identical.
>>
>> The cik.c header has an additional "Authors:" line which the kfd
>> files do
> not, but AFAIK that is not part of the license text proper.
>>
>
> You can not claim GPL if you want to use this license. radeon is
> weird best for historical reasons as we wanted to share code with BSD
> thus it is dual licensed and this is reflected with :
> MODULE_LICENSE("GPL and additional rights");
>
> inside radeon_drv.c
>
> So if you want to have MODULE_LICENSE(GPL) then you should have
> header that use the GPL license wording and no wording from BSD like
>>> license.
> Otherwise change the MODULE_LICENSE and it would also be good to say
> dual licensed at top of each files (or least next to each license) so
> that it is clear this is BSD & GPL license.

 Got it. Missed that we had a different MODULE_LICENSE.

 Since the goal is license compatibility with radeon so we can update the
>>> interface and move code between the drivers in future I guess my
>>> preference would be to update MODULE_LICENSE in the kfd code to "GPL and
>>> additional rights", do you think that would be OK ?
>>>
>>> I am not a lawyer and nothing that i said should be considered as legal 
>>> advice
>>> (on the contrary ;)) I think you need to be more clear with each license to
>>> clear says GPLv2 or BSD ie dual licensed but the dual license is a beast you
>>> would definitly want to talk to lawyer about.
>>
>> Yeah, dual license seems horrid in its implications for developers so we've 
>> always tried to avoid it. GPL hurts us for porting to other OSes so the X11 
>> / "GPL with additional rights" combo seemed like the ideal solution and we 
>> 

[PATCH 1/1] Revert "drm/i915: drop i915_ prefix from enable_rc6, enable_fbc, enable_ppgtt parameters"

2014-07-17 Thread Amit Shah
On (Thu) 17 Jul 2014 [11:11:15], Daniel Vetter wrote:
> On Thu, Jul 17, 2014 at 02:32:41PM +0530, Amit Shah wrote:
> > On (Thu) 17 Jul 2014 [09:35:20], Daniel Vetter wrote:
> > > On Wed, Jul 16, 2014 at 9:54 PM, Linus Torvalds
> > >  wrote:
> > > > Sorry for the top post, I'm on the road..
> > > >
> > > > In wondering if we couldn't just keep both the old an the new names and 
> > > > have
> > > > them both point at the same variable? Remove the description for the old
> > > > name, but keep it working?
> > > 
> > > I'm really surprised here ... We have rc6 enabled by default
> > > everywhere, and all the additional rc6 levels that users try to enable
> > > are known to hard-hang machines.
> > 
> > I haven't had this problem on my hardware (ThinkPad T420s, lspci
> > below) for a few kernel versions.  I think I added the enable_rc6=
> > setting back from the time the deeper states were enabled and then
> > reverted for SandyBridge.
> > 
> > Nevertheless, with the current state, RC6p and RC6pp states are not
> > used.
> 
> Yeah, on snb they cause crashes and instability and also don't provide
> measurable power benefits (afaik). So I recommend you drop that one.

Not for me -- there have been no crashes / hangs / lockups as I
mentioned.

> > > I actually have plans to taint the
> > > kernel if you set any of them since I'm fed up with the random crash
> > > reports. Same for fbc, even more so or the ppgtt knob. My stance is
> > > that if you know about these knobs you _really_ should know the driver
> > > to its depths and so also be able to follow module parameter
> > > renamings.
> > 
> > I also remember there being bugzillas about power consumption, and
> > using this setting was recommended (for Fedora, I think).  I know
> > a few people are using this setting.
> 
> I know, google is littered with such entries. Unfortunately by the time
> google thinks something is important (which usually takes a few months)
> it's already badly outdated: i915 graphics developement is charging ahead
> at a really brisk pace - we merge a few hundred patches per release for
> i915 alone.

But for SNB, there's really no "improvement" for the RC6 states, is
there?

> > > > On Jul 16, 2014 8:34 AM, "Amit Shah"  wrote:
> > > >>
> > > >> This reverts commit 3adee7a7976012a20f1d3b5a529a3c105e29fef1.
> > > >>
> > > >> After upgrading to v3.15, my laptop's battery started draining quite
> > > >> fast.  Powertop pointed to the deep RC6 states not being used.  The
> > > >> kernel param I had put to enable them had stopped working the way it
> > > >> used to; so I disagree with the 'not maintaing ABI' part of the param
> > > >> name change.
> > > >>
> > > >> However weird the names may be, they're in active use and changing them
> > > >> only causes pain for users.  This also isn't advertised (marked
> > > >> deprecated, big warning shown, etc.), so just reverting now.
> > > >>
> > > >> CC: Daniel Vetter 
> > > >> CC: Jani Nikula 
> > > >> CC: David Airlie 
> > > >> CC:  # v3.15+
> > > >> Signed-off-by: Amit Shah 
> > > 
> > > Anyway we need to figure out what went wrong here. Please share your
> > > exact kernelcmdline and lspci -nn. Also stats for before/after from
> > > powertop when idle please.
> > 
> > Powertop stats for idle are a little difficult -- since this is my
> > primary laptop.
> 
> Now I'm a bit confused: How have you measured that the lack of rc6p/pp is
> the reason for your power consumption regression?
> -Daniel

What I meant was rebooting in the middle of something is a pain
(usually a week or two between trying these things); and also for a
fair comparison, the workloads have to be similar for both the
powertop ratings.

In any case, my daily work doesn't change, and I noticed this
immediately upon booting into 3.15.  The laptop heats up a bit more,
that's the first clue; and the battery doesn't provide as much backup
as it used to.

Amit


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-17 Thread Oded Gabbay
On 14/07/14 10:58, Christian K?nig wrote:
> Am 14.07.2014 09:38, schrieb Michel D?nzer:
>> On 11.07.2014 06:50, Oded Gabbay wrote:
>>> @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct
>>> radeon_ib *ib)
>>>*/
>>>   int cik_vm_init(struct radeon_device *rdev)
>>>   {
>>> -/* number of VMs */
>>> -rdev->vm_manager.nvm = 16;
>>> +/*
>>> + * number of VMs
>>> + * VMID 0 is reserved for Graphics
>>> + * radeon compute will use VMIDs 1-7
>>> + * KFD will use VMIDs 8-15
>>> + */
>>> +rdev->vm_manager.nvm = 8;
>> This comment is inaccurate: Graphics can use VMIDs 1-7 as well.
>
> Actually VMID 0 is reserved for system use and graphics operation only use 
> VMIDs
> 1-7.
>
> Christian.
Will be fixed in v2 of the patchset

Oded
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-17 Thread Christian König
>> +
>>>>>   int radeon_no_wb;
>>>>>   int radeon_modeset = -1;
>>>>>   int radeon_dynclks = -1;
>>>>> @@ -630,12 +633,15 @@ static int __init radeon_init(void)
>>>>>   #endif
>>>>>   }
>>>>>
>>>>> +radeon_kfd_init();
>>>>> +
>>>>>   /* let modprobe override vga console setting */
>>>>>   return drm_pci_init(driver, pdriver);
>>>>>   }
>>>>>
>>>>>   static void __exit radeon_exit(void)
>>>>>   {
>>>>> +radeon_kfd_fini();
>>>>>   drm_pci_exit(driver, pdriver);
>>>>>   radeon_unregister_atpx_handler();
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c
>>>>> b/drivers/gpu/drm/radeon/radeon_kms.c
>>>>> index 35d9318..0748284 100644
>>>>> --- a/drivers/gpu/drm/radeon/radeon_kms.c
>>>>> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
>>>>> @@ -34,6 +34,10 @@
>>>>>   #include 
>>>>>   #include 
>>>>>
>>>>> +extern void radeon_kfd_device_probe(struct radeon_device *rdev);
>>>>> +extern void radeon_kfd_device_init(struct radeon_device *rdev);
>>>>> +extern void radeon_kfd_device_fini(struct radeon_device *rdev);
>>>>> +
>>>>>   #if defined(CONFIG_VGA_SWITCHEROO)
>>>>>   bool radeon_has_atpx(void);
>>>>>   #else
>>>>> @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device 
>>>>> *dev)
>>>>>
>>>>>   pm_runtime_get_sync(dev->dev);
>>>>>
>>>>> +radeon_kfd_device_fini(rdev);
>>>>> +
>>>>>   radeon_acpi_fini(rdev);
>>>>>
>>>>>   radeon_modeset_fini(rdev);
>>>>> @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device 
>>>>> *dev,
>>>>> unsigned long flags)
>>>>>   "Error during ACPI methods call\n");
>>>>>   }
>>>>>
>>>>> +radeon_kfd_device_probe(rdev);
>>>>> +radeon_kfd_device_init(rdev);
>>>>> +
>>>>>   if (radeon_is_px(dev)) {
>>>>>   pm_runtime_use_autosuspend(dev->dev);
>>>>>   pm_runtime_set_autosuspend_delay(dev->dev, 5000);
>>>>> -- 
>>>>> 1.9.1
>>>>>
>>>
>>
>

-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/2031c0d1/attachment-0001.html>


[PATCH 1/1] Revert "drm/i915: drop i915_ prefix from enable_rc6, enable_fbc, enable_ppgtt parameters"

2014-07-17 Thread Amit Shah
On (Thu) 17 Jul 2014 [09:35:20], Daniel Vetter wrote:
> On Wed, Jul 16, 2014 at 9:54 PM, Linus Torvalds
>  wrote:
> > Sorry for the top post, I'm on the road..
> >
> > In wondering if we couldn't just keep both the old an the new names and have
> > them both point at the same variable? Remove the description for the old
> > name, but keep it working?
> 
> I'm really surprised here ... We have rc6 enabled by default
> everywhere, and all the additional rc6 levels that users try to enable
> are known to hard-hang machines.

I haven't had this problem on my hardware (ThinkPad T420s, lspci
below) for a few kernel versions.  I think I added the enable_rc6=
setting back from the time the deeper states were enabled and then
reverted for SandyBridge.

Nevertheless, with the current state, RC6p and RC6pp states are not
used.

> I actually have plans to taint the
> kernel if you set any of them since I'm fed up with the random crash
> reports. Same for fbc, even more so or the ppgtt knob. My stance is
> that if you know about these knobs you _really_ should know the driver
> to its depths and so also be able to follow module parameter
> renamings.

I also remember there being bugzillas about power consumption, and
using this setting was recommended (for Fedora, I think).  I know
a few people are using this setting.

> > On Jul 16, 2014 8:34 AM, "Amit Shah"  wrote:
> >>
> >> This reverts commit 3adee7a7976012a20f1d3b5a529a3c105e29fef1.
> >>
> >> After upgrading to v3.15, my laptop's battery started draining quite
> >> fast.  Powertop pointed to the deep RC6 states not being used.  The
> >> kernel param I had put to enable them had stopped working the way it
> >> used to; so I disagree with the 'not maintaing ABI' part of the param
> >> name change.
> >>
> >> However weird the names may be, they're in active use and changing them
> >> only causes pain for users.  This also isn't advertised (marked
> >> deprecated, big warning shown, etc.), so just reverting now.
> >>
> >> CC: Daniel Vetter 
> >> CC: Jani Nikula 
> >> CC: David Airlie 
> >> CC:  # v3.15+
> >> Signed-off-by: Amit Shah 
> 
> Anyway we need to figure out what went wrong here. Please share your
> exact kernelcmdline and lspci -nn. Also stats for before/after from
> powertop when idle please.

Powertop stats for idle are a little difficult -- since this is my
primary laptop.

BOOT_IMAGE=/vmlinuz-3.15.4-200.fc20.x86_64 
root=/dev/mapper/luks-3aff2acf-737d-4002-b644-15f599d09a18 ro 
rd.lvm.lv=fedora_grmbl/00 rd.lvm.lv=fedora_grmbl/01 
vconsole.font=latarcyrheb-sun16 
rd.luks.uuid=luks-0934d354-5b07-4e91-a699-9bfc57e76fdc 
rd.luks.uuid=luks-3aff2acf-737d-4002-b644-15f599d09a18 rhgb quiet slub_debug=- 
i915.i915_enable_rc6=7 LANG=en_IN.UTF-8

00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor 
Family DRAM Controller [8086:0104] (rev 09)
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core 
Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series 
Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network 
Connection [8086:1502] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset 
Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 04)
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset 
Family High Definition Audio Controller [8086:1c20] (rev 04)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 1 [8086:1c10] (rev b4)
00:1c.1 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 2 [8086:1c12] (rev b4)
00:1c.3 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 4 [8086:1c16] (rev b4)
00:1c.4 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 5 [8086:1c18] (rev b4)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset 
Family USB Enhanced Host Controller #1 [8086:1c26] (rev 04)
00:1f.0 ISA bridge [0601]: Intel Corporation QM67 Express Chipset Family LPC 
Controller [8086:1c4f] (rev 04)
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset 
Family 6 port SATA AHCI Controller [8086:1c03] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family 
SMBus Controller [8086:1c22] (rev 04)
03:00.0 Network controller [0280]: Intel Corporation Centrino Advanced-N 6205 
[Taylor Peak] [8086:0085] (rev 34)
05:00.0 SD Host controller [0805]: Ricoh Co Ltd MMC/SD Host Controller 
[1180:e822] (rev 07)
0d:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host 
Controller [1033:0194] (rev 04)


Thanks,
Amit


[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-17 Thread Christian König
Am 17.07.2014 13:57, schrieb Oded Gabbay:
> On 11/07/14 19:36, Jerome Glisse wrote:
>> On Fri, Jul 11, 2014 at 12:50:08AM +0300, Oded Gabbay wrote:
>>> The KFD driver should be loaded when the radeon driver is loaded and
>>> should be finalized when the radeon driver is removed.
>>>
>>> This patch adds a function call to initialize kfd from radeon_init
>>> and a function call to finalize kfd from radeon_exit.
>>>
>>> If the KFD driver is not present in the system, the initialize call
>>> fails and the radeon driver continues normally.
>>>
>>> This patch also adds calls to probe, initialize and finalize a kfd 
>>> device
>>> per radeon device using the kgd-->kfd interface.
>>>
>>> Signed-off-by: Oded Gabbay 
>>
>> It might be nice to allow to build radeon without HSA so i think an
>> CONFIG_HSA should be added and have other thing depends on it.
>> Otherwise this one is.
>>
>> Reviewed-by: J?r?me Glisse 
>>
> We do allow it :)
> There is no problem building radeon without the kfd. In that case, 
> when radeon finds out that kfd is not available, it simply moves on 
> with its initialization procedure.

At least off hand I don't see how this should work. Radeon directly 
calls radeon_kfd_(probe|init|fini) and so has a direct dependency on it.

Christian.

>
> Oded
>>
>>> ---
>>>   drivers/gpu/drm/radeon/radeon_drv.c | 6 ++
>>>   drivers/gpu/drm/radeon/radeon_kms.c | 9 +
>>>   2 files changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
>>> b/drivers/gpu/drm/radeon/radeon_drv.c
>>> index cb14213..88a45a0 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>>> @@ -151,6 +151,9 @@ static inline void 
>>> radeon_register_atpx_handler(void) {}
>>>   static inline void radeon_unregister_atpx_handler(void) {}
>>>   #endif
>>>
>>> +extern bool radeon_kfd_init(void);
>>> +extern void radeon_kfd_fini(void);
>>> +
>>>   int radeon_no_wb;
>>>   int radeon_modeset = -1;
>>>   int radeon_dynclks = -1;
>>> @@ -630,12 +633,15 @@ static int __init radeon_init(void)
>>>   #endif
>>>   }
>>>
>>> +radeon_kfd_init();
>>> +
>>>   /* let modprobe override vga console setting */
>>>   return drm_pci_init(driver, pdriver);
>>>   }
>>>
>>>   static void __exit radeon_exit(void)
>>>   {
>>> +radeon_kfd_fini();
>>>   drm_pci_exit(driver, pdriver);
>>>   radeon_unregister_atpx_handler();
>>>   }
>>> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
>>> b/drivers/gpu/drm/radeon/radeon_kms.c
>>> index 35d9318..0748284 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_kms.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
>>> @@ -34,6 +34,10 @@
>>>   #include 
>>>   #include 
>>>
>>> +extern void radeon_kfd_device_probe(struct radeon_device *rdev);
>>> +extern void radeon_kfd_device_init(struct radeon_device *rdev);
>>> +extern void radeon_kfd_device_fini(struct radeon_device *rdev);
>>> +
>>>   #if defined(CONFIG_VGA_SWITCHEROO)
>>>   bool radeon_has_atpx(void);
>>>   #else
>>> @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev)
>>>
>>>   pm_runtime_get_sync(dev->dev);
>>>
>>> +radeon_kfd_device_fini(rdev);
>>> +
>>>   radeon_acpi_fini(rdev);
>>>
>>>   radeon_modeset_fini(rdev);
>>> @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device 
>>> *dev, unsigned long flags)
>>>   "Error during ACPI methods call\n");
>>>   }
>>>
>>> +radeon_kfd_device_probe(rdev);
>>> +radeon_kfd_device_init(rdev);
>>> +
>>>   if (radeon_is_px(dev)) {
>>>   pm_runtime_use_autosuspend(dev->dev);
>>>   pm_runtime_set_autosuspend_delay(dev->dev, 5000);
>>> -- 
>>> 1.9.1
>>>
>



[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT

2014-07-17 Thread Marek Olšák
On Thu, Jul 17, 2014 at 12:01 PM, Michel D?nzer  wrote:
> Mesa patches:
>
> [PATCH 1/5] winsys/radeon: Use separate caching buffer managers for
> [PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some
> [PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming

For these 3 patches:

Reviewed-by: Marek Ol??k 

Marek


[Mesa-dev] [PATCH 5/5] r600g, radeonsi: Prefer VRAM for persistent mappings

2014-07-17 Thread Marek Olšák
Like I said at patch 4, this would be okay if the COHERENT flag wasn't set.

If you removed the PERSISTENT flag from the conditional, the placement
of persistent non-coherent buffers would be driven by the "usage",
meaning that you would be able to get any kind of placement you want.

Marek

On Thu, Jul 17, 2014 at 12:01 PM, Michel D?nzer  wrote:
> From: Michel D?nzer 
>
> Signed-off-by: Michel D?nzer 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index c8a0723..6f7fa29 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -125,12 +125,10 @@ bool r600_init_resource(struct r600_common_screen 
> *rscreen,
> break;
> }
>
> -   /* Use GTT for all persistent mappings, because they are
> -* always cached and coherent. */
> if (res->b.b.target == PIPE_BUFFER &&
> res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
>   PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
> -   res->domains = RADEON_DOMAIN_GTT;
> +   res->domains = RADEON_DOMAIN_VRAM;
> flags = RADEON_FLAG_GTT_WC;
> }
>
> --
> 2.0.0
>
> ___
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-17 Thread Marek Olšák
The resource flags actually tell you what you can do. If the COHERENT
flag is set, the mapping must be cached. If it's unset, it's up to
you.

If write-combining is faster for vertex uploads, then Glamor shouldn't
set the coherent flag.

Marek

On Thu, Jul 17, 2014 at 12:01 PM, Michel D?nzer  wrote:
> From: Michel D?nzer 
>
> This is hopefully safe: The kernel makes sure writes to these mappings
> finish before the GPU might start reading from them, and the GPU caches
> are invalidated at the start of a command stream.
>
> Signed-off-by: Michel D?nzer 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index 40917f0..c8a0723 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -131,7 +131,7 @@ bool r600_init_resource(struct r600_common_screen 
> *rscreen,
> res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
>   PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
> res->domains = RADEON_DOMAIN_GTT;
> -   flags = 0;
> +   flags = RADEON_FLAG_GTT_WC;
> }
>
> /* Tiled textures are unmappable. Always put them in VRAM. */
> --
> 2.0.0
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 78453] [HAWAII] Get acceleration working

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78453

--- Comment #79 from Kai  ---
Created attachment 102983
  --> https://bugs.freedesktop.org/attachment.cgi?id=102983=edit
dmesg: lock-up with Xorg -retro

Ok, this is weird: just out of curiosity I tried to launch Xorg with "-retro",
then I do see errors logged (see attached excerpt from dmesg). If I run Xorg
without parameters, I just end up with a black screen, no logged errors and
Xorg.0.log looks like everything is fine (except for that
"config_odev_get_int_attribute" error):
> [  1724.926] (--) RADEON(0): Chipset: "HAWAII" (ChipID = 0x67b1)
> [  1724.926] (EE) Error config_odev_get_int_attribute called for non integer 
> attrib 4
> [  1724.926] (II) Loading sub module "dri2"
> [  1724.926] (II) LoadModule: "dri2"
> [  1724.926] (II) Module "dri2" already built-in
> [  1724.926] (II) Loading sub module "glamoregl"
> [  1724.926] (II) LoadModule: "glamoregl"
> [  1724.926] (II) Loading /usr/lib/xorg/modules/libglamoregl.so
> [  1724.931] (II) Module glamoregl: vendor="X.Org Foundation"
> [  1724.931]  compiled for 1.15.99.904, module version = 1.0.0
> [  1724.931]  ABI class: X.Org ANSI C Emulation, version 0.4
> [  1724.931] (II) glamor: OpenGL accelerated X.org driver based.
> [  1724.958] (II) glamor: EGL version 1.4 (DRI2):
> [  1724.975] (II) RADEON(0): glamor detected, initialising EGL layer.
> [  1724.975] (II) RADEON(0): KMS Color Tiling: disabled
> [  1724.975] (II) RADEON(0): KMS Color Tiling 2D: disabled
> [  1724.975] (II) RADEON(0): KMS Pageflipping: enabled
> [  1724.975] (II) RADEON(0): SwapBuffers wait for vsync: enabled
> [...]

Shouldn't I be seeing the same errors as with "-retro" as well?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/33401733/attachment-0001.html>


[RFCv2] drm/msm: DT support for 8960/8064

2014-07-17 Thread divya ojha
Hi Rob,

On Tue, Jul 8, 2014 at 9:30 PM, Rob Clark  wrote:
> Now that we (almost) have enough dependencies in place (MMCC, RPM, etc),
> add necessary DT support so that we can use drm/msm on upstream kernel.
>
> Signed-off-by: Rob Clark 
> ---
> I thought I sent this already, but looks like I've forgot.
..
> diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
> index 7f7aade..041c2fc 100644
> --- a/drivers/gpu/drm/msm/hdmi/hdmi.c
> +++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
> @@ -123,7 +123,8 @@ struct hdmi *hdmi_init(struct drm_device *dev, struct 
> drm_encoder *encoder)
> for (i = 0; i < config->hpd_reg_cnt; i++) {
> struct regulator *reg;
>
> -   reg = devm_regulator_get(>dev, 
> config->hpd_reg_names[i]);
> +   reg = devm_regulator_get_exclusive(>dev,
> +   config->hpd_reg_names[i]);
> if (IS_ERR(reg)) {
> ret = PTR_ERR(reg);
> dev_err(dev->dev, "failed to get hpd regulator: %s 
> (%d)\n",
> @@ -138,7 +139,8 @@ struct hdmi *hdmi_init(struct drm_device *dev, struct 
> drm_encoder *encoder)
> for (i = 0; i < config->pwr_reg_cnt; i++) {
> struct regulator *reg;
>
> -   reg = devm_regulator_get(>dev, 
> config->pwr_reg_names[i]);
> +   reg = devm_regulator_get_exclusive(>dev,
> +   config->pwr_reg_names[i]);
> if (IS_ERR(reg)) {
> ret = PTR_ERR(reg);
> dev_err(dev->dev, "failed to get pwr regulator: %s 
> (%d)\n",

Don't we need to have a if(regulator_enabled) check after
devm_regulator_get function ?
I see a similar test after camera regulator_get function call.

-Regards
Divya


[Bug 78453] [HAWAII] Get acceleration working

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78453

--- Comment #78 from Kai  ---
Ahrg, comment #77 was only half of what I wanted to add... (screwed the C up)

In Xorg.0.log I'm seeing
> (EE) Error config_odev_get_int_attribute called for non integer attrib 4
no matter whether I force acceleration on or not. Not sure, this is relevant
though.


In case I force HAWAII acceleration (same settings as Luzipher in comment #9)
on I end up with a black screen and the system is inaccessible directly. Over
SSH I found, that no error is logged to either Xorg.0.log or dmesg (I checked
journalctl as well, just to be sure, but nothing there as well). Issuing a
reboot command over SSH didn't work either. Would I need to set some kernel
variable to get info about a locked-up GPU?


The stack I used was (base is Debian Testing):
GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1)
Linux: 3.15.5
libdrm: 2.4.54-1
LLVM: SVN:trunk/r213236
libclc: Git:master/0ec7437d9c
Mesa: Git:master/48deb4dbf2
DDX: 1:7.4.0-2
X: 2:1.15.99.904-1 (1.16.0 RC 4)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/6df85955/attachment.html>


[PATCH] drm/radeon: remove visible vram size limit on bo allocation

2014-07-17 Thread Michel Dänzer
On 17.07.2014 02:26, Alex Deucher wrote:
> Now that fallback to gtt is fixed for cpu access, we can
> remove this limit.
> 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/radeon/radeon_gem.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> b/drivers/gpu/drm/radeon/radeon_gem.c
> index fdd189b..07a13c9 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -55,8 +55,11 @@ int radeon_gem_object_create(struct radeon_device *rdev, 
> int size,
>   alignment = PAGE_SIZE;
>   }
>  
> - /* maximun bo size is the minimun btw visible vram and gtt size */
> - max_size = min(rdev->mc.visible_vram_size, rdev->mc.gtt_size);
> + /* Maximum bo size is the gtt size since we use the gtt to handle
> +  * vram to system pool migrations.  We could probably remove this
> +  * check altogether with a little additional work.
> +  */
> + max_size = rdev->mc.gtt_size;
>   if (size > max_size) {
>   DRM_DEBUG("Allocation size %dMb bigger than %ldMb limit\n",
> size >> 20, max_size >> 20);

A BO of size rdev->mc.gtt_size can never actually be bound to GTT,
because we have some pinned BOs in there. I think it's a bit
disingenuous to let userspace allocate a BO that can never actually be
used by the GPU. :)

The hack I attached to
https://bugs.freedesktop.org/show_bug.cgi?id=78717 has a start for
dealing with that. I was running that patch for a while and didn't
notice any bad effects from it.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer


[PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-17 Thread Grigori Goronzy
On 17.07.2014 12:01, Michel D?nzer wrote:
> From: Michel D?nzer 
> 
> This is hopefully safe: The kernel makes sure writes to these mappings
> finish before the GPU might start reading from them, and the GPU caches
> are invalidated at the start of a command stream.
>

Aren't CPU reads from write-combined GTT memory extraordinarily slow,
because they're uncached? And don't you need the right access patterns
to make write combining perform well?

Grigori

> Signed-off-by: Michel D?nzer 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index 40917f0..c8a0723 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -131,7 +131,7 @@ bool r600_init_resource(struct r600_common_screen 
> *rscreen,
>   res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
> PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
>   res->domains = RADEON_DOMAIN_GTT;
> - flags = 0;
> + flags = RADEON_FLAG_GTT_WC;
>   }
>  
>   /* Tiled textures are unmappable. Always put them in VRAM. */
> 


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 246 bytes
Desc: OpenPGP digital signature
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/21e1dc18/attachment.sig>


[Bug 78453] [HAWAII] Get acceleration working

2014-07-17 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78453

Kai  changed:

   What|Removed |Added

 CC||kai at dev.carbon-project.org

--- Comment #77 from Kai  ---
Just an additional note: with 3.15.5 I'm seeing eight (one for each ring?)
instances of:
> [drm:radeon_atom_get_leakage_vddc_based_on_leakage_params] *ERROR* Unknown 
> table version 3, 1

Probably not important to the larger issue (no acceleration; fallback to
llvmpipe) here though.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/9b0f5a55/attachment.html>


[PATCH v6 09/14] ARM: dts: s6e3fa0: add DT bindings

2014-07-17 Thread Thierry Reding
On Thu, Jul 17, 2014 at 06:01:24PM +0900, YoungJun Cho wrote:
> This patch adds DT bindings for s6e3fa0 panel.
> The bindings describes panel resources and display timings.

The commit message here should preferably say which platform this is
used on.

> Signed-off-by: YoungJun Cho 
> Acked-by: Inki Dae 
> Acked-by: Kyungmin Park 
> ---
>  .../devicetree/bindings/panel/samsung,s6e3fa0.txt  | 46 
> ++
>  1 file changed, 46 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
> 
> diff --git a/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt 
> b/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
> new file mode 100644
> index 000..2cd32f5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/panel/samsung,s6e3fa0.txt
> @@ -0,0 +1,46 @@
> +Samsung S6E3FA0 AMOLED LCD 5.7 inch panel
> +
> +Required properties:
> +  - compatible: "samsung,s6e3fa0"
> +  - reg: the virtual channel number of a DSI peripheral
> +  - vdd3-supply: core voltage supply
> +  - vci-supply: voltage supply for analog circuits
> +  - reset-gpios: a GPIO spec for the reset pin
> +  - det-gpios: a GPIO spec for the OLED detection pin
> +  - te-gpios: a GPIO spec for the TE pin
> +  - display-timings: timings for the connected panel as described by [1]

display-timings should be optional. The panel driver should provide a
default mode. And only if you really need to override the default mode
you should provide the option of getting an alternative set of values
from DT.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/6478fb7f/attachment.sig>


[PATCH v6 10/14] drm/panel: add S6E3FA0 driver

2014-07-17 Thread Thierry Reding
ce *dsi = to_mipi_dsi_device(ctx->dev);
> +
> + mipi_dsi_dcs_write(dsi, dsi->channel, data, len);
> +}

Both mipi_dsi_dcs_read() and mipi_dsi_dcs_write() return error codes on
failure. Why are you silently ignoring them?

> +#define s6e3fa0_dcs_write_seq(ctx, seq...)   \
> +do { \
> + const unsigned char d[] = { seq };  \
> + BUILD_BUG_ON_MSG(ARRAY_SIZE(d) > 64, "too big seq for stack");  \
> + s6e3fa0_dcs_write(ctx, d, ARRAY_SIZE(d));   \
> +} while (0)
> +
> +#define s6e3fa0_dcs_write_seq_static(ctx, seq...)\
> +do { \
> + static const unsigned char d[] = { seq };   \
> + s6e3fa0_dcs_write(ctx, d, ARRAY_SIZE(d));   \
> +} while (0)

I've had this discussion with Andrzej before and I'm still not convinced
that this is a useful macro.

At least they should propagate the error code, though.

> +static void s6e3fa0_set_maximum_return_packet_size(struct s6e3fa0 *ctx,
> + unsigned int size)
> +{
> + struct mipi_dsi_device *dsi = to_mipi_dsi_device(ctx->dev);
> + const struct mipi_dsi_host_ops *ops = dsi->host->ops;
> +
> + if (ops && ops->transfer) {
> + unsigned char buf[] = {size, 0};
> + struct mipi_dsi_msg msg = {
> + .channel = dsi->channel,
> + .type = MIPI_DSI_SET_MAXIMUM_RETURN_PACKET_SIZE,
> + .tx_len = sizeof(buf),
> + .tx_buf = buf
> + };
> +
> + ops->transfer(dsi->host, );
> + }
> +}

The Set Maximum Return Packet Size command is a standard command, so
please turn that into a generic function exposed by the DSI core.

> +static void s6e3fa0_read_mtp_id(struct s6e3fa0 *ctx)
> +{
> + unsigned char id[MTP_ID_LEN];
> + int ret;
> +
> + s6e3fa0_set_maximum_return_packet_size(ctx, MTP_ID_LEN);
> + ret = s6e3fa0_dcs_read(ctx, MIPI_DCS_GET_DISPLAY_ID, id, MTP_ID_LEN);

This also looks like a standard DCS command. I can't find it in either
v1.01 nor v1.02 of the specification, though. Do you know where it's
specified?

> +static void s6e3fa0_set_te_on(struct s6e3fa0 *ctx)
> +{
> + s6e3fa0_dcs_write_seq_static(ctx, MIPI_DCS_SET_TEAR_ON, 0x00);
> +}

This is also a standard DCS command.

> +static int s6e3fa0_power_off(struct s6e3fa0 *ctx)
> +{
> + gpiod_set_value(ctx->reset_gpio, 0);

Setting the reset GPIO to 0 for power off? Shouldn't this be 1 and the
polarity be specified in the GPIO specifier?

> +static void s6e3fa0_set_sequence(struct s6e3fa0 *ctx)
> +{
> + s6e3fa0_apply_level_1_key(ctx);
> + s6e3fa0_dcs_write_seq_static(ctx, MIPI_DCS_EXIT_SLEEP_MODE);
> + msleep(20);
> +
> + s6e3fa0_read_mtp_id(ctx);
> + s6e3fa0_read_vddm(ctx);
> +
> + s6e3fa0_panel_init(ctx);
> +
> + s6e3fa0_dcs_write_seq_static(ctx, MIPI_DCS_SET_DISPLAY_ON);
> +}
> +
> +static int s6e3fa0_disable(struct drm_panel *panel)
> +{
> + struct s6e3fa0 *ctx = panel_to_s6e3fa0(panel);
> +
> + s6e3fa0_dcs_write_seq_static(ctx, MIPI_DCS_SET_DISPLAY_OFF);
> + msleep(35);
> + s6e3fa0_dcs_write_seq_static(ctx, MIPI_DCS_ENTER_SLEEP_MODE);
> + msleep(120);
> +
> + return s6e3fa0_power_off(ctx);
> +}

The SET_DISPLAY_{ON,OFF} and {ENTER,EXIT}_SLEEP_MODE are standard
commands, too.

> +static int s6e3fa0_probe(struct mipi_dsi_device *dsi)
> +{
> + struct device *dev = >dev;
> + struct s6e3fa0 *ctx;
> + struct gpio_desc *det_gpio;
> + int ret, te_gpio;
> +
> + ctx = devm_kzalloc(dev, sizeof(struct s6e3fa0), GFP_KERNEL);

sizeof(*ctx)

> + det_gpio = devm_gpiod_get(dev, "det");
> + if (IS_ERR(det_gpio)) {
> + dev_err(dev, "failed to get det gpio: %ld\n",
> + PTR_ERR(det_gpio));
> + return PTR_ERR(det_gpio);
> + }
> + ret = gpiod_direction_input(det_gpio);
> + if (ret < 0) {
> + dev_err(dev, "failed to configure det gpio: %d\n", ret);
> + return ret;
> + }
> + ret = devm_request_irq(dev, gpiod_to_irq(det_gpio),
> + s6e3fa0_det_interrupt, IRQF_TRIGGER_FALLING,
> + "oled-det", ctx);
> + if (ret) {
> + dev_err(dev, "failed to request det irq: %d\n", ret);
> + return ret;
> + }
> +
> + te_gpio = of_get_named_gpio(dev->of_node, "te-gpios", 0);

Why doesn't this use the gpiod_* API like the other GPIOs?

> +static struct of_device_id s6e3fa0_of_match[] = {

Should be static const.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/ea609433/attachment-0001.sig>


[PATCH] drm/radeon: remove visible vram size limit on bo allocation

2014-07-17 Thread Alex Deucher
On Thu, Jul 17, 2014 at 10:28 AM, Christian K?nig
 wrote:
> Am 17.07.2014 06:02, schrieb Michel D?nzer:
>
>> On 17.07.2014 02:26, Alex Deucher wrote:
>>>
>>> Now that fallback to gtt is fixed for cpu access, we can
>>> remove this limit.
>>>
>>> Signed-off-by: Alex Deucher 
>>> ---
>>>   drivers/gpu/drm/radeon/radeon_gem.c | 7 +--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c
>>> b/drivers/gpu/drm/radeon/radeon_gem.c
>>> index fdd189b..07a13c9 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>>> @@ -55,8 +55,11 @@ int radeon_gem_object_create(struct radeon_device
>>> *rdev, int size,
>>> alignment = PAGE_SIZE;
>>> }
>>>   - /* maximun bo size is the minimun btw visible vram and gtt size
>>> */
>>> -   max_size = min(rdev->mc.visible_vram_size, rdev->mc.gtt_size);
>>> +   /* Maximum bo size is the gtt size since we use the gtt to handle
>>> +* vram to system pool migrations.  We could probably remove this
>>> +* check altogether with a little additional work.
>>> +*/
>>> +   max_size = rdev->mc.gtt_size;
>>> if (size > max_size) {
>>> DRM_DEBUG("Allocation size %dMb bigger than %ldMb
>>> limit\n",
>>>   size >> 20, max_size >> 20);
>>
>> A BO of size rdev->mc.gtt_size can never actually be bound to GTT,
>> because we have some pinned BOs in there. I think it's a bit
>> disingenuous to let userspace allocate a BO that can never actually be
>> used by the GPU. :)
>>
>> The hack I attached to
>> https://bugs.freedesktop.org/show_bug.cgi?id=78717 has a start for
>> dealing with that. I was running that patch for a while and didn't
>> notice any bad effects from it.
>
>
> Haven't looked at the patch yet, but can't we just go over all existing
> allocations on PIN and figure out the largest free area and save that value?
> I mean pinning of GTT memory happens rarely and mostly on system startup.


How about that attached patches?

Alex
-- next part --
A non-text attachment was scrubbed...
Name: 0004-drm-radeon-remove-visible-vram-size-limit-on-bo-allo.patch
Type: text/x-diff
Size: 2757 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/2058b475/attachment.patch>
-- next part --
A non-text attachment was scrubbed...
Name: 0003-drm-radeon-use-vram-gart-pinned-size-in-radeon_do_te.patch
Type: text/x-diff
Size: 1255 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/2058b475/attachment-0001.patch>
-- next part --
A non-text attachment was scrubbed...
Name: 0002-drm-radeon-use-vram-gart-pinned-size-in-radeon_gem_i.patch
Type: text/x-diff
Size: 1506 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/2058b475/attachment-0002.patch>
-- next part --
A non-text attachment was scrubbed...
Name: 0001-drm-radeon-tracked-pinned-memory.patch
Type: text/x-diff
Size: 2040 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140717/2058b475/attachment-0003.patch>


[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT

2014-07-17 Thread Christian König
Am 17.07.2014 12:01, schrieb Michel D?nzer:
> In order to try and improve X(Shm)PutImage performance with glamor, I
> implemented support for write-combined CPU mappings of BOs in GTT.
>
> This did provide a nice speedup, but to my surprise, using VRAM instead
> of write-combined GTT turned out to be even faster in general on my
> Kaveri machine, both for the internal GPU and for discrete GPUs.
>
> However, I've kept the changes from GTT to VRAM separated, in case this
> turns out to be a loss on other setups.
>
> Kernel patches:
>
> [PATCH 1/5] drm/radeon: Remove radeon_gart_restore()
> [PATCH 2/5] drm/radeon: Pass GART page flags to
> [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in
> [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and

Those four are Reviewed-by: Christian K?nig 

> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI

I'm still not very keen with this change since I still don't understand 
the reason why it's faster than with GTT. Definitely needs more testing 
on a wider range of systems. Maybe limit it to APUs for now?

Regards,
Christian.

>
> Mesa patches:
>
> [PATCH 1/5] winsys/radeon: Use separate caching buffer managers for
> [PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some
> [PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming
> [PATCH 4/5] r600g,radeonsi: Use write-combined persistent GTT
> [PATCH 5/5] r600g,radeonsi: Prefer VRAM for persistent mappings
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel



[PATCH] drm/radeon: remove visible vram size limit on bo allocation

2014-07-17 Thread Alex Deucher
On Thu, Jul 17, 2014 at 10:28 AM, Christian K?nig
 wrote:
> Am 17.07.2014 06:02, schrieb Michel D?nzer:
>
>> On 17.07.2014 02:26, Alex Deucher wrote:
>>>
>>> Now that fallback to gtt is fixed for cpu access, we can
>>> remove this limit.
>>>
>>> Signed-off-by: Alex Deucher 
>>> ---
>>>   drivers/gpu/drm/radeon/radeon_gem.c | 7 +--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c
>>> b/drivers/gpu/drm/radeon/radeon_gem.c
>>> index fdd189b..07a13c9 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>>> @@ -55,8 +55,11 @@ int radeon_gem_object_create(struct radeon_device
>>> *rdev, int size,
>>> alignment = PAGE_SIZE;
>>> }
>>>   - /* maximun bo size is the minimun btw visible vram and gtt size
>>> */
>>> -   max_size = min(rdev->mc.visible_vram_size, rdev->mc.gtt_size);
>>> +   /* Maximum bo size is the gtt size since we use the gtt to handle
>>> +* vram to system pool migrations.  We could probably remove this
>>> +* check altogether with a little additional work.
>>> +*/
>>> +   max_size = rdev->mc.gtt_size;
>>> if (size > max_size) {
>>> DRM_DEBUG("Allocation size %dMb bigger than %ldMb
>>> limit\n",
>>>   size >> 20, max_size >> 20);
>>
>> A BO of size rdev->mc.gtt_size can never actually be bound to GTT,
>> because we have some pinned BOs in there. I think it's a bit
>> disingenuous to let userspace allocate a BO that can never actually be
>> used by the GPU. :)
>>
>> The hack I attached to
>> https://bugs.freedesktop.org/show_bug.cgi?id=78717 has a start for
>> dealing with that. I was running that patch for a while and didn't
>> notice any bad effects from it.
>
>
> Haven't looked at the patch yet, but can't we just go over all existing
> allocations on PIN and figure out the largest free area and save that value?
> I mean pinning of GTT memory happens rarely and mostly on system startup.

yeah, I had the same thought.

Alex


[RFCv2] drm/msm: DT support for 8960/8064

2014-07-17 Thread Rob Clark
On Thu, Jul 17, 2014 at 4:10 AM, divya ojha  wrote:
> Hi Rob,
>
> On Tue, Jul 8, 2014 at 9:30 PM, Rob Clark  wrote:
>> Now that we (almost) have enough dependencies in place (MMCC, RPM, etc),
>> add necessary DT support so that we can use drm/msm on upstream kernel.
>>
>> Signed-off-by: Rob Clark 
>> ---
>> I thought I sent this already, but looks like I've forgot.
> ..
>> diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c 
>> b/drivers/gpu/drm/msm/hdmi/hdmi.c
>> index 7f7aade..041c2fc 100644
>> --- a/drivers/gpu/drm/msm/hdmi/hdmi.c
>> +++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
>> @@ -123,7 +123,8 @@ struct hdmi *hdmi_init(struct drm_device *dev, struct 
>> drm_encoder *encoder)
>> for (i = 0; i < config->hpd_reg_cnt; i++) {
>> struct regulator *reg;
>>
>> -   reg = devm_regulator_get(>dev, 
>> config->hpd_reg_names[i]);
>> +   reg = devm_regulator_get_exclusive(>dev,
>> +   config->hpd_reg_names[i]);
>> if (IS_ERR(reg)) {
>> ret = PTR_ERR(reg);
>> dev_err(dev->dev, "failed to get hpd regulator: %s 
>> (%d)\n",
>> @@ -138,7 +139,8 @@ struct hdmi *hdmi_init(struct drm_device *dev, struct 
>> drm_encoder *encoder)
>> for (i = 0; i < config->pwr_reg_cnt; i++) {
>> struct regulator *reg;
>>
>> -   reg = devm_regulator_get(>dev, 
>> config->pwr_reg_names[i]);
>> +   reg = devm_regulator_get_exclusive(>dev,
>> +   config->pwr_reg_names[i]);
>> if (IS_ERR(reg)) {
>> ret = PTR_ERR(reg);
>> dev_err(dev->dev, "failed to get pwr regulator: %s 
>> (%d)\n",
>
> Don't we need to have a if(regulator_enabled) check after
> devm_regulator_get function ?
> I see a similar test after camera regulator_get function call.

tbh, I'm not 100% sure.

Normally I would say that any driver using some resource (regulator,
etc) should take it's own reference irrespective of whether some other
driver already has the regulator/etc, enabled.  Otherwise the
reference counting doesn't work out.

The thing I'm not 100% sure about is interaction w/ bootloader, in
cases where bootloader does initial modeset.  I've had some problems
in this area in the past when I was trying to get DSI working on my
nexus4.  I doubt the correct solution is for the driver to check if
regulator is already enabled by bootloader before enabling it, but I'm
not sure if something else somewhere needs to drop references to
resources enabled by bootloader.

BR,
-R

> -Regards
> Divya


[PATCH] drm: Check for connection_mutex in drm_select_eld

2014-07-17 Thread Sean Paul
drm_select_eld should check for mode_config.connection_mutex as
well as mode_config.mutex

Signed-off-by: Sean Paul 
---
 drivers/gpu/drm/drm_edid.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index dfa9769..087d608 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3305,6 +3305,7 @@ struct drm_connector *drm_select_eld(struct drm_encoder 
*encoder,
struct drm_device *dev = encoder->dev;

WARN_ON(!mutex_is_locked(>mode_config.mutex));
+   WARN_ON(!drm_modeset_is_locked(>mode_config.connection_mutex));

list_for_each_entry(connector, >mode_config.connector_list, head)
if (connector->encoder == encoder && connector->eld[0])
-- 
2.0.0



[PATCH v5 05/14] drm/exynos: dsi: add pass TE host ops to support LCD I80 interface

2014-07-17 Thread Inki Dae
On 2014? 07? 16? 19:12, YoungJun Cho wrote:
> Hi Thierry,
> 
> On 07/16/2014 04:54 PM, Thierry Reding wrote:
>> On Wed, Jul 16, 2014 at 11:23:09AM +0900, YoungJun Cho wrote:
>>> Hi Inki,
>>>
>>> On 07/15/2014 11:34 AM, Inki Dae wrote:
 On 2014? 07? 14? 20:03, Thierry Reding wrote:
> On Mon, Jul 14, 2014 at 07:45:28PM +0900, YoungJun Cho wrote:
>> On 07/14/2014 06:41 PM, Thierry Reding wrote:
> [...]
>>> That said, I've been doing some research and it seems like we have a
>>> somewhat similar feature on Tegra. What happens there is that
>>> there are
>>> three GPIO pins that can be repurposed for TE signalling. But as
>>> opposed
>>> to using them as interrupts the display controller can be
>>> configured to
>>> use them, upon which it will automatically handle the TE signal by
>>> sending the next frame.
>>
>> Could you explain more detail how the Tegra display controller
>> could be
>> configured with this GPIO pins?
>> I have no idea except that the display controller registers this
>> GPIO as an
>> IRQ.
>
> On Tegra the display controller has a special register that can be
> programmed to use one of the three GPIOs as TE signal. Then the
> display
> controller can be configured in one-shot (non-continuous) mode, which
> means that software needs to explicitly set a trigger bit to tell the
> display controller to send a new frame. If TE signalling is enabled,
> then the display controller will not immediately send a new frame when
> triggered but wait for signalling of this GPIO.
>
>>> So we have at least two very different implementations of this on
>>> two
>>> different SoCs. Further the specification explicitly recommends
>>> using
>>> the BTA sequence and DSI protocol to wait for TE. So I still
>>> think that
>>> controllers that provide an additional, non-spec compliant method to
>>> signal TE should handle it separately rather than within DSI.
>>> Otherwise
>>> we essentially need to make the DSI "core" aware of all these
>>> quirks,
>>> and I'd rather avoid that.
>>
>> You mean, the DSI specification guides to use BTA, so it's better
>> to use
>> display controller rather than DSIM, right?
>
> What I'm saying is that there's nothing about a side-band TE wire
> in the
> DSI spec. In fact the spec explicitly says that this mechanism of an
> external TE wire from older protocols (DBI) was replaced by the BTA
> sequence over the protocol.
>
> Now, my understanding is that using the BTA sequence over the DSI
> protocol would introduce some latency and that forces some panel
> vendors
> to still provide a side-band TE wire even in DSI compliant panels. But
> since this is not part of the specification there is no standard
> way to
> do this (as evidenced by Tegra and Exynos). Therefore putting such
> functionality into the core DSI code is bad.
>
> But that doesn't mean that you have to put this functionality into the
> display controller driver on Exynos. What I'm saying is that it should
> be handled by the SoC driver rather than the core. Where exactly
> probably depends on the particular case.
>
 As Inki commented before, I'll try to use remote-endpoint
 property of DSI
 device node in exynos DSIM driver and call FIMD notifier.
>>>
>>> Sounds like it matches what I said above. I'm not a huge fan of
>>> notifiers, but if it works for you I suppose that's fine. The
>>> alternative would be to directly call a FIMD function, which is
>>> somewhat more explicit than a notifier.
>>
>> Yes, I also like explicit call, so I want to use dsi_host_ops and
>> calls it
>> in panel. But there is an objection to use dis_host_ops, we think
>> notifier
>> in exynos dsim for fimd(display controller).
>
> There are other ways to explicitly call into the display
> controller. You
> could for example get access to the CRTC that DSIM is currently
> connected to (via exynos_dsi.encoder->crtc) and then cast that to a
> struct exynos_drm_crtc and call a function to trigger a new frame
> to be
> sent (for example exynos_drm_crtc_send_frame()). This assumes that you
> can safely cast struct drm_crtc * to struct exynos_drm_crtc *, but
> that
> shouldn't be a problem.
>
> With the above, you could make the DSIM handle the TE signal
> interrupts
> and trigger the DC using the new exynos_drm_crtc_send_frame()
> function.
>

 It seems better than the use of notifier. Actually, original patch used
 this way except TE event.
 Mr. Cho, let's use remote-endpoint property and this way instead of
 notifier.

>>>
>>> The struct exynos_dsi has panel_node, which is valid by
>>> exynos_dsi_host_attach() is called from 

[PATCH 1/1] Revert "drm/i915: drop i915_ prefix from enable_rc6, enable_fbc, enable_ppgtt parameters"

2014-07-17 Thread Daniel Vetter
On Thu, Jul 17, 2014 at 02:32:41PM +0530, Amit Shah wrote:
> On (Thu) 17 Jul 2014 [09:35:20], Daniel Vetter wrote:
> > On Wed, Jul 16, 2014 at 9:54 PM, Linus Torvalds
> >  wrote:
> > > Sorry for the top post, I'm on the road..
> > >
> > > In wondering if we couldn't just keep both the old an the new names and 
> > > have
> > > them both point at the same variable? Remove the description for the old
> > > name, but keep it working?
> > 
> > I'm really surprised here ... We have rc6 enabled by default
> > everywhere, and all the additional rc6 levels that users try to enable
> > are known to hard-hang machines.
> 
> I haven't had this problem on my hardware (ThinkPad T420s, lspci
> below) for a few kernel versions.  I think I added the enable_rc6=
> setting back from the time the deeper states were enabled and then
> reverted for SandyBridge.
> 
> Nevertheless, with the current state, RC6p and RC6pp states are not
> used.

Yeah, on snb they cause crashes and instability and also don't provide
measurable power benefits (afaik). So I recommend you drop that one.

> > I actually have plans to taint the
> > kernel if you set any of them since I'm fed up with the random crash
> > reports. Same for fbc, even more so or the ppgtt knob. My stance is
> > that if you know about these knobs you _really_ should know the driver
> > to its depths and so also be able to follow module parameter
> > renamings.
> 
> I also remember there being bugzillas about power consumption, and
> using this setting was recommended (for Fedora, I think).  I know
> a few people are using this setting.

I know, google is littered with such entries. Unfortunately by the time
google thinks something is important (which usually takes a few months)
it's already badly outdated: i915 graphics developement is charging ahead
at a really brisk pace - we merge a few hundred patches per release for
i915 alone.

> > > On Jul 16, 2014 8:34 AM, "Amit Shah"  wrote:
> > >>
> > >> This reverts commit 3adee7a7976012a20f1d3b5a529a3c105e29fef1.
> > >>
> > >> After upgrading to v3.15, my laptop's battery started draining quite
> > >> fast.  Powertop pointed to the deep RC6 states not being used.  The
> > >> kernel param I had put to enable them had stopped working the way it
> > >> used to; so I disagree with the 'not maintaing ABI' part of the param
> > >> name change.
> > >>
> > >> However weird the names may be, they're in active use and changing them
> > >> only causes pain for users.  This also isn't advertised (marked
> > >> deprecated, big warning shown, etc.), so just reverting now.
> > >>
> > >> CC: Daniel Vetter 
> > >> CC: Jani Nikula 
> > >> CC: David Airlie 
> > >> CC:  # v3.15+
> > >> Signed-off-by: Amit Shah 
> > 
> > Anyway we need to figure out what went wrong here. Please share your
> > exact kernelcmdline and lspci -nn. Also stats for before/after from
> > powertop when idle please.
> 
> Powertop stats for idle are a little difficult -- since this is my
> primary laptop.

Now I'm a bit confused: How have you measured that the lack of rc6p/pp is
the reason for your power consumption regression?
-Daniel

> 
> BOOT_IMAGE=/vmlinuz-3.15.4-200.fc20.x86_64 
> root=/dev/mapper/luks-3aff2acf-737d-4002-b644-15f599d09a18 ro 
> rd.lvm.lv=fedora_grmbl/00 rd.lvm.lv=fedora_grmbl/01 
> vconsole.font=latarcyrheb-sun16 
> rd.luks.uuid=luks-0934d354-5b07-4e91-a699-9bfc57e76fdc 
> rd.luks.uuid=luks-3aff2acf-737d-4002-b644-15f599d09a18 rhgb quiet 
> slub_debug=- i915.i915_enable_rc6=7 LANG=en_IN.UTF-8
> 
> 00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor 
> Family DRAM Controller [8086:0104] (rev 09)
> 00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation 
> Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
> 00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 
> Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
> 00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network 
> Connection [8086:1502] (rev 04)
> 00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset 
> Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 04)
> 00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset 
> Family High Definition Audio Controller [8086:1c20] (rev 04)
> 00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
> Family PCI Express Root Port 1 [8086:1c10] (rev b4)
> 00:1c.1 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
> Family PCI Express Root Port 2 [8086:1c12] (rev b4)
> 00:1c.3 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
> Family PCI Express Root Port 4 [8086:1c16] (rev b4)
> 00:1c.4 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset 
> Family PCI Express Root Port 5 [8086:1c18] (rev b4)
> 00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset 
> Family USB Enhanced Host Controller #1 [8086:1c26] (rev 04)
> 

[PATCH 2/2] drm/radeon: use a fetch function to get the edid

2014-07-17 Thread Daniel Vetter
On Wed, Jul 16, 2014 at 03:51:37PM -0400, Alex Deucher wrote:
> On Tue, Jul 15, 2014 at 12:51 PM, Daniel Vetter  wrote:
> > On Tue, Jul 15, 2014 at 5:44 PM, Alex Deucher  
> > wrote:
> >> On Tue, Jul 15, 2014 at 11:18 AM, Daniel Vetter  wrote:
> >>> On Tue, Jul 15, 2014 at 11:08:11AM -0400, Alex Deucher wrote:
>  We keep a cached version of the edid in radeon_connector which
>  we use for determining connectedness and when to enable certain
>  features like hdmi audio, etc.  When the user uses the firmware
>  interface to override the driver with some other edid the driver's
>  copy is never updated.  The fetch function will check if there
>  is a user supplied edid and update the driver's copy if there
>  is.
> 
>  bug:
>  https://bugs.freedesktop.org/show_bug.cgi?id=80691
> 
>  Signed-off-by: Alex Deucher 
> >>>
> >>> [snip]
> >>>
>  +struct edid *radeon_connector_edid(struct drm_connector *connector)
>  +{
>  + struct radeon_connector *radeon_connector = 
>  to_radeon_connector(connector);
>  + struct drm_property_blob *edid_blob = connector->edid_blob_ptr;
>  +
>  + if (radeon_connector->edid) {
>  + return radeon_connector->edid;
>  + } else if (edid_blob) {
>  + struct edid *edid = kmemdup(edid_blob->data, 
>  edid_blob->length, GFP_KERNEL);
>  + if (edid)
>  + radeon_connector->edid = edid;
>  + }
>  + return radeon_connector->edid;
>  +}
> >>>
> >>> We have similar issues on intel now that we use the debugfs interface to
> >>> force certain edids (for validating e.g. 4k or 3d) - our code doesn't see
> >>> the forced edid. Should we have a helper somewhere or just change
> >>> drm_get_edid to dtrt here?
> >>>
> >>> Adding Thomas who's working on this.
> >>
> >> I think the best solution would be to make drm_load_edid_firmware()
> >> just return the raw user supplied edid, then let the drivers handle it
> >> internally.  That way the drivers could decide how they want to handle
> >> it in their detect() or get_modes() callbacks.  The problem is that if
> >> drm_load_edid_firmware() succeeds, the driver's get_modes() callback
> >> never gets called.  A less invasive alternative would be to add a a
> >> get_modes_firmware() callback, e.g.,
> >>
> >> diff --git a/drivers/gpu/drm/drm_probe_helper.c
> >> b/drivers/gpu/drm/drm_probe_helper.c
> >> index d22676b..ceb246f 100644
> >> --- a/drivers/gpu/drm/drm_probe_helper.c
> >> +++ b/drivers/gpu/drm/drm_probe_helper.c
> >> @@ -127,7 +127,10 @@ static int
> >> drm_helper_probe_single_connector_modes_merge_bits(struct drm_connect
> >> }
> >>
> >>  #ifdef CONFIG_DRM_LOAD_EDID_FIRMWARE
> >> -   count = drm_load_edid_firmware(connector);
> >> +   if (connector_funcs->get_modes_firmware)
> >> +   count = (*connector_funcs->get_modes_firmware)(connector);
> >> +   else
> >> +   count = drm_load_edid_firmware(connector);
> >> if (count == 0)
> >>  #endif
> >> count = (*connector_funcs->get_modes)(connector);
> >>
> >> and the driver implementation could mostly just wrap
> >> drm_load_edid_firmware, e.g.,
> >>
> >> +static int radeon_get_modes_firmware(struct drm_connector *connector)
> >> +{
> >> +   struct radeon_connector *radeon_connector =
> >> to_radeon_connector(connector);
> >> +   struct drm_property_blob *edid_blob;
> >> +   int ret;
> >> +
> >> +   ret = drm_load_edid_firmware(connector);
> >> +   edid_blob = connector->edid_blob_ptr;
> >> +   /* update the driver's copy of the */
> >> +   if (edid_blob) {
> >> +   struct edid *edid = kmemdup(edid_blob->data,
> >> edid_blob->length, GFP_KERNEL);
> >> +   if (edid)
> >> +   radeon_connector->edid = edid;
> >> +   }
> >> +
> >> +   return ret;
> >> +}
> >>
> >> The problem is that wouldn't give the driver access to the user
> >> provided edid at detect() time.
> >
> > Yeah, we also do a bunch of things in ->detect, so ->detect not
> > getting called for forced edids is a bit annoying. The other thing is
> > that edid overriding through debugfs at runtime is done differently
> > again, and for those the driver's ->detect actually gets called. Well,
> > if the connector state doesn't get forced.
> >
> > One idea I've had which is a bit of work is to move all these
> > detection stuff outside of ->detect and into encoder->mode_fixup
> > functions (compute_config in i915). If we then add a function to grab
> > the cached/firmware/overridden edid and use it there it should all
> > work. And at least on i915 you need to do a full modeset to e.g.
> > update the audio status anyway.
> >
> 
> Same here.  My initial version of the patch just moved the edid
> assignment into the mode_valid() callback since that was the next time
> the common code called into the driver, but mode_fixup 

[PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT

2014-07-17 Thread Alex Deucher
On Thu, Jul 17, 2014 at 6:01 AM, Michel D?nzer  wrote:
> In order to try and improve X(Shm)PutImage performance with glamor, I
> implemented support for write-combined CPU mappings of BOs in GTT.
>
> This did provide a nice speedup, but to my surprise, using VRAM instead
> of write-combined GTT turned out to be even faster in general on my
> Kaveri machine, both for the internal GPU and for discrete GPUs.
>
> However, I've kept the changes from GTT to VRAM separated, in case this
> turns out to be a loss on other setups.
>
> Kernel patches:
>
> [PATCH 1/5] drm/radeon: Remove radeon_gart_restore()
> [PATCH 2/5] drm/radeon: Pass GART page flags to
> [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in
> [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and
> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI

Applied 1-4 to my 3.17 tree.  thanks!

Alex

>
> Mesa patches:
>
> [PATCH 1/5] winsys/radeon: Use separate caching buffer managers for
> [PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some
> [PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming
> [PATCH 4/5] r600g,radeonsi: Use write-combined persistent GTT
> [PATCH 5/5] r600g,radeonsi: Prefer VRAM for persistent mappings
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/1] Revert "drm/i915: drop i915_ prefix from enable_rc6, enable_fbc, enable_ppgtt parameters"

2014-07-17 Thread Daniel Vetter
On Wed, Jul 16, 2014 at 9:54 PM, Linus Torvalds
 wrote:
> Sorry for the top post, I'm on the road..
>
> In wondering if we couldn't just keep both the old an the new names and have
> them both point at the same variable? Remove the description for the old
> name, but keep it working?

I'm really surprised here ... We have rc6 enabled by default
everywhere, and all the additional rc6 levels that users try to enable
are known to hard-hang machines. I actually have plans to taint the
kernel if you set any of them since I'm fed up with the random crash
reports. Same for fbc, even more so or the ppgtt knob. My stance is
that if you know about these knobs you _really_ should know the driver
to its depths and so also be able to follow module parameter
renamings.

> On Jul 16, 2014 8:34 AM, "Amit Shah"  wrote:
>>
>> This reverts commit 3adee7a7976012a20f1d3b5a529a3c105e29fef1.
>>
>> After upgrading to v3.15, my laptop's battery started draining quite
>> fast.  Powertop pointed to the deep RC6 states not being used.  The
>> kernel param I had put to enable them had stopped working the way it
>> used to; so I disagree with the 'not maintaing ABI' part of the param
>> name change.
>>
>> However weird the names may be, they're in active use and changing them
>> only causes pain for users.  This also isn't advertised (marked
>> deprecated, big warning shown, etc.), so just reverting now.
>>
>> CC: Daniel Vetter 
>> CC: Jani Nikula 
>> CC: David Airlie 
>> CC:  # v3.15+
>> Signed-off-by: Amit Shah 

Anyway we need to figure out what went wrong here. Please share your
exact kernelcmdline and lspci -nn. Also stats for before/after from
powertop when idle please.

Thanks, Daniel

>> ---
>>  drivers/gpu/drm/i915/i915_params.c | 12 ++--
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_params.c
>> b/drivers/gpu/drm/i915/i915_params.c
>> index d05a2af..053981d 100644
>> --- a/drivers/gpu/drm/i915/i915_params.c
>> +++ b/drivers/gpu/drm/i915/i915_params.c
>> @@ -69,16 +69,16 @@ MODULE_PARM_DESC(semaphores,
>> "Use semaphores for inter-ring sync "
>> "(default: -1 (use per-chip defaults))");
>>
>> -module_param_named(enable_rc6, i915.enable_rc6, int, 0400);
>> -MODULE_PARM_DESC(enable_rc6,
>> +module_param_named(i915_enable_rc6, i915.enable_rc6, int, 0400);
>> +MODULE_PARM_DESC(i915_enable_rc6,
>> "Enable power-saving render C-state 6. "
>> "Different stages can be selected via bitmask values "
>> "(0 = disable; 1 = enable rc6; 2 = enable deep rc6; 4 = enable
>> deepest rc6). "
>> "For example, 3 would enable rc6 and deep rc6, and 7 would enable
>> everything. "
>> "default: -1 (use per-chip default)");
>>
>> -module_param_named(enable_fbc, i915.enable_fbc, int, 0600);
>> -MODULE_PARM_DESC(enable_fbc,
>> +module_param_named(i915_enable_fbc, i915.enable_fbc, int, 0600);
>> +MODULE_PARM_DESC(i915_enable_fbc,
>> "Enable frame buffer compression for power savings "
>> "(default: -1 (use per-chip default))");
>>
>> @@ -111,8 +111,8 @@ MODULE_PARM_DESC(enable_hangcheck,
>> "WARNING: Disabling this can cause system wide hangs. "
>> "(default: true)");
>>
>> -module_param_named(enable_ppgtt, i915.enable_ppgtt, int, 0400);
>> -MODULE_PARM_DESC(enable_ppgtt,
>> +module_param_named(i915_enable_ppgtt, i915.enable_ppgtt, int, 0400);
>> +MODULE_PARM_DESC(i915_enable_ppgtt,
>> "Override PPGTT usage. "
>> "(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
>>
>> --
>> 1.9.3
>>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


  1   2   >