date:20140805

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #22 from Dave Airlie  ---
do you have radeon.dpm=0 in smoe /etc/modprobe.d or somewhere like that file?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5f28454a/attachment.html>

[sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!

2014-08-05 Thread Fengguang Wu

ered
[4.237859] emc: device handler registered
[4.238463] osst :I: Tape driver with OnStream support version 0.99.4
[4.238463] osst :I: $Id: osst.c,v 1.73 2005/01/01 21:13:34 wriede Exp $
[4.243834] osd: LOADED open-osd 0.2.1
[4.260658] Rounding down aligned max_sectors from 4294967295 to 8388600
[4.280969] mtdoops: mtd device (mtddev=name/number) must be supplied
[4.282117] device id = 2440
[4.282605] device id = 2480
[4.283097] device id = 24c0
[4.283583] device id = 24d0
[4.284134] device id = 25a1
[4.284620] device id = 2670
[4.286157] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd
[4.287060] [ cut here ]
[4.287722] kernel BUG at include/linux/mtd/map.h:148!
[4.288048] invalid opcode:  [#1] PREEMPT SMP 
[4.288048] CPU 1 
[4.288048] Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc4-00162-g49099c4 #17 
Bochs Bochs
[4.288048] RIP: 0010:[]  [] 
mtd_do_chip_probe+0x1d/0x1f
[4.288048] RSP: 0018:880011049e20  EFLAGS: 00010246
[4.288048] RAX:  RBX: 82a23550 RCX: 
[4.288048] RDX: 880011049e20 RSI: 82a23580 RDI: 880011049e80
[4.288048] RBP: 880011049e80 R08: 0003 R09: 810d6c93
[4.288048] R10:  R11: 0001 R12: 82a23eb0
[4.288048] R13: 828790ce R14:  R15: 
[4.288048] FS:  () GS:88001260() 
knlGS:
[4.288048] CS:  0010 DS:  ES:  CR0: 8005003b
[4.288048] CR2:  CR3: 0298c000 CR4: 000406e0
[4.288048] DR0:  DR1:  DR2: 
[4.288048] DR3:  DR6: 0ff0 DR7: 0400
[4.288048] Process swapper/0 (pid: 1, threadinfo 880011048000, task 
88001104)
[4.288048] Stack:
[4.288048]     

[4.288048]     

[4.288048]     

[4.288048] Call Trace:
[4.288048]  [] cfi_probe+0x15/0x17
[4.288048]  [] do_map_probe+0xa0/0xac
[4.288048]  [] ? physmap_init+0x12/0x12
[4.288048]  [] init_sbc_gxx+0x104/0x15b
[4.288048]  [] do_one_initcall+0x86/0x208
[4.288048]  [] kernel_init+0x10d/0x1c2
[4.288048]  [] ? do_early_param+0xc3/0xc3
[4.288048]  [] kernel_thread_helper+0x4/0x10
[4.288048]  [] ? retint_restore_args+0x13/0x13
[4.288048]  [] ? do_one_initcall+0x208/0x208
[4.288048]  [] ? gs_change+0x13/0x13
[4.288048] Code: 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5 48 
83 ec 60 66 66 66 66 90 31 c0 b9 18 00 00 00 48 8d 55 a0 48 89 d7 f3 ab <0f> 0b 
55 48 89 e5 66 66 66 66 90 48 c7 c6 a0 39 a2 82 e8 cc ff 
[4.288048] RIP  [] mtd_do_chip_probe+0x1d/0x1f
[4.288048]  RSP 
[4.321423] ---[ end trace 169195d5d1f9be6e ]---
[4.322118] swapper/0 (1) used greatest stack depth: 3768 bytes left
[4.323045] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b
[4.323045] 

Elapsed time: 10
qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel 
/kernel/x86_64-randconfig-s0-08051229/49099c4991da3c94773f888aea2e9d27b8a7c6d1/vmlinuz-3.5.0-rc4-00162-g49099c4
 -append 'hung_task_panic=1 earlyprintk=ttyS0,115200 debug apic=debug 
sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=10 
softlockup_panic=1 nmi_watchdog=panic  prompt_ramdisk=0 console=ttyS0,115200 
console=tty0 vga=normal  root=/dev/ram0 rw 
link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-s0-08051229/linux-devel:devel-hourly-2014080511:49099c4991da3c94773f888aea2e9d27b8a7c6d1:bisect-linux1/.vmlinuz-49099c4991da3c94773f888aea2e9d27b8a7c6d1-20140805164127-2-kbuild
 branch=linux-devel/devel-hourly-2014080511 
BOOT_IMAGE=/kernel/x86_64-randconfig-s0-08051229/49099c4991da3c94773f888aea2e9d27b8a7c6d1/vmlinuz-3.5.0-rc4-00162-g49099c4
 drbd.minor_count=8'  -initrd /kernel-tests/initrd/quantal-core-x86_64.cgz -m 
320 -smp 2 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc 
-no-reboot -watchdog i6300esb -rtc base=localtime -pidfile 
/dev/shm/kboot/pid-quantal-kbuild-15 -serial 
file:/dev/shm/kboot/serial-quantal-kbuild-15 -daemonize -display none -monitor 
null 
-- next part --
A non-text attachment was scrubbed...
Name: 
x86_64-randconfig-s0-08051229-7d5b32398354b2cb45d711c021557d8da09ae30b-kernel-BUG-at-128910.log
Type: application/octet-stream
Size: 139708 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/f56dcf51/attachment-0001.obj>
-- next part --
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 3.5.0-rc4 Kernel Configuration
#
CONFIG_64BIT=y
# CONFIG_X86_32 is n

[Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81680

--- Comment #9 from Eugene  ---
(In reply to comment #8)
> (In reply to comment #7)
> > Yes, I'm using Kubuntu. And Libgl1-mesa-dri-dbg recently installed. Last
> > report is here:
> 
> It still doesn't resolve the symbols, please use addr2line.

With what frame address I should use it or how to determine it ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/0f45b2a7/attachment.html>

[PATCH 07/15] drm/exynos: dsi: Add support for panel prepare and unprepare routines

2014-08-05 Thread Ajay kumar

Hi Andreas,


On Tue, Aug 5, 2014 at 3:33 PM, Andrzej Hajda  wrote:
> Hi Ajay,
>
>
> On 07/31/2014 07:42 PM, Ajay Kumar wrote:
>> Modify exynos_dsi driver to support the new panel calls:
>> prepare and unprepare.
>>
>> Signed-off-by: Ajay Kumar 
>> ---
>>  drivers/gpu/drm/exynos/exynos_drm_dsi.c |   12 ++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
>> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
>> index dc7c80b..4834932 100644
>> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
>> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
>> @@ -1351,7 +1351,7 @@ static int exynos_dsi_enable(struct exynos_dsi *dsi)
>>   if (ret < 0)
>>   return ret;
>>
>> - ret = drm_panel_enable(dsi->panel);
>> + ret = drm_panel_prepare(dsi->panel);
>>   if (ret < 0) {
>>   exynos_dsi_poweroff(dsi);
>>   return ret;
>> @@ -1360,6 +1360,13 @@ static int exynos_dsi_enable(struct exynos_dsi *dsi)
>>   exynos_dsi_set_display_mode(dsi);
>>   exynos_dsi_set_display_enable(dsi, true);
>>
>> + ret = drm_panel_enable(dsi->panel);
>> + if (ret < 0) {
>> + exynos_dsi_set_display_enable(dsi, false);
>
> I guess drm_panel_unprepare(dsi->panel) should be here.
Thanks for pointing it out. I am not sure if Thierry has already
picked this up since Inki has given Acked by.
In that case, you can send it as a fix separately :)

Ajay
>> + exynos_dsi_poweroff(dsi);
>> + return ret;
>> + }
>> +
>>   dsi->state |= DSIM_STATE_ENABLED;
>>
>>   return 0;
>> @@ -1370,8 +1377,9 @@ static void exynos_dsi_disable(struct exynos_dsi *dsi)
>>   if (!(dsi->state & DSIM_STATE_ENABLED))
>>   return;
>>
>> - exynos_dsi_set_display_enable(dsi, false);
>>   drm_panel_disable(dsi->panel);
>> + exynos_dsi_set_display_enable(dsi, false);
>> + drm_panel_unprepare(dsi->panel);
>>   exynos_dsi_poweroff(dsi);
>>
>>   dsi->state &= ~DSIM_STATE_ENABLED;
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" 
> in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 00/23] AMDKFD Kernel Driver

2014-08-05 Thread Oded Gabbay

On 05/08/14 20:11, David Herrmann wrote:
> Hi
> 
> On Tue, Aug 5, 2014 at 5:30 PM, Oded Gabbay  wrote:
>> Hi,
>> Here is the v3 patch set of amdkfd.
>>
>> This version contains changes and fixes to code, as agreed on during the 
>> review
>> of the v2 patch set.
>>
>> The major changes are:
>>
>> - There are two new module parameters: # of processes and # of queues per
>>   process. The defaults, as agreed on in the v2 review, are 32 and 128
>>   respectively. This sets the default amount of GART address space that 
>> amdkfd
>>   requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff,
>>   such as mqd for kernel queue, hpd for pipelines, etc.)
>>
>> - All the GART address space usage of amdkfd is done inside a single 
>> contiguous
>>   buffer that is allocated from system memory, and pinned to the start of the
>>   GART during the startup of amdkfd (which is just after the startup of
>>   radeon). The management of this buffer is done by the radeon sa manager.
>>   This buffer is not evict-able.
>>
>> - Mapping of doorbells is initiated by the userspace lib (by mmap syscall),
>>   instead of initiating it from inside an ioctl (using vm_mmap).
>>
>> - Removed ioctls for exclusive access to performance counters
>>
>> - Added documentation about the QCM (Queue Control Management), apertures and
>>   interfaces between amdkfd and radeon.
>>
>> Two important notes:
>>
>> - The topology patch has not been changed. Look at
>>   http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html
>>   for my response. I also put my answer as an explanation in the commit msg
>>   of the patch.
> 
> This patchset adds 10.000 lines and contains nearly 0 comments *why*
> stuff is added. Seriously, it is almost impossible to understand what
> you're doing. Can you please include a high-level introduction in the
> [0/X] cover-letter and include it in every series you send? A
> blog-post or something would also be fine. And yes, it's totally ok if
> this is 10k lines of plain-text.

My bad. I forgot to attach the cover letter of v2 and especially v1,
which includes a lengthy explanation of the driver.

So here it is and I will respond to your other comments later.

Oded

v2 cover letter:
---

As a continuation to the existing discussion, here is a v2 patch series
restructured with a cleaner history and no totally-different-early-versions
of the code.

Instead of 83 patches, there are now a total of 25 patches, where 5 of them
are modifications to radeon driver and 18 of them include only amdkfd code.
There is no code going away or even modified between patches, only added.

The driver was renamed from radeon_kfd to amdkfd and moved to reside under
drm/radeon/amdkfd. This move was done to emphasize the fact that this
driver
is an AMD-only driver at this point. Having said that, we do foresee
a generic hsa framework being implemented in the future and in that case,
we will adjust amdkfd to work within that framework.

As the amdkfd driver should support multiple AMD gfx drivers, we want to
keep it as a seperate driver from radeon. Therefore, the amdkfd code is
contained in its own folder. The amdkfd folder was put under the radeon
folder because the only AMD gfx driver in the Linux kernel at this point
is the radeon driver. Having said that, we will probably need to move
it (maybe to be directly under drm) after we integrate with additional AMD
gfx drivers.

For people who like to review using git, the v2 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

Written by Oded Gabbayh 

-
Original Cover Letter:
-

This patch set implements a Heterogeneous System Architecture (HSA) driver
for radeon-family GPUs.

HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to share
system resources more effectively via HW features including shared pageable
memory, userspace-accessible work queues, and platform-level atomics. In
addition to the memory protection mechanisms in GPUVM and IOMMUv2, the Sea
Islands family of GPUs also performs HW-level validation of commands passed
in through the queues (aka rings).

The code in this patch set is intended to serve both as a sample driver for
other HSA-compatible hardware devices and as a production driver for
radeon-family processors. The code is architected to support multiple CPUs
each with connected GPUs, although the current implementation focuses on a
single Kaveri/Berlin APU, and works alongside the existing radeon kernel
graphics driver (kgd).

AMD GPUs designed for use with HSA (Sea Islands and up) share some hardware
functionality between HSA compute and regular gfx/compute (memory,
interrupts, registers), while other functionality has been added
specifically for HSA compute  (hw scheduler for virtualized compute rings).
All shared hardware is owned by the radeon graphics driver, and an
interface
between kfd and kgd allows the kfd to

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #21 from Alex Deucher  ---
(In reply to comment #20)
> (In reply to comment #19)
> > Maybe we'll get more useful feedback once more people start testing hawaii.
> 
> That sounds like I failed to provide something? If you have any request,
> what I should check, just let me know. Ie. trying a different compiler?

I didn't mean to imply that.  I can't think of anything else to provide.  I'm
just thinking maybe someone will notice some small detail that I missed or
something like that.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5a8469fd/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #20 from Kai  ---
(In reply to comment #19)
> Maybe we'll get more useful feedback once more people start testing hawaii.

That sounds like I failed to provide something? If you have any request, what I
should check, just let me know. Ie. trying a different compiler?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/87c6d0d1/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #19 from Alex Deucher  ---
(In reply to comment #18)
> (In reply to comment #16)
> > I don't have any other ideas off hand.  That patch represents is the only
> > difference explicitly setting that parameter changes.
> 
> Ok, no problem; I just keep the radeon.dpm=1 around and I'm going to be
> happy, I hope. But I guess we should keep this bug open, until we find the
> cause? Maybe we should change the title to something like "reclocking only
> with radeon.dpm=1 set"? But that's all your call.

Yeah, let's keep it open for now.  Maybe we'll get more useful feedback once
more people start testing hawaii.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/7a8fd880/attachment-0001.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #18 from Kai  ---
(In reply to comment #16)
> I don't have any other ideas off hand.  That patch represents is the only
> difference explicitly setting that parameter changes.

Ok, no problem; I just keep the radeon.dpm=1 around and I'm going to be happy,
I hope. But I guess we should keep this bug open, until we find the cause?
Maybe we should change the title to something like "reclocking only with
radeon.dpm=1 set"? But that's all your call.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/9bf4e2f6/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #17 from Kai  ---
(In reply to comment #15)
> I booted each configuration represent by attachment 104103 and attachment 
> 104104 two times.

Just to clarify: the boot and testing order was:

rebooting into configuration 104103 ? starting Portal 2 with GALLIUM_HUD=fps ?
verifying FPS in level as low ? powering off

booting configuration 104104 ? starting Portal 2 with GALLIUM_HUD=fps ?
verifying FPS in level as high ? powering off

booting configuration 104103 ? starting Portal 2 with GALLIUM_HUD=fps ?
verifying FPS in level as low ? rebooting into configuration 104104 ? starting
Portal 2 with GALLIUM_HUD=fps ? verifying FPS in level as high

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6b9989be/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #16 from Alex Deucher  ---
I don't have any other ideas off hand.  That patch represents is the only
difference explicitly setting that parameter changes.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/df86d2db/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #15 from Kai  ---
(In reply to comment #11)
> Created attachment 104101 [details] [review]
> enable dpm=1 debugging even when dpm is not forced
> 
> This patch enables the additional dpm debugging output even when it is not
> explictly set on the command line.  Does it help?  The only thing I can
> figure is that the debugging output adds a small delay that may have a
> positive impact.

You're not going to like this. But setting radeon.dpm=1 must have some other
side effect. I booted each configuration represent by attachment 104103 and
attachment 104104 two times. The first (104103) is the stack from comment #0
plus the patch from attachment 104101 applied to the kernel, then booted
without radeon.dpm=1 (see the dmesg output for the kernel command line). When I
start Portal 2 I stay at the numbers reported in comment #0 (ie. at low FPS).

If I boot the stack from comment #0 with the patch from attachment 104101
applied to the kernel and DO set radeon.dpm=1 on the kernel command line (see
second dmesg output; 104104), then I get 60 FPS in Portal 2.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6c6b84a5/attachment.html>

[Bug 68799] [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=68799

--- Comment #3 from Stanis?aw Halik  ---
Available at last address again.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/1c6ed76a/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #14 from Alex Deucher  ---
Did it help?  With the patch applied, the behavior of the driver is identical
whether or not you append radeon.dpm=1 to your kernel command line.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5b07f8ce/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #13 from Kai  ---
Created attachment 104104
  --> https://bugs.freedesktop.org/attachment.cgi?id=104104=edit
dmesg output with attachment 104101 and "radeon.dpm=1" set

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/034dcc5e/attachment-0001.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #12 from Kai  ---
Created attachment 104103
  --> https://bugs.freedesktop.org/attachment.cgi?id=104103=edit
dmesg output with attachment 104101 and no "radeon.dpm=1" set

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/d8db098e/attachment.html>

screen goes blank when loading gma500_gfx (atom D2500)

2014-08-05 Thread Michael Tokarev

05.08.2014 20:11, Michael Tokarev wrote:
> Hello again.
> 
> It's been 4 more months since last message in this thread (which was mine).
> Now kernel 3.16 has been released, and I decided to give it a try.  And it
> behaves just like all previous kernels, -- once gma500_gfx module is loaded,
> screen goes blank, monitor turns off ("no signal detected") and nothing to
> be seen until reboot.
> 
> Can we try to debug this somehow, after more than half a year?... :)

Current debugging (by 3.16), after:

 modprobe drm debug=6
 modprobe gma500_gfx

on a freshly booted system:

[   46.463381] Linux agpgart interface v0.103
[   46.491487] [drm] Initialized drm 1.1.0 20060810
[   56.585520] [drm:psb_intel_opregion_setup] Public ACPI methods supported
[   56.585528] [drm:psb_intel_opregion_setup] ASLE supported
[   56.585563] gma500 :00:02.0: irq 50 for MSI/MSI-X
[   56.585591] [drm:psb_intel_init_bios] Using VBT from OpRegion: $VBT 
CEDARVIEW  d
[   56.585604] [drm:drm_mode_debug_printmodeline] Modeline 0:"1920x1080" 0 
144000 1920 2016 2080 2176 1080 1088 1092 1100 0x8 0xa
[   56.585609] [drm:parse_sdvo_device_mapping] No SDVO device info is found in 
VBT
[   56.585617] [drm:parse_edp] EDP timing in vbt t1_t3 2000 t8 10 t9 2000 t10 
500 t11_t12 5000
[   56.585621] [drm:parse_edp] VBT reports EDP: Lane_count 1, Lane_rate 6, Bpp 
24
[   56.585624] [drm:parse_edp] VBT reports EDP: VSwing  0, Preemph 0
[   56.598203] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[   56.598902] acpi device:28: registered as cooling_device2
[   56.599109] input: Video Bus as 
/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input11
[   56.599326] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   56.599366] [drm] No driver support for vblank timestamp query.
[   56.650918] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent adapter 
intel drm LVDSDDC_C
[   56.651842] [drm:cdv_intel_dp_i2c_init] i2c_init DPDDC-B
[   56.652352] [drm:cdv_intel_dp_aux_ch] dp_aux_ch timeout status 0x51440064
[   56.652356] [drm:cdv_intel_dp_i2c_aux_ch] aux_ch failed -110
[   56.652863] [drm:cdv_intel_dp_aux_ch] dp_aux_ch timeout status 0x51440064
[   56.652866] [drm:cdv_intel_dp_i2c_aux_ch] aux_ch failed -110
[   56.653706] [drm:cdv_intel_dp_i2c_init] i2c_init DPDDC-C
[   56.654014] [drm:cdv_intel_dp_i2c_aux_ch] aux_i2c nack
[   56.654223] [drm:cdv_intel_dp_i2c_aux_ch] aux_i2c nack
[   56.714765] gma500 :00:02.0: trying to get vblank count for disabled 
pipe 1
[   56.714812] gma500 :00:02.0: trying to get vblank count for disabled 
pipe 1
[   56.775220] [drm:drm_helper_probe_single_connector_modes_merge_bits] 
[CONNECTOR:10:VGA-1]
[   56.900606] [drm:drm_helper_probe_single_connector_modes_merge_bits] 
[CONNECTOR:10:VGA-1] probed modes :
[   56.900617] [drm:drm_mode_debug_printmodeline] Modeline 26:"1280x1024" 60 
108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5
[   56.900624] [drm:drm_mode_debug_printmodeline] Modeline 36:"1280x1024" 75 
135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5
[   56.900630] [drm:drm_mode_debug_printmodeline] Modeline 29:"1280x1024" 72 
132840 1280 1368 1504 1728 1024 1025 1028 1067 0x0 0x6
[   56.900637] [drm:drm_mode_debug_printmodeline] Modeline 28:"1152x864" 75 
108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5
[   56.900643] [drm:drm_mode_debug_printmodeline] Modeline 37:"1024x768" 75 
78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5
[   56.900649] [drm:drm_mode_debug_printmodeline] Modeline 38:"1024x768" 70 
75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa
[   56.900656] [drm:drm_mode_debug_printmodeline] Modeline 39:"1024x768" 60 
65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa
[   56.900662] [drm:drm_mode_debug_printmodeline] Modeline 40:"832x624" 75 
57284 832 864 928 1152 624 625 628 667 0x40 0xa
[   56.900669] [drm:drm_mode_debug_printmodeline] Modeline 41:"800x600" 75 
49500 800 816 896 1056 600 601 604 625 0x40 0x5
[   56.900675] [drm:drm_mode_debug_printmodeline] Modeline 42:"800x600" 72 
5 800 856 976 1040 600 637 643 666 0x40 0x5
[   56.900681] [drm:drm_mode_debug_printmodeline] Modeline 30:"800x600" 60 
4 800 840 968 1056 600 601 605 628 0x40 0x5
[   56.900687] [drm:drm_mode_debug_printmodeline] Modeline 31:"640x480" 75 
31500 640 656 720 840 480 481 484 500 0x40 0xa
[   56.900694] [drm:drm_mode_debug_printmodeline] Modeline 32:"640x480" 73 
31500 640 664 704 832 480 489 491 520 0x40 0xa
[   56.900700] [drm:drm_mode_debug_printmodeline] Modeline 33:"640x480" 67 
30240 640 704 768 864 480 483 486 525 0x40 0xa
[   56.900706] [drm:drm_mode_debug_printmodeline] Modeline 34:"640x480" 60 
25200 640 656 752 800 480 490 492 525 0x40 0xa
[   56.900713] [drm:drm_mode_debug_printmodeline] Modeline 35:"720x400" 70 
28320 720 738 846 900 400 412 414 449 0x40 0x6
[   56.900719] [drm:drm_mode_debug_printmodeline] Modeline 27:"640x350" 70 
25170 640 656 752 800 350 387 389 449 0x40 0x9
[   56.900724]

screen goes blank when loading gma500_gfx (atom D2500)

2014-08-05 Thread Michael Tokarev

Hello again.

It's been 4 more months since last message in this thread (which was mine).
Now kernel 3.16 has been released, and I decided to give it a try.  And it
behaves just like all previous kernels, -- once gma500_gfx module is loaded,
screen goes blank, monitor turns off ("no signal detected") and nothing to
be seen until reboot.

Can we try to debug this somehow, after more than half a year?... :)

Thank you,

/mjt

05.04.2014 12:15, Michael Tokarev wrote:
> Hello again
> 
> It's been about 2 months since I sent the original debugging output.  Today I 
> tried
> out 3.14 kernel.  And this one behaves quite similarly, screen goes blank 
> right
> when loading gma500_gfx module.  Here's the dmesg from a freshly booted system
> after doing
> 
>   modprobe drm debug=6
>   modprobe gma500_gfx
> 
> with a monitor connected to VGA port (before loading gma500_gfx, it displays 
> the
> regular text console):
> 
> [   39.863330] Linux agpgart interface v0.103
> [   39.900511] [drm] Initialized drm 1.1.0 20060810
> [   45.012300] [drm:psb_intel_opregion_setup], Public ACPI methods supported
> [   45.012308] [drm:psb_intel_opregion_setup], ASLE supported
> [   45.012345] gma500 :00:02.0: irq 50 for MSI/MSI-X
> [   45.012371] [drm:psb_intel_init_bios], Using VBT from OpRegion: $VBT 
> CEDARVIEW  d
> [   45.012384] [drm:drm_mode_debug_printmodeline], Modeline 0:"1920x1080" 0 
> 144000 1920 2016 2080 2176 1080 1088 1092 1100 0x8 0xa
> [   45.012389] [drm:parse_sdvo_device_mapping], No SDVO device info is found 
> in VBT
> [   45.012397] [drm:parse_edp], EDP timing in vbt t1_t3 2000 t8 10 t9 2000 
> t10 500 t11_t12 5000
> [   45.012401] [drm:parse_edp], VBT reports EDP: Lane_count 1, Lane_rate 6, 
> Bpp 24
> [   45.012405] [drm:parse_edp], VBT reports EDP: VSwing  0, Preemph 0
> [   45.012478] gma500 :00:02.0: GPU: power management timed out.
> [   45.026195] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
> [   45.026891] acpi device:29: registered as cooling_device2
> [   45.027104] input: Video Bus as 
> /devices/LNXSYSTM:00/device:00/PNP0A08:00/LNXVIDEO:00/input/input11
> [   45.027681] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   45.027726] [drm] No driver support for vblank timestamp query.
> [   45.078928] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent 
> adapter intel drm LVDSDDC_C
> [   45.079839] [drm:cdv_intel_dp_i2c_init], i2c_init DPDDC-B
> [   45.080383] [drm:cdv_intel_dp_aux_ch], dp_aux_ch timeout status 0x51440064
> [   45.080388] [drm:cdv_intel_dp_i2c_aux_ch], aux_ch failed -110
> [   45.080896] [drm:cdv_intel_dp_aux_ch], dp_aux_ch timeout status 0x51440064
> [   45.080899] [drm:cdv_intel_dp_i2c_aux_ch], aux_ch failed -110
> [   45.081754] [drm:cdv_intel_dp_i2c_init], i2c_init DPDDC-C
> [   45.082062] [drm:cdv_intel_dp_i2c_aux_ch], aux_i2c nack
> [   45.082272] [drm:cdv_intel_dp_i2c_aux_ch], aux_i2c nack
> [   45.122742] [drm:cdv_intel_single_pipe_active], pipe enabled 0
> [   45.142780] gma500 :00:02.0: trying to get vblank count for disabled 
> pipe 1
> [   45.142826] gma500 :00:02.0: trying to get vblank count for disabled 
> pipe 1
> [   45.183207] [drm:cdv_intel_single_pipe_active], pipe enabled 0
> [   45.203249] [drm:drm_helper_probe_single_connector_modes], 
> [CONNECTOR:7:VGA-1]
> [   45.332286] [drm:drm_helper_probe_single_connector_modes], 
> [CONNECTOR:7:VGA-1] probed modes :
> [   45.332297] [drm:drm_mode_debug_printmodeline], Modeline 23:"1280x1024" 60 
> 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5
> [   45.332304] [drm:drm_mode_debug_printmodeline], Modeline 33:"1280x1024" 75 
> 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5
> [   45.332311] [drm:drm_mode_debug_printmodeline], Modeline 26:"1280x1024" 72 
> 132840 1280 1368 1504 1728 1024 1025 1028 1067 0x0 0x6
> [   45.332318] [drm:drm_mode_debug_printmodeline], Modeline 25:"1152x864" 75 
> 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5
> [   45.332325] [drm:drm_mode_debug_printmodeline], Modeline 34:"1024x768" 75 
> 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5
> [   45.332332] [drm:drm_mode_debug_printmodeline], Modeline 35:"1024x768" 70 
> 75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa
> [   45.332338] [drm:drm_mode_debug_printmodeline], Modeline 36:"1024x768" 60 
> 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa
> [   45.332345] [drm:drm_mode_debug_printmodeline], Modeline 37:"832x624" 75 
> 57284 832 864 928 1152 624 625 628 667 0x40 0xa
> [   45.332352] [drm:drm_mode_debug_printmodeline], Modeline 38:"800x600" 75 
> 49500 800 816 896 1056 600 601 604 625 0x40 0x5
> [   45.332359] [drm:drm_mode_debug_printmodeline], Modeline 39:"800x600" 72 
> 5 800 856 976 1040 600 637 643 666 0x40 0x5
> [   45.332365] [drm:drm_mode_debug_printmodeline], Modeline 27:"800x600" 60 
> 4 800 840 968 1056 600 601 605 628 0x40 0x5
> [   45.332372] [drm:drm_mode_debug_printmodeline], Modeline 28:"640x480" 75 
> 31500 640 656 720 840 480 481

[sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!

2014-08-05 Thread Nick Krause

On Tue, Aug 5, 2014 at 9:59 AM, Fengguang Wu  wrote:
> Hello,
>
> This is an old BUG that still lives in linux-next.
>
> [4.284620] device id = 2670
> [4.286157] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd
> [4.287060] [ cut here ]
> [4.287722] kernel BUG at include/linux/mtd/map.h:148!
> [4.288048] invalid opcode:  [#1] PREEMPT SMP
> [4.288048] CPU 1
> [4.288048] Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc4-00162-g49099c4 
> #17 Bochs Bochs
> [4.288048] RIP: 0010:[]  [] 
> mtd_do_chip_probe+0x1d/0x1f
> [4.288048] RSP: 0018:880011049e20  EFLAGS: 00010246
> [4.288048] RAX:  RBX: 82a23550 RCX: 
> 
> [4.288048] RDX: 880011049e20 RSI: 82a23580 RDI: 
> 880011049e80
> [4.288048] RBP: 880011049e80 R08: 0003 R09: 
> 810d6c93
> [4.288048] R10:  R11: 0001 R12: 
> 82a23eb0
> [4.288048] R13: 828790ce R14:  R15: 
> 
> [4.288048] FS:  () GS:88001260() 
> knlGS:
> [4.288048] CS:  0010 DS:  ES:  CR0: 8005003b
> [4.288048] CR2:  CR3: 0298c000 CR4: 
> 000406e0
> [4.288048] DR0:  DR1:  DR2: 
> 
> [4.288048] DR3:  DR6: 0ff0 DR7: 
> 0400
> [4.288048] Process swapper/0 (pid: 1, threadinfo 880011048000, task 
> 88001104)
> [4.288048] Stack:
> [4.288048]     
> 
> [4.288048]     
> 
> [4.288048]     
> 
> [4.288048] Call Trace:
> [4.288048]  [] cfi_probe+0x15/0x17
> [4.288048]  [] do_map_probe+0xa0/0xac
> [4.288048]  [] ? physmap_init+0x12/0x12
> [4.288048]  [] init_sbc_gxx+0x104/0x15b
> [4.288048]  [] do_one_initcall+0x86/0x208
> [4.288048]  [] kernel_init+0x10d/0x1c2
> [4.288048]  [] ? do_early_param+0xc3/0xc3
> [4.288048]  [] kernel_thread_helper+0x4/0x10
> [4.288048]  [] ? retint_restore_args+0x13/0x13
> [4.288048]  [] ? do_one_initcall+0x208/0x208
> [4.288048]  [] ? gs_change+0x13/0x13
> [4.288048] Code: 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5 48 
> 83 ec 60 66 66 66 66 90 31 c0 b9 18 00 00 00 48 8d 55 a0 48 89 d7 f3 ab <0f> 
> 0b 55 48 89 e5 66 66 66 66 90 48 c7 c6 a0 39 a2 82 e8 cc ff
> [4.288048] RIP  [] mtd_do_chip_probe+0x1d/0x1f
> [4.288048]  RSP 
> [4.321423] ---[ end trace 169195d5d1f9be6e ]---
> [4.322118] swapper/0 (1) used greatest stack depth: 3768 bytes left
>
> This script may reproduce the error.
>
> 
> #!/bin/bash
>
> kernel=$1
> initrd=quantal-core-x86_64.cgz
>
> wget --no-clobber 
> https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
>
> kvm=(
> qemu-system-x86_64
> -enable-kvm
> -cpu Haswell,+smep,+smap
> -kernel $kernel
> -initrd $initrd
> -m 320
> -smp 2
> -net nic,vlan=1,model=e1000
> -net user,vlan=1
> -boot order=nc
> -no-reboot
> -watchdog i6300esb
> -rtc base=localtime
> -serial stdio
> -display none
> -monitor null
> )
>
> append=(
> hung_task_panic=1
> earlyprintk=ttyS0,115200
> debug
> apic=debug
> sysrq_always_enabled
> rcupdate.rcu_cpu_stall_timeout=100
> panic=10
> softlockup_panic=1
> nmi_watchdog=panic
> prompt_ramdisk=0
> console=ttyS0,115200
> console=tty0
> vga=normal
> root=/dev/ram0
> rw
> drbd.minor_count=8
> )
>
> "${kvm[@]}" --append "${append[*]}"
> 
>
> Thanks,
> Fengguang
>
> ___
> LKP mailing list
> LKP at linux.intel.com
>
I am new , here and will try to trace your issue on linus's tree
unless there is a major difference between Linus's tree and
linux-next.
If there is please let me known before I start tracing this.
Best Regards ,
Nick

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #11 from Alex Deucher  ---
Created attachment 104101
  --> https://bugs.freedesktop.org/attachment.cgi?id=104101=edit
enable dpm=1 debugging even when dpm is not forced

This patch enables the additional dpm debugging output even when it is not
explictly set on the command line.  Does it help?  The only thing I can figure
is that the debugging output adds a small delay that may have a positive
impact.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/c141e194/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #10 from Kai  ---
(In reply to comment #8)
> dpm is enabled by default for hawaii asics.  You shouldn't need to force it
> on the command line.  forcing it just enabled additional debugging output.

I can only state, that by setting radeon.dpm=1 I get 60 FPS in e.g. Portal 2
and without I'm at 15 FPS max. As written in comment #0, I've built your
drm-next-3.17-rebased-on-fixes branch, my top commit is
commit fa783807977da98da35590fd1d5efdfd4f33fd59
Author: Christian K?nig 
Date:   Mon Jul 28 13:30:12 2014 +0200

drm/radeon: allow userptr write access under certain conditions

It needs to be anonymous memory (no file mappings)
and we are requried to install an MMU notifier.

Signed-off-by: Christian K?nig 
Signed-off-by: Alex Deucher 


I even went through several reboots, switching between "with radeon.dpm=1" and
without. All showed the same result. Let me know, if there is something else, I
can do to assist in debugging this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/cd31b125/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #9 from Kai  ---
(In reply to comment #7)
> Are(In reply to comment #6)
> > Now for your glxgears test: reclocking works (in Portal 2 as well, where I
> > get 58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel
> > command line.
> 
> Are you absolutely sure you need radeon.dpm=1 ?

Yes.

> Reclocking works here (R9
> 290X) without it. I just rechecked and I don't have it on my kernel command
> line (new "drm-next-3.17" branch). Nor do I have it anywhere in /etc.

If unsure with what you've booted, look at dmesg, one of the first lines looks
like:
> Command line: BOOT_IMAGE=/vmlinuz-3.16.0-rc6-citadel 
> root=/dev/mapper/citadel--vg-vol--root ro quiet radeon.dpm=1

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/2132ddd9/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #8 from Alex Deucher  ---
dpm is enabled by default for hawaii asics.  You shouldn't need to force it on
the command line.  forcing it just enabled additional debugging output.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/d5527a02/attachment.html>

[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2

2014-08-05 Thread Christian König

Am 05.08.2014 um 19:39 schrieb Jerome Glisse:
> On Tue, Aug 05, 2014 at 06:05:29PM +0200, Christian K?nig wrote:
>> From: Christian K?nig 
>>
>> Avoid problems with writeback by limiting userptr to anonymous memory.
>>
>> v2: add commit and code comments
> I guess, i have not expressed myself clearly. This is bogus, you pretend
> you want to avoid writeback issue but you still allow userspace to map
> file backed pages (which by the way might be a regular bo object from
> another device for instance and that would be fun).
>
> So this patch is a no go and i would rather see that this userptr to
> be restricted to anon vma only no matter what. No flags here.

Mapping of non anonymous memory (e.g. everything get_user_pages won't 
fail with) is restricted to read only access by the GPU.

I'm fine with making it a hard requirement for all mappings if you say 
it's a must have.

Christian.

>
> Cheers,
> J?r?me
>
>> Signed-off-by: Christian K?nig 
>> ---
>>   drivers/gpu/drm/radeon/radeon_gem.c |  3 ++-
>>   drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++
>>   include/uapi/drm/radeon_drm.h   |  1 +
>>   3 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
>> b/drivers/gpu/drm/radeon/radeon_gem.c
>> index 993ab22..032736b 100644
>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>> @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, 
>> void *data,
>>  return -EACCES;
>>   
>>  /* reject unknown flag values */
>> -if (args->flags & ~RADEON_GEM_USERPTR_READONLY)
>> +if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
>> +RADEON_GEM_USERPTR_ANONONLY))
>>  return -EINVAL;
>>   
>>  /* readonly pages not tested on older hardware */
>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
>> b/drivers/gpu/drm/radeon/radeon_ttm.c
>> index 0109090..54eb7bc 100644
>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>> @@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
>> ttm->num_pages * PAGE_SIZE))
>>  return -EFAULT;
>>   
>> +if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
>> +/* check that we only pin down anonymous memory
>> +   to prevent problems with writeback */
>> +unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
>> +struct vm_area_struct *vma;
>> +vma = find_vma(gtt->usermm, gtt->userptr);
>> +if (!vma || vma->vm_file || vma->vm_end < end)
>> +return -EPERM;
>> +}
>> +
>>  do {
>>  unsigned num_pages = ttm->num_pages - pinned;
>>  uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE;
>> diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
>> index 3a9f209..9720e1a 100644
>> --- a/include/uapi/drm/radeon_drm.h
>> +++ b/include/uapi/drm/radeon_drm.h
>> @@ -816,6 +816,7 @@ struct drm_radeon_gem_create {
>>* perform any operation.
>>*/
>>   #define RADEON_GEM_USERPTR_READONLY(1 << 0)
>> +#define RADEON_GEM_USERPTR_ANONONLY (1 << 1)
>>   
>>   struct drm_radeon_gem_userptr {
>>  uint64_taddr;
>> -- 
>> 1.9.1
>>
>> ___
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel

[pull] radeon drm-next-3.17

2014-08-05 Thread Christian König

Am 05.08.2014 um 19:22 schrieb Daniel Vetter:
> On Tue, Aug 5, 2014 at 7:15 PM, Deucher, Alexander
>  wrote:
>>> -Original Message-
>>> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
>>> Vetter
>>> Sent: Tuesday, August 05, 2014 1:09 PM
>>> To: Alex Deucher
>>> Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, 
>>> Alexander
>>> Subject: Re: [pull] radeon drm-next-3.17
>>>
>>> On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote:
 Hi Dave,

 This is the radeon pull request for 3.17.  Highlights:
 - Additional Hawaii fixes
 - Support for using the display scaler on non-fixed mode displays
 - Support for new firmware format that makes it easier to update
 - Enable dpm by default on additional asics
 - GPUVM improvements
 - Support for uncached and write combined gtt buffers
 - Userptr support
>>> Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see
>>> them fly by anywhere, so I guess I've missed them on some m-l I don't
>>> subscribe to.
>> Christian wrote some patches to validate the interfaces, but I'm not sure he 
>> ever sent them out.  We haven't yet done a full implementation in the 
>> usermode drivers to take advantage of this yet.
> Well right now I've consistently rejected all patches that don't yet
> come with the full thing (libdrm, usermode drivers and tests for it
> all as not ready). And I do that at least once per week since we have
> blob userspace separate from mesa, too. So if we toss that rule
> overboard (and my understanding is that Dave's been fairly strict
> here) I'll look rather bad. As in really, really bad.
>
> I strongly prefer that userptr gets postponed until it's ready. Dave?

That's just my fault. Wanted to wait with the mesa patches till we have 
the kernel interface accepted.

I've just send them out,
Christian.

> -Daniel

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #7 from Luzipher  ---
Are(In reply to comment #6)
> Now for your glxgears test: reclocking works (in Portal 2 as well, where I
> get 58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel
> command line.

Are you absolutely sure you need radeon.dpm=1 ? Reclocking works here (R9 290X)
without it. I just rechecked and I don't have it on my kernel command line (new
"drm-next-3.17" branch). Nor do I have it anywhere in /etc.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/bf8f97f8/attachment.html>

[pull] radeon drm-next-3.17

2014-08-05 Thread Daniel Vetter

On Tue, Aug 5, 2014 at 7:15 PM, Deucher, Alexander
 wrote:
>> -Original Message-
>> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
>> Vetter
>> Sent: Tuesday, August 05, 2014 1:09 PM
>> To: Alex Deucher
>> Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, 
>> Alexander
>> Subject: Re: [pull] radeon drm-next-3.17
>>
>> On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote:
>> > Hi Dave,
>> >
>> > This is the radeon pull request for 3.17.  Highlights:
>> > - Additional Hawaii fixes
>> > - Support for using the display scaler on non-fixed mode displays
>> > - Support for new firmware format that makes it easier to update
>> > - Enable dpm by default on additional asics
>> > - GPUVM improvements
>> > - Support for uncached and write combined gtt buffers
>> > - Userptr support
>>
>> Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see
>> them fly by anywhere, so I guess I've missed them on some m-l I don't
>> subscribe to.
>
> Christian wrote some patches to validate the interfaces, but I'm not sure he 
> ever sent them out.  We haven't yet done a full implementation in the 
> usermode drivers to take advantage of this yet.

Well right now I've consistently rejected all patches that don't yet
come with the full thing (libdrm, usermode drivers and tests for it
all as not ready). And I do that at least once per week since we have
blob userspace separate from mesa, too. So if we toss that rule
overboard (and my understanding is that Dave's been fairly strict
here) I'll look rather bad. As in really, really bad.

I strongly prefer that userptr gets postponed until it's ready. Dave?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH v3 00/23] AMDKFD Kernel Driver

2014-08-05 Thread David Herrmann

Hi

On Tue, Aug 5, 2014 at 5:30 PM, Oded Gabbay  wrote:
> Hi,
> Here is the v3 patch set of amdkfd.
>
> This version contains changes and fixes to code, as agreed on during the 
> review
> of the v2 patch set.
>
> The major changes are:
>
> - There are two new module parameters: # of processes and # of queues per
>   process. The defaults, as agreed on in the v2 review, are 32 and 128
>   respectively. This sets the default amount of GART address space that amdkfd
>   requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff,
>   such as mqd for kernel queue, hpd for pipelines, etc.)
>
> - All the GART address space usage of amdkfd is done inside a single 
> contiguous
>   buffer that is allocated from system memory, and pinned to the start of the
>   GART during the startup of amdkfd (which is just after the startup of
>   radeon). The management of this buffer is done by the radeon sa manager.
>   This buffer is not evict-able.
>
> - Mapping of doorbells is initiated by the userspace lib (by mmap syscall),
>   instead of initiating it from inside an ioctl (using vm_mmap).
>
> - Removed ioctls for exclusive access to performance counters
>
> - Added documentation about the QCM (Queue Control Management), apertures and
>   interfaces between amdkfd and radeon.
>
> Two important notes:
>
> - The topology patch has not been changed. Look at
>   http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html
>   for my response. I also put my answer as an explanation in the commit msg
>   of the patch.

This patchset adds 10.000 lines and contains nearly 0 comments *why*
stuff is added. Seriously, it is almost impossible to understand what
you're doing. Can you please include a high-level introduction in the
[0/X] cover-letter and include it in every series you send? A
blog-post or something would also be fine. And yes, it's totally ok if
this is 10k lines of plain-text.

Lets start with the basics:

1) Why do you use kobject directly to expose the topology? Almost no
other driver does that, why do you use it in amdkfd instead of "struct
bus" and "struct device"? You totally lack uevent handling, sysfs
hierarchy integration and more. If you'd use existing infrastructue
instead of kobject directly, everything would work just fine.

2) What kind of topology is exposed? Is it nested? How deep? How many
items are usually expected? How does the sysfs tree (`tree
/sys//topology`) look like on your machine? For people without the
hardware it's nearly impossible to understand how this will look like.

3) How is the interface supposed to be used? I can see one global
char-dev where you can queue jobs by providing a GPU-ID. Why don't you
create one char-dev *per* available GPU just like all other interfaces
do? Why is this a separate thing instead of a drm_minor object that
can be added per device as a separate interface to KMS and
render-nodes? Where is the underlying "struct device" for those GPUs?

4) Why is the topology static? FWIW, you allow runtime modifications,
but I cannot see any notification mechanism for user-space? Again,
using existing driver-core would provide all that for free.

I really appreciate that you provided code instead of just ideas, but
please describe why you do things the way they are. And please provide
examples for people who do not have the hardware.

Thanks
David

> - There are still some minor code style issues I need to fix. I didn't want
>   to delay v3 any further but I will publish either v4 with those fixes,
>   or just relevant patches if the whole patch set will be merged.
>
> For people who like to review using git, the v3 patch set is located at:
> http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3
>
> In addition, I would like to announce that we have uploaded the userspace lib
> that accompanies amdkfd. That lib is called "libhsakmt" and you can view it 
> at:
> http://cgit.freedesktop.org/~gabbayo/libhsakmt
>
> Alexey Skidanov (1):
>   amdkfd: Implement the Get Process Aperture IOCTL
>
> Andrew Lewycky (3):
>   amdkfd: Add basic modules to amdkfd
>   amdkfd: Add interrupt handling module
>   amdkfd: Implement the Set Memory Policy IOCTL
>
> Ben Goz (8):
>   amdkfd: Add queue module
>   amdkfd: Add mqd_manager module
>   amdkfd: Add kernel queue module
>   amdkfd: Add module parameter of scheduling policy
>   amdkfd: Add packet manager module
>   amdkfd: Add process queue manager module
>   amdkfd: Add device queue manager module
>   amdkfd: Implement the create/destroy/update queue IOCTLs
>
> Evgeny Pinchuk (2):
>   amdkfd: Add topology module to amdkfd
>   amdkfd: Implement the Get Clock Counters IOCTL
>
> Oded Gabbay (9):
>   drm/radeon: reduce number of free VMIDs and pipes in KV
>   drm/radeon/cik: Don't touch int of pipes 1-7
>   drm/radeon: Report doorbell configuration to amdkfd
>   drm/radeon: adding synchronization for GRBM GFX
>   drm/radeon: Add radeon <--> amdkfd interface
>   Update MAINTAINERS and CREDITS files with

[pull] radeon drm-next-3.17

2014-08-05 Thread Daniel Vetter

On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote:
> Hi Dave,
> 
> This is the radeon pull request for 3.17.  Highlights:
> - Additional Hawaii fixes
> - Support for using the display scaler on non-fixed mode displays
> - Support for new firmware format that makes it easier to update
> - Enable dpm by default on additional asics
> - GPUVM improvements
> - Support for uncached and write combined gtt buffers
> - Userptr support

Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see
them fly by anywhere, so I guess I've missed them on some m-l I don't
subscribe to.
-Daniel

> - Allow allocation of BOs larger than visible vram
> - Various other small fixes and improvements
> 
> The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d:
> 
>   drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 
> +1000)
> 
> are available in the git repository at:
> 
>   git://people.freedesktop.org/~agd5f/linux drm-next-3.17
> 
> for you to fetch changes up to ffd7d3a9d535933c7edfbaaac161f11628270716:
> 
>   drm/radeon: allow userptr write access under certain conditions (2014-08-05 
> 12:10:42 -0400)
> 
> 
> Alex Deucher (25):
>   drm/radeon/dpm: add support for SVI2 voltage for SI
>   drm/radeon: disable gfx cgcg on cik
>   drm/radeon: add new firmware header definitions (v3)
>   drm/radeon/si: Add support for new ucode format (v3)
>   drm/radeon/cik: Add support for new ucode format (v5)
>   drm/radeon: enable display scaling on all connectors (v2)
>   drm/radeon: consolidate vga and dvi get_modes functions (v2)
>   drm/radeon: restructure edid fetching
>   drm/radeon: use a fetch function to get the edid
>   drm/radeon: track pinned memory (v2)
>   drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl
>   drm/radeon: use vram/gart pinned size in radeon_do_test_moves
>   drm/radeon: remove visible vram size limit on bo allocation (v4)
>   drm/radeon: add a PX quirk list
>   drm/radeon: make radeon_connector_encoder_is_hbr2 static
>   drm/radeon: load the lm63 driver for an lm64 thermal chip.
>   drm/radeon: fix reversed logic in evergreen_mc_resume
>   drm/radeon/atom: add new voltage fetch function for hawaii
>   drm/radeon/dpm: handle voltage info fetching on hawaii
>   drm/radeon: re-enable dpm by default on cayman
>   drm/radeon: re-enable dpm by default on BTC
>   drm/radeon: use an intervall tree to manage the VMA v2
>   drm/radeon: use packet2 for nop on hawaii with old firmware
>   drm/radeon: tweak ACCEL_WORKING2 query for hawaii
>   drm/radeon: use packet3 for nop on hawaii with new firmware
> 
> Andreas Boll (1):
>   drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for hawaii
> 
> Christian K?nig (20):
>   drm/radeon: remove discardable flag from radeon_gem_object_create
>   drm/radeon: fix R600_PTE_GART handling
>   drm/radeon: add trace_radeon_vm_flush
>   drm/radeon: set VM base addr using the PFP v2
>   drm/radeon: separate ring and IB handling
>   drm/radeon: invalidate moved BOs in the VM (v2)
>   drm/radeon: remove radeon_bo_clear_va
>   drm/radeon: try to enable VM flushing once more
>   drm/radeon: adjust default radeon_vm_block_size v2
>   drm/radeon: remove taking mclk_lock from radeon_bo_unref
>   drm/radeon: add radeon_bo_ref function
>   drm/radeon: take a BO reference on VM cleanup
>   drm/radeon: add VM GART copy optimization to NI as well
>   drm/radeon: split PT setup in more functions
>   drm/radeon: update IB size estimation for VM
>   drm/radeon: add userptr support v7
>   drm/radeon: add userptr flag to limit it to anonymous memory v2
>   drm/radeon: add userptr flag to directly validate the BO to GTT
>   drm/radeon: add userptr flag to register MMU notifier v3
>   drm/radeon: allow userptr write access under certain conditions
> 
> Fabian Frederick (1):
>   drm/radeon: remove null test before kfree
> 
> Lauri Kasanen (1):
>   drm/radeon: Inline r100_mm_rreg, -wreg, v3
> 
> Mario Kleiner (2):
>   drm/radeon: Use pflip irqs for pageflip completion if possible. (v2)
>   drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined.
> 
> Michel D?nzer (10):
>   drm/radeon: Demote 'BO allocation size too large' message to debug only
>   drm/radeon: Remove radeon_gart_restore()
>   drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly
>   drm/radeon: Allow write-combined CPU mappings of BOs in GTT (v2)
>   drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe
>   drm/radeon: Use write-combined CPU mappings of IBs on >= CIK
>   drm/radeon/cik: Read back SDMA WPTR register after writing it
>   drm/radeon: s/ioctl_wait_idle/mmio_hpd_flush/
>   drm/radeon: Always flush the HDP cache

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #6 from Kai  ---
(In reply to comment #5)
> Are you checking radeon_pm_info while the app is running?  E.g., via ssh or
> via another X terminal?  If you switch to another VT or something like that
> there will not be any activity.  Can you try is with something simple like
> glxgears?  E.g., run `vblank_mode=0 glxgears -fullscreen` and then check
> radeon_pm_info via ssh while gears is running.

I've always checked through SSH from a second machine.

Now for your glxgears test: reclocking works (in Portal 2 as well, where I get
58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel command
line. Was that expected? I thought DPM was activated automatically with your
3.17 branch (it says so during boot as well, see e.g. attachment 103996) or at
least I interpreted the "[drm] radeon: dpm initialized" line that way.

As far as I'm concerned this can be closed, though the radeon man page should
probably get a line like "setting radeon.dpm=1 is mandatory for reclocking on
the following ASICs". I let you decide whether this is something that should
have happend automatically (my preference) or that requires the kernel
parameter and close/keep the report accordingly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/84cff81e/attachment.html>

[Bug 82154] [HAWAII] gpu-reset when closing gwenview, fails to resume (atombios stuck executing), then flickery noise

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82154

--- Comment #4 from Luzipher  ---
(In reply to comment #3)
> Does this also happen with the drm-next-3.17-rebased-on-fixes kernel branch?
> drm-next-3.17-wip is missing a few stability fixes compared to that.

Unfortunately I couldn't test with drm-next-3.17-rebased-on-fixes, because it
has a bug somewhere in ata (null pointer dereference or some such thing) that
prevents me from booting.
Also I never reproduced the bug, it actually happened after I recompiled stuff
(especially xf86-video-ati with the v3-patch for enabling hawaii accel
available here:
http://lists.x.org/archives/xorg-driver-ati/2014-August/026534.html ). I then
couldn't get the setup where the crash happened back easily (acceleration
stopped working). But I suspect it's a "random" bug that isn't really related
to closing gwenview.

I'm now on the brand-new current "drm-next-3.17" branch, which is based on
3.16.0 final and boots fine - and should also have all the fixes and patches
(?).

I'll monitor the situation for a while. Probably this bug can be disregarded
unless I get more similar crashes with all the fixes applied and more
information how to cause it.

Sorry 'bout the noise, I though those dmesg-messages with exact atombios
commands getting stuck would reveal an possibly easy-to-fix issue, but
according to agd5f on irc they are only symptoms of the gpu not being able to
resume correctly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/9a3a8bdc/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #5 from Alex Deucher  ---
Are you checking radeon_pm_info while the app is running?  E.g., via ssh or via
another X terminal?  If you switch to another VT or something like that there
will not be any activity.  Can you try is with something simple like glxgears? 
E.g., run `vblank_mode=0 glxgears -fullscreen` and then check radeon_pm_info
via ssh while gears is running.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/79c1de41/attachment.html>

[PATCH v3 00/23] AMDKFD Kernel Driver

2014-08-05 Thread Bridgman, John

))To be clear, when we ask for open source userspace that shows how things are 
suppose to be use we are thinking something like mesa but in this case most 
likely something like an open source opencl implementation on top of that 
kernel api.

Yep, understood. We're working on that too. Next should be the HSA API runtime, 
which is essentially the user mode driver for HSA that language toolchains run 
over.

I think Sumatra (Java) will probably be the first open source language runtime 
rather than OpenCL -- it's working today albeit via an older version of the HSA 
API.

Thanks,
JB



- Original Message -
From: Jerome Glisse [mailto:j.gli...@gmail.com]
Sent: Tuesday, August 05, 2014 01:51 PM Eastern Standard Time
To: Gabbay, Oded
Cc: Lewycky, Andrew; Daenzer, Michel; linux-kernel at vger.kernel.org 
; dri-devel at lists.freedesktop.org 
; Andrew Morton 
Subject: Re: [PATCH v3 00/23] AMDKFD Kernel Driver

On Tue, Aug 05, 2014 at 06:30:28PM +0300, Oded Gabbay wrote:
> Hi,
> Here is the v3 patch set of amdkfd.
> 
> This version contains changes and fixes to code, as agreed on during the 
> review
> of the v2 patch set.
> 
> The major changes are:
> 
> - There are two new module parameters: # of processes and # of queues per 
>   process. The defaults, as agreed on in the v2 review, are 32 and 128 
>   respectively. This sets the default amount of GART address space that amdkfd
>   requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff,
>   such as mqd for kernel queue, hpd for pipelines, etc.)
>   
> - All the GART address space usage of amdkfd is done inside a single 
> contiguous
>   buffer that is allocated from system memory, and pinned to the start of the 
>   GART during the startup of amdkfd (which is just after the startup of 
>   radeon). The management of this buffer is done by the radeon sa manager. 
>   This buffer is not evict-able.
>   
> - Mapping of doorbells is initiated by the userspace lib (by mmap syscall), 
>   instead of initiating it from inside an ioctl (using vm_mmap).
>   
> - Removed ioctls for exclusive access to performance counters
>   
> - Added documentation about the QCM (Queue Control Management), apertures and
>   interfaces between amdkfd and radeon.
> 
> Two important notes:
> 
> - The topology patch has not been changed. Look at 
>   http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html
>   for my response. I also put my answer as an explanation in the commit msg
>   of the patch.
>   
> - There are still some minor code style issues I need to fix. I didn't want
>   to delay v3 any further but I will publish either v4 with those fixes,
>   or just relevant patches if the whole patch set will be merged.
> 
> For people who like to review using git, the v3 patch set is located at:
> http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3
> 
> In addition, I would like to announce that we have uploaded the userspace lib
> that accompanies amdkfd. That lib is called "libhsakmt" and you can view it 
> at:
> http://cgit.freedesktop.org/~gabbayo/libhsakmt

Not commenting on the patchset yet, will try to find sometime in my non work
hour to do that. But the userspace you released are just a libdrm like thing
and this is not what we mean by we need to have userspace that shows how the
kernel api is use.

So this library is nothing but a wrapper and have allmost no value for any
serious review of the kernel api.

To be clear, when we ask for open source userspace that shows how things are
suppose to be use we are thinking something like mesa but in this case most
likely something like an open source opencl implementation on top of that
kernel api.


Btw this library code remind me of VHDL ... thought code style for userspace
library is anybody choice.

Cheers,
J?r?me

> 
> Alexey Skidanov (1):
>   amdkfd: Implement the Get Process Aperture IOCTL
> 
> Andrew Lewycky (3):
>   amdkfd: Add basic modules to amdkfd
>   amdkfd: Add interrupt handling module
>   amdkfd: Implement the Set Memory Policy IOCTL
> 
> Ben Goz (8):
>   amdkfd: Add queue module
>   amdkfd: Add mqd_manager module
>   amdkfd: Add kernel queue module
>   amdkfd: Add module parameter of scheduling policy
>   amdkfd: Add packet manager module
>   amdkfd: Add process queue manager module
>   amdkfd: Add device queue manager module
>   amdkfd: Implement the create/destroy/update queue IOCTLs
> 
> Evgeny Pinchuk (2):
>   amdkfd: Add topology module to amdkfd
>   amdkfd: Implement the Get Clock Counters IOCTL
> 
> Oded Gabbay (9):
>   drm/radeon: reduce number of free VMIDs and pipes in KV
>   drm/radeon/cik: Don't touch int of pipes 1-7
>   drm/radeon: Report doorbell configuration to amdkfd
>   drm/radeon: adding synchronization for GRBM GFX
>   drm/radeon: Add radeon <--> amdkfd interface
>   Update MAINTAINERS and CREDITS files with amdkfd info
>   amdkfd: Add IOCTL set definitions of amdkfd
>   amdkfd: Add amdkfd skeleton driver
>   amdkfd: Add

[PATCH v3 23/23] amdkfd: Implement the Get Process Aperture IOCTL

2014-08-05 Thread Oded Gabbay

From: Alexey Skidanov 

v3: fix debug msg

Signed-off-by: Alexey Skidanov 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 47 -
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  5 +++
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index eba5b5d..5ee0cda 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -397,7 +397,52 @@ static long kfd_ioctl_get_clock_counters(struct file 
*filep, struct kfd_process

 static int kfd_ioctl_get_process_apertures(struct file *filp, struct 
kfd_process *p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_get_process_apertures_args args;
+   struct kfd_process_device *pdd;
+
+   dev_dbg(kfd_device, "get apertures for PASID %d", p->pasid);
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   args.num_of_nodes = 0;
+
+   mutex_lock(>mutex);
+
+   /*if the process-device list isn't empty*/
+   if (kfd_has_process_device_data(p)) {
+   /* Run over all pdd of the process */
+   pdd = kfd_get_first_process_device_data(p);
+   do {
+
+   args.process_apertures[args.num_of_nodes].gpu_id = 
pdd->dev->id;
+   args.process_apertures[args.num_of_nodes].lds_base = 
pdd->lds_base;
+   args.process_apertures[args.num_of_nodes].lds_limit = 
pdd->lds_limit;
+   args.process_apertures[args.num_of_nodes].gpuvm_base = 
pdd->gpuvm_base;
+   args.process_apertures[args.num_of_nodes].gpuvm_limit = 
pdd->gpuvm_limit;
+   args.process_apertures[args.num_of_nodes].scratch_base 
= pdd->scratch_base;
+   args.process_apertures[args.num_of_nodes].scratch_limit 
= pdd->scratch_limit;
+
+   dev_dbg(kfd_device, "node id %u\n", args.num_of_nodes);
+   dev_dbg(kfd_device, "gpu id %u\n", pdd->dev->id);
+   dev_dbg(kfd_device, "lds_base %llX\n", pdd->lds_base);
+   dev_dbg(kfd_device, "lds_limit %llX\n", pdd->lds_limit);
+   dev_dbg(kfd_device, "gpuvm_base %llX\n", 
pdd->gpuvm_base);
+   dev_dbg(kfd_device, "gpuvm_limit %llX\n", 
pdd->gpuvm_limit);
+   dev_dbg(kfd_device, "scratch_base %llX\n", 
pdd->scratch_base);
+   dev_dbg(kfd_device, "scratch_limit %llX\n", 
pdd->scratch_limit);
+
+   args.num_of_nodes++;
+   } while ((pdd = kfd_get_next_process_device_data(p, pdd)) != 
NULL &&
+   (args.num_of_nodes < NUM_OF_SUPPORTED_GPUS));
+   }
+
+   mutex_unlock(>mutex);
+
+   if (copy_to_user(arg, , sizeof(args)))
+   return -EFAULT;
+
+   return 0;
 }

 static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 0e3e18f..9f49f11 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -445,6 +445,11 @@ struct kfd_process_device 
*kfd_get_process_device_data(struct kfd_dev *dev,
struct kfd_process *p,
int create_pdd);

+/* Process device data iterator */
+struct kfd_process_device *kfd_get_first_process_device_data(struct 
kfd_process *p);
+struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process 
*p, struct kfd_process_device *pdd);
+bool kfd_has_process_device_data(struct kfd_process *p);
+
 /* PASIDs */
 int kfd_pasid_init(void);
 void kfd_pasid_exit(void);
-- 
1.9.1

[PATCH v3 22/23] amdkfd: Implement the Get Clock Counters IOCTL

2014-08-05 Thread Oded Gabbay

From: Evgeny Pinchuk 

Signed-off-by: Evgeny Pinchuk 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index cc7ac28..eba5b5d 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -364,7 +364,34 @@ out:

 static long kfd_ioctl_get_clock_counters(struct file *filep, struct 
kfd_process *p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_get_clock_counters_args args;
+   struct kfd_dev *dev;
+   struct timespec time;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   /* Reading GPU clock counter from KGD */
+   args.gpu_clock_counter = kfd2kgd->get_gpu_clock_counter(dev->kgd);
+
+   /* No access to rdtsc. Using raw monotonic time */
+   getrawmonotonic();
+   args.cpu_clock_counter = (uint64_t)timespec_to_ns();
+
+   get_monotonic_boottime();
+   args.system_clock_counter = (uint64_t)timespec_to_ns();
+
+   /* Since the counter is in nano-seconds we use 1GHz frequency */
+   args.system_clock_freq = 10;
+
+   if (copy_to_user(arg, , sizeof(args)))
+   return -EFAULT;
+
+   return 0;
 }


-- 
1.9.1

[PATCH v3 21/23] amdkfd: Implement the Set Memory Policy IOCTL

2014-08-05 Thread Oded Gabbay

From: Andrew Lewycky 

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 51 -
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index 17725f6..cc7ac28 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include "kfd_priv.h"
+#include "kfd_device_queue_manager.h"

 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -310,7 +311,55 @@ static int kfd_ioctl_update_queue(struct file *filp, 
struct kfd_process *p, void

 static long kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process 
*p, void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_set_memory_policy_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   struct kfd_process_device *pdd;
+   enum cache_policy default_policy, alternate_policy;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (args.default_policy != KFD_IOC_CACHE_POLICY_COHERENT
+   && args.default_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   if (args.alternate_policy != KFD_IOC_CACHE_POLICY_COHERENT
+   && args.alternate_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) {
+   return -EINVAL;
+   }
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   pdd = kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd) < 0) {
+   err = PTR_ERR(pdd);
+   goto out;
+   }
+
+   default_policy = (args.default_policy == KFD_IOC_CACHE_POLICY_COHERENT)
+? cache_policy_coherent : cache_policy_noncoherent;
+
+   alternate_policy = (args.alternate_policy == 
KFD_IOC_CACHE_POLICY_COHERENT)
+  ? cache_policy_coherent : cache_policy_noncoherent;
+
+   if (!dev->dqm->set_cache_memory_policy(dev->dqm,
+>qpd,
+default_policy,
+alternate_policy,
+(void __user 
*)args.alternate_aperture_base,
+args.alternate_aperture_size))
+   err = -EINVAL;
+
+out:
+   mutex_unlock(>mutex);
+
+   return err;
 }

 static long kfd_ioctl_get_clock_counters(struct file *filep, struct 
kfd_process *p, void __user *arg)
-- 
1.9.1

[PATCH v3 20/23] amdkfd: Implement the create/destroy/update queue IOCTLs

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

v3: remove use of internal typedefs
v3: fix debug prints
v3: add checks for parameters
v3: use doorbell address from user

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c| 182 -
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |   8 +
 .../drm/radeon/amdkfd/kfd_process_queue_manager.c  |   5 +-
 include/uapi/linux/kfd_ioctl.h |   7 +-
 4 files changed, 196 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
index c42f53b..17725f6 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -119,17 +119,193 @@ static int kfd_open(struct inode *inode, struct file 
*filep)

 static long kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, 
void __user *arg)
 {
-   return -ENODEV;
+   struct kfd_ioctl_create_queue_args args;
+   struct kfd_dev *dev;
+   int err = 0;
+   unsigned int queue_id;
+   struct kfd_process_device *pdd;
+   struct queue_properties q_properties;
+
+   memset(_properties, 0, sizeof(struct queue_properties));
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (args.queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
+   pr_err("kfd: queue percentage must be between 0 to 
KFD_MAX_QUEUE_PERCENTAGE\n");
+   return -EINVAL;
+   }
+
+   if (args.queue_priority > KFD_MAX_QUEUE_PRIORITY) {
+   pr_err("kfd: queue priority must be between 0 to 
KFD_MAX_QUEUE_PRIORITY\n");
+   return -EINVAL;
+   }
+
+   if ((args.ring_base_address) &&
+   (!access_ok(VERIFY_WRITE, args.ring_base_address, 
sizeof(uint64_t {
+   pr_err("kfd: can't access ring base address\n");
+   return -EFAULT;
+   }
+
+   if (!is_power_of_2(args.ring_size)) {
+   pr_err("kfd: ring size must be between 0 to 
KFD_MAX_QUEUE_PERCENTAGE\n");
+   return -EINVAL;
+   }
+
+   if (!access_ok(VERIFY_WRITE, args.read_pointer_address, 
sizeof(uint32_t))) {
+   pr_err("kfd: can't access read pointer\n");
+   return -EFAULT;
+   }
+
+   if (!access_ok(VERIFY_WRITE, args.write_pointer_address, 
sizeof(uint32_t))) {
+   pr_err("kfd: can't access write pointer\n");
+   return -EFAULT;
+   }
+
+   q_properties.is_interop = false;
+   q_properties.queue_percent = args.queue_percentage;
+   q_properties.priority = args.queue_priority;
+   q_properties.queue_address = args.ring_base_address;
+   q_properties.queue_size = args.ring_size;
+   q_properties.read_ptr = (uint32_t *) args.read_pointer_address;
+   q_properties.write_ptr = (uint32_t *) args.write_pointer_address;
+
+
+   pr_debug("kfd: creating queue ioctl\n");
+
+   pr_debug("Queue Percentage (%d, %d)\n",
+   q_properties.queue_percent, args.queue_percentage);
+
+   pr_debug("Queue Priority (%d, %d)\n",
+   q_properties.priority, args.queue_priority);
+
+   pr_debug("Queue Address (0x%llX, 0x%llX)\n",
+   q_properties.queue_address, args.ring_base_address);
+
+   pr_debug("Queue Size (0x%llX, %u)\n",
+   q_properties.queue_size, args.ring_size);
+
+   pr_debug("Queue r/w Pointers (0x%llX, 0x%llX)\n",
+   (uint64_t) q_properties.read_ptr,
+   (uint64_t) q_properties.write_ptr);
+
+   dev = kfd_device_by_id(args.gpu_id);
+   if (dev == NULL)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   pdd = kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd) < 0) {
+   err = PTR_ERR(pdd);
+   goto err_bind_process;
+   }
+
+   pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n",
+   p->pasid,
+   dev->id);
+
+   err = pqm_create_queue(>pqm, dev, filep, _properties, 0, 
KFD_QUEUE_TYPE_COMPUTE, _id);
+   if (err != 0)
+   goto err_create_queue;
+
+   args.queue_id = queue_id;
+
+   /* Return gpu_id as doorbell offset for mmap usage */
+   args.doorbell_offset = args.gpu_id << PAGE_SHIFT;
+
+   if (copy_to_user(arg, , sizeof(args))) {
+   err = -EFAULT;
+   goto err_copy_args_out;
+   }
+
+   mutex_unlock(>mutex);
+
+   pr_debug("kfd: queue id %d was created successfully\n", args.queue_id);
+
+   pr_debug("ring buffer address == 0x%016llX\n",
+   args.ring_base_address);
+
+   pr_debug("read ptr address== 0x%016llX\n",
+   args.read_pointer_address);
+
+   pr_debug("write ptr address   == 0x%016llX\n",
+

[PATCH v3 19/23] amdkfd: Add interrupt handling module

2014-08-05 Thread Oded Gabbay

From: Andrew Lewycky 

This patch adds the interrupt handling module, in kfd_interrupt.c, and its
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring per amdkfd
device. The internal interrupt ring contains interrupts that needs further
handling. The extra handling is deferred to a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware simply queues
a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose
interrupts because we have no back-pressure to the hardware.

v3: change device init
v3: make sure spin lock is taken only if init is complete
v3: move bool to end of struct

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile|   3 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c|  23 +++-
 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c | 161 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h  |  18 ++-
 4 files changed, 200 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index e3099c8..91d5015 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -8,6 +8,7 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o 
kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
-   kfd_process_queue_manager.o kfd_device_queue_manager.o
+   kfd_process_queue_manager.o kfd_device_queue_manager.o \
+   kfd_interrupt.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index 74575de..a364c1c 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -31,6 +31,7 @@

 static const struct kfd_device_info kaveri_device_info = {
.max_pasid_bits = 16,
+   .ih_ring_entry_size = 4 * sizeof(uint32_t)
 };

 struct kfd_deviceid {
@@ -187,6 +188,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
goto kfd_topology_add_device_error;
}

+   if (kfd_interrupt_init(kfd)) {
+   dev_err(kfd_device,
+   "Error initializing interrupts for device (%x:%x)\n",
+   kfd->pdev->vendor, kfd->pdev->device);
+   goto kfd_interrupt_error;
+   }
+
if (!device_iommu_pasid_init(kfd)) {
dev_err(kfd_device,
"Error initializing iommuv2 for device (%x:%x)\n",
@@ -223,6 +231,8 @@ dqm_start_error:
 device_queue_manager_error:
amd_iommu_free_device(kfd->pdev);
 device_iommu_pasid_error:
+   kfd_interrupt_exit(kfd);
+kfd_interrupt_error:
kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
kfd2kgd->fini_sa_manager(kfd->kgd);
@@ -238,6 +248,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
if (kfd->init_complete) {
device_queue_manager_uninit(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
+   kfd_interrupt_exit(kfd);
kfd_topology_remove_device(kfd);
}

@@ -274,6 +285,16 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
return 0;
 }

-void kgd2kfd_interrupt(struct kfd_dev *dev, const void *ih_ring_entry)
+/* This is called directly from KGD at ISR. */
+void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
 {
+   if (kfd->init_complete) {
+   spin_lock(>interrupt_lock);
+
+   if (kfd->interrupts_active
+   && enqueue_ih_ring_entry(kfd, ih_ring_entry))
+   schedule_work(>interrupt_work);
+
+   spin_unlock(>interrupt_lock);
+   }
 }
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c
new file mode 100644
index 000..eed43a7
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c
@@ -0,0 +1,161 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ *

[PATCH v3 18/23] amdkfd: Add device queue manager module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The device bounded section is handled by this module.
The DQM module handles queue setup, update and tear-down from the device side.
It also supports suspend/resume operation.

v3: change device_init
v3: use new gart allocation functions
v3: Add documentation

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c |  28 +-
 .../drm/radeon/amdkfd/kfd_device_queue_manager.c   | 989 +
 .../drm/radeon/amdkfd/kfd_device_queue_manager.h   |  43 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  13 +
 5 files changed, 1073 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index b18a2b5..e3099c8 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -8,6 +8,6 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o 
kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
kfd_kernel_queue.o kfd_packet_manager.o \
-   kfd_process_queue_manager.o
+   kfd_process_queue_manager.o kfd_device_queue_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index ce90592..74575de 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include "kfd_priv.h"
+#include "kfd_device_queue_manager.h"

 #define MQD_SIZE_ALIGNED 768

@@ -194,12 +195,33 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
}
amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);

+   kfd->dqm = device_queue_manager_init(kfd);
+   if (!kfd->dqm) {
+   dev_err(kfd_device,
+   "Error initializing queue manager for device (%x:%x)\n",
+   kfd->pdev->vendor, kfd->pdev->device);
+   goto device_queue_manager_error;
+   }
+
+   if (kfd->dqm->start(kfd->dqm) != 0) {
+   dev_err(kfd_device,
+   "Error starting queuen manager for device (%x:%x)\n",
+   kfd->pdev->vendor, kfd->pdev->device);
+   goto dqm_start_error;
+   }
+
kfd->init_complete = true;
dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
 kfd->pdev->device);

+   pr_debug("kfd: Starting kfd with the following scheduling policy %d\n", 
sched_policy);
+
goto out;

+dqm_start_error:
+   device_queue_manager_uninit(kfd->dqm);
+device_queue_manager_error:
+   amd_iommu_free_device(kfd->pdev);
 device_iommu_pasid_error:
kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
@@ -214,6 +236,7 @@ out:
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
if (kfd->init_complete) {
+   device_queue_manager_uninit(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
kfd_topology_remove_device(kfd);
}
@@ -225,8 +248,10 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
BUG_ON(kfd == NULL);

-   if (kfd->init_complete)
+   if (kfd->init_complete) {
+   kfd->dqm->stop(kfd->dqm);
amd_iommu_free_device(kfd->pdev);
+   }
 }

 int kgd2kfd_resume(struct kfd_dev *kfd)
@@ -243,6 +268,7 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
if (err < 0)
return -ENXIO;
amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);
+   kfd->dqm->start(kfd->dqm);
}

return 0;
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c
new file mode 100644
index 000..2c3abd2
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c
@@ -0,0 +1,989 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE

[PATCH v3 17/23] amdkfd: Add process queue manager module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The queue scheduler divides into two sections, one section is process bounded
and the other section is device bounded.
The process bounded section is handled by this module. The PQM handles usermode
queue setup, updates and tear-down.

v3: use kernel param to limit queues per process instead of define
v3: use doorbell address from user

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   3 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  17 +
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c|  16 +
 .../drm/radeon/amdkfd/kfd_process_queue_manager.c  | 343 +
 4 files changed, 378 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index b88e637..b18a2b5 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -7,6 +7,7 @@ ccflags-y := -Iinclude/drm
 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o kfd_packet_manager.o
+   kfd_kernel_queue.o kfd_packet_manager.o \
+   kfd_process_queue_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 34028f8..600d671 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -365,6 +365,9 @@ struct kfd_process_device {
struct kfd_dev *dev;


+   /* per-process-per device QCM data structure */
+   struct qcm_process_device qpd;
+
/*Apertures*/
uint64_t lds_base;
uint64_t lds_limit;
@@ -410,6 +413,8 @@ struct kfd_process {
 */
struct list_head per_device_data;

+   struct process_queue_manager pqm;
+
/* The process's queues. */
size_t queue_array_size;

@@ -477,11 +482,23 @@ inline uint32_t upper_32(uint64_t x);

 int init_queue(struct queue **q, struct queue_properties properties);
 void uninit_queue(struct queue *q);
+void print_queue_properties(struct queue_properties *q);
 void print_queue(struct queue *q);

 struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum 
kfd_queue_type type);
 void kernel_queue_uninit(struct kernel_queue *kq);

+/* Process Queue Manager */
+struct process_queue_node {
+   struct queue *q;
+   struct kernel_queue *kq;
+   struct list_head process_queue_list;
+};
+
+int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p);
+void pqm_uninit(struct process_queue_manager *pqm);
+int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
+
 /* Packet Manager */

 #define KFD_HIQ_TIMEOUT (500)
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
index 98eba8e..b8bf15d 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c
@@ -150,6 +150,9 @@ static void kfd_process_notifier_release(struct 
mmu_notifier *mn,

mutex_lock(>mutex);

+   /* In case our notifier is called before IOMMU notifier */
+   pqm_uninit(>pqm);
+
list_for_each_entry_safe(pdd, temp, >per_device_data, 
per_device_list) {
amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
list_del(>per_device_list);
@@ -214,8 +217,16 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)

INIT_LIST_HEAD(>per_device_data);

+   err = pqm_init(>pqm, process);
+   if (err != 0)
+   goto err_process_pqm_init;
+
return process;

+err_process_pqm_init:
+   hash_del_rcu(>kfd_processes);
+   synchronize_rcu();
+   mmu_notifier_unregister_no_release(>mmu_notifier, process->mm);
 err_mmu_notifier:
kfd_pasid_free(process->pasid);
 err_alloc_pasid:
@@ -240,6 +251,9 @@ struct kfd_process_device 
*kfd_get_process_device_data(struct kfd_dev *dev,
pdd = kzalloc(sizeof(*pdd), GFP_KERNEL);
if (pdd != NULL) {
pdd->dev = dev;
+   INIT_LIST_HEAD(>qpd.queues_list);
+   INIT_LIST_HEAD(>qpd.priv_queue_list);
+   pdd->qpd.dqm = dev->dqm;
list_add(>per_device_list, >per_device_data);
}
}
@@ -299,6 +313,8 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, 
unsigned int pasid)

mutex_lock(>mutex);

+   pqm_uninit(>pqm);
+
pdd = kfd_get_process_device_data(dev, p, 0);

/*
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c
new

[PATCH v3 16/23] amdkfd: Add packet manager module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The packet manager module builds PM4 packets for the sole use of the CP
scheduler. Those packets are used by the HIQ to submit runlists to the CP.

v3: remove include of cik_mqds.h
v3: Change lower_32/upper_32 calls to use linux macros
v3: use new gart allocation functions
v3: add documentation

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c | 495 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  72 +++
 3 files changed, 568 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 020d6c7..b88e637 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -7,6 +7,6 @@ ccflags-y := -Iinclude/drm
 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
kfd_process.o kfd_queue.o kfd_mqd_manager.o \
-   kfd_kernel_queue.o
+   kfd_kernel_queue.o kfd_packet_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c
new file mode 100644
index 000..aabc17e
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c
@@ -0,0 +1,495 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include "kfd_device_queue_manager.h"
+#include "kfd_kernel_queue.h"
+#include "kfd_priv.h"
+#include "kfd_pm4_headers.h"
+#include "kfd_pm4_opcodes.h"
+
+static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, 
unsigned int buffer_size_bytes)
+{
+   unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
+
+   BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes);
+   *wptr = temp;
+}
+
+static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
+{
+   union PM4_TYPE_3_HEADER header;
+
+   header.u32all = 0;
+   header.opcode = opcode;
+   header.count = packet_size/sizeof(uint32_t) - 2;
+   header.type = PM4_TYPE_3;
+
+   return header.u32all;
+}
+
+static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int 
*rlib_size, bool *over_subscription)
+{
+   unsigned int process_count, queue_count;
+
+   BUG_ON(!pm || !rlib_size || !over_subscription);
+
+   process_count = pm->dqm->processes_count;
+   queue_count = pm->dqm->queue_count;
+
+   /* check if there is over subscription*/
+   *over_subscription = false;
+   if ((process_count >= VMID_PER_DEVICE) ||
+   queue_count > PIPE_PER_ME_CP_SCHEDULING * 
QUEUES_PER_PIPE) {
+   *over_subscription = true;
+   pr_debug("kfd: over subscribed runlist\n");
+   }
+
+   /* calculate run list ib allocation size */
+   *rlib_size = process_count * sizeof(struct pm4_map_process) +
+queue_count * sizeof(struct pm4_map_queues);
+
+   /* increase the allocation size in case we need a chained run list when 
over subscription */
+   if (*over_subscription)
+   *rlib_size += sizeof(struct pm4_runlist);
+
+   pr_debug("kfd: runlist ib size %d\n", *rlib_size);
+}
+
+static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int 
**rl_buffer, uint64_t *rl_gpu_buffer,
+   unsigned int *rl_buffer_size, bool *is_over_subscription)
+{
+   int retval;
+
+   BUG_ON(!pm);
+   BUG_ON(pm->allocated == true);
+   BUG_ON(is_over_subscription == NULL);
+
+   pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
+
+   retval =

[PATCH v3 15/23] amdkfd: Add module parameter of scheduling policy

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

This patch adds a new parameter to the amdkfd driver. This parameter enables
the user to select the scheduling policy of the CP. The choices are:

* CP Scheduling with support for over-subscription
* CP Scheduling without support for over-subscription
* Without CP Scheduling

Note that the third option (Without CP scheduling) is only for debug purposes
and bringup of new H/W. As such, it is _not_ guaranteed to work at all times on
all H/W versions.

v3: fix description
v3: change permissions to read_only
v3: verify value
v3: add documentation

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 12 
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   | 29 +
 2 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
index a31bf03..5c58031 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c
@@ -45,6 +45,11 @@ static const struct kgd2kfd_calls kgd2kfd = {
.resume = kgd2kfd_resume,
 };

+int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION;
+module_param(sched_policy, int, 0444);
+MODULE_PARM_DESC(sched_policy,
+   "Kernel cmdline parameter that defines the amdkfd scheduling policy");
+
 int max_num_of_processes = KFD_MAX_NUM_OF_PROCESSES_DEFAULT;
 module_param(max_num_of_processes, int, 0444);
 MODULE_PARM_DESC(max_num_of_processes,
@@ -79,6 +84,13 @@ static int __init kfd_module_init(void)
int err;

/* Verify module parameters */
+   if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
+   (sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
+   pr_err("kfd: sched_policy has invalid value\n");
+   return -1;
+   }
+
+   /* Verify module parameters */
if ((max_num_of_processes < 0) ||
(max_num_of_processes > KFD_MAX_NUM_OF_PROCESSES)) {
pr_err("kfd: max_num_of_processes must be between 0 to 
KFD_MAX_NUM_OF_PROCESSES\n");
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index 66980df..5835d07 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -65,6 +65,35 @@ extern int max_num_of_queues_per_process;
 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS_DEFAULT 128
 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 1024

+/* Kernel module parameter to specify the scheduling policy */
+extern int sched_policy;
+
+/**
+ * enum kfd_sched_policy
+ *
+ * @KFD_SCHED_POLICY_HWS: H/W scheduling policy known as command processor (cp)
+ * scheduling. In this scheduling mode we're using the firmware code to 
schedule
+ * the user mode queues and kernel queues such as HIQ and DIQ.
+ * the HIQ queue is used as a special queue that dispatches the configuration 
to
+ * the cp and the user mode queues list that are currently running.
+ * the DIQ queue is a debugging queue that dispatches debugging commands to the
+ * firmware.
+ * in this scheduling mode user mode queues over subscription feature is 
enabled.
+ *
+ * @KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION: The same as above but the over
+ * subscription feature disabled.
+ *
+ * @KFD_SCHED_POLICY_NO_HWS: no H/W scheduling policy is a mode which directly
+ * set the command processor registers and sets the queues "manually". This 
mode
+ * is used *ONLY* for debugging proposes.
+ *
+ */
+enum kfd_sched_policy {
+   KFD_SCHED_POLICY_HWS = 0,
+   KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION,
+   KFD_SCHED_POLICY_NO_HWS
+};
+
 enum cache_policy {
cache_policy_coherent,
cache_policy_noncoherent
-- 
1.9.1

[PATCH v3 14/23] amdkfd: Add kernel queue module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The kernel queue module enables the amdkfd to establish kernel queues, not
exposed to user space.

The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug
Interface Queue) operations

v3: remove use of internal typedefs
v3: use new gart allocation functions

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile |   3 +-
 .../drm/radeon/amdkfd/kfd_device_queue_manager.h   | 101 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c   | 330 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h   |  66 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h| 682 +
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h| 107 
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  33 +-
 7 files changed, 1320 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 9f8de8d..020d6c7 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,7 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
-   kfd_process.o kfd_queue.o kfd_mqd_manager.o
+   kfd_process.o kfd_queue.o kfd_mqd_manager.o \
+   kfd_kernel_queue.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
new file mode 100644
index 000..e3a56ec
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h
@@ -0,0 +1,101 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef KFD_DEVICE_QUEUE_MANAGER_H_
+#define KFD_DEVICE_QUEUE_MANAGER_H_
+
+#include 
+#include 
+#include "kfd_priv.h"
+#include "kfd_mqd_manager.h"
+
+#define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS   (500)
+#define QUEUES_PER_PIPE(8)
+#define PIPE_PER_ME_CP_SCHEDULING  (3)
+#define CIK_VMID_NUM   (8)
+#define KFD_VMID_START_OFFSET  (8)
+#define VMID_PER_DEVICECIK_VMID_NUM
+#define KFD_DQM_FIRST_PIPE (0)
+
+struct device_process_node {
+   struct qcm_process_device *qpd;
+   struct list_head list;
+};
+
+struct device_queue_manager {
+   int (*create_queue)(struct device_queue_manager *dqm,
+   struct queue *q,
+   struct qcm_process_device *qpd,
+   int *allocate_vmid);
+   int (*destroy_queue)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
+   struct queue *q);
+   int (*update_queue)(struct device_queue_manager *dqm,
+   struct queue *q);
+   int (*destroy_queues)(struct device_queue_manager *dqm);
+   struct mqd_manager * (*get_mqd_manager)(struct device_queue_manager 
*dqm,
+   enum KFD_MQD_TYPE type);
+   int (*execute_queues)(struct device_queue_manager *dqm);
+   int (*register_process)(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd);
+   int (*unregister_process)(struct device_queue_manager *dqm,
+

[PATCH v3 13/23] amdkfd: Add mqd_manager module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The mqd_manager module handles MQD data structures.
MQD stands for Memory Queue Descriptor, which is used by the H/W to
keep the usermode queue state in memory.

v3: remove new typedefs
v3: remove pragma pack 4
v3: remove cik_mqds.h
v3: Change lower_32/upper_32 calls to use linux macros
v3: use new gart allocation functions
v3: Add documentation

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile  |   2 +-
 drivers/gpu/drm/radeon/amdkfd/cik_regs.h| 220 +
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c | 305 
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h |  88 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  11 +
 5 files changed, 625 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_regs.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 392728a..9f8de8d 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
-   kfd_process.o kfd_queue.o
+   kfd_process.o kfd_queue.o kfd_mqd_manager.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/cik_regs.h 
b/drivers/gpu/drm/radeon/amdkfd/cik_regs.h
new file mode 100644
index 000..a6404e3
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/cik_regs.h
@@ -0,0 +1,220 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef CIK_REGS_H
+#define CIK_REGS_H
+
+#define IH_VMID_0_LUT  0x3D40u
+
+#define BIF_DOORBELL_CNTL  0x530Cu
+
+#defineSRBM_GFX_CNTL   0xE44
+#definePIPEID(x)   ((x) << 0)
+#defineMEID(x) ((x) << 2)
+#defineVMID(x) ((x) << 4)
+#defineQUEUEID(x)  ((x) << 8)
+
+#defineSQ_CONFIG   0x8C00
+
+#defineSH_MEM_BASES0x8C28
+/* if PTR32, these are the bases for scratch and lds */
+#definePRIVATE_BASE(x) ((x) << 0) /* 
scratch */
+#defineSHARED_BASE(x)  ((x) << 16) /* 
LDS */
+#defineSH_MEM_APE1_BASE0x8C2C
+/* if PTR32, this is the base location of GPUVM */
+#defineSH_MEM_APE1_LIMIT   0x8C30
+/* if PTR32, this is the upper limit of GPUVM */
+#defineSH_MEM_CONFIG   0x8C34
+#definePTR32   (1 << 0)
+#define PRIVATE_ATC(1 << 1)
+#defineALIGNMENT_MODE(x)   ((x) << 2)
+#defineSH_MEM_ALIGNMENT_MODE_DWORD 0
+#defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT  1
+#defineSH_MEM_ALIGNMENT_MODE_STRICT2
+#defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 3
+#defineDEFAULT_MTYPE(x)((x) << 4)
+#defineAPE1_MTYPE(x)   ((x) << 7)
+
+/* valid for both DEFAULT_MTYPE and APE1_MTYPE */
+#defineMTYPE_CACHED0
+#defineMTYPE_NONCACHED 3
+
+

[PATCH v3 12/23] amdkfd: Add queue module

2014-08-05 Thread Oded Gabbay

From: Ben Goz 

The queue module enables allocating and initializing queues uniformly.

v3: remove typedef
v3: break pr_debug to one line
v3: remove memset
v3: add documentation

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile|   2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h  | 123 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c |  85 +
 3 files changed, 208 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 6d6746e..392728a 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm

 amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
-   kfd_process.o
+   kfd_process.o kfd_queue.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
index b55b1cb..cf6d40d 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h
@@ -56,7 +56,6 @@ extern int max_num_of_queues_per_process;
 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS_DEFAULT 128
 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 1024

-
 struct kfd_device_info {
const struct kfd_scheduler_class *scheduler_class;
unsigned int max_pasid_bits;
@@ -116,6 +115,128 @@ void kfd_chardev_exit(void);
 struct device *kfd_chardev(void);


+/**
+ * enum kfd_queue_type
+ *
+ * @KFD_QUEUE_TYPE_COMPUTE: Regular user mode queue type.
+ *
+ * @KFD_QUEUE_TYPE_SDMA: Sdma user mode queue type.
+ *
+ * @KFD_QUEUE_TYPE_HIQ: HIQ queue type.
+ *
+ * @KFD_QUEUE_TYPE_DIQ: DIQ queue type.
+ */
+enum kfd_queue_type  {
+   KFD_QUEUE_TYPE_COMPUTE,
+   KFD_QUEUE_TYPE_SDMA,
+   KFD_QUEUE_TYPE_HIQ,
+   KFD_QUEUE_TYPE_DIQ
+};
+
+/**
+ * struct queue_properties
+ *
+ * @type: The queue type.
+ *
+ * @queue_id: Queue identifier.
+ *
+ * @queue_address: Queue ring buffer address.
+ *
+ * @queue_size: Queue ring buffer size.
+ *
+ * @priority: Defines the queue priority relative to other queues in the 
process.
+ * This is just an indication and HW scheduling may override the priority as
+ * necessary while keeping the relative prioritization.
+ * the priority granularity is from 0 to f which f is the highest priority.
+ * currently all queues are initialized with the highest priority.
+ *
+ * @queue_percent: This field is partially implemented and currently a zero in
+ * this field defines that the queue is non active.
+ *
+ * @read_ptr: User space address which points to the number of dwords the
+ * cp read from the ring buffer. This field updates automatically by the H/W.
+ *
+ * @write_ptr: Defines the number of dwords written to the ring buffer.
+ *
+ * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
+ * the queue ring buffer. This field should be similar to write_ptr and the 
user
+ * should update this field after he updated the write_ptr.
+ *
+ * @doorbell_off: The doorbell offset in the doorbell pci-bar.
+ *
+ * @is_interop: Defines if this is a interop queue. Interop queue means that 
the
+ * queue can access both graphics and compute resources.
+ *
+ * @is_active: Defines if the queue is active or not.
+ *
+ * @vmid: If the scheduling mode is no cp scheduling the field defines the vmid
+ * of the queue.
+ *
+ * This structure represents the queue properties for each queue no matter if
+ * it's user mode or kernel mode queue.
+ *
+ */
+struct queue_properties {
+   enum kfd_queue_type type;
+   unsigned int queue_id;
+   uint64_t queue_address;
+   uint64_t  queue_size;
+   uint32_t priority;
+   uint32_t queue_percent;
+   uint32_t *read_ptr;
+   uint32_t *write_ptr;
+   uint32_t *doorbell_ptr;
+   uint32_t doorbell_off;
+   bool is_interop;
+   bool is_active;
+   /* Not relevant for user mode queues in cp scheduling */
+   unsigned int vmid;
+};
+
+/**
+ * struct queue
+ *
+ * @list: Queue linked list.
+ *
+ * @mqd: The queue MQD.
+ *
+ * @mqd_mem_obj: The MQD local gpu memory object.
+ *
+ * @gart_mqd_addr: The MQD gart mc address.
+ *
+ * @properties: The queue properties.
+ *
+ * @mec: Used only in no cp scheduling mode and identifies to micro engine id
+ * that the queue should be execute on.
+ *
+ * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe 
id.
+ *
+ * @queue: Used only in no cp scheduliong mode and identifies the queue's slot.
+ *
+ * @process: The kfd process that created this queue.
+ *
+ * @device: The kfd device that created this queue.
+ *
+ * This structure represents user mode compute queues.
+ * It contains all the necessary data to handle such queues.
+ *
+ */
+

[PATCH v3 11/23] amdkfd: Add binding/unbinding calls to amd_iommu driver

2014-08-05 Thread Oded Gabbay

This patch adds the functions to bind and unbind pasid
from a device through the amd_iommu driver.

The unbind function is called when the mm_struct of the
process is released.

The bind function is not called here because it is called
only in the IOCTLs which are not yet implemented at this
stage of the patchset.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c  | 86 -
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  1 +
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c | 12 
 3 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
index 8e2b075..ce90592 100644
--- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c
@@ -98,6 +98,63 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct 
pci_dev *pdev)
return kfd;
 }

+static bool device_iommu_pasid_init(struct kfd_dev *kfd)
+{
+   const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP | 
AMD_IOMMU_DEVICE_FLAG_PRI_SUP
+   | AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
+
+   struct amd_iommu_device_info iommu_info;
+   unsigned int pasid_limit;
+   int err;
+
+   err = amd_iommu_device_info(kfd->pdev, _info);
+   if (err < 0) {
+   dev_err(kfd_device, "error getting iommu info. is the iommu 
enabled?\n");
+   return false;
+   }
+
+   if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
+   dev_err(kfd_device, "error required iommu flags ats(%i), 
pri(%i), pasid(%i)\n",
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
+  (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 
0);
+   return false;
+   }
+
+   pasid_limit = min_t(unsigned int,
+   (unsigned int)1 << kfd->device_info->max_pasid_bits,
+   iommu_info.max_pasids);
+   /*
+* last pasid is used for kernel queues doorbells
+* in the future the last pasid might be used for a kernel thread.
+*/
+   pasid_limit = min_t(unsigned int,
+   pasid_limit,
+   kfd->doorbell_process_limit - 1);
+
+   err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+   if (err < 0) {
+   dev_err(kfd_device, "error initializing iommu device\n");
+   return false;
+   }
+
+   if (!kfd_set_pasid_limit(pasid_limit)) {
+   dev_err(kfd_device, "error setting pasid limit\n");
+   amd_iommu_free_device(kfd->pdev);
+   return false;
+   }
+
+   return true;
+}
+
+static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
+{
+   struct kfd_dev *dev = kfd_device_by_pci_dev(pdev);
+
+   if (dev)
+   kfd_unbind_process_from_device(dev, pasid);
+}
+
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
 const struct kgd2kfd_shared_resources *gpu_resources)
 {
@@ -129,6 +186,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
goto kfd_topology_add_device_error;
}

+   if (!device_iommu_pasid_init(kfd)) {
+   dev_err(kfd_device,
+   "Error initializing iommuv2 for device (%x:%x)\n",
+   kfd->pdev->vendor, kfd->pdev->device);
+   goto device_iommu_pasid_error;
+   }
+   amd_iommu_set_invalidate_ctx_cb(kfd->pdev, 
iommu_pasid_shutdown_callback);

kfd->init_complete = true;
dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
@@ -136,6 +200,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,

goto out;

+device_iommu_pasid_error:
+   kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
kfd2kgd->fini_sa_manager(kfd->kgd);
dev_err(kfd_device,
@@ -147,7 +213,10 @@ out:

 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
-   kfd_topology_remove_device(kfd);
+   if (kfd->init_complete) {
+   amd_iommu_free_device(kfd->pdev);
+   kfd_topology_remove_device(kfd);
+   }

kfree(kfd);
 }
@@ -155,12 +224,27 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
BUG_ON(kfd == NULL);
+
+   if (kfd->init_complete)
+   amd_iommu_free_device(kfd->pdev);
 }

 int kgd2kfd_resume(struct kfd_dev *kfd)
 {
+   unsigned int pasid_limit;
+   int err;
+
BUG_ON(kfd == NULL);

+   pasid_limit = kfd_get_pasid_limit();
+
+   if (kfd->init_complete) {
+   err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+   if (err < 0)
+   return -ENXIO;
+   amd_iommu_set_invalidate_ctx_cb(kfd->pdev,

[PATCH v3 10/23] amdkfd: Add basic modules to amdkfd

2014-08-05 Thread Oded Gabbay

From: Andrew Lewycky 

This patch adds the process module and three helper modules:

- kfd_process, which handles process which open /dev/kfd

- kfd_doorbell, which provides helper functions for doorbell allocation,
  release and mapping to userspace

- kfd_pasid, which provides helper functions for pasid allocation and release

- kfd_aperture, which provides helper functions for managing the LDS, Local GPU
  memory and Scratch memory apertures of the process

This patch only contains the basic kfd_process module, which doesn't contain
the reference to the queue scheduler. This was done to allow easier code review.

Also, this patch doesn't contain the calls to the IOMMU driver for binding the
pasid to the device. Again, this was done to allow easier code review

The kfd_process object is created when a process opens /dev/kfd and is closed
when the mm_struct of that process is teared-down.

v3: Remove kfd_vidmem file
v3: Replace direct mmput call to mmu_notifier release
v3: remove typedefs
v3: move bool to end of struct
v3: Add new kernel params for gart usage limitation
v3: init sa manager
v3: fix debug msgs
v3: remove support for LDS in 32 bit
v3: Change code to support mmap of doorbell pages from userspace
v3: Add documentation for apertures

Signed-off-by: Andrew Lewycky 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile   |   4 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c | 350 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c  |  31 ++-
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c   |  45 +++-
 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c | 236 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c   |  32 ++-
 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c|  95 
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 141 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c  | 319 
 drivers/gpu/drm/radeon/radeon_kfd.c  |  21 +-
 10 files changed, 1247 insertions(+), 27 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process.c

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 08ecfcd..6d6746e 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -4,6 +4,8 @@

 ccflags-y := -Iinclude/drm

-amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
+   kfd_pasid.o kfd_doorbell.o kfd_aperture.o \
+   kfd_process.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
new file mode 100644
index 000..8cfb720
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c
@@ -0,0 +1,350 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "kfd_priv.h"
+#include 
+#include 
+#include 
+
+/*
+ * The primary memory I/O features being added for revisions of gfxip
+ * beyond 7.0 (Kaveri) are:
+ *
+ * Access to ATC/IOMMU mapped memory w/ associated extension of VA to 48b
+ *
+ * ?Flat? shader memory access ? These are new shader vector memory operations
+ * that do not reference a T#/V# so a ?pointer? is what is sourced from the
+ * vector gprs for direct access to memory.  This pointer space has the
+ * Shared(LDS) and Private(Scratch) memory mapped into this pointer space as
+ * apertures.  The hardware then determines how to direct the memory request
+ * based on what

[PATCH v3 09/23] amdkfd: Add topology module to amdkfd

2014-08-05 Thread Oded Gabbay

From: Evgeny Pinchuk 

This patch adds the topology module to the driver. The topology is exposed to
userspace through the sysfs.

The calls to add and remove a device to/from topology are done by the radeon
driver.

The CPU information, that is provided in the topology section of the amdkfd
driver, is extracted from the CRAT table. Unlike the CPU information located
in /sys/devices/system/cpu/cpu*, which is extracted from the SRAT table.

While the CPU information provided by the CRAT and the SRAT tables might be
identical, the node topology might be different. The SRAT table contains the
topology of CPU nodes only. The CRAT table contains the topology of CPU and GPU
nodes together (and can be interleaved). For example CPU node 1 in SRAT can be
CPU node 3 in CRAT. Furthermore it's worth to mention that the CRAT table
contains only HSA compatible nodes (nodes which are compliant with the HSA
spec).

To recap, amdkfd exposes a different kind of topology than the one exposed by
/sys/devices/system/cpu/cpu even though it may contain similar information.

Signed-off-by: Evgeny Pinchuk 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/amdkfd/Makefile   |2 +-
 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h |  294 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c   |7 +
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c   |7 +
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h |   17 +
 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c | 1207 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h |  168 
 7 files changed, 1701 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h

diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
index 9564e75..08ecfcd 100644
--- a/drivers/gpu/drm/radeon/amdkfd/Makefile
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -4,6 +4,6 @@

 ccflags-y := -Iinclude/drm

-amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o

 obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h 
b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
new file mode 100644
index 000..a374fa3
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h
@@ -0,0 +1,294 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KFD_CRAT_H_INCLUDED
+#define KFD_CRAT_H_INCLUDED
+
+#include 
+
+#pragma pack(1)
+
+/*
+ * 4CC signature values for the CRAT and CDIT ACPI tables
+ */
+
+#define CRAT_SIGNATURE "CRAT"
+#define CDIT_SIGNATURE "CDIT"
+
+/*
+ * Component Resource Association Table (CRAT)
+ */
+
+#define CRAT_OEMID_LENGTH  6
+#define CRAT_OEMTABLEID_LENGTH 8
+#define CRAT_RESERVED_LENGTH   6
+
+#define CRAT_OEMID_64BIT_MASK ((1ULL << (CRAT_OEMID_LENGTH * 8)) - 1)
+
+struct crat_header {
+   uint32_tsignature;
+   uint32_tlength;
+   uint8_t revision;
+   uint8_t checksum;
+   uint8_t oem_id[CRAT_OEMID_LENGTH];
+   uint8_t oem_table_id[CRAT_OEMTABLEID_LENGTH];
+   uint32_toem_revision;
+   uint32_tcreator_id;
+   uint32_tcreator_revision;
+   uint32_ttotal_entries;
+   uint16_tnum_domains;
+   uint8_t reserved[CRAT_RESERVED_LENGTH];
+};
+
+/*
+ * The header structure is immediately followed by total_entries of the
+ * data definitions
+ */
+
+/*
+ * The currently defined subtype entries in the CRAT
+ */
+#define CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY  0
+#define CRAT_SUBTYPE_MEMORY_AFFINITY   1
+#define CRAT_SUBTYPE_CACHE_AFFINITY2
+#define CRAT_SUBTYPE_TLB_AFFINITY  3
+#define

[PATCH v3 08/23] amdkfd: Add amdkfd skeleton driver

2014-08-05 Thread Oded Gabbay

This patch adds the amdkfd skeleton driver. The driver does nothing except
define a /dev/kfd device.

It returns -ENODEV on all amdkfd IOCTLs.

(v3) move bool to end of struct

(v3) remove pmc ioctls

(v3) add meaningful error message for ioctl error

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/Kconfig  |   2 +
 drivers/gpu/drm/radeon/Makefile |   2 +
 drivers/gpu/drm/radeon/amdkfd/Kconfig   |  10 ++
 drivers/gpu/drm/radeon/amdkfd/Makefile  |   9 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 187 
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c  | 129 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c  |  98 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h|  82 
 8 files changed, 519 insertions(+)
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/Kconfig
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/Makefile
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_module.c
 create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h

diff --git a/drivers/gpu/drm/radeon/Kconfig b/drivers/gpu/drm/radeon/Kconfig
index 970f8e9..b697321 100644
--- a/drivers/gpu/drm/radeon/Kconfig
+++ b/drivers/gpu/drm/radeon/Kconfig
@@ -6,3 +6,5 @@ config DRM_RADEON_UMS

  Userspace modesetting is deprecated for quite some time now, so
  enable this only if you have ancient versions of the DDX drivers.
+
+source "drivers/gpu/drm/radeon/amdkfd/Kconfig"
diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 1476103..8ab6b6e 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -112,4 +112,6 @@ radeon-$(CONFIG_ACPI) += radeon_acpi.o

 obj-$(CONFIG_DRM_RADEON)+= radeon.o

+obj-$(CONFIG_HSA_RADEON)+= amdkfd/
+
 CFLAGS_radeon_trace_points.o := -I$(src)
diff --git a/drivers/gpu/drm/radeon/amdkfd/Kconfig 
b/drivers/gpu/drm/radeon/amdkfd/Kconfig
new file mode 100644
index 000..900bb34
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/Kconfig
@@ -0,0 +1,10 @@
+#
+# Heterogenous system architecture configuration
+#
+
+config HSA_RADEON
+   tristate "HSA kernel driver for AMD Radeon devices"
+   depends on DRM_RADEON && AMD_IOMMU_V2 && X86_64
+   default m
+   help
+ Enable this if you want to use HSA features on AMD radeon devices.
diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile 
b/drivers/gpu/drm/radeon/amdkfd/Makefile
new file mode 100644
index 000..9564e75
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/Makefile
@@ -0,0 +1,9 @@
+#
+# Makefile for Heterogenous System Architecture support for AMD radeon devices
+#
+
+ccflags-y := -Iinclude/drm
+
+amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o
+
+obj-$(CONFIG_HSA_RADEON)   += amdkfd.o
diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
new file mode 100644
index 000..f198e5a
--- /dev/null
+++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c
@@ -0,0 +1,187 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "kfd_priv.h"
+
+static long kfd_ioctl(struct file *, unsigned int, unsigned long);
+static int kfd_open(struct inode *, struct file *);
+
+static const char kfd_dev_name[] = "kfd";
+
+static const struct file_operations kfd_fops = {
+   .owner = THIS_MODULE,
+   .unlocked_ioctl = kfd_ioctl,
+   .compat_ioctl = kfd_ioctl,
+   .open = kfd_open,
+};
+
+static int kfd_char_dev_major = -1;
+static struct class *kfd_class;
+struct device *kfd_device;
+
+int kfd_chardev_init(void)
+{
+   int err = 0;
+
+

[PATCH v3 07/23] amdkfd: Add IOCTL set definitions of amdkfd

2014-08-05 Thread Oded Gabbay

- KFD_IOC_GET_VERSION:
Retrieves the interface version of amdkfd

- KFD_IOC_CREATE_QUEUE:
Creates a usermode queue that runs on a specific GPU device

- KFD_IOC_DESTROY_QUEUE:
Destroys an existing usermode queue

- KFD_IOC_SET_MEMORY_POLICY:
Sets the memory policy of the default and alternate aperture of the
calling process

- KFD_IOC_GET_CLOCK_COUNTERS:
Retrieves counters (timestamps) of CPU and GPU

- KFD_IOC_GET_PROCESS_APERTURES:
Retrieves information about process apertures that were initialized
during the open() call of the amdkfd device

- KFD_IOC_UPDATE_QUEUE:
Updates configuration of an existing usermode queue

(v3) remove pragma pack

(v3) remove pmc ioctls

(v3) add parameter for doorbell offset

(v3) add comment on counters

Signed-off-by: Oded Gabbay 
---
 include/uapi/linux/kfd_ioctl.h | 123 +
 1 file changed, 123 insertions(+)
 create mode 100644 include/uapi/linux/kfd_ioctl.h

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
new file mode 100644
index 000..a06e021
--- /dev/null
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -0,0 +1,123 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KFD_IOCTL_H_INCLUDED
+#define KFD_IOCTL_H_INCLUDED
+
+#include 
+#include 
+
+#define KFD_IOCTL_CURRENT_VERSION 1
+
+struct kfd_ioctl_get_version_args {
+   uint32_t min_supported_version; /* from KFD */
+   uint32_t max_supported_version; /* from KFD */
+};
+
+/* For kfd_ioctl_create_queue_args.queue_type. */
+#define KFD_IOC_QUEUE_TYPE_COMPUTE   0
+#define KFD_IOC_QUEUE_TYPE_SDMA  1
+
+struct kfd_ioctl_create_queue_args {
+   uint64_t ring_base_address; /* to KFD */
+   uint64_t write_pointer_address; /* from KFD */
+   uint64_t read_pointer_address;  /* from KFD */
+   uint64_t doorbell_offset;   /* from KFD */
+
+   uint32_t ring_size; /* to KFD */
+   uint32_t gpu_id;/* to KFD */
+   uint32_t queue_type;/* to KFD */
+   uint32_t queue_percentage;  /* to KFD */
+   uint32_t queue_priority;/* to KFD */
+   uint32_t queue_id;  /* from KFD */
+};
+
+struct kfd_ioctl_destroy_queue_args {
+   uint32_t queue_id;  /* to KFD */
+};
+
+struct kfd_ioctl_update_queue_args {
+   uint64_t ring_base_address; /* to KFD */
+
+   uint32_t queue_id;  /* to KFD */
+   uint32_t ring_size; /* to KFD */
+   uint32_t queue_percentage;  /* to KFD */
+   uint32_t queue_priority;/* to KFD */
+};
+
+/* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */
+#define KFD_IOC_CACHE_POLICY_COHERENT 0
+#define KFD_IOC_CACHE_POLICY_NONCOHERENT 1
+
+struct kfd_ioctl_set_memory_policy_args {
+   uint64_t alternate_aperture_base;   /* to KFD */
+   uint64_t alternate_aperture_size;   /* to KFD */
+
+   uint32_t gpu_id;/* to KFD */
+   uint32_t default_policy;/* to KFD */
+   uint32_t alternate_policy;  /* to KFD */
+};
+
+/*
+ * All counters are monotonic. They are used for profiling of compute jobs.
+ * The profiling is done by userspace.
+ *
+ * In case of GPU reset, the counter should not be affected.
+ */
+
+struct kfd_ioctl_get_clock_counters_args {
+   uint64_t gpu_clock_counter; /* from KFD */
+   uint64_t cpu_clock_counter; /* from KFD */
+   uint64_t system_clock_counter;  /* from KFD */
+   uint64_t system_clock_freq; /* from KFD */
+
+   uint32_t gpu_id;/* to KFD */
+};
+
+#define NUM_OF_SUPPORTED_GPUS 7
+
+struct kfd_process_device_apertures {
+   uint64_t lds_base;  /* from KFD */
+   uint64_t lds_limit; /* from KFD */
+

[PATCH v3 06/23] Update MAINTAINERS and CREDITS files with amdkfd info

2014-08-05 Thread Oded Gabbay

Signed-off-by: Oded Gabbay 
---
 CREDITS |  7 +++
 MAINTAINERS | 10 ++
 2 files changed, 17 insertions(+)

diff --git a/CREDITS b/CREDITS
index 28ee151..e9628d5 100644
--- a/CREDITS
+++ b/CREDITS
@@ -1197,6 +1197,13 @@ S: R. Tocantins, 89 - Cristo Rei
 S: 80050-430 - Curitiba - Paran?
 S: Brazil

+N: Oded Gabbay
+E: oded.gabbay at gmail.com
+D: AMD KFD maintainer
+S: 12 Shraga Raphaeli
+S: Petah-Tikva, 4906418
+S: Israel
+
 N: Kumar Gala
 E: galak at kernel.crashing.org
 D: Embedded PowerPC 6xx/7xx/74xx/82xx/83xx/85xx support
diff --git a/MAINTAINERS b/MAINTAINERS
index d76e077..da3aecb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -589,6 +589,16 @@ F: drivers/crypto/geode*
 F: drivers/video/geode/
 F: arch/x86/include/asm/geode.h

+AMD KFD (radeon extension)
+M: Oded Gabbay 
+L: dri-devel at lists.freedesktop.org
+T: git git://people.freedesktop.org/~gabbayo/linux.git
+S: Supported
+F: drivers/gpu/drm/radeon/amdkfd/*
+F: drivers/gpu/drm/radeon/radeon_kfd.c
+F: drivers/gpu/drm/radeon/radeon_kfd.h
+F: include/linux/uapi/linux/kfd_ioctl.h
+
 AMD IOMMU (AMD-VI)
 M: Joerg Roedel 
 L: iommu at lists.linux-foundation.org
-- 
1.9.1

[PATCH v3 05/23] drm/radeon: Add radeon <--> amdkfd interface

2014-08-05 Thread Oded Gabbay

This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

(v3)
Add interface for sa manager init and fini. The init function will allocate a
buffer on system memory and pin it to the GART address space via the radeon sa
manager.

All mappings of buffers to GART address space are done via the radeon sa
manager. The interface of allocate memory will use the radeon sa manager to sub
allocate from the single buffer that was allocated during the init function.

(v3) Change lower_32/upper_32 calls to use linux macros

(v3) Add documentation for the interface

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/Makefile |   1 +
 drivers/gpu/drm/radeon/cik.c|   9 +
 drivers/gpu/drm/radeon/cik_reg.h|  65 +
 drivers/gpu/drm/radeon/cikd.h   |  51 +++-
 drivers/gpu/drm/radeon/radeon.h |   4 +
 drivers/gpu/drm/radeon/radeon_drv.c |   5 +
 drivers/gpu/drm/radeon/radeon_kfd.c | 538 
 drivers/gpu/drm/radeon/radeon_kfd.h | 177 
 drivers/gpu/drm/radeon/radeon_kms.c |   7 +
 9 files changed, 856 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c
 create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.h

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index c7fa1ae..1476103 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -104,6 +104,7 @@ radeon-y += \
radeon_vce.o \
vce_v1_0.o \
vce_v2_0.o \
+   radeon_kfd.o

 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 0096538..f4a65de 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -32,6 +32,7 @@
 #include "cik_blit_shaders.h"
 #include "radeon_ucode.h"
 #include "clearstate_ci.h"
+#include "radeon_kfd.h"

 MODULE_FIRMWARE("radeon/BONAIRE_pfp.bin");
 MODULE_FIRMWARE("radeon/BONAIRE_me.bin");
@@ -7766,6 +7767,9 @@ restart_ih:
while (rptr != wptr) {
/* wptr/rptr are in bytes! */
ring_index = rptr / 4;
+
+   radeon_kfd_interrupt(rdev, (const void *) 
>ih.ring[ring_index]);
+
src_id =  le32_to_cpu(rdev->ih.ring[ring_index]) & 0xff;
src_data = le32_to_cpu(rdev->ih.ring[ring_index + 1]) & 
0xfff;
ring_id = le32_to_cpu(rdev->ih.ring[ring_index + 2]) & 0xff;
@@ -8453,6 +8457,10 @@ static int cik_startup(struct radeon_device *rdev)
if (r)
return r;

+   r = radeon_kfd_resume(rdev);
+   if (r)
+   return r;
+
return 0;
 }

@@ -8501,6 +8509,7 @@ int cik_resume(struct radeon_device *rdev)
  */
 int cik_suspend(struct radeon_device *rdev)
 {
+   radeon_kfd_suspend(rdev);
radeon_pm_suspend(rdev);
dce6_audio_fini(rdev);
radeon_vm_manager_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/cik_reg.h b/drivers/gpu/drm/radeon/cik_reg.h
index ca1bb61..1ab3dbc 100644
--- a/drivers/gpu/drm/radeon/cik_reg.h
+++ b/drivers/gpu/drm/radeon/cik_reg.h
@@ -147,4 +147,69 @@

 #define CIK_LB_DESKTOP_HEIGHT 0x6b0c

+struct cik_hqd_registers {
+   u32 cp_mqd_base_addr;
+   u32 cp_mqd_base_addr_hi;
+   u32 cp_hqd_active;
+   u32 cp_hqd_vmid;
+   u32 cp_hqd_persistent_state;
+   u32 cp_hqd_pipe_priority;
+   u32 cp_hqd_queue_priority;
+   u32 cp_hqd_quantum;
+   u32 cp_hqd_pq_base;
+   u32 cp_hqd_pq_base_hi;
+   u32 cp_hqd_pq_rptr;
+   u32 cp_hqd_pq_rptr_report_addr;
+   u32 cp_hqd_pq_rptr_report_addr_hi;
+   u32 cp_hqd_pq_wptr_poll_addr;
+   u32 cp_hqd_pq_wptr_poll_addr_hi;
+   u32 cp_hqd_pq_doorbell_control;
+   u32 cp_hqd_pq_wptr;
+   u32 cp_hqd_pq_control;
+   u32 cp_hqd_ib_base_addr;
+   u32 cp_hqd_ib_base_addr_hi;
+   u32 cp_hqd_ib_rptr;
+   u32 cp_hqd_ib_control;
+   u32 cp_hqd_iq_timer;
+   u32 cp_hqd_iq_rptr;
+   u32 cp_hqd_dequeue_request;
+   u32 cp_hqd_dma_offload;
+   u32 cp_hqd_sema_cmd;
+   u32 cp_hqd_msg_type;
+   u32 cp_hqd_atomic0_preop_lo;
+   u32 cp_hqd_atomic0_preop_hi;
+   u32 cp_hqd_atomic1_preop_lo;
+   u32

[PATCH v3 04/23] drm/radeon: adding synchronization for GRBM GFX

2014-08-05 Thread Oded Gabbay

Implementing a lock for selecting and accessing shader engines and arrays.
This lock will make sure that radeon and amdkfd are not colliding when
accessing shader engines and arrays with GRBM_GFX_INDEX register.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c   | 26 ++
 drivers/gpu/drm/radeon/radeon.h|  2 ++
 drivers/gpu/drm/radeon/radeon_device.c |  1 +
 3 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index d54d3d7..0096538 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -1563,6 +1563,8 @@ static const u32 godavari_golden_registers[] =

 static void cik_init_golden_registers(struct radeon_device *rdev)
 {
+   /* Some of the registers might be dependant on GRBM_GFX_INDEX */
+   mutex_lock(>grbm_idx_mutex);
switch (rdev->family) {
case CHIP_BONAIRE:
radeon_program_register_sequence(rdev,
@@ -1637,6 +1639,7 @@ static void cik_init_golden_registers(struct 
radeon_device *rdev)
default:
break;
}
+   mutex_unlock(>grbm_idx_mutex);
 }

 /**
@@ -3419,6 +3422,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
u32 disabled_rbs = 0;
u32 enabled_rbs = 0;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < se_num; i++) {
for (j = 0; j < sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -3430,6 +3434,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);

mask = 1;
for (i = 0; i < max_rb_num_per_se * se_num; i++) {
@@ -3440,6 +3445,7 @@ static void cik_setup_rb(struct radeon_device *rdev,

rdev->config.cik.backend_enable_mask = enabled_rbs;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < se_num; i++) {
cik_select_se_sh(rdev, i, 0x);
data = 0;
@@ -3467,6 +3473,7 @@ static void cik_setup_rb(struct radeon_device *rdev,
WREG32(PA_SC_RASTER_CONFIG, data);
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);
 }

 /**
@@ -3684,6 +3691,12 @@ static void cik_gpu_init(struct radeon_device *rdev)
/* set HW defaults for 3D engine */
WREG32(CP_MEQ_THRESHOLDS, MEQ1_START(0x30) | MEQ2_START(0x60));

+   mutex_lock(>grbm_idx_mutex);
+   /*
+* making sure that the following register writes will be broadcasted
+* to all the shaders
+*/
+   cik_select_se_sh(rdev, 0x, 0x);
WREG32(SX_DEBUG_1, 0x20);

WREG32(TA_CNTL_AUX, 0x0001);
@@ -3739,6 +3752,7 @@ static void cik_gpu_init(struct radeon_device *rdev)

WREG32(PA_CL_ENHANCE, CLIP_VTX_REORDER_ENA | NUM_CLIP_SEQ(3));
WREG32(PA_SC_ENHANCE, ENABLE_PA_SC_OUT_OF_ORDER);
+   mutex_unlock(>grbm_idx_mutex);

udelay(50);
 }
@@ -6036,6 +6050,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
u32 i, j, k;
u32 mask;

+   mutex_lock(>grbm_idx_mutex);
for (i = 0; i < rdev->config.cik.max_shader_engines; i++) {
for (j = 0; j < rdev->config.cik.max_sh_per_se; j++) {
cik_select_se_sh(rdev, i, j);
@@ -6047,6 +6062,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device 
*rdev)
}
}
cik_select_se_sh(rdev, 0x, 0x);
+   mutex_unlock(>grbm_idx_mutex);

mask = SE_MASTER_BUSY_MASK | GC_MASTER_BUSY | TC0_MASTER_BUSY | 
TC1_MASTER_BUSY;
for (k = 0; k < rdev->usec_timeout; k++) {
@@ -6181,10 +6197,12 @@ static int cik_rlc_resume(struct radeon_device *rdev)
WREG32(RLC_LB_CNTR_INIT, 0);
WREG32(RLC_LB_CNTR_MAX, 0x8000);

+   mutex_lock(>grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_LB_INIT_CU_MASK, 0x);
WREG32(RLC_LB_PARAMS, 0x00600408);
WREG32(RLC_LB_CNTL, 0x8004);
+   mutex_unlock(>grbm_idx_mutex);

WREG32(RLC_MC_CNTL, 0);
WREG32(RLC_UCODE_CNTL, 0);
@@ -6251,11 +6269,13 @@ static void cik_enable_cgcg(struct radeon_device *rdev, 
bool enable)

tmp = cik_halt_rlc(rdev);

+   mutex_lock(>grbm_idx_mutex);
cik_select_se_sh(rdev, 0x, 0x);
WREG32(RLC_SERDES_WR_CU_MASTER_MASK, 0x);
WREG32(RLC_SERDES_WR_NONCU_MASTER_MASK, 0x);
tmp2 = BPM_ADDR_MASK | CGCG_OVERRIDE_0 | CGLS_ENABLE;
WREG32(RLC_SERDES_WR_CTRL, tmp2);
+   mutex_unlock(>grbm_idx_mutex);

cik_update_rlc(rdev, tmp);

@@ -6297,11 +6317,13 @@ static void cik_enable_mgcg(struct radeon_device *rdev, 
bool enable)

tmp = cik_halt_rlc(rdev);

+

[PATCH v3 03/23] drm/radeon: Report doorbell configuration to amdkfd

2014-08-05 Thread Oded Gabbay

radeon and amdkfd share the doorbell aperture.
radeon sets it up, takes the doorbells required for its own rings
and reports the setup to amdkfd.
radeon reserved doorbells are at the start of the doorbell aperture.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/radeon.h|  4 
 drivers/gpu/drm/radeon/radeon_device.c | 31 +++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 511191f..75bcc04 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -691,6 +691,10 @@ struct radeon_doorbell {

 int radeon_doorbell_get(struct radeon_device *rdev, u32 *page);
 void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell);
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset);

 /*
  * IRQS.
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index c58f84f..827bcd1 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -373,6 +373,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 
doorbell)
__clear_bit(doorbell, rdev->doorbell.used);
 }

+/**
+ * radeon_doorbell_get_kfd_info - Report doorbell configuration required to
+ *setup KFD
+ *
+ * @rdev: radeon_device pointer
+ * @aperture_base: output returning doorbell aperture base physical address
+ * @aperture_size: output returning doorbell aperture size in bytes
+ * @start_offset: output returning # of doorbell bytes reserved for radeon.
+ *
+ * Radeon and the KFD share the doorbell aperture. Radeon sets it up,
+ * takes doorbells required for its own rings and reports the setup to KFD.
+ * Radeon reserved doorbells are at the start of the doorbell aperture.
+ */
+void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
+ phys_addr_t *aperture_base,
+ size_t *aperture_size,
+ size_t *start_offset)
+{
+   /* The first num_doorbells are used by radeon.
+* KFD takes whatever's left in the aperture. */
+   if (rdev->doorbell.size > rdev->doorbell.num_doorbells * sizeof(u32)) {
+   *aperture_base = rdev->doorbell.base;
+   *aperture_size = rdev->doorbell.size;
+   *start_offset = rdev->doorbell.num_doorbells * sizeof(u32);
+   } else {
+   *aperture_base = 0;
+   *aperture_size = 0;
+   *start_offset = 0;
+   }
+}
+
 /*
  * radeon_wb_*()
  * Writeback is the the method by which the the GPU updates special pages
-- 
1.9.1

[PATCH v3 02/23] drm/radeon/cik: Don't touch int of pipes 1-7

2014-08-05 Thread Oded Gabbay

amdkfd should set interrupts for pipes 1-7.

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c | 71 +---
 1 file changed, 1 insertion(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 9571be8..d54d3d7 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -7265,8 +7265,7 @@ static int cik_irq_init(struct radeon_device *rdev)
 int cik_irq_set(struct radeon_device *rdev)
 {
u32 cp_int_cntl;
-   u32 cp_m1p0, cp_m1p1, cp_m1p2, cp_m1p3;
-   u32 cp_m2p0, cp_m2p1, cp_m2p2, cp_m2p3;
+   u32 cp_m1p0;
u32 crtc1 = 0, crtc2 = 0, crtc3 = 0, crtc4 = 0, crtc5 = 0, crtc6 = 0;
u32 hpd1, hpd2, hpd3, hpd4, hpd5, hpd6;
u32 grbm_int_cntl = 0;
@@ -7300,13 +7299,6 @@ int cik_irq_set(struct radeon_device *rdev)
dma_cntl1 = RREG32(SDMA0_CNTL + SDMA1_REGISTER_OFFSET) & ~TRAP_ENABLE;

cp_m1p0 = RREG32(CP_ME1_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p1 = RREG32(CP_ME1_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p2 = RREG32(CP_ME1_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m1p3 = RREG32(CP_ME1_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p0 = RREG32(CP_ME2_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p1 = RREG32(CP_ME2_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p2 = RREG32(CP_ME2_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;
-   cp_m2p3 = RREG32(CP_ME2_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE;

if (rdev->flags & RADEON_IS_IGP)
thermal_int = RREG32_SMC(CG_THERMAL_INT_CTRL) &
@@ -7328,33 +7320,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe 
%d\n", ring->pipe);
-   break;
-   }
-   } else if (ring->me == 2) {
-   switch (ring->pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe 
%d\n", ring->pipe);
break;
@@ -7371,33 +7336,6 @@ int cik_irq_set(struct radeon_device *rdev)
case 0:
cp_m1p0 |= TIME_STAMP_INT_ENABLE;
break;
-   case 1:
-   cp_m1p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m1p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   default:
-   DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe 
%d\n", ring->pipe);
-   break;
-   }
-   } else if (ring->me == 2) {
-   switch (ring->pipe) {
-   case 0:
-   cp_m2p0 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 1:
-   cp_m2p1 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 2:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
-   case 3:
-   cp_m2p2 |= TIME_STAMP_INT_ENABLE;
-   break;
default:
DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe 
%d\n", ring->pipe);
break;
@@ -7486,13 +7424,6 @@ int cik_irq_set(struct radeon_device *rdev)
WREG32(SDMA0_CNTL +

[PATCH v3 01/23] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-08-05 Thread Oded Gabbay

To support HSA on KV, we need to limit the number of vmids and pipes
that are available for radeon's use with KV.

This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs
0-7) and also makes radeon thinks that KV has only a single MEC with a single
pipe in it

(v3) Use define for static vmid allocation in radeon

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/cik.c  | 48 +--
 drivers/gpu/drm/radeon/cikd.h |  2 ++
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index b625646..9571be8 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4660,12 +4660,11 @@ static int cik_mec_init(struct radeon_device *rdev)
/*
 * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
 * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
+* Nonetheless, we assign only 1 pipe because all other pipes will
+* be handled by KFD
 */
-   if (rdev->family == CHIP_KAVERI)
-   rdev->mec.num_mec = 2;
-   else
-   rdev->mec.num_mec = 1;
-   rdev->mec.num_pipe = 4;
+   rdev->mec.num_mec = 1;
+   rdev->mec.num_pipe = 1;
rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;

if (rdev->mec.hpd_eop_obj == NULL) {
@@ -4807,28 +4806,24 @@ static int cik_cp_compute_resume(struct radeon_device 
*rdev)

/* init the pipes */
mutex_lock(>srbm_mutex);
-   for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
-   int me = (i < 4) ? 1 : 2;
-   int pipe = (i < 4) ? i : (i - 4);

-   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 
2);
+   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;

-   cik_srbm_select(rdev, me, pipe, 0, 0);
+   cik_srbm_select(rdev, 0, 0, 0, 0);

-   /* write the EOP addr */
-   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
-   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 
8);
+   /* write the EOP addr */
+   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
+   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8);

-   /* set the VMID assigned */
-   WREG32(CP_HPD_EOP_VMID, 0);
+   /* set the VMID assigned */
+   WREG32(CP_HPD_EOP_VMID, 0);
+
+   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
+   tmp = RREG32(CP_HPD_EOP_CONTROL);
+   tmp &= ~EOP_SIZE_MASK;
+   tmp |= order_base_2(MEC_HPD_SIZE / 8);
+   WREG32(CP_HPD_EOP_CONTROL, tmp);

-   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
-   tmp = RREG32(CP_HPD_EOP_CONTROL);
-   tmp &= ~EOP_SIZE_MASK;
-   tmp |= order_base_2(MEC_HPD_SIZE / 8);
-   WREG32(CP_HPD_EOP_CONTROL, tmp);
-   }
-   cik_srbm_select(rdev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);

/* init the queues.  Just two for now. */
@@ -5874,8 +5869,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct 
radeon_ib *ib)
  */
 int cik_vm_init(struct radeon_device *rdev)
 {
-   /* number of VMs */
-   rdev->vm_manager.nvm = 16;
+   /*
+* number of VMs
+* VMID 0 is reserved for System
+* radeon graphics/compute will use VMIDs 1-7
+* amdkfd will use VMIDs 8-15
+*/
+   rdev->vm_manager.nvm = RADEON_NUM_OF_VMIDS;
/* base offset of vram pages */
if (rdev->flags & RADEON_IS_IGP) {
u64 tmp = RREG32(MC_VM_FB_OFFSET);
diff --git a/drivers/gpu/drm/radeon/cikd.h b/drivers/gpu/drm/radeon/cikd.h
index 0c6e1b5..fae4d0c 100644
--- a/drivers/gpu/drm/radeon/cikd.h
+++ b/drivers/gpu/drm/radeon/cikd.h
@@ -30,6 +30,8 @@
 #define CIK_RB_BITMAP_WIDTH_PER_SH 2
 #define HAWAII_RB_BITMAP_WIDTH_PER_SH  4

+#define RADEON_NUM_OF_VMIDS8
+
 /* DIDT IND registers */
 #define DIDT_SQ_CTRL0 0x0
 #   define DIDT_CTRL_EN   (1 << 0)
-- 
1.9.1

[PATCH v3 00/23] AMDKFD Kernel Driver

2014-08-05 Thread Oded Gabbay

Hi,
Here is the v3 patch set of amdkfd.

This version contains changes and fixes to code, as agreed on during the review
of the v2 patch set.

The major changes are:

- There are two new module parameters: # of processes and # of queues per 
  process. The defaults, as agreed on in the v2 review, are 32 and 128 
  respectively. This sets the default amount of GART address space that amdkfd
  requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff,
  such as mqd for kernel queue, hpd for pipelines, etc.)

- All the GART address space usage of amdkfd is done inside a single contiguous
  buffer that is allocated from system memory, and pinned to the start of the 
  GART during the startup of amdkfd (which is just after the startup of 
  radeon). The management of this buffer is done by the radeon sa manager. 
  This buffer is not evict-able.

- Mapping of doorbells is initiated by the userspace lib (by mmap syscall), 
  instead of initiating it from inside an ioctl (using vm_mmap).

- Removed ioctls for exclusive access to performance counters

- Added documentation about the QCM (Queue Control Management), apertures and
  interfaces between amdkfd and radeon.

Two important notes:

- The topology patch has not been changed. Look at 
  http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html
  for my response. I also put my answer as an explanation in the commit msg
  of the patch.

- There are still some minor code style issues I need to fix. I didn't want
  to delay v3 any further but I will publish either v4 with those fixes,
  or just relevant patches if the whole patch set will be merged.

For people who like to review using git, the v3 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3

In addition, I would like to announce that we have uploaded the userspace lib
that accompanies amdkfd. That lib is called "libhsakmt" and you can view it at:
http://cgit.freedesktop.org/~gabbayo/libhsakmt

Alexey Skidanov (1):
  amdkfd: Implement the Get Process Aperture IOCTL

Andrew Lewycky (3):
  amdkfd: Add basic modules to amdkfd
  amdkfd: Add interrupt handling module
  amdkfd: Implement the Set Memory Policy IOCTL

Ben Goz (8):
  amdkfd: Add queue module
  amdkfd: Add mqd_manager module
  amdkfd: Add kernel queue module
  amdkfd: Add module parameter of scheduling policy
  amdkfd: Add packet manager module
  amdkfd: Add process queue manager module
  amdkfd: Add device queue manager module
  amdkfd: Implement the create/destroy/update queue IOCTLs

Evgeny Pinchuk (2):
  amdkfd: Add topology module to amdkfd
  amdkfd: Implement the Get Clock Counters IOCTL

Oded Gabbay (9):
  drm/radeon: reduce number of free VMIDs and pipes in KV
  drm/radeon/cik: Don't touch int of pipes 1-7
  drm/radeon: Report doorbell configuration to amdkfd
  drm/radeon: adding synchronization for GRBM GFX
  drm/radeon: Add radeon <--> amdkfd interface
  Update MAINTAINERS and CREDITS files with amdkfd info
  amdkfd: Add IOCTL set definitions of amdkfd
  amdkfd: Add amdkfd skeleton driver
  amdkfd: Add binding/unbinding calls to amd_iommu driver

 CREDITS|7 +
 MAINTAINERS|   10 +
 drivers/gpu/drm/radeon/Kconfig |2 +
 drivers/gpu/drm/radeon/Makefile|3 +
 drivers/gpu/drm/radeon/amdkfd/Kconfig  |   10 +
 drivers/gpu/drm/radeon/amdkfd/Makefile |   14 +
 drivers/gpu/drm/radeon/amdkfd/cik_regs.h   |  220 
 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c   |  350 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c|  511 +
 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h   |  294 +
 drivers/gpu/drm/radeon/amdkfd/kfd_device.c |  300 +
 .../drm/radeon/amdkfd/kfd_device_queue_manager.c   |  989 
 .../drm/radeon/amdkfd/kfd_device_queue_manager.h   |  144 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c   |  236 
 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c  |  161 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c   |  330 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h   |   66 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_module.c |  147 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c|  305 +
 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h|   88 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c |  495 
 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c  |   95 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h|  682 +++
 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h|  107 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h   |  560 +
 drivers/gpu/drm/radeon/amdkfd/kfd_process.c|  347 ++
 .../drm/radeon/amdkfd/kfd_process_queue_manager.c  |  346 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c  |   85 ++
 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c   |

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #4 from Kai  ---
Created attachment 104094
  --> https://bugs.freedesktop.org/attachment.cgi?id=104094=edit
dmesg with radeon.dpm=1 set

Here you go. The last power state entry in dmesg is:
> switching from power state:
>  ui class: performance
>  internal class: none
>  caps: 
>  uvdvclk: 0 dclk: 0
>  power level 0sclk: 3 mclk: 15000 pcie gen: 3 pcie lanes: 16
>  power level 1sclk: 98000 mclk: 125000 pcie gen: 3 pcie lanes: 16
>  status: c r 
> switching to power state:
>  ui class: performance
>  internal class: none
>  caps: 
>  uvdvclk: 0 dclk: 0
>  power level 0sclk: 3 mclk: 15000 pcie gen: 3 pcie lanes: 16
>  power level 1sclk: 98000 mclk: 125000 pcie gen: 3 pcie lanes: 16
>  status: c r

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/13147a4e/attachment-0001.html>

[pull] radeon drm-next-3.17

2014-08-05 Thread Alex Deucher

Hi Dave,

This is the radeon pull request for 3.17.  Highlights:
- Additional Hawaii fixes
- Support for using the display scaler on non-fixed mode displays
- Support for new firmware format that makes it easier to update
- Enable dpm by default on additional asics
- GPUVM improvements
- Support for uncached and write combined gtt buffers
- Allow allocation of BOs larger than visible vram
- Various other small fixes and improvements

Drop the userptr stuff for now pending further discussion.

The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d:

  drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 
+1000)

are available in the git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-3.17

for you to fetch changes up to 9f51e2e04f74608adec9957df97684a37a4cd375:

  drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined. 
(2014-08-05 11:22:54 -0400)


Alex Deucher (25):
  drm/radeon/dpm: add support for SVI2 voltage for SI
  drm/radeon: disable gfx cgcg on cik
  drm/radeon: add new firmware header definitions (v3)
  drm/radeon/si: Add support for new ucode format (v3)
  drm/radeon/cik: Add support for new ucode format (v5)
  drm/radeon: enable display scaling on all connectors (v2)
  drm/radeon: consolidate vga and dvi get_modes functions (v2)
  drm/radeon: restructure edid fetching
  drm/radeon: use a fetch function to get the edid
  drm/radeon: track pinned memory (v2)
  drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl
  drm/radeon: use vram/gart pinned size in radeon_do_test_moves
  drm/radeon: remove visible vram size limit on bo allocation (v4)
  drm/radeon: add a PX quirk list
  drm/radeon: make radeon_connector_encoder_is_hbr2 static
  drm/radeon: load the lm63 driver for an lm64 thermal chip.
  drm/radeon: fix reversed logic in evergreen_mc_resume
  drm/radeon/atom: add new voltage fetch function for hawaii
  drm/radeon/dpm: handle voltage info fetching on hawaii
  drm/radeon: re-enable dpm by default on cayman
  drm/radeon: re-enable dpm by default on BTC
  drm/radeon: use an intervall tree to manage the VMA v2
  drm/radeon: use packet2 for nop on hawaii with old firmware
  drm/radeon: tweak ACCEL_WORKING2 query for hawaii
  drm/radeon: use packet3 for nop on hawaii with new firmware

Andreas Boll (1):
  drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for hawaii

Christian K?nig (15):
  drm/radeon: remove discardable flag from radeon_gem_object_create
  drm/radeon: fix R600_PTE_GART handling
  drm/radeon: add trace_radeon_vm_flush
  drm/radeon: set VM base addr using the PFP v2
  drm/radeon: separate ring and IB handling
  drm/radeon: invalidate moved BOs in the VM (v2)
  drm/radeon: remove radeon_bo_clear_va
  drm/radeon: try to enable VM flushing once more
  drm/radeon: adjust default radeon_vm_block_size v2
  drm/radeon: remove taking mclk_lock from radeon_bo_unref
  drm/radeon: add radeon_bo_ref function
  drm/radeon: take a BO reference on VM cleanup
  drm/radeon: add VM GART copy optimization to NI as well
  drm/radeon: split PT setup in more functions
  drm/radeon: update IB size estimation for VM

Fabian Frederick (1):
  drm/radeon: remove null test before kfree

Lauri Kasanen (1):
  drm/radeon: Inline r100_mm_rreg, -wreg, v3

Mario Kleiner (2):
  drm/radeon: Use pflip irqs for pageflip completion if possible. (v2)
  drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined.

Michel D?nzer (10):
  drm/radeon: Demote 'BO allocation size too large' message to debug only
  drm/radeon: Remove radeon_gart_restore()
  drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly
  drm/radeon: Allow write-combined CPU mappings of BOs in GTT (v2)
  drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe
  drm/radeon: Use write-combined CPU mappings of IBs on >= CIK
  drm/radeon/cik: Read back SDMA WPTR register after writing it
  drm/radeon: s/ioctl_wait_idle/mmio_hpd_flush/
  drm/radeon: Always flush the HDP cache before submitting a CS to the GPU
  drm/radeon: Only flush HDP cache from idle ioctl if BO is in VRAM

Stefan Br?ns (2):
  drm/radeon: Use correct value for unknown audio/video latency
  drm/radeon/audio: break out of loops once we match connector

 drivers/gpu/drm/Kconfig|   1 +
 drivers/gpu/drm/radeon/Makefile|   2 +-
 drivers/gpu/drm/radeon/atombios_encoders.c |  16 +-
 drivers/gpu/drm/radeon/ci_dpm.c|  13 +-
 drivers/gpu/drm/radeon/ci_smc.c|  39 +-
 drivers/gpu/drm/radeon/cik.c   | 722 ++---
 drivers/gpu/drm/radeon/cik_sdma.c  | 247 ++

[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #8 from sarnex  ---
(In reply to comment #7)
> (In reply to comment #6)
> > I tried running a 3D game using the terminal to monitor stdout, and this was
> > constantly spammed as well. Not sure if it gives any additional information.
> > 
> > radeon: Failed to allocate a buffer:
> > radeon:size  : 0 bytes
> > radeon:alignment : 4096 bytes
> > radeon:domains   : 4
> > radeon:flags : 4
> 
> Yeah, pretty much the same message from userspace.
> 
> Looks like a bug in the userspace driver somewhere. Simplest thing to find
> it would be to attach a debugger and get a backtrace when the message is
> printed.

Hi, thanks for the response. I don't really know what I'm doing with GDB so
I'll explain what I did to get this output. I installed all of the dbg mesa
packages from the PPA. Then, I ran the command "LIBGL_DEBUG=verbose gdb
glxgears", and when error printed(immedaitely), I pressed Ctrl+C and then typed
bt full. If there's another way to do this please let me know.

http://pastebin.com/v1nA7JWY

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/05d82c98/attachment.html>

[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2

2014-08-05 Thread Jerome Glisse

On Tue, Aug 05, 2014 at 07:45:21PM +0200, Christian K?nig wrote:
> Am 05.08.2014 um 19:39 schrieb Jerome Glisse:
> >On Tue, Aug 05, 2014 at 06:05:29PM +0200, Christian K?nig wrote:
> >>From: Christian K?nig 
> >>
> >>Avoid problems with writeback by limiting userptr to anonymous memory.
> >>
> >>v2: add commit and code comments
> >I guess, i have not expressed myself clearly. This is bogus, you pretend
> >you want to avoid writeback issue but you still allow userspace to map
> >file backed pages (which by the way might be a regular bo object from
> >another device for instance and that would be fun).
> >
> >So this patch is a no go and i would rather see that this userptr to
> >be restricted to anon vma only no matter what. No flags here.
> 
> Mapping of non anonymous memory (e.g. everything get_user_pages won't fail
> with) is restricted to read only access by the GPU.
> 
> I'm fine with making it a hard requirement for all mappings if you say it's
> a must have.
> 

Well for time being you should force read only. The way you implement write
is broken. Here is how it can abuse to allow write to a file backed mmap.

mmap(fixaddress,fixedsize,NOFD)
userptr_ioctl(fixedaddress, RADEON_GEM_USERPTR_ANONONLY)
// bo is created successfully because fixedaddress is part of anonvma
munmap(fixedaddress,fixedsize)
// radeon get mmu_notifier_range_start callback and unbind page from the
// bo but radeon does not know there was an unmap.
mmap(fixaddress,fixedsize,fd_to_this_read_only_file_i_want_to_write_to)
radeon_ioctl_use_my_userptrbo
// bo is bind again by radeon and because all flag are set at creation
// it is map with write permission allowing someone to write to a file
// that might be read only for the user.
//
// Script kiddies it's time to learn about gpu ...

Of course if you this patch (kind of selling my own junk here) :

http://www.spinics.net/lists/linux-mm/msg75878.html

then you could know inside the range_start that you should remove the
write permission and that it should be rechecked on next bind.

Note that i have not read much of your code so maybe you handle this
case somehow.

Cheers,
J?r?me

> Christian.
> 
> >
> >Cheers,
> >J?r?me
> >
> >>Signed-off-by: Christian K?nig 
> >>---
> >>  drivers/gpu/drm/radeon/radeon_gem.c |  3 ++-
> >>  drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++
> >>  include/uapi/drm/radeon_drm.h   |  1 +
> >>  3 files changed, 13 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> >>b/drivers/gpu/drm/radeon/radeon_gem.c
> >>index 993ab22..032736b 100644
> >>--- a/drivers/gpu/drm/radeon/radeon_gem.c
> >>+++ b/drivers/gpu/drm/radeon/radeon_gem.c
> >>@@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, 
> >>void *data,
> >>return -EACCES;
> >>/* reject unknown flag values */
> >>-   if (args->flags & ~RADEON_GEM_USERPTR_READONLY)
> >>+   if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
> >>+   RADEON_GEM_USERPTR_ANONONLY))
> >>return -EINVAL;
> >>/* readonly pages not tested on older hardware */
> >>diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> >>b/drivers/gpu/drm/radeon/radeon_ttm.c
> >>index 0109090..54eb7bc 100644
> >>--- a/drivers/gpu/drm/radeon/radeon_ttm.c
> >>+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> >>@@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt 
> >>*ttm)
> >>   ttm->num_pages * PAGE_SIZE))
> >>return -EFAULT;
> >>+   if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
> >>+   /* check that we only pin down anonymous memory
> >>+  to prevent problems with writeback */
> >>+   unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
> >>+   struct vm_area_struct *vma;
> >>+   vma = find_vma(gtt->usermm, gtt->userptr);
> >>+   if (!vma || vma->vm_file || vma->vm_end < end)
> >>+   return -EPERM;
> >>+   }
> >>+
> >>do {
> >>unsigned num_pages = ttm->num_pages - pinned;
> >>uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE;
> >>diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
> >>index 3a9f209..9720e1a 100644
> >>--- a/include/uapi/drm/radeon_drm.h
> >>+++ b/include/uapi/drm/radeon_drm.h
> >>@@ -816,6 +816,7 @@ struct drm_radeon_gem_create {
> >>   * perform any operation.
> >>   */
> >>  #define RADEON_GEM_USERPTR_READONLY   (1 << 0)
> >>+#define RADEON_GEM_USERPTR_ANONONLY(1 << 1)
> >>  struct drm_radeon_gem_userptr {
> >>uint64_taddr;
> >>-- 
> >>1.9.1
> >>
> >>___
> >>dri-devel mailing list
> >>dri-devel at lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 5/5] drm/radeon: allow userptr write access under certain conditions

2014-08-05 Thread Christian König

From: Christian K?nig 

It needs to be anonymous memory (no file mappings)
and we are requried to install an MMU notifier.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 2a6fbf1..01b5894 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -285,19 +285,24 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (offset_in_page(args->addr | args->size))
return -EINVAL;

-   /* we only support read only mappings for now */
-   if (!(args->flags & RADEON_GEM_USERPTR_READONLY))
-   return -EACCES;
-
/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE |
RADEON_GEM_USERPTR_REGISTER))
return -EINVAL;

-   /* readonly pages not tested on older hardware */
-   if (rdev->family < CHIP_R600)
-   return -EINVAL;
+   if (args->flags & RADEON_GEM_USERPTR_READONLY) {
+   /* readonly pages not tested on older hardware */
+   if (rdev->family < CHIP_R600)
+   return -EINVAL;
+
+   } else if (!(args->flags & RADEON_GEM_USERPTR_ANONONLY) ||
+  !(args->flags & RADEON_GEM_USERPTR_REGISTER)) {
+
+   /* if we want to write to it we must require anonymous
+  memory and install a MMU notifier */
+   return -EACCES;
+   }

down_read(>exclusive_lock);

-- 
1.9.1

[PATCH 4/5] drm/radeon: add userptr flag to register MMU notifier v3

2014-08-05 Thread Christian König

From: Christian K?nig 

Whenever userspace mapping related to our userptr change
we wait for it to become idle and unmap it from GTT.

v2: rebased, fix mutex unlock in error path
v3: improve commit message

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/Kconfig|   1 +
 drivers/gpu/drm/radeon/Makefile|   2 +-
 drivers/gpu/drm/radeon/radeon.h|  12 ++
 drivers/gpu/drm/radeon/radeon_device.c |   2 +
 drivers/gpu/drm/radeon/radeon_gem.c|   9 +-
 drivers/gpu/drm/radeon/radeon_mn.c | 272 +
 drivers/gpu/drm/radeon/radeon_object.c |   1 +
 include/uapi/drm/radeon_drm.h  |   1 +
 8 files changed, 298 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_mn.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 9b2eedc..2745284 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -115,6 +115,7 @@ config DRM_RADEON
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select MMU_NOTIFIER
help
  Choose this option if you have an ATI Radeon graphics card.  There
  are both PCI and AGP versions.  You don't need to choose this to
diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 0013ad0..c7fa1ae 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -80,7 +80,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \
rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o 
\
trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \
-   ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o
+   ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o radeon_mn.o

 # add async DMA block
 radeon-y += \
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 3c6999e..511191f 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -487,6 +488,9 @@ struct radeon_bo {

struct ttm_bo_kmap_obj  dma_buf_vmap;
pid_t   pid;
+
+   struct radeon_mn*mn;
+   struct interval_tree_node   mn_it;
 };
 #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base)

@@ -1725,6 +1729,11 @@ void radeon_test_ring_sync(struct radeon_device *rdev,
   struct radeon_ring *cpB);
 void radeon_test_syncing(struct radeon_device *rdev);

+/*
+ * MMU Notifier
+ */
+int radeon_mn_register(struct radeon_bo *bo, unsigned long addr);
+void radeon_mn_unregister(struct radeon_bo *bo);

 /*
  * Debugfs
@@ -2372,6 +2381,9 @@ struct radeon_device {
/* tracking pinned memory */
u64 vram_pin_size;
u64 gart_pin_size;
+
+   struct mutexmn_lock;
+   DECLARE_HASHTABLE(mn_hash, 7);
 };

 bool radeon_is_px(struct drm_device *dev);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index c8ea050..c58f84f 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1270,6 +1270,8 @@ int radeon_device_init(struct radeon_device *rdev,
init_rwsem(>pm.mclk_lock);
init_rwsem(>exclusive_lock);
init_waitqueue_head(>irq.vblank_queue);
+   mutex_init(>mn_lock);
+   hash_init(rdev->mn_hash);
r = radeon_gem_init(rdev);
if (r)
return r;
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 4506560..2a6fbf1 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -291,7 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,

/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
-   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE))
+   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE |
+   RADEON_GEM_USERPTR_REGISTER))
return -EINVAL;

/* readonly pages not tested on older hardware */
@@ -312,6 +313,12 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (r)
goto release_object;

+   if (args->flags & RADEON_GEM_USERPTR_REGISTER) {
+   r = radeon_mn_register(bo, args->addr);
+   if (r)
+   goto release_object;
+   }
+
if (args->flags & RADEON_GEM_USERPTR_VALIDATE) {
down_read(>mm->mmap_sem);
r = radeon_bo_reserve(bo, true);
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c 
b/drivers/gpu/drm/radeon/radeon_mn.c
new file mode 100644
index 000..0157bc2
--- /dev/null
+++

[PATCH 3/5] drm/radeon: add userptr flag to directly validate the BO to GTT

2014-08-05 Thread Christian König

From: Christian K?nig 

This way we test userptr availability at BO creation time instead of first use.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 18 +-
 include/uapi/drm/radeon_drm.h   |  1 +
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 032736b..4506560 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -291,7 +291,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,

/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
-   RADEON_GEM_USERPTR_ANONONLY))
+   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE))
return -EINVAL;

/* readonly pages not tested on older hardware */
@@ -312,6 +312,22 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (r)
goto release_object;

+   if (args->flags & RADEON_GEM_USERPTR_VALIDATE) {
+   down_read(>mm->mmap_sem);
+   r = radeon_bo_reserve(bo, true);
+   if (r) {
+   up_read(>mm->mmap_sem);
+   goto release_object;
+   }
+
+   radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT);
+   r = ttm_bo_validate(>tbo, >placement, true, false);
+   radeon_bo_unreserve(bo);
+   up_read(>mm->mmap_sem);
+   if (r)
+   goto release_object;
+   }
+
r = drm_gem_handle_create(filp, gobj, );
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index 9720e1a..5dc61c2 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -817,6 +817,7 @@ struct drm_radeon_gem_create {
  */
 #define RADEON_GEM_USERPTR_READONLY(1 << 0)
 #define RADEON_GEM_USERPTR_ANONONLY(1 << 1)
+#define RADEON_GEM_USERPTR_VALIDATE(1 << 2)

 struct drm_radeon_gem_userptr {
uint64_taddr;
-- 
1.9.1

[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2

2014-08-05 Thread Christian König

From: Christian K?nig 

Avoid problems with writeback by limiting userptr to anonymous memory.

v2: add commit and code comments

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c |  3 ++-
 drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++
 include/uapi/drm/radeon_drm.h   |  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 993ab22..032736b 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
return -EACCES;

/* reject unknown flag values */
-   if (args->flags & ~RADEON_GEM_USERPTR_READONLY)
+   if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
+   RADEON_GEM_USERPTR_ANONONLY))
return -EINVAL;

/* readonly pages not tested on older hardware */
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 0109090..54eb7bc 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
   ttm->num_pages * PAGE_SIZE))
return -EFAULT;

+   if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
+   /* check that we only pin down anonymous memory
+  to prevent problems with writeback */
+   unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
+   struct vm_area_struct *vma;
+   vma = find_vma(gtt->usermm, gtt->userptr);
+   if (!vma || vma->vm_file || vma->vm_end < end)
+   return -EPERM;
+   }
+
do {
unsigned num_pages = ttm->num_pages - pinned;
uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE;
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index 3a9f209..9720e1a 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -816,6 +816,7 @@ struct drm_radeon_gem_create {
  * perform any operation.
  */
 #define RADEON_GEM_USERPTR_READONLY(1 << 0)
+#define RADEON_GEM_USERPTR_ANONONLY(1 << 1)

 struct drm_radeon_gem_userptr {
uint64_taddr;
-- 
1.9.1

[PATCH 1/5] drm/radeon: add userptr support v7

2014-08-05 Thread Christian König

From: Christian K?nig 

This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.

It imposes several restrictions upon the memory being mapped:

1. It must be page aligned (both start/end addresses, i.e ptr and size).

2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).

3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.

4. The BO is only mapped readonly for now, so no write support.

5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.

Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.

v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
v7: add warning about it's availability in the API definition

Signed-off-by: Christian K?nig 
Reviewed-by: Alex Deucher  (v4)
Reviewed-by: J?r?me Glisse  (v4)
---
 drivers/gpu/drm/radeon/radeon.h|   5 ++
 drivers/gpu/drm/radeon/radeon_cs.c |  25 +-
 drivers/gpu/drm/radeon/radeon_drv.c|   5 +-
 drivers/gpu/drm/radeon/radeon_gem.c|  68 
 drivers/gpu/drm/radeon/radeon_kms.c|   1 +
 drivers/gpu/drm/radeon/radeon_object.c |   3 +
 drivers/gpu/drm/radeon/radeon_prime.c  |  10 +++
 drivers/gpu/drm/radeon/radeon_ttm.c| 139 +
 drivers/gpu/drm/radeon/radeon_vm.c |   3 +
 include/uapi/drm/radeon_drm.h  |  16 
 10 files changed, 272 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 9e1732e..3c6999e 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void 
*data,
  struct drm_file *filp);
 int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);
+int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
+struct drm_file *filp);
 int radeon_gem_pin_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data,
@@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct 
radeon_device *rdev, int enabl
 extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int 
enable);
 extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 
domain);
 extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
+extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr,
+uint32_t flags);
+extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm);
 extern void radeon_vram_location(struct radeon_device *rdev, struct radeon_mc 
*mc, u64 base);
 extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc 
*mc);
 extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index ee712c1..1321491 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
struct radeon_cs_chunk *chunk;
struct radeon_cs_buckets buckets;
unsigned i, j;
-   bool duplicate;
+   bool duplicate, need_mmap_lock = false;
+   int r;

if (p->chunk_relocs_idx == -1) {
return 0;
@@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)
p->relocs[i].allowed_domains = domain;
}

+   if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
+   uint32_t domain = p->relocs[i].prefered_domains;
+   if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
+   DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
+ "allowed for userptr BOs\n");
+   return -EINVAL;
+   }
+   need_mmap_lock = true;
+   domain = RADEON_GEM_DOMAIN_GTT;
+   p->relocs[i].prefered_domains = domain;
+   p->relocs[i].allowed_domains = domain;
+   }
+
p->relocs[i].tv.bo = >relocs[i].robj->tbo;

[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #7 from Christian K?nig  ---
(In reply to comment #6)
> I tried running a 3D game using the terminal to monitor stdout, and this was
> constantly spammed as well. Not sure if it gives any additional information.
> 
> radeon: Failed to allocate a buffer:
> radeon:size  : 0 bytes
> radeon:alignment : 4096 bytes
> radeon:domains   : 4
> radeon:flags : 4

Yeah, pretty much the same message from userspace.

Looks like a bug in the userspace driver somewhere. Simplest thing to find it
would be to attach a debugger and get a backtrace when the message is printed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/ab597808/attachment.html>

[PATCH] drm/radeon: Only flush HDP cache from idle ioctl if BO is in VRAM

2014-08-05 Thread Michel Dänzer

On 05.08.2014 07:01, Marek Ol??k wrote:
> I'm afraid this won't always work and it can be a source of bugs.
> 
> Userspace doesn't have to call GEM_WAIT_IDLE before a CPU access to a
> VRAM buffer. For example, consider a wait-idle request with a non-zero
> timeout, which is implemented as a loop which calls GEM_BUSY. Also,
> userspace can use fences (alright they are backed by 1-page-sized VRAM
> buffers at the moment) and it may use real fences in the future which
> are not tied to a buffer object.
> 
> If the HDP flush isn't allowed in userspace IBs, I think we will have
> to expose it as an ioctl and call it explicitly from userspace.

I understand your concerns, but my patch doesn't change anything wrt
them, does it?


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #6 from sarnex  ---
I tried running a 3D game using the terminal to monitor stdout, and this was
constantly spammed as well. Not sure if it gives any additional information.

radeon: Failed to allocate a buffer:
radeon:size  : 0 bytes
radeon:alignment : 4096 bytes
radeon:domains   : 4
radeon:flags : 4

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/4e06e3f6/attachment.html>

Dual-channel DSI

2014-08-05 Thread Thierry Reding

Hi everyone,

I've been working on adding support for a panel that uses what's
commonly known as dual-channel DSI. Sometimes this is referred to as
ganged-mode as well.

What is it, you ask? It's essentially a hack to work around the band-
width restrictions of DSI, albeit one that's been commonly implemented
by several SoC vendors.

This typically works by equipping a peripheral with two DSI interfaces,
each of which driving one half of the screen (symmetric left-right mode)
or every other line (symmetric odd-even mode). Apparently there can be
asymmetric modes in addition to those two, but they seem to be the
common ones. Often both of the DSI interfaces need to be configured
using DCS commands and vendor specific registers.

A single display controller is typically used video data transmission.
This is necessary to provide synchronization and avoid tearing and all
kinds of other ugliness. For this to work both DSI controllers need to
be made aware of which chunk of the video data stream is addressing
them.

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

Kai  changed:

   What|Removed |Added

 Attachment #104081|VBIOS from  |VBIOS from XFX R9-290A-EDBD
description||

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/0be19d6b/attachment.html>

[PULL] topic/core-stuff

2014-08-05 Thread Daniel Vetter

Hi Dave,

Flushing out my drm core stuff branch, just 2 stragglers.

Cheers, Daniel


The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d:

  drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 
+1000)

are available in the git repository at:

  git://anongit.freedesktop.org/drm-intel tags/topic/core-stuff-2014-08-05

for you to fetch changes up to 82a1f64963fa58749b28b39b7ad64140dc2df8cb:

  drm: Fix race when checking for fb in the generic kms obj lookup (2014-08-05 
15:54:13 +0200)


Chris Wilson (1):
  drm: Unlink dead file_priv from list of active files first

Daniel Vetter (1):
  drm: Fix race when checking for fb in the generic kms obj lookup

 drivers/gpu/drm/drm_crtc.c | 11 ++-
 drivers/gpu/drm/drm_fops.c |  8 
 2 files changed, 10 insertions(+), 9 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #7 from Andy Furniss  ---
Created attachment 104083
  --> https://bugs.freedesktop.org/attachment.cgi?id=104083=edit
bad

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/b54d8530/attachment.html>

[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #6 from Andy Furniss  ---
Created attachment 104082
  --> https://bugs.freedesktop.org/attachment.cgi?id=104082=edit
good

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/805c650c/attachment-0001.html>

[PATCH 1/5] drm/radeon: add userptr support v6

2014-08-05 Thread Christian König

Am 05.08.2014 um 16:30 schrieb Jerome Glisse:
> On Tue, Aug 05, 2014 at 04:11:03PM +0200, Christian K?nig wrote:
>> From: Christian K?nig 
>>
>> This patch adds an IOCTL for turning a pointer supplied by
>> userspace into a buffer object.
>>
>> It imposes several restrictions upon the memory being mapped:
>>
>> 1. It must be page aligned (both start/end addresses, i.e ptr and size).
>>
>> 2. It must be normal system memory, not a pointer into another map of IO
>> space (e.g. it must not be a GTT mmapping of another object).
>>
>> 3. The BO is mapped into GTT, so the maximum amount of memory mapped at
>> all times is still the GTT limit.
>>
>> 4. The BO is only mapped readonly for now, so no write support.
>>
>> 5. List of backing pages is only acquired once, so they represent a
>> snapshot of the first use.
>>
>> Exporting and sharing as well as mapping of buffer objects created by
>> this function is forbidden and results in an -EPERM.
>>
>> v2: squash all previous changes into first public version
>> v3: fix tabs, map readonly, don't use MM callback any more
>> v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
>>  pin/unpin pages on bind/unbind instead of populate/unpopulate
>> v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
>>  flags, better handle READONLY flag, improve permission check
>> v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
>>
>> Signed-off-by: Christian K?nig 
>> Reviewed-by: Alex Deucher  (v4)
>> Reviewed-by: J?r?me Glisse  (v4)
>> ---
>>   drivers/gpu/drm/radeon/radeon.h|   5 ++
>>   drivers/gpu/drm/radeon/radeon_cs.c |  25 +-
>>   drivers/gpu/drm/radeon/radeon_drv.c|   5 +-
>>   drivers/gpu/drm/radeon/radeon_gem.c|  68 
>>   drivers/gpu/drm/radeon/radeon_kms.c|   1 +
>>   drivers/gpu/drm/radeon/radeon_object.c |   3 +
>>   drivers/gpu/drm/radeon/radeon_prime.c  |  10 +++
>>   drivers/gpu/drm/radeon/radeon_ttm.c| 139 
>> +
>>   drivers/gpu/drm/radeon/radeon_vm.c |   3 +
>>   include/uapi/drm/radeon_drm.h  |  11 +++
>>   10 files changed, 267 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon.h 
>> b/drivers/gpu/drm/radeon/radeon.h
>> index 9e1732e..3c6999e 100644
>> --- a/drivers/gpu/drm/radeon/radeon.h
>> +++ b/drivers/gpu/drm/radeon/radeon.h
>> @@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void 
>> *data,
>>struct drm_file *filp);
>>   int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
>>  struct drm_file *filp);
>> +int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
>> + struct drm_file *filp);
>>   int radeon_gem_pin_ioctl(struct drm_device *dev, void *data,
>>   struct drm_file *file_priv);
>>   int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data,
>> @@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct 
>> radeon_device *rdev, int enabl
>>   extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int 
>> enable);
>>   extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 
>> domain);
>>   extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
>> +extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr,
>> + uint32_t flags);
>> +extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm);
>>   extern void radeon_vram_location(struct radeon_device *rdev, struct 
>> radeon_mc *mc, u64 base);
>>   extern void radeon_gtt_location(struct radeon_device *rdev, struct 
>> radeon_mc *mc);
>>   extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool 
>> fbcon);
>> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
>> b/drivers/gpu/drm/radeon/radeon_cs.c
>> index ee712c1..1321491 100644
>> --- a/drivers/gpu/drm/radeon/radeon_cs.c
>> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
>> @@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
>> *p)
>>  struct radeon_cs_chunk *chunk;
>>  struct radeon_cs_buckets buckets;
>>  unsigned i, j;
>> -bool duplicate;
>> +bool duplicate, need_mmap_lock = false;
>> +int r;
>>   
>>  if (p->chunk_relocs_idx == -1) {
>>  return 0;
>> @@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct 
>> radeon_cs_parser *p)
>>  p->relocs[i].allowed_domains = domain;
>>  }
>>   
>> +if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
>> +uint32_t domain = p->relocs[i].prefered_domains;
>> +if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
>> +DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
>> +  "allowed for userptr BOs\n");
>> +return -EINVAL;
>> +}
>> +

[pull] radeon drm-next-3.17

2014-08-05 Thread Deucher, Alexander

> -Original Message-
> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, August 05, 2014 1:09 PM
> To: Alex Deucher
> Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, 
> Alexander
> Subject: Re: [pull] radeon drm-next-3.17
> 
> On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote:
> > Hi Dave,
> >
> > This is the radeon pull request for 3.17.  Highlights:
> > - Additional Hawaii fixes
> > - Support for using the display scaler on non-fixed mode displays
> > - Support for new firmware format that makes it easier to update
> > - Enable dpm by default on additional asics
> > - GPUVM improvements
> > - Support for uncached and write combined gtt buffers
> > - Userptr support
> 
> Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see
> them fly by anywhere, so I guess I've missed them on some m-l I don't
> subscribe to.

Christian wrote some patches to validate the interfaces, but I'm not sure he 
ever sent them out.  We haven't yet done a full implementation in the usermode 
drivers to take advantage of this yet.

Alex

> -Daniel
> 
> > - Allow allocation of BOs larger than visible vram
> > - Various other small fixes and improvements
> >
> > The following changes since commit
> a91576d7916f6cce76d30303e60e1ac47cf4a76d:
> >
> >   drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19
> +1000)
> >
> > are available in the git repository at:
> >
> >   git://people.freedesktop.org/~agd5f/linux drm-next-3.17
> >
> > for you to fetch changes up to
> ffd7d3a9d535933c7edfbaaac161f11628270716:
> >
> >   drm/radeon: allow userptr write access under certain conditions (2014-08-
> 05 12:10:42 -0400)
> >
> > 
> > Alex Deucher (25):
> >   drm/radeon/dpm: add support for SVI2 voltage for SI
> >   drm/radeon: disable gfx cgcg on cik
> >   drm/radeon: add new firmware header definitions (v3)
> >   drm/radeon/si: Add support for new ucode format (v3)
> >   drm/radeon/cik: Add support for new ucode format (v5)
> >   drm/radeon: enable display scaling on all connectors (v2)
> >   drm/radeon: consolidate vga and dvi get_modes functions (v2)
> >   drm/radeon: restructure edid fetching
> >   drm/radeon: use a fetch function to get the edid
> >   drm/radeon: track pinned memory (v2)
> >   drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl
> >   drm/radeon: use vram/gart pinned size in radeon_do_test_moves
> >   drm/radeon: remove visible vram size limit on bo allocation (v4)
> >   drm/radeon: add a PX quirk list
> >   drm/radeon: make radeon_connector_encoder_is_hbr2 static
> >   drm/radeon: load the lm63 driver for an lm64 thermal chip.
> >   drm/radeon: fix reversed logic in evergreen_mc_resume
> >   drm/radeon/atom: add new voltage fetch function for hawaii
> >   drm/radeon/dpm: handle voltage info fetching on hawaii
> >   drm/radeon: re-enable dpm by default on cayman
> >   drm/radeon: re-enable dpm by default on BTC
> >   drm/radeon: use an intervall tree to manage the VMA v2
> >   drm/radeon: use packet2 for nop on hawaii with old firmware
> >   drm/radeon: tweak ACCEL_WORKING2 query for hawaii
> >   drm/radeon: use packet3 for nop on hawaii with new firmware
> >
> > Andreas Boll (1):
> >   drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for
> hawaii
> >
> > Christian K?nig (20):
> >   drm/radeon: remove discardable flag from radeon_gem_object_create
> >   drm/radeon: fix R600_PTE_GART handling
> >   drm/radeon: add trace_radeon_vm_flush
> >   drm/radeon: set VM base addr using the PFP v2
> >   drm/radeon: separate ring and IB handling
> >   drm/radeon: invalidate moved BOs in the VM (v2)
> >   drm/radeon: remove radeon_bo_clear_va
> >   drm/radeon: try to enable VM flushing once more
> >   drm/radeon: adjust default radeon_vm_block_size v2
> >   drm/radeon: remove taking mclk_lock from radeon_bo_unref
> >   drm/radeon: add radeon_bo_ref function
> >   drm/radeon: take a BO reference on VM cleanup
> >   drm/radeon: add VM GART copy optimization to NI as well
> >   drm/radeon: split PT setup in more functions
> >   drm/radeon: update IB size estimation for VM
> >   drm/radeon: add userptr support v7
> >   drm/radeon: add userptr flag to limit it to anonymous memory v2
> >   drm/radeon: add userptr flag to directly validate the BO to GTT
> >   drm/radeon: add userptr flag to register MMU notifier v3
> >   drm/radeon: allow userptr write access under certain conditions
> >
> > Fabian Frederick (1):
> >   drm/radeon: remove null test before kfree
> >
> > Lauri Kasanen (1):
> >   drm/radeon: Inline r100_mm_rreg, -wreg, v3
> >
> > Mario Kleiner (2):
> >   drm/radeon: Use pflip irqs for pageflip completion if possible. (v2)
> >

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #3 from Kai  ---
Created attachment 104081
  --> https://bugs.freedesktop.org/attachment.cgi?id=104081=edit
VBIOS from

(In reply to comment #2)
> Please attach your dmesg output with radeon.dpm=1 set on the kernel command
> line in grub.  That dumps some additional debugging output.

I'll reboot later and attach that dmesg, I'm currently bisecting X for bug
82055.

>  Also please attach a copy of your vbios.

Here you go. Below you find the lspci output, maybe you can reach out to XFX
directly, if that should help:
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
> Hawaii PRO [Radeon R9 290] (prog-if 00 [VGA controller])
> Subsystem: XFX Pine Group Inc. Device 9295
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> SERR-  Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 45
> Region 0: Memory at e000 (64-bit, prefetchable) [size=256M]
> Region 2: Memory at f000 (64-bit, prefetchable) [size=8M]
> Region 4: I/O ports at e000 [size=256]
> Region 5: Memory at f7e0 (32-bit, non-prefetchable) [size=256K]
> Expansion ROM at f7e4 [disabled] [size=128K]
> Capabilities: [48] Vendor Specific Information: Len=08 
> Capabilities: [50] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0-,D1+,D2+,D3hot+,D3cold-)
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, 
> L1 unlimited
> ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
> Unsupported-
> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 256 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- 
> TransPend-
> LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit 
> Latency L0s <64ns, L1 <1us
> ClockPM- Surprise- LLActRep- BwNot-
> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ 
> DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-, 
> LTR-, OBFF Not Supported
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, 
> OBFF Disabled
> LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
>  Transmit Margin: Normal Operating Range, 
> EnterModifiedCompliance- ComplianceSOS-
>  Compliance De-emphasis: -6dB
> LnkSta2: Current De-emphasis Level: -3.5dB, 
> EqualizationComplete+, EqualizationPhase1+
>  EqualizationPhase2+, EqualizationPhase3+, 
> LinkEqualizationRequest-
> Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Address: fee00358  Data: 
> Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 
> Len=010 
> Capabilities: [150 v2] Advanced Error Reporting
> UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
> CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
> NonFatalErr+
> AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ 
> ChkEn-
> Capabilities: [270 v1] #19
> Capabilities: [2b0 v1] Address Translation Service (ATS)
> ATSCap: Invalidate Queue Depth: 00
> ATSCtl: Enable-, Smallest Translation Unit: 00
> Capabilities: [2c0 v1] #13
> Capabilities: [2d0 v1] #1b
> Kernel driver in use: radeon

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/95243971/attachment.html>

[Intel-gfx] [PATCH 1/6] drm: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread Jindal, Sonika



On 8/5/2014 4:45 PM, Daniel Vetter wrote:
> On Tue, Aug 05, 2014 at 04:38:17PM +0530, sonika.jindal at intel.com wrote:
>> From: Sonika Jindal 
>>
>> Renaming defines to have levels instead of nominal values.
>>
>> Signed-off-by: Sonika Jindal 
>
> You can't split up patches like this since this will break compilation.
> For larger stuff (and imo this is right above the cutoff) you first need
> to add the new functions/defines, then convert everyone over. And only
> when all the drivers are converted can we apply the patch to remove the
> old functions/defines.
> -Daniel
>
Got your concern. So, I will repost the first patch keeping both the 
defines and an additional last patch for removing the extra defines.
>> ---
>>   include/drm/drm_dp_helper.h |   16 
>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
>> index a21568b..70f362b 100644
>> --- a/include/drm/drm_dp_helper.h
>> +++ b/include/drm/drm_dp_helper.h
>> @@ -190,16 +190,16 @@
>>   # define DP_TRAIN_VOLTAGE_SWING_MASK   0x3
>>   # define DP_TRAIN_VOLTAGE_SWING_SHIFT  0
>>   # define DP_TRAIN_MAX_SWING_REACHED(1 << 2)
>> -# define DP_TRAIN_VOLTAGE_SWING_400 (0 << 0)
>> -# define DP_TRAIN_VOLTAGE_SWING_600 (1 << 0)
>> -# define DP_TRAIN_VOLTAGE_SWING_800 (2 << 0)
>> -# define DP_TRAIN_VOLTAGE_SWING_1200(3 << 0)
>> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_0 (0 << 0)
>> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_1 (1 << 0)
>> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_2 (2 << 0)
>> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_3 (3 << 0)
>>
>>   # define DP_TRAIN_PRE_EMPHASIS_MASK(3 << 3)
>> -# define DP_TRAIN_PRE_EMPHASIS_0(0 << 3)
>> -# define DP_TRAIN_PRE_EMPHASIS_3_5  (1 << 3)
>> -# define DP_TRAIN_PRE_EMPHASIS_6(2 << 3)
>> -# define DP_TRAIN_PRE_EMPHASIS_9_5  (3 << 3)
>> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_0  (0 << 3)
>> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_1  (1 << 3)
>> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_2  (2 << 3)
>> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_3  (3 << 3)
>>
>>   # define DP_TRAIN_PRE_EMPHASIS_SHIFT   3
>>   # define DP_TRAIN_MAX_PRE_EMPHASIS_REACHED  (1 << 5)
>> --
>> 1.7.10.4
>>
>> ___
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #2 from Alex Deucher  ---
Please attach your dmesg output with radeon.dpm=1 set on the kernel command
line in grub.  That dumps some additional debugging output.  Also please attach
a copy of your vbios.

(as root)
(use lspci to get the bus id)
cd /sys/bus/pci/devices/
echo 1 > rom
cat rom > /tmp/vbios.rom
echo 0 > rom

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5130033e/attachment.html>

[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

--- Comment #1 from Kai  ---
Since the image was a bit too large, I can't attach it here. You can find the
screenshot at at http://imgur.com/vFBfQpQ

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/b08a8fd8/attachment.html>

[Bug 82201] New: [HAWAII] GPU doesn't reclock, poor 3D performance

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82201

  Priority: medium
Bug ID: 82201
  Assignee: dri-devel at lists.freedesktop.org
   Summary: [HAWAII] GPU doesn't reclock, poor 3D performance
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: kai at dev.carbon-project.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Drivers/Gallium/radeonsi
   Product: Mesa

No matter what program I run, the clock of the GPU stays:
# cat /sys/kernel/debug/dri/*/radeon_pm_info
power level avg sclk: 3 mclk: 15000
power level avg sclk: 3 mclk: 15000

The attached screenshot shows Portal 2 with a GALLIUM_HUD=fps overlay. The ~30
FPS are in the menu, the 8-15 FPS are in the level.

My stack is (base: Debian Testing):
GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1)
Linux: Git:~agdf5/linux:drm-next-3.17-rebased-on-fixes:fa78380797 (calls itself
3.16-rc6)
Firmware: <http://people.freedesktop.org/~agd5f/radeon_ucode/ucode.tar.gz>
> 9e05820da42549ce9c89d147cf1f8e19  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_ce.bin
> c8bab593090fc54f239c8d7596c8d846  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_mc.bin
> 3618dbb955d8a84970e262bb2e6d2a16  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_me.bin
> c000b0fc9ff6582145f66504b0ec9597  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_mec.bin
> 0643ad24b3beff2214cce533e094c1b7  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_pfp.bin
> ba6054b7d78184a74602fd81607e1386  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_rlc.bin
> 11288f635737331b69de9ee82fe04898  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_sdma.bin
> 284429675a5560e0fad42aa982965fc2  
> /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_smc.bin
libdrm: Git:master/libdrm-2.4.56
LLVM: SVN:trunk/r214546 (3.6 snapshot)
libclc: Git:master/5b48f170c8
Mesa: Git:master/e41cc45361
DDX: Git:master/fbf575cb01 + Patch from
http://lists.x.org/archives/xorg-driver-ati/2014-August/026534.html
X: 2:1.16.0-1 (1.16.0)

Let me know, if you need further information (current Xorg.0.log (attachment
103995), dmesg (attachment 103996) and glxinfo (attachment 103997) can be found
attached to bug 82055).

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6a514e67/attachment-0001.html>

[PATCH 6/6] drm/tegra: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Signed-off-by: Sonika Jindal 
---
 drivers/gpu/drm/tegra/dpaux.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c
index 3f132e3..34f3c1d 100644
--- a/drivers/gpu/drm/tegra/dpaux.c
+++ b/drivers/gpu/drm/tegra/dpaux.c
@@ -532,9 +532,9 @@ int tegra_dpaux_train(struct tegra_dpaux *dpaux, struct 
drm_dp_link *link,

for (i = 0; i < link->num_lanes; i++)
values[i] = DP_TRAIN_MAX_PRE_EMPHASIS_REACHED |
-   DP_TRAIN_PRE_EMPHASIS_0 |
+   DP_TRAIN_PRE_EMPHASIS_LEVEL_0 |
DP_TRAIN_MAX_SWING_REACHED |
-   DP_TRAIN_VOLTAGE_SWING_400;
+   DP_TRAIN_VOLTAGE_SWING_LEVEL_0;

err = drm_dp_dpcd_write(>aux, DP_TRAINING_LANE0_SET, values,
link->num_lanes);
-- 
1.7.10.4

[PATCH 5/6] drm/gma500: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Signed-off-by: Sonika Jindal 
---
 drivers/gpu/drm/gma500/cdv_intel_dp.c |   20 ++--
 drivers/gpu/drm/gma500/intel_bios.c   |   16 
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_intel_dp.c 
b/drivers/gpu/drm/gma500/cdv_intel_dp.c
index a4cc0e6..a9ef65d 100644
--- a/drivers/gpu/drm/gma500/cdv_intel_dp.c
+++ b/drivers/gpu/drm/gma500/cdv_intel_dp.c
@@ -1089,21 +1089,21 @@ static char *link_train_names[] = {
 };
 #endif

-#define CDV_DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_1200
+#define CDV_DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_LEVEL_3
 /*
 static uint8_t
 cdv_intel_dp_pre_emphasis_max(uint8_t voltage_swing)
 {
switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) {
-   case DP_TRAIN_VOLTAGE_SWING_400:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_600:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_800:
-   return DP_TRAIN_PRE_EMPHASIS_3_5;
-   case DP_TRAIN_VOLTAGE_SWING_1200:
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_0:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_2;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_1:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_2;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_2:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_1;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_3:
default:
-   return DP_TRAIN_PRE_EMPHASIS_0;
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_0;
}
 }
 */
@@ -1276,7 +1276,7 @@ cdv_intel_dp_set_vswing_premph(struct gma_encoder 
*encoder, uint8_t signal_level
cdv_sb_write(dev, ddi_reg->VSwing2, 
dp_vswing_premph_table[index]);

/* ;gfx_dpio_set_reg(0x814c, 0x40802040) */
-   if ((vswing + premph) == DP_TRAIN_VOLTAGE_SWING_1200)
+   if ((vswing + premph) == DP_TRAIN_VOLTAGE_SWING_LEVEL_3)
cdv_sb_write(dev, ddi_reg->VSwing3, 0x70802040);
else
cdv_sb_write(dev, ddi_reg->VSwing3, 0x40802040);
diff --git a/drivers/gpu/drm/gma500/intel_bios.c 
b/drivers/gpu/drm/gma500/intel_bios.c
index d349734..9573283 100644
--- a/drivers/gpu/drm/gma500/intel_bios.c
+++ b/drivers/gpu/drm/gma500/intel_bios.c
@@ -116,30 +116,30 @@ parse_edp(struct drm_psb_private *dev_priv, struct 
bdb_header *bdb)

switch (edp_link_params->preemphasis) {
case 0:
-   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_0;
+   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_0;
break;
case 1:
-   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_3_5;
+   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_1;
break;
case 2:
-   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_6;
+   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_2;
break;
case 3:
-   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_9_5;
+   dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_3;
break;
}
switch (edp_link_params->vswing) {
case 0:
-   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_400;
+   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_0;
break;
case 1:
-   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_600;
+   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_1;
break;
case 2:
-   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_800;
+   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_2;
break;
case 3:
-   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_1200;
+   dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_3;
break;
}
DRM_DEBUG_KMS("VBT reports EDP: VSwing  %d, Preemph %d\n",
-- 
1.7.10.4

[PATCH 4/6] drm/radeon: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Signed-off-by: Sonika Jindal 
---
 drivers/gpu/drm/radeon/atombios_dp.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_dp.c 
b/drivers/gpu/drm/radeon/atombios_dp.c
index b1e11f8..ef32b16 100644
--- a/drivers/gpu/drm/radeon/atombios_dp.c
+++ b/drivers/gpu/drm/radeon/atombios_dp.c
@@ -232,8 +232,8 @@ void radeon_dp_aux_init(struct radeon_connector 
*radeon_connector)

 /* general DP utility functions */

-#define DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_1200
-#define DP_PRE_EMPHASIS_MAXDP_TRAIN_PRE_EMPHASIS_9_5
+#define DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_LEVEL_3
+#define DP_PRE_EMPHASIS_MAXDP_TRAIN_PRE_EMPHASIS_LEVEL_3

 static void dp_get_adjust_train(u8 link_status[DP_LINK_STATUS_SIZE],
int lane_count,
-- 
1.7.10.4

[PATCH 3/6] drm/exynos: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Signed-off-by: Sonika Jindal 
---
 drivers/gpu/drm/exynos/exynos_dp_core.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_dp_core.c 
b/drivers/gpu/drm/exynos/exynos_dp_core.c
index 31c3de9..e520943 100644
--- a/drivers/gpu/drm/exynos/exynos_dp_core.c
+++ b/drivers/gpu/drm/exynos/exynos_dp_core.c
@@ -331,8 +331,8 @@ static int exynos_dp_link_start(struct exynos_dp_device *dp)
return retval;

for (lane = 0; lane < lane_count; lane++)
-   buf[lane] = DP_TRAIN_PRE_EMPHASIS_0 |
-   DP_TRAIN_VOLTAGE_SWING_400;
+   buf[lane] = DP_TRAIN_PRE_EMPHASIS_LEVEL_0 |
+   DP_TRAIN_VOLTAGE_SWING_LEVEL_0;

retval = exynos_dp_write_bytes_to_dpcd(dp, DP_TRAINING_LANE0_SET,
lane_count, buf);
-- 
1.7.10.4

[PATCH 2/6] drm/i915: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Changing the DP training vswing/pre-emph defines in i915.

Signed-off-by: Sonika Jindal 
---
 drivers/gpu/drm/i915/intel_bios.c |   16 +--
 drivers/gpu/drm/i915/intel_dp.c   |  194 ++---
 2 files changed, 105 insertions(+), 105 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_bios.c 
b/drivers/gpu/drm/i915/intel_bios.c
index 031c565..ef11274 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -627,16 +627,16 @@ parse_edp(struct drm_i915_private *dev_priv, struct 
bdb_header *bdb)

switch (edp_link_params->preemphasis) {
case EDP_PREEMPHASIS_NONE:
-   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_0;
+   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_0;
break;
case EDP_PREEMPHASIS_3_5dB:
-   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_3_5;
+   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_1;
break;
case EDP_PREEMPHASIS_6dB:
-   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_6;
+   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_2;
break;
case EDP_PREEMPHASIS_9_5dB:
-   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_9_5;
+   dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_3;
break;
default:
DRM_DEBUG_KMS("VBT has unknown eDP pre-emphasis value %u\n",
@@ -646,16 +646,16 @@ parse_edp(struct drm_i915_private *dev_priv, struct 
bdb_header *bdb)

switch (edp_link_params->vswing) {
case EDP_VSWING_0_4V:
-   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_400;
+   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_0
break;
case EDP_VSWING_0_6V:
-   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_600;
+   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_1;
break;
case EDP_VSWING_0_8V:
-   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_800;
+   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_2;
break;
case EDP_VSWING_1_2V:
-   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_1200;
+   dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_3;
break;
default:
DRM_DEBUG_KMS("VBT has unknown eDP voltage swing value %u\n",
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index ce890f0..c2b3075 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -2393,13 +2393,13 @@ intel_dp_voltage_max(struct intel_dp *intel_dp)
enum port port = dp_to_dig_port(intel_dp)->port;

if (IS_VALLEYVIEW(dev))
-   return DP_TRAIN_VOLTAGE_SWING_1200;
+   return DP_TRAIN_VOLTAGE_SWING_LEVEL_3;
else if (IS_GEN7(dev) && port == PORT_A)
-   return DP_TRAIN_VOLTAGE_SWING_800;
+   return DP_TRAIN_VOLTAGE_SWING_LEVEL_2;
else if (HAS_PCH_CPT(dev) && port != PORT_A)
-   return DP_TRAIN_VOLTAGE_SWING_1200;
+   return DP_TRAIN_VOLTAGE_SWING_LEVEL_3;
else
-   return DP_TRAIN_VOLTAGE_SWING_800;
+   return DP_TRAIN_VOLTAGE_SWING_LEVEL_2;
 }

 static uint8_t
@@ -2410,49 +2410,49 @@ intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, 
uint8_t voltage_swing)

if (IS_HASWELL(dev) || IS_BROADWELL(dev)) {
switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) {
-   case DP_TRAIN_VOLTAGE_SWING_400:
-   return DP_TRAIN_PRE_EMPHASIS_9_5;
-   case DP_TRAIN_VOLTAGE_SWING_600:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_800:
-   return DP_TRAIN_PRE_EMPHASIS_3_5;
-   case DP_TRAIN_VOLTAGE_SWING_1200:
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_0:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_3;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_1:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_2;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_2:
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_1;
+   case DP_TRAIN_VOLTAGE_SWING_LEVEL_3:
default:
-   return DP_TRAIN_PRE_EMPHASIS_0;
+   return DP_TRAIN_PRE_EMPHASIS_LEVEL_0;
}
} else if (IS_VALLEYVIEW(dev)) {
switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) {
-   case DP_TRAIN_VOLTAGE_SWING_400:
-   return DP_TRAIN_PRE_EMPHASIS_9_5;
-   case DP_TRAIN_VOLTAGE_SWING_600:
-   return

[PATCH 1/6] drm: Renaming DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Renaming defines to have levels instead of nominal values.

Signed-off-by: Sonika Jindal 
---
 include/drm/drm_dp_helper.h |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index a21568b..70f362b 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -190,16 +190,16 @@
 # define DP_TRAIN_VOLTAGE_SWING_MASK   0x3
 # define DP_TRAIN_VOLTAGE_SWING_SHIFT  0
 # define DP_TRAIN_MAX_SWING_REACHED(1 << 2)
-# define DP_TRAIN_VOLTAGE_SWING_400(0 << 0)
-# define DP_TRAIN_VOLTAGE_SWING_600(1 << 0)
-# define DP_TRAIN_VOLTAGE_SWING_800(2 << 0)
-# define DP_TRAIN_VOLTAGE_SWING_1200   (3 << 0)
+# define DP_TRAIN_VOLTAGE_SWING_LEVEL_0 (0 << 0)
+# define DP_TRAIN_VOLTAGE_SWING_LEVEL_1 (1 << 0)
+# define DP_TRAIN_VOLTAGE_SWING_LEVEL_2 (2 << 0)
+# define DP_TRAIN_VOLTAGE_SWING_LEVEL_3 (3 << 0)

 # define DP_TRAIN_PRE_EMPHASIS_MASK(3 << 3)
-# define DP_TRAIN_PRE_EMPHASIS_0   (0 << 3)
-# define DP_TRAIN_PRE_EMPHASIS_3_5 (1 << 3)
-# define DP_TRAIN_PRE_EMPHASIS_6   (2 << 3)
-# define DP_TRAIN_PRE_EMPHASIS_9_5 (3 << 3)
+# define DP_TRAIN_PRE_EMPHASIS_LEVEL_0  (0 << 3)
+# define DP_TRAIN_PRE_EMPHASIS_LEVEL_1  (1 << 3)
+# define DP_TRAIN_PRE_EMPHASIS_LEVEL_2  (2 << 3)
+# define DP_TRAIN_PRE_EMPHASIS_LEVEL_3  (3 << 3)

 # define DP_TRAIN_PRE_EMPHASIS_SHIFT   3
 # define DP_TRAIN_MAX_PRE_EMPHASIS_REACHED  (1 << 5)
-- 
1.7.10.4

[PATCH 0/6] Rename DP training vswing/pre-emph defines

2014-08-05 Thread sonika.jin...@intel.com

From: Sonika Jindal 

Rename the defines to have levels instead of values for vswing and pre-emph
levels as the values may differ in other scenarios like low vswing of eDP 1.4
where the values are different.
Updated in all the drivers as well

Sonika Jindal (6):
  drm: Renaming DP training vswing/pre-emph defines
  drm/i915: Renaming DP training vswing/pre-emph defines
  drm/exynos: Renaming DP training vswing/pre-emph defines
  drm/radeon: Renaming DP training vswing/pre-emph defines
  drm/gma500: Renaming DP training vswing/pre-emph defines
  drm/tegra: Renaming DP training vswing/pre-emph defines

 drivers/gpu/drm/exynos/exynos_dp_core.c |4 +-
 drivers/gpu/drm/gma500/cdv_intel_dp.c   |   20 ++--
 drivers/gpu/drm/gma500/intel_bios.c |   16 +--
 drivers/gpu/drm/i915/intel_bios.c   |   16 +--
 drivers/gpu/drm/i915/intel_dp.c |  194 +++
 drivers/gpu/drm/radeon/atombios_dp.c|4 +-
 drivers/gpu/drm/tegra/dpaux.c   |4 +-
 include/drm/drm_dp_helper.h |   16 +--
 8 files changed, 137 insertions(+), 137 deletions(-)

-- 
1.7.10.4

[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory

2014-08-05 Thread Christian König

Am 05.08.2014 um 16:24 schrieb Jerome Glisse:
> On Tue, Aug 05, 2014 at 04:11:04PM +0200, Christian K?nig wrote:
>> From: Christian K?nig 
> Why do you want that ?
To avoid any problems with writeback (which as far as I understand 
should only happen on mmaped files).

> NACK until proper explanation and motive.
Going to update the commit message and add a code comment.

Christian.

>
>> Signed-off-by: Christian K?nig 
>> ---
>>   drivers/gpu/drm/radeon/radeon_gem.c | 3 ++-
>>   drivers/gpu/drm/radeon/radeon_ttm.c | 8 
>>   include/uapi/drm/radeon_drm.h   | 3 ++-
>>   3 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
>> b/drivers/gpu/drm/radeon/radeon_gem.c
>> index 993ab22..032736b 100644
>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>> @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, 
>> void *data,
>>  return -EACCES;
>>   
>>  /* reject unknown flag values */
>> -if (args->flags & ~RADEON_GEM_USERPTR_READONLY)
>> +if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
>> +RADEON_GEM_USERPTR_ANONONLY))
>>  return -EINVAL;
>>   
>>  /* readonly pages not tested on older hardware */
>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
>> b/drivers/gpu/drm/radeon/radeon_ttm.c
>> index 0109090..d63e698 100644
>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>> @@ -542,6 +542,14 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
>> ttm->num_pages * PAGE_SIZE))
>>  return -EFAULT;
>>   
>> +if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
>> +unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
>> +struct vm_area_struct *vma;
>> +vma = find_vma(gtt->usermm, gtt->userptr);
>> +if (!vma || vma->vm_file || vma->vm_end < end)
>> +return -EPERM;
>> +}
>> +
>>  do {
>>  unsigned num_pages = ttm->num_pages - pinned;
>>  uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE;
>> diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
>> index a18ec54..4080ad3 100644
>> --- a/include/uapi/drm/radeon_drm.h
>> +++ b/include/uapi/drm/radeon_drm.h
>> @@ -810,7 +810,8 @@ struct drm_radeon_gem_create {
>>  uint32_tflags;
>>   };
>>   
>> -#define RADEON_GEM_USERPTR_READONLY 0x1
>> +#define RADEON_GEM_USERPTR_READONLY (1 << 0)
>> +#define RADEON_GEM_USERPTR_ANONONLY (1 << 1)
>>   
>>   struct drm_radeon_gem_userptr {
>>  uint64_taddr;
>> -- 
>> 1.9.1
>>
>> ___
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #5 from Tom Stellard  ---
Can you post the output of R600_DEBUG=cs from both the "good" and "bad"
commits?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/2da2b06b/attachment.html>

[PATCH 5/5] drm/radeon: allow userptr write access under certain conditions

2014-08-05 Thread Christian König

From: Christian K?nig 

It needs to be anonymous memory (no file mappings)
and we are requried to install an MMU notifier.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 2a6fbf1..01b5894 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -285,19 +285,24 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (offset_in_page(args->addr | args->size))
return -EINVAL;

-   /* we only support read only mappings for now */
-   if (!(args->flags & RADEON_GEM_USERPTR_READONLY))
-   return -EACCES;
-
/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE |
RADEON_GEM_USERPTR_REGISTER))
return -EINVAL;

-   /* readonly pages not tested on older hardware */
-   if (rdev->family < CHIP_R600)
-   return -EINVAL;
+   if (args->flags & RADEON_GEM_USERPTR_READONLY) {
+   /* readonly pages not tested on older hardware */
+   if (rdev->family < CHIP_R600)
+   return -EINVAL;
+
+   } else if (!(args->flags & RADEON_GEM_USERPTR_ANONONLY) ||
+  !(args->flags & RADEON_GEM_USERPTR_REGISTER)) {
+
+   /* if we want to write to it we must require anonymous
+  memory and install a MMU notifier */
+   return -EACCES;
+   }

down_read(>exclusive_lock);

-- 
1.9.1

[PATCH 4/5] drm/radeon: add userptr flag to register MMU notifier v2

2014-08-05 Thread Christian König

From: Christian K?nig 

v2: rebased, fix mutex unlock in error path

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/Kconfig|   1 +
 drivers/gpu/drm/radeon/Makefile|   2 +-
 drivers/gpu/drm/radeon/radeon.h|  12 ++
 drivers/gpu/drm/radeon/radeon_device.c |   2 +
 drivers/gpu/drm/radeon/radeon_gem.c|   9 +-
 drivers/gpu/drm/radeon/radeon_mn.c | 272 +
 drivers/gpu/drm/radeon/radeon_object.c |   1 +
 include/uapi/drm/radeon_drm.h  |   1 +
 8 files changed, 298 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_mn.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 9b2eedc..2745284 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -115,6 +115,7 @@ config DRM_RADEON
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select MMU_NOTIFIER
help
  Choose this option if you have an ATI Radeon graphics card.  There
  are both PCI and AGP versions.  You don't need to choose this to
diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 0013ad0..c7fa1ae 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -80,7 +80,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \
rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o 
\
trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \
-   ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o
+   ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o radeon_mn.o

 # add async DMA block
 radeon-y += \
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 3c6999e..511191f 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -487,6 +488,9 @@ struct radeon_bo {

struct ttm_bo_kmap_obj  dma_buf_vmap;
pid_t   pid;
+
+   struct radeon_mn*mn;
+   struct interval_tree_node   mn_it;
 };
 #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base)

@@ -1725,6 +1729,11 @@ void radeon_test_ring_sync(struct radeon_device *rdev,
   struct radeon_ring *cpB);
 void radeon_test_syncing(struct radeon_device *rdev);

+/*
+ * MMU Notifier
+ */
+int radeon_mn_register(struct radeon_bo *bo, unsigned long addr);
+void radeon_mn_unregister(struct radeon_bo *bo);

 /*
  * Debugfs
@@ -2372,6 +2381,9 @@ struct radeon_device {
/* tracking pinned memory */
u64 vram_pin_size;
u64 gart_pin_size;
+
+   struct mutexmn_lock;
+   DECLARE_HASHTABLE(mn_hash, 7);
 };

 bool radeon_is_px(struct drm_device *dev);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index c8ea050..c58f84f 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1270,6 +1270,8 @@ int radeon_device_init(struct radeon_device *rdev,
init_rwsem(>pm.mclk_lock);
init_rwsem(>exclusive_lock);
init_waitqueue_head(>irq.vblank_queue);
+   mutex_init(>mn_lock);
+   hash_init(rdev->mn_hash);
r = radeon_gem_init(rdev);
if (r)
return r;
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 4506560..2a6fbf1 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -291,7 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,

/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
-   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE))
+   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE |
+   RADEON_GEM_USERPTR_REGISTER))
return -EINVAL;

/* readonly pages not tested on older hardware */
@@ -312,6 +313,12 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (r)
goto release_object;

+   if (args->flags & RADEON_GEM_USERPTR_REGISTER) {
+   r = radeon_mn_register(bo, args->addr);
+   if (r)
+   goto release_object;
+   }
+
if (args->flags & RADEON_GEM_USERPTR_VALIDATE) {
down_read(>mm->mmap_sem);
r = radeon_bo_reserve(bo, true);
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c 
b/drivers/gpu/drm/radeon/radeon_mn.c
new file mode 100644
index 000..0157bc2
--- /dev/null
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -0,0 +1,272 @@
+/*
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ *

[PATCH 3/5] drm/radeon: add userptr flag to directly validate the BO to GTT

2014-08-05 Thread Christian König

From: Christian K?nig 

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 18 +-
 include/uapi/drm/radeon_drm.h   |  1 +
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 032736b..4506560 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -291,7 +291,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,

/* reject unknown flag values */
if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
-   RADEON_GEM_USERPTR_ANONONLY))
+   RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE))
return -EINVAL;

/* readonly pages not tested on older hardware */
@@ -312,6 +312,22 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
if (r)
goto release_object;

+   if (args->flags & RADEON_GEM_USERPTR_VALIDATE) {
+   down_read(>mm->mmap_sem);
+   r = radeon_bo_reserve(bo, true);
+   if (r) {
+   up_read(>mm->mmap_sem);
+   goto release_object;
+   }
+
+   radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT);
+   r = ttm_bo_validate(>tbo, >placement, true, false);
+   radeon_bo_unreserve(bo);
+   up_read(>mm->mmap_sem);
+   if (r)
+   goto release_object;
+   }
+
r = drm_gem_handle_create(filp, gobj, );
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index 4080ad3..026111b 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -812,6 +812,7 @@ struct drm_radeon_gem_create {

 #define RADEON_GEM_USERPTR_READONLY(1 << 0)
 #define RADEON_GEM_USERPTR_ANONONLY(1 << 1)
+#define RADEON_GEM_USERPTR_VALIDATE(1 << 2)

 struct drm_radeon_gem_userptr {
uint64_taddr;
-- 
1.9.1

[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory

2014-08-05 Thread Christian König

From: Christian K?nig 

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 3 ++-
 drivers/gpu/drm/radeon/radeon_ttm.c | 8 
 include/uapi/drm/radeon_drm.h   | 3 ++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 993ab22..032736b 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
return -EACCES;

/* reject unknown flag values */
-   if (args->flags & ~RADEON_GEM_USERPTR_READONLY)
+   if (args->flags & ~(RADEON_GEM_USERPTR_READONLY |
+   RADEON_GEM_USERPTR_ANONONLY))
return -EINVAL;

/* readonly pages not tested on older hardware */
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 0109090..d63e698 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -542,6 +542,14 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
   ttm->num_pages * PAGE_SIZE))
return -EFAULT;

+   if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
+   unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
+   struct vm_area_struct *vma;
+   vma = find_vma(gtt->usermm, gtt->userptr);
+   if (!vma || vma->vm_file || vma->vm_end < end)
+   return -EPERM;
+   }
+
do {
unsigned num_pages = ttm->num_pages - pinned;
uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE;
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index a18ec54..4080ad3 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -810,7 +810,8 @@ struct drm_radeon_gem_create {
uint32_tflags;
 };

-#define RADEON_GEM_USERPTR_READONLY0x1
+#define RADEON_GEM_USERPTR_READONLY(1 << 0)
+#define RADEON_GEM_USERPTR_ANONONLY(1 << 1)

 struct drm_radeon_gem_userptr {
uint64_taddr;
-- 
1.9.1

[PATCH 1/5] drm/radeon: add userptr support v6

2014-08-05 Thread Christian König

From: Christian K?nig 

This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.

It imposes several restrictions upon the memory being mapped:

1. It must be page aligned (both start/end addresses, i.e ptr and size).

2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).

3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.

4. The BO is only mapped readonly for now, so no write support.

5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.

Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.

v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin

Signed-off-by: Christian K?nig 
Reviewed-by: Alex Deucher  (v4)
Reviewed-by: J?r?me Glisse  (v4)
---
 drivers/gpu/drm/radeon/radeon.h|   5 ++
 drivers/gpu/drm/radeon/radeon_cs.c |  25 +-
 drivers/gpu/drm/radeon/radeon_drv.c|   5 +-
 drivers/gpu/drm/radeon/radeon_gem.c|  68 
 drivers/gpu/drm/radeon/radeon_kms.c|   1 +
 drivers/gpu/drm/radeon/radeon_object.c |   3 +
 drivers/gpu/drm/radeon/radeon_prime.c  |  10 +++
 drivers/gpu/drm/radeon/radeon_ttm.c| 139 +
 drivers/gpu/drm/radeon/radeon_vm.c |   3 +
 include/uapi/drm/radeon_drm.h  |  11 +++
 10 files changed, 267 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 9e1732e..3c6999e 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void 
*data,
  struct drm_file *filp);
 int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);
+int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
+struct drm_file *filp);
 int radeon_gem_pin_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data,
@@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct 
radeon_device *rdev, int enabl
 extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int 
enable);
 extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 
domain);
 extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
+extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr,
+uint32_t flags);
+extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm);
 extern void radeon_vram_location(struct radeon_device *rdev, struct radeon_mc 
*mc, u64 base);
 extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc 
*mc);
 extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index ee712c1..1321491 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
struct radeon_cs_chunk *chunk;
struct radeon_cs_buckets buckets;
unsigned i, j;
-   bool duplicate;
+   bool duplicate, need_mmap_lock = false;
+   int r;

if (p->chunk_relocs_idx == -1) {
return 0;
@@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)
p->relocs[i].allowed_domains = domain;
}

+   if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
+   uint32_t domain = p->relocs[i].prefered_domains;
+   if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
+   DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
+ "allowed for userptr BOs\n");
+   return -EINVAL;
+   }
+   need_mmap_lock = true;
+   domain = RADEON_GEM_DOMAIN_GTT;
+   p->relocs[i].prefered_domains = domain;
+   p->relocs[i].allowed_domains = domain;
+   }
+
p->relocs[i].tv.bo = >relocs[i].robj->tbo;
p->relocs[i].handle = r->handle;

@@ -176,8

[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #4 from Andy Furniss  ---
I bisected LLVM and it came up with -

ph4[llvm]$ git bisect good
ee17bf3fd4189d1981a6e908b4519e600ec7b002 is the first bad commit
commit ee17bf3fd4189d1981a6e908b4519e600ec7b002
Author: Matt Arsenault 
Date:   Fri Jul 25 23:02:42 2014 +

R600/SI: Allow partial unrolling and increase thresholds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk at 213985
91177308-0d34-0410-b5e6-96231b3b80d8

I don't know when I'll get to do kernel yet.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/1aac23a2/attachment-0001.html>

[Bug 41762] radeon default power_profile "default" makes laptop overheat (Mobility Radeon HD 3650)

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=41762

--- Comment #11 from renich at woralelandia.com  ---
I am suffering of the same thing on Fedora 20. Even during install.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/fa91cd74/attachment.html>

[Bug 41762] radeon default power_profile "default" makes laptop overheat (Mobility Radeon HD 3650)

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=41762

--- Comment #10 from renich at woralelandia.com  ---
Created attachment 104077
  --> https://bugs.freedesktop.org/attachment.cgi?id=104077=edit
journalctl -b

output of journalctl -b

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/20fd9479/attachment.html>

[Bug 81644] Random crashes on RadeonSI with Chromium.

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81644

--- Comment #41 from jackdachef at gmail.com ---
Created attachment 104076
  --> https://bugs.freedesktop.org/attachment.cgi?id=104076=edit
dmesg-output after 25 minute hardware-accelerated html5 video crash, no
hardlock this time (Magic SYSRQ works), screen corruption (screen subdivided
horizontally into ~18 parts)

kernel running with drm-next-3.17-rebased-on-fixes applied on top of 3.16-rc6

latest commit:
authorChristian K?nig 2014-07-28 11:30:12
(GMT)
committerAlex Deucher 2014-08-04 21:45:53
(GMT)
commitfa783807977da98da35590fd1d5efdfd4f33fd59 (patch)
tree0f1573ae770843228930a0f278a82eb5d482a4c5
parent5fc6854683aad9ae8b711cbe0d824c11b4aad66c (diff)
drm/radeon: allow userptr write access under certain conditions



several hours of pushing and trying to get X/system lockup with firefox
(hardware acceleration enabled) and watching & opening up large jpg images -
showed that at least that issue was resolved (Bug #81612 )


Then now proceeded to re-test HTML5 video with hardware acceleration (hardware
acceleration disabled was seemingly stable so far)

the funny thing: each of the last 3 test attempts after pretty much exactly 25
minutes it tends to lock up X


reproducer: chromium 38.0.2107.3 (previous versions should also work), but this
one has more options disabled which should rule out other crash/instability
triggers,

youtube.com ,
keywords: movie trailers 2014

watching random movie trailers with preferrably 1080p (some only available in
720p)


result: screen content locks up, mouse still movable for a short time & sound
continuing, the screen turning black - (box locking up/hardlock - this time
*not*) - this time: (in total 2) attempts to salvage via Magic SYSRQ + k

screen flickers, another Magic SYSRQ + k

screen turns on again, mentioned screen corruption (screen subdivided
horizontally into ~18 parts) with mostly white and green color in the shape of
tiles

took a photo, if needed


so we got a *clear* improvement: the box does *not* hardlock anymore, Magic
SYSRQ key works again and screen attempts to recover with Magic SYSRQ + k,

but it's not successful yet


hope the information of dmesg helps with further adding some ideas on how to
solve this


added the following patchset (patches 2-7) on top of that kernel
https://lkml.org/lkml/2014/8/3/120 ([PATCH 0/7] locking/rwsem: enable reader
opt-spinning & writer respin ), not sure if that might increase stability


Cheers

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/3aed2db5/attachment.html>

[Bug 81644] Random crashes on RadeonSI with Chromium.

2014-08-05 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81644

--- Comment #40 from Alex Deucher  ---
(In reply to comment #39)
> 
> are the other ways to temporarily disable LLVM for debugging in radeonsi ?

llvm is required for radeonsi.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/e882c1a9/attachment.html>

1 2 >

1 - 100 of 150 matches

Mail list logo