[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #22 from Dave Airlie --- do you have radeon.dpm=0 in smoe /etc/modprobe.d or somewhere like that file? -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5f28454a/attachment.html>
[sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!
ered [4.237859] emc: device handler registered [4.238463] osst :I: Tape driver with OnStream support version 0.99.4 [4.238463] osst :I: $Id: osst.c,v 1.73 2005/01/01 21:13:34 wriede Exp $ [4.243834] osd: LOADED open-osd 0.2.1 [4.260658] Rounding down aligned max_sectors from 4294967295 to 8388600 [4.280969] mtdoops: mtd device (mtddev=name/number) must be supplied [4.282117] device id = 2440 [4.282605] device id = 2480 [4.283097] device id = 24c0 [4.283583] device id = 24d0 [4.284134] device id = 25a1 [4.284620] device id = 2670 [4.286157] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd [4.287060] [ cut here ] [4.287722] kernel BUG at include/linux/mtd/map.h:148! [4.288048] invalid opcode: [#1] PREEMPT SMP [4.288048] CPU 1 [4.288048] Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc4-00162-g49099c4 #17 Bochs Bochs [4.288048] RIP: 0010:[] [] mtd_do_chip_probe+0x1d/0x1f [4.288048] RSP: 0018:880011049e20 EFLAGS: 00010246 [4.288048] RAX: RBX: 82a23550 RCX: [4.288048] RDX: 880011049e20 RSI: 82a23580 RDI: 880011049e80 [4.288048] RBP: 880011049e80 R08: 0003 R09: 810d6c93 [4.288048] R10: R11: 0001 R12: 82a23eb0 [4.288048] R13: 828790ce R14: R15: [4.288048] FS: () GS:88001260() knlGS: [4.288048] CS: 0010 DS: ES: CR0: 8005003b [4.288048] CR2: CR3: 0298c000 CR4: 000406e0 [4.288048] DR0: DR1: DR2: [4.288048] DR3: DR6: 0ff0 DR7: 0400 [4.288048] Process swapper/0 (pid: 1, threadinfo 880011048000, task 88001104) [4.288048] Stack: [4.288048] [4.288048] [4.288048] [4.288048] Call Trace: [4.288048] [] cfi_probe+0x15/0x17 [4.288048] [] do_map_probe+0xa0/0xac [4.288048] [] ? physmap_init+0x12/0x12 [4.288048] [] init_sbc_gxx+0x104/0x15b [4.288048] [] do_one_initcall+0x86/0x208 [4.288048] [] kernel_init+0x10d/0x1c2 [4.288048] [] ? do_early_param+0xc3/0xc3 [4.288048] [] kernel_thread_helper+0x4/0x10 [4.288048] [] ? retint_restore_args+0x13/0x13 [4.288048] [] ? do_one_initcall+0x208/0x208 [4.288048] [] ? gs_change+0x13/0x13 [4.288048] Code: 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5 48 83 ec 60 66 66 66 66 90 31 c0 b9 18 00 00 00 48 8d 55 a0 48 89 d7 f3 ab <0f> 0b 55 48 89 e5 66 66 66 66 90 48 c7 c6 a0 39 a2 82 e8 cc ff [4.288048] RIP [] mtd_do_chip_probe+0x1d/0x1f [4.288048] RSP [4.321423] ---[ end trace 169195d5d1f9be6e ]--- [4.322118] swapper/0 (1) used greatest stack depth: 3768 bytes left [4.323045] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [4.323045] Elapsed time: 10 qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel /kernel/x86_64-randconfig-s0-08051229/49099c4991da3c94773f888aea2e9d27b8a7c6d1/vmlinuz-3.5.0-rc4-00162-g49099c4 -append 'hung_task_panic=1 earlyprintk=ttyS0,115200 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=10 softlockup_panic=1 nmi_watchdog=panic prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-s0-08051229/linux-devel:devel-hourly-2014080511:49099c4991da3c94773f888aea2e9d27b8a7c6d1:bisect-linux1/.vmlinuz-49099c4991da3c94773f888aea2e9d27b8a7c6d1-20140805164127-2-kbuild branch=linux-devel/devel-hourly-2014080511 BOOT_IMAGE=/kernel/x86_64-randconfig-s0-08051229/49099c4991da3c94773f888aea2e9d27b8a7c6d1/vmlinuz-3.5.0-rc4-00162-g49099c4 drbd.minor_count=8' -initrd /kernel-tests/initrd/quantal-core-x86_64.cgz -m 320 -smp 2 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -pidfile /dev/shm/kboot/pid-quantal-kbuild-15 -serial file:/dev/shm/kboot/serial-quantal-kbuild-15 -daemonize -display none -monitor null -- next part -- A non-text attachment was scrubbed... Name: x86_64-randconfig-s0-08051229-7d5b32398354b2cb45d711c021557d8da09ae30b-kernel-BUG-at-128910.log Type: application/octet-stream Size: 139708 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/f56dcf51/attachment-0001.obj> -- next part -- # # Automatically generated file; DO NOT EDIT. # Linux/x86_64 3.5.0-rc4 Kernel Configuration # CONFIG_64BIT=y # CONFIG_X86_32 is n
[Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on
https://bugs.freedesktop.org/show_bug.cgi?id=81680 --- Comment #9 from Eugene --- (In reply to comment #8) > (In reply to comment #7) > > Yes, I'm using Kubuntu. And Libgl1-mesa-dri-dbg recently installed. Last > > report is here: > > It still doesn't resolve the symbols, please use addr2line. With what frame address I should use it or how to determine it ? -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/0f45b2a7/attachment.html>
[PATCH 07/15] drm/exynos: dsi: Add support for panel prepare and unprepare routines
Hi Andreas, On Tue, Aug 5, 2014 at 3:33 PM, Andrzej Hajda wrote: > Hi Ajay, > > > On 07/31/2014 07:42 PM, Ajay Kumar wrote: >> Modify exynos_dsi driver to support the new panel calls: >> prepare and unprepare. >> >> Signed-off-by: Ajay Kumar >> --- >> drivers/gpu/drm/exynos/exynos_drm_dsi.c | 12 ++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c >> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c >> index dc7c80b..4834932 100644 >> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c >> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c >> @@ -1351,7 +1351,7 @@ static int exynos_dsi_enable(struct exynos_dsi *dsi) >> if (ret < 0) >> return ret; >> >> - ret = drm_panel_enable(dsi->panel); >> + ret = drm_panel_prepare(dsi->panel); >> if (ret < 0) { >> exynos_dsi_poweroff(dsi); >> return ret; >> @@ -1360,6 +1360,13 @@ static int exynos_dsi_enable(struct exynos_dsi *dsi) >> exynos_dsi_set_display_mode(dsi); >> exynos_dsi_set_display_enable(dsi, true); >> >> + ret = drm_panel_enable(dsi->panel); >> + if (ret < 0) { >> + exynos_dsi_set_display_enable(dsi, false); > > I guess drm_panel_unprepare(dsi->panel) should be here. Thanks for pointing it out. I am not sure if Thierry has already picked this up since Inki has given Acked by. In that case, you can send it as a fix separately :) Ajay >> + exynos_dsi_poweroff(dsi); >> + return ret; >> + } >> + >> dsi->state |= DSIM_STATE_ENABLED; >> >> return 0; >> @@ -1370,8 +1377,9 @@ static void exynos_dsi_disable(struct exynos_dsi *dsi) >> if (!(dsi->state & DSIM_STATE_ENABLED)) >> return; >> >> - exynos_dsi_set_display_enable(dsi, false); >> drm_panel_disable(dsi->panel); >> + exynos_dsi_set_display_enable(dsi, false); >> + drm_panel_unprepare(dsi->panel); >> exynos_dsi_poweroff(dsi); >> >> dsi->state &= ~DSIM_STATE_ENABLED; >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" > in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 00/23] AMDKFD Kernel Driver
On 05/08/14 20:11, David Herrmann wrote: > Hi > > On Tue, Aug 5, 2014 at 5:30 PM, Oded Gabbay wrote: >> Hi, >> Here is the v3 patch set of amdkfd. >> >> This version contains changes and fixes to code, as agreed on during the >> review >> of the v2 patch set. >> >> The major changes are: >> >> - There are two new module parameters: # of processes and # of queues per >> process. The defaults, as agreed on in the v2 review, are 32 and 128 >> respectively. This sets the default amount of GART address space that >> amdkfd >> requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff, >> such as mqd for kernel queue, hpd for pipelines, etc.) >> >> - All the GART address space usage of amdkfd is done inside a single >> contiguous >> buffer that is allocated from system memory, and pinned to the start of the >> GART during the startup of amdkfd (which is just after the startup of >> radeon). The management of this buffer is done by the radeon sa manager. >> This buffer is not evict-able. >> >> - Mapping of doorbells is initiated by the userspace lib (by mmap syscall), >> instead of initiating it from inside an ioctl (using vm_mmap). >> >> - Removed ioctls for exclusive access to performance counters >> >> - Added documentation about the QCM (Queue Control Management), apertures and >> interfaces between amdkfd and radeon. >> >> Two important notes: >> >> - The topology patch has not been changed. Look at >> http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html >> for my response. I also put my answer as an explanation in the commit msg >> of the patch. > > This patchset adds 10.000 lines and contains nearly 0 comments *why* > stuff is added. Seriously, it is almost impossible to understand what > you're doing. Can you please include a high-level introduction in the > [0/X] cover-letter and include it in every series you send? A > blog-post or something would also be fine. And yes, it's totally ok if > this is 10k lines of plain-text. My bad. I forgot to attach the cover letter of v2 and especially v1, which includes a lengthy explanation of the driver. So here it is and I will respond to your other comments later. Oded v2 cover letter: --- As a continuation to the existing discussion, here is a v2 patch series restructured with a cleaner history and no totally-different-early-versions of the code. Instead of 83 patches, there are now a total of 25 patches, where 5 of them are modifications to radeon driver and 18 of them include only amdkfd code. There is no code going away or even modified between patches, only added. The driver was renamed from radeon_kfd to amdkfd and moved to reside under drm/radeon/amdkfd. This move was done to emphasize the fact that this driver is an AMD-only driver at this point. Having said that, we do foresee a generic hsa framework being implemented in the future and in that case, we will adjust amdkfd to work within that framework. As the amdkfd driver should support multiple AMD gfx drivers, we want to keep it as a seperate driver from radeon. Therefore, the amdkfd code is contained in its own folder. The amdkfd folder was put under the radeon folder because the only AMD gfx driver in the Linux kernel at this point is the radeon driver. Having said that, we will probably need to move it (maybe to be directly under drm) after we integrate with additional AMD gfx drivers. For people who like to review using git, the v2 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 Written by Oded Gabbayh - Original Cover Letter: - This patch set implements a Heterogeneous System Architecture (HSA) driver for radeon-family GPUs. HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to share system resources more effectively via HW features including shared pageable memory, userspace-accessible work queues, and platform-level atomics. In addition to the memory protection mechanisms in GPUVM and IOMMUv2, the Sea Islands family of GPUs also performs HW-level validation of commands passed in through the queues (aka rings). The code in this patch set is intended to serve both as a sample driver for other HSA-compatible hardware devices and as a production driver for radeon-family processors. The code is architected to support multiple CPUs each with connected GPUs, although the current implementation focuses on a single Kaveri/Berlin APU, and works alongside the existing radeon kernel graphics driver (kgd). AMD GPUs designed for use with HSA (Sea Islands and up) share some hardware functionality between HSA compute and regular gfx/compute (memory, interrupts, registers), while other functionality has been added specifically for HSA compute (hw scheduler for virtualized compute rings). All shared hardware is owned by the radeon graphics driver, and an interface between kfd and kgd allows the kfd to
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #21 from Alex Deucher --- (In reply to comment #20) > (In reply to comment #19) > > Maybe we'll get more useful feedback once more people start testing hawaii. > > That sounds like I failed to provide something? If you have any request, > what I should check, just let me know. Ie. trying a different compiler? I didn't mean to imply that. I can't think of anything else to provide. I'm just thinking maybe someone will notice some small detail that I missed or something like that. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5a8469fd/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #20 from Kai --- (In reply to comment #19) > Maybe we'll get more useful feedback once more people start testing hawaii. That sounds like I failed to provide something? If you have any request, what I should check, just let me know. Ie. trying a different compiler? -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/87c6d0d1/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #19 from Alex Deucher --- (In reply to comment #18) > (In reply to comment #16) > > I don't have any other ideas off hand. That patch represents is the only > > difference explicitly setting that parameter changes. > > Ok, no problem; I just keep the radeon.dpm=1 around and I'm going to be > happy, I hope. But I guess we should keep this bug open, until we find the > cause? Maybe we should change the title to something like "reclocking only > with radeon.dpm=1 set"? But that's all your call. Yeah, let's keep it open for now. Maybe we'll get more useful feedback once more people start testing hawaii. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/7a8fd880/attachment-0001.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #18 from Kai --- (In reply to comment #16) > I don't have any other ideas off hand. That patch represents is the only > difference explicitly setting that parameter changes. Ok, no problem; I just keep the radeon.dpm=1 around and I'm going to be happy, I hope. But I guess we should keep this bug open, until we find the cause? Maybe we should change the title to something like "reclocking only with radeon.dpm=1 set"? But that's all your call. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/9bf4e2f6/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #17 from Kai --- (In reply to comment #15) > I booted each configuration represent by attachment 104103 and attachment > 104104 two times. Just to clarify: the boot and testing order was: rebooting into configuration 104103 ? starting Portal 2 with GALLIUM_HUD=fps ? verifying FPS in level as low ? powering off booting configuration 104104 ? starting Portal 2 with GALLIUM_HUD=fps ? verifying FPS in level as high ? powering off booting configuration 104103 ? starting Portal 2 with GALLIUM_HUD=fps ? verifying FPS in level as low ? rebooting into configuration 104104 ? starting Portal 2 with GALLIUM_HUD=fps ? verifying FPS in level as high -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6b9989be/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #16 from Alex Deucher --- I don't have any other ideas off hand. That patch represents is the only difference explicitly setting that parameter changes. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/df86d2db/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #15 from Kai --- (In reply to comment #11) > Created attachment 104101 [details] [review] > enable dpm=1 debugging even when dpm is not forced > > This patch enables the additional dpm debugging output even when it is not > explictly set on the command line. Does it help? The only thing I can > figure is that the debugging output adds a small delay that may have a > positive impact. You're not going to like this. But setting radeon.dpm=1 must have some other side effect. I booted each configuration represent by attachment 104103 and attachment 104104 two times. The first (104103) is the stack from comment #0 plus the patch from attachment 104101 applied to the kernel, then booted without radeon.dpm=1 (see the dmesg output for the kernel command line). When I start Portal 2 I stay at the numbers reported in comment #0 (ie. at low FPS). If I boot the stack from comment #0 with the patch from attachment 104101 applied to the kernel and DO set radeon.dpm=1 on the kernel command line (see second dmesg output; 104104), then I get 60 FPS in Portal 2. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6c6b84a5/attachment.html>
[Bug 68799] [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN
https://bugs.freedesktop.org/show_bug.cgi?id=68799 --- Comment #3 from Stanis?aw Halik --- Available at last address again. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/1c6ed76a/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #14 from Alex Deucher --- Did it help? With the patch applied, the behavior of the driver is identical whether or not you append radeon.dpm=1 to your kernel command line. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5b07f8ce/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #13 from Kai --- Created attachment 104104 --> https://bugs.freedesktop.org/attachment.cgi?id=104104=edit dmesg output with attachment 104101 and "radeon.dpm=1" set -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/034dcc5e/attachment-0001.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #12 from Kai --- Created attachment 104103 --> https://bugs.freedesktop.org/attachment.cgi?id=104103=edit dmesg output with attachment 104101 and no "radeon.dpm=1" set -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/d8db098e/attachment.html>
screen goes blank when loading gma500_gfx (atom D2500)
05.08.2014 20:11, Michael Tokarev wrote: > Hello again. > > It's been 4 more months since last message in this thread (which was mine). > Now kernel 3.16 has been released, and I decided to give it a try. And it > behaves just like all previous kernels, -- once gma500_gfx module is loaded, > screen goes blank, monitor turns off ("no signal detected") and nothing to > be seen until reboot. > > Can we try to debug this somehow, after more than half a year?... :) Current debugging (by 3.16), after: modprobe drm debug=6 modprobe gma500_gfx on a freshly booted system: [ 46.463381] Linux agpgart interface v0.103 [ 46.491487] [drm] Initialized drm 1.1.0 20060810 [ 56.585520] [drm:psb_intel_opregion_setup] Public ACPI methods supported [ 56.585528] [drm:psb_intel_opregion_setup] ASLE supported [ 56.585563] gma500 :00:02.0: irq 50 for MSI/MSI-X [ 56.585591] [drm:psb_intel_init_bios] Using VBT from OpRegion: $VBT CEDARVIEW d [ 56.585604] [drm:drm_mode_debug_printmodeline] Modeline 0:"1920x1080" 0 144000 1920 2016 2080 2176 1080 1088 1092 1100 0x8 0xa [ 56.585609] [drm:parse_sdvo_device_mapping] No SDVO device info is found in VBT [ 56.585617] [drm:parse_edp] EDP timing in vbt t1_t3 2000 t8 10 t9 2000 t10 500 t11_t12 5000 [ 56.585621] [drm:parse_edp] VBT reports EDP: Lane_count 1, Lane_rate 6, Bpp 24 [ 56.585624] [drm:parse_edp] VBT reports EDP: VSwing 0, Preemph 0 [ 56.598203] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no) [ 56.598902] acpi device:28: registered as cooling_device2 [ 56.599109] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input11 [ 56.599326] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 56.599366] [drm] No driver support for vblank timestamp query. [ 56.650918] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent adapter intel drm LVDSDDC_C [ 56.651842] [drm:cdv_intel_dp_i2c_init] i2c_init DPDDC-B [ 56.652352] [drm:cdv_intel_dp_aux_ch] dp_aux_ch timeout status 0x51440064 [ 56.652356] [drm:cdv_intel_dp_i2c_aux_ch] aux_ch failed -110 [ 56.652863] [drm:cdv_intel_dp_aux_ch] dp_aux_ch timeout status 0x51440064 [ 56.652866] [drm:cdv_intel_dp_i2c_aux_ch] aux_ch failed -110 [ 56.653706] [drm:cdv_intel_dp_i2c_init] i2c_init DPDDC-C [ 56.654014] [drm:cdv_intel_dp_i2c_aux_ch] aux_i2c nack [ 56.654223] [drm:cdv_intel_dp_i2c_aux_ch] aux_i2c nack [ 56.714765] gma500 :00:02.0: trying to get vblank count for disabled pipe 1 [ 56.714812] gma500 :00:02.0: trying to get vblank count for disabled pipe 1 [ 56.775220] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:10:VGA-1] [ 56.900606] [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:10:VGA-1] probed modes : [ 56.900617] [drm:drm_mode_debug_printmodeline] Modeline 26:"1280x1024" 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5 [ 56.900624] [drm:drm_mode_debug_printmodeline] Modeline 36:"1280x1024" 75 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5 [ 56.900630] [drm:drm_mode_debug_printmodeline] Modeline 29:"1280x1024" 72 132840 1280 1368 1504 1728 1024 1025 1028 1067 0x0 0x6 [ 56.900637] [drm:drm_mode_debug_printmodeline] Modeline 28:"1152x864" 75 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5 [ 56.900643] [drm:drm_mode_debug_printmodeline] Modeline 37:"1024x768" 75 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5 [ 56.900649] [drm:drm_mode_debug_printmodeline] Modeline 38:"1024x768" 70 75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa [ 56.900656] [drm:drm_mode_debug_printmodeline] Modeline 39:"1024x768" 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa [ 56.900662] [drm:drm_mode_debug_printmodeline] Modeline 40:"832x624" 75 57284 832 864 928 1152 624 625 628 667 0x40 0xa [ 56.900669] [drm:drm_mode_debug_printmodeline] Modeline 41:"800x600" 75 49500 800 816 896 1056 600 601 604 625 0x40 0x5 [ 56.900675] [drm:drm_mode_debug_printmodeline] Modeline 42:"800x600" 72 5 800 856 976 1040 600 637 643 666 0x40 0x5 [ 56.900681] [drm:drm_mode_debug_printmodeline] Modeline 30:"800x600" 60 4 800 840 968 1056 600 601 605 628 0x40 0x5 [ 56.900687] [drm:drm_mode_debug_printmodeline] Modeline 31:"640x480" 75 31500 640 656 720 840 480 481 484 500 0x40 0xa [ 56.900694] [drm:drm_mode_debug_printmodeline] Modeline 32:"640x480" 73 31500 640 664 704 832 480 489 491 520 0x40 0xa [ 56.900700] [drm:drm_mode_debug_printmodeline] Modeline 33:"640x480" 67 30240 640 704 768 864 480 483 486 525 0x40 0xa [ 56.900706] [drm:drm_mode_debug_printmodeline] Modeline 34:"640x480" 60 25200 640 656 752 800 480 490 492 525 0x40 0xa [ 56.900713] [drm:drm_mode_debug_printmodeline] Modeline 35:"720x400" 70 28320 720 738 846 900 400 412 414 449 0x40 0x6 [ 56.900719] [drm:drm_mode_debug_printmodeline] Modeline 27:"640x350" 70 25170 640 656 752 800 350 387 389 449 0x40 0x9 [ 56.900724]
screen goes blank when loading gma500_gfx (atom D2500)
Hello again. It's been 4 more months since last message in this thread (which was mine). Now kernel 3.16 has been released, and I decided to give it a try. And it behaves just like all previous kernels, -- once gma500_gfx module is loaded, screen goes blank, monitor turns off ("no signal detected") and nothing to be seen until reboot. Can we try to debug this somehow, after more than half a year?... :) Thank you, /mjt 05.04.2014 12:15, Michael Tokarev wrote: > Hello again > > It's been about 2 months since I sent the original debugging output. Today I > tried > out 3.14 kernel. And this one behaves quite similarly, screen goes blank > right > when loading gma500_gfx module. Here's the dmesg from a freshly booted system > after doing > > modprobe drm debug=6 > modprobe gma500_gfx > > with a monitor connected to VGA port (before loading gma500_gfx, it displays > the > regular text console): > > [ 39.863330] Linux agpgart interface v0.103 > [ 39.900511] [drm] Initialized drm 1.1.0 20060810 > [ 45.012300] [drm:psb_intel_opregion_setup], Public ACPI methods supported > [ 45.012308] [drm:psb_intel_opregion_setup], ASLE supported > [ 45.012345] gma500 :00:02.0: irq 50 for MSI/MSI-X > [ 45.012371] [drm:psb_intel_init_bios], Using VBT from OpRegion: $VBT > CEDARVIEW d > [ 45.012384] [drm:drm_mode_debug_printmodeline], Modeline 0:"1920x1080" 0 > 144000 1920 2016 2080 2176 1080 1088 1092 1100 0x8 0xa > [ 45.012389] [drm:parse_sdvo_device_mapping], No SDVO device info is found > in VBT > [ 45.012397] [drm:parse_edp], EDP timing in vbt t1_t3 2000 t8 10 t9 2000 > t10 500 t11_t12 5000 > [ 45.012401] [drm:parse_edp], VBT reports EDP: Lane_count 1, Lane_rate 6, > Bpp 24 > [ 45.012405] [drm:parse_edp], VBT reports EDP: VSwing 0, Preemph 0 > [ 45.012478] gma500 :00:02.0: GPU: power management timed out. > [ 45.026195] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no) > [ 45.026891] acpi device:29: registered as cooling_device2 > [ 45.027104] input: Video Bus as > /devices/LNXSYSTM:00/device:00/PNP0A08:00/LNXVIDEO:00/input/input11 > [ 45.027681] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [ 45.027726] [drm] No driver support for vblank timestamp query. > [ 45.078928] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent > adapter intel drm LVDSDDC_C > [ 45.079839] [drm:cdv_intel_dp_i2c_init], i2c_init DPDDC-B > [ 45.080383] [drm:cdv_intel_dp_aux_ch], dp_aux_ch timeout status 0x51440064 > [ 45.080388] [drm:cdv_intel_dp_i2c_aux_ch], aux_ch failed -110 > [ 45.080896] [drm:cdv_intel_dp_aux_ch], dp_aux_ch timeout status 0x51440064 > [ 45.080899] [drm:cdv_intel_dp_i2c_aux_ch], aux_ch failed -110 > [ 45.081754] [drm:cdv_intel_dp_i2c_init], i2c_init DPDDC-C > [ 45.082062] [drm:cdv_intel_dp_i2c_aux_ch], aux_i2c nack > [ 45.082272] [drm:cdv_intel_dp_i2c_aux_ch], aux_i2c nack > [ 45.122742] [drm:cdv_intel_single_pipe_active], pipe enabled 0 > [ 45.142780] gma500 :00:02.0: trying to get vblank count for disabled > pipe 1 > [ 45.142826] gma500 :00:02.0: trying to get vblank count for disabled > pipe 1 > [ 45.183207] [drm:cdv_intel_single_pipe_active], pipe enabled 0 > [ 45.203249] [drm:drm_helper_probe_single_connector_modes], > [CONNECTOR:7:VGA-1] > [ 45.332286] [drm:drm_helper_probe_single_connector_modes], > [CONNECTOR:7:VGA-1] probed modes : > [ 45.332297] [drm:drm_mode_debug_printmodeline], Modeline 23:"1280x1024" 60 > 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5 > [ 45.332304] [drm:drm_mode_debug_printmodeline], Modeline 33:"1280x1024" 75 > 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5 > [ 45.332311] [drm:drm_mode_debug_printmodeline], Modeline 26:"1280x1024" 72 > 132840 1280 1368 1504 1728 1024 1025 1028 1067 0x0 0x6 > [ 45.332318] [drm:drm_mode_debug_printmodeline], Modeline 25:"1152x864" 75 > 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5 > [ 45.332325] [drm:drm_mode_debug_printmodeline], Modeline 34:"1024x768" 75 > 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5 > [ 45.332332] [drm:drm_mode_debug_printmodeline], Modeline 35:"1024x768" 70 > 75000 1024 1048 1184 1328 768 771 777 806 0x40 0xa > [ 45.332338] [drm:drm_mode_debug_printmodeline], Modeline 36:"1024x768" 60 > 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa > [ 45.332345] [drm:drm_mode_debug_printmodeline], Modeline 37:"832x624" 75 > 57284 832 864 928 1152 624 625 628 667 0x40 0xa > [ 45.332352] [drm:drm_mode_debug_printmodeline], Modeline 38:"800x600" 75 > 49500 800 816 896 1056 600 601 604 625 0x40 0x5 > [ 45.332359] [drm:drm_mode_debug_printmodeline], Modeline 39:"800x600" 72 > 5 800 856 976 1040 600 637 643 666 0x40 0x5 > [ 45.332365] [drm:drm_mode_debug_printmodeline], Modeline 27:"800x600" 60 > 4 800 840 968 1056 600 601 605 628 0x40 0x5 > [ 45.332372] [drm:drm_mode_debug_printmodeline], Modeline 28:"640x480" 75 > 31500 640 656 720 840 480 481
[sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!
On Tue, Aug 5, 2014 at 9:59 AM, Fengguang Wu wrote: > Hello, > > This is an old BUG that still lives in linux-next. > > [4.284620] device id = 2670 > [4.286157] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd > [4.287060] [ cut here ] > [4.287722] kernel BUG at include/linux/mtd/map.h:148! > [4.288048] invalid opcode: [#1] PREEMPT SMP > [4.288048] CPU 1 > [4.288048] Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc4-00162-g49099c4 > #17 Bochs Bochs > [4.288048] RIP: 0010:[] [] > mtd_do_chip_probe+0x1d/0x1f > [4.288048] RSP: 0018:880011049e20 EFLAGS: 00010246 > [4.288048] RAX: RBX: 82a23550 RCX: > > [4.288048] RDX: 880011049e20 RSI: 82a23580 RDI: > 880011049e80 > [4.288048] RBP: 880011049e80 R08: 0003 R09: > 810d6c93 > [4.288048] R10: R11: 0001 R12: > 82a23eb0 > [4.288048] R13: 828790ce R14: R15: > > [4.288048] FS: () GS:88001260() > knlGS: > [4.288048] CS: 0010 DS: ES: CR0: 8005003b > [4.288048] CR2: CR3: 0298c000 CR4: > 000406e0 > [4.288048] DR0: DR1: DR2: > > [4.288048] DR3: DR6: 0ff0 DR7: > 0400 > [4.288048] Process swapper/0 (pid: 1, threadinfo 880011048000, task > 88001104) > [4.288048] Stack: > [4.288048] > > [4.288048] > > [4.288048] > > [4.288048] Call Trace: > [4.288048] [] cfi_probe+0x15/0x17 > [4.288048] [] do_map_probe+0xa0/0xac > [4.288048] [] ? physmap_init+0x12/0x12 > [4.288048] [] init_sbc_gxx+0x104/0x15b > [4.288048] [] do_one_initcall+0x86/0x208 > [4.288048] [] kernel_init+0x10d/0x1c2 > [4.288048] [] ? do_early_param+0xc3/0xc3 > [4.288048] [] kernel_thread_helper+0x4/0x10 > [4.288048] [] ? retint_restore_args+0x13/0x13 > [4.288048] [] ? do_one_initcall+0x208/0x208 > [4.288048] [] ? gs_change+0x13/0x13 > [4.288048] Code: 83 c4 58 5b 41 5c 41 5d 41 5e 41 5f 5d c3 55 48 89 e5 48 > 83 ec 60 66 66 66 66 90 31 c0 b9 18 00 00 00 48 8d 55 a0 48 89 d7 f3 ab <0f> > 0b 55 48 89 e5 66 66 66 66 90 48 c7 c6 a0 39 a2 82 e8 cc ff > [4.288048] RIP [] mtd_do_chip_probe+0x1d/0x1f > [4.288048] RSP > [4.321423] ---[ end trace 169195d5d1f9be6e ]--- > [4.322118] swapper/0 (1) used greatest stack depth: 3768 bytes left > > This script may reproduce the error. > > > #!/bin/bash > > kernel=$1 > initrd=quantal-core-x86_64.cgz > > wget --no-clobber > https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd > > kvm=( > qemu-system-x86_64 > -enable-kvm > -cpu Haswell,+smep,+smap > -kernel $kernel > -initrd $initrd > -m 320 > -smp 2 > -net nic,vlan=1,model=e1000 > -net user,vlan=1 > -boot order=nc > -no-reboot > -watchdog i6300esb > -rtc base=localtime > -serial stdio > -display none > -monitor null > ) > > append=( > hung_task_panic=1 > earlyprintk=ttyS0,115200 > debug > apic=debug > sysrq_always_enabled > rcupdate.rcu_cpu_stall_timeout=100 > panic=10 > softlockup_panic=1 > nmi_watchdog=panic > prompt_ramdisk=0 > console=ttyS0,115200 > console=tty0 > vga=normal > root=/dev/ram0 > rw > drbd.minor_count=8 > ) > > "${kvm[@]}" --append "${append[*]}" > > > Thanks, > Fengguang > > ___ > LKP mailing list > LKP at linux.intel.com > I am new , here and will try to trace your issue on linus's tree unless there is a major difference between Linus's tree and linux-next. If there is please let me known before I start tracing this. Best Regards , Nick
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #11 from Alex Deucher --- Created attachment 104101 --> https://bugs.freedesktop.org/attachment.cgi?id=104101=edit enable dpm=1 debugging even when dpm is not forced This patch enables the additional dpm debugging output even when it is not explictly set on the command line. Does it help? The only thing I can figure is that the debugging output adds a small delay that may have a positive impact. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/c141e194/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #10 from Kai --- (In reply to comment #8) > dpm is enabled by default for hawaii asics. You shouldn't need to force it > on the command line. forcing it just enabled additional debugging output. I can only state, that by setting radeon.dpm=1 I get 60 FPS in e.g. Portal 2 and without I'm at 15 FPS max. As written in comment #0, I've built your drm-next-3.17-rebased-on-fixes branch, my top commit is commit fa783807977da98da35590fd1d5efdfd4f33fd59 Author: Christian K?nig Date: Mon Jul 28 13:30:12 2014 +0200 drm/radeon: allow userptr write access under certain conditions It needs to be anonymous memory (no file mappings) and we are requried to install an MMU notifier. Signed-off-by: Christian K?nig Signed-off-by: Alex Deucher I even went through several reboots, switching between "with radeon.dpm=1" and without. All showed the same result. Let me know, if there is something else, I can do to assist in debugging this. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/cd31b125/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #9 from Kai --- (In reply to comment #7) > Are(In reply to comment #6) > > Now for your glxgears test: reclocking works (in Portal 2 as well, where I > > get 58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel > > command line. > > Are you absolutely sure you need radeon.dpm=1 ? Yes. > Reclocking works here (R9 > 290X) without it. I just rechecked and I don't have it on my kernel command > line (new "drm-next-3.17" branch). Nor do I have it anywhere in /etc. If unsure with what you've booted, look at dmesg, one of the first lines looks like: > Command line: BOOT_IMAGE=/vmlinuz-3.16.0-rc6-citadel > root=/dev/mapper/citadel--vg-vol--root ro quiet radeon.dpm=1 -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/2132ddd9/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #8 from Alex Deucher --- dpm is enabled by default for hawaii asics. You shouldn't need to force it on the command line. forcing it just enabled additional debugging output. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/d5527a02/attachment.html>
[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2
Am 05.08.2014 um 19:39 schrieb Jerome Glisse: > On Tue, Aug 05, 2014 at 06:05:29PM +0200, Christian K?nig wrote: >> From: Christian K?nig >> >> Avoid problems with writeback by limiting userptr to anonymous memory. >> >> v2: add commit and code comments > I guess, i have not expressed myself clearly. This is bogus, you pretend > you want to avoid writeback issue but you still allow userspace to map > file backed pages (which by the way might be a regular bo object from > another device for instance and that would be fun). > > So this patch is a no go and i would rather see that this userptr to > be restricted to anon vma only no matter what. No flags here. Mapping of non anonymous memory (e.g. everything get_user_pages won't fail with) is restricted to read only access by the GPU. I'm fine with making it a hard requirement for all mappings if you say it's a must have. Christian. > > Cheers, > J?r?me > >> Signed-off-by: Christian K?nig >> --- >> drivers/gpu/drm/radeon/radeon_gem.c | 3 ++- >> drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++ >> include/uapi/drm/radeon_drm.h | 1 + >> 3 files changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c >> b/drivers/gpu/drm/radeon/radeon_gem.c >> index 993ab22..032736b 100644 >> --- a/drivers/gpu/drm/radeon/radeon_gem.c >> +++ b/drivers/gpu/drm/radeon/radeon_gem.c >> @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, >> void *data, >> return -EACCES; >> >> /* reject unknown flag values */ >> -if (args->flags & ~RADEON_GEM_USERPTR_READONLY) >> +if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | >> +RADEON_GEM_USERPTR_ANONONLY)) >> return -EINVAL; >> >> /* readonly pages not tested on older hardware */ >> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c >> b/drivers/gpu/drm/radeon/radeon_ttm.c >> index 0109090..54eb7bc 100644 >> --- a/drivers/gpu/drm/radeon/radeon_ttm.c >> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c >> @@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) >> ttm->num_pages * PAGE_SIZE)) >> return -EFAULT; >> >> +if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { >> +/* check that we only pin down anonymous memory >> + to prevent problems with writeback */ >> +unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; >> +struct vm_area_struct *vma; >> +vma = find_vma(gtt->usermm, gtt->userptr); >> +if (!vma || vma->vm_file || vma->vm_end < end) >> +return -EPERM; >> +} >> + >> do { >> unsigned num_pages = ttm->num_pages - pinned; >> uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; >> diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h >> index 3a9f209..9720e1a 100644 >> --- a/include/uapi/drm/radeon_drm.h >> +++ b/include/uapi/drm/radeon_drm.h >> @@ -816,6 +816,7 @@ struct drm_radeon_gem_create { >>* perform any operation. >>*/ >> #define RADEON_GEM_USERPTR_READONLY(1 << 0) >> +#define RADEON_GEM_USERPTR_ANONONLY (1 << 1) >> >> struct drm_radeon_gem_userptr { >> uint64_taddr; >> -- >> 1.9.1 >> >> ___ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
[pull] radeon drm-next-3.17
Am 05.08.2014 um 19:22 schrieb Daniel Vetter: > On Tue, Aug 5, 2014 at 7:15 PM, Deucher, Alexander > wrote: >>> -Original Message- >>> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel >>> Vetter >>> Sent: Tuesday, August 05, 2014 1:09 PM >>> To: Alex Deucher >>> Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, >>> Alexander >>> Subject: Re: [pull] radeon drm-next-3.17 >>> >>> On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote: Hi Dave, This is the radeon pull request for 3.17. Highlights: - Additional Hawaii fixes - Support for using the display scaler on non-fixed mode displays - Support for new firmware format that makes it easier to update - Enable dpm by default on additional asics - GPUVM improvements - Support for uncached and write combined gtt buffers - Userptr support >>> Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see >>> them fly by anywhere, so I guess I've missed them on some m-l I don't >>> subscribe to. >> Christian wrote some patches to validate the interfaces, but I'm not sure he >> ever sent them out. We haven't yet done a full implementation in the >> usermode drivers to take advantage of this yet. > Well right now I've consistently rejected all patches that don't yet > come with the full thing (libdrm, usermode drivers and tests for it > all as not ready). And I do that at least once per week since we have > blob userspace separate from mesa, too. So if we toss that rule > overboard (and my understanding is that Dave's been fairly strict > here) I'll look rather bad. As in really, really bad. > > I strongly prefer that userptr gets postponed until it's ready. Dave? That's just my fault. Wanted to wait with the mesa patches till we have the kernel interface accepted. I've just send them out, Christian. > -Daniel
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #7 from Luzipher --- Are(In reply to comment #6) > Now for your glxgears test: reclocking works (in Portal 2 as well, where I > get 58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel > command line. Are you absolutely sure you need radeon.dpm=1 ? Reclocking works here (R9 290X) without it. I just rechecked and I don't have it on my kernel command line (new "drm-next-3.17" branch). Nor do I have it anywhere in /etc. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/bf8f97f8/attachment.html>
[pull] radeon drm-next-3.17
On Tue, Aug 5, 2014 at 7:15 PM, Deucher, Alexander wrote: >> -Original Message- >> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel >> Vetter >> Sent: Tuesday, August 05, 2014 1:09 PM >> To: Alex Deucher >> Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, >> Alexander >> Subject: Re: [pull] radeon drm-next-3.17 >> >> On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote: >> > Hi Dave, >> > >> > This is the radeon pull request for 3.17. Highlights: >> > - Additional Hawaii fixes >> > - Support for using the display scaler on non-fixed mode displays >> > - Support for new firmware format that makes it easier to update >> > - Enable dpm by default on additional asics >> > - GPUVM improvements >> > - Support for uncached and write combined gtt buffers >> > - Userptr support >> >> Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see >> them fly by anywhere, so I guess I've missed them on some m-l I don't >> subscribe to. > > Christian wrote some patches to validate the interfaces, but I'm not sure he > ever sent them out. We haven't yet done a full implementation in the > usermode drivers to take advantage of this yet. Well right now I've consistently rejected all patches that don't yet come with the full thing (libdrm, usermode drivers and tests for it all as not ready). And I do that at least once per week since we have blob userspace separate from mesa, too. So if we toss that rule overboard (and my understanding is that Dave's been fairly strict here) I'll look rather bad. As in really, really bad. I strongly prefer that userptr gets postponed until it's ready. Dave? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
[PATCH v3 00/23] AMDKFD Kernel Driver
Hi On Tue, Aug 5, 2014 at 5:30 PM, Oded Gabbay wrote: > Hi, > Here is the v3 patch set of amdkfd. > > This version contains changes and fixes to code, as agreed on during the > review > of the v2 patch set. > > The major changes are: > > - There are two new module parameters: # of processes and # of queues per > process. The defaults, as agreed on in the v2 review, are 32 and 128 > respectively. This sets the default amount of GART address space that amdkfd > requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff, > such as mqd for kernel queue, hpd for pipelines, etc.) > > - All the GART address space usage of amdkfd is done inside a single > contiguous > buffer that is allocated from system memory, and pinned to the start of the > GART during the startup of amdkfd (which is just after the startup of > radeon). The management of this buffer is done by the radeon sa manager. > This buffer is not evict-able. > > - Mapping of doorbells is initiated by the userspace lib (by mmap syscall), > instead of initiating it from inside an ioctl (using vm_mmap). > > - Removed ioctls for exclusive access to performance counters > > - Added documentation about the QCM (Queue Control Management), apertures and > interfaces between amdkfd and radeon. > > Two important notes: > > - The topology patch has not been changed. Look at > http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html > for my response. I also put my answer as an explanation in the commit msg > of the patch. This patchset adds 10.000 lines and contains nearly 0 comments *why* stuff is added. Seriously, it is almost impossible to understand what you're doing. Can you please include a high-level introduction in the [0/X] cover-letter and include it in every series you send? A blog-post or something would also be fine. And yes, it's totally ok if this is 10k lines of plain-text. Lets start with the basics: 1) Why do you use kobject directly to expose the topology? Almost no other driver does that, why do you use it in amdkfd instead of "struct bus" and "struct device"? You totally lack uevent handling, sysfs hierarchy integration and more. If you'd use existing infrastructue instead of kobject directly, everything would work just fine. 2) What kind of topology is exposed? Is it nested? How deep? How many items are usually expected? How does the sysfs tree (`tree /sys//topology`) look like on your machine? For people without the hardware it's nearly impossible to understand how this will look like. 3) How is the interface supposed to be used? I can see one global char-dev where you can queue jobs by providing a GPU-ID. Why don't you create one char-dev *per* available GPU just like all other interfaces do? Why is this a separate thing instead of a drm_minor object that can be added per device as a separate interface to KMS and render-nodes? Where is the underlying "struct device" for those GPUs? 4) Why is the topology static? FWIW, you allow runtime modifications, but I cannot see any notification mechanism for user-space? Again, using existing driver-core would provide all that for free. I really appreciate that you provided code instead of just ideas, but please describe why you do things the way they are. And please provide examples for people who do not have the hardware. Thanks David > - There are still some minor code style issues I need to fix. I didn't want > to delay v3 any further but I will publish either v4 with those fixes, > or just relevant patches if the whole patch set will be merged. > > For people who like to review using git, the v3 patch set is located at: > http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3 > > In addition, I would like to announce that we have uploaded the userspace lib > that accompanies amdkfd. That lib is called "libhsakmt" and you can view it > at: > http://cgit.freedesktop.org/~gabbayo/libhsakmt > > Alexey Skidanov (1): > amdkfd: Implement the Get Process Aperture IOCTL > > Andrew Lewycky (3): > amdkfd: Add basic modules to amdkfd > amdkfd: Add interrupt handling module > amdkfd: Implement the Set Memory Policy IOCTL > > Ben Goz (8): > amdkfd: Add queue module > amdkfd: Add mqd_manager module > amdkfd: Add kernel queue module > amdkfd: Add module parameter of scheduling policy > amdkfd: Add packet manager module > amdkfd: Add process queue manager module > amdkfd: Add device queue manager module > amdkfd: Implement the create/destroy/update queue IOCTLs > > Evgeny Pinchuk (2): > amdkfd: Add topology module to amdkfd > amdkfd: Implement the Get Clock Counters IOCTL > > Oded Gabbay (9): > drm/radeon: reduce number of free VMIDs and pipes in KV > drm/radeon/cik: Don't touch int of pipes 1-7 > drm/radeon: Report doorbell configuration to amdkfd > drm/radeon: adding synchronization for GRBM GFX > drm/radeon: Add radeon <--> amdkfd interface > Update MAINTAINERS and CREDITS files with
[pull] radeon drm-next-3.17
On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote: > Hi Dave, > > This is the radeon pull request for 3.17. Highlights: > - Additional Hawaii fixes > - Support for using the display scaler on non-fixed mode displays > - Support for new firmware format that makes it easier to update > - Enable dpm by default on additional asics > - GPUVM improvements > - Support for uncached and write combined gtt buffers > - Userptr support Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see them fly by anywhere, so I guess I've missed them on some m-l I don't subscribe to. -Daniel > - Allow allocation of BOs larger than visible vram > - Various other small fixes and improvements > > The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d: > > drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 > +1000) > > are available in the git repository at: > > git://people.freedesktop.org/~agd5f/linux drm-next-3.17 > > for you to fetch changes up to ffd7d3a9d535933c7edfbaaac161f11628270716: > > drm/radeon: allow userptr write access under certain conditions (2014-08-05 > 12:10:42 -0400) > > > Alex Deucher (25): > drm/radeon/dpm: add support for SVI2 voltage for SI > drm/radeon: disable gfx cgcg on cik > drm/radeon: add new firmware header definitions (v3) > drm/radeon/si: Add support for new ucode format (v3) > drm/radeon/cik: Add support for new ucode format (v5) > drm/radeon: enable display scaling on all connectors (v2) > drm/radeon: consolidate vga and dvi get_modes functions (v2) > drm/radeon: restructure edid fetching > drm/radeon: use a fetch function to get the edid > drm/radeon: track pinned memory (v2) > drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl > drm/radeon: use vram/gart pinned size in radeon_do_test_moves > drm/radeon: remove visible vram size limit on bo allocation (v4) > drm/radeon: add a PX quirk list > drm/radeon: make radeon_connector_encoder_is_hbr2 static > drm/radeon: load the lm63 driver for an lm64 thermal chip. > drm/radeon: fix reversed logic in evergreen_mc_resume > drm/radeon/atom: add new voltage fetch function for hawaii > drm/radeon/dpm: handle voltage info fetching on hawaii > drm/radeon: re-enable dpm by default on cayman > drm/radeon: re-enable dpm by default on BTC > drm/radeon: use an intervall tree to manage the VMA v2 > drm/radeon: use packet2 for nop on hawaii with old firmware > drm/radeon: tweak ACCEL_WORKING2 query for hawaii > drm/radeon: use packet3 for nop on hawaii with new firmware > > Andreas Boll (1): > drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for hawaii > > Christian K?nig (20): > drm/radeon: remove discardable flag from radeon_gem_object_create > drm/radeon: fix R600_PTE_GART handling > drm/radeon: add trace_radeon_vm_flush > drm/radeon: set VM base addr using the PFP v2 > drm/radeon: separate ring and IB handling > drm/radeon: invalidate moved BOs in the VM (v2) > drm/radeon: remove radeon_bo_clear_va > drm/radeon: try to enable VM flushing once more > drm/radeon: adjust default radeon_vm_block_size v2 > drm/radeon: remove taking mclk_lock from radeon_bo_unref > drm/radeon: add radeon_bo_ref function > drm/radeon: take a BO reference on VM cleanup > drm/radeon: add VM GART copy optimization to NI as well > drm/radeon: split PT setup in more functions > drm/radeon: update IB size estimation for VM > drm/radeon: add userptr support v7 > drm/radeon: add userptr flag to limit it to anonymous memory v2 > drm/radeon: add userptr flag to directly validate the BO to GTT > drm/radeon: add userptr flag to register MMU notifier v3 > drm/radeon: allow userptr write access under certain conditions > > Fabian Frederick (1): > drm/radeon: remove null test before kfree > > Lauri Kasanen (1): > drm/radeon: Inline r100_mm_rreg, -wreg, v3 > > Mario Kleiner (2): > drm/radeon: Use pflip irqs for pageflip completion if possible. (v2) > drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined. > > Michel D?nzer (10): > drm/radeon: Demote 'BO allocation size too large' message to debug only > drm/radeon: Remove radeon_gart_restore() > drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly > drm/radeon: Allow write-combined CPU mappings of BOs in GTT (v2) > drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe > drm/radeon: Use write-combined CPU mappings of IBs on >= CIK > drm/radeon/cik: Read back SDMA WPTR register after writing it > drm/radeon: s/ioctl_wait_idle/mmio_hpd_flush/ > drm/radeon: Always flush the HDP cache
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #6 from Kai --- (In reply to comment #5) > Are you checking radeon_pm_info while the app is running? E.g., via ssh or > via another X terminal? If you switch to another VT or something like that > there will not be any activity. Can you try is with something simple like > glxgears? E.g., run `vblank_mode=0 glxgears -fullscreen` and then check > radeon_pm_info via ssh while gears is running. I've always checked through SSH from a second machine. Now for your glxgears test: reclocking works (in Portal 2 as well, where I get 58-60 FPS now). The only difference is the radeon.dpm=1 on the kernel command line. Was that expected? I thought DPM was activated automatically with your 3.17 branch (it says so during boot as well, see e.g. attachment 103996) or at least I interpreted the "[drm] radeon: dpm initialized" line that way. As far as I'm concerned this can be closed, though the radeon man page should probably get a line like "setting radeon.dpm=1 is mandatory for reclocking on the following ASICs". I let you decide whether this is something that should have happend automatically (my preference) or that requires the kernel parameter and close/keep the report accordingly. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/84cff81e/attachment.html>
[Bug 82154] [HAWAII] gpu-reset when closing gwenview, fails to resume (atombios stuck executing), then flickery noise
https://bugs.freedesktop.org/show_bug.cgi?id=82154 --- Comment #4 from Luzipher --- (In reply to comment #3) > Does this also happen with the drm-next-3.17-rebased-on-fixes kernel branch? > drm-next-3.17-wip is missing a few stability fixes compared to that. Unfortunately I couldn't test with drm-next-3.17-rebased-on-fixes, because it has a bug somewhere in ata (null pointer dereference or some such thing) that prevents me from booting. Also I never reproduced the bug, it actually happened after I recompiled stuff (especially xf86-video-ati with the v3-patch for enabling hawaii accel available here: http://lists.x.org/archives/xorg-driver-ati/2014-August/026534.html ). I then couldn't get the setup where the crash happened back easily (acceleration stopped working). But I suspect it's a "random" bug that isn't really related to closing gwenview. I'm now on the brand-new current "drm-next-3.17" branch, which is based on 3.16.0 final and boots fine - and should also have all the fixes and patches (?). I'll monitor the situation for a while. Probably this bug can be disregarded unless I get more similar crashes with all the fixes applied and more information how to cause it. Sorry 'bout the noise, I though those dmesg-messages with exact atombios commands getting stuck would reveal an possibly easy-to-fix issue, but according to agd5f on irc they are only symptoms of the gpu not being able to resume correctly. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/9a3a8bdc/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #5 from Alex Deucher --- Are you checking radeon_pm_info while the app is running? E.g., via ssh or via another X terminal? If you switch to another VT or something like that there will not be any activity. Can you try is with something simple like glxgears? E.g., run `vblank_mode=0 glxgears -fullscreen` and then check radeon_pm_info via ssh while gears is running. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/79c1de41/attachment.html>
[PATCH v3 00/23] AMDKFD Kernel Driver
))To be clear, when we ask for open source userspace that shows how things are suppose to be use we are thinking something like mesa but in this case most likely something like an open source opencl implementation on top of that kernel api. Yep, understood. We're working on that too. Next should be the HSA API runtime, which is essentially the user mode driver for HSA that language toolchains run over. I think Sumatra (Java) will probably be the first open source language runtime rather than OpenCL -- it's working today albeit via an older version of the HSA API. Thanks, JB - Original Message - From: Jerome Glisse [mailto:j.gli...@gmail.com] Sent: Tuesday, August 05, 2014 01:51 PM Eastern Standard Time To: Gabbay, Oded Cc: Lewycky, Andrew; Daenzer, Michel; linux-kernel at vger.kernel.org ; dri-devel at lists.freedesktop.org ; Andrew Morton Subject: Re: [PATCH v3 00/23] AMDKFD Kernel Driver On Tue, Aug 05, 2014 at 06:30:28PM +0300, Oded Gabbay wrote: > Hi, > Here is the v3 patch set of amdkfd. > > This version contains changes and fixes to code, as agreed on during the > review > of the v2 patch set. > > The major changes are: > > - There are two new module parameters: # of processes and # of queues per > process. The defaults, as agreed on in the v2 review, are 32 and 128 > respectively. This sets the default amount of GART address space that amdkfd > requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff, > such as mqd for kernel queue, hpd for pipelines, etc.) > > - All the GART address space usage of amdkfd is done inside a single > contiguous > buffer that is allocated from system memory, and pinned to the start of the > GART during the startup of amdkfd (which is just after the startup of > radeon). The management of this buffer is done by the radeon sa manager. > This buffer is not evict-able. > > - Mapping of doorbells is initiated by the userspace lib (by mmap syscall), > instead of initiating it from inside an ioctl (using vm_mmap). > > - Removed ioctls for exclusive access to performance counters > > - Added documentation about the QCM (Queue Control Management), apertures and > interfaces between amdkfd and radeon. > > Two important notes: > > - The topology patch has not been changed. Look at > http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html > for my response. I also put my answer as an explanation in the commit msg > of the patch. > > - There are still some minor code style issues I need to fix. I didn't want > to delay v3 any further but I will publish either v4 with those fixes, > or just relevant patches if the whole patch set will be merged. > > For people who like to review using git, the v3 patch set is located at: > http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3 > > In addition, I would like to announce that we have uploaded the userspace lib > that accompanies amdkfd. That lib is called "libhsakmt" and you can view it > at: > http://cgit.freedesktop.org/~gabbayo/libhsakmt Not commenting on the patchset yet, will try to find sometime in my non work hour to do that. But the userspace you released are just a libdrm like thing and this is not what we mean by we need to have userspace that shows how the kernel api is use. So this library is nothing but a wrapper and have allmost no value for any serious review of the kernel api. To be clear, when we ask for open source userspace that shows how things are suppose to be use we are thinking something like mesa but in this case most likely something like an open source opencl implementation on top of that kernel api. Btw this library code remind me of VHDL ... thought code style for userspace library is anybody choice. Cheers, J?r?me > > Alexey Skidanov (1): > amdkfd: Implement the Get Process Aperture IOCTL > > Andrew Lewycky (3): > amdkfd: Add basic modules to amdkfd > amdkfd: Add interrupt handling module > amdkfd: Implement the Set Memory Policy IOCTL > > Ben Goz (8): > amdkfd: Add queue module > amdkfd: Add mqd_manager module > amdkfd: Add kernel queue module > amdkfd: Add module parameter of scheduling policy > amdkfd: Add packet manager module > amdkfd: Add process queue manager module > amdkfd: Add device queue manager module > amdkfd: Implement the create/destroy/update queue IOCTLs > > Evgeny Pinchuk (2): > amdkfd: Add topology module to amdkfd > amdkfd: Implement the Get Clock Counters IOCTL > > Oded Gabbay (9): > drm/radeon: reduce number of free VMIDs and pipes in KV > drm/radeon/cik: Don't touch int of pipes 1-7 > drm/radeon: Report doorbell configuration to amdkfd > drm/radeon: adding synchronization for GRBM GFX > drm/radeon: Add radeon <--> amdkfd interface > Update MAINTAINERS and CREDITS files with amdkfd info > amdkfd: Add IOCTL set definitions of amdkfd > amdkfd: Add amdkfd skeleton driver > amdkfd: Add
[PATCH v3 23/23] amdkfd: Implement the Get Process Aperture IOCTL
From: Alexey Skidanovv3: fix debug msg Signed-off-by: Alexey Skidanov Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 47 - drivers/gpu/drm/radeon/amdkfd/kfd_priv.h| 5 +++ 2 files changed, 51 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c index eba5b5d..5ee0cda 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c @@ -397,7 +397,52 @@ static long kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process static int kfd_ioctl_get_process_apertures(struct file *filp, struct kfd_process *p, void __user *arg) { - return -ENODEV; + struct kfd_ioctl_get_process_apertures_args args; + struct kfd_process_device *pdd; + + dev_dbg(kfd_device, "get apertures for PASID %d", p->pasid); + + if (copy_from_user(, arg, sizeof(args))) + return -EFAULT; + + args.num_of_nodes = 0; + + mutex_lock(>mutex); + + /*if the process-device list isn't empty*/ + if (kfd_has_process_device_data(p)) { + /* Run over all pdd of the process */ + pdd = kfd_get_first_process_device_data(p); + do { + + args.process_apertures[args.num_of_nodes].gpu_id = pdd->dev->id; + args.process_apertures[args.num_of_nodes].lds_base = pdd->lds_base; + args.process_apertures[args.num_of_nodes].lds_limit = pdd->lds_limit; + args.process_apertures[args.num_of_nodes].gpuvm_base = pdd->gpuvm_base; + args.process_apertures[args.num_of_nodes].gpuvm_limit = pdd->gpuvm_limit; + args.process_apertures[args.num_of_nodes].scratch_base = pdd->scratch_base; + args.process_apertures[args.num_of_nodes].scratch_limit = pdd->scratch_limit; + + dev_dbg(kfd_device, "node id %u\n", args.num_of_nodes); + dev_dbg(kfd_device, "gpu id %u\n", pdd->dev->id); + dev_dbg(kfd_device, "lds_base %llX\n", pdd->lds_base); + dev_dbg(kfd_device, "lds_limit %llX\n", pdd->lds_limit); + dev_dbg(kfd_device, "gpuvm_base %llX\n", pdd->gpuvm_base); + dev_dbg(kfd_device, "gpuvm_limit %llX\n", pdd->gpuvm_limit); + dev_dbg(kfd_device, "scratch_base %llX\n", pdd->scratch_base); + dev_dbg(kfd_device, "scratch_limit %llX\n", pdd->scratch_limit); + + args.num_of_nodes++; + } while ((pdd = kfd_get_next_process_device_data(p, pdd)) != NULL && + (args.num_of_nodes < NUM_OF_SUPPORTED_GPUS)); + } + + mutex_unlock(>mutex); + + if (copy_to_user(arg, , sizeof(args))) + return -EFAULT; + + return 0; } static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h index 0e3e18f..9f49f11 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h @@ -445,6 +445,11 @@ struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev, struct kfd_process *p, int create_pdd); +/* Process device data iterator */ +struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p); +struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p, struct kfd_process_device *pdd); +bool kfd_has_process_device_data(struct kfd_process *p); + /* PASIDs */ int kfd_pasid_init(void); void kfd_pasid_exit(void); -- 1.9.1
[PATCH v3 22/23] amdkfd: Implement the Get Clock Counters IOCTL
From: Evgeny PinchukSigned-off-by: Evgeny Pinchuk Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c index cc7ac28..eba5b5d 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c @@ -364,7 +364,34 @@ out: static long kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void __user *arg) { - return -ENODEV; + struct kfd_ioctl_get_clock_counters_args args; + struct kfd_dev *dev; + struct timespec time; + + if (copy_from_user(, arg, sizeof(args))) + return -EFAULT; + + dev = kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + /* Reading GPU clock counter from KGD */ + args.gpu_clock_counter = kfd2kgd->get_gpu_clock_counter(dev->kgd); + + /* No access to rdtsc. Using raw monotonic time */ + getrawmonotonic(); + args.cpu_clock_counter = (uint64_t)timespec_to_ns(); + + get_monotonic_boottime(); + args.system_clock_counter = (uint64_t)timespec_to_ns(); + + /* Since the counter is in nano-seconds we use 1GHz frequency */ + args.system_clock_freq = 10; + + if (copy_to_user(arg, , sizeof(args))) + return -EFAULT; + + return 0; } -- 1.9.1
[PATCH v3 21/23] amdkfd: Implement the Set Memory Policy IOCTL
From: Andrew LewyckySigned-off-by: Andrew Lewycky Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 51 - 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c index 17725f6..cc7ac28 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c @@ -35,6 +35,7 @@ #include #include #include "kfd_priv.h" +#include "kfd_device_queue_manager.h" static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); @@ -310,7 +311,55 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p, void static long kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void __user *arg) { - return -ENODEV; + struct kfd_ioctl_set_memory_policy_args args; + struct kfd_dev *dev; + int err = 0; + struct kfd_process_device *pdd; + enum cache_policy default_policy, alternate_policy; + + if (copy_from_user(, arg, sizeof(args))) + return -EFAULT; + + if (args.default_policy != KFD_IOC_CACHE_POLICY_COHERENT + && args.default_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) { + return -EINVAL; + } + + if (args.alternate_policy != KFD_IOC_CACHE_POLICY_COHERENT + && args.alternate_policy != KFD_IOC_CACHE_POLICY_NONCOHERENT) { + return -EINVAL; + } + + dev = kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + mutex_lock(>mutex); + + pdd = kfd_bind_process_to_device(dev, p); + if (IS_ERR(pdd) < 0) { + err = PTR_ERR(pdd); + goto out; + } + + default_policy = (args.default_policy == KFD_IOC_CACHE_POLICY_COHERENT) +? cache_policy_coherent : cache_policy_noncoherent; + + alternate_policy = (args.alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT) + ? cache_policy_coherent : cache_policy_noncoherent; + + if (!dev->dqm->set_cache_memory_policy(dev->dqm, +>qpd, +default_policy, +alternate_policy, +(void __user *)args.alternate_aperture_base, +args.alternate_aperture_size)) + err = -EINVAL; + +out: + mutex_unlock(>mutex); + + return err; } static long kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void __user *arg) -- 1.9.1
[PATCH v3 20/23] amdkfd: Implement the create/destroy/update queue IOCTLs
From: Ben Gozv3: remove use of internal typedefs v3: fix debug prints v3: add checks for parameters v3: use doorbell address from user Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c| 182 - drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 8 + .../drm/radeon/amdkfd/kfd_process_queue_manager.c | 5 +- include/uapi/linux/kfd_ioctl.h | 7 +- 4 files changed, 196 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c index c42f53b..17725f6 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c @@ -119,17 +119,193 @@ static int kfd_open(struct inode *inode, struct file *filep) static long kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void __user *arg) { - return -ENODEV; + struct kfd_ioctl_create_queue_args args; + struct kfd_dev *dev; + int err = 0; + unsigned int queue_id; + struct kfd_process_device *pdd; + struct queue_properties q_properties; + + memset(_properties, 0, sizeof(struct queue_properties)); + + if (copy_from_user(, arg, sizeof(args))) + return -EFAULT; + + if (args.queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) { + pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n"); + return -EINVAL; + } + + if (args.queue_priority > KFD_MAX_QUEUE_PRIORITY) { + pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n"); + return -EINVAL; + } + + if ((args.ring_base_address) && + (!access_ok(VERIFY_WRITE, args.ring_base_address, sizeof(uint64_t { + pr_err("kfd: can't access ring base address\n"); + return -EFAULT; + } + + if (!is_power_of_2(args.ring_size)) { + pr_err("kfd: ring size must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n"); + return -EINVAL; + } + + if (!access_ok(VERIFY_WRITE, args.read_pointer_address, sizeof(uint32_t))) { + pr_err("kfd: can't access read pointer\n"); + return -EFAULT; + } + + if (!access_ok(VERIFY_WRITE, args.write_pointer_address, sizeof(uint32_t))) { + pr_err("kfd: can't access write pointer\n"); + return -EFAULT; + } + + q_properties.is_interop = false; + q_properties.queue_percent = args.queue_percentage; + q_properties.priority = args.queue_priority; + q_properties.queue_address = args.ring_base_address; + q_properties.queue_size = args.ring_size; + q_properties.read_ptr = (uint32_t *) args.read_pointer_address; + q_properties.write_ptr = (uint32_t *) args.write_pointer_address; + + + pr_debug("kfd: creating queue ioctl\n"); + + pr_debug("Queue Percentage (%d, %d)\n", + q_properties.queue_percent, args.queue_percentage); + + pr_debug("Queue Priority (%d, %d)\n", + q_properties.priority, args.queue_priority); + + pr_debug("Queue Address (0x%llX, 0x%llX)\n", + q_properties.queue_address, args.ring_base_address); + + pr_debug("Queue Size (0x%llX, %u)\n", + q_properties.queue_size, args.ring_size); + + pr_debug("Queue r/w Pointers (0x%llX, 0x%llX)\n", + (uint64_t) q_properties.read_ptr, + (uint64_t) q_properties.write_ptr); + + dev = kfd_device_by_id(args.gpu_id); + if (dev == NULL) + return -EINVAL; + + mutex_lock(>mutex); + + pdd = kfd_bind_process_to_device(dev, p); + if (IS_ERR(pdd) < 0) { + err = PTR_ERR(pdd); + goto err_bind_process; + } + + pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n", + p->pasid, + dev->id); + + err = pqm_create_queue(>pqm, dev, filep, _properties, 0, KFD_QUEUE_TYPE_COMPUTE, _id); + if (err != 0) + goto err_create_queue; + + args.queue_id = queue_id; + + /* Return gpu_id as doorbell offset for mmap usage */ + args.doorbell_offset = args.gpu_id << PAGE_SHIFT; + + if (copy_to_user(arg, , sizeof(args))) { + err = -EFAULT; + goto err_copy_args_out; + } + + mutex_unlock(>mutex); + + pr_debug("kfd: queue id %d was created successfully\n", args.queue_id); + + pr_debug("ring buffer address == 0x%016llX\n", + args.ring_base_address); + + pr_debug("read ptr address== 0x%016llX\n", + args.read_pointer_address); + + pr_debug("write ptr address == 0x%016llX\n", +
[PATCH v3 19/23] amdkfd: Add interrupt handling module
From: Andrew LewyckyThis patch adds the interrupt handling module, in kfd_interrupt.c, and its related members in different data structures to the amdkfd driver. The amdkfd interrupt module maintains an internal interrupt ring per amdkfd device. The internal interrupt ring contains interrupts that needs further handling. The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. v3: change device init v3: make sure spin lock is taken only if init is complete v3: move bool to end of struct Signed-off-by: Andrew Lewycky Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile| 3 +- drivers/gpu/drm/radeon/amdkfd/kfd_device.c| 23 +++- drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c | 161 ++ drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 18 ++- 4 files changed, 200 insertions(+), 5 deletions(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index e3099c8..91d5015 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -8,6 +8,7 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ kfd_process.o kfd_queue.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ - kfd_process_queue_manager.o kfd_device_queue_manager.o + kfd_process_queue_manager.o kfd_device_queue_manager.o \ + kfd_interrupt.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c index 74575de..a364c1c 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c @@ -31,6 +31,7 @@ static const struct kfd_device_info kaveri_device_info = { .max_pasid_bits = 16, + .ih_ring_entry_size = 4 * sizeof(uint32_t) }; struct kfd_deviceid { @@ -187,6 +188,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, goto kfd_topology_add_device_error; } + if (kfd_interrupt_init(kfd)) { + dev_err(kfd_device, + "Error initializing interrupts for device (%x:%x)\n", + kfd->pdev->vendor, kfd->pdev->device); + goto kfd_interrupt_error; + } + if (!device_iommu_pasid_init(kfd)) { dev_err(kfd_device, "Error initializing iommuv2 for device (%x:%x)\n", @@ -223,6 +231,8 @@ dqm_start_error: device_queue_manager_error: amd_iommu_free_device(kfd->pdev); device_iommu_pasid_error: + kfd_interrupt_exit(kfd); +kfd_interrupt_error: kfd_topology_remove_device(kfd); kfd_topology_add_device_error: kfd2kgd->fini_sa_manager(kfd->kgd); @@ -238,6 +248,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) if (kfd->init_complete) { device_queue_manager_uninit(kfd->dqm); amd_iommu_free_device(kfd->pdev); + kfd_interrupt_exit(kfd); kfd_topology_remove_device(kfd); } @@ -274,6 +285,16 @@ int kgd2kfd_resume(struct kfd_dev *kfd) return 0; } -void kgd2kfd_interrupt(struct kfd_dev *dev, const void *ih_ring_entry) +/* This is called directly from KGD at ISR. */ +void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) { + if (kfd->init_complete) { + spin_lock(>interrupt_lock); + + if (kfd->interrupts_active + && enqueue_ih_ring_entry(kfd, ih_ring_entry)) + schedule_work(>interrupt_work); + + spin_unlock(>interrupt_lock); + } } diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c new file mode 100644 index 000..eed43a7 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c @@ -0,0 +1,161 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + *
[PATCH v3 18/23] amdkfd: Add device queue manager module
From: Ben GozThe queue scheduler divides into two sections, one section is process bounded and the other section is device bounded. The device bounded section is handled by this module. The DQM module handles queue setup, update and tear-down from the device side. It also supports suspend/resume operation. v3: change device_init v3: use new gart allocation functions v3: Add documentation Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 2 +- drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 28 +- .../drm/radeon/amdkfd/kfd_device_queue_manager.c | 989 + .../drm/radeon/amdkfd/kfd_device_queue_manager.h | 43 + drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 13 + 5 files changed, 1073 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index b18a2b5..e3099c8 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -8,6 +8,6 @@ amdkfd-y:= kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ kfd_process.o kfd_queue.o kfd_mqd_manager.o \ kfd_kernel_queue.o kfd_packet_manager.o \ - kfd_process_queue_manager.o + kfd_process_queue_manager.o kfd_device_queue_manager.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c index ce90592..74575de 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c @@ -25,6 +25,7 @@ #include #include #include "kfd_priv.h" +#include "kfd_device_queue_manager.h" #define MQD_SIZE_ALIGNED 768 @@ -194,12 +195,33 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, } amd_iommu_set_invalidate_ctx_cb(kfd->pdev, iommu_pasid_shutdown_callback); + kfd->dqm = device_queue_manager_init(kfd); + if (!kfd->dqm) { + dev_err(kfd_device, + "Error initializing queue manager for device (%x:%x)\n", + kfd->pdev->vendor, kfd->pdev->device); + goto device_queue_manager_error; + } + + if (kfd->dqm->start(kfd->dqm) != 0) { + dev_err(kfd_device, + "Error starting queuen manager for device (%x:%x)\n", + kfd->pdev->vendor, kfd->pdev->device); + goto dqm_start_error; + } + kfd->init_complete = true; dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor, kfd->pdev->device); + pr_debug("kfd: Starting kfd with the following scheduling policy %d\n", sched_policy); + goto out; +dqm_start_error: + device_queue_manager_uninit(kfd->dqm); +device_queue_manager_error: + amd_iommu_free_device(kfd->pdev); device_iommu_pasid_error: kfd_topology_remove_device(kfd); kfd_topology_add_device_error: @@ -214,6 +236,7 @@ out: void kgd2kfd_device_exit(struct kfd_dev *kfd) { if (kfd->init_complete) { + device_queue_manager_uninit(kfd->dqm); amd_iommu_free_device(kfd->pdev); kfd_topology_remove_device(kfd); } @@ -225,8 +248,10 @@ void kgd2kfd_suspend(struct kfd_dev *kfd) { BUG_ON(kfd == NULL); - if (kfd->init_complete) + if (kfd->init_complete) { + kfd->dqm->stop(kfd->dqm); amd_iommu_free_device(kfd->pdev); + } } int kgd2kfd_resume(struct kfd_dev *kfd) @@ -243,6 +268,7 @@ int kgd2kfd_resume(struct kfd_dev *kfd) if (err < 0) return -ENXIO; amd_iommu_set_invalidate_ctx_cb(kfd->pdev, iommu_pasid_shutdown_callback); + kfd->dqm->start(kfd->dqm); } return 0; diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c new file mode 100644 index 000..2c3abd2 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c @@ -0,0 +1,989 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE
[PATCH v3 17/23] amdkfd: Add process queue manager module
From: Ben GozThe queue scheduler divides into two sections, one section is process bounded and the other section is device bounded. The process bounded section is handled by this module. The PQM handles usermode queue setup, updates and tear-down. v3: use kernel param to limit queues per process instead of define v3: use doorbell address from user Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 3 +- drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 17 + drivers/gpu/drm/radeon/amdkfd/kfd_process.c| 16 + .../drm/radeon/amdkfd/kfd_process_queue_manager.c | 343 + 4 files changed, 378 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index b88e637..b18a2b5 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -7,6 +7,7 @@ ccflags-y := -Iinclude/drm amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ kfd_process.o kfd_queue.o kfd_mqd_manager.o \ - kfd_kernel_queue.o kfd_packet_manager.o + kfd_kernel_queue.o kfd_packet_manager.o \ + kfd_process_queue_manager.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h index 34028f8..600d671 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h @@ -365,6 +365,9 @@ struct kfd_process_device { struct kfd_dev *dev; + /* per-process-per device QCM data structure */ + struct qcm_process_device qpd; + /*Apertures*/ uint64_t lds_base; uint64_t lds_limit; @@ -410,6 +413,8 @@ struct kfd_process { */ struct list_head per_device_data; + struct process_queue_manager pqm; + /* The process's queues. */ size_t queue_array_size; @@ -477,11 +482,23 @@ inline uint32_t upper_32(uint64_t x); int init_queue(struct queue **q, struct queue_properties properties); void uninit_queue(struct queue *q); +void print_queue_properties(struct queue_properties *q); void print_queue(struct queue *q); struct kernel_queue *kernel_queue_init(struct kfd_dev *dev, enum kfd_queue_type type); void kernel_queue_uninit(struct kernel_queue *kq); +/* Process Queue Manager */ +struct process_queue_node { + struct queue *q; + struct kernel_queue *kq; + struct list_head process_queue_list; +}; + +int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p); +void pqm_uninit(struct process_queue_manager *pqm); +int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid); + /* Packet Manager */ #define KFD_HIQ_TIMEOUT (500) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c index 98eba8e..b8bf15d 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_process.c @@ -150,6 +150,9 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn, mutex_lock(>mutex); + /* In case our notifier is called before IOMMU notifier */ + pqm_uninit(>pqm); + list_for_each_entry_safe(pdd, temp, >per_device_data, per_device_list) { amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid); list_del(>per_device_list); @@ -214,8 +217,16 @@ static struct kfd_process *create_process(const struct task_struct *thread) INIT_LIST_HEAD(>per_device_data); + err = pqm_init(>pqm, process); + if (err != 0) + goto err_process_pqm_init; + return process; +err_process_pqm_init: + hash_del_rcu(>kfd_processes); + synchronize_rcu(); + mmu_notifier_unregister_no_release(>mmu_notifier, process->mm); err_mmu_notifier: kfd_pasid_free(process->pasid); err_alloc_pasid: @@ -240,6 +251,9 @@ struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev, pdd = kzalloc(sizeof(*pdd), GFP_KERNEL); if (pdd != NULL) { pdd->dev = dev; + INIT_LIST_HEAD(>qpd.queues_list); + INIT_LIST_HEAD(>qpd.priv_queue_list); + pdd->qpd.dqm = dev->dqm; list_add(>per_device_list, >per_device_data); } } @@ -299,6 +313,8 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid) mutex_lock(>mutex); + pqm_uninit(>pqm); + pdd = kfd_get_process_device_data(dev, p, 0); /* diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c new
[PATCH v3 16/23] amdkfd: Add packet manager module
From: Ben GozThe packet manager module builds PM4 packets for the sole use of the CP scheduler. Those packets are used by the HIQ to submit runlists to the CP. v3: remove include of cik_mqds.h v3: Change lower_32/upper_32 calls to use linux macros v3: use new gart allocation functions v3: add documentation Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 2 +- drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c | 495 + drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 72 +++ 3 files changed, 568 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 020d6c7..b88e637 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -7,6 +7,6 @@ ccflags-y := -Iinclude/drm amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ kfd_process.o kfd_queue.o kfd_mqd_manager.o \ - kfd_kernel_queue.o + kfd_kernel_queue.o kfd_packet_manager.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c new file mode 100644 index 000..aabc17e --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c @@ -0,0 +1,495 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#include +#include +#include "kfd_device_queue_manager.h" +#include "kfd_kernel_queue.h" +#include "kfd_priv.h" +#include "kfd_pm4_headers.h" +#include "kfd_pm4_opcodes.h" + +static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes, unsigned int buffer_size_bytes) +{ + unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t); + + BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes); + *wptr = temp; +} + +static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size) +{ + union PM4_TYPE_3_HEADER header; + + header.u32all = 0; + header.opcode = opcode; + header.count = packet_size/sizeof(uint32_t) - 2; + header.type = PM4_TYPE_3; + + return header.u32all; +} + +static void pm_calc_rlib_size(struct packet_manager *pm, unsigned int *rlib_size, bool *over_subscription) +{ + unsigned int process_count, queue_count; + + BUG_ON(!pm || !rlib_size || !over_subscription); + + process_count = pm->dqm->processes_count; + queue_count = pm->dqm->queue_count; + + /* check if there is over subscription*/ + *over_subscription = false; + if ((process_count >= VMID_PER_DEVICE) || + queue_count > PIPE_PER_ME_CP_SCHEDULING * QUEUES_PER_PIPE) { + *over_subscription = true; + pr_debug("kfd: over subscribed runlist\n"); + } + + /* calculate run list ib allocation size */ + *rlib_size = process_count * sizeof(struct pm4_map_process) + +queue_count * sizeof(struct pm4_map_queues); + + /* increase the allocation size in case we need a chained run list when over subscription */ + if (*over_subscription) + *rlib_size += sizeof(struct pm4_runlist); + + pr_debug("kfd: runlist ib size %d\n", *rlib_size); +} + +static int pm_allocate_runlist_ib(struct packet_manager *pm, unsigned int **rl_buffer, uint64_t *rl_gpu_buffer, + unsigned int *rl_buffer_size, bool *is_over_subscription) +{ + int retval; + + BUG_ON(!pm); + BUG_ON(pm->allocated == true); + BUG_ON(is_over_subscription == NULL); + + pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription); + + retval =
[PATCH v3 15/23] amdkfd: Add module parameter of scheduling policy
From: Ben GozThis patch adds a new parameter to the amdkfd driver. This parameter enables the user to select the scheduling policy of the CP. The choices are: * CP Scheduling with support for over-subscription * CP Scheduling without support for over-subscription * Without CP Scheduling Note that the third option (Without CP scheduling) is only for debug purposes and bringup of new H/W. As such, it is _not_ guaranteed to work at all times on all H/W versions. v3: fix description v3: change permissions to read_only v3: verify value v3: add documentation Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 12 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 29 + 2 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c index a31bf03..5c58031 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_module.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_module.c @@ -45,6 +45,11 @@ static const struct kgd2kfd_calls kgd2kfd = { .resume = kgd2kfd_resume, }; +int sched_policy = KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION; +module_param(sched_policy, int, 0444); +MODULE_PARM_DESC(sched_policy, + "Kernel cmdline parameter that defines the amdkfd scheduling policy"); + int max_num_of_processes = KFD_MAX_NUM_OF_PROCESSES_DEFAULT; module_param(max_num_of_processes, int, 0444); MODULE_PARM_DESC(max_num_of_processes, @@ -79,6 +84,13 @@ static int __init kfd_module_init(void) int err; /* Verify module parameters */ + if ((sched_policy < KFD_SCHED_POLICY_HWS) || + (sched_policy > KFD_SCHED_POLICY_NO_HWS)) { + pr_err("kfd: sched_policy has invalid value\n"); + return -1; + } + + /* Verify module parameters */ if ((max_num_of_processes < 0) || (max_num_of_processes > KFD_MAX_NUM_OF_PROCESSES)) { pr_err("kfd: max_num_of_processes must be between 0 to KFD_MAX_NUM_OF_PROCESSES\n"); diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h index 66980df..5835d07 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h @@ -65,6 +65,35 @@ extern int max_num_of_queues_per_process; #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS_DEFAULT 128 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 1024 +/* Kernel module parameter to specify the scheduling policy */ +extern int sched_policy; + +/** + * enum kfd_sched_policy + * + * @KFD_SCHED_POLICY_HWS: H/W scheduling policy known as command processor (cp) + * scheduling. In this scheduling mode we're using the firmware code to schedule + * the user mode queues and kernel queues such as HIQ and DIQ. + * the HIQ queue is used as a special queue that dispatches the configuration to + * the cp and the user mode queues list that are currently running. + * the DIQ queue is a debugging queue that dispatches debugging commands to the + * firmware. + * in this scheduling mode user mode queues over subscription feature is enabled. + * + * @KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION: The same as above but the over + * subscription feature disabled. + * + * @KFD_SCHED_POLICY_NO_HWS: no H/W scheduling policy is a mode which directly + * set the command processor registers and sets the queues "manually". This mode + * is used *ONLY* for debugging proposes. + * + */ +enum kfd_sched_policy { + KFD_SCHED_POLICY_HWS = 0, + KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION, + KFD_SCHED_POLICY_NO_HWS +}; + enum cache_policy { cache_policy_coherent, cache_policy_noncoherent -- 1.9.1
[PATCH v3 14/23] amdkfd: Add kernel queue module
From: Ben GozThe kernel queue module enables the amdkfd to establish kernel queues, not exposed to user space. The kernel queues are used for HIQ (HSA Interface Queue) and DIQ (Debug Interface Queue) operations v3: remove use of internal typedefs v3: use new gart allocation functions Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 3 +- .../drm/radeon/amdkfd/kfd_device_queue_manager.h | 101 +++ drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c | 330 ++ drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h | 66 ++ drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h| 682 + drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h| 107 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 33 +- 7 files changed, 1320 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 9f8de8d..020d6c7 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -6,6 +6,7 @@ ccflags-y := -Iinclude/drm amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ - kfd_process.o kfd_queue.o kfd_mqd_manager.o + kfd_process.o kfd_queue.o kfd_mqd_manager.o \ + kfd_kernel_queue.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h new file mode 100644 index 000..e3a56ec --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h @@ -0,0 +1,101 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#ifndef KFD_DEVICE_QUEUE_MANAGER_H_ +#define KFD_DEVICE_QUEUE_MANAGER_H_ + +#include +#include +#include "kfd_priv.h" +#include "kfd_mqd_manager.h" + +#define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS (500) +#define QUEUES_PER_PIPE(8) +#define PIPE_PER_ME_CP_SCHEDULING (3) +#define CIK_VMID_NUM (8) +#define KFD_VMID_START_OFFSET (8) +#define VMID_PER_DEVICECIK_VMID_NUM +#define KFD_DQM_FIRST_PIPE (0) + +struct device_process_node { + struct qcm_process_device *qpd; + struct list_head list; +}; + +struct device_queue_manager { + int (*create_queue)(struct device_queue_manager *dqm, + struct queue *q, + struct qcm_process_device *qpd, + int *allocate_vmid); + int (*destroy_queue)(struct device_queue_manager *dqm, + struct qcm_process_device *qpd, + struct queue *q); + int (*update_queue)(struct device_queue_manager *dqm, + struct queue *q); + int (*destroy_queues)(struct device_queue_manager *dqm); + struct mqd_manager * (*get_mqd_manager)(struct device_queue_manager *dqm, + enum KFD_MQD_TYPE type); + int (*execute_queues)(struct device_queue_manager *dqm); + int (*register_process)(struct device_queue_manager *dqm, + struct qcm_process_device *qpd); + int (*unregister_process)(struct device_queue_manager *dqm, +
[PATCH v3 13/23] amdkfd: Add mqd_manager module
From: Ben GozThe mqd_manager module handles MQD data structures. MQD stands for Memory Queue Descriptor, which is used by the H/W to keep the usermode queue state in memory. v3: remove new typedefs v3: remove pragma pack 4 v3: remove cik_mqds.h v3: Change lower_32/upper_32 calls to use linux macros v3: use new gart allocation functions v3: Add documentation Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 2 +- drivers/gpu/drm/radeon/amdkfd/cik_regs.h| 220 + drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c | 305 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h | 88 +++ drivers/gpu/drm/radeon/amdkfd/kfd_priv.h| 11 + 5 files changed, 625 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_regs.h create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 392728a..9f8de8d 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ - kfd_process.o kfd_queue.o + kfd_process.o kfd_queue.o kfd_mqd_manager.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/cik_regs.h b/drivers/gpu/drm/radeon/amdkfd/cik_regs.h new file mode 100644 index 000..a6404e3 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/cik_regs.h @@ -0,0 +1,220 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef CIK_REGS_H +#define CIK_REGS_H + +#define IH_VMID_0_LUT 0x3D40u + +#define BIF_DOORBELL_CNTL 0x530Cu + +#defineSRBM_GFX_CNTL 0xE44 +#definePIPEID(x) ((x) << 0) +#defineMEID(x) ((x) << 2) +#defineVMID(x) ((x) << 4) +#defineQUEUEID(x) ((x) << 8) + +#defineSQ_CONFIG 0x8C00 + +#defineSH_MEM_BASES0x8C28 +/* if PTR32, these are the bases for scratch and lds */ +#definePRIVATE_BASE(x) ((x) << 0) /* scratch */ +#defineSHARED_BASE(x) ((x) << 16) /* LDS */ +#defineSH_MEM_APE1_BASE0x8C2C +/* if PTR32, this is the base location of GPUVM */ +#defineSH_MEM_APE1_LIMIT 0x8C30 +/* if PTR32, this is the upper limit of GPUVM */ +#defineSH_MEM_CONFIG 0x8C34 +#definePTR32 (1 << 0) +#define PRIVATE_ATC(1 << 1) +#defineALIGNMENT_MODE(x) ((x) << 2) +#defineSH_MEM_ALIGNMENT_MODE_DWORD 0 +#defineSH_MEM_ALIGNMENT_MODE_DWORD_STRICT 1 +#defineSH_MEM_ALIGNMENT_MODE_STRICT2 +#defineSH_MEM_ALIGNMENT_MODE_UNALIGNED 3 +#defineDEFAULT_MTYPE(x)((x) << 4) +#defineAPE1_MTYPE(x) ((x) << 7) + +/* valid for both DEFAULT_MTYPE and APE1_MTYPE */ +#defineMTYPE_CACHED0 +#defineMTYPE_NONCACHED 3 + +
[PATCH v3 12/23] amdkfd: Add queue module
From: Ben GozThe queue module enables allocating and initializing queues uniformly. v3: remove typedef v3: break pr_debug to one line v3: remove memset v3: add documentation Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile| 2 +- drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 123 +- drivers/gpu/drm/radeon/amdkfd/kfd_queue.c | 85 + 3 files changed, 208 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 6d6746e..392728a 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -6,6 +6,6 @@ ccflags-y := -Iinclude/drm amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ - kfd_process.o + kfd_process.o kfd_queue.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h index b55b1cb..cf6d40d 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_priv.h @@ -56,7 +56,6 @@ extern int max_num_of_queues_per_process; #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS_DEFAULT 128 #define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 1024 - struct kfd_device_info { const struct kfd_scheduler_class *scheduler_class; unsigned int max_pasid_bits; @@ -116,6 +115,128 @@ void kfd_chardev_exit(void); struct device *kfd_chardev(void); +/** + * enum kfd_queue_type + * + * @KFD_QUEUE_TYPE_COMPUTE: Regular user mode queue type. + * + * @KFD_QUEUE_TYPE_SDMA: Sdma user mode queue type. + * + * @KFD_QUEUE_TYPE_HIQ: HIQ queue type. + * + * @KFD_QUEUE_TYPE_DIQ: DIQ queue type. + */ +enum kfd_queue_type { + KFD_QUEUE_TYPE_COMPUTE, + KFD_QUEUE_TYPE_SDMA, + KFD_QUEUE_TYPE_HIQ, + KFD_QUEUE_TYPE_DIQ +}; + +/** + * struct queue_properties + * + * @type: The queue type. + * + * @queue_id: Queue identifier. + * + * @queue_address: Queue ring buffer address. + * + * @queue_size: Queue ring buffer size. + * + * @priority: Defines the queue priority relative to other queues in the process. + * This is just an indication and HW scheduling may override the priority as + * necessary while keeping the relative prioritization. + * the priority granularity is from 0 to f which f is the highest priority. + * currently all queues are initialized with the highest priority. + * + * @queue_percent: This field is partially implemented and currently a zero in + * this field defines that the queue is non active. + * + * @read_ptr: User space address which points to the number of dwords the + * cp read from the ring buffer. This field updates automatically by the H/W. + * + * @write_ptr: Defines the number of dwords written to the ring buffer. + * + * @doorbell_ptr: This field aim is to notify the H/W of new packet written to + * the queue ring buffer. This field should be similar to write_ptr and the user + * should update this field after he updated the write_ptr. + * + * @doorbell_off: The doorbell offset in the doorbell pci-bar. + * + * @is_interop: Defines if this is a interop queue. Interop queue means that the + * queue can access both graphics and compute resources. + * + * @is_active: Defines if the queue is active or not. + * + * @vmid: If the scheduling mode is no cp scheduling the field defines the vmid + * of the queue. + * + * This structure represents the queue properties for each queue no matter if + * it's user mode or kernel mode queue. + * + */ +struct queue_properties { + enum kfd_queue_type type; + unsigned int queue_id; + uint64_t queue_address; + uint64_t queue_size; + uint32_t priority; + uint32_t queue_percent; + uint32_t *read_ptr; + uint32_t *write_ptr; + uint32_t *doorbell_ptr; + uint32_t doorbell_off; + bool is_interop; + bool is_active; + /* Not relevant for user mode queues in cp scheduling */ + unsigned int vmid; +}; + +/** + * struct queue + * + * @list: Queue linked list. + * + * @mqd: The queue MQD. + * + * @mqd_mem_obj: The MQD local gpu memory object. + * + * @gart_mqd_addr: The MQD gart mc address. + * + * @properties: The queue properties. + * + * @mec: Used only in no cp scheduling mode and identifies to micro engine id + * that the queue should be execute on. + * + * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe id. + * + * @queue: Used only in no cp scheduliong mode and identifies the queue's slot. + * + * @process: The kfd process that created this queue. + * + * @device: The kfd device that created this queue. + * + * This structure represents user mode compute queues. + * It contains all the necessary data to handle such queues. + * + */ +
[PATCH v3 11/23] amdkfd: Add binding/unbinding calls to amd_iommu driver
This patch adds the functions to bind and unbind pasid from a device through the amd_iommu driver. The unbind function is called when the mm_struct of the process is released. The bind function is not called here because it is called only in the IOCTLs which are not yet implemented at this stage of the patchset. Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 86 - drivers/gpu/drm/radeon/amdkfd/kfd_priv.h| 1 + drivers/gpu/drm/radeon/amdkfd/kfd_process.c | 12 3 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c index 8e2b075..ce90592 100644 --- a/drivers/gpu/drm/radeon/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_device.c @@ -98,6 +98,63 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev) return kfd; } +static bool device_iommu_pasid_init(struct kfd_dev *kfd) +{ + const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP | AMD_IOMMU_DEVICE_FLAG_PRI_SUP + | AMD_IOMMU_DEVICE_FLAG_PASID_SUP; + + struct amd_iommu_device_info iommu_info; + unsigned int pasid_limit; + int err; + + err = amd_iommu_device_info(kfd->pdev, _info); + if (err < 0) { + dev_err(kfd_device, "error getting iommu info. is the iommu enabled?\n"); + return false; + } + + if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) { + dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n", + (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0, + (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0, + (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0); + return false; + } + + pasid_limit = min_t(unsigned int, + (unsigned int)1 << kfd->device_info->max_pasid_bits, + iommu_info.max_pasids); + /* +* last pasid is used for kernel queues doorbells +* in the future the last pasid might be used for a kernel thread. +*/ + pasid_limit = min_t(unsigned int, + pasid_limit, + kfd->doorbell_process_limit - 1); + + err = amd_iommu_init_device(kfd->pdev, pasid_limit); + if (err < 0) { + dev_err(kfd_device, "error initializing iommu device\n"); + return false; + } + + if (!kfd_set_pasid_limit(pasid_limit)) { + dev_err(kfd_device, "error setting pasid limit\n"); + amd_iommu_free_device(kfd->pdev); + return false; + } + + return true; +} + +static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid) +{ + struct kfd_dev *dev = kfd_device_by_pci_dev(pdev); + + if (dev) + kfd_unbind_process_from_device(dev, pasid); +} + bool kgd2kfd_device_init(struct kfd_dev *kfd, const struct kgd2kfd_shared_resources *gpu_resources) { @@ -129,6 +186,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, goto kfd_topology_add_device_error; } + if (!device_iommu_pasid_init(kfd)) { + dev_err(kfd_device, + "Error initializing iommuv2 for device (%x:%x)\n", + kfd->pdev->vendor, kfd->pdev->device); + goto device_iommu_pasid_error; + } + amd_iommu_set_invalidate_ctx_cb(kfd->pdev, iommu_pasid_shutdown_callback); kfd->init_complete = true; dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor, @@ -136,6 +200,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, goto out; +device_iommu_pasid_error: + kfd_topology_remove_device(kfd); kfd_topology_add_device_error: kfd2kgd->fini_sa_manager(kfd->kgd); dev_err(kfd_device, @@ -147,7 +213,10 @@ out: void kgd2kfd_device_exit(struct kfd_dev *kfd) { - kfd_topology_remove_device(kfd); + if (kfd->init_complete) { + amd_iommu_free_device(kfd->pdev); + kfd_topology_remove_device(kfd); + } kfree(kfd); } @@ -155,12 +224,27 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) void kgd2kfd_suspend(struct kfd_dev *kfd) { BUG_ON(kfd == NULL); + + if (kfd->init_complete) + amd_iommu_free_device(kfd->pdev); } int kgd2kfd_resume(struct kfd_dev *kfd) { + unsigned int pasid_limit; + int err; + BUG_ON(kfd == NULL); + pasid_limit = kfd_get_pasid_limit(); + + if (kfd->init_complete) { + err = amd_iommu_init_device(kfd->pdev, pasid_limit); + if (err < 0) + return -ENXIO; + amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
[PATCH v3 10/23] amdkfd: Add basic modules to amdkfd
From: Andrew LewyckyThis patch adds the process module and three helper modules: - kfd_process, which handles process which open /dev/kfd - kfd_doorbell, which provides helper functions for doorbell allocation, release and mapping to userspace - kfd_pasid, which provides helper functions for pasid allocation and release - kfd_aperture, which provides helper functions for managing the LDS, Local GPU memory and Scratch memory apertures of the process This patch only contains the basic kfd_process module, which doesn't contain the reference to the queue scheduler. This was done to allow easier code review. Also, this patch doesn't contain the calls to the IOMMU driver for binding the pasid to the device. Again, this was done to allow easier code review The kfd_process object is created when a process opens /dev/kfd and is closed when the mm_struct of that process is teared-down. v3: Remove kfd_vidmem file v3: Replace direct mmput call to mmu_notifier release v3: remove typedefs v3: move bool to end of struct v3: Add new kernel params for gart usage limitation v3: init sa manager v3: fix debug msgs v3: remove support for LDS in 32 bit v3: Change code to support mmap of doorbell pages from userspace v3: Add documentation for apertures Signed-off-by: Andrew Lewycky Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile | 4 +- drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c | 350 +++ drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 31 ++- drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 45 +++- drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c | 236 ++ drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 32 ++- drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c| 95 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 141 +++ drivers/gpu/drm/radeon/amdkfd/kfd_process.c | 319 drivers/gpu/drm/radeon/radeon_kfd.c | 21 +- 10 files changed, 1247 insertions(+), 27 deletions(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process.c diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 08ecfcd..6d6746e 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -4,6 +4,8 @@ ccflags-y := -Iinclude/drm -amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o +amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \ + kfd_pasid.o kfd_doorbell.o kfd_aperture.o \ + kfd_process.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c new file mode 100644 index 000..8cfb720 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c @@ -0,0 +1,350 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "kfd_priv.h" +#include +#include +#include + +/* + * The primary memory I/O features being added for revisions of gfxip + * beyond 7.0 (Kaveri) are: + * + * Access to ATC/IOMMU mapped memory w/ associated extension of VA to 48b + * + * ?Flat? shader memory access ? These are new shader vector memory operations + * that do not reference a T#/V# so a ?pointer? is what is sourced from the + * vector gprs for direct access to memory. This pointer space has the + * Shared(LDS) and Private(Scratch) memory mapped into this pointer space as + * apertures. The hardware then determines how to direct the memory request + * based on what
[PATCH v3 09/23] amdkfd: Add topology module to amdkfd
From: Evgeny PinchukThis patch adds the topology module to the driver. The topology is exposed to userspace through the sysfs. The calls to add and remove a device to/from topology are done by the radeon driver. The CPU information, that is provided in the topology section of the amdkfd driver, is extracted from the CRAT table. Unlike the CPU information located in /sys/devices/system/cpu/cpu*, which is extracted from the SRAT table. While the CPU information provided by the CRAT and the SRAT tables might be identical, the node topology might be different. The SRAT table contains the topology of CPU nodes only. The CRAT table contains the topology of CPU and GPU nodes together (and can be interleaved). For example CPU node 1 in SRAT can be CPU node 3 in CRAT. Furthermore it's worth to mention that the CRAT table contains only HSA compatible nodes (nodes which are compliant with the HSA spec). To recap, amdkfd exposes a different kind of topology than the one exposed by /sys/devices/system/cpu/cpu even though it may contain similar information. Signed-off-by: Evgeny Pinchuk Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/amdkfd/Makefile |2 +- drivers/gpu/drm/radeon/amdkfd/kfd_crat.h | 294 +++ drivers/gpu/drm/radeon/amdkfd/kfd_device.c |7 + drivers/gpu/drm/radeon/amdkfd/kfd_module.c |7 + drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 17 + drivers/gpu/drm/radeon/amdkfd/kfd_topology.c | 1207 ++ drivers/gpu/drm/radeon/amdkfd/kfd_topology.h | 168 7 files changed, 1701 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile index 9564e75..08ecfcd 100644 --- a/drivers/gpu/drm/radeon/amdkfd/Makefile +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -4,6 +4,6 @@ ccflags-y := -Iinclude/drm -amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o +amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h new file mode 100644 index 000..a374fa3 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_crat.h @@ -0,0 +1,294 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef KFD_CRAT_H_INCLUDED +#define KFD_CRAT_H_INCLUDED + +#include + +#pragma pack(1) + +/* + * 4CC signature values for the CRAT and CDIT ACPI tables + */ + +#define CRAT_SIGNATURE "CRAT" +#define CDIT_SIGNATURE "CDIT" + +/* + * Component Resource Association Table (CRAT) + */ + +#define CRAT_OEMID_LENGTH 6 +#define CRAT_OEMTABLEID_LENGTH 8 +#define CRAT_RESERVED_LENGTH 6 + +#define CRAT_OEMID_64BIT_MASK ((1ULL << (CRAT_OEMID_LENGTH * 8)) - 1) + +struct crat_header { + uint32_tsignature; + uint32_tlength; + uint8_t revision; + uint8_t checksum; + uint8_t oem_id[CRAT_OEMID_LENGTH]; + uint8_t oem_table_id[CRAT_OEMTABLEID_LENGTH]; + uint32_toem_revision; + uint32_tcreator_id; + uint32_tcreator_revision; + uint32_ttotal_entries; + uint16_tnum_domains; + uint8_t reserved[CRAT_RESERVED_LENGTH]; +}; + +/* + * The header structure is immediately followed by total_entries of the + * data definitions + */ + +/* + * The currently defined subtype entries in the CRAT + */ +#define CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY 0 +#define CRAT_SUBTYPE_MEMORY_AFFINITY 1 +#define CRAT_SUBTYPE_CACHE_AFFINITY2 +#define CRAT_SUBTYPE_TLB_AFFINITY 3 +#define
[PATCH v3 08/23] amdkfd: Add amdkfd skeleton driver
This patch adds the amdkfd skeleton driver. The driver does nothing except define a /dev/kfd device. It returns -ENODEV on all amdkfd IOCTLs. (v3) move bool to end of struct (v3) remove pmc ioctls (v3) add meaningful error message for ioctl error Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/Kconfig | 2 + drivers/gpu/drm/radeon/Makefile | 2 + drivers/gpu/drm/radeon/amdkfd/Kconfig | 10 ++ drivers/gpu/drm/radeon/amdkfd/Makefile | 9 ++ drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 187 drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 129 +++ drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 98 +++ drivers/gpu/drm/radeon/amdkfd/kfd_priv.h| 82 8 files changed, 519 insertions(+) create mode 100644 drivers/gpu/drm/radeon/amdkfd/Kconfig create mode 100644 drivers/gpu/drm/radeon/amdkfd/Makefile create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_module.c create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h diff --git a/drivers/gpu/drm/radeon/Kconfig b/drivers/gpu/drm/radeon/Kconfig index 970f8e9..b697321 100644 --- a/drivers/gpu/drm/radeon/Kconfig +++ b/drivers/gpu/drm/radeon/Kconfig @@ -6,3 +6,5 @@ config DRM_RADEON_UMS Userspace modesetting is deprecated for quite some time now, so enable this only if you have ancient versions of the DDX drivers. + +source "drivers/gpu/drm/radeon/amdkfd/Kconfig" diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index 1476103..8ab6b6e 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -112,4 +112,6 @@ radeon-$(CONFIG_ACPI) += radeon_acpi.o obj-$(CONFIG_DRM_RADEON)+= radeon.o +obj-$(CONFIG_HSA_RADEON)+= amdkfd/ + CFLAGS_radeon_trace_points.o := -I$(src) diff --git a/drivers/gpu/drm/radeon/amdkfd/Kconfig b/drivers/gpu/drm/radeon/amdkfd/Kconfig new file mode 100644 index 000..900bb34 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/Kconfig @@ -0,0 +1,10 @@ +# +# Heterogenous system architecture configuration +# + +config HSA_RADEON + tristate "HSA kernel driver for AMD Radeon devices" + depends on DRM_RADEON && AMD_IOMMU_V2 && X86_64 + default m + help + Enable this if you want to use HSA features on AMD radeon devices. diff --git a/drivers/gpu/drm/radeon/amdkfd/Makefile b/drivers/gpu/drm/radeon/amdkfd/Makefile new file mode 100644 index 000..9564e75 --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/Makefile @@ -0,0 +1,9 @@ +# +# Makefile for Heterogenous System Architecture support for AMD radeon devices +# + +ccflags-y := -Iinclude/drm + +amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o + +obj-$(CONFIG_HSA_RADEON) += amdkfd.o diff --git a/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c new file mode 100644 index 000..f198e5a --- /dev/null +++ b/drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c @@ -0,0 +1,187 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "kfd_priv.h" + +static long kfd_ioctl(struct file *, unsigned int, unsigned long); +static int kfd_open(struct inode *, struct file *); + +static const char kfd_dev_name[] = "kfd"; + +static const struct file_operations kfd_fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = kfd_ioctl, + .compat_ioctl = kfd_ioctl, + .open = kfd_open, +}; + +static int kfd_char_dev_major = -1; +static struct class *kfd_class; +struct device *kfd_device; + +int kfd_chardev_init(void) +{ + int err = 0; + +
[PATCH v3 07/23] amdkfd: Add IOCTL set definitions of amdkfd
- KFD_IOC_GET_VERSION: Retrieves the interface version of amdkfd - KFD_IOC_CREATE_QUEUE: Creates a usermode queue that runs on a specific GPU device - KFD_IOC_DESTROY_QUEUE: Destroys an existing usermode queue - KFD_IOC_SET_MEMORY_POLICY: Sets the memory policy of the default and alternate aperture of the calling process - KFD_IOC_GET_CLOCK_COUNTERS: Retrieves counters (timestamps) of CPU and GPU - KFD_IOC_GET_PROCESS_APERTURES: Retrieves information about process apertures that were initialized during the open() call of the amdkfd device - KFD_IOC_UPDATE_QUEUE: Updates configuration of an existing usermode queue (v3) remove pragma pack (v3) remove pmc ioctls (v3) add parameter for doorbell offset (v3) add comment on counters Signed-off-by: Oded Gabbay --- include/uapi/linux/kfd_ioctl.h | 123 + 1 file changed, 123 insertions(+) create mode 100644 include/uapi/linux/kfd_ioctl.h diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h new file mode 100644 index 000..a06e021 --- /dev/null +++ b/include/uapi/linux/kfd_ioctl.h @@ -0,0 +1,123 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef KFD_IOCTL_H_INCLUDED +#define KFD_IOCTL_H_INCLUDED + +#include +#include + +#define KFD_IOCTL_CURRENT_VERSION 1 + +struct kfd_ioctl_get_version_args { + uint32_t min_supported_version; /* from KFD */ + uint32_t max_supported_version; /* from KFD */ +}; + +/* For kfd_ioctl_create_queue_args.queue_type. */ +#define KFD_IOC_QUEUE_TYPE_COMPUTE 0 +#define KFD_IOC_QUEUE_TYPE_SDMA 1 + +struct kfd_ioctl_create_queue_args { + uint64_t ring_base_address; /* to KFD */ + uint64_t write_pointer_address; /* from KFD */ + uint64_t read_pointer_address; /* from KFD */ + uint64_t doorbell_offset; /* from KFD */ + + uint32_t ring_size; /* to KFD */ + uint32_t gpu_id;/* to KFD */ + uint32_t queue_type;/* to KFD */ + uint32_t queue_percentage; /* to KFD */ + uint32_t queue_priority;/* to KFD */ + uint32_t queue_id; /* from KFD */ +}; + +struct kfd_ioctl_destroy_queue_args { + uint32_t queue_id; /* to KFD */ +}; + +struct kfd_ioctl_update_queue_args { + uint64_t ring_base_address; /* to KFD */ + + uint32_t queue_id; /* to KFD */ + uint32_t ring_size; /* to KFD */ + uint32_t queue_percentage; /* to KFD */ + uint32_t queue_priority;/* to KFD */ +}; + +/* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */ +#define KFD_IOC_CACHE_POLICY_COHERENT 0 +#define KFD_IOC_CACHE_POLICY_NONCOHERENT 1 + +struct kfd_ioctl_set_memory_policy_args { + uint64_t alternate_aperture_base; /* to KFD */ + uint64_t alternate_aperture_size; /* to KFD */ + + uint32_t gpu_id;/* to KFD */ + uint32_t default_policy;/* to KFD */ + uint32_t alternate_policy; /* to KFD */ +}; + +/* + * All counters are monotonic. They are used for profiling of compute jobs. + * The profiling is done by userspace. + * + * In case of GPU reset, the counter should not be affected. + */ + +struct kfd_ioctl_get_clock_counters_args { + uint64_t gpu_clock_counter; /* from KFD */ + uint64_t cpu_clock_counter; /* from KFD */ + uint64_t system_clock_counter; /* from KFD */ + uint64_t system_clock_freq; /* from KFD */ + + uint32_t gpu_id;/* to KFD */ +}; + +#define NUM_OF_SUPPORTED_GPUS 7 + +struct kfd_process_device_apertures { + uint64_t lds_base; /* from KFD */ + uint64_t lds_limit; /* from KFD */ +
[PATCH v3 06/23] Update MAINTAINERS and CREDITS files with amdkfd info
Signed-off-by: Oded Gabbay --- CREDITS | 7 +++ MAINTAINERS | 10 ++ 2 files changed, 17 insertions(+) diff --git a/CREDITS b/CREDITS index 28ee151..e9628d5 100644 --- a/CREDITS +++ b/CREDITS @@ -1197,6 +1197,13 @@ S: R. Tocantins, 89 - Cristo Rei S: 80050-430 - Curitiba - Paran? S: Brazil +N: Oded Gabbay +E: oded.gabbay at gmail.com +D: AMD KFD maintainer +S: 12 Shraga Raphaeli +S: Petah-Tikva, 4906418 +S: Israel + N: Kumar Gala E: galak at kernel.crashing.org D: Embedded PowerPC 6xx/7xx/74xx/82xx/83xx/85xx support diff --git a/MAINTAINERS b/MAINTAINERS index d76e077..da3aecb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -589,6 +589,16 @@ F: drivers/crypto/geode* F: drivers/video/geode/ F: arch/x86/include/asm/geode.h +AMD KFD (radeon extension) +M: Oded Gabbay +L: dri-devel at lists.freedesktop.org +T: git git://people.freedesktop.org/~gabbayo/linux.git +S: Supported +F: drivers/gpu/drm/radeon/amdkfd/* +F: drivers/gpu/drm/radeon/radeon_kfd.c +F: drivers/gpu/drm/radeon/radeon_kfd.h +F: include/linux/uapi/linux/kfd_ioctl.h + AMD IOMMU (AMD-VI) M: Joerg Roedel L: iommu at lists.linux-foundation.org -- 1.9.1
[PATCH v3 05/23] drm/radeon: Add radeon <--> amdkfd interface
This patch adds the interface between the radeon driver and the amdkfd driver. The interface implementation is contained in radeon_kfd.c and radeon_kfd.h. The interface itself is represented by a pointer to struct kfd_dev. The pointer is located inside radeon_device structure. All the register accesses that amdkfd need are done using this interface. This allows us to avoid direct register accesses in amdkfd proper, while also avoiding locking between amdkfd and radeon. The single exception is the doorbells that are used in both of the drivers. However, because they are located in separate pci bar pages, the danger of sharing registers between the drivers is minimal. Having said that, we are planning to move the doorbells as well to radeon. (v3) Add interface for sa manager init and fini. The init function will allocate a buffer on system memory and pin it to the GART address space via the radeon sa manager. All mappings of buffers to GART address space are done via the radeon sa manager. The interface of allocate memory will use the radeon sa manager to sub allocate from the single buffer that was allocated during the init function. (v3) Change lower_32/upper_32 calls to use linux macros (v3) Add documentation for the interface Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/Makefile | 1 + drivers/gpu/drm/radeon/cik.c| 9 + drivers/gpu/drm/radeon/cik_reg.h| 65 + drivers/gpu/drm/radeon/cikd.h | 51 +++- drivers/gpu/drm/radeon/radeon.h | 4 + drivers/gpu/drm/radeon/radeon_drv.c | 5 + drivers/gpu/drm/radeon/radeon_kfd.c | 538 drivers/gpu/drm/radeon/radeon_kfd.h | 177 drivers/gpu/drm/radeon/radeon_kms.c | 7 + 9 files changed, 856 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.h diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index c7fa1ae..1476103 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -104,6 +104,7 @@ radeon-y += \ radeon_vce.o \ vce_v1_0.o \ vce_v2_0.o \ + radeon_kfd.o radeon-$(CONFIG_COMPAT) += radeon_ioc32.o radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 0096538..f4a65de 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -32,6 +32,7 @@ #include "cik_blit_shaders.h" #include "radeon_ucode.h" #include "clearstate_ci.h" +#include "radeon_kfd.h" MODULE_FIRMWARE("radeon/BONAIRE_pfp.bin"); MODULE_FIRMWARE("radeon/BONAIRE_me.bin"); @@ -7766,6 +7767,9 @@ restart_ih: while (rptr != wptr) { /* wptr/rptr are in bytes! */ ring_index = rptr / 4; + + radeon_kfd_interrupt(rdev, (const void *) >ih.ring[ring_index]); + src_id = le32_to_cpu(rdev->ih.ring[ring_index]) & 0xff; src_data = le32_to_cpu(rdev->ih.ring[ring_index + 1]) & 0xfff; ring_id = le32_to_cpu(rdev->ih.ring[ring_index + 2]) & 0xff; @@ -8453,6 +8457,10 @@ static int cik_startup(struct radeon_device *rdev) if (r) return r; + r = radeon_kfd_resume(rdev); + if (r) + return r; + return 0; } @@ -8501,6 +8509,7 @@ int cik_resume(struct radeon_device *rdev) */ int cik_suspend(struct radeon_device *rdev) { + radeon_kfd_suspend(rdev); radeon_pm_suspend(rdev); dce6_audio_fini(rdev); radeon_vm_manager_fini(rdev); diff --git a/drivers/gpu/drm/radeon/cik_reg.h b/drivers/gpu/drm/radeon/cik_reg.h index ca1bb61..1ab3dbc 100644 --- a/drivers/gpu/drm/radeon/cik_reg.h +++ b/drivers/gpu/drm/radeon/cik_reg.h @@ -147,4 +147,69 @@ #define CIK_LB_DESKTOP_HEIGHT 0x6b0c +struct cik_hqd_registers { + u32 cp_mqd_base_addr; + u32 cp_mqd_base_addr_hi; + u32 cp_hqd_active; + u32 cp_hqd_vmid; + u32 cp_hqd_persistent_state; + u32 cp_hqd_pipe_priority; + u32 cp_hqd_queue_priority; + u32 cp_hqd_quantum; + u32 cp_hqd_pq_base; + u32 cp_hqd_pq_base_hi; + u32 cp_hqd_pq_rptr; + u32 cp_hqd_pq_rptr_report_addr; + u32 cp_hqd_pq_rptr_report_addr_hi; + u32 cp_hqd_pq_wptr_poll_addr; + u32 cp_hqd_pq_wptr_poll_addr_hi; + u32 cp_hqd_pq_doorbell_control; + u32 cp_hqd_pq_wptr; + u32 cp_hqd_pq_control; + u32 cp_hqd_ib_base_addr; + u32 cp_hqd_ib_base_addr_hi; + u32 cp_hqd_ib_rptr; + u32 cp_hqd_ib_control; + u32 cp_hqd_iq_timer; + u32 cp_hqd_iq_rptr; + u32 cp_hqd_dequeue_request; + u32 cp_hqd_dma_offload; + u32 cp_hqd_sema_cmd; + u32 cp_hqd_msg_type; + u32 cp_hqd_atomic0_preop_lo; + u32 cp_hqd_atomic0_preop_hi; + u32 cp_hqd_atomic1_preop_lo; + u32
[PATCH v3 04/23] drm/radeon: adding synchronization for GRBM GFX
Implementing a lock for selecting and accessing shader engines and arrays. This lock will make sure that radeon and amdkfd are not colliding when accessing shader engines and arrays with GRBM_GFX_INDEX register. Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/cik.c | 26 ++ drivers/gpu/drm/radeon/radeon.h| 2 ++ drivers/gpu/drm/radeon/radeon_device.c | 1 + 3 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index d54d3d7..0096538 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -1563,6 +1563,8 @@ static const u32 godavari_golden_registers[] = static void cik_init_golden_registers(struct radeon_device *rdev) { + /* Some of the registers might be dependant on GRBM_GFX_INDEX */ + mutex_lock(>grbm_idx_mutex); switch (rdev->family) { case CHIP_BONAIRE: radeon_program_register_sequence(rdev, @@ -1637,6 +1639,7 @@ static void cik_init_golden_registers(struct radeon_device *rdev) default: break; } + mutex_unlock(>grbm_idx_mutex); } /** @@ -3419,6 +3422,7 @@ static void cik_setup_rb(struct radeon_device *rdev, u32 disabled_rbs = 0; u32 enabled_rbs = 0; + mutex_lock(>grbm_idx_mutex); for (i = 0; i < se_num; i++) { for (j = 0; j < sh_per_se; j++) { cik_select_se_sh(rdev, i, j); @@ -3430,6 +3434,7 @@ static void cik_setup_rb(struct radeon_device *rdev, } } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(>grbm_idx_mutex); mask = 1; for (i = 0; i < max_rb_num_per_se * se_num; i++) { @@ -3440,6 +3445,7 @@ static void cik_setup_rb(struct radeon_device *rdev, rdev->config.cik.backend_enable_mask = enabled_rbs; + mutex_lock(>grbm_idx_mutex); for (i = 0; i < se_num; i++) { cik_select_se_sh(rdev, i, 0x); data = 0; @@ -3467,6 +3473,7 @@ static void cik_setup_rb(struct radeon_device *rdev, WREG32(PA_SC_RASTER_CONFIG, data); } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(>grbm_idx_mutex); } /** @@ -3684,6 +3691,12 @@ static void cik_gpu_init(struct radeon_device *rdev) /* set HW defaults for 3D engine */ WREG32(CP_MEQ_THRESHOLDS, MEQ1_START(0x30) | MEQ2_START(0x60)); + mutex_lock(>grbm_idx_mutex); + /* +* making sure that the following register writes will be broadcasted +* to all the shaders +*/ + cik_select_se_sh(rdev, 0x, 0x); WREG32(SX_DEBUG_1, 0x20); WREG32(TA_CNTL_AUX, 0x0001); @@ -3739,6 +3752,7 @@ static void cik_gpu_init(struct radeon_device *rdev) WREG32(PA_CL_ENHANCE, CLIP_VTX_REORDER_ENA | NUM_CLIP_SEQ(3)); WREG32(PA_SC_ENHANCE, ENABLE_PA_SC_OUT_OF_ORDER); + mutex_unlock(>grbm_idx_mutex); udelay(50); } @@ -6036,6 +6050,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device *rdev) u32 i, j, k; u32 mask; + mutex_lock(>grbm_idx_mutex); for (i = 0; i < rdev->config.cik.max_shader_engines; i++) { for (j = 0; j < rdev->config.cik.max_sh_per_se; j++) { cik_select_se_sh(rdev, i, j); @@ -6047,6 +6062,7 @@ static void cik_wait_for_rlc_serdes(struct radeon_device *rdev) } } cik_select_se_sh(rdev, 0x, 0x); + mutex_unlock(>grbm_idx_mutex); mask = SE_MASTER_BUSY_MASK | GC_MASTER_BUSY | TC0_MASTER_BUSY | TC1_MASTER_BUSY; for (k = 0; k < rdev->usec_timeout; k++) { @@ -6181,10 +6197,12 @@ static int cik_rlc_resume(struct radeon_device *rdev) WREG32(RLC_LB_CNTR_INIT, 0); WREG32(RLC_LB_CNTR_MAX, 0x8000); + mutex_lock(>grbm_idx_mutex); cik_select_se_sh(rdev, 0x, 0x); WREG32(RLC_LB_INIT_CU_MASK, 0x); WREG32(RLC_LB_PARAMS, 0x00600408); WREG32(RLC_LB_CNTL, 0x8004); + mutex_unlock(>grbm_idx_mutex); WREG32(RLC_MC_CNTL, 0); WREG32(RLC_UCODE_CNTL, 0); @@ -6251,11 +6269,13 @@ static void cik_enable_cgcg(struct radeon_device *rdev, bool enable) tmp = cik_halt_rlc(rdev); + mutex_lock(>grbm_idx_mutex); cik_select_se_sh(rdev, 0x, 0x); WREG32(RLC_SERDES_WR_CU_MASTER_MASK, 0x); WREG32(RLC_SERDES_WR_NONCU_MASTER_MASK, 0x); tmp2 = BPM_ADDR_MASK | CGCG_OVERRIDE_0 | CGLS_ENABLE; WREG32(RLC_SERDES_WR_CTRL, tmp2); + mutex_unlock(>grbm_idx_mutex); cik_update_rlc(rdev, tmp); @@ -6297,11 +6317,13 @@ static void cik_enable_mgcg(struct radeon_device *rdev, bool enable) tmp = cik_halt_rlc(rdev); +
[PATCH v3 03/23] drm/radeon: Report doorbell configuration to amdkfd
radeon and amdkfd share the doorbell aperture. radeon sets it up, takes the doorbells required for its own rings and reports the setup to amdkfd. radeon reserved doorbells are at the start of the doorbell aperture. Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/radeon.h| 4 drivers/gpu/drm/radeon/radeon_device.c | 31 +++ 2 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 511191f..75bcc04 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -691,6 +691,10 @@ struct radeon_doorbell { int radeon_doorbell_get(struct radeon_device *rdev, u32 *page); void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell); +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev, + phys_addr_t *aperture_base, + size_t *aperture_size, + size_t *start_offset); /* * IRQS. diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index c58f84f..827bcd1 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -373,6 +373,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell) __clear_bit(doorbell, rdev->doorbell.used); } +/** + * radeon_doorbell_get_kfd_info - Report doorbell configuration required to + *setup KFD + * + * @rdev: radeon_device pointer + * @aperture_base: output returning doorbell aperture base physical address + * @aperture_size: output returning doorbell aperture size in bytes + * @start_offset: output returning # of doorbell bytes reserved for radeon. + * + * Radeon and the KFD share the doorbell aperture. Radeon sets it up, + * takes doorbells required for its own rings and reports the setup to KFD. + * Radeon reserved doorbells are at the start of the doorbell aperture. + */ +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev, + phys_addr_t *aperture_base, + size_t *aperture_size, + size_t *start_offset) +{ + /* The first num_doorbells are used by radeon. +* KFD takes whatever's left in the aperture. */ + if (rdev->doorbell.size > rdev->doorbell.num_doorbells * sizeof(u32)) { + *aperture_base = rdev->doorbell.base; + *aperture_size = rdev->doorbell.size; + *start_offset = rdev->doorbell.num_doorbells * sizeof(u32); + } else { + *aperture_base = 0; + *aperture_size = 0; + *start_offset = 0; + } +} + /* * radeon_wb_*() * Writeback is the the method by which the the GPU updates special pages -- 1.9.1
[PATCH v3 02/23] drm/radeon/cik: Don't touch int of pipes 1-7
amdkfd should set interrupts for pipes 1-7. Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/cik.c | 71 +--- 1 file changed, 1 insertion(+), 70 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 9571be8..d54d3d7 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -7265,8 +7265,7 @@ static int cik_irq_init(struct radeon_device *rdev) int cik_irq_set(struct radeon_device *rdev) { u32 cp_int_cntl; - u32 cp_m1p0, cp_m1p1, cp_m1p2, cp_m1p3; - u32 cp_m2p0, cp_m2p1, cp_m2p2, cp_m2p3; + u32 cp_m1p0; u32 crtc1 = 0, crtc2 = 0, crtc3 = 0, crtc4 = 0, crtc5 = 0, crtc6 = 0; u32 hpd1, hpd2, hpd3, hpd4, hpd5, hpd6; u32 grbm_int_cntl = 0; @@ -7300,13 +7299,6 @@ int cik_irq_set(struct radeon_device *rdev) dma_cntl1 = RREG32(SDMA0_CNTL + SDMA1_REGISTER_OFFSET) & ~TRAP_ENABLE; cp_m1p0 = RREG32(CP_ME1_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m1p1 = RREG32(CP_ME1_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m1p2 = RREG32(CP_ME1_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m1p3 = RREG32(CP_ME1_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m2p0 = RREG32(CP_ME2_PIPE0_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m2p1 = RREG32(CP_ME2_PIPE1_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m2p2 = RREG32(CP_ME2_PIPE2_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; - cp_m2p3 = RREG32(CP_ME2_PIPE3_INT_CNTL) & ~TIME_STAMP_INT_ENABLE; if (rdev->flags & RADEON_IS_IGP) thermal_int = RREG32_SMC(CG_THERMAL_INT_CTRL) & @@ -7328,33 +7320,6 @@ int cik_irq_set(struct radeon_device *rdev) case 0: cp_m1p0 |= TIME_STAMP_INT_ENABLE; break; - case 1: - cp_m1p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - default: - DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe %d\n", ring->pipe); - break; - } - } else if (ring->me == 2) { - switch (ring->pipe) { - case 0: - cp_m2p0 |= TIME_STAMP_INT_ENABLE; - break; - case 1: - cp_m2p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; default: DRM_DEBUG("si_irq_set: sw int cp1 invalid pipe %d\n", ring->pipe); break; @@ -7371,33 +7336,6 @@ int cik_irq_set(struct radeon_device *rdev) case 0: cp_m1p0 |= TIME_STAMP_INT_ENABLE; break; - case 1: - cp_m1p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m1p2 |= TIME_STAMP_INT_ENABLE; - break; - default: - DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe %d\n", ring->pipe); - break; - } - } else if (ring->me == 2) { - switch (ring->pipe) { - case 0: - cp_m2p0 |= TIME_STAMP_INT_ENABLE; - break; - case 1: - cp_m2p1 |= TIME_STAMP_INT_ENABLE; - break; - case 2: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; - case 3: - cp_m2p2 |= TIME_STAMP_INT_ENABLE; - break; default: DRM_DEBUG("si_irq_set: sw int cp2 invalid pipe %d\n", ring->pipe); break; @@ -7486,13 +7424,6 @@ int cik_irq_set(struct radeon_device *rdev) WREG32(SDMA0_CNTL +
[PATCH v3 01/23] drm/radeon: reduce number of free VMIDs and pipes in KV
To support HSA on KV, we need to limit the number of vmids and pipes that are available for radeon's use with KV. This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs 0-7) and also makes radeon thinks that KV has only a single MEC with a single pipe in it (v3) Use define for static vmid allocation in radeon Signed-off-by: Oded Gabbay --- drivers/gpu/drm/radeon/cik.c | 48 +-- drivers/gpu/drm/radeon/cikd.h | 2 ++ 2 files changed, 26 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index b625646..9571be8 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4660,12 +4660,11 @@ static int cik_mec_init(struct radeon_device *rdev) /* * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total +* Nonetheless, we assign only 1 pipe because all other pipes will +* be handled by KFD */ - if (rdev->family == CHIP_KAVERI) - rdev->mec.num_mec = 2; - else - rdev->mec.num_mec = 1; - rdev->mec.num_pipe = 4; + rdev->mec.num_mec = 1; + rdev->mec.num_pipe = 1; rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8; if (rdev->mec.hpd_eop_obj == NULL) { @@ -4807,28 +4806,24 @@ static int cik_cp_compute_resume(struct radeon_device *rdev) /* init the pipes */ mutex_lock(>srbm_mutex); - for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) { - int me = (i < 4) ? 1 : 2; - int pipe = (i < 4) ? i : (i - 4); - eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 2); + eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr; - cik_srbm_select(rdev, me, pipe, 0, 0); + cik_srbm_select(rdev, 0, 0, 0, 0); - /* write the EOP addr */ - WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); - WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); + /* write the EOP addr */ + WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); + WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); - /* set the VMID assigned */ - WREG32(CP_HPD_EOP_VMID, 0); + /* set the VMID assigned */ + WREG32(CP_HPD_EOP_VMID, 0); + + /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ + tmp = RREG32(CP_HPD_EOP_CONTROL); + tmp &= ~EOP_SIZE_MASK; + tmp |= order_base_2(MEC_HPD_SIZE / 8); + WREG32(CP_HPD_EOP_CONTROL, tmp); - /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ - tmp = RREG32(CP_HPD_EOP_CONTROL); - tmp &= ~EOP_SIZE_MASK; - tmp |= order_base_2(MEC_HPD_SIZE / 8); - WREG32(CP_HPD_EOP_CONTROL, tmp); - } - cik_srbm_select(rdev, 0, 0, 0, 0); mutex_unlock(>srbm_mutex); /* init the queues. Just two for now. */ @@ -5874,8 +5869,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib) */ int cik_vm_init(struct radeon_device *rdev) { - /* number of VMs */ - rdev->vm_manager.nvm = 16; + /* +* number of VMs +* VMID 0 is reserved for System +* radeon graphics/compute will use VMIDs 1-7 +* amdkfd will use VMIDs 8-15 +*/ + rdev->vm_manager.nvm = RADEON_NUM_OF_VMIDS; /* base offset of vram pages */ if (rdev->flags & RADEON_IS_IGP) { u64 tmp = RREG32(MC_VM_FB_OFFSET); diff --git a/drivers/gpu/drm/radeon/cikd.h b/drivers/gpu/drm/radeon/cikd.h index 0c6e1b5..fae4d0c 100644 --- a/drivers/gpu/drm/radeon/cikd.h +++ b/drivers/gpu/drm/radeon/cikd.h @@ -30,6 +30,8 @@ #define CIK_RB_BITMAP_WIDTH_PER_SH 2 #define HAWAII_RB_BITMAP_WIDTH_PER_SH 4 +#define RADEON_NUM_OF_VMIDS8 + /* DIDT IND registers */ #define DIDT_SQ_CTRL0 0x0 # define DIDT_CTRL_EN (1 << 0) -- 1.9.1
[PATCH v3 00/23] AMDKFD Kernel Driver
Hi, Here is the v3 patch set of amdkfd. This version contains changes and fixes to code, as agreed on during the review of the v2 patch set. The major changes are: - There are two new module parameters: # of processes and # of queues per process. The defaults, as agreed on in the v2 review, are 32 and 128 respectively. This sets the default amount of GART address space that amdkfd requires to 3.5MB (3MB for userspace queues mqds and 0.5MB for other stuff, such as mqd for kernel queue, hpd for pipelines, etc.) - All the GART address space usage of amdkfd is done inside a single contiguous buffer that is allocated from system memory, and pinned to the start of the GART during the startup of amdkfd (which is just after the startup of radeon). The management of this buffer is done by the radeon sa manager. This buffer is not evict-able. - Mapping of doorbells is initiated by the userspace lib (by mmap syscall), instead of initiating it from inside an ioctl (using vm_mmap). - Removed ioctls for exclusive access to performance counters - Added documentation about the QCM (Queue Control Management), apertures and interfaces between amdkfd and radeon. Two important notes: - The topology patch has not been changed. Look at http://lists.freedesktop.org/archives/dri-devel/2014-July/065042.html for my response. I also put my answer as an explanation in the commit msg of the patch. - There are still some minor code style issues I need to fix. I didn't want to delay v3 any further but I will publish either v4 with those fixes, or just relevant patches if the whole patch set will be merged. For people who like to review using git, the v3 patch set is located at: http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v3 In addition, I would like to announce that we have uploaded the userspace lib that accompanies amdkfd. That lib is called "libhsakmt" and you can view it at: http://cgit.freedesktop.org/~gabbayo/libhsakmt Alexey Skidanov (1): amdkfd: Implement the Get Process Aperture IOCTL Andrew Lewycky (3): amdkfd: Add basic modules to amdkfd amdkfd: Add interrupt handling module amdkfd: Implement the Set Memory Policy IOCTL Ben Goz (8): amdkfd: Add queue module amdkfd: Add mqd_manager module amdkfd: Add kernel queue module amdkfd: Add module parameter of scheduling policy amdkfd: Add packet manager module amdkfd: Add process queue manager module amdkfd: Add device queue manager module amdkfd: Implement the create/destroy/update queue IOCTLs Evgeny Pinchuk (2): amdkfd: Add topology module to amdkfd amdkfd: Implement the Get Clock Counters IOCTL Oded Gabbay (9): drm/radeon: reduce number of free VMIDs and pipes in KV drm/radeon/cik: Don't touch int of pipes 1-7 drm/radeon: Report doorbell configuration to amdkfd drm/radeon: adding synchronization for GRBM GFX drm/radeon: Add radeon <--> amdkfd interface Update MAINTAINERS and CREDITS files with amdkfd info amdkfd: Add IOCTL set definitions of amdkfd amdkfd: Add amdkfd skeleton driver amdkfd: Add binding/unbinding calls to amd_iommu driver CREDITS|7 + MAINTAINERS| 10 + drivers/gpu/drm/radeon/Kconfig |2 + drivers/gpu/drm/radeon/Makefile|3 + drivers/gpu/drm/radeon/amdkfd/Kconfig | 10 + drivers/gpu/drm/radeon/amdkfd/Makefile | 14 + drivers/gpu/drm/radeon/amdkfd/cik_regs.h | 220 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c | 350 ++ drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c| 511 + drivers/gpu/drm/radeon/amdkfd/kfd_crat.h | 294 + drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 300 + .../drm/radeon/amdkfd/kfd_device_queue_manager.c | 989 .../drm/radeon/amdkfd/kfd_device_queue_manager.h | 144 +++ drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c | 236 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c | 161 +++ drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c | 330 ++ drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h | 66 ++ drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 147 +++ drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c| 305 + drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h| 88 ++ drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c | 495 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c | 95 ++ drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h| 682 +++ drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h| 107 ++ drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 560 + drivers/gpu/drm/radeon/amdkfd/kfd_process.c| 347 ++ .../drm/radeon/amdkfd/kfd_process_queue_manager.c | 346 ++ drivers/gpu/drm/radeon/amdkfd/kfd_queue.c | 85 ++ drivers/gpu/drm/radeon/amdkfd/kfd_topology.c |
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #4 from Kai --- Created attachment 104094 --> https://bugs.freedesktop.org/attachment.cgi?id=104094=edit dmesg with radeon.dpm=1 set Here you go. The last power state entry in dmesg is: > switching from power state: > ui class: performance > internal class: none > caps: > uvdvclk: 0 dclk: 0 > power level 0sclk: 3 mclk: 15000 pcie gen: 3 pcie lanes: 16 > power level 1sclk: 98000 mclk: 125000 pcie gen: 3 pcie lanes: 16 > status: c r > switching to power state: > ui class: performance > internal class: none > caps: > uvdvclk: 0 dclk: 0 > power level 0sclk: 3 mclk: 15000 pcie gen: 3 pcie lanes: 16 > power level 1sclk: 98000 mclk: 125000 pcie gen: 3 pcie lanes: 16 > status: c r -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/13147a4e/attachment-0001.html>
[pull] radeon drm-next-3.17
Hi Dave, This is the radeon pull request for 3.17. Highlights: - Additional Hawaii fixes - Support for using the display scaler on non-fixed mode displays - Support for new firmware format that makes it easier to update - Enable dpm by default on additional asics - GPUVM improvements - Support for uncached and write combined gtt buffers - Allow allocation of BOs larger than visible vram - Various other small fixes and improvements Drop the userptr stuff for now pending further discussion. The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d: drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 +1000) are available in the git repository at: git://people.freedesktop.org/~agd5f/linux drm-next-3.17 for you to fetch changes up to 9f51e2e04f74608adec9957df97684a37a4cd375: drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined. (2014-08-05 11:22:54 -0400) Alex Deucher (25): drm/radeon/dpm: add support for SVI2 voltage for SI drm/radeon: disable gfx cgcg on cik drm/radeon: add new firmware header definitions (v3) drm/radeon/si: Add support for new ucode format (v3) drm/radeon/cik: Add support for new ucode format (v5) drm/radeon: enable display scaling on all connectors (v2) drm/radeon: consolidate vga and dvi get_modes functions (v2) drm/radeon: restructure edid fetching drm/radeon: use a fetch function to get the edid drm/radeon: track pinned memory (v2) drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl drm/radeon: use vram/gart pinned size in radeon_do_test_moves drm/radeon: remove visible vram size limit on bo allocation (v4) drm/radeon: add a PX quirk list drm/radeon: make radeon_connector_encoder_is_hbr2 static drm/radeon: load the lm63 driver for an lm64 thermal chip. drm/radeon: fix reversed logic in evergreen_mc_resume drm/radeon/atom: add new voltage fetch function for hawaii drm/radeon/dpm: handle voltage info fetching on hawaii drm/radeon: re-enable dpm by default on cayman drm/radeon: re-enable dpm by default on BTC drm/radeon: use an intervall tree to manage the VMA v2 drm/radeon: use packet2 for nop on hawaii with old firmware drm/radeon: tweak ACCEL_WORKING2 query for hawaii drm/radeon: use packet3 for nop on hawaii with new firmware Andreas Boll (1): drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for hawaii Christian K?nig (15): drm/radeon: remove discardable flag from radeon_gem_object_create drm/radeon: fix R600_PTE_GART handling drm/radeon: add trace_radeon_vm_flush drm/radeon: set VM base addr using the PFP v2 drm/radeon: separate ring and IB handling drm/radeon: invalidate moved BOs in the VM (v2) drm/radeon: remove radeon_bo_clear_va drm/radeon: try to enable VM flushing once more drm/radeon: adjust default radeon_vm_block_size v2 drm/radeon: remove taking mclk_lock from radeon_bo_unref drm/radeon: add radeon_bo_ref function drm/radeon: take a BO reference on VM cleanup drm/radeon: add VM GART copy optimization to NI as well drm/radeon: split PT setup in more functions drm/radeon: update IB size estimation for VM Fabian Frederick (1): drm/radeon: remove null test before kfree Lauri Kasanen (1): drm/radeon: Inline r100_mm_rreg, -wreg, v3 Mario Kleiner (2): drm/radeon: Use pflip irqs for pageflip completion if possible. (v2) drm/radeon: Prevent hdmi deep color if max_tmds_clock is undefined. Michel D?nzer (10): drm/radeon: Demote 'BO allocation size too large' message to debug only drm/radeon: Remove radeon_gart_restore() drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly drm/radeon: Allow write-combined CPU mappings of BOs in GTT (v2) drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe drm/radeon: Use write-combined CPU mappings of IBs on >= CIK drm/radeon/cik: Read back SDMA WPTR register after writing it drm/radeon: s/ioctl_wait_idle/mmio_hpd_flush/ drm/radeon: Always flush the HDP cache before submitting a CS to the GPU drm/radeon: Only flush HDP cache from idle ioctl if BO is in VRAM Stefan Br?ns (2): drm/radeon: Use correct value for unknown audio/video latency drm/radeon/audio: break out of loops once we match connector drivers/gpu/drm/Kconfig| 1 + drivers/gpu/drm/radeon/Makefile| 2 +- drivers/gpu/drm/radeon/atombios_encoders.c | 16 +- drivers/gpu/drm/radeon/ci_dpm.c| 13 +- drivers/gpu/drm/radeon/ci_smc.c| 39 +- drivers/gpu/drm/radeon/cik.c | 722 ++--- drivers/gpu/drm/radeon/cik_sdma.c | 247 ++
[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors
https://bugs.freedesktop.org/show_bug.cgi?id=82162 --- Comment #8 from sarnex --- (In reply to comment #7) > (In reply to comment #6) > > I tried running a 3D game using the terminal to monitor stdout, and this was > > constantly spammed as well. Not sure if it gives any additional information. > > > > radeon: Failed to allocate a buffer: > > radeon:size : 0 bytes > > radeon:alignment : 4096 bytes > > radeon:domains : 4 > > radeon:flags : 4 > > Yeah, pretty much the same message from userspace. > > Looks like a bug in the userspace driver somewhere. Simplest thing to find > it would be to attach a debugger and get a backtrace when the message is > printed. Hi, thanks for the response. I don't really know what I'm doing with GDB so I'll explain what I did to get this output. I installed all of the dbg mesa packages from the PPA. Then, I ran the command "LIBGL_DEBUG=verbose gdb glxgears", and when error printed(immedaitely), I pressed Ctrl+C and then typed bt full. If there's another way to do this please let me know. http://pastebin.com/v1nA7JWY -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/05d82c98/attachment.html>
[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2
On Tue, Aug 05, 2014 at 07:45:21PM +0200, Christian K?nig wrote: > Am 05.08.2014 um 19:39 schrieb Jerome Glisse: > >On Tue, Aug 05, 2014 at 06:05:29PM +0200, Christian K?nig wrote: > >>From: Christian K?nig > >> > >>Avoid problems with writeback by limiting userptr to anonymous memory. > >> > >>v2: add commit and code comments > >I guess, i have not expressed myself clearly. This is bogus, you pretend > >you want to avoid writeback issue but you still allow userspace to map > >file backed pages (which by the way might be a regular bo object from > >another device for instance and that would be fun). > > > >So this patch is a no go and i would rather see that this userptr to > >be restricted to anon vma only no matter what. No flags here. > > Mapping of non anonymous memory (e.g. everything get_user_pages won't fail > with) is restricted to read only access by the GPU. > > I'm fine with making it a hard requirement for all mappings if you say it's > a must have. > Well for time being you should force read only. The way you implement write is broken. Here is how it can abuse to allow write to a file backed mmap. mmap(fixaddress,fixedsize,NOFD) userptr_ioctl(fixedaddress, RADEON_GEM_USERPTR_ANONONLY) // bo is created successfully because fixedaddress is part of anonvma munmap(fixedaddress,fixedsize) // radeon get mmu_notifier_range_start callback and unbind page from the // bo but radeon does not know there was an unmap. mmap(fixaddress,fixedsize,fd_to_this_read_only_file_i_want_to_write_to) radeon_ioctl_use_my_userptrbo // bo is bind again by radeon and because all flag are set at creation // it is map with write permission allowing someone to write to a file // that might be read only for the user. // // Script kiddies it's time to learn about gpu ... Of course if you this patch (kind of selling my own junk here) : http://www.spinics.net/lists/linux-mm/msg75878.html then you could know inside the range_start that you should remove the write permission and that it should be rechecked on next bind. Note that i have not read much of your code so maybe you handle this case somehow. Cheers, J?r?me > Christian. > > > > >Cheers, > >J?r?me > > > >>Signed-off-by: Christian K?nig > >>--- > >> drivers/gpu/drm/radeon/radeon_gem.c | 3 ++- > >> drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++ > >> include/uapi/drm/radeon_drm.h | 1 + > >> 3 files changed, 13 insertions(+), 1 deletion(-) > >> > >>diff --git a/drivers/gpu/drm/radeon/radeon_gem.c > >>b/drivers/gpu/drm/radeon/radeon_gem.c > >>index 993ab22..032736b 100644 > >>--- a/drivers/gpu/drm/radeon/radeon_gem.c > >>+++ b/drivers/gpu/drm/radeon/radeon_gem.c > >>@@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, > >>void *data, > >>return -EACCES; > >>/* reject unknown flag values */ > >>- if (args->flags & ~RADEON_GEM_USERPTR_READONLY) > >>+ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | > >>+ RADEON_GEM_USERPTR_ANONONLY)) > >>return -EINVAL; > >>/* readonly pages not tested on older hardware */ > >>diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c > >>b/drivers/gpu/drm/radeon/radeon_ttm.c > >>index 0109090..54eb7bc 100644 > >>--- a/drivers/gpu/drm/radeon/radeon_ttm.c > >>+++ b/drivers/gpu/drm/radeon/radeon_ttm.c > >>@@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt > >>*ttm) > >> ttm->num_pages * PAGE_SIZE)) > >>return -EFAULT; > >>+ if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { > >>+ /* check that we only pin down anonymous memory > >>+ to prevent problems with writeback */ > >>+ unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; > >>+ struct vm_area_struct *vma; > >>+ vma = find_vma(gtt->usermm, gtt->userptr); > >>+ if (!vma || vma->vm_file || vma->vm_end < end) > >>+ return -EPERM; > >>+ } > >>+ > >>do { > >>unsigned num_pages = ttm->num_pages - pinned; > >>uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; > >>diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h > >>index 3a9f209..9720e1a 100644 > >>--- a/include/uapi/drm/radeon_drm.h > >>+++ b/include/uapi/drm/radeon_drm.h > >>@@ -816,6 +816,7 @@ struct drm_radeon_gem_create { > >> * perform any operation. > >> */ > >> #define RADEON_GEM_USERPTR_READONLY (1 << 0) > >>+#define RADEON_GEM_USERPTR_ANONONLY(1 << 1) > >> struct drm_radeon_gem_userptr { > >>uint64_taddr; > >>-- > >>1.9.1 > >> > >>___ > >>dri-devel mailing list > >>dri-devel at lists.freedesktop.org > >>http://lists.freedesktop.org/mailman/listinfo/dri-devel > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 5/5] drm/radeon: allow userptr write access under certain conditions
From: Christian K?nigIt needs to be anonymous memory (no file mappings) and we are requried to install an MMU notifier. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 2a6fbf1..01b5894 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -285,19 +285,24 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (offset_in_page(args->addr | args->size)) return -EINVAL; - /* we only support read only mappings for now */ - if (!(args->flags & RADEON_GEM_USERPTR_READONLY)) - return -EACCES; - /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE | RADEON_GEM_USERPTR_REGISTER)) return -EINVAL; - /* readonly pages not tested on older hardware */ - if (rdev->family < CHIP_R600) - return -EINVAL; + if (args->flags & RADEON_GEM_USERPTR_READONLY) { + /* readonly pages not tested on older hardware */ + if (rdev->family < CHIP_R600) + return -EINVAL; + + } else if (!(args->flags & RADEON_GEM_USERPTR_ANONONLY) || + !(args->flags & RADEON_GEM_USERPTR_REGISTER)) { + + /* if we want to write to it we must require anonymous + memory and install a MMU notifier */ + return -EACCES; + } down_read(>exclusive_lock); -- 1.9.1
[PATCH 4/5] drm/radeon: add userptr flag to register MMU notifier v3
From: Christian K?nigWhenever userspace mapping related to our userptr change we wait for it to become idle and unmap it from GTT. v2: rebased, fix mutex unlock in error path v3: improve commit message Signed-off-by: Christian K?nig --- drivers/gpu/drm/Kconfig| 1 + drivers/gpu/drm/radeon/Makefile| 2 +- drivers/gpu/drm/radeon/radeon.h| 12 ++ drivers/gpu/drm/radeon/radeon_device.c | 2 + drivers/gpu/drm/radeon/radeon_gem.c| 9 +- drivers/gpu/drm/radeon/radeon_mn.c | 272 + drivers/gpu/drm/radeon/radeon_object.c | 1 + include/uapi/drm/radeon_drm.h | 1 + 8 files changed, 298 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/radeon/radeon_mn.c diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 9b2eedc..2745284 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -115,6 +115,7 @@ config DRM_RADEON select HWMON select BACKLIGHT_CLASS_DEVICE select INTERVAL_TREE + select MMU_NOTIFIER help Choose this option if you have an ATI Radeon graphics card. There are both PCI and AGP versions. You don't need to choose this to diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index 0013ad0..c7fa1ae 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -80,7 +80,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \ r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \ rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o \ trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \ - ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o + ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o radeon_mn.o # add async DMA block radeon-y += \ diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 3c6999e..511191f 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -65,6 +65,7 @@ #include #include #include +#include #include #include @@ -487,6 +488,9 @@ struct radeon_bo { struct ttm_bo_kmap_obj dma_buf_vmap; pid_t pid; + + struct radeon_mn*mn; + struct interval_tree_node mn_it; }; #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base) @@ -1725,6 +1729,11 @@ void radeon_test_ring_sync(struct radeon_device *rdev, struct radeon_ring *cpB); void radeon_test_syncing(struct radeon_device *rdev); +/* + * MMU Notifier + */ +int radeon_mn_register(struct radeon_bo *bo, unsigned long addr); +void radeon_mn_unregister(struct radeon_bo *bo); /* * Debugfs @@ -2372,6 +2381,9 @@ struct radeon_device { /* tracking pinned memory */ u64 vram_pin_size; u64 gart_pin_size; + + struct mutexmn_lock; + DECLARE_HASHTABLE(mn_hash, 7); }; bool radeon_is_px(struct drm_device *dev); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index c8ea050..c58f84f 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1270,6 +1270,8 @@ int radeon_device_init(struct radeon_device *rdev, init_rwsem(>pm.mclk_lock); init_rwsem(>exclusive_lock); init_waitqueue_head(>irq.vblank_queue); + mutex_init(>mn_lock); + hash_init(rdev->mn_hash); r = radeon_gem_init(rdev); if (r) return r; diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 4506560..2a6fbf1 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -291,7 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | - RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE)) + RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE | + RADEON_GEM_USERPTR_REGISTER)) return -EINVAL; /* readonly pages not tested on older hardware */ @@ -312,6 +313,12 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (r) goto release_object; + if (args->flags & RADEON_GEM_USERPTR_REGISTER) { + r = radeon_mn_register(bo, args->addr); + if (r) + goto release_object; + } + if (args->flags & RADEON_GEM_USERPTR_VALIDATE) { down_read(>mm->mmap_sem); r = radeon_bo_reserve(bo, true); diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c new file mode 100644 index 000..0157bc2 --- /dev/null +++
[PATCH 3/5] drm/radeon: add userptr flag to directly validate the BO to GTT
From: Christian K?nigThis way we test userptr availability at BO creation time instead of first use. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 18 +- include/uapi/drm/radeon_drm.h | 1 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 032736b..4506560 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -291,7 +291,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | - RADEON_GEM_USERPTR_ANONONLY)) + RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE)) return -EINVAL; /* readonly pages not tested on older hardware */ @@ -312,6 +312,22 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (r) goto release_object; + if (args->flags & RADEON_GEM_USERPTR_VALIDATE) { + down_read(>mm->mmap_sem); + r = radeon_bo_reserve(bo, true); + if (r) { + up_read(>mm->mmap_sem); + goto release_object; + } + + radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT); + r = ttm_bo_validate(>tbo, >placement, true, false); + radeon_bo_unreserve(bo); + up_read(>mm->mmap_sem); + if (r) + goto release_object; + } + r = drm_gem_handle_create(filp, gobj, ); /* drop reference from allocate - handle holds it now */ drm_gem_object_unreference_unlocked(gobj); diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h index 9720e1a..5dc61c2 100644 --- a/include/uapi/drm/radeon_drm.h +++ b/include/uapi/drm/radeon_drm.h @@ -817,6 +817,7 @@ struct drm_radeon_gem_create { */ #define RADEON_GEM_USERPTR_READONLY(1 << 0) #define RADEON_GEM_USERPTR_ANONONLY(1 << 1) +#define RADEON_GEM_USERPTR_VALIDATE(1 << 2) struct drm_radeon_gem_userptr { uint64_taddr; -- 1.9.1
[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory v2
From: Christian K?nigAvoid problems with writeback by limiting userptr to anonymous memory. v2: add commit and code comments Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 3 ++- drivers/gpu/drm/radeon/radeon_ttm.c | 10 ++ include/uapi/drm/radeon_drm.h | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 993ab22..032736b 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, return -EACCES; /* reject unknown flag values */ - if (args->flags & ~RADEON_GEM_USERPTR_READONLY) + if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | + RADEON_GEM_USERPTR_ANONONLY)) return -EINVAL; /* readonly pages not tested on older hardware */ diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 0109090..54eb7bc 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -542,6 +542,16 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) ttm->num_pages * PAGE_SIZE)) return -EFAULT; + if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { + /* check that we only pin down anonymous memory + to prevent problems with writeback */ + unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; + struct vm_area_struct *vma; + vma = find_vma(gtt->usermm, gtt->userptr); + if (!vma || vma->vm_file || vma->vm_end < end) + return -EPERM; + } + do { unsigned num_pages = ttm->num_pages - pinned; uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h index 3a9f209..9720e1a 100644 --- a/include/uapi/drm/radeon_drm.h +++ b/include/uapi/drm/radeon_drm.h @@ -816,6 +816,7 @@ struct drm_radeon_gem_create { * perform any operation. */ #define RADEON_GEM_USERPTR_READONLY(1 << 0) +#define RADEON_GEM_USERPTR_ANONONLY(1 << 1) struct drm_radeon_gem_userptr { uint64_taddr; -- 1.9.1
[PATCH 1/5] drm/radeon: add userptr support v7
From: Christian K?nigThis patch adds an IOCTL for turning a pointer supplied by userspace into a buffer object. It imposes several restrictions upon the memory being mapped: 1. It must be page aligned (both start/end addresses, i.e ptr and size). 2. It must be normal system memory, not a pointer into another map of IO space (e.g. it must not be a GTT mmapping of another object). 3. The BO is mapped into GTT, so the maximum amount of memory mapped at all times is still the GTT limit. 4. The BO is only mapped readonly for now, so no write support. 5. List of backing pages is only acquired once, so they represent a snapshot of the first use. Exporting and sharing as well as mapping of buffer objects created by this function is forbidden and results in an -EPERM. v2: squash all previous changes into first public version v3: fix tabs, map readonly, don't use MM callback any more v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages, pin/unpin pages on bind/unbind instead of populate/unpopulate v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown flags, better handle READONLY flag, improve permission check v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin v7: add warning about it's availability in the API definition Signed-off-by: Christian K?nig Reviewed-by: Alex Deucher (v4) Reviewed-by: J?r?me Glisse (v4) --- drivers/gpu/drm/radeon/radeon.h| 5 ++ drivers/gpu/drm/radeon/radeon_cs.c | 25 +- drivers/gpu/drm/radeon/radeon_drv.c| 5 +- drivers/gpu/drm/radeon/radeon_gem.c| 68 drivers/gpu/drm/radeon/radeon_kms.c| 1 + drivers/gpu/drm/radeon/radeon_object.c | 3 + drivers/gpu/drm/radeon/radeon_prime.c | 10 +++ drivers/gpu/drm/radeon/radeon_ttm.c| 139 + drivers/gpu/drm/radeon/radeon_vm.c | 3 + include/uapi/drm/radeon_drm.h | 16 10 files changed, 272 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 9e1732e..3c6999e 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_create_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); int radeon_gem_pin_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data, @@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct radeon_device *rdev, int enabl extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int enable); extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain); extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo); +extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr, +uint32_t flags); +extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm); extern void radeon_vram_location(struct radeon_device *rdev, struct radeon_mc *mc, u64 base); extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc *mc); extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index ee712c1..1321491 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p) struct radeon_cs_chunk *chunk; struct radeon_cs_buckets buckets; unsigned i, j; - bool duplicate; + bool duplicate, need_mmap_lock = false; + int r; if (p->chunk_relocs_idx == -1) { return 0; @@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p) p->relocs[i].allowed_domains = domain; } + if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) { + uint32_t domain = p->relocs[i].prefered_domains; + if (!(domain & RADEON_GEM_DOMAIN_GTT)) { + DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is " + "allowed for userptr BOs\n"); + return -EINVAL; + } + need_mmap_lock = true; + domain = RADEON_GEM_DOMAIN_GTT; + p->relocs[i].prefered_domains = domain; + p->relocs[i].allowed_domains = domain; + } + p->relocs[i].tv.bo = >relocs[i].robj->tbo;
[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors
https://bugs.freedesktop.org/show_bug.cgi?id=82162 --- Comment #7 from Christian K?nig --- (In reply to comment #6) > I tried running a 3D game using the terminal to monitor stdout, and this was > constantly spammed as well. Not sure if it gives any additional information. > > radeon: Failed to allocate a buffer: > radeon:size : 0 bytes > radeon:alignment : 4096 bytes > radeon:domains : 4 > radeon:flags : 4 Yeah, pretty much the same message from userspace. Looks like a bug in the userspace driver somewhere. Simplest thing to find it would be to attach a debugger and get a backtrace when the message is printed. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/ab597808/attachment.html>
[PATCH] drm/radeon: Only flush HDP cache from idle ioctl if BO is in VRAM
On 05.08.2014 07:01, Marek Ol??k wrote: > I'm afraid this won't always work and it can be a source of bugs. > > Userspace doesn't have to call GEM_WAIT_IDLE before a CPU access to a > VRAM buffer. For example, consider a wait-idle request with a non-zero > timeout, which is implemented as a loop which calls GEM_BUSY. Also, > userspace can use fences (alright they are backed by 1-page-sized VRAM > buffers at the moment) and it may use real fences in the future which > are not tied to a buffer object. > > If the HDP flush isn't allowed in userspace IBs, I think we will have > to expose it as an ioctl and call it explicitly from userspace. I understand your concerns, but my patch doesn't change anything wrt them, does it? -- Earthling Michel D?nzer| http://www.amd.com Libre software enthusiast |Mesa and X developer
[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors
https://bugs.freedesktop.org/show_bug.cgi?id=82162 --- Comment #6 from sarnex --- I tried running a 3D game using the terminal to monitor stdout, and this was constantly spammed as well. Not sure if it gives any additional information. radeon: Failed to allocate a buffer: radeon:size : 0 bytes radeon:alignment : 4096 bytes radeon:domains : 4 radeon:flags : 4 -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/4e06e3f6/attachment.html>
Dual-channel DSI
Hi everyone, I've been working on adding support for a panel that uses what's commonly known as dual-channel DSI. Sometimes this is referred to as ganged-mode as well. What is it, you ask? It's essentially a hack to work around the band- width restrictions of DSI, albeit one that's been commonly implemented by several SoC vendors. This typically works by equipping a peripheral with two DSI interfaces, each of which driving one half of the screen (symmetric left-right mode) or every other line (symmetric odd-even mode). Apparently there can be asymmetric modes in addition to those two, but they seem to be the common ones. Often both of the DSI interfaces need to be configured using DCS commands and vendor specific registers. A single display controller is typically used video data transmission. This is necessary to provide synchronization and avoid tearing and all kinds of other ugliness. For this to work both DSI controllers need to be made aware of which chunk of the video data stream is addressing them.
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 Kai changed: What|Removed |Added Attachment #104081|VBIOS from |VBIOS from XFX R9-290A-EDBD description|| -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/0be19d6b/attachment.html>
[PULL] topic/core-stuff
Hi Dave, Flushing out my drm core stuff branch, just 2 stragglers. Cheers, Daniel The following changes since commit a91576d7916f6cce76d30303e60e1ac47cf4a76d: drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 +1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/topic/core-stuff-2014-08-05 for you to fetch changes up to 82a1f64963fa58749b28b39b7ad64140dc2df8cb: drm: Fix race when checking for fb in the generic kms obj lookup (2014-08-05 15:54:13 +0200) Chris Wilson (1): drm: Unlink dead file_priv from list of active files first Daniel Vetter (1): drm: Fix race when checking for fb in the generic kms obj lookup drivers/gpu/drm/drm_crtc.c | 11 ++- drivers/gpu/drm/drm_fops.c | 8 2 files changed, 10 insertions(+), 9 deletions(-) -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm
https://bugs.freedesktop.org/show_bug.cgi?id=82050 --- Comment #7 from Andy Furniss --- Created attachment 104083 --> https://bugs.freedesktop.org/attachment.cgi?id=104083=edit bad -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/b54d8530/attachment.html>
[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm
https://bugs.freedesktop.org/show_bug.cgi?id=82050 --- Comment #6 from Andy Furniss --- Created attachment 104082 --> https://bugs.freedesktop.org/attachment.cgi?id=104082=edit good -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/805c650c/attachment-0001.html>
[PATCH 1/5] drm/radeon: add userptr support v6
Am 05.08.2014 um 16:30 schrieb Jerome Glisse: > On Tue, Aug 05, 2014 at 04:11:03PM +0200, Christian K?nig wrote: >> From: Christian K?nig >> >> This patch adds an IOCTL for turning a pointer supplied by >> userspace into a buffer object. >> >> It imposes several restrictions upon the memory being mapped: >> >> 1. It must be page aligned (both start/end addresses, i.e ptr and size). >> >> 2. It must be normal system memory, not a pointer into another map of IO >> space (e.g. it must not be a GTT mmapping of another object). >> >> 3. The BO is mapped into GTT, so the maximum amount of memory mapped at >> all times is still the GTT limit. >> >> 4. The BO is only mapped readonly for now, so no write support. >> >> 5. List of backing pages is only acquired once, so they represent a >> snapshot of the first use. >> >> Exporting and sharing as well as mapping of buffer objects created by >> this function is forbidden and results in an -EPERM. >> >> v2: squash all previous changes into first public version >> v3: fix tabs, map readonly, don't use MM callback any more >> v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages, >> pin/unpin pages on bind/unbind instead of populate/unpopulate >> v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown >> flags, better handle READONLY flag, improve permission check >> v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin >> >> Signed-off-by: Christian K?nig >> Reviewed-by: Alex Deucher (v4) >> Reviewed-by: J?r?me Glisse (v4) >> --- >> drivers/gpu/drm/radeon/radeon.h| 5 ++ >> drivers/gpu/drm/radeon/radeon_cs.c | 25 +- >> drivers/gpu/drm/radeon/radeon_drv.c| 5 +- >> drivers/gpu/drm/radeon/radeon_gem.c| 68 >> drivers/gpu/drm/radeon/radeon_kms.c| 1 + >> drivers/gpu/drm/radeon/radeon_object.c | 3 + >> drivers/gpu/drm/radeon/radeon_prime.c | 10 +++ >> drivers/gpu/drm/radeon/radeon_ttm.c| 139 >> + >> drivers/gpu/drm/radeon/radeon_vm.c | 3 + >> include/uapi/drm/radeon_drm.h | 11 +++ >> 10 files changed, 267 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/radeon.h >> b/drivers/gpu/drm/radeon/radeon.h >> index 9e1732e..3c6999e 100644 >> --- a/drivers/gpu/drm/radeon/radeon.h >> +++ b/drivers/gpu/drm/radeon/radeon.h >> @@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void >> *data, >>struct drm_file *filp); >> int radeon_gem_create_ioctl(struct drm_device *dev, void *data, >> struct drm_file *filp); >> +int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, >> + struct drm_file *filp); >> int radeon_gem_pin_ioctl(struct drm_device *dev, void *data, >> struct drm_file *file_priv); >> int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data, >> @@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct >> radeon_device *rdev, int enabl >> extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int >> enable); >> extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 >> domain); >> extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo); >> +extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr, >> + uint32_t flags); >> +extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm); >> extern void radeon_vram_location(struct radeon_device *rdev, struct >> radeon_mc *mc, u64 base); >> extern void radeon_gtt_location(struct radeon_device *rdev, struct >> radeon_mc *mc); >> extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool >> fbcon); >> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c >> b/drivers/gpu/drm/radeon/radeon_cs.c >> index ee712c1..1321491 100644 >> --- a/drivers/gpu/drm/radeon/radeon_cs.c >> +++ b/drivers/gpu/drm/radeon/radeon_cs.c >> @@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser >> *p) >> struct radeon_cs_chunk *chunk; >> struct radeon_cs_buckets buckets; >> unsigned i, j; >> -bool duplicate; >> +bool duplicate, need_mmap_lock = false; >> +int r; >> >> if (p->chunk_relocs_idx == -1) { >> return 0; >> @@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct >> radeon_cs_parser *p) >> p->relocs[i].allowed_domains = domain; >> } >> >> +if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) { >> +uint32_t domain = p->relocs[i].prefered_domains; >> +if (!(domain & RADEON_GEM_DOMAIN_GTT)) { >> +DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is " >> + "allowed for userptr BOs\n"); >> +return -EINVAL; >> +} >> +
[pull] radeon drm-next-3.17
> -Original Message- > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel > Vetter > Sent: Tuesday, August 05, 2014 1:09 PM > To: Alex Deucher > Cc: dri-devel at lists.freedesktop.org; airlied at gmail.com; Deucher, > Alexander > Subject: Re: [pull] radeon drm-next-3.17 > > On Tue, Aug 05, 2014 at 12:22:06PM -0400, Alex Deucher wrote: > > Hi Dave, > > > > This is the radeon pull request for 3.17. Highlights: > > - Additional Hawaii fixes > > - Support for using the display scaler on non-fixed mode displays > > - Support for new firmware format that makes it easier to update > > - Enable dpm by default on additional asics > > - GPUVM improvements > > - Support for uncached and write combined gtt buffers > > - Userptr support > > Aside: Where's the libdrm/mesa/whatever patches for this? I didn't see > them fly by anywhere, so I guess I've missed them on some m-l I don't > subscribe to. Christian wrote some patches to validate the interfaces, but I'm not sure he ever sent them out. We haven't yet done a full implementation in the usermode drivers to take advantage of this yet. Alex > -Daniel > > > - Allow allocation of BOs larger than visible vram > > - Various other small fixes and improvements > > > > The following changes since commit > a91576d7916f6cce76d30303e60e1ac47cf4a76d: > > > > drm/ttm: Pass GFP flags in order to avoid deadlock. (2014-08-05 10:54:19 > +1000) > > > > are available in the git repository at: > > > > git://people.freedesktop.org/~agd5f/linux drm-next-3.17 > > > > for you to fetch changes up to > ffd7d3a9d535933c7edfbaaac161f11628270716: > > > > drm/radeon: allow userptr write access under certain conditions (2014-08- > 05 12:10:42 -0400) > > > > > > Alex Deucher (25): > > drm/radeon/dpm: add support for SVI2 voltage for SI > > drm/radeon: disable gfx cgcg on cik > > drm/radeon: add new firmware header definitions (v3) > > drm/radeon/si: Add support for new ucode format (v3) > > drm/radeon/cik: Add support for new ucode format (v5) > > drm/radeon: enable display scaling on all connectors (v2) > > drm/radeon: consolidate vga and dvi get_modes functions (v2) > > drm/radeon: restructure edid fetching > > drm/radeon: use a fetch function to get the edid > > drm/radeon: track pinned memory (v2) > > drm/radeon: use vram/gart pinned size in radeon_gem_info_ioctl > > drm/radeon: use vram/gart pinned size in radeon_do_test_moves > > drm/radeon: remove visible vram size limit on bo allocation (v4) > > drm/radeon: add a PX quirk list > > drm/radeon: make radeon_connector_encoder_is_hbr2 static > > drm/radeon: load the lm63 driver for an lm64 thermal chip. > > drm/radeon: fix reversed logic in evergreen_mc_resume > > drm/radeon/atom: add new voltage fetch function for hawaii > > drm/radeon/dpm: handle voltage info fetching on hawaii > > drm/radeon: re-enable dpm by default on cayman > > drm/radeon: re-enable dpm by default on BTC > > drm/radeon: use an intervall tree to manage the VMA v2 > > drm/radeon: use packet2 for nop on hawaii with old firmware > > drm/radeon: tweak ACCEL_WORKING2 query for hawaii > > drm/radeon: use packet3 for nop on hawaii with new firmware > > > > Andreas Boll (1): > > drm/radeon: tweak ACCEL_WORKING2 query for the new firmware for > hawaii > > > > Christian K?nig (20): > > drm/radeon: remove discardable flag from radeon_gem_object_create > > drm/radeon: fix R600_PTE_GART handling > > drm/radeon: add trace_radeon_vm_flush > > drm/radeon: set VM base addr using the PFP v2 > > drm/radeon: separate ring and IB handling > > drm/radeon: invalidate moved BOs in the VM (v2) > > drm/radeon: remove radeon_bo_clear_va > > drm/radeon: try to enable VM flushing once more > > drm/radeon: adjust default radeon_vm_block_size v2 > > drm/radeon: remove taking mclk_lock from radeon_bo_unref > > drm/radeon: add radeon_bo_ref function > > drm/radeon: take a BO reference on VM cleanup > > drm/radeon: add VM GART copy optimization to NI as well > > drm/radeon: split PT setup in more functions > > drm/radeon: update IB size estimation for VM > > drm/radeon: add userptr support v7 > > drm/radeon: add userptr flag to limit it to anonymous memory v2 > > drm/radeon: add userptr flag to directly validate the BO to GTT > > drm/radeon: add userptr flag to register MMU notifier v3 > > drm/radeon: allow userptr write access under certain conditions > > > > Fabian Frederick (1): > > drm/radeon: remove null test before kfree > > > > Lauri Kasanen (1): > > drm/radeon: Inline r100_mm_rreg, -wreg, v3 > > > > Mario Kleiner (2): > > drm/radeon: Use pflip irqs for pageflip completion if possible. (v2) > >
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #3 from Kai --- Created attachment 104081 --> https://bugs.freedesktop.org/attachment.cgi?id=104081=edit VBIOS from (In reply to comment #2) > Please attach your dmesg output with radeon.dpm=1 set on the kernel command > line in grub. That dumps some additional debugging output. I'll reboot later and attach that dmesg, I'm currently bisecting X for bug 82055. > Also please attach a copy of your vbios. Here you go. Below you find the lspci output, maybe you can reach out to XFX directly, if that should help: > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] > Hawaii PRO [Radeon R9 290] (prog-if 00 [VGA controller]) > Subsystem: XFX Pine Group Inc. Device 9295 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 45 > Region 0: Memory at e000 (64-bit, prefetchable) [size=256M] > Region 2: Memory at f000 (64-bit, prefetchable) [size=8M] > Region 4: I/O ports at e000 [size=256] > Region 5: Memory at f7e0 (32-bit, non-prefetchable) [size=256K] > Expansion ROM at f7e4 [disabled] [size=128K] > Capabilities: [48] Vendor Specific Information: Len=08 > Capabilities: [50] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA > PME(D0-,D1+,D2+,D3hot+,D3cold-) > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, > L1 unlimited > ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- > Unsupported- > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 256 bytes, MaxReadReq 512 bytes > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- > TransPend- > LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit > Latency L0s <64ns, L1 <1us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ > DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Not Supported, TimeoutDis-, > LTR-, OBFF Not Supported > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, > OBFF Disabled > LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- > Transmit Margin: Normal Operating Range, > EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -3.5dB, > EqualizationComplete+, EqualizationPhase1+ > EqualizationPhase2+, EqualizationPhase3+, > LinkEqualizationRequest- > Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Address: fee00358 Data: > Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 > Len=010 > Capabilities: [150 v2] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ > ChkEn- > Capabilities: [270 v1] #19 > Capabilities: [2b0 v1] Address Translation Service (ATS) > ATSCap: Invalidate Queue Depth: 00 > ATSCtl: Enable-, Smallest Translation Unit: 00 > Capabilities: [2c0 v1] #13 > Capabilities: [2d0 v1] #1b > Kernel driver in use: radeon -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/95243971/attachment.html>
[Intel-gfx] [PATCH 1/6] drm: Renaming DP training vswing/pre-emph defines
On 8/5/2014 4:45 PM, Daniel Vetter wrote: > On Tue, Aug 05, 2014 at 04:38:17PM +0530, sonika.jindal at intel.com wrote: >> From: Sonika Jindal >> >> Renaming defines to have levels instead of nominal values. >> >> Signed-off-by: Sonika Jindal > > You can't split up patches like this since this will break compilation. > For larger stuff (and imo this is right above the cutoff) you first need > to add the new functions/defines, then convert everyone over. And only > when all the drivers are converted can we apply the patch to remove the > old functions/defines. > -Daniel > Got your concern. So, I will repost the first patch keeping both the defines and an additional last patch for removing the extra defines. >> --- >> include/drm/drm_dp_helper.h | 16 >> 1 file changed, 8 insertions(+), 8 deletions(-) >> >> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h >> index a21568b..70f362b 100644 >> --- a/include/drm/drm_dp_helper.h >> +++ b/include/drm/drm_dp_helper.h >> @@ -190,16 +190,16 @@ >> # define DP_TRAIN_VOLTAGE_SWING_MASK 0x3 >> # define DP_TRAIN_VOLTAGE_SWING_SHIFT 0 >> # define DP_TRAIN_MAX_SWING_REACHED(1 << 2) >> -# define DP_TRAIN_VOLTAGE_SWING_400 (0 << 0) >> -# define DP_TRAIN_VOLTAGE_SWING_600 (1 << 0) >> -# define DP_TRAIN_VOLTAGE_SWING_800 (2 << 0) >> -# define DP_TRAIN_VOLTAGE_SWING_1200(3 << 0) >> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_0 (0 << 0) >> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_1 (1 << 0) >> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_2 (2 << 0) >> +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_3 (3 << 0) >> >> # define DP_TRAIN_PRE_EMPHASIS_MASK(3 << 3) >> -# define DP_TRAIN_PRE_EMPHASIS_0(0 << 3) >> -# define DP_TRAIN_PRE_EMPHASIS_3_5 (1 << 3) >> -# define DP_TRAIN_PRE_EMPHASIS_6(2 << 3) >> -# define DP_TRAIN_PRE_EMPHASIS_9_5 (3 << 3) >> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_0 (0 << 3) >> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_1 (1 << 3) >> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_2 (2 << 3) >> +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_3 (3 << 3) >> >> # define DP_TRAIN_PRE_EMPHASIS_SHIFT 3 >> # define DP_TRAIN_MAX_PRE_EMPHASIS_REACHED (1 << 5) >> -- >> 1.7.10.4 >> >> ___ >> Intel-gfx mailing list >> Intel-gfx at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx >
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #2 from Alex Deucher --- Please attach your dmesg output with radeon.dpm=1 set on the kernel command line in grub. That dumps some additional debugging output. Also please attach a copy of your vbios. (as root) (use lspci to get the bus id) cd /sys/bus/pci/devices/ echo 1 > rom cat rom > /tmp/vbios.rom echo 0 > rom -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/5130033e/attachment.html>
[Bug 82201] [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 --- Comment #1 from Kai --- Since the image was a bit too large, I can't attach it here. You can find the screenshot at at http://imgur.com/vFBfQpQ -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/b08a8fd8/attachment.html>
[Bug 82201] New: [HAWAII] GPU doesn't reclock, poor 3D performance
https://bugs.freedesktop.org/show_bug.cgi?id=82201 Priority: medium Bug ID: 82201 Assignee: dri-devel at lists.freedesktop.org Summary: [HAWAII] GPU doesn't reclock, poor 3D performance Severity: normal Classification: Unclassified OS: Linux (All) Reporter: kai at dev.carbon-project.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Drivers/Gallium/radeonsi Product: Mesa No matter what program I run, the clock of the GPU stays: # cat /sys/kernel/debug/dri/*/radeon_pm_info power level avg sclk: 3 mclk: 15000 power level avg sclk: 3 mclk: 15000 The attached screenshot shows Portal 2 with a GALLIUM_HUD=fps overlay. The ~30 FPS are in the menu, the 8-15 FPS are in the level. My stack is (base: Debian Testing): GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1) Linux: Git:~agdf5/linux:drm-next-3.17-rebased-on-fixes:fa78380797 (calls itself 3.16-rc6) Firmware: <http://people.freedesktop.org/~agd5f/radeon_ucode/ucode.tar.gz> > 9e05820da42549ce9c89d147cf1f8e19 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_ce.bin > c8bab593090fc54f239c8d7596c8d846 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_mc.bin > 3618dbb955d8a84970e262bb2e6d2a16 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_me.bin > c000b0fc9ff6582145f66504b0ec9597 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_mec.bin > 0643ad24b3beff2214cce533e094c1b7 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_pfp.bin > ba6054b7d78184a74602fd81607e1386 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_rlc.bin > 11288f635737331b69de9ee82fe04898 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_sdma.bin > 284429675a5560e0fad42aa982965fc2 > /lib/firmware/updates/3.16.0-rc6-citadel/radeon/hawaii_smc.bin libdrm: Git:master/libdrm-2.4.56 LLVM: SVN:trunk/r214546 (3.6 snapshot) libclc: Git:master/5b48f170c8 Mesa: Git:master/e41cc45361 DDX: Git:master/fbf575cb01 + Patch from http://lists.x.org/archives/xorg-driver-ati/2014-August/026534.html X: 2:1.16.0-1 (1.16.0) Let me know, if you need further information (current Xorg.0.log (attachment 103995), dmesg (attachment 103996) and glxinfo (attachment 103997) can be found attached to bug 82055). -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/6a514e67/attachment-0001.html>
[PATCH 6/6] drm/tegra: Renaming DP training vswing/pre-emph defines
From: Sonika JindalSigned-off-by: Sonika Jindal --- drivers/gpu/drm/tegra/dpaux.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c index 3f132e3..34f3c1d 100644 --- a/drivers/gpu/drm/tegra/dpaux.c +++ b/drivers/gpu/drm/tegra/dpaux.c @@ -532,9 +532,9 @@ int tegra_dpaux_train(struct tegra_dpaux *dpaux, struct drm_dp_link *link, for (i = 0; i < link->num_lanes; i++) values[i] = DP_TRAIN_MAX_PRE_EMPHASIS_REACHED | - DP_TRAIN_PRE_EMPHASIS_0 | + DP_TRAIN_PRE_EMPHASIS_LEVEL_0 | DP_TRAIN_MAX_SWING_REACHED | - DP_TRAIN_VOLTAGE_SWING_400; + DP_TRAIN_VOLTAGE_SWING_LEVEL_0; err = drm_dp_dpcd_write(>aux, DP_TRAINING_LANE0_SET, values, link->num_lanes); -- 1.7.10.4
[PATCH 5/6] drm/gma500: Renaming DP training vswing/pre-emph defines
From: Sonika JindalSigned-off-by: Sonika Jindal --- drivers/gpu/drm/gma500/cdv_intel_dp.c | 20 ++-- drivers/gpu/drm/gma500/intel_bios.c | 16 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/gma500/cdv_intel_dp.c b/drivers/gpu/drm/gma500/cdv_intel_dp.c index a4cc0e6..a9ef65d 100644 --- a/drivers/gpu/drm/gma500/cdv_intel_dp.c +++ b/drivers/gpu/drm/gma500/cdv_intel_dp.c @@ -1089,21 +1089,21 @@ static char *link_train_names[] = { }; #endif -#define CDV_DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_1200 +#define CDV_DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_LEVEL_3 /* static uint8_t cdv_intel_dp_pre_emphasis_max(uint8_t voltage_swing) { switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) { - case DP_TRAIN_VOLTAGE_SWING_400: - return DP_TRAIN_PRE_EMPHASIS_6; - case DP_TRAIN_VOLTAGE_SWING_600: - return DP_TRAIN_PRE_EMPHASIS_6; - case DP_TRAIN_VOLTAGE_SWING_800: - return DP_TRAIN_PRE_EMPHASIS_3_5; - case DP_TRAIN_VOLTAGE_SWING_1200: + case DP_TRAIN_VOLTAGE_SWING_LEVEL_0: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_2; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_1: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_2; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_2: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_1; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_3: default: - return DP_TRAIN_PRE_EMPHASIS_0; + return DP_TRAIN_PRE_EMPHASIS_LEVEL_0; } } */ @@ -1276,7 +1276,7 @@ cdv_intel_dp_set_vswing_premph(struct gma_encoder *encoder, uint8_t signal_level cdv_sb_write(dev, ddi_reg->VSwing2, dp_vswing_premph_table[index]); /* ;gfx_dpio_set_reg(0x814c, 0x40802040) */ - if ((vswing + premph) == DP_TRAIN_VOLTAGE_SWING_1200) + if ((vswing + premph) == DP_TRAIN_VOLTAGE_SWING_LEVEL_3) cdv_sb_write(dev, ddi_reg->VSwing3, 0x70802040); else cdv_sb_write(dev, ddi_reg->VSwing3, 0x40802040); diff --git a/drivers/gpu/drm/gma500/intel_bios.c b/drivers/gpu/drm/gma500/intel_bios.c index d349734..9573283 100644 --- a/drivers/gpu/drm/gma500/intel_bios.c +++ b/drivers/gpu/drm/gma500/intel_bios.c @@ -116,30 +116,30 @@ parse_edp(struct drm_psb_private *dev_priv, struct bdb_header *bdb) switch (edp_link_params->preemphasis) { case 0: - dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_0; + dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_0; break; case 1: - dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_3_5; + dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_1; break; case 2: - dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_6; + dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_2; break; case 3: - dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_9_5; + dev_priv->edp.preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_3; break; } switch (edp_link_params->vswing) { case 0: - dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_400; + dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_0; break; case 1: - dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_600; + dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_1; break; case 2: - dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_800; + dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_2; break; case 3: - dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_1200; + dev_priv->edp.vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_3; break; } DRM_DEBUG_KMS("VBT reports EDP: VSwing %d, Preemph %d\n", -- 1.7.10.4
[PATCH 4/6] drm/radeon: Renaming DP training vswing/pre-emph defines
From: Sonika JindalSigned-off-by: Sonika Jindal --- drivers/gpu/drm/radeon/atombios_dp.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_dp.c b/drivers/gpu/drm/radeon/atombios_dp.c index b1e11f8..ef32b16 100644 --- a/drivers/gpu/drm/radeon/atombios_dp.c +++ b/drivers/gpu/drm/radeon/atombios_dp.c @@ -232,8 +232,8 @@ void radeon_dp_aux_init(struct radeon_connector *radeon_connector) /* general DP utility functions */ -#define DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_1200 -#define DP_PRE_EMPHASIS_MAXDP_TRAIN_PRE_EMPHASIS_9_5 +#define DP_VOLTAGE_MAX DP_TRAIN_VOLTAGE_SWING_LEVEL_3 +#define DP_PRE_EMPHASIS_MAXDP_TRAIN_PRE_EMPHASIS_LEVEL_3 static void dp_get_adjust_train(u8 link_status[DP_LINK_STATUS_SIZE], int lane_count, -- 1.7.10.4
[PATCH 3/6] drm/exynos: Renaming DP training vswing/pre-emph defines
From: Sonika JindalSigned-off-by: Sonika Jindal --- drivers/gpu/drm/exynos/exynos_dp_core.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_dp_core.c b/drivers/gpu/drm/exynos/exynos_dp_core.c index 31c3de9..e520943 100644 --- a/drivers/gpu/drm/exynos/exynos_dp_core.c +++ b/drivers/gpu/drm/exynos/exynos_dp_core.c @@ -331,8 +331,8 @@ static int exynos_dp_link_start(struct exynos_dp_device *dp) return retval; for (lane = 0; lane < lane_count; lane++) - buf[lane] = DP_TRAIN_PRE_EMPHASIS_0 | - DP_TRAIN_VOLTAGE_SWING_400; + buf[lane] = DP_TRAIN_PRE_EMPHASIS_LEVEL_0 | + DP_TRAIN_VOLTAGE_SWING_LEVEL_0; retval = exynos_dp_write_bytes_to_dpcd(dp, DP_TRAINING_LANE0_SET, lane_count, buf); -- 1.7.10.4
[PATCH 2/6] drm/i915: Renaming DP training vswing/pre-emph defines
From: Sonika JindalChanging the DP training vswing/pre-emph defines in i915. Signed-off-by: Sonika Jindal --- drivers/gpu/drm/i915/intel_bios.c | 16 +-- drivers/gpu/drm/i915/intel_dp.c | 194 ++--- 2 files changed, 105 insertions(+), 105 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c index 031c565..ef11274 100644 --- a/drivers/gpu/drm/i915/intel_bios.c +++ b/drivers/gpu/drm/i915/intel_bios.c @@ -627,16 +627,16 @@ parse_edp(struct drm_i915_private *dev_priv, struct bdb_header *bdb) switch (edp_link_params->preemphasis) { case EDP_PREEMPHASIS_NONE: - dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_0; + dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_0; break; case EDP_PREEMPHASIS_3_5dB: - dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_3_5; + dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_1; break; case EDP_PREEMPHASIS_6dB: - dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_6; + dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_2; break; case EDP_PREEMPHASIS_9_5dB: - dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_9_5; + dev_priv->vbt.edp_preemphasis = DP_TRAIN_PRE_EMPHASIS_LEVEL_3; break; default: DRM_DEBUG_KMS("VBT has unknown eDP pre-emphasis value %u\n", @@ -646,16 +646,16 @@ parse_edp(struct drm_i915_private *dev_priv, struct bdb_header *bdb) switch (edp_link_params->vswing) { case EDP_VSWING_0_4V: - dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_400; + dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_0 break; case EDP_VSWING_0_6V: - dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_600; + dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_1; break; case EDP_VSWING_0_8V: - dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_800; + dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_2; break; case EDP_VSWING_1_2V: - dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_1200; + dev_priv->vbt.edp_vswing = DP_TRAIN_VOLTAGE_SWING_LEVEL_3; break; default: DRM_DEBUG_KMS("VBT has unknown eDP voltage swing value %u\n", diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index ce890f0..c2b3075 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -2393,13 +2393,13 @@ intel_dp_voltage_max(struct intel_dp *intel_dp) enum port port = dp_to_dig_port(intel_dp)->port; if (IS_VALLEYVIEW(dev)) - return DP_TRAIN_VOLTAGE_SWING_1200; + return DP_TRAIN_VOLTAGE_SWING_LEVEL_3; else if (IS_GEN7(dev) && port == PORT_A) - return DP_TRAIN_VOLTAGE_SWING_800; + return DP_TRAIN_VOLTAGE_SWING_LEVEL_2; else if (HAS_PCH_CPT(dev) && port != PORT_A) - return DP_TRAIN_VOLTAGE_SWING_1200; + return DP_TRAIN_VOLTAGE_SWING_LEVEL_3; else - return DP_TRAIN_VOLTAGE_SWING_800; + return DP_TRAIN_VOLTAGE_SWING_LEVEL_2; } static uint8_t @@ -2410,49 +2410,49 @@ intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, uint8_t voltage_swing) if (IS_HASWELL(dev) || IS_BROADWELL(dev)) { switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) { - case DP_TRAIN_VOLTAGE_SWING_400: - return DP_TRAIN_PRE_EMPHASIS_9_5; - case DP_TRAIN_VOLTAGE_SWING_600: - return DP_TRAIN_PRE_EMPHASIS_6; - case DP_TRAIN_VOLTAGE_SWING_800: - return DP_TRAIN_PRE_EMPHASIS_3_5; - case DP_TRAIN_VOLTAGE_SWING_1200: + case DP_TRAIN_VOLTAGE_SWING_LEVEL_0: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_3; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_1: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_2; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_2: + return DP_TRAIN_PRE_EMPHASIS_LEVEL_1; + case DP_TRAIN_VOLTAGE_SWING_LEVEL_3: default: - return DP_TRAIN_PRE_EMPHASIS_0; + return DP_TRAIN_PRE_EMPHASIS_LEVEL_0; } } else if (IS_VALLEYVIEW(dev)) { switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) { - case DP_TRAIN_VOLTAGE_SWING_400: - return DP_TRAIN_PRE_EMPHASIS_9_5; - case DP_TRAIN_VOLTAGE_SWING_600: - return
[PATCH 1/6] drm: Renaming DP training vswing/pre-emph defines
From: Sonika JindalRenaming defines to have levels instead of nominal values. Signed-off-by: Sonika Jindal --- include/drm/drm_dp_helper.h | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index a21568b..70f362b 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -190,16 +190,16 @@ # define DP_TRAIN_VOLTAGE_SWING_MASK 0x3 # define DP_TRAIN_VOLTAGE_SWING_SHIFT 0 # define DP_TRAIN_MAX_SWING_REACHED(1 << 2) -# define DP_TRAIN_VOLTAGE_SWING_400(0 << 0) -# define DP_TRAIN_VOLTAGE_SWING_600(1 << 0) -# define DP_TRAIN_VOLTAGE_SWING_800(2 << 0) -# define DP_TRAIN_VOLTAGE_SWING_1200 (3 << 0) +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_0 (0 << 0) +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_1 (1 << 0) +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_2 (2 << 0) +# define DP_TRAIN_VOLTAGE_SWING_LEVEL_3 (3 << 0) # define DP_TRAIN_PRE_EMPHASIS_MASK(3 << 3) -# define DP_TRAIN_PRE_EMPHASIS_0 (0 << 3) -# define DP_TRAIN_PRE_EMPHASIS_3_5 (1 << 3) -# define DP_TRAIN_PRE_EMPHASIS_6 (2 << 3) -# define DP_TRAIN_PRE_EMPHASIS_9_5 (3 << 3) +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_0 (0 << 3) +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_1 (1 << 3) +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_2 (2 << 3) +# define DP_TRAIN_PRE_EMPHASIS_LEVEL_3 (3 << 3) # define DP_TRAIN_PRE_EMPHASIS_SHIFT 3 # define DP_TRAIN_MAX_PRE_EMPHASIS_REACHED (1 << 5) -- 1.7.10.4
[PATCH 0/6] Rename DP training vswing/pre-emph defines
From: Sonika JindalRename the defines to have levels instead of values for vswing and pre-emph levels as the values may differ in other scenarios like low vswing of eDP 1.4 where the values are different. Updated in all the drivers as well Sonika Jindal (6): drm: Renaming DP training vswing/pre-emph defines drm/i915: Renaming DP training vswing/pre-emph defines drm/exynos: Renaming DP training vswing/pre-emph defines drm/radeon: Renaming DP training vswing/pre-emph defines drm/gma500: Renaming DP training vswing/pre-emph defines drm/tegra: Renaming DP training vswing/pre-emph defines drivers/gpu/drm/exynos/exynos_dp_core.c |4 +- drivers/gpu/drm/gma500/cdv_intel_dp.c | 20 ++-- drivers/gpu/drm/gma500/intel_bios.c | 16 +-- drivers/gpu/drm/i915/intel_bios.c | 16 +-- drivers/gpu/drm/i915/intel_dp.c | 194 +++ drivers/gpu/drm/radeon/atombios_dp.c|4 +- drivers/gpu/drm/tegra/dpaux.c |4 +- include/drm/drm_dp_helper.h | 16 +-- 8 files changed, 137 insertions(+), 137 deletions(-) -- 1.7.10.4
[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory
Am 05.08.2014 um 16:24 schrieb Jerome Glisse: > On Tue, Aug 05, 2014 at 04:11:04PM +0200, Christian K?nig wrote: >> From: Christian K?nig > Why do you want that ? To avoid any problems with writeback (which as far as I understand should only happen on mmaped files). > NACK until proper explanation and motive. Going to update the commit message and add a code comment. Christian. > >> Signed-off-by: Christian K?nig >> --- >> drivers/gpu/drm/radeon/radeon_gem.c | 3 ++- >> drivers/gpu/drm/radeon/radeon_ttm.c | 8 >> include/uapi/drm/radeon_drm.h | 3 ++- >> 3 files changed, 12 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c >> b/drivers/gpu/drm/radeon/radeon_gem.c >> index 993ab22..032736b 100644 >> --- a/drivers/gpu/drm/radeon/radeon_gem.c >> +++ b/drivers/gpu/drm/radeon/radeon_gem.c >> @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, >> void *data, >> return -EACCES; >> >> /* reject unknown flag values */ >> -if (args->flags & ~RADEON_GEM_USERPTR_READONLY) >> +if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | >> +RADEON_GEM_USERPTR_ANONONLY)) >> return -EINVAL; >> >> /* readonly pages not tested on older hardware */ >> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c >> b/drivers/gpu/drm/radeon/radeon_ttm.c >> index 0109090..d63e698 100644 >> --- a/drivers/gpu/drm/radeon/radeon_ttm.c >> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c >> @@ -542,6 +542,14 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) >> ttm->num_pages * PAGE_SIZE)) >> return -EFAULT; >> >> +if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { >> +unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; >> +struct vm_area_struct *vma; >> +vma = find_vma(gtt->usermm, gtt->userptr); >> +if (!vma || vma->vm_file || vma->vm_end < end) >> +return -EPERM; >> +} >> + >> do { >> unsigned num_pages = ttm->num_pages - pinned; >> uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; >> diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h >> index a18ec54..4080ad3 100644 >> --- a/include/uapi/drm/radeon_drm.h >> +++ b/include/uapi/drm/radeon_drm.h >> @@ -810,7 +810,8 @@ struct drm_radeon_gem_create { >> uint32_tflags; >> }; >> >> -#define RADEON_GEM_USERPTR_READONLY 0x1 >> +#define RADEON_GEM_USERPTR_READONLY (1 << 0) >> +#define RADEON_GEM_USERPTR_ANONONLY (1 << 1) >> >> struct drm_radeon_gem_userptr { >> uint64_taddr; >> -- >> 1.9.1 >> >> ___ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm
https://bugs.freedesktop.org/show_bug.cgi?id=82050 --- Comment #5 from Tom Stellard --- Can you post the output of R600_DEBUG=cs from both the "good" and "bad" commits? -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/2da2b06b/attachment.html>
[PATCH 5/5] drm/radeon: allow userptr write access under certain conditions
From: Christian K?nigIt needs to be anonymous memory (no file mappings) and we are requried to install an MMU notifier. Signed-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 2a6fbf1..01b5894 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -285,19 +285,24 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (offset_in_page(args->addr | args->size)) return -EINVAL; - /* we only support read only mappings for now */ - if (!(args->flags & RADEON_GEM_USERPTR_READONLY)) - return -EACCES; - /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE | RADEON_GEM_USERPTR_REGISTER)) return -EINVAL; - /* readonly pages not tested on older hardware */ - if (rdev->family < CHIP_R600) - return -EINVAL; + if (args->flags & RADEON_GEM_USERPTR_READONLY) { + /* readonly pages not tested on older hardware */ + if (rdev->family < CHIP_R600) + return -EINVAL; + + } else if (!(args->flags & RADEON_GEM_USERPTR_ANONONLY) || + !(args->flags & RADEON_GEM_USERPTR_REGISTER)) { + + /* if we want to write to it we must require anonymous + memory and install a MMU notifier */ + return -EACCES; + } down_read(>exclusive_lock); -- 1.9.1
[PATCH 4/5] drm/radeon: add userptr flag to register MMU notifier v2
From: Christian K?nigv2: rebased, fix mutex unlock in error path Signed-off-by: Christian K?nig --- drivers/gpu/drm/Kconfig| 1 + drivers/gpu/drm/radeon/Makefile| 2 +- drivers/gpu/drm/radeon/radeon.h| 12 ++ drivers/gpu/drm/radeon/radeon_device.c | 2 + drivers/gpu/drm/radeon/radeon_gem.c| 9 +- drivers/gpu/drm/radeon/radeon_mn.c | 272 + drivers/gpu/drm/radeon/radeon_object.c | 1 + include/uapi/drm/radeon_drm.h | 1 + 8 files changed, 298 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/radeon/radeon_mn.c diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 9b2eedc..2745284 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -115,6 +115,7 @@ config DRM_RADEON select HWMON select BACKLIGHT_CLASS_DEVICE select INTERVAL_TREE + select MMU_NOTIFIER help Choose this option if you have an ATI Radeon graphics card. There are both PCI and AGP versions. You don't need to choose this to diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index 0013ad0..c7fa1ae 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -80,7 +80,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \ r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \ rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o \ trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \ - ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o + ci_dpm.o dce6_afmt.o radeon_vm.o radeon_ucode.o radeon_ib.o radeon_mn.o # add async DMA block radeon-y += \ diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 3c6999e..511191f 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -65,6 +65,7 @@ #include #include #include +#include #include #include @@ -487,6 +488,9 @@ struct radeon_bo { struct ttm_bo_kmap_obj dma_buf_vmap; pid_t pid; + + struct radeon_mn*mn; + struct interval_tree_node mn_it; }; #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base) @@ -1725,6 +1729,11 @@ void radeon_test_ring_sync(struct radeon_device *rdev, struct radeon_ring *cpB); void radeon_test_syncing(struct radeon_device *rdev); +/* + * MMU Notifier + */ +int radeon_mn_register(struct radeon_bo *bo, unsigned long addr); +void radeon_mn_unregister(struct radeon_bo *bo); /* * Debugfs @@ -2372,6 +2381,9 @@ struct radeon_device { /* tracking pinned memory */ u64 vram_pin_size; u64 gart_pin_size; + + struct mutexmn_lock; + DECLARE_HASHTABLE(mn_hash, 7); }; bool radeon_is_px(struct drm_device *dev); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index c8ea050..c58f84f 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1270,6 +1270,8 @@ int radeon_device_init(struct radeon_device *rdev, init_rwsem(>pm.mclk_lock); init_rwsem(>exclusive_lock); init_waitqueue_head(>irq.vblank_queue); + mutex_init(>mn_lock); + hash_init(rdev->mn_hash); r = radeon_gem_init(rdev); if (r) return r; diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 4506560..2a6fbf1 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -291,7 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | - RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE)) + RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE | + RADEON_GEM_USERPTR_REGISTER)) return -EINVAL; /* readonly pages not tested on older hardware */ @@ -312,6 +313,12 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (r) goto release_object; + if (args->flags & RADEON_GEM_USERPTR_REGISTER) { + r = radeon_mn_register(bo, args->addr); + if (r) + goto release_object; + } + if (args->flags & RADEON_GEM_USERPTR_VALIDATE) { down_read(>mm->mmap_sem); r = radeon_bo_reserve(bo, true); diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c new file mode 100644 index 000..0157bc2 --- /dev/null +++ b/drivers/gpu/drm/radeon/radeon_mn.c @@ -0,0 +1,272 @@ +/* + * Copyright 2014 Advanced Micro Devices, Inc. + * All Rights Reserved. + * + *
[PATCH 3/5] drm/radeon: add userptr flag to directly validate the BO to GTT
From: Christian K?nigSigned-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 18 +- include/uapi/drm/radeon_drm.h | 1 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 032736b..4506560 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -291,7 +291,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, /* reject unknown flag values */ if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | - RADEON_GEM_USERPTR_ANONONLY)) + RADEON_GEM_USERPTR_ANONONLY | RADEON_GEM_USERPTR_VALIDATE)) return -EINVAL; /* readonly pages not tested on older hardware */ @@ -312,6 +312,22 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, if (r) goto release_object; + if (args->flags & RADEON_GEM_USERPTR_VALIDATE) { + down_read(>mm->mmap_sem); + r = radeon_bo_reserve(bo, true); + if (r) { + up_read(>mm->mmap_sem); + goto release_object; + } + + radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT); + r = ttm_bo_validate(>tbo, >placement, true, false); + radeon_bo_unreserve(bo); + up_read(>mm->mmap_sem); + if (r) + goto release_object; + } + r = drm_gem_handle_create(filp, gobj, ); /* drop reference from allocate - handle holds it now */ drm_gem_object_unreference_unlocked(gobj); diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h index 4080ad3..026111b 100644 --- a/include/uapi/drm/radeon_drm.h +++ b/include/uapi/drm/radeon_drm.h @@ -812,6 +812,7 @@ struct drm_radeon_gem_create { #define RADEON_GEM_USERPTR_READONLY(1 << 0) #define RADEON_GEM_USERPTR_ANONONLY(1 << 1) +#define RADEON_GEM_USERPTR_VALIDATE(1 << 2) struct drm_radeon_gem_userptr { uint64_taddr; -- 1.9.1
[PATCH 2/5] drm/radeon: add userptr flag to limit it to anonymous memory
From: Christian K?nigSigned-off-by: Christian K?nig --- drivers/gpu/drm/radeon/radeon_gem.c | 3 ++- drivers/gpu/drm/radeon/radeon_ttm.c | 8 include/uapi/drm/radeon_drm.h | 3 ++- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 993ab22..032736b 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -290,7 +290,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, return -EACCES; /* reject unknown flag values */ - if (args->flags & ~RADEON_GEM_USERPTR_READONLY) + if (args->flags & ~(RADEON_GEM_USERPTR_READONLY | + RADEON_GEM_USERPTR_ANONONLY)) return -EINVAL; /* readonly pages not tested on older hardware */ diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 0109090..d63e698 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -542,6 +542,14 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) ttm->num_pages * PAGE_SIZE)) return -EFAULT; + if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { + unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; + struct vm_area_struct *vma; + vma = find_vma(gtt->usermm, gtt->userptr); + if (!vma || vma->vm_file || vma->vm_end < end) + return -EPERM; + } + do { unsigned num_pages = ttm->num_pages - pinned; uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h index a18ec54..4080ad3 100644 --- a/include/uapi/drm/radeon_drm.h +++ b/include/uapi/drm/radeon_drm.h @@ -810,7 +810,8 @@ struct drm_radeon_gem_create { uint32_tflags; }; -#define RADEON_GEM_USERPTR_READONLY0x1 +#define RADEON_GEM_USERPTR_READONLY(1 << 0) +#define RADEON_GEM_USERPTR_ANONONLY(1 << 1) struct drm_radeon_gem_userptr { uint64_taddr; -- 1.9.1
[PATCH 1/5] drm/radeon: add userptr support v6
From: Christian K?nigThis patch adds an IOCTL for turning a pointer supplied by userspace into a buffer object. It imposes several restrictions upon the memory being mapped: 1. It must be page aligned (both start/end addresses, i.e ptr and size). 2. It must be normal system memory, not a pointer into another map of IO space (e.g. it must not be a GTT mmapping of another object). 3. The BO is mapped into GTT, so the maximum amount of memory mapped at all times is still the GTT limit. 4. The BO is only mapped readonly for now, so no write support. 5. List of backing pages is only acquired once, so they represent a snapshot of the first use. Exporting and sharing as well as mapping of buffer objects created by this function is forbidden and results in an -EPERM. v2: squash all previous changes into first public version v3: fix tabs, map readonly, don't use MM callback any more v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages, pin/unpin pages on bind/unbind instead of populate/unpopulate v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown flags, better handle READONLY flag, improve permission check v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin Signed-off-by: Christian K?nig Reviewed-by: Alex Deucher (v4) Reviewed-by: J?r?me Glisse (v4) --- drivers/gpu/drm/radeon/radeon.h| 5 ++ drivers/gpu/drm/radeon/radeon_cs.c | 25 +- drivers/gpu/drm/radeon/radeon_drv.c| 5 +- drivers/gpu/drm/radeon/radeon_gem.c| 68 drivers/gpu/drm/radeon/radeon_kms.c| 1 + drivers/gpu/drm/radeon/radeon_object.c | 3 + drivers/gpu/drm/radeon/radeon_prime.c | 10 +++ drivers/gpu/drm/radeon/radeon_ttm.c| 139 + drivers/gpu/drm/radeon/radeon_vm.c | 3 + include/uapi/drm/radeon_drm.h | 11 +++ 10 files changed, 267 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 9e1732e..3c6999e 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2138,6 +2138,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_create_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); int radeon_gem_pin_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data, @@ -2871,6 +2873,9 @@ extern void radeon_legacy_set_clock_gating(struct radeon_device *rdev, int enabl extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int enable); extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain); extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo); +extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr, +uint32_t flags); +extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm); extern void radeon_vram_location(struct radeon_device *rdev, struct radeon_mc *mc, u64 base); extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc *mc); extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon); diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index ee712c1..1321491 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p) struct radeon_cs_chunk *chunk; struct radeon_cs_buckets buckets; unsigned i, j; - bool duplicate; + bool duplicate, need_mmap_lock = false; + int r; if (p->chunk_relocs_idx == -1) { return 0; @@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p) p->relocs[i].allowed_domains = domain; } + if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) { + uint32_t domain = p->relocs[i].prefered_domains; + if (!(domain & RADEON_GEM_DOMAIN_GTT)) { + DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is " + "allowed for userptr BOs\n"); + return -EINVAL; + } + need_mmap_lock = true; + domain = RADEON_GEM_DOMAIN_GTT; + p->relocs[i].prefered_domains = domain; + p->relocs[i].allowed_domains = domain; + } + p->relocs[i].tv.bo = >relocs[i].robj->tbo; p->relocs[i].handle = r->handle; @@ -176,8
[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm
https://bugs.freedesktop.org/show_bug.cgi?id=82050 --- Comment #4 from Andy Furniss --- I bisected LLVM and it came up with - ph4[llvm]$ git bisect good ee17bf3fd4189d1981a6e908b4519e600ec7b002 is the first bad commit commit ee17bf3fd4189d1981a6e908b4519e600ec7b002 Author: Matt Arsenault Date: Fri Jul 25 23:02:42 2014 + R600/SI: Allow partial unrolling and increase thresholds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk at 213985 91177308-0d34-0410-b5e6-96231b3b80d8 I don't know when I'll get to do kernel yet. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/1aac23a2/attachment-0001.html>
[Bug 41762] radeon default power_profile "default" makes laptop overheat (Mobility Radeon HD 3650)
https://bugs.freedesktop.org/show_bug.cgi?id=41762 --- Comment #11 from renich at woralelandia.com --- I am suffering of the same thing on Fedora 20. Even during install. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/fa91cd74/attachment.html>
[Bug 41762] radeon default power_profile "default" makes laptop overheat (Mobility Radeon HD 3650)
https://bugs.freedesktop.org/show_bug.cgi?id=41762 --- Comment #10 from renich at woralelandia.com --- Created attachment 104077 --> https://bugs.freedesktop.org/attachment.cgi?id=104077=edit journalctl -b output of journalctl -b -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/20fd9479/attachment.html>
[Bug 81644] Random crashes on RadeonSI with Chromium.
https://bugs.freedesktop.org/show_bug.cgi?id=81644 --- Comment #41 from jackdachef at gmail.com --- Created attachment 104076 --> https://bugs.freedesktop.org/attachment.cgi?id=104076=edit dmesg-output after 25 minute hardware-accelerated html5 video crash, no hardlock this time (Magic SYSRQ works), screen corruption (screen subdivided horizontally into ~18 parts) kernel running with drm-next-3.17-rebased-on-fixes applied on top of 3.16-rc6 latest commit: authorChristian K?nig 2014-07-28 11:30:12 (GMT) committerAlex Deucher 2014-08-04 21:45:53 (GMT) commitfa783807977da98da35590fd1d5efdfd4f33fd59 (patch) tree0f1573ae770843228930a0f278a82eb5d482a4c5 parent5fc6854683aad9ae8b711cbe0d824c11b4aad66c (diff) drm/radeon: allow userptr write access under certain conditions several hours of pushing and trying to get X/system lockup with firefox (hardware acceleration enabled) and watching & opening up large jpg images - showed that at least that issue was resolved (Bug #81612 ) Then now proceeded to re-test HTML5 video with hardware acceleration (hardware acceleration disabled was seemingly stable so far) the funny thing: each of the last 3 test attempts after pretty much exactly 25 minutes it tends to lock up X reproducer: chromium 38.0.2107.3 (previous versions should also work), but this one has more options disabled which should rule out other crash/instability triggers, youtube.com , keywords: movie trailers 2014 watching random movie trailers with preferrably 1080p (some only available in 720p) result: screen content locks up, mouse still movable for a short time & sound continuing, the screen turning black - (box locking up/hardlock - this time *not*) - this time: (in total 2) attempts to salvage via Magic SYSRQ + k screen flickers, another Magic SYSRQ + k screen turns on again, mentioned screen corruption (screen subdivided horizontally into ~18 parts) with mostly white and green color in the shape of tiles took a photo, if needed so we got a *clear* improvement: the box does *not* hardlock anymore, Magic SYSRQ key works again and screen attempts to recover with Magic SYSRQ + k, but it's not successful yet hope the information of dmesg helps with further adding some ideas on how to solve this added the following patchset (patches 2-7) on top of that kernel https://lkml.org/lkml/2014/8/3/120 ([PATCH 0/7] locking/rwsem: enable reader opt-spinning & writer respin ), not sure if that might increase stability Cheers -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/3aed2db5/attachment.html>
[Bug 81644] Random crashes on RadeonSI with Chromium.
https://bugs.freedesktop.org/show_bug.cgi?id=81644 --- Comment #40 from Alex Deucher --- (In reply to comment #39) > > are the other ways to temporarily disable LLVM for debugging in radeonsi ? llvm is required for radeonsi. -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140805/e882c1a9/attachment.html>