Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Marek Olšák
On Fri, Jun 23, 2017 at 3:45 PM, axie wrote: > Hi Marek, > > I understand you spent time on your original logic too. I really don't > understand why you talked about pain if somebody can improve it. > > To reduce the pain, now I am seriously considering dropping this patch. But >

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Marek Olšák
On Fri, Jun 23, 2017 at 1:55 PM, Zhou, David(ChunMing) wrote: > > > From: Marek Olšák [mar...@gmail.com] > Sent: Friday, June 23, 2017 6:49 PM > To: Christian König > Cc: Zhou, David(ChunMing); Xie, AlexBin;

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Marek Olšák
On Fri, Jun 23, 2017 at 3:01 PM, Christian König wrote: > The key point here is while optimizing this is nice the much bigger pile is > the locking done for each BO. > > In other words even when we optimize all the other locks involved into > atomics or RCU, the BO

Re: [PATCH 0/9] Visible VRAM Management Improvements

2017-06-23 Thread John Brooks
On Fri, Jun 23, 2017 at 05:02:58PM -0400, Felix Kuehling wrote: > Hi John, > > I haven't read your patches. Just a question based on the cover letter. > > I understand that visible VRAM is the biggest pain point. But could the > same reasoning make sense for invisible VRAM? That is, doing all

RE: [PATCH] drm/amdgpu: Support passing amdgpu critical error to host via GPU Mailbox.

2017-06-23 Thread Deucher, Alexander
> -Original Message- > From: Gavin Wan [mailto:gavin@amd.com] > Sent: Friday, June 23, 2017 5:33 PM > To: dl.gcr.gpu-virtual; brahma_sw_dev; amd-gfx@lists.freedesktop.org > Cc: Wan, Gavin > Subject: [PATCH] drm/amdgpu: Support passing amdgpu critical error to host > via GPU Mailbox. >

Re: [alsa-devel] [PATCH 8/8] ASoC: AMD: Add machine driver for cz rt5650

2017-06-23 Thread Pierre-Louis Bossart
+static const struct acpi_device_id cz_audio_acpi_match[] = { + { "I2SC1002", 0 }, This one goes on my list of _HID that don't follow ACPI/PCI vendorID/PartID conventions. AMD shoud use the "AMDI" ACPI ID or the 0x1002 PCI ID for the 4 first characters, if everyone does what they feel

Re: [PATCH 0/9] Visible VRAM Management Improvements

2017-06-23 Thread Felix Kuehling
Hi John, I haven't read your patches. Just a question based on the cover letter. I understand that visible VRAM is the biggest pain point. But could the same reasoning make sense for invisible VRAM? That is, doing all the migrations to VRAM in a workqueue? Regards, Felix On 17-06-23 01:39

RE: [PATCH] drm/amdgpu/psp: upper_32_bits/lower_32_bits for address setup

2017-06-23 Thread Li, Samuel
Reviewed-by: Samuel Li Sam > -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Alex Deucher > Sent: Thursday, June 22, 2017 6:29 PM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander

[PATCH umr v2] Add PCI device based GPU selection with --pci

2017-06-23 Thread Jean-Francois Thibert
This allows selecting the GPU by its PCI device both with and without kernel mode support. The instance is populated automatically so that the proper corresponding debugfs files are used if present. Signed-off-by: Jean-Francois Thibert --- doc/umr.1 | 4 +++

Re: [PATCH 1/2] drm/amdgpu: vm_update_ptes remove code duplication

2017-06-23 Thread Felix Kuehling
Sorry for the delay. The series is Reviewed-by: Felix Kuehling . Regards, Felix On 17-06-13 02:24 PM, Christian König wrote: > Am 13.06.2017 um 19:07 schrieb Alex Deucher: >> On Fri, Jun 9, 2017 at 5:47 PM, Harish Kasiviswanathan >>

[PATCH 1/9] drm/amdgpu: Separate placements and busy placements

2017-06-23 Thread John Brooks
This allows a BO to have busy placements that are not part of its normal placements. Users that want the busy placements to be the same can change the placement.busy_placement pointer and corresponding count to be the same as the regular placements. Signed-off-by: John Brooks

[PATCH 5/9] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo

2017-06-23 Thread John Brooks
Signed-off-by: John Brooks --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 ++ 3 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h

[PATCH 8/9] drm/amdgpu: Asynchronously move BOs to visible VRAM

2017-06-23 Thread John Brooks
Moving CPU-accessible BOs from GTT to visible VRAM reduces latency on the GPU and improves average framerate. However, it's an expensive operation. When visible VRAM is full and evictions are necessary, it can easily take tens of milliseconds. On the CS path, that directly increases the frame time

[PATCH 2/9] drm/amdgpu: Add vis_vramlimit module parameter

2017-06-23 Thread John Brooks
Allow specifying a limit on visible VRAM via a module parameter. This is helpful for testing performance under visible VRAM pressure. Signed-off-by: John Brooks --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4

[PATCH 6/9] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-23 Thread John Brooks
When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace, it should only be treated as a hint to initially place a BO somewhere CPU accessible, rather than having a permanent effect on BO placement. Instead of the flag being set in stone at BO creation, set the flag when a page

[PATCH 4/9] drm/amdgpu: Don't force BOs into visible VRAM if they can go to GTT instead

2017-06-23 Thread John Brooks
amdgpu_ttm_placement_init() callers that are using both VRAM and GTT as domains usually don't want visible VRAM as a busy placement. Signed-off-by: John Brooks --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-)

[PATCH 7/9] drm/amdgpu: Throttle visible VRAM moves separately

2017-06-23 Thread John Brooks
The BO move throttling code is designed to allow VRAM to fill quickly if it is relatively empty. However, this does not take into account situations where the visible VRAM is smaller than total VRAM, and total VRAM may not be close to full but the visible VRAM segment is under pressure. In such

[PATCH 0/9] Visible VRAM Management Improvements

2017-06-23 Thread John Brooks
This patch series is intended to improve performance when limited CPU-visible VRAM is under pressure. Moving BOs into visible VRAM is essentially a housekeeping task. It's faster to access them in VRAM than GTT, but it isn't a hard requirement for them to be in VRAM. As such, it is unnecessary to

[PATCH 3/9] drm/amdgpu: Don't force BOs into visible VRAM for page faults

2017-06-23 Thread John Brooks
There is no need for page faults to force BOs into visible VRAM if it's full, and the time it takes to do so is great enough to cause noticeable stuttering. Add GTT as a possible placement so that if visible VRAM is full, page faults move BOs to GTT instead of evicting other BOs from VRAM.

Re: [PATCH 1/8] drm/amd/amdgpu: Added asic_type as ACP DMA driver platform data

2017-06-23 Thread Alex Deucher
On Fri, Jun 23, 2017 at 12:43 PM, Christian König wrote: > Am 23.06.2017 um 18:34 schrieb Alex Deucher: >> >> From: Vijendar Mukunda >> >> asic_type information is passed to ACP DMA Driver as platform data. >> We need this to determine whether

Re: [PATCH 1/8] drm/amd/amdgpu: Added asic_type as ACP DMA driver platform data

2017-06-23 Thread Christian König
Am 23.06.2017 um 18:34 schrieb Alex Deucher: From: Vijendar Mukunda asic_type information is passed to ACP DMA Driver as platform data. We need this to determine whether the asic is Carrizo (CZ) or Stoney (ST) in the acp sound driver. Reviewed-by: Alex Deucher

[PATCH 6/8] ASoC: AMD: Buffer related changes for Stoney

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Stoney uses 16kb SRAM memory for playback and 16Kb for capture. Modified Max buffer size to have the correct mapping between System Memory and SRAM. Added snd_pcm_hardware structures for playback and capture for Stoney. Reviewed-by: Alex Deucher

[PATCH 4/8] ASoC: AMD: added condition checks for CZ specific code

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Added condition checks for CZ specific code based on asic_type. Stoney specific code will be added in a future commit. Reviewed-by: Alex Deucher Signed-off-by: Vijendar Mukunda Signed-off-by:

[PATCH 8/8] ASoC: AMD: Add machine driver for cz rt5650

2017-06-23 Thread Alex Deucher
From: Akshu Agrawal The driver is used for AMD board using rt5650 codec. Reviewed-by: Alex Deucher Signed-off-by: Akshu Agrawal Signed-off-by: Alex Deucher --- sound/soc/amd/Kconfig |

[PATCH 7/8] drm/amd/amdgpu: Disable ACP Power Gating for Stoney platform

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Power Gating is disabled in Stoney platform. Reviewed-by: Alex Deucher Signed-off-by: Vijendar Mukunda Signed-off-by: Alex Deucher ---

[PATCH 3/8] drm/amd/amdgpu: Added a dwc quirk for Stoney platform

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Added DW_I2S_QUIRK_16BIT_IDX_OVERRIDE quirk for Stoney. Supported format and bus width for I2S controller read from I2S Component Parameter registers. These are ready only registers. For Stoney, I2S Component Parameter registers are programmed

[PATCH 5/8] ASoC: AMD: DMA driver changes for Stoney Platform

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Added DMA driver changes for Stoney platform. Below are the key differences between Stoney and CZ: - Memory Gating is disabled - SRAM Banks won't be turned off - No Of SRAM Banks reduced to 6 - DAGB Garlic Interface used - 16 bit resolution is

[PATCH 0/8] Add ASoC support for AMD Stoney APUs

2017-06-23 Thread Alex Deucher
This patch set updates the AMD GPU and Audio CoProcessor (ACP) audio drivers and the designware i2s driver for Stoney (ST). ST is an APU similar to Carrizo (CZ) which already has ACP audio support. The i2s controller and ACP audio DMA engine are part of the GPU and both need updating so I would

[PATCH 2/8] ASoC: dwc: Added a quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to dwc driver

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda Added quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to Designware driver. This quirk will set idx value to 1. By setting this quirk, it will override supported format as 16 bit resolution and bus width as 2 Bytes. Reviewed-by: Alex Deucher

[PATCH 1/8] drm/amd/amdgpu: Added asic_type as ACP DMA driver platform data

2017-06-23 Thread Alex Deucher
From: Vijendar Mukunda asic_type information is passed to ACP DMA Driver as platform data. We need this to determine whether the asic is Carrizo (CZ) or Stoney (ST) in the acp sound driver. Reviewed-by: Alex Deucher Signed-off-by: Vijendar

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Christian König
On the other hand, after you optimize the BO reservation lock, other locks still need optimization, right? In theory yes, in practice no. There are just way other things we should tackle before taking care of removing any locks that we probably never get to that point even with more

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread axie
Hi Christian, I agree with you. On the other hand, after you optimize the BO reservation lock, other locks still need optimization, right? 1. Locking itself is not cheap. 2. Waiting in lock is even more expensive. Thanks, Alex Bin Xie On 2017-06-23 09:01 AM, Christian König wrote: The

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread axie
Hi Marek, I understand you spent time on your original logic too. I really don't understand why you talked about pain if somebody can improve it. To reduce the pain, now I am seriously considering dropping this patch. But please read on before you conclude. Let us treat open source software

Re: [PATCH v2 13/14] drm: stm: remove dead code and pointless local lut storage

2017-06-23 Thread Peter Rosin
On 2017-06-22 13:49, Philippe CORNU wrote: > On 06/22/2017 08:06 AM, Peter Rosin wrote: >> The redundant fb helper .load_lut is no longer used, and can not >> work right without also providing the fb helpers .gamma_set and >> .gamma_get thus rendering the code in this driver suspect. >> > > Hi

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Christian König
The key point here is while optimizing this is nice the much bigger pile is the locking done for each BO. In other words even when we optimize all the other locks involved into atomics or RCU, the BO reservation lock will still dominate everything. One possible solution to this would be per

RE: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Zhou, David(ChunMing)
From: Marek Olšák [mar...@gmail.com] Sent: Friday, June 23, 2017 6:49 PM To: Christian König Cc: Zhou, David(ChunMing); Xie, AlexBin; amd-gfx@lists.freedesktop.org; Xie, AlexBin Subject: Re: [PATCH 1/3] drm/amdgpu: fix a typo On Fri, Jun 23, 2017 at

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Marek Olšák
I agree with you about the spinlock. You seem to be good at this. It's always good to do measurements to validate that a code change improves something, especially when the code size and code complexity has to be increased. A CPU profiler such as sysprof can show you improvements on the order of

Re: [PATCH] Add PCI device based GPU selection with --pci

2017-06-23 Thread Tom St Denis
Some style/flow issues inline below. On 22/06/17 04:36 PM, Jean-Francois Thibert wrote: This allows selecting the GPU by its PCI device both with and without kernel mode support. The instance is populated automatically so that the proper corresponding debugfs files are used if present.

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Marek Olšák
On Fri, Jun 23, 2017 at 11:27 AM, Christian König wrote: > Am 23.06.2017 um 11:08 schrieb zhoucm1: >> >> >> >> On 2017年06月23日 17:01, zhoucm1 wrote: >>> >>> >>> >>> On 2017年06月23日 16:25, Christian König wrote: Am 23.06.2017 um 09:09 schrieb zhoucm1: > >

Re: [Intel-gfx] [PATCH v2 13/14] drm: stm: remove dead code and pointless local lut storage

2017-06-23 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 11:49:34AM +, Philippe CORNU wrote: > > > On 06/22/2017 08:06 AM, Peter Rosin wrote: > > The redundant fb helper .load_lut is no longer used, and can not > > work right without also providing the fb helpers .gamma_set and > > .gamma_get thus rendering the code in this

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Christian König
Am 23.06.2017 um 11:08 schrieb zhoucm1: On 2017年06月23日 17:01, zhoucm1 wrote: On 2017年06月23日 16:25, Christian König wrote: Am 23.06.2017 um 09:09 schrieb zhoucm1: On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread zhoucm1
On 2017年06月23日 17:01, zhoucm1 wrote: On 2017年06月23日 16:25, Christian König wrote: Am 23.06.2017 um 09:09 schrieb zhoucm1: On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a BO list like Marek suggested would indeed

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread zhoucm1
On 2017年06月23日 16:25, Christian König wrote: Am 23.06.2017 um 09:09 schrieb zhoucm1: On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a BO list like Marek suggested would indeed save us some time here. interesting,

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread zhoucm1
On 2017年06月23日 16:25, Christian König wrote: Am 23.06.2017 um 09:09 schrieb zhoucm1: On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a BO list like Marek suggested would indeed save us some time here. interesting,

Re: [PATCH 01/11] drm/fb-helper: do a generic fb_setcmap helper in terms of crtc .gamma_set

2017-06-23 Thread Daniel Vetter
On Thu, Jun 22, 2017 at 10:48:10AM +0200, Peter Rosin wrote: > On 2017-06-22 08:36, Daniel Vetter wrote: > > On Wed, Jun 21, 2017 at 11:40:52AM +0200, Peter Rosin wrote: > >> On 2017-06-21 09:38, Daniel Vetter wrote: > >>> On Tue, Jun 20, 2017 at 09:25:25PM +0200, Peter Rosin wrote: > This

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Christian König
Am 23.06.2017 um 09:09 schrieb zhoucm1: On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a BO list like Marek suggested would indeed save us some time here. interesting, I always follow how to improve our cs ioctl,

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread zhoucm1
On 2017年06月23日 14:57, Christian König wrote: But giving the CS IOCTL an option for directly specifying the BOs instead of a BO list like Marek suggested would indeed save us some time here. interesting, I always follow how to improve our cs ioctl, since UMD guys aften complain our command

Re: [PATCH 1/3] drm/amdgpu: fix a typo

2017-06-23 Thread Christian König
Hi Alex, actually Marek is right, command submission is actually not much of a bottleneck to us because it is handled from a separate userspace thread. So those micro optimizations you do here on CPU cycles are actually rather superfluous. But giving the CS IOCTL an option for directly