[PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Sinan Kaya
Code is expecing to observe the same number of buffers returned from
dma_map_sg() function compared to sg_alloc_table_from_pages(). This
doesn't hold true universally especially for systems with IOMMU.

IOMMU driver tries to combine buffers into a single DMA address as much
as it can. The right thing is to tell the DMA layer how much combining
IOMMU can do.

Signed-off-by: Sinan Kaya 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 8e28270..1b031eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -851,7 +851,7 @@ static int gmc_v6_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
dev_warn(adev->dev, "amdgpu: No coherent DMA available.\n");
}
-
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
r = gmc_v6_0_init_microcode(adev);
if (r) {
dev_err(adev->dev, "Failed to load mc firmware!\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 86e9d682..0a4b2cc1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -999,6 +999,7 @@ static int gmc_v7_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
pr_warn("amdgpu: No coherent DMA available\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
 
r = gmc_v7_0_init_microcode(adev);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 9a813d8..b171529 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1096,6 +1096,7 @@ static int gmc_v8_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
pr_warn("amdgpu: No coherent DMA available\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
 
r = gmc_v8_0_init_microcode(adev);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 3b7e7af..36e658ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -855,6 +855,7 @@ static int gmc_v9_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
printk(KERN_WARNING "amdgpu: No coherent DMA available.\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
 
r = gmc_v9_0_mc_init(adev);
if (r)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 1/5] Add functions for changing CRTC color management properties

2018-04-11 Thread Michel Dänzer
On 2018-04-10 08:02 PM, Leo Li wrote:
> On 2018-04-09 11:03 AM, Michel Dänzer wrote:
>> On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:
>>> From: "Leo (Sunpeng) Li" 
>>>
>>> This change adds a few functions in preparation of enabling CRTC color
>>> managment via the randr interface.
>>>
>>> The driver-private CRTC object now contains a list of properties,
>>> mirroring the driver-private output object. The lifecycle of the CRTC
>>> properties will also mirror the output.
>>>
>>> Since color managment properties are all DRM blobs, we'll expose the
>>> ability to change the blob ID. The user can create blobs via libdrm
>>> (which can be done without ownership of DRM master), then set the ID via
>>> xrandr. The user will then have to ensure proper cleanup by subsequently
>>> releasing the blob.
>>
>> That sounds a bit clunky. :)
>>
>> When changing a blob ID, the change only takes effect on the next atomic
>> commit, doesn't it? How does the client trigger the atomic commit?
> 
> From the perspective of a client that wishes to change a property, the
> process between regular properties and blob properties should be
> essentially the same. Both will trigger an atomic commit when the DRM
> set property ioctl is called from our DDX driver.

That doesn't sound right — if every set property ioctl call implicitly
triggered an atomic commit, the KMS atomic API wouldn't be "atomic" at all.

Do you mean that the DDX driver triggers an atomic commit on each
property change? How is that done?


> The only difference is that DRM property blobs can be arbitrary in size,
> and needs to be passed by reference through its DRM-defined blob ID.
> Because of this, the client has to create the blob, save it's id, call
> libXrandr to change it, then destroy the blob after it's been committed.
> 
> The client has to call libXrandr due to DRM permissions. IIRC, there can
> be only one DRM master. And since xserver is DRM master, an external
> application cannot set DRM properties unless it goes through X. However,
> creating and destroying DRM property blobs and can be done by anyone.
> 
> Was this the source of the clunkiness? I've thought about having DDX
> create and destroy the blob instead, but that needs an interface for the
> client to get arbitrarily sized data to DDX. I'm not aware of any good
> ways to do so. Don't think the kernel can do this for us either. It does
> create the blob for legacy gamma, but that's because there's a dedicated
> ioctl for it.

Maybe there are better approaches than exposing these properties to
clients directly. See also my latest follow-up on patch 3.


>>> @@ -1604,6 +1623,18 @@ static void drmmode_output_dpms(xf86OutputPtr
>>> output, int mode)
>>>   }
>>>   }
>>>   +static Bool drmmode_crtc_property_ignore(drmModePropertyPtr prop)
>>> +{
>>> +    if (!prop)
>>> +    return TRUE;
>>> +    /* Ignore CRTC gamma lut sizes */
>>> +    if (!strcmp(prop->name, "GAMMA_LUT_SIZE") ||
>>> +    !strcmp(prop->name, "DEGAMMA_LUT_SIZE"))
>>> +    return TRUE;
>>
>> Without these properties, how can a client know the LUT sizes?
> 
> Good point, I originally thought the sizes are fixed and did not need
> exposing. But who knows if they may change, or even be different per asic.

AFAIK at least Intel hardware already has different sizes in different
hardware generations. We should avoid creating an API which couldn't
also work for the modesetting driver and generic clients.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 3/5] Keep CRTC properties consistent

2018-04-11 Thread Michel Dänzer
On 2018-04-10 08:02 PM, Leo Li wrote:
> On 2018-04-09 11:03 AM, Michel Dänzer wrote:
>> On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:
>>> From: "Leo (Sunpeng) Li" 
>>>
>>> In cases where CRTC properties are updated without going through
>>> RRChangeOutputProperty, we don't update the properties in user land.
>>>
>>> Consider setting legacy gamma. It doesn't go through
>>> RRChangeOutputProperty, but modifies the CRTC's color management
>>> properties. Unless they are updated, the user properties will remain
>>> stale.
>>
>> Can you describe a bit more how the legacy gamma and the new properties
>> interact?
>>
> 
> Sure thing, I'll include this in the message for v2:
> 
> In kernel, the legacy set gamma interface is essentially an adapter to
> the non-legacy set properties interface. In the end, they both set the
> same property to a DRM property blob, which contains the gamma lookup
> table. The key difference between them is how this blob is created.
> 
> For legacy gamma, the kernel takes 3 arrays from user-land, and creates
> the blob using them. Note that a blob is identified by it's blob_id.
> 
> For non-legacy gamma, the kernel takes a blob_id from user-land that
> references the blob. This means user-land is responsible for creating
> the blob.
> 
> From the perspective of RandR, this presents some problems. Since both
> paths modify the same property, RandR must keep the reported property
> value up-to-date with which ever path is used:
> 
> 1. Legacy gamma via
> xrandr --output  --gamma x:x:x
> 2. Non-legacy color properties via
> xrandr --output  --set GAMMA_LUT 
> 
> Keeping the value up-to-date isn't a problem for 2, since RandR updates
> it for us as part of changing output properties.
> 
> But if 1 is used, the property blob is created within kernel, and RandR
> is unaware of the new blob_id. To update it, we need to ask kernel about
> it.
> 
> --- continue with rest of message ---
>>
>>> Therefore, add a function to update user CRTC properties by querying
>>> DRM,
>>> and call it whenever legacy gamma is changed.
>>
>> Note that drmmode_crtc_gamma_do_set is called from
>> drmmode_set_mode_major, i.e. on every modeset or under some
>> circumstances when a DRI3 client stops page flipping.
>>
> 
> The property will have to be updated each time the legacy set gamma
> ioctl is called, since a new blob (with a new blob_id) is created each
> time.
> 
> Not sure if this is a good idea, but perhaps we can have a flag that
> explicitly enable one or the other, depending on user preference? A
> user-only property with something like:
> 
> 0: Use legacy gamma, calls to change non-legacy properties are ignored.
> 1: Use non-legacy, calls to legacy gamma will be ignored.
> 
> On 0, we can remove/disable all non-legacy properties from the property
> list, and avoid having to update them. On 1, we'll enable the
> properties, and won't have to update them either since legacy gamma is
> "disabled". It has the added benefit of avoiding unexpected legacy gamma
> sets when using non-legacy, and vice versa.

Hmm. So either legacy or non-legacy clients won't work at all, or
they'll step on each other's toes, clobbering the HW gamma LUT from each
other.

I'm afraid neither of those alternatives sound like a good user
experience to me.

Consider on the one hand something like Night Light / redshift, using
legacy APIs to adjust colour temperature to the time of day. On the
other hand, another client using the non-legacy API for say fine-tuning
of a display's advanced gamma capabilities.

Ideally, in this case the gamma LUTs from the legacy and non-legacy APIs
should be combined, such that the hardware LUT reflects both the colour
temperature set by Night Light / refshift and the fine-tuning set by the
non-legacy client. Is that feasible? If not, can you explain a little why?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 2/5] Hook up CRTC color management functions

2018-04-11 Thread Michel Dänzer
On 2018-04-10 08:02 PM, Leo Li wrote:
> On 2018-04-09 11:03 AM, Michel Dänzer wrote:
>> On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:
>>> From: "Leo (Sunpeng) Li" 
>>>
>>> The functions insert into the output resource creation, and property
>>> change functions. CRTC destroy is also hooked-up for proper cleanup of
>>> the CRTC property list.
>>>
>>> Signed-off-by: Leo (Sunpeng) Li 
>>
>> [...]
>>
>>> @@ -1933,6 +1933,9 @@ static void
>>> drmmode_output_create_resources(xf86OutputPtr output)
>>>   }
>>>   }
>>>   }
>>> +
>>> +    if (output->crtc)
>>> +    drmmode_crtc_create_resources(output->crtc, output);
>>
>> output->crtc is only non-NULL here for outputs which are enabled at Xorg
>> startup; other outputs won't have the new properties.
> 
> Is it necessary to have the CRTC properties on a output if the CRTC is
> disabled for that output?

The set of properties exposed by an output should be constant throughout
its lifetime.


> I've tested hot-plugging with this, and the properties do initialize on
> hot-plug.

I suspect you were testing something like DP MST, where the output is
created dynamically on hot-plug. For "static" outputs such as
DVI/HDMI/VGA/LVDS (and "normal" (e)DP), drmmode_output_create_resources
is only called once during Xorg initialization, and output->crtc is NULL
at that point unless the output is enabled from the beginning (which is
only the case if either the output is connected at that point, or is
explicitly configured in xorg.conf to be on).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Christian König

Am 11.04.2018 um 08:26 schrieb Huang Rui:

On Tue, Apr 10, 2018 at 04:59:55PM -0400, Sinan Kaya wrote:

Code is expecing to observe the same number of buffers returned from
dma_map_sg() function compared to sg_alloc_table_from_pages(). This
doesn't hold true universally especially for systems with IOMMU.

IOMMU driver tries to combine buffers into a single DMA address as much
as it can. The right thing is to tell the DMA layer how much combining
IOMMU can do.

Signed-off-by: Sinan Kaya 

Sinan, I guess Christian's suggestion is to add amdgpu_device_init function
like here:


Yes, exactly. Looks like Sinan just tried to place the call next to 
pci_set_dma_mask()/pci_set_consistent_dma_mask().


Not necessary a bad idea, but in this case not optimal.

Sinan please refine your patch as Rui suggested and resend.

Thanks,
Christian.



8<---
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0e798b3..9b96771 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2339,6 +2339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 /* init the mode config */
 drm_mode_config_init(adev->ddev);

+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
+
 r = amdgpu_device_ip_init(adev);
 if (r) {
 /* failed in exclusive mode due to timeout */
8<---

After that, we don't do it in each generation.

Thanks,
Ray
___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: set preferred_domain independent of fallback handling

2018-04-11 Thread Christian König

Am 11.04.2018 um 04:38 schrieb zhoucm1:



On 2018年04月10日 21:42, Christian König wrote:

When GEM needs to fallback to GTT for VRAM BOs we still want the
preferred domain to be untouched so that the BO has a cance to move back
to VRAM in the future.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    | 14 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 13 ++---
  2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c

index 46b9ea4e6103..9dc0a190413c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -47,6 +47,7 @@ int amdgpu_gem_object_create(struct amdgpu_device 
*adev, unsigned long size,

   struct reservation_object *resv,
   struct drm_gem_object **obj)
  {
+    uint32_t domain = initial_domain;
  struct amdgpu_bo *bo;
  int r;
  @@ -57,7 +58,7 @@ int amdgpu_gem_object_create(struct amdgpu_device 
*adev, unsigned long size,

  }
    retry:
-    r = amdgpu_bo_create(adev, size, alignment, initial_domain,
+    r = amdgpu_bo_create(adev, size, alignment, domain,
   flags, type, resv, &bo);
  if (r) {
  if (r != -ERESTARTSYS) {
@@ -66,8 +67,8 @@ int amdgpu_gem_object_create(struct amdgpu_device 
*adev, unsigned long size,

  goto retry;
  }
  -    if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
-    initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
+    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
+    domain |= AMDGPU_GEM_DOMAIN_GTT;
  goto retry;
  }
  DRM_DEBUG("Failed to allocate GEM object (%ld, %d, %u, 
%d)\n",
@@ -75,6 +76,13 @@ int amdgpu_gem_object_create(struct amdgpu_device 
*adev, unsigned long size,

  }
  return r;
  }
+
+    bo->preferred_domains = initial_domain;
+    bo->allowed_domains = initial_domain;
+    if (type != ttm_bo_type_kernel &&
+    bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
+    bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
It's an ugly change back after bo creation! Do you think it's better 
than previous?


Well it's better than moving the fallback handling into the core 
functions and then adding a flag to disable it again because we don't 
want it in the core function.




Alternative way, you can add one parameter to amdgpu_bo_create, 
directly pass preferred_domains and allowed_domains to amdgpu_bo_create.


Then we would need three parameters, one where to create the BO now and 
two for preferred/allowed domains.


I think that in this case overwriting the placement from the initial 
value looks cleaner to me.


Regards,
Christian.



Regards,
David Zhou

+
  *obj = &bo->gem_base;
    return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index 6d08cde8443c..854d18d5cff4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -375,17 +375,8 @@ static int amdgpu_bo_do_create(struct 
amdgpu_device *adev, unsigned long size,

  drm_gem_private_object_init(adev->ddev, &bo->gem_base, size);
  INIT_LIST_HEAD(&bo->shadow_list);
  INIT_LIST_HEAD(&bo->va);
-    bo->preferred_domains = domain & (AMDGPU_GEM_DOMAIN_VRAM |
- AMDGPU_GEM_DOMAIN_GTT |
- AMDGPU_GEM_DOMAIN_CPU |
- AMDGPU_GEM_DOMAIN_GDS |
- AMDGPU_GEM_DOMAIN_GWS |
- AMDGPU_GEM_DOMAIN_OA);
-    bo->allowed_domains = bo->preferred_domains;
-    if (type != ttm_bo_type_kernel &&
-    bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
-    bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
-
+    bo->preferred_domains = domain;
+    bo->allowed_domains = domain;
  bo->flags = flags;
    #ifdef CONFIG_X86_32




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCHv2] drm/amdkfd: Remove vla

2018-04-11 Thread Christian König

Am 11.04.2018 um 03:02 schrieb Laura Abbott:

There's an ongoing effort to remove VLAs[1] from the kernel to eventually
turn on -Wvla. Switch to a constant value that covers all hardware.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Laura Abbott 


It would be nicer to have a define for that somewhere.

Anyway the patch is Acked-by: Christian König  
for now.


Regards,
Christian.


---
v2: Switch to a larger size to account for other hardware
---
  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
index 035c351f47c5..c3a5a80e31ae 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
@@ -139,10 +139,12 @@ static void interrupt_wq(struct work_struct *work)
  {
struct kfd_dev *dev = container_of(work, struct kfd_dev,
interrupt_work);
+   uint32_t ih_ring_entry[8];
  
-	uint32_t ih_ring_entry[DIV_ROUND_UP(

-   dev->device_info->ih_ring_entry_size,
-   sizeof(uint32_t))];
+   if (dev->device_info->ih_ring_entry_size > (8 * sizeof(uint32_t))) {
+   dev_err(kfd_chardev(), "Ring entry too small\n");
+   return;
+   }
  
  	while (dequeue_ih_ring_entry(dev, ih_ring_entry))

dev->device_info->event_interrupt_class->interrupt_wq(dev,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: RFC for a render API to support adaptive sync and VRR

2018-04-11 Thread Michel Dänzer
On 2018-04-11 08:57 AM, Nicolai Hähnle wrote:
> On 10.04.2018 23:45, Cyr, Aric wrote:
> For video games we have a similar situation where a frame is
> rendered
> for a certain world time and in the ideal case we would actually
> display the frame at this world time.

 That seems like it would be a poorly written game that flips like
 that, unless they are explicitly trying to throttle the
 framerate for
 some reason.  When a game presents a completed frame, they’d like
 that to happen as soon as possible.
>>>
>>> What you're describing is what most games have been doing
>>> traditionally.
>>> Croteam's research shows that this results in micro-stuttering,
>>> because
>>> frames may be presented too early. To avoid that, they want to
>>> explicitly time each presentation as described by Christian.
>>
>> Yes, I agree completely.  However that's only truly relevant for
>> fixed
>> refreshed rate displays.
>
> No, it also affects variable refresh; possibly even more in some
> cases,
> because the presentation time is less predictable.

 Yes, and that's why you don't want to do it when you have variable
 refresh.  The hardware in the monitor and GPU will do it for you,
>>> so why bother?
>>>
>>> I think Michel's point is that the monitor and GPU hardware *cannot*
>>> really do this, because there's synchronization with audio to take into
>>> account, which the GPU or monitor don't know about.
>>
>> How does it work fine today given that all kernel seems to know is
>> 'current' or 'current+1' vsyncs.
>> Presumably the applications somehow schedule all this just fine.
>> If this works without variable refresh for 60Hz, will it not work for
>> a fixed-rate "48Hz" monitor (assuming a 24Hz video)?
> 
> You're right. I guess a better way to state the point is that it
> *doesn't* really work today with fixed refresh, but if we're going to
> introduce a new API, then why not do so in a way that can fix these
> additional problems as well?

Exactly. With a fixed frame duration, we'll still have fundamentally the
same issues as we currently do without variable refresh, not making use
of the full potential of variable refresh.


> Say you have a multi-GPU system, and each GPU has multiple displays
> attached, and a single application is driving them all. The application
> queues flips for all displays with the same target_present_time_ns
> attribute. Starting at some time T, the application simply asks for the
> same present time T + i * 1667 (or whatever) for frame i from all
> displays.

BTW, this is an interesting side point I've wanted to make: Any
applications / use cases which really do want a fixed refresh rate can
trivially do it with time-based presentation like this.


> Of course it's to be expected that some (or all) of the displays will
> not be able to hit the target time on the first bunch of flips due to
> hardware limitations, but as long as the range of supported frame times
> is wide enough, I'd expect all of them to drift towards presenting at
> the correct time eventually, even across multiple GPUs, with this simple
> scheme.
> 
> Why would that not work to sync up all displays almost perfectly?

Seconded.


> How about what I wrote in an earlier mail of having attributes:
> 
> - target_present_time_ns
> - hint_frame_time_ns (optional)
> 
> ... and if a video player set both, the driver could still do the
> optimizations you've explained?

FWIW, I don't think a property would be a good mechanism for the target
presentation time.

At least with VDPAU, video players are already explicitly specifying the
target presentation time, so no changes should be required at that
level. Don't know about other video APIs.

The X11 Present extension protocol is also prepared for specifying the
target presentation time already, the support for it just needs to be
implemented.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: RFC for a render API to support adaptive sync and VRR

2018-04-11 Thread Michel Dänzer
On 2018-04-10 07:25 PM, Cyr, Aric wrote:
>> From: Michel Dänzer [mailto:mic...@daenzer.net]
>> On 2018-04-10 07:13 PM, Cyr, Aric wrote:
 From: Michel Dänzer [mailto:mic...@daenzer.net]
 On 2018-04-10 06:26 PM, Cyr, Aric wrote:
> From: Koenig, Christian Sent: Tuesday, April 10, 2018 11:43
>
>> For video games we have a similar situation where a frame is rendered
>> for a certain world time and in the ideal case we would actually
>> display the frame at this world time.
>
> That seems like it would be a poorly written game that flips like
> that, unless they are explicitly trying to throttle the framerate for
> some reason.  When a game presents a completed frame, they’d like
> that to happen as soon as possible.

 What you're describing is what most games have been doing traditionally.
 Croteam's research shows that this results in micro-stuttering, because
 frames may be presented too early. To avoid that, they want to
 explicitly time each presentation as described by Christian.
>>>
>>> Yes, I agree completely.  However that's only truly relevant for fixed
>>> refreshed rate displays.
>>
>> No, it also affects variable refresh; possibly even more in some cases,
>> because the presentation time is less predictable.
> 
> Yes, and that's why you don't want to do it when you have variable refresh.  
> The hardware in the monitor and GPU will do it for you, so why bother?
> The input to their algorithms will be noisy causing worst estimations.  If 
> you just present as fast as you can, it'll just work (within reason).

If a frame is presented earlier than the time corresponding to the state
of the world as displayed in the frame, it results in stutter, just as
when it's presented too late.


> The majority of gamers want maximum FPS for their games, and there's quite 
> frequently outrage at a particular game when they are limited to something 
> lower that what their monitor could otherwise support (i.e. I don't want my 
> game limited to 30Hz if I have a shiny 144Hz gaming display I paid good money 
> for).

That doesn't (have to) happen.


See
https://www.gdcvault.com/play/1025407/Advanced-Graphics-Techniques-Tutorial-The
for Croteam's talk about this at this year's GDC. It says the best API
available so far is the Vulkan extension VK_GOOGLE_display_timing, which
(among other things) allows specifying the earliest desired presentation
time via VkPresentTimeGOOGLE::desiredPresentTime . (The talk also
mentions that they previously experimented with VDPAU, because it allows
specifying the target presentation time)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/pp: Clear smu response register before send smu message

2018-04-11 Thread Rex Zhu
smu firmware do not update response register immediately under
some delay tasks. so we read out the original value.

so clear the register first.

Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c   | 4 +---
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 1 +
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
index 2a93f3a..2d4ec8a 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
@@ -208,9 +208,7 @@ static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
uint16_t msg)
 {
int ret;
 
-   if (!ci_is_smc_ram_running(hwmgr))
-   return -EINVAL;
-
+   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
 
PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index 10a1123..64d33b7 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -176,6 +176,7 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
msg)
else if (ret != 1)
pr_info("\n last message was failed ret is %d\n", ret);
 
+   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
 
PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix amdgpu_debugfs_gpr_read debugfs entry

2018-04-11 Thread Tom St Denis

Ping?

On 04/09/2018 08:16 AM, Tom St Denis wrote:

We don't need to check the alignment of the offset and there was
potential a buffer overflow as well.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c98e59721444..b1ea38e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -507,6 +507,9 @@ static ssize_t amdgpu_debugfs_wave_read(struct file *f, 
char __user *buf,
return result;
  }
  
+// read at most 1024 words

+#define AMDGPU_DEBUGFS_MAX_SGPR_READ 1024
+
  static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
  {
@@ -515,7 +518,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char 
__user *buf,
ssize_t result = 0;
uint32_t offset, se, sh, cu, wave, simd, thread, bank, *data;
  
-	if (size & 3 || *pos & 3)

+   if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ))
return -EINVAL;
  
  	/* decode offset */

@@ -528,7 +531,8 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char 
__user *buf,
thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
  
-	data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);

+   data = kmalloc_array(AMDGPU_DEBUGFS_MAX_SGPR_READ, sizeof(*data),
+GFP_KERNEL);
if (!data)
return -ENOMEM;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix amdgpu_debugfs_gpr_read debugfs entry

2018-04-11 Thread Christian König


  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ)) 


I think checking the position alignment here is still necessary, cause 
we can't read from not dw boundaries don't we?


Christian.

Am 11.04.2018 um 13:55 schrieb Tom St Denis:

Ping?

On 04/09/2018 08:16 AM, Tom St Denis wrote:

We don't need to check the alignment of the offset and there was
potential a buffer overflow as well.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c

index c98e59721444..b1ea38e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -507,6 +507,9 @@ static ssize_t amdgpu_debugfs_wave_read(struct 
file *f, char __user *buf,

  return result;
  }
  +// read at most 1024 words
+#define AMDGPU_DEBUGFS_MAX_SGPR_READ 1024
+
  static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user 
*buf,

  size_t size, loff_t *pos)
  {
@@ -515,7 +518,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  ssize_t result = 0;
  uint32_t offset, se, sh, cu, wave, simd, thread, bank, *data;
  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ))
  return -EINVAL;
    /* decode offset */
@@ -528,7 +531,8 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
  bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
  -    data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);
+    data = kmalloc_array(AMDGPU_DEBUGFS_MAX_SGPR_READ, sizeof(*data),
+ GFP_KERNEL);
  if (!data)
  return -ENOMEM;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix amdgpu_debugfs_gpr_read debugfs entry

2018-04-11 Thread Tom St Denis



On 04/11/2018 07:58 AM, Christian König wrote:


  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ)) 


I think checking the position alignment here is still necessary, cause 
we can't read from not dw boundaries don't we?


The index is a dword index as fed into SQ_IND_INDEX (offset => start => 
regno as you trace from the debugfs entry to gfx_v8_0_read_wave_sgprs to 
wave_read_regs (or analogues...)).


SQ_IND_INDEX doesn't take a byte offset but a dword offset, for instance:

include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP0 
  0x026c
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP1 
  0x026d
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP2 
  0x026e



The current way for instance would prohibit reading (directly) 
SQ_WAVE_TTMP1.


I agree it's not really how a typical file device works but it's not a 
typical file device :-).  It's assumed every read would be preceded by a 
seek to set the higher order bits anyways.


Cheers,
Tom




Christian.

Am 11.04.2018 um 13:55 schrieb Tom St Denis:

Ping?

On 04/09/2018 08:16 AM, Tom St Denis wrote:

We don't need to check the alignment of the offset and there was
potential a buffer overflow as well.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c

index c98e59721444..b1ea38e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -507,6 +507,9 @@ static ssize_t amdgpu_debugfs_wave_read(struct 
file *f, char __user *buf,

  return result;
  }
  +// read at most 1024 words
+#define AMDGPU_DEBUGFS_MAX_SGPR_READ 1024
+
  static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user 
*buf,

  size_t size, loff_t *pos)
  {
@@ -515,7 +518,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  ssize_t result = 0;
  uint32_t offset, se, sh, cu, wave, simd, thread, bank, *data;
  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ))
  return -EINVAL;
    /* decode offset */
@@ -528,7 +531,8 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
  bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
  -    data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);
+    data = kmalloc_array(AMDGPU_DEBUGFS_MAX_SGPR_READ, sizeof(*data),
+ GFP_KERNEL);
  if (!data)
  return -ENOMEM;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix amdgpu_debugfs_gpr_read debugfs entry

2018-04-11 Thread Christian König

Am 11.04.2018 um 14:03 schrieb Tom St Denis:



On 04/11/2018 07:58 AM, Christian König wrote:


  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ)) 


I think checking the position alignment here is still necessary, 
cause we can't read from not dw boundaries don't we?


The index is a dword index as fed into SQ_IND_INDEX (offset => start 
=> regno as you trace from the debugfs entry to 
gfx_v8_0_read_wave_sgprs to wave_read_regs (or analogues...)).


SQ_IND_INDEX doesn't take a byte offset but a dword offset, for instance:

include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP0 0x026c
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP1 0x026d
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP2 0x026e


The current way for instance would prohibit reading (directly) 
SQ_WAVE_TTMP1.


I agree it's not really how a typical file device works but it's not a 
typical file device :-).  It's assumed every read would be preceded by 
a seek to set the higher order bits anyways.


So pos=0 is one dw and pos=1 is another one? If that's the case then 
that would be a bug since pos is by definition a byte offset.


I suggest to keep the limitation and instead fix how pos is interpreted 
instead.


Regards,
Christian.



Cheers,
Tom




Christian.

Am 11.04.2018 um 13:55 schrieb Tom St Denis:

Ping?

On 04/09/2018 08:16 AM, Tom St Denis wrote:

We don't need to check the alignment of the offset and there was
potential a buffer overflow as well.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c

index c98e59721444..b1ea38e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -507,6 +507,9 @@ static ssize_t amdgpu_debugfs_wave_read(struct 
file *f, char __user *buf,

  return result;
  }
  +// read at most 1024 words
+#define AMDGPU_DEBUGFS_MAX_SGPR_READ 1024
+
  static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char 
__user *buf,

  size_t size, loff_t *pos)
  {
@@ -515,7 +518,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  ssize_t result = 0;
  uint32_t offset, se, sh, cu, wave, simd, thread, bank, *data;
  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ))
  return -EINVAL;
    /* decode offset */
@@ -528,7 +531,8 @@ static ssize_t amdgpu_debugfs_gpr_read(struct 
file *f, char __user *buf,

  thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
  bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
  -    data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);
+    data = kmalloc_array(AMDGPU_DEBUGFS_MAX_SGPR_READ, sizeof(*data),
+ GFP_KERNEL);
  if (!data)
  return -ENOMEM;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/pp: Fix logic error in smu7_check_dpm_table_updated

2018-04-11 Thread Rex Zhu
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 68b9e0b..31d271f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4684,7 +4684,7 @@ static void smu7_check_dpm_table_updated(struct pp_hwmgr 
*hwmgr)
for (i=0; i < dep_table->count; i++) {
if (dep_table->entries[i].vddc != 
odn_dep_table->entries[i].vddc) {
data->need_update_smu7_dpm_table |= 
DPMTABLE_OD_UPDATE_VDDC;
-   break;
+   return;
}
}
if (i == dep_table->count)
@@ -4695,7 +4695,7 @@ static void smu7_check_dpm_table_updated(struct pp_hwmgr 
*hwmgr)
for (i=0; i < dep_table->count; i++) {
if (dep_table->entries[i].vddc != 
odn_dep_table->entries[i].vddc) {
data->need_update_smu7_dpm_table |= 
DPMTABLE_OD_UPDATE_VDDC;
-   break;
+   return;
}
}
if (i == dep_table->count)
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: print DMA-buf status in debugfs

2018-04-11 Thread Christian König
Just note if a BO was imported/exported.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 28c2706e48d7..93d3f333444b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -765,6 +765,8 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, 
void *data)
struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj);
struct seq_file *m = data;
 
+   struct dma_buf_attachment *attachment;
+   struct dma_buf *dma_buf;
unsigned domain;
const char *placement;
unsigned pin_count;
@@ -793,6 +795,15 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, 
void *data)
pin_count = READ_ONCE(bo->pin_count);
if (pin_count)
seq_printf(m, " pin count %d", pin_count);
+
+   dma_buf = READ_ONCE(bo->gem_base.dma_buf);
+   attachment = READ_ONCE(bo->gem_base.import_attach);
+
+   if (attachment)
+   seq_printf(m, " imported from %p", dma_buf);
+   else if (dma_buf)
+   seq_printf(m, " exported as %p", dma_buf);
+
seq_printf(m, "\n");
 
return 0;
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/5] drm/amdgpu: add independent DMA-buf export v2

2018-04-11 Thread Christian König
Instead of relying on the DRM functions just implement our own export
functions. This adds support for taking care of unpinned DMA-buf.

v2: fix unintended recursion, remove debugging leftovers

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c  | 108 ++---
 4 files changed, 72 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 2babfad1fd7f..3e1727251edc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -371,7 +371,6 @@ int amdgpu_gem_object_open(struct drm_gem_object *obj,
 void amdgpu_gem_object_close(struct drm_gem_object *obj,
struct drm_file *file_priv);
 unsigned long amdgpu_gem_timeout(uint64_t timeout_ns);
-struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj);
 struct drm_gem_object *
 amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 struct dma_buf_attachment *attach,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 0b19482b36b8..6107b845d8a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -890,7 +890,6 @@ static struct drm_driver kms_driver = {
.gem_prime_export = amdgpu_gem_prime_export,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_res_obj = amdgpu_gem_prime_res_obj,
-   .gem_prime_get_sg_table = amdgpu_gem_prime_get_sg_table,
.gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table,
.gem_prime_vmap = amdgpu_gem_prime_vmap,
.gem_prime_vunmap = amdgpu_gem_prime_vunmap,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 9e23d6f6f3f3..15bb48cebbc4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -31,6 +31,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -953,6 +954,10 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 
amdgpu_bo_kunmap(abo);
 
+   if (abo->gem_base.dma_buf && !abo->gem_base.import_attach &&
+   bo->mem.mem_type != TTM_PL_SYSTEM)
+   dma_buf_invalidate_mappings(abo->gem_base.dma_buf);
+
/* remember the eviction */
if (evict)
atomic64_inc(&adev->num_evictions);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 4b584cb75bf4..7fef95f0fed1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -32,14 +32,6 @@
 
 static const struct dma_buf_ops amdgpu_dmabuf_ops;
 
-struct sg_table *amdgpu_gem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-   struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-   int npages = bo->tbo.num_pages;
-
-   return drm_prime_pages_to_sg(bo->tbo.ttm->pages, npages);
-}
-
 void *amdgpu_gem_prime_vmap(struct drm_gem_object *obj)
 {
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
@@ -126,23 +118,22 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
return ERR_PTR(ret);
 }
 
-static int amdgpu_gem_map_attach(struct dma_buf *dma_buf,
-struct device *target_dev,
-struct dma_buf_attachment *attach)
+static struct sg_table *
+amdgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
+  enum dma_data_direction dir)
 {
+   struct dma_buf *dma_buf = attach->dmabuf;
struct drm_gem_object *obj = dma_buf->priv;
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+   struct sg_table *sgt;
long r;
 
-   r = drm_gem_map_attach(dma_buf, target_dev, attach);
-   if (r)
-   return r;
-
-   r = amdgpu_bo_reserve(bo, false);
-   if (unlikely(r != 0))
-   goto error_detach;
-
+   if (!attach->invalidate) {
+   r = amdgpu_bo_reserve(bo, false);
+   if (unlikely(r != 0))
+   return ERR_PTR(r);
+   }
 
if (attach->dev->driver != adev->dev->driver) {
/*
@@ -158,42 +149,77 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf,
}
}
 
-   /* pin buffer into GTT */
-   r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT, NULL);
-   if (r)
+   if (!attach->invalidate) {
+   /* pin buffer into GTT */
+   r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT, NULL);
+   if (r)
+   goto error_unreserve;
+
+   } else {
+   /* move buffer into GTT */
+   struct

[PATCH 5/5] drm/amdgpu: add independent DMA-buf import v3

2018-04-11 Thread Christian König
Instead of relying on the DRM functions just implement our own import
functions. This adds support for taking care of unpinned DMA-buf.

v2: enable for all exporters, not just amdgpu, fix invalidation
handling, lock reservation object while setting callback
v3: change to new dma_buf attach interface

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 72 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 32 +++---
 2 files changed, 89 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 7fef95f0fed1..58dcfba0225a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -86,28 +86,24 @@ int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, 
struct vm_area_struct *vma
return ret;
 }
 
-struct drm_gem_object *
-amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
-struct dma_buf_attachment *attach,
-struct sg_table *sg)
+static struct drm_gem_object *
+amdgpu_gem_prime_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)
 {
-   struct reservation_object *resv = attach->dmabuf->resv;
+   struct reservation_object *resv = dma_buf->resv;
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_bo *bo;
int ret;
 
ww_mutex_lock(&resv->lock, NULL);
-   ret = amdgpu_bo_create(adev, attach->dmabuf->size, PAGE_SIZE,
+   ret = amdgpu_bo_create(adev, dma_buf->size, PAGE_SIZE,
   AMDGPU_GEM_DOMAIN_CPU, 0, ttm_bo_type_sg,
   resv, &bo);
if (ret)
goto error;
 
-   bo->tbo.sg = sg;
-   bo->tbo.ttm->sg = sg;
bo->allowed_domains = AMDGPU_GEM_DOMAIN_GTT;
bo->preferred_domains = AMDGPU_GEM_DOMAIN_GTT;
-   if (attach->dmabuf->ops != &amdgpu_dmabuf_ops)
+   if (dma_buf->ops != &amdgpu_dmabuf_ops)
bo->prime_shared_count = 1;
 
ww_mutex_unlock(&resv->lock);
@@ -118,6 +114,26 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
return ERR_PTR(ret);
 }
 
+struct drm_gem_object *
+amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
+struct dma_buf_attachment *attach,
+struct sg_table *sg)
+{
+   struct drm_gem_object *obj;
+   struct amdgpu_bo *bo;
+
+   obj = amdgpu_gem_prime_create_obj(dev, attach->dmabuf);
+   if (IS_ERR(obj))
+   return obj;
+
+   bo = gem_to_amdgpu_bo(obj);
+   amdgpu_bo_reserve(bo, true);
+   bo->tbo.sg = sg;
+   bo->tbo.ttm->sg = sg;
+   amdgpu_bo_unreserve(bo);
+   return obj;
+}
+
 static struct sg_table *
 amdgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
   enum dma_data_direction dir)
@@ -293,9 +309,29 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_device 
*dev,
return buf;
 }
 
+static void
+amdgpu_gem_prime_invalidate_mappings(struct dma_buf_attachment *attach)
+{
+   struct ttm_operation_ctx ctx = { false, false };
+   struct drm_gem_object *obj = attach->priv;
+   struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+   struct ttm_placement placement = {};
+   int r;
+
+   r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
+   if (r)
+   DRM_ERROR("Failed to unmap DMA-buf import (%d))\n", r);
+}
+
 struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf)
 {
+   struct dma_buf_attach_info attach_info = {
+   .dev = dev->dev,
+   .dmabuf = dma_buf,
+   .invalidate = amdgpu_gem_prime_invalidate_mappings
+   };
+   struct dma_buf_attachment *attach;
struct drm_gem_object *obj;
 
if (dma_buf->ops == &amdgpu_dmabuf_ops) {
@@ -310,5 +346,21 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct 
drm_device *dev,
}
}
 
-   return drm_gem_prime_import(dev, dma_buf);
+   if (!dma_buf->invalidation_supported)
+   return drm_gem_prime_import(dev, dma_buf);
+
+   obj = amdgpu_gem_prime_create_obj(dev, dma_buf);
+   if (IS_ERR(obj))
+   return obj;
+
+   attach_info.importer_priv = obj;
+   attach = dma_buf_attach(&attach_info);
+   if (IS_ERR(attach)) {
+   drm_gem_object_put(obj);
+   return ERR_CAST(attach);
+   }
+
+   get_dma_buf(dma_buf);
+   obj->import_attach = attach;
+   return obj;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index ab73300e6c7f..2a8f328918cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #inclu

[PATCH 2/5] dma-buf: add optional invalidate_mappings callback v3

2018-04-11 Thread Christian König
Each importer can now provide an invalidate_mappings callback.

This allows the exporter to provide the mappings without the need to pin
the backing store.

v2: don't try to invalidate mappings when the callback is NULL,
lock the reservation obj while using the attachments,
add helper to set the callback
v3: move flag for invalidation support into the DMA-buf,
use new attach_info structure to set the callback
v4: use importer_priv field instead of mangling exporter priv.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-buf.c | 44 
 include/linux/dma-buf.h   | 36 ++--
 2 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 4b46982c6d9c..ffdaab10e2f2 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -565,6 +565,8 @@ struct dma_buf_attachment *dma_buf_attach(const struct 
dma_buf_attach_info *info
 
attach->dev = info->dev;
attach->dmabuf = dmabuf;
+   attach->importer_priv = info->importer_priv;
+   attach->invalidate = info->invalidate;
 
mutex_lock(&dmabuf->lock);
 
@@ -573,7 +575,9 @@ struct dma_buf_attachment *dma_buf_attach(const struct 
dma_buf_attach_info *info
if (ret)
goto err_attach;
}
+   reservation_object_lock(dmabuf->resv, NULL);
list_add(&attach->node, &dmabuf->attachments);
+   reservation_object_unlock(dmabuf->resv);
 
mutex_unlock(&dmabuf->lock);
return attach;
@@ -599,7 +603,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct 
dma_buf_attachment *attach)
return;
 
mutex_lock(&dmabuf->lock);
+   reservation_object_lock(dmabuf->resv, NULL);
list_del(&attach->node);
+   reservation_object_unlock(dmabuf->resv);
if (dmabuf->ops->detach)
dmabuf->ops->detach(dmabuf, attach);
 
@@ -633,10 +639,23 @@ struct sg_table *dma_buf_map_attachment(struct 
dma_buf_attachment *attach,
if (WARN_ON(!attach || !attach->dmabuf))
return ERR_PTR(-EINVAL);
 
+   /*
+* Mapping a DMA-buf can trigger its invalidation, prevent sending this
+* event to the caller by temporary removing this attachment from the
+* list.
+*/
+   if (attach->invalidate) {
+   reservation_object_assert_held(attach->dmabuf->resv);
+   list_del(&attach->node);
+   }
+
sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
if (!sg_table)
sg_table = ERR_PTR(-ENOMEM);
 
+   if (attach->invalidate)
+   list_add(&attach->node, &attach->dmabuf->attachments);
+
return sg_table;
 }
 EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
@@ -657,6 +676,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment 
*attach,
 {
might_sleep();
 
+   if (attach->invalidate)
+   reservation_object_assert_held(attach->dmabuf->resv);
+
if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
return;
 
@@ -665,6 +687,26 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment 
*attach,
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
 
+/**
+ * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
+ *
+ * @dmabuf:[in]buffer which mappings should be invalidated
+ *
+ * Informs all attachmenst that they need to destroy and recreated all their
+ * mappings.
+ */
+void dma_buf_invalidate_mappings(struct dma_buf *dmabuf)
+{
+   struct dma_buf_attachment *attach;
+
+   reservation_object_assert_held(dmabuf->resv);
+
+   list_for_each_entry(attach, &dmabuf->attachments, node)
+   if (attach->invalidate)
+   attach->invalidate(attach);
+}
+EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
+
 /**
  * DOC: cpu access
  *
@@ -1122,10 +1164,12 @@ static int dma_buf_debug_show(struct seq_file *s, void 
*unused)
seq_puts(s, "\tAttached Devices:\n");
attach_count = 0;
 
+   reservation_object_lock(buf_obj->resv, NULL);
list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
attach_count++;
}
+   reservation_object_unlock(buf_obj->resv);
 
seq_printf(s, "Total %d devices attached\n\n",
attach_count);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 414b4dde5eb7..566503dd2b4f 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -270,6 +270,8 @@ struct dma_buf_ops {
  * @poll: for userspace poll support
  * @cb_excl: for userspace poll support
  * @cb_shared: for userspace poll support
+ * @invalidation_supported: True when the exporter supports unpinned operation
+ * 

[PATCH 1/5] dma-buf: use parameter structure for dma_buf_attach

2018-04-11 Thread Christian König
Move the parameters into a structure to make it simpler to extend it in
follow up patches.

This also adds the importer private as parameter so that we can directly
work with a completely filled in attachment structure.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-buf.c | 15 ---
 drivers/gpu/drm/armada/armada_gem.c   |  6 +-
 drivers/gpu/drm/drm_prime.c   |  6 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c|  6 +-
 drivers/gpu/drm/tegra/gem.c   |  6 +-
 drivers/gpu/drm/udl/udl_dmabuf.c  |  6 +-
 drivers/media/common/videobuf2/videobuf2-dma-contig.c |  6 +-
 drivers/media/common/videobuf2/videobuf2-dma-sg.c |  6 +-
 drivers/staging/media/tegra-vde/tegra-vde.c   |  6 +-
 include/linux/dma-buf.h   | 17 +++--
 10 files changed, 63 insertions(+), 17 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index d78d5fc173dc..4b46982c6d9c 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -534,8 +534,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
 /**
  * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
  * calls attach() of dma_buf_ops to allow device-specific attach functionality
- * @dmabuf:[in]buffer to attach device to.
- * @dev:   [in]device to be attached.
+ * @info:  [in]holds all the attach related information provided
+ * by the importer. see &struct dma_buf_attach_info
+ * for further details.
  *
  * Returns struct dma_buf_attachment pointer for this attachment. Attachments
  * must be cleaned up by calling dma_buf_detach().
@@ -549,26 +550,26 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * accessible to @dev, and cannot be moved to a more suitable place. This is
  * indicated with the error code -EBUSY.
  */
-struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
- struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info 
*info)
 {
+   struct dma_buf *dmabuf = info->dmabuf;
struct dma_buf_attachment *attach;
int ret;
 
-   if (WARN_ON(!dmabuf || !dev))
+   if (WARN_ON(!dmabuf || !info->dev))
return ERR_PTR(-EINVAL);
 
attach = kzalloc(sizeof(*attach), GFP_KERNEL);
if (!attach)
return ERR_PTR(-ENOMEM);
 
-   attach->dev = dev;
+   attach->dev = info->dev;
attach->dmabuf = dmabuf;
 
mutex_lock(&dmabuf->lock);
 
if (dmabuf->ops->attach) {
-   ret = dmabuf->ops->attach(dmabuf, dev, attach);
+   ret = dmabuf->ops->attach(dmabuf, info->dev, attach);
if (ret)
goto err_attach;
}
diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index a97f509743a5..f4d1c11f57ea 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -514,6 +514,10 @@ armada_gem_prime_export(struct drm_device *dev, struct 
drm_gem_object *obj,
 struct drm_gem_object *
 armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf)
 {
+   struct dma_buf_attach_info attach_info = {
+   .dev = dev->dev,
+   .dmabuf = buf
+   };
struct dma_buf_attachment *attach;
struct armada_gem_object *dobj;
 
@@ -529,7 +533,7 @@ armada_gem_prime_import(struct drm_device *dev, struct 
dma_buf *buf)
}
}
 
-   attach = dma_buf_attach(buf, dev->dev);
+   attach = dma_buf_attach(&attach_info);
if (IS_ERR(attach))
return ERR_CAST(attach);
 
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..4da242de51c2 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -707,6 +707,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct 
drm_device *dev,
struct dma_buf *dma_buf,
struct device *attach_dev)
 {
+   struct dma_buf_attach_info attach_info = {
+   .dev = attach_dev,
+   .dmabuf = dma_buf
+   };
struct dma_buf_attachment *attach;
struct sg_table *sgt;
struct drm_gem_object *obj;
@@ -727,7 +731,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct 
drm_device *dev,
if (!dev->driver->gem_prime_import_sg_table)
return ERR_PTR(-EINVAL);
 
-   attach = dma_buf_attach(dma_buf, attach_dev);
+   attach = dma_buf_attach(&attach_info);
if (IS_ERR(attach))
return ERR_CAST(attach);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 69a7aec49e84..7b737a883106 100644
--- a

[PATCH 3/5] drm/ttm: remove the backing store if no placement is given

2018-04-11 Thread Christian König
Pipeline removal of the BOs backing store when the placement is given
during validation.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 98e06f8bf23b..17e821f01d0a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1078,6 +1078,18 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
uint32_t new_flags;
 
reservation_object_assert_held(bo->resv);
+
+   /*
+* Remove the backing store if no placement is given.
+*/
+   if (!placement->num_placement && !placement->num_busy_placement) {
+   ret = ttm_bo_pipeline_gutting(bo);
+   if (ret)
+   return ret;
+
+   return ttm_tt_create(bo, false);
+   }
+
/*
 * Check whether we need to move buffer.
 */
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Sinan Kaya
On 4/11/2018 8:03 AM, Robin Murphy wrote:
> On 10/04/18 21:59, Sinan Kaya wrote:
>> Code is expecing to observe the same number of buffers returned from
>> dma_map_sg() function compared to sg_alloc_table_from_pages(). This
>> doesn't hold true universally especially for systems with IOMMU.
> 
> So why not fix said code? It's clearly not a real hardware limitation, and 
> the map_sg() APIs have potentially returned fewer than nents since forever, 
> so there's really no excuse.

Sure, I'll take a better fix if there is one.

> 
>> IOMMU driver tries to combine buffers into a single DMA address as much
>> as it can. The right thing is to tell the DMA layer how much combining
>> IOMMU can do.
> 
> Disagree; this is a dodgy hack, since you'll now end up passing scatterlists 
> into dma_map_sg() which already violate max_seg_size to begin with, and I 
> think a conscientious DMA API implementation would be at rights to fail the 
> mapping for that reason (I know arm64 happens not to, but that was a 
> deliberate design decision to make my life easier at the time).
> 
> As a short-term fix, at least do something like what i915 does and constrain 
> the table allocation to the desired segment size as well, so things remain 
> self-consistent. But still never claim that faking a hardware constraint as a 
> workaround for a driver shortcoming is "the right thing to do" ;)

You are asking for something like this from here, right?

https://elixir.bootlin.com/linux/v4.16.1/source/drivers/gpu/drm/i915/i915_gem_dmabuf.c#L58


ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
if (ret)
goto err_free;

src = obj->mm.pages->sgl;
dst = st->sgl;
for (i = 0; i < obj->mm.pages->nents; i++) {
sg_set_page(dst, sg_page(src), src->length, 0);
dst = sg_next(dst);
src = sg_next(src);
}

This seems to allocate the scatter gather list and fill it in manually before 
passing it
to dma_map_sg(). I'll give it a try. 

Just double checking.

> 
> Robin.
> 
>> Signed-off-by: Sinan Kaya 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Lenovo ThinkVision X1 27" USB-C occasionally hangs the kernel (usbhid) and isn't setting a proper resolution when USB 3.0 (FHD) switched to USB 2.0 (UHD)

2018-04-11 Thread Heikki Krogerus
On Wed, Apr 11, 2018 at 02:09:29PM +0300, Heikki Krogerus wrote:
> On Wed, Apr 11, 2018 at 08:28:44AM +, andrey.ara...@nixaid.com wrote:
> > Thank you for the insights, Heikki!
> > 
> > Please find the acpi.dump.tgz file is a attached.
> > 
> > I do not have the USBC* and INT3515* devices,
> 
> OK. That means we don't have any way to interface with the USB Type-C
> ports directly unfortunately. The ports are quite simply not visible
> to us. We can't do anything from USB side.
> 
> The problem is in any case a graphics problem, so maybe it's better to
> let those guys take a look at this. I think this MacBook Pro has AMD
> GPU, so adding AMD driver maintainers and lists.

Sorry Andrey, the previous mail did not get to you. I managed to
change your email address somehow. :-)



> > so I have attached the following file as well:
> > 
> > # ls -1 /sys/bus/acpi/devices/*/status | while read dev; do echo $dev: 
> > $(cat $dev); done | gzip -c > /tmp/all-device-status.log.gz
> > 
> > 
> > KR,
> > Andrey Arapov
> > 
> > April 10, 2018 4:35 PM, "Heikki Krogerus"  
> > wrote:
> > 
> > > On Tue, Apr 10, 2018 at 09:05:07AM +, andrey.ara...@nixaid.com wrote:
> > > 
> > >> Dear Linux Kernel Devs,
> > >> 
> > >> I have recently got a Lenovo ThinkVision X1 27" monitor, it is connected 
> > >> to my
> > >> laptop over a USB-C cable (DisplayPort).
> > >> 
> > >> This monitor has a built-in USB hub with a toggle button, when pressed it
> > >> shows two options on the screen:
> > >> 
> > >> Press USB Switch to toggle between:
> > >> A) 3840x2160 UHD USB 2.0
> > >> B) 1920x1080 FHD USB 3.0
> > >> 
> > >> By default it is always set to FHD, which is why Linux sees the 
> > >> 1920x1080 as a
> > >> maximum possible resolution.
> > >> 
> > >> Whenever I am changing it to the UHD, Linux is not changing the 
> > >> resolution to
> > >> 3840x2160, instead it sets to somewhere around 2560x.
> > >> 
> > >> To work this around, I am running the following commands manually:
> > >> 
> > >> xrandr --newmode "3840x2160_60.00" 533.25 3840 3888 3920 4000 2160 2163 
> > >> 2168  +hsync -vsync
> > >> xrandr --addmode DisplayPort-2 3840x2160_60.00
> > >> xrandr --output DisplayPort-2 --mode 3840x2160_60.00
> > >> 
> > >> Then, when I was trying to switch it back to FHD and again back to UHD,
> > >> sometimes it causes a kernel panic. The panic also happened when I was
> > >> plugging the cable in/out and back again. What I was able to spot is 
> > >> that the
> > >> last unloaded kernel module was usbhid.
> > >> 
> > >> Please advise, what can I try to help investigating this issue further?
> > >> 
> > >> I have attached the output from "dmesg -T" command as "4.16.1-dmesg.txt" 
> > >> file.
> > >> The logs are related to when the monitor was connected via a USB-C cable 
> > >> to a
> > >> DisplayPort-2, the default resolution picked was FHD (USB 3.0) and then 
> > >> I have
> > >> pressed the toggle button to use UHD (USB 2.0), then have applied the 
> > >> "xrandr"
> > >> commands and have closed the lid of my laptop so I would be using my 
> > >> monitor
> > >> as a primary display.
> > > 
> > > The board seems to support Thunderbold, however, in this case my guess is 
> > > that
> > > the monitor is actually using just the DP alternate mode (Thundebolt can 
> > > pipe
> > > DisplayPort protocol).
> > > 
> > > I'm guessing the problem is related to the DisplayPort lane counts. With 
> > > the
> > > highest resolutions you need all four lanes, but if you want to use USB3 
> > > with
> > > USB Type-C connectors at the same time, two have to be allocated for USB3
> > > leaving only two for DisplayPort allowing lower resolution. The problem 
> > > is that
> > > the GPU drivers need to know how many lanes the DisplayPort has in use. 
> > > That
> > > information will in normal case come from USB Type-C drivers. 
> > > Unfortunately we
> > > do not have support for that in Linux kernel yet, but I have made a 
> > > proposal for
> > > a solution.
> > > 
> > > Let's start by checking details of your board. Can you send us acpidump 
> > > output?
> > > 
> > > % acpidump -o 
> > > 
> > > Also, please check the status of these acpi devices:
> > > 
> > > % cat /sys/bus/acpi/devices/USBC*/status
> > > % cat /sys/bus/acpi/devices/INT3515*/status


Cheers,

-- 
heikki
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Robin Murphy

On 10/04/18 21:59, Sinan Kaya wrote:

Code is expecing to observe the same number of buffers returned from
dma_map_sg() function compared to sg_alloc_table_from_pages(). This
doesn't hold true universally especially for systems with IOMMU.


So why not fix said code? It's clearly not a real hardware limitation, 
and the map_sg() APIs have potentially returned fewer than nents since 
forever, so there's really no excuse.



IOMMU driver tries to combine buffers into a single DMA address as much
as it can. The right thing is to tell the DMA layer how much combining
IOMMU can do.


Disagree; this is a dodgy hack, since you'll now end up passing 
scatterlists into dma_map_sg() which already violate max_seg_size to 
begin with, and I think a conscientious DMA API implementation would be 
at rights to fail the mapping for that reason (I know arm64 happens not 
to, but that was a deliberate design decision to make my life easier at 
the time).


As a short-term fix, at least do something like what i915 does and 
constrain the table allocation to the desired segment size as well, so 
things remain self-consistent. But still never claim that faking a 
hardware constraint as a workaround for a driver shortcoming is "the 
right thing to do" ;)


Robin.


Signed-off-by: Sinan Kaya 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 1 +
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 1 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 +
  4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 8e28270..1b031eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -851,7 +851,7 @@ static int gmc_v6_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
dev_warn(adev->dev, "amdgpu: No coherent DMA available.\n");
}
-
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
r = gmc_v6_0_init_microcode(adev);
if (r) {
dev_err(adev->dev, "Failed to load mc firmware!\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 86e9d682..0a4b2cc1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -999,6 +999,7 @@ static int gmc_v7_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
pr_warn("amdgpu: No coherent DMA available\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
  
  	r = gmc_v7_0_init_microcode(adev);

if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 9a813d8..b171529 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1096,6 +1096,7 @@ static int gmc_v8_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
pr_warn("amdgpu: No coherent DMA available\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
  
  	r = gmc_v8_0_init_microcode(adev);

if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 3b7e7af..36e658ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -855,6 +855,7 @@ static int gmc_v9_0_sw_init(void *handle)
pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32));
printk(KERN_WARNING "amdgpu: No coherent DMA available.\n");
}
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
  
  	r = gmc_v9_0_mc_init(adev);

if (r)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: Add APU support in vi_set_vce_clocks

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 2:08 AM, Rex Zhu  wrote:
> 1. fix set vce clocks failed on Cz/St
>which lead 1s delay when boot up.
> 2. remove the workaround in vce_v3_0.c
>
> Reviewed-by: Alex Deucher 
> Reviewed-by: Huang Rui 
> Acked-by: Christian König 
> Acked-by: Shirish S 
> Signed-off-by: Rex Zhu 

Series is:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c |  4 ++--
>  drivers/gpu/drm/amd/amdgpu/vi.c   | 31 +--
>  2 files changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 428d192..ac96172 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -467,8 +467,8 @@ static int vce_v3_0_hw_init(void *handle)
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> vce_v3_0_override_vce_clock_gating(adev, true);
> -   if (!(adev->flags & AMD_IS_APU))
> -   amdgpu_asic_set_vce_clocks(adev, 1, 1);
> +
> +   amdgpu_asic_set_vce_clocks(adev, 1, 1);
>
> for (i = 0; i < adev->vce.num_rings; i++)
> adev->vce.ring[i].ready = false;
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> index 51acd7c..4034a28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -757,6 +757,8 @@ static int vi_set_uvd_clock(struct amdgpu_device *adev, 
> u32 clock,
>  #define ixGNB_CLK1_STATUS   0xD822010C
>  #define ixGNB_CLK2_DFS_CNTL 0xD8220110
>  #define ixGNB_CLK2_STATUS   0xD822012C
> +#define ixGNB_CLK3_DFS_CNTL 0xD8220130
> +#define ixGNB_CLK3_STATUS   0xD822014C
>
>  static int vi_set_uvd_clocks(struct amdgpu_device *adev, u32 vclk, u32 dclk)
>  {
> @@ -788,6 +790,22 @@ static int vi_set_vce_clocks(struct amdgpu_device *adev, 
> u32 evclk, u32 ecclk)
> int r, i;
> struct atom_clock_dividers dividers;
> u32 tmp;
> +   u32 reg_ctrl;
> +   u32 reg_status;
> +   u32 status_mask;
> +   u32 reg_mask;
> +
> +   if (adev->flags & AMD_IS_APU) {
> +   reg_ctrl = ixGNB_CLK3_DFS_CNTL;
> +   reg_status = ixGNB_CLK3_STATUS;
> +   status_mask = 0x0001;
> +   reg_mask = CG_ECLK_CNTL__ECLK_DIVIDER_MASK;
> +   } else {
> +   reg_ctrl = ixCG_ECLK_CNTL;
> +   reg_status = ixCG_ECLK_STATUS;
> +   status_mask = CG_ECLK_STATUS__ECLK_STATUS_MASK;
> +   reg_mask = CG_ECLK_CNTL__ECLK_DIR_CNTL_EN_MASK | 
> CG_ECLK_CNTL__ECLK_DIVIDER_MASK;
> +   }
>
> r = amdgpu_atombios_get_clock_dividers(adev,
>
> COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
> @@ -796,24 +814,25 @@ static int vi_set_vce_clocks(struct amdgpu_device 
> *adev, u32 evclk, u32 ecclk)
> return r;
>
> for (i = 0; i < 100; i++) {
> -   if (RREG32_SMC(ixCG_ECLK_STATUS) & 
> CG_ECLK_STATUS__ECLK_STATUS_MASK)
> +   if (RREG32_SMC(reg_status) & status_mask)
> break;
> mdelay(10);
> }
> +
> if (i == 100)
> return -ETIMEDOUT;
>
> -   tmp = RREG32_SMC(ixCG_ECLK_CNTL);
> -   tmp &= ~(CG_ECLK_CNTL__ECLK_DIR_CNTL_EN_MASK |
> -   CG_ECLK_CNTL__ECLK_DIVIDER_MASK);
> +   tmp = RREG32_SMC(reg_ctrl);
> +   tmp &= ~reg_mask;
> tmp |= dividers.post_divider;
> -   WREG32_SMC(ixCG_ECLK_CNTL, tmp);
> +   WREG32_SMC(reg_ctrl, tmp);
>
> for (i = 0; i < 100; i++) {
> -   if (RREG32_SMC(ixCG_ECLK_STATUS) & 
> CG_ECLK_STATUS__ECLK_STATUS_MASK)
> +   if (RREG32_SMC(reg_status) & status_mask)
> break;
> mdelay(10);
> }
> +
> if (i == 100)
> return -ETIMEDOUT;
>
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix amdgpu_debugfs_gpr_read debugfs entry

2018-04-11 Thread Tom St Denis



On 04/11/2018 08:17 AM, Christian König wrote:

Am 11.04.2018 um 14:03 schrieb Tom St Denis:



On 04/11/2018 07:58 AM, Christian König wrote:


  -    if (size & 3 || *pos & 3)
+    if (size & 3 || size > (4 * AMDGPU_DEBUGFS_MAX_SGPR_READ)) 


I think checking the position alignment here is still necessary, 
cause we can't read from not dw boundaries don't we?


The index is a dword index as fed into SQ_IND_INDEX (offset => start 
=> regno as you trace from the debugfs entry to 
gfx_v8_0_read_wave_sgprs to wave_read_regs (or analogues...)).


SQ_IND_INDEX doesn't take a byte offset but a dword offset, for instance:

include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP0 0x026c
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP1 0x026d
include/asic_reg/gc/gc_9_0_offset.h:#define ixSQ_WAVE_TTMP2 0x026e


The current way for instance would prohibit reading (directly) 
SQ_WAVE_TTMP1.


I agree it's not really how a typical file device works but it's not a 
typical file device :-).  It's assumed every read would be preceded by 
a seek to set the higher order bits anyways.


So pos=0 is one dw and pos=1 is another one? If that's the case then 
that would be a bug since pos is by definition a byte offset.


I suggest to keep the limitation and instead fix how pos is interpreted 
instead.


It would break copies of umr out there already (in ways that won't be 
obvious and therefore result in lost time/etc).   Since umr is probably 
the only user of this it's not really a big deal.  Ideally, everyone 
uses the latest umr each day :-) but that's not realistically the case.


I agree that had I written this today I would keep the file offset 
consistent with the byte offset into the register file (shift down by 4 
before calling the callback).


Since you need to seek to a very specific address to use this device 
it's not the sort of thing that someone would cat and then expect 
sensible output anyways.


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCHv2] drm/amdkfd: Remove vla

2018-04-11 Thread Felix Kuehling
On 2018-04-10 09:02 PM, Laura Abbott wrote:
> There's an ongoing effort to remove VLAs[1] from the kernel to eventually
> turn on -Wvla. Switch to a constant value that covers all hardware.
>
> [1] https://lkml.org/lkml/2018/3/7/621
>
> Signed-off-by: Laura Abbott 
> ---
> v2: Switch to a larger size to account for other hardware
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> index 035c351f47c5..c3a5a80e31ae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> @@ -139,10 +139,12 @@ static void interrupt_wq(struct work_struct *work)
>  {
>   struct kfd_dev *dev = container_of(work, struct kfd_dev,
>   interrupt_work);
> + uint32_t ih_ring_entry[8];
>  
> - uint32_t ih_ring_entry[DIV_ROUND_UP(
> - dev->device_info->ih_ring_entry_size,
> - sizeof(uint32_t))];
> + if (dev->device_info->ih_ring_entry_size > (8 * sizeof(uint32_t))) {
> + dev_err(kfd_chardev(), "Ring entry too small\n");

When the ring entry size really is too small, this will potentially
flood the kernel log. Maybe a WARN_ONCE would be more helpful. With that
fixed, this patch is Reviewed-by: Felix Kuehling 

Regards,
  Felix

> + return;
> + }
>  
>   while (dequeue_ih_ring_entry(dev, ih_ring_entry))
>   dev->device_info->event_interrupt_class->interrupt_wq(dev,

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Robin Murphy

On 11/04/18 15:33, Sinan Kaya wrote:

On 4/11/2018 8:03 AM, Robin Murphy wrote:

On 10/04/18 21:59, Sinan Kaya wrote:

Code is expecing to observe the same number of buffers returned
from dma_map_sg() function compared to
sg_alloc_table_from_pages(). This doesn't hold true universally
especially for systems with IOMMU.


So why not fix said code? It's clearly not a real hardware
limitation, and the map_sg() APIs have potentially returned fewer
than nents since forever, so there's really no excuse.


Sure, I'll take a better fix if there is one.




IOMMU driver tries to combine buffers into a single DMA address
as much as it can. The right thing is to tell the DMA layer how
much combining IOMMU can do.


Disagree; this is a dodgy hack, since you'll now end up passing
scatterlists into dma_map_sg() which already violate max_seg_size
to begin with, and I think a conscientious DMA API implementation
would be at rights to fail the mapping for that reason (I know
arm64 happens not to, but that was a deliberate design decision to
make my life easier at the time).

As a short-term fix, at least do something like what i915 does and
constrain the table allocation to the desired segment size as well,
so things remain self-consistent. But still never claim that faking
a hardware constraint as a workaround for a driver shortcoming is
"the right thing to do" ;)


You are asking for something like this from here, right?

https://elixir.bootlin.com/linux/v4.16.1/source/drivers/gpu/drm/i915/i915_gem_dmabuf.c#L58


I just looked for callers of __sg_alloc_table_from_pages() with a limit 
other than SCATTERLIST_MAX_SEGMENT, and found 
__i915_gem_userptr_alloc_pages(), which at first glance appears to be 
doing something vaguely similar to amdgpu_ttm_tt_pin_userptr().


Robin.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: RFC for a render API to support adaptive sync and VRR

2018-04-11 Thread Michel Dänzer
On 2018-04-10 06:26 PM, Cyr, Aric wrote:> > My guess is they prefer to
“do nothing” and let driver/HW manage it,
> otherwise you exempt all existing games from supporting adaptive sync
> without a rewrite or update.
Nobody is saying adaptive sync should only work with explicit target
presentation times provided by the application. We're just arguing that
target presentation time as a mechanism is superior to target refresh
rate, both for video and game use cases. It also trivially allows
emulating "as early as possible" (target presentation time = 0) and
"fixed refresh rate" (target presentation time = start + i * target
frame duration) behaviour, even transparently for the application.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] Revert "drm/amd/powerply: fix power reading on Fiji"

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:
> we don't have limit of [50ms, 4sec] sampling period.
> smu calculate average gpu power in real time.
> we can read average gpu power through smu message or
> read special register.
>
> This reverts commit 462d8dcc9fec0d89f1ff6a1f93f1d4f670878c71.
>
> Signed-off-by: Rex Zhu 

If Eric is ok with this,

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index 21c021b..388184e 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -3364,8 +3364,7 @@ static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr,
> "Failed to start pm status log!",
> return -1);
>
> -   /* Sampling period from 50ms to 4sec */
> -   msleep_interruptible(200);
> +   msleep_interruptible(20);
>
> PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> PPSMC_MSG_PmStatusLogSample),
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amd/pp: Refine get_gpu_power for VI

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:
> pkgpwr is the average gpu power of 100ms. it is calculated by
> firmware in real time.
>
> 1. we can send smu message PPSMC_MSG_GetCurrPkgPwr to read currentpkgpwr 
> directly.
>
> 2. On Fiji/tonga/bonaire/hawwii, without PPSMC_MSG_GetCurrPkgPwr support.
>Send PPSMC_MSG_PmStatusLogStart/Sample to let smu write currentpkgpwr
>to ixSMU_PM_STATUS_94. driver can read pkgpwr from ixSMU_PM_STATUS_94.
>
> Signed-off-by: Rex Zhu 

Assuming Eric is ok with removing the other power readings,

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 51 
> --
>  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 10 +++--
>  2 files changed, 34 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index 388184e..20f5a6f 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -3359,30 +3359,33 @@ static int smu7_get_pp_table_entry(struct pp_hwmgr 
> *hwmgr,
>  static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr,
> struct pp_gpu_power *query)
>  {
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_PmStatusLogStart),
> -   "Failed to start pm status log!",
> -   return -1);
> -
> -   msleep_interruptible(20);
> -
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_PmStatusLogSample),
> -   "Failed to sample pm status log!",
> -   return -1);
> -
> -   query->vddc_power = cgs_read_ind_register(hwmgr->device,
> -   CGS_IND_REG__SMC,
> -   ixSMU_PM_STATUS_40);
> -   query->vddci_power = cgs_read_ind_register(hwmgr->device,
> -   CGS_IND_REG__SMC,
> -   ixSMU_PM_STATUS_49);
> -   query->max_gpu_power = cgs_read_ind_register(hwmgr->device,
> -   CGS_IND_REG__SMC,
> -   ixSMU_PM_STATUS_94);
> -   query->average_gpu_power = cgs_read_ind_register(hwmgr->device,
> -   CGS_IND_REG__SMC,
> -   ixSMU_PM_STATUS_95);
> +   int i;
> +
> +   if (!query)
> +   return -EINVAL;
> +
> +
> +   memset(query, 0, sizeof *query);
> +
> +   smum_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_GetCurrPkgPwr, 
> 0);
> +   query->average_gpu_power = cgs_read_register(hwmgr->device, 
> mmSMC_MSG_ARG_0);
> +
> +   if (query->average_gpu_power != 0)
> +   return 0;
> +
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogStart);
> +   cgs_write_ind_register(hwmgr->device, CGS_IND_REG__SMC,
> +   ixSMU_PM_STATUS_94, 
> 0);
> +
> +   for (i = 0; i < 20; i++) {
> +   mdelay(1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogSample);
> +   query->average_gpu_power = 
> cgs_read_ind_register(hwmgr->device,
> +   CGS_IND_REG__SMC,
> +   ixSMU_PM_STATUS_94);
> +   if (query->average_gpu_power != 0)
> +   break;
> +   }
>
> return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> index fb32a3f..10a1123 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> @@ -171,8 +171,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
> uint16_t msg)
>
> ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
>
> -   if (ret != 1)
> -   pr_info("\n failed to send pre message %x ret is %d \n",  
> msg, ret);
> +   if (ret == 0xFE)
> +   pr_debug("last message was not supported\n");
> +   else if (ret != 1)
> +   pr_info("\n last message was failed ret is %d\n", ret);
>
> cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
>
> @@ -180,7 +182,9 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
> msg)
>
> ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
>
> -   if (ret != 1)
> +   if (ret == 0xFE)
> +   pr_debug("message %x was not supported\n", msg);
> +   else if (ret != 1)
> pr_info("\n failed to send message %x ret is %d \n",  msg, 
> ret);
>
> return 0;
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/m

Re: [PATCH 3/3] drm/amd/pp: Remove struct pp_gpu_power

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:
> This struct can't be used for all asics.
>
> Currently smu only calculate average gpu power in real time.
>
> for vddc/vddci/max power,
> we can read them per 1ms through some registers. but
>
> 1. the type of return values is not unified.
> For Vi, return type is uint. For vega, return type is float.
>
> 2. need to assign the unit time to calculate the average power.
>
> so remove this struct.
>
> if user need to know the power on vddc/vddci.
> we can export them with new common interface/struct.
>

If Tom and Eric are ok with this:
Acked-by: Alex Deucher 

> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  7 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 22 +++--
>  drivers/gpu/drm/amd/include/kgd_pp_interface.h |  7 ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 23 
> +-
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 17 +++-
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 13 
>  6 files changed, 29 insertions(+), 60 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 487d39e..1e59818 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -701,9 +701,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
> *data, struct drm_file
> }
> }
> case AMDGPU_INFO_SENSOR: {
> -   struct pp_gpu_power query = {0};
> -   int query_size = sizeof(query);
> -
> if (!adev->pm.dpm_enabled)
> return -ENOENT;
>
> @@ -746,10 +743,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, 
> void *data, struct drm_file
> /* get average GPU power */
> if (amdgpu_dpm_read_sensor(adev,
>AMDGPU_PP_SENSOR_GPU_POWER,
> -  (void *)&query, 
> &query_size)) {
> +  (void *)&ui32, 
> &ui32_size)) {
> return -EINVAL;
> }
> -   ui32 = query.average_gpu_power >> 8;
> +   ui32 >>= 8;
> break;
> case AMDGPU_INFO_SENSOR_VDDNB:
> /* get VDDNB in millivolts */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index e5f60fc..744f105 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -1020,8 +1020,8 @@ static ssize_t amdgpu_hwmon_show_power_avg(struct 
> device *dev,
>  {
> struct amdgpu_device *adev = dev_get_drvdata(dev);
> struct drm_device *ddev = adev->ddev;
> -   struct pp_gpu_power query = {0};
> -   int r, size = sizeof(query);
> +   u32 query = 0;
> +   int r, size = sizeof(u32);
> unsigned uw;
>
> /* Can't get power when the card is off */
> @@ -1041,7 +1041,7 @@ static ssize_t amdgpu_hwmon_show_power_avg(struct 
> device *dev,
> return r;
>
> /* convert to microwatts */
> -   uw = (query.average_gpu_power >> 8) * 100;
> +   uw = (query >> 8) * 100 + (query & 0xff) * 1000;
>
> return snprintf(buf, PAGE_SIZE, "%u\n", uw);
>  }
> @@ -1752,7 +1752,7 @@ void amdgpu_pm_compute_clocks(struct amdgpu_device 
> *adev)
>  static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, struct 
> amdgpu_device *adev)
>  {
> uint32_t value;
> -   struct pp_gpu_power query = {0};
> +   uint32_t query = 0;
> int size;
>
> /* sanity check PP is enabled */
> @@ -1775,17 +1775,9 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file 
> *m, struct amdgpu_device *a
> seq_printf(m, "\t%u mV (VDDGFX)\n", value);
> if (!amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_VDDNB, (void 
> *)&value, &size))
> seq_printf(m, "\t%u mV (VDDNB)\n", value);
> -   size = sizeof(query);
> -   if (!amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_GPU_POWER, (void 
> *)&query, &size)) {
> -   seq_printf(m, "\t%u.%u W (VDDC)\n", query.vddc_power >> 8,
> -   query.vddc_power & 0xff);
> -   seq_printf(m, "\t%u.%u W (VDDCI)\n", query.vddci_power >> 8,
> -   query.vddci_power & 0xff);
> -   seq_printf(m, "\t%u.%u W (max GPU)\n", query.max_gpu_power >> 
> 8,
> -   query.max_gpu_power & 0xff);
> -   seq_printf(m, "\t%u.%u W (average GPU)\n", 
> query.average_gpu_power >> 8,
> -   query.average_gpu_power & 0xff);
> -   }
> +   size = sizeof(uint32_t);
> +   if (!amdgpu_dp

Re: [PATCH 3/3] drm/amd/pp: Remove struct pp_gpu_power

2018-04-11 Thread Tom St Denis



On 04/11/2018 01:23 PM, Alex Deucher wrote:

On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:

This struct can't be used for all asics.

Currently smu only calculate average gpu power in real time.

for vddc/vddci/max power,
we can read them per 1ms through some registers. but

1. the type of return values is not unified.
 For Vi, return type is uint. For vega, return type is float.

2. need to assign the unit time to calculate the average power.

so remove this struct.

if user need to know the power on vddc/vddci.
we can export them with new common interface/struct.



If Tom and Eric are ok with this:
Acked-by: Alex Deucher 


It'll break the sensor reading in umr but that's fairly minor and easy 
to fix.  I don't really consider the --top output of umr as mission 
critical (even though it can be handy).  I'm ok with the change.


Once this lands on drm-next I'll upgrade umr to support it.

Cheers,
Tom





Signed-off-by: Rex Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  7 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 22 +++--
  drivers/gpu/drm/amd/include/kgd_pp_interface.h |  7 ---
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 23 +-
  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 17 +++-
  drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 13 
  6 files changed, 29 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 487d39e..1e59818 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -701,9 +701,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
 }
 }
 case AMDGPU_INFO_SENSOR: {
-   struct pp_gpu_power query = {0};
-   int query_size = sizeof(query);
-
 if (!adev->pm.dpm_enabled)
 return -ENOENT;

@@ -746,10 +743,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
 /* get average GPU power */
 if (amdgpu_dpm_read_sensor(adev,
AMDGPU_PP_SENSOR_GPU_POWER,
-  (void *)&query, 
&query_size)) {
+  (void *)&ui32, &ui32_size)) {
 return -EINVAL;
 }
-   ui32 = query.average_gpu_power >> 8;
+   ui32 >>= 8;
 break;
 case AMDGPU_INFO_SENSOR_VDDNB:
 /* get VDDNB in millivolts */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index e5f60fc..744f105 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -1020,8 +1020,8 @@ static ssize_t amdgpu_hwmon_show_power_avg(struct device 
*dev,
  {
 struct amdgpu_device *adev = dev_get_drvdata(dev);
 struct drm_device *ddev = adev->ddev;
-   struct pp_gpu_power query = {0};
-   int r, size = sizeof(query);
+   u32 query = 0;
+   int r, size = sizeof(u32);
 unsigned uw;

 /* Can't get power when the card is off */
@@ -1041,7 +1041,7 @@ static ssize_t amdgpu_hwmon_show_power_avg(struct device 
*dev,
 return r;

 /* convert to microwatts */
-   uw = (query.average_gpu_power >> 8) * 100;
+   uw = (query >> 8) * 100 + (query & 0xff) * 1000;

 return snprintf(buf, PAGE_SIZE, "%u\n", uw);
  }
@@ -1752,7 +1752,7 @@ void amdgpu_pm_compute_clocks(struct amdgpu_device *adev)
  static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, struct amdgpu_device 
*adev)
  {
 uint32_t value;
-   struct pp_gpu_power query = {0};
+   uint32_t query = 0;
 int size;

 /* sanity check PP is enabled */
@@ -1775,17 +1775,9 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, 
struct amdgpu_device *a
 seq_printf(m, "\t%u mV (VDDGFX)\n", value);
 if (!amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_VDDNB, (void *)&value, 
&size))
 seq_printf(m, "\t%u mV (VDDNB)\n", value);
-   size = sizeof(query);
-   if (!amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_GPU_POWER, (void 
*)&query, &size)) {
-   seq_printf(m, "\t%u.%u W (VDDC)\n", query.vddc_power >> 8,
-   query.vddc_power & 0xff);
-   seq_printf(m, "\t%u.%u W (VDDCI)\n", query.vddci_power >> 8,
-   query.vddci_power & 0xff);
-   seq_printf(m, "\t%u.%u W (max GPU)\n", query.max_gpu_power >> 8,
-   query.max_gpu_power & 0xff);
-   seq_printf(m, "\t%u.%u W (average GPU)\n", query.average_gpu

Re: [PATCH] drm/amd/pp: Fix logic error in smu7_check_dpm_table_updated

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 8:00 AM, Rex Zhu  wrote:
> Signed-off-by: Rex Zhu 

Is this really worth it?  It just saves a few cycles, but I'm worried
if we ever extend or rewrite the function we may miss later updates.
At the very least, I'd add a comment saying something like "return as
soon as we know the VDDC has changed, no need to check for additional
VDDC changes"

Alex

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index 68b9e0b..31d271f 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -4684,7 +4684,7 @@ static void smu7_check_dpm_table_updated(struct 
> pp_hwmgr *hwmgr)
> for (i=0; i < dep_table->count; i++) {
> if (dep_table->entries[i].vddc != 
> odn_dep_table->entries[i].vddc) {
> data->need_update_smu7_dpm_table |= 
> DPMTABLE_OD_UPDATE_VDDC;
> -   break;
> +   return;
> }
> }
> if (i == dep_table->count)
> @@ -4695,7 +4695,7 @@ static void smu7_check_dpm_table_updated(struct 
> pp_hwmgr *hwmgr)
> for (i=0; i < dep_table->count; i++) {
> if (dep_table->entries[i].vddc != 
> odn_dep_table->entries[i].vddc) {
> data->need_update_smu7_dpm_table |= 
> DPMTABLE_OD_UPDATE_VDDC;
> -   break;
> +   return;
> }
> }
> if (i == dep_table->count)
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/pp: Clear smu response register before send smu message

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 6:20 AM, Rex Zhu  wrote:
> smu firmware do not update response register immediately under
> some delay tasks. so we read out the original value.
>
> so clear the register first.
>
> Signed-off-by: Rex Zhu 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c   | 4 +---
>  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 1 +
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
> index 2a93f3a..2d4ec8a 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
> @@ -208,9 +208,7 @@ static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
> uint16_t msg)
>  {
> int ret;
>
> -   if (!ci_is_smc_ram_running(hwmgr))
> -   return -EINVAL;
> -
> +   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
> cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
>
> PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> index 10a1123..64d33b7 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> @@ -176,6 +176,7 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
> msg)
> else if (ret != 1)
> pr_info("\n last message was failed ret is %d\n", ret);
>
> +   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
> cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
>
> PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu: Allow dma_map_sg() coalescing

2018-04-11 Thread Robin Murphy
Now that drm_prime_sg_to_page_addr_arrays() understands the case where
dma_map_sg() has coalesced segments and returns 0 < count < nents, we
can relax the check to only consider genuine failure.

Signed-off-by: Robin Murphy 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 205da3ff9cd0..f81e96a4242f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
r = -ENOMEM;
nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents != ttm->sg->nents)
+   if (nents == 0)
goto release_sg;
 
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
-- 
2.16.1.dirty

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/prime: Iterate SG DMA addresses separately

2018-04-11 Thread Robin Murphy
For dma_map_sg(), DMA API implementations are free to merge consecutive
segments into a single DMA mapping if conditions are suitable, thus the
resulting DMA addresses may be packed into fewer entries than
ttm->sg->nents implies.

drm_prime_sg_to_page_addr_arrays() does not account for this, meaning
its callers either have to reject the 0 < count < nents case or risk
getting bogus addresses back later. Fortunately this is relatively easy
to deal with having to rejig structures to also store the mapped count,
since the total DMA length should still be equal to the total buffer
length. All we need is a separate scatterlist cursor to iterate the DMA
addresses separately from the CPU addresses.

Signed-off-by: Robin Murphy 
---

Off the back of Sinan's proposal for a workaround, I took a closer look
and this jumped out - I have no hardware to test it, nor do I really
know my way around this code, so I'm probably missing something, but at
face value this seems like the only obvious problem, and worth fixing
either way.

These patches are based on drm-next, and compile-tested (for arm64) only.

Robin.

 drivers/gpu/drm/drm_prime.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..db3dc8489afc 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -933,16 +933,18 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table 
*sgt, struct page **pages,
 dma_addr_t *addrs, int max_entries)
 {
unsigned count;
-   struct scatterlist *sg;
+   struct scatterlist *sg, *dma_sg;
struct page *page;
-   u32 len, index;
+   u32 len, dma_len, index;
dma_addr_t addr;
 
index = 0;
+   dma_sg = sgt->sgl;
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
len = sg->length;
page = sg_page(sg);
-   addr = sg_dma_address(sg);
 
while (len > 0) {
if (WARN_ON(index >= max_entries))
@@ -957,6 +959,12 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
len -= PAGE_SIZE;
index++;
}
+
+   if (dma_len == 0) {
+   dma_sg = sg_next(dma_sg);
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
+   }
}
return 0;
 }
-- 
2.16.1.dirty

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] Revert "drm/amd/powerply: fix power reading on Fiji"

2018-04-11 Thread Eric Huang

If it is verified by SMU team, I am OK with it.

Regards,
Eric

On 2018-04-11 01:19 PM, Alex Deucher wrote:

On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:

we don't have limit of [50ms, 4sec] sampling period.
smu calculate average gpu power in real time.
we can read average gpu power through smu message or
read special register.

This reverts commit 462d8dcc9fec0d89f1ff6a1f93f1d4f670878c71.

Signed-off-by: Rex Zhu 

If Eric is ok with this,

Acked-by: Alex Deucher 


---
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 21c021b..388184e 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3364,8 +3364,7 @@ static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr,
 "Failed to start pm status log!",
 return -1);

-   /* Sampling period from 50ms to 4sec */
-   msleep_interruptible(200);
+   msleep_interruptible(20);

 PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
 PPSMC_MSG_PmStatusLogSample),
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amd/pp: Refine get_gpu_power for VI

2018-04-11 Thread Eric Huang
This patch change the power registers reading from average to maximum. 
If SMU team verifies it, I am OK with it.


Regards,
Eric

On 2018-04-11 01:21 PM, Alex Deucher wrote:

On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:

pkgpwr is the average gpu power of 100ms. it is calculated by
firmware in real time.

1. we can send smu message PPSMC_MSG_GetCurrPkgPwr to read currentpkgpwr 
directly.

2. On Fiji/tonga/bonaire/hawwii, without PPSMC_MSG_GetCurrPkgPwr support.
Send PPSMC_MSG_PmStatusLogStart/Sample to let smu write currentpkgpwr
to ixSMU_PM_STATUS_94. driver can read pkgpwr from ixSMU_PM_STATUS_94.

Signed-off-by: Rex Zhu 

Assuming Eric is ok with removing the other power readings,

Acked-by: Alex Deucher 


---
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 51 --
  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 10 +++--
  2 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 388184e..20f5a6f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3359,30 +3359,33 @@ static int smu7_get_pp_table_entry(struct pp_hwmgr 
*hwmgr,
  static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr,
 struct pp_gpu_power *query)
  {
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_PmStatusLogStart),
-   "Failed to start pm status log!",
-   return -1);
-
-   msleep_interruptible(20);
-
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_PmStatusLogSample),
-   "Failed to sample pm status log!",
-   return -1);
-
-   query->vddc_power = cgs_read_ind_register(hwmgr->device,
-   CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_40);
-   query->vddci_power = cgs_read_ind_register(hwmgr->device,
-   CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_49);
-   query->max_gpu_power = cgs_read_ind_register(hwmgr->device,
-   CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_94);
-   query->average_gpu_power = cgs_read_ind_register(hwmgr->device,
-   CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_95);
+   int i;
+
+   if (!query)
+   return -EINVAL;
+
+
+   memset(query, 0, sizeof *query);
+
+   smum_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_GetCurrPkgPwr, 0);
+   query->average_gpu_power = cgs_read_register(hwmgr->device, 
mmSMC_MSG_ARG_0);
+
+   if (query->average_gpu_power != 0)
+   return 0;
+
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogStart);
+   cgs_write_ind_register(hwmgr->device, CGS_IND_REG__SMC,
+   ixSMU_PM_STATUS_94, 0);
+
+   for (i = 0; i < 20; i++) {
+   mdelay(1);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogSample);
+   query->average_gpu_power = cgs_read_ind_register(hwmgr->device,
+   CGS_IND_REG__SMC,
+   ixSMU_PM_STATUS_94);
+   if (query->average_gpu_power != 0)
+   break;
+   }

 return 0;
  }
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index fb32a3f..10a1123 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -171,8 +171,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
msg)

 ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);

-   if (ret != 1)
-   pr_info("\n failed to send pre message %x ret is %d \n",  msg, 
ret);
+   if (ret == 0xFE)
+   pr_debug("last message was not supported\n");
+   else if (ret != 1)
+   pr_info("\n last message was failed ret is %d\n", ret);

 cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);

@@ -180,7 +182,9 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
msg)

 ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);

-   if (ret != 1)
+   if (ret == 0xFE)
+   pr_debug("message %x was not supported\n", msg);
+   else if (ret != 1)
 pr_info("\n failed to send message %x ret is %d \n",  msg, 
ret);

 return 0;
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_

Re: [PATCH 3/3] drm/amd/pp: Remove struct pp_gpu_power

2018-04-11 Thread Eric Huang

I am OK with this change.

Regards,
Eric

On 2018-04-11 01:25 PM, Tom St Denis wrote:



On 04/11/2018 01:23 PM, Alex Deucher wrote:

On Wed, Apr 11, 2018 at 2:31 AM, Rex Zhu  wrote:

This struct can't be used for all asics.

Currently smu only calculate average gpu power in real time.

for vddc/vddci/max power,
we can read them per 1ms through some registers. but

1. the type of return values is not unified.
 For Vi, return type is uint. For vega, return type is float.

2. need to assign the unit time to calculate the average power.

so remove this struct.

if user need to know the power on vddc/vddci.
we can export them with new common interface/struct.



If Tom and Eric are ok with this:
Acked-by: Alex Deucher 


It'll break the sensor reading in umr but that's fairly minor and easy 
to fix.  I don't really consider the --top output of umr as mission 
critical (even though it can be handy).  I'm ok with the change.


Once this lands on drm-next I'll upgrade umr to support it.

Cheers,
Tom





Signed-off-by: Rex Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |  7 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 22 
+++--

  drivers/gpu/drm/amd/include/kgd_pp_interface.h |  7 ---
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 23 
+-
  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 17 
+++-

  drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 13 
  6 files changed, 29 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c

index 487d39e..1e59818 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -701,9 +701,6 @@ static int amdgpu_info_ioctl(struct drm_device 
*dev, void *data, struct drm_file

 }
 }
 case AMDGPU_INFO_SENSOR: {
-   struct pp_gpu_power query = {0};
-   int query_size = sizeof(query);
-
 if (!adev->pm.dpm_enabled)
 return -ENOENT;

@@ -746,10 +743,10 @@ static int amdgpu_info_ioctl(struct drm_device 
*dev, void *data, struct drm_file

 /* get average GPU power */
 if (amdgpu_dpm_read_sensor(adev,
AMDGPU_PP_SENSOR_GPU_POWER,
-  (void *)&query, 
&query_size)) {
+  (void *)&ui32, 
&ui32_size)) {

 return -EINVAL;
 }
-   ui32 = query.average_gpu_power >> 8;
+   ui32 >>= 8;
 break;
 case AMDGPU_INFO_SENSOR_VDDNB:
 /* get VDDNB in millivolts */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c

index e5f60fc..744f105 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -1020,8 +1020,8 @@ static ssize_t 
amdgpu_hwmon_show_power_avg(struct device *dev,

  {
 struct amdgpu_device *adev = dev_get_drvdata(dev);
 struct drm_device *ddev = adev->ddev;
-   struct pp_gpu_power query = {0};
-   int r, size = sizeof(query);
+   u32 query = 0;
+   int r, size = sizeof(u32);
 unsigned uw;

 /* Can't get power when the card is off */
@@ -1041,7 +1041,7 @@ static ssize_t 
amdgpu_hwmon_show_power_avg(struct device *dev,

 return r;

 /* convert to microwatts */
-   uw = (query.average_gpu_power >> 8) * 100;
+   uw = (query >> 8) * 100 + (query & 0xff) * 1000;

 return snprintf(buf, PAGE_SIZE, "%u\n", uw);
  }
@@ -1752,7 +1752,7 @@ void amdgpu_pm_compute_clocks(struct 
amdgpu_device *adev)
  static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, struct 
amdgpu_device *adev)

  {
 uint32_t value;
-   struct pp_gpu_power query = {0};
+   uint32_t query = 0;
 int size;

 /* sanity check PP is enabled */
@@ -1775,17 +1775,9 @@ static int amdgpu_debugfs_pm_info_pp(struct 
seq_file *m, struct amdgpu_device *a

 seq_printf(m, "\t%u mV (VDDGFX)\n", value);
 if (!amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_VDDNB, 
(void *)&value, &size))

 seq_printf(m, "\t%u mV (VDDNB)\n", value);
-   size = sizeof(query);
-   if (!amdgpu_dpm_read_sensor(adev, 
AMDGPU_PP_SENSOR_GPU_POWER, (void *)&query, &size)) {
-   seq_printf(m, "\t%u.%u W (VDDC)\n", query.vddc_power 
>> 8,

-   query.vddc_power & 0xff);
-   seq_printf(m, "\t%u.%u W (VDDCI)\n", 
query.vddci_power >> 8,

-   query.vddci_power & 0xff);
-   seq_printf(m, "\t%u.%u W (max GPU)\n", 
query.max_gpu_power >> 8,

-   query.max_gpu_power & 0xff);
-   se

Re: [1/2] drm/prime: Iterate SG DMA addresses separately

2018-04-11 Thread Robin Murphy

On 11/04/18 18:11, Robin Murphy wrote:

For dma_map_sg(), DMA API implementations are free to merge consecutive
segments into a single DMA mapping if conditions are suitable, thus the
resulting DMA addresses may be packed into fewer entries than
ttm->sg->nents implies.

drm_prime_sg_to_page_addr_arrays() does not account for this, meaning
its callers either have to reject the 0 < count < nents case or risk
getting bogus addresses back later. Fortunately this is relatively easy
to deal with having to rejig structures to also store the mapped count,
since the total DMA length should still be equal to the total buffer
length. All we need is a separate scatterlist cursor to iterate the DMA
addresses separately from the CPU addresses.

Signed-off-by: Robin Murphy 
---

Off the back of Sinan's proposal for a workaround, I took a closer look
and this jumped out - I have no hardware to test it, nor do I really
know my way around this code, so I'm probably missing something, but at
face value this seems like the only obvious problem, and worth fixing
either way.

These patches are based on drm-next, and compile-tested (for arm64) only.

Robin.

  drivers/gpu/drm/drm_prime.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..db3dc8489afc 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -933,16 +933,18 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table 
*sgt, struct page **pages,
 dma_addr_t *addrs, int max_entries)
  {
unsigned count;
-   struct scatterlist *sg;
+   struct scatterlist *sg, *dma_sg;
struct page *page;
-   u32 len, index;
+   u32 len, dma_len, index;
dma_addr_t addr;
  
  	index = 0;

+   dma_sg = sgt->sgl;
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
len = sg->length;
page = sg_page(sg);
-   addr = sg_dma_address(sg);
  
  		while (len > 0) {

if (WARN_ON(index >= max_entries))
@@ -957,6 +959,12 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
len -= PAGE_SIZE;


+   dma_len -= PAGE_SIZE;

Ugh, somehow that bit got lost. Told you I'd have missed something, 
although I was rather assuming it would be something less obvious...


Robin.


index++;
}
+
+   if (dma_len == 0) {
+   dma_sg = sg_next(dma_sg);
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
+   }
}
return 0;
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/prime: Iterate SG DMA addresses separately

2018-04-11 Thread Christian König

Am 11.04.2018 um 19:11 schrieb Robin Murphy:

For dma_map_sg(), DMA API implementations are free to merge consecutive
segments into a single DMA mapping if conditions are suitable, thus the
resulting DMA addresses may be packed into fewer entries than
ttm->sg->nents implies.

drm_prime_sg_to_page_addr_arrays() does not account for this, meaning
its callers either have to reject the 0 < count < nents case or risk
getting bogus addresses back later. Fortunately this is relatively easy
to deal with having to rejig structures to also store the mapped count,
since the total DMA length should still be equal to the total buffer
length. All we need is a separate scatterlist cursor to iterate the DMA
addresses separately from the CPU addresses.


Mhm, I think I like Sinas approach better.

See the hardware actually needs the dma_address on a page by page basis.

Joining multiple consecutive pages into one entry is just additional 
overhead which we don't need.


Regards,
Christian.



Signed-off-by: Robin Murphy 
---

Off the back of Sinan's proposal for a workaround, I took a closer look
and this jumped out - I have no hardware to test it, nor do I really
know my way around this code, so I'm probably missing something, but at
face value this seems like the only obvious problem, and worth fixing
either way.

These patches are based on drm-next, and compile-tested (for arm64) only.

Robin.

  drivers/gpu/drm/drm_prime.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..db3dc8489afc 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -933,16 +933,18 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table 
*sgt, struct page **pages,
 dma_addr_t *addrs, int max_entries)
  {
unsigned count;
-   struct scatterlist *sg;
+   struct scatterlist *sg, *dma_sg;
struct page *page;
-   u32 len, index;
+   u32 len, dma_len, index;
dma_addr_t addr;
  
  	index = 0;

+   dma_sg = sgt->sgl;
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
len = sg->length;
page = sg_page(sg);
-   addr = sg_dma_address(sg);
  
  		while (len > 0) {

if (WARN_ON(index >= max_entries))
@@ -957,6 +959,12 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
len -= PAGE_SIZE;
index++;
}
+
+   if (dma_len == 0) {
+   dma_sg = sg_next(dma_sg);
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
+   }
}
return 0;
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: Allow dma_map_sg() coalescing

2018-04-11 Thread Christian König

Am 11.04.2018 um 19:11 schrieb Robin Murphy:

Now that drm_prime_sg_to_page_addr_arrays() understands the case where
dma_map_sg() has coalesced segments and returns 0 < count < nents, we
can relax the check to only consider genuine failure.


That pattern is repeated in pretty much all drivers using TTM.

So you would need to fix all of them, but as I said I don't think that 
this approach is a good idea.


We essentially wanted to get rid of the dma_address array in the mid 
term and that change goes into the exactly opposite direction.


Regards,
Christian.



Signed-off-by: Robin Murphy 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 205da3ff9cd0..f81e96a4242f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
  
  	r = -ENOMEM;

nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents != ttm->sg->nents)
+   if (nents == 0)
goto release_sg;
  
  	drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v6 00/10] drm/i915: Implement proper fallback training for MST

2018-04-11 Thread Lyude Paul
Latest version of PW series 39642, hopefully this should also actually
come up on intel-gfx and go through CI.

No changes other than rebasing to the current drm-intel-next-queued

Lyude Paul (10):
  drm/atomic: Print debug message on atomic check failure
  drm/i915: Move DP modeset retry work into intel_dp
  drm/dp_mst: Fix naming on drm_atomic_get_mst_topology_state()
  drm/dp_mst: Remove all evil duplicate state pointers
  drm/dp_mst: Make drm_dp_mst_topology_state subclassable
  drm/dp_mst: Add reset_state callback to topology mgr
  drm/i915: Only use one link bw config for MST topologies
  drm/i915: Make intel_dp_mst_atomic_check() idempotent
  drm/dp_mst: Add MST fallback retraining helpers
  drm/i915: Implement proper fallback training for MST

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |  14 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  35 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|   4 +-
 drivers/gpu/drm/drm_atomic.c   |   5 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 548 -
 drivers/gpu/drm/i915/intel_ddi.c   |  10 +-
 drivers/gpu/drm/i915/intel_display.c   |  14 +-
 drivers/gpu/drm/i915/intel_dp.c| 361 ++
 drivers/gpu/drm/i915/intel_dp_link_training.c  |   6 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 177 +--
 drivers/gpu/drm/i915/intel_drv.h   |  20 +-
 drivers/gpu/drm/nouveau/nv50_display.c |  15 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c |  13 +-
 include/drm/drm_dp_mst_helper.h|  42 +-
 14 files changed, 1086 insertions(+), 178 deletions(-)

-- 
2.14.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v6 05/10] drm/dp_mst: Make drm_dp_mst_topology_state subclassable

2018-04-11 Thread Lyude Paul
This is useful for drivers (which will probably be all of them soon)
which need to track state that is exclusive to the topology, and not a
specific connector on said topology. This includes things such as the
link rate and lane count that are shared by all of the connectors on the
topology.

Signed-off-by: Lyude Paul 
Cc: Manasi Navare 
Cc: Ville Syrjälä 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 14 +++-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 35 +++-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|  4 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 94 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 13 ++-
 drivers/gpu/drm/nouveau/nv50_display.c | 15 +++-
 drivers/gpu/drm/radeon/radeon_dp_mst.c | 13 ++-
 include/drm/drm_dp_mst_helper.h|  8 ++
 8 files changed, 165 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e42a28e3adc5..2c3660c36732 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3626,9 +3626,17 @@ static int amdgpu_dm_connector_init(struct 
amdgpu_display_manager *dm,
 
drm_connector_register(&aconnector->base);
 
-   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort
-   || connector_type == DRM_MODE_CONNECTOR_eDP)
-   amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+   connector_type == DRM_MODE_CONNECTOR_eDP) {
+   res = amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (res) {
+   drm_connector_unregister(&aconnector->base);
+   drm_connector_cleanup(&aconnector->base);
+   aconnector->connector_id = -1;
+
+   goto out_free;
+   }
+   }
 
 #if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 8291d74f26bc..a3a43b1b30df 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -475,22 +475,49 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
.register_connector = dm_dp_mst_register_connector
 };
 
-void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
-  struct amdgpu_dm_connector *aconnector)
+static const struct drm_private_state_funcs dm_mst_state_funcs = {
+   .atomic_duplicate_state = drm_atomic_dp_mst_duplicate_topology_state,
+   .atomic_destroy_state = drm_atomic_dp_mst_destroy_topology_state,
+};
+
+int amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+ struct amdgpu_dm_connector *aconnector)
 {
+   struct drm_dp_mst_topology_state *state =
+   kzalloc(sizeof(*state), GFP_KERNEL);
+   int ret = 0;
+
+   if (!state)
+   return -ENOMEM;
+
aconnector->dm_dp_aux.aux.name = "dmdc";
aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
 
-   drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   ret = drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   if (ret)
+   goto err_aux;
+
aconnector->mst_mgr.cbs = &dm_mst_cbs;
-   drm_dp_mst_topology_mgr_init(
+   aconnector->mst_mgr.funcs = &dm_mst_state_funcs;
+   ret = drm_dp_mst_topology_mgr_init(
&aconnector->mst_mgr,
+   state,
dm->adev->ddev,
&aconnector->dm_dp_aux.aux,
16,
4,
aconnector->connector_id);
+   if (ret)
+   goto err_mst;
+
+   return 0;
+
+err_mst:
+   drm_dp_aux_unregister(&aconnector->dm_dp_aux.aux);
+err_aux:
+   kfree(state);
+   return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
index 8cf51da26657..d28fb456d2d5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
@@ -29,8 +29,8 @@
 struct amdgpu_display_manager;
 struct amdgpu_dm_connector;
 
-void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
-  struct amdgpu_dm_connector *aconnector);
+int amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+ struct amdgpu_dm_connector *aconnector);
 voi

[PATCH] drm/amdgpu: Always call kfd post reset after reset

2018-04-11 Thread Shaoyun Liu
Even reset failed, kfd post reset need to be called to make lock balance on
kfd side

Change-Id: I8b6ef29d7527915611be0b96a9cd039bc75bb0a9
Signed-off-by: Shaoyun Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 78b7d39..90a37ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3231,12 +3231,11 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
*adev,
/* bad news, how to tell it to userspace ? */
dev_info(adev->dev, "GPU reset(%d) failed\n", 
atomic_read(&adev->gpu_reset_counter));
amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
-   } else {
+   } else
dev_info(adev->dev, "GPU reset(%d) 
successed!\n",atomic_read(&adev->gpu_reset_counter));
-   /*unlock kfd after a successfully recovery*/
-   amdgpu_amdkfd_post_reset(adev);
-   }
 
+   /*unlock kfd */
+   amdgpu_amdkfd_post_reset(adev);
amdgpu_vf_error_trans_all(adev);
adev->in_gpu_reset = 0;
mutex_unlock(&adev->lock_reset);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Always call kfd post reset after reset

2018-04-11 Thread Alex Deucher
On Wed, Apr 11, 2018 at 3:47 PM, Shaoyun Liu  wrote:
> Even reset failed, kfd post reset need to be called to make lock balance on
> kfd side
>
> Change-Id: I8b6ef29d7527915611be0b96a9cd039bc75bb0a9
> Signed-off-by: Shaoyun Liu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 78b7d39..90a37ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3231,12 +3231,11 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> /* bad news, how to tell it to userspace ? */
> dev_info(adev->dev, "GPU reset(%d) failed\n", 
> atomic_read(&adev->gpu_reset_counter));
> amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, 
> r);
> -   } else {
> +   } else
> dev_info(adev->dev, "GPU reset(%d) 
> successed!\n",atomic_read(&adev->gpu_reset_counter));
> -   /*unlock kfd after a successfully recovery*/
> -   amdgpu_amdkfd_post_reset(adev);
> -   }
>
> +   /*unlock kfd */
> +   amdgpu_amdkfd_post_reset(adev);
> amdgpu_vf_error_trans_all(adev);
> adev->in_gpu_reset = 0;
> mutex_unlock(&adev->lock_reset);
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Always call kfd post reset after reset

2018-04-11 Thread Felix Kuehling
On 2018-04-11 03:47 PM, Shaoyun Liu wrote:
> Even reset failed, kfd post reset need to be called to make lock balance on
> kfd side
>
> Change-Id: I8b6ef29d7527915611be0b96a9cd039bc75bb0a9
> Signed-off-by: Shaoyun Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 78b7d39..90a37ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3231,12 +3231,11 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>   /* bad news, how to tell it to userspace ? */
>   dev_info(adev->dev, "GPU reset(%d) failed\n", 
> atomic_read(&adev->gpu_reset_counter));
>   amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
> - } else {
> + } else
>   dev_info(adev->dev, "GPU reset(%d) 
> successed!\n",atomic_read(&adev->gpu_reset_counter));
> - /*unlock kfd after a successfully recovery*/
> - amdgpu_amdkfd_post_reset(adev);
> - }

Please leave the braces {...}. It's better style to make all branches of
the same if-else-if-...-else use the same braces (or no-braces). With
that fixed, this change is Reviewed-by: Felix Kuehling


>  
> + /*unlock kfd */
> + amdgpu_amdkfd_post_reset(adev);
>   amdgpu_vf_error_trans_all(adev);
>   adev->in_gpu_reset = 0;
>   mutex_unlock(&adev->lock_reset);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 2/5] Hook up CRTC color management functions

2018-04-11 Thread Leo Li



On 2018-04-11 04:48 AM, Michel Dänzer wrote:

On 2018-04-10 08:02 PM, Leo Li wrote:

On 2018-04-09 11:03 AM, Michel Dänzer wrote:

On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:

From: "Leo (Sunpeng) Li" 

The functions insert into the output resource creation, and property
change functions. CRTC destroy is also hooked-up for proper cleanup of
the CRTC property list.

Signed-off-by: Leo (Sunpeng) Li 


[...]


@@ -1933,6 +1933,9 @@ static void
drmmode_output_create_resources(xf86OutputPtr output)
   }
   }
   }
+
+    if (output->crtc)
+    drmmode_crtc_create_resources(output->crtc, output);


output->crtc is only non-NULL here for outputs which are enabled at Xorg
startup; other outputs won't have the new properties.


Is it necessary to have the CRTC properties on a output if the CRTC is
disabled for that output?


The set of properties exposed by an output should be constant throughout
its lifetime.



I see.

I'm not sure what the standard is for listing 'disabled' properties like
this. What do clients expect? Should the properties be configured as 0
and immutable until a valid CRTC is attached?

This would mean that DDX has to know the list of available CRTC
properties before a CRTC is available. I guess we can hard-code a list
of expected properties from kernel DRM, if that's not of concern. And if
they don't exist because of version differences, we can just leave them
disabled.




I've tested hot-plugging with this, and the properties do initialize on
hot-plug.


I suspect you were testing something like DP MST, where the output is
created dynamically on hot-plug. For "static" outputs such as
DVI/HDMI/VGA/LVDS (and "normal" (e)DP), drmmode_output_create_resources
is only called once during Xorg initialization, and output->crtc is NULL
at that point unless the output is enabled from the beginning (which is
only the case if either the output is connected at that point, or is
explicitly configured in xorg.conf to be on).



Yep, I was using a DP... This definitely should be fixed.

Leo




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 3/5] Keep CRTC properties consistent

2018-04-11 Thread Leo Li



On 2018-04-11 04:39 AM, Michel Dänzer wrote:

On 2018-04-10 08:02 PM, Leo Li wrote:

On 2018-04-09 11:03 AM, Michel Dänzer wrote:

On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:

From: "Leo (Sunpeng) Li" 

In cases where CRTC properties are updated without going through
RRChangeOutputProperty, we don't update the properties in user land.

Consider setting legacy gamma. It doesn't go through
RRChangeOutputProperty, but modifies the CRTC's color management
properties. Unless they are updated, the user properties will remain
stale.


Can you describe a bit more how the legacy gamma and the new properties
interact?



Sure thing, I'll include this in the message for v2:

In kernel, the legacy set gamma interface is essentially an adapter to
the non-legacy set properties interface. In the end, they both set the
same property to a DRM property blob, which contains the gamma lookup
table. The key difference between them is how this blob is created.

For legacy gamma, the kernel takes 3 arrays from user-land, and creates
the blob using them. Note that a blob is identified by it's blob_id.

For non-legacy gamma, the kernel takes a blob_id from user-land that
references the blob. This means user-land is responsible for creating
the blob.

 From the perspective of RandR, this presents some problems. Since both
paths modify the same property, RandR must keep the reported property
value up-to-date with which ever path is used:

1. Legacy gamma via
xrandr --output  --gamma x:x:x
2. Non-legacy color properties via
xrandr --output  --set GAMMA_LUT 

Keeping the value up-to-date isn't a problem for 2, since RandR updates
it for us as part of changing output properties.

But if 1 is used, the property blob is created within kernel, and RandR
is unaware of the new blob_id. To update it, we need to ask kernel about
it.

--- continue with rest of message ---



Therefore, add a function to update user CRTC properties by querying
DRM,
and call it whenever legacy gamma is changed.


Note that drmmode_crtc_gamma_do_set is called from
drmmode_set_mode_major, i.e. on every modeset or under some
circumstances when a DRI3 client stops page flipping.



The property will have to be updated each time the legacy set gamma
ioctl is called, since a new blob (with a new blob_id) is created each
time.

Not sure if this is a good idea, but perhaps we can have a flag that
explicitly enable one or the other, depending on user preference? A
user-only property with something like:

0: Use legacy gamma, calls to change non-legacy properties are ignored.
1: Use non-legacy, calls to legacy gamma will be ignored.

On 0, we can remove/disable all non-legacy properties from the property
list, and avoid having to update them. On 1, we'll enable the
properties, and won't have to update them either since legacy gamma is
"disabled". It has the added benefit of avoiding unexpected legacy gamma
sets when using non-legacy, and vice versa.


Hmm. So either legacy or non-legacy clients won't work at all, or
they'll step on each other's toes, clobbering the HW gamma LUT from each
other.

I'm afraid neither of those alternatives sound like a good user
experience to me.

Consider on the one hand something like Night Light / redshift, using
legacy APIs to adjust colour temperature to the time of day. On the
other hand, another client using the non-legacy API for say fine-tuning
of a display's advanced gamma capabilities.

Ideally, in this case the gamma LUTs from the legacy and non-legacy APIs
should be combined, such that the hardware LUT reflects both the colour
temperature set by Night Light / refshift and the fine-tuning set by the
non-legacy client. Is that feasible? If not, can you explain a little why?



Hmm, combining LUTs could be feasible, but I don't think that's the
right approach.

LUTs can be combined through composition h(x) = f(g(x)), with some
interpolation involved. Once combined, it can be set using the
non-legacy API, since that'll likely have a larger LUT size.

But I think what you've mentioned raises a bigger issue of color
management conflicts in general. It doesn't have to be redshift vs
monitor correction, what if there more than 2 applications contending to
set gamma? xrandr's brightness already conflicts with redshift, and so
does some apps running on WINE. Ultimately, any (legacy or not) gamma
set requests are unified into one CRTC property in DRM, and they will
all conflict if not managed. I don't think combining two LUTs will help
here.

For the small example at hand, the ideal solution is to have redshift
use the color transformation matrix (CTM), which will be exposed as part
of the non-legacy color API. It'll need modification of redshift, but at
least it won't conflict with anything gamma related.



Jumping back on some patch 1 topics:

Are there ways to get arbitrarily (no more than 4096 elements) sized
arrays from a client, to the DDX driver? Otherwise, I'm not sure if
there's another way to expose these properties,

Re: [PATCH xf86-video-amdgpu 1/5] Add functions for changing CRTC color management properties

2018-04-11 Thread Leo Li



On 2018-04-11 04:39 AM, Michel Dänzer wrote:

On 2018-04-10 08:02 PM, Leo Li wrote:

On 2018-04-09 11:03 AM, Michel Dänzer wrote:

On 2018-03-26 10:00 PM, sunpeng...@amd.com wrote:

From: "Leo (Sunpeng) Li" 

This change adds a few functions in preparation of enabling CRTC color
managment via the randr interface.

The driver-private CRTC object now contains a list of properties,
mirroring the driver-private output object. The lifecycle of the CRTC
properties will also mirror the output.

Since color managment properties are all DRM blobs, we'll expose the
ability to change the blob ID. The user can create blobs via libdrm
(which can be done without ownership of DRM master), then set the ID via
xrandr. The user will then have to ensure proper cleanup by subsequently
releasing the blob.


That sounds a bit clunky. :)

When changing a blob ID, the change only takes effect on the next atomic
commit, doesn't it? How does the client trigger the atomic commit?


 From the perspective of a client that wishes to change a property, the
process between regular properties and blob properties should be
essentially the same. Both will trigger an atomic commit when the DRM
set property ioctl is called from our DDX driver.


That doesn't sound right — if every set property ioctl call implicitly
triggered an atomic commit, the KMS atomic API wouldn't be "atomic" at all.

Do you mean that the DDX driver triggers an atomic commit on each
property change? How is that done?



Yes,

In drmmode_display.c: drmmode_output_set_property(), a call is made to
drmModeConnectorSetProperty() [libdrm].

And if drmmode_crtc_set_property() is considered, then a call can be
made to drmModeObjectSetProperty() [libdrm].

Both of these functions trigger an ioctl that eventually calls
drm_mode_obj_set_property_ioctl() within the kernel. It will trigger an
atomic commit, if the kernel driver supports it.

To be "truly atomic", the set of properties that DDX wish to change must
be compiled together using drmModeAtomicAddProperty(), then committed
all at once via drmModeAtomicCommit(). Intel's IGT test suite has a good
example of this, within igt_atomic_commit() here:
https://cgit.freedesktop.org/drm/igt-gpu-tools/tree/lib/igt_kms.c#n2997
The for-loops will add the properties, which are the committed at the end.




The only difference is that DRM property blobs can be arbitrary in size,
and needs to be passed by reference through its DRM-defined blob ID.
Because of this, the client has to create the blob, save it's id, call
libXrandr to change it, then destroy the blob after it's been committed.

The client has to call libXrandr due to DRM permissions. IIRC, there can
be only one DRM master. And since xserver is DRM master, an external
application cannot set DRM properties unless it goes through X. However,
creating and destroying DRM property blobs and can be done by anyone.

Was this the source of the clunkiness? I've thought about having DDX
create and destroy the blob instead, but that needs an interface for the
client to get arbitrarily sized data to DDX. I'm not aware of any good
ways to do so. Don't think the kernel can do this for us either. It does
create the blob for legacy gamma, but that's because there's a dedicated
ioctl for it.


Maybe there are better approaches than exposing these properties to
clients directly. See also my latest follow-up on patch 3.



I'll reply to this in patch 3 as well.

Leo




@@ -1604,6 +1623,18 @@ static void drmmode_output_dpms(xf86OutputPtr
output, int mode)
   }
   }
   +static Bool drmmode_crtc_property_ignore(drmModePropertyPtr prop)
+{
+    if (!prop)
+    return TRUE;
+    /* Ignore CRTC gamma lut sizes */
+    if (!strcmp(prop->name, "GAMMA_LUT_SIZE") ||
+    !strcmp(prop->name, "DEGAMMA_LUT_SIZE"))
+    return TRUE;


Without these properties, how can a client know the LUT sizes?


Good point, I originally thought the sizes are fixed and did not need
exposing. But who knows if they may change, or even be different per asic.


AFAIK at least Intel hardware already has different sizes in different
hardware generations. We should avoid creating an API which couldn't
also work for the modesetting driver and generic clients.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V3] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Sinan Kaya
Code is expecing to observe the same number of buffers returned from
dma_map_sg() function compared to sg_alloc_table_from_pages(). This
doesn't hold true universally especially for systems with IOMMU.

The hardware actually needs the dma_address on a page by page basis.
Joining multiple consecutive pages into one entry is just additional
overhead which we don't need.

Signed-off-by: Sinan Kaya 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index af1b879..91abe1a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1948,6 +1948,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* init the mode config */
drm_mode_config_init(adev->ddev);
 
+   /* limit scatter-gather buffer size to PAGE_SIZE */
+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
+
r = amdgpu_device_ip_init(adev);
if (r) {
/* failed in exclusive mode due to timeout */
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v7 00/10] drm/i915: Implement proper fallback training for MST

2018-04-11 Thread Lyude Paul
Next version of https://patchwork.freedesktop.org/series/41576/

All changes in this patch series are just to make checkpatch a little
happier, no functional changes.

Lyude Paul (10):
  drm/atomic: Print debug message on atomic check failure
  drm/i915: Move DP modeset retry work into intel_dp
  drm/dp_mst: Fix naming on drm_atomic_get_mst_topology_state()
  drm/dp_mst: Remove all evil duplicate state pointers
  drm/dp_mst: Make drm_dp_mst_topology_state subclassable
  drm/dp_mst: Add reset_state callback to topology mgr
  drm/i915: Only use one link bw config for MST topologies
  drm/i915: Make intel_dp_mst_atomic_check() idempotent
  drm/dp_mst: Add MST fallback retraining helpers
  drm/i915: Implement proper fallback training for MST

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |  14 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  46 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|   4 +-
 drivers/gpu/drm/drm_atomic.c   |   5 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 550 -
 drivers/gpu/drm/i915/intel_ddi.c   |  10 +-
 drivers/gpu/drm/i915/intel_display.c   |  14 +-
 drivers/gpu/drm/i915/intel_dp.c| 376 ++
 drivers/gpu/drm/i915/intel_dp_link_training.c  |   6 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 186 +--
 drivers/gpu/drm/i915/intel_drv.h   |  20 +-
 drivers/gpu/drm/nouveau/nv50_display.c |  17 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c |  13 +-
 include/drm/drm_dp_mst_helper.h|  43 +-
 14 files changed, 1120 insertions(+), 184 deletions(-)

-- 
2.14.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v7 05/10] drm/dp_mst: Make drm_dp_mst_topology_state subclassable

2018-04-11 Thread Lyude Paul
This is useful for drivers (which will probably be all of them soon)
which need to track state that is exclusive to the topology, and not a
specific connector on said topology. This includes things such as the
link rate and lane count that are shared by all of the connectors on the
topology.

Signed-off-by: Lyude Paul 
Cc: Manasi Navare 
Cc: Ville Syrjälä 

V7:
 - Fix CHECKPATCH errors
Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 14 +++-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 46 ---
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|  4 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 95 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 13 ++-
 drivers/gpu/drm/nouveau/nv50_display.c | 17 +++-
 drivers/gpu/drm/radeon/radeon_dp_mst.c | 13 ++-
 include/drm/drm_dp_mst_helper.h|  8 ++
 8 files changed, 173 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e42a28e3adc5..2c3660c36732 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3626,9 +3626,17 @@ static int amdgpu_dm_connector_init(struct 
amdgpu_display_manager *dm,
 
drm_connector_register(&aconnector->base);
 
-   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort
-   || connector_type == DRM_MODE_CONNECTOR_eDP)
-   amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+   connector_type == DRM_MODE_CONNECTOR_eDP) {
+   res = amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (res) {
+   drm_connector_unregister(&aconnector->base);
+   drm_connector_cleanup(&aconnector->base);
+   aconnector->connector_id = -1;
+
+   goto out_free;
+   }
+   }
 
 #if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 8291d74f26bc..dcaa92d12cbc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -475,22 +475,48 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
.register_connector = dm_dp_mst_register_connector
 };
 
-void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
-  struct amdgpu_dm_connector *aconnector)
+static const struct drm_private_state_funcs dm_mst_state_funcs = {
+   .atomic_duplicate_state = drm_atomic_dp_mst_duplicate_topology_state,
+   .atomic_destroy_state = drm_atomic_dp_mst_destroy_topology_state,
+};
+
+int amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+ struct amdgpu_dm_connector *aconnector)
 {
+   struct drm_dp_mst_topology_state *state =
+   kzalloc(sizeof(*state), GFP_KERNEL);
+   int ret = 0;
+
+   if (!state)
+   return -ENOMEM;
+
aconnector->dm_dp_aux.aux.name = "dmdc";
aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
 
-   drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   ret = drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   if (ret)
+   goto err_aux;
+
aconnector->mst_mgr.cbs = &dm_mst_cbs;
-   drm_dp_mst_topology_mgr_init(
-   &aconnector->mst_mgr,
-   dm->adev->ddev,
-   &aconnector->dm_dp_aux.aux,
-   16,
-   4,
-   aconnector->connector_id);
+   aconnector->mst_mgr.funcs = &dm_mst_state_funcs;
+   ret = drm_dp_mst_topology_mgr_init(&aconnector->mst_mgr,
+  state,
+  dm->adev->ddev,
+  &aconnector->dm_dp_aux.aux,
+  16,
+  4,
+  aconnector->connector_id);
+   if (ret)
+   goto err_mst;
+
+   return 0;
+
+err_mst:
+   drm_dp_aux_unregister(&aconnector->dm_dp_aux.aux);
+err_aux:
+   kfree(state);
+   return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
index 8cf51da26657..d28fb456d2d5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_

[PATCH] drm/amdgpu/powerplay: fix smu7_get_memory_type for fiji

2018-04-11 Thread Alex Deucher
Fiji uses a different register than other smu7 asics, but
we already have this info in the base driver so just
use that.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 21c021ba0f49..97b7c2333f19 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4150,13 +4150,9 @@ static int smu7_read_clock_registers(struct pp_hwmgr 
*hwmgr)
 static int smu7_get_memory_type(struct pp_hwmgr *hwmgr)
 {
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
-   uint32_t temp;
-
-   temp = cgs_read_register(hwmgr->device, mmMC_SEQ_MISC0);
+   struct amdgpu_device *adev = hwmgr->adev;
 
-   data->is_memory_gddr5 = (MC_SEQ_MISC0_GDDR5_VALUE ==
-   ((temp & MC_SEQ_MISC0_GDDR5_MASK) >>
-MC_SEQ_MISC0_GDDR5_SHIFT));
+   data->is_memory_gddr5 = (adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5);
 
return 0;
 }
-- 
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/powerplay: rename smu7_upload_mc_firmware

2018-04-11 Thread Alex Deucher
It doesn't actually upload any firmware is just
checks the version.  The actual upload happens in
the gmc modules.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 97b7c2333f19..ed43dd39b5d6 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4071,7 +4071,7 @@ static int smu7_check_states_equal(struct pp_hwmgr *hwmgr,
return 0;
 }
 
-static int smu7_upload_mc_firmware(struct pp_hwmgr *hwmgr)
+static int smu7_check_mc_firmware(struct pp_hwmgr *hwmgr)
 {
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
 
@@ -4200,7 +4200,7 @@ static int smu7_setup_asic_task(struct pp_hwmgr *hwmgr)
 {
int tmp_result, result = 0;
 
-   smu7_upload_mc_firmware(hwmgr);
+   smu7_check_mc_firmware(hwmgr);
 
tmp_result = smu7_read_clock_registers(hwmgr);
PP_ASSERT_WITH_CODE((0 == tmp_result),
-- 
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: RFC for a render API to support adaptive sync and VRR

2018-04-11 Thread Cyr, Aric
> From: Michel Dänzer [mailto:mic...@daenzer.net]
> Sent: Wednesday, April 11, 2018 05:50
> On 2018-04-11 08:57 AM, Nicolai Hähnle wrote:
> > On 10.04.2018 23:45, Cyr, Aric wrote:
> >> How does it work fine today given that all kernel seems to know is
> >> 'current' or 'current+1' vsyncs.
> >> Presumably the applications somehow schedule all this just fine.
> >> If this works without variable refresh for 60Hz, will it not work for
> >> a fixed-rate "48Hz" monitor (assuming a 24Hz video)?
> >
> > You're right. I guess a better way to state the point is that it
> > *doesn't* really work today with fixed refresh, but if we're going to
> > introduce a new API, then why not do so in a way that can fix these
> > additional problems as well?
> 
> Exactly. With a fixed frame duration, we'll still have fundamentally the
> same issues as we currently do without variable refresh, not making use
> of the full potential of variable refresh.

I see.  Well then, that's makes this sort of orthogonal to the discussion.  
If you say that there are no media players on Linux today that can maintain 
audio/video sync with a 60Hz display, then that problem is much larger than the 
one we're trying to solve here.  
By the way, I don't believe that is a true statement :)

> > Say you have a multi-GPU system, and each GPU has multiple displays
> > attached, and a single application is driving them all. The application
> > queues flips for all displays with the same target_present_time_ns
> > attribute. Starting at some time T, the application simply asks for the
> > same present time T + i * 1667 (or whatever) for frame i from all
> > displays.
[snip]
> > Why would that not work to sync up all displays almost perfectly?
> 
> Seconded.

It doesn't work that way unfortunately.  In theory, sounds great, but if you 
ask anyone who's worked with framelock/genlock, it is a complicated problem.  
Easiest explaination is that you need to be able to atomically program 
registers across multiple GPUs at the same time, this is not possible without 
hardware assist (see AMD's S400 module for example).
We have enough to discuss without this, so let's leave mGPU for another day 
since we can't solve it here anyways.
 
> > Okay, that's interesting. Does this mean that the display driver still
> > programs a refresh rate to some hardware register?

Yes, driver can, in some cases, update the minimum and maximum vertical total 
each flip.
In fixed rated example, you would set them equal to achieve your desired 
refresh rate.
We don't program refresh rate, just the vertical total min/max (which 
translates to affecting refresh rate, given constant pixel clock and horizontal 
total).
Thus, our refresh rate granularity is one line-time, on the order of microsec 
accuracy.

> > How about what I wrote in an earlier mail of having attributes:
> >
> > - target_present_time_ns
> > - hint_frame_time_ns (optional)
> >
> > ... and if a video player set both, the driver could still do the
> > optimizations you've explained?
> 
> FWIW, I don't think a property would be a good mechanism for the target
> presentation time.
> 
> At least with VDPAU, video players are already explicitly specifying the
> target presentation time, so no changes should be required at that
> level. Don't know about other video APIs.
> 
> The X11 Present extension protocol is also prepared for specifying the
> target presentation time already, the support for it just needs to be
> implemented.

I'm perfectly OK with presentation time-based *API*.  I get it from a user 
mode/app perspective, and that's fine.  We need that feedback and would like 
help defining that portions of the stack.
However, I think it doesn't make as much sense as a *DDI* because it doesn't 
correspond to any hardware real or logical (i.e. no one would implement it in 
HW this way) and the industry specs aren't defined that way.
You can have libdrm or some other usermode component translate your 
presentation time into a frame duration and schedule it.  What's the advantage 
of having this in kernel besides the fact we lose the intent of the application 
and could prevent features and optimizations.  When it gets to kernel, I think 
it is much more elegant for the flip structure to contain a simple duration 
that says "hey, show this frame on the screen for this long".  Then we don't 
need any clocks or timers just some simple math and program the hardware.

In short, 
 1) We can simplify media players' lives by helping them get really, really 
close to their content rate, so they wouldn't need any frame rate conversion.  
 They'll still need A/V syncing though, and variable refresh cannot solve 
this and thus is way out of scope of what we're proposing.  

 2) For gaming, don't even try to guess a frame duration, the driver/hardware 
will do a better job every time, just specify duration=0 and flip as fast as 
you can.

Regards,
  Aric

P.S. Thanks for the Croteam link.  Interesting, but basically nullifie

[PATCH v8 00/10] drm/i915: Implement proper fallback training for MST

2018-04-11 Thread Lyude Paul
Next version of https://patchwork.freedesktop.org/series/41576/

Only changes are removing duplicate SoBs that git send-email annoyingly
added. Sorry about that :(

Lyude Paul (10):
  drm/atomic: Print debug message on atomic check failure
  drm/i915: Move DP modeset retry work into intel_dp
  drm/dp_mst: Fix naming on drm_atomic_get_mst_topology_state()
  drm/dp_mst: Remove all evil duplicate state pointers
  drm/dp_mst: Make drm_dp_mst_topology_state subclassable
  drm/dp_mst: Add reset_state callback to topology mgr
  drm/i915: Only use one link bw config for MST topologies
  drm/i915: Make intel_dp_mst_atomic_check() idempotent
  drm/dp_mst: Add MST fallback retraining helpers
  drm/i915: Implement proper fallback training for MST

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |  14 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  46 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|   4 +-
 drivers/gpu/drm/drm_atomic.c   |   5 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 550 -
 drivers/gpu/drm/i915/intel_ddi.c   |  10 +-
 drivers/gpu/drm/i915/intel_display.c   |  14 +-
 drivers/gpu/drm/i915/intel_dp.c| 376 ++
 drivers/gpu/drm/i915/intel_dp_link_training.c  |   6 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 186 +--
 drivers/gpu/drm/i915/intel_drv.h   |  20 +-
 drivers/gpu/drm/nouveau/nv50_display.c |  17 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c |  13 +-
 include/drm/drm_dp_mst_helper.h|  43 +-
 14 files changed, 1120 insertions(+), 184 deletions(-)

-- 
2.14.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v8 05/10] drm/dp_mst: Make drm_dp_mst_topology_state subclassable

2018-04-11 Thread Lyude Paul
This is useful for drivers (which will probably be all of them soon)
which need to track state that is exclusive to the topology, and not a
specific connector on said topology. This includes things such as the
link rate and lane count that are shared by all of the connectors on the
topology.

Signed-off-by: Lyude Paul 
Cc: Manasi Navare 
Cc: Ville Syrjälä 

V7:
 - Fix CHECKPATCH errors
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 14 +++-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 46 ---
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|  4 +-
 drivers/gpu/drm/drm_dp_mst_topology.c  | 95 +-
 drivers/gpu/drm/i915/intel_dp_mst.c| 13 ++-
 drivers/gpu/drm/nouveau/nv50_display.c | 17 +++-
 drivers/gpu/drm/radeon/radeon_dp_mst.c | 13 ++-
 include/drm/drm_dp_mst_helper.h|  8 ++
 8 files changed, 173 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e42a28e3adc5..2c3660c36732 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3626,9 +3626,17 @@ static int amdgpu_dm_connector_init(struct 
amdgpu_display_manager *dm,
 
drm_connector_register(&aconnector->base);
 
-   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort
-   || connector_type == DRM_MODE_CONNECTOR_eDP)
-   amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+   connector_type == DRM_MODE_CONNECTOR_eDP) {
+   res = amdgpu_dm_initialize_dp_connector(dm, aconnector);
+   if (res) {
+   drm_connector_unregister(&aconnector->base);
+   drm_connector_cleanup(&aconnector->base);
+   aconnector->connector_id = -1;
+
+   goto out_free;
+   }
+   }
 
 #if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 8291d74f26bc..dcaa92d12cbc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -475,22 +475,48 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
.register_connector = dm_dp_mst_register_connector
 };
 
-void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
-  struct amdgpu_dm_connector *aconnector)
+static const struct drm_private_state_funcs dm_mst_state_funcs = {
+   .atomic_duplicate_state = drm_atomic_dp_mst_duplicate_topology_state,
+   .atomic_destroy_state = drm_atomic_dp_mst_destroy_topology_state,
+};
+
+int amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+ struct amdgpu_dm_connector *aconnector)
 {
+   struct drm_dp_mst_topology_state *state =
+   kzalloc(sizeof(*state), GFP_KERNEL);
+   int ret = 0;
+
+   if (!state)
+   return -ENOMEM;
+
aconnector->dm_dp_aux.aux.name = "dmdc";
aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
 
-   drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   ret = drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
+   if (ret)
+   goto err_aux;
+
aconnector->mst_mgr.cbs = &dm_mst_cbs;
-   drm_dp_mst_topology_mgr_init(
-   &aconnector->mst_mgr,
-   dm->adev->ddev,
-   &aconnector->dm_dp_aux.aux,
-   16,
-   4,
-   aconnector->connector_id);
+   aconnector->mst_mgr.funcs = &dm_mst_state_funcs;
+   ret = drm_dp_mst_topology_mgr_init(&aconnector->mst_mgr,
+  state,
+  dm->adev->ddev,
+  &aconnector->dm_dp_aux.aux,
+  16,
+  4,
+  aconnector->connector_id);
+   if (ret)
+   goto err_mst;
+
+   return 0;
+
+err_mst:
+   drm_dp_aux_unregister(&aconnector->dm_dp_aux.aux);
+err_aux:
+   kfree(state);
+   return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
index 8cf51da26657..d28fb456d2d5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
@@ -29,8 +29,8 @@
 

Re: [PATCH] drm/amdgpu/powerplay: fix smu7_get_memory_type for fiji

2018-04-11 Thread Zhu, Rex
Reviewed-by: Rex Zhu 



Best Regards

Rex



From: amd-gfx  on behalf of Alex Deucher 

Sent: Thursday, April 12, 2018 6:59 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: [PATCH] drm/amdgpu/powerplay: fix smu7_get_memory_type for fiji

Fiji uses a different register than other smu7 asics, but
we already have this info in the base driver so just
use that.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 21c021ba0f49..97b7c2333f19 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4150,13 +4150,9 @@ static int smu7_read_clock_registers(struct pp_hwmgr 
*hwmgr)
 static int smu7_get_memory_type(struct pp_hwmgr *hwmgr)
 {
 struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
-   uint32_t temp;
-
-   temp = cgs_read_register(hwmgr->device, mmMC_SEQ_MISC0);
+   struct amdgpu_device *adev = hwmgr->adev;

-   data->is_memory_gddr5 = (MC_SEQ_MISC0_GDDR5_VALUE ==
-   ((temp & MC_SEQ_MISC0_GDDR5_MASK) >>
-MC_SEQ_MISC0_GDDR5_SHIFT));
+   data->is_memory_gddr5 = (adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5);

 return 0;
 }
--
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

amd-gfx Info Page - 
freedesktop.org
lists.freedesktop.org
Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the following form. 
Use of all freedesktop.org lists is subject to our Code of Conduct.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu/powerplay: rename smu7_upload_mc_firmware

2018-04-11 Thread Zhu, Rex
Reviewed-by: Rex Zhu 



Best Regards

Rex



From: amd-gfx  on behalf of Alex Deucher 

Sent: Thursday, April 12, 2018 7:11 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: [PATCH] drm/amdgpu/powerplay: rename smu7_upload_mc_firmware

It doesn't actually upload any firmware is just
checks the version.  The actual upload happens in
the gmc modules.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 97b7c2333f19..ed43dd39b5d6 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -4071,7 +4071,7 @@ static int smu7_check_states_equal(struct pp_hwmgr *hwmgr,
 return 0;
 }

-static int smu7_upload_mc_firmware(struct pp_hwmgr *hwmgr)
+static int smu7_check_mc_firmware(struct pp_hwmgr *hwmgr)
 {
 struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);

@@ -4200,7 +4200,7 @@ static int smu7_setup_asic_task(struct pp_hwmgr *hwmgr)
 {
 int tmp_result, result = 0;

-   smu7_upload_mc_firmware(hwmgr);
+   smu7_check_mc_firmware(hwmgr);

 tmp_result = smu7_read_clock_registers(hwmgr);
 PP_ASSERT_WITH_CODE((0 == tmp_result),
--
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

amd-gfx Info Page - 
freedesktop.org
lists.freedesktop.org
Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the following form. 
Use of all freedesktop.org lists is subject to our Code of Conduct.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v4 1/2] drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over

2018-04-11 Thread Andrey Grodzovsky
From: Alex Deucher 

Steal 9 MB for vga emulation and fb if vga is enabled, otherwise,
steal enough to cover the current display size as set by the vbios.

If no memory is used (e.g., secondary or headless card), skip
stolen memory reserve.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +++--
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   | 17 -
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   | 17 -
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   | 17 -
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 42 -
 5 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index ab73300e..0555821 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1441,12 +1441,14 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
return r;
}
 
-   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, PAGE_SIZE,
-   AMDGPU_GEM_DOMAIN_VRAM,
-   &adev->stolen_vga_memory,
-   NULL, NULL);
-   if (r)
-   return r;
+   if (adev->gmc.stolen_size) {
+   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, 
PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM,
+   &adev->stolen_vga_memory,
+   NULL, NULL);
+   if (r)
+   return r;
+   }
DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
 (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
 
@@ -1521,7 +1523,8 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
return;
 
amdgpu_ttm_debugfs_fini(adev);
-   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
+   if (adev->gmc.stolen_size)
+   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
amdgpu_ttm_fw_reserve_vram_fini(adev);
if (adev->mman.aper_base_kaddr)
iounmap(adev->mman.aper_base_kaddr);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 5617cf6..63f0b65 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -825,6 +825,21 @@ static int gmc_v6_0_late_init(void *handle)
return 0;
 }
 
+static unsigned gmc_v6_0_get_vbios_fb_size(struct amdgpu_device *adev)
+{
+   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
+
+   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
+   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
MB for FB */
+   } else {
+   u32 viewport = RREG32(mmVIEWPORT_SIZE);
+   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_HEIGHT) *
+REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_WIDTH) *
+4);
+   return size;
+   }
+}
+
 static int gmc_v6_0_sw_init(void *handle)
 {
int r;
@@ -851,7 +866,7 @@ static int gmc_v6_0_sw_init(void *handle)
 
adev->gmc.mc_mask = 0xffULL;
 
-   adev->gmc.stolen_size = 256 * 1024;
+   adev->gmc.stolen_size = gmc_v6_0_get_vbios_fb_size(adev);
 
adev->need_dma32 = false;
dma_bits = adev->need_dma32 ? 32 : 40;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 80054f3..2deb5c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -964,6 +964,21 @@ static int gmc_v7_0_late_init(void *handle)
return 0;
 }
 
+static unsigned gmc_v7_0_get_vbios_fb_size(struct amdgpu_device *adev)
+{
+   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
+
+   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
+   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
MB for FB */
+   } else {
+   u32 viewport = RREG32(mmVIEWPORT_SIZE);
+   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_HEIGHT) *
+REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_WIDTH) *
+4);
+   return size;
+   }
+}
+
 static int gmc_v7_0_sw_init(void *handle)
 {
int r;
@@ -998,7 +1013,7 @@ static int gmc_v7_0_sw_init(void *handle)
 */
adev->gmc.mc_mask = 0xffULL; /* 40 bit MC */
 
-   adev->gmc.stolen_size = 256 * 1024;
+   adev->gmc.stolen_size = gmc_v7_0_get_vbios_fb_size(adev);
 
/* set DMA mask + need_dma32 flags.
 * PCIE - can handle 40-bits.
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index d71d4cb..04b00df 1006

[PATCH v4 2/2] drm/amdgpu: Free VGA stolen memory as soon as possible.

2018-04-11 Thread Andrey Grodzovsky
Reserved VRAM is used to avoid overriding pre OS FB.
Once our display stack takes over we don't need the reserved
VRAM anymore.

v2:
Remove comment, we know actually why we need to reserve the stolen VRAM.
Fix return type for amdgpu_ttm_late_init.
v3:
Return 0 in amdgpu_bo_late_init, rebase on changes to previous patch
v4:
Don't release stolen memory for GMC9 ASICs untill GART corruption
on S3 resume is resolved.

Signed-off-by: Andrey Grodzovsky 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  8 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  2 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  2 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  2 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 14 ++
 8 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 9e23d6f..a160ef0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -852,6 +852,13 @@ int amdgpu_bo_init(struct amdgpu_device *adev)
return amdgpu_ttm_init(adev);
 }
 
+int amdgpu_bo_late_init(struct amdgpu_device *adev)
+{
+   amdgpu_ttm_late_init(adev);
+
+   return 0;
+}
+
 void amdgpu_bo_fini(struct amdgpu_device *adev)
 {
amdgpu_ttm_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 3bee133..1e9fe85 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -251,6 +251,7 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
domain,
 int amdgpu_bo_unpin(struct amdgpu_bo *bo);
 int amdgpu_bo_evict_vram(struct amdgpu_device *adev);
 int amdgpu_bo_init(struct amdgpu_device *adev);
+int amdgpu_bo_late_init(struct amdgpu_device *adev);
 void amdgpu_bo_fini(struct amdgpu_device *adev);
 int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
struct vm_area_struct *vma);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 0555821..7a608cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1517,14 +1517,18 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
return 0;
 }
 
+void amdgpu_ttm_late_init(struct amdgpu_device *adev)
+{
+   if (adev->gmc.stolen_size)
+   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
+}
+
 void amdgpu_ttm_fini(struct amdgpu_device *adev)
 {
if (!adev->mman.initialized)
return;
 
amdgpu_ttm_debugfs_fini(adev);
-   if (adev->gmc.stolen_size)
-   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
amdgpu_ttm_fw_reserve_vram_fini(adev);
if (adev->mman.aper_base_kaddr)
iounmap(adev->mman.aper_base_kaddr);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 6ea7de8..e969c87 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -77,6 +77,7 @@ uint64_t amdgpu_vram_mgr_usage(struct ttm_mem_type_manager 
*man);
 uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_mem_type_manager *man);
 
 int amdgpu_ttm_init(struct amdgpu_device *adev);
+void amdgpu_ttm_late_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
 void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev,
bool enable);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 63f0b65..4a8f9bd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -819,6 +819,8 @@ static int gmc_v6_0_late_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   amdgpu_bo_late_init(adev);
+
if (amdgpu_vm_fault_stop != AMDGPU_VM_FAULT_STOP_ALWAYS)
return amdgpu_irq_get(adev, &adev->gmc.vm_fault, 0);
else
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 2deb5c9..189fdf9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -958,6 +958,8 @@ static int gmc_v7_0_late_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   amdgpu_bo_late_init(adev);
+
if (amdgpu_vm_fault_stop != AMDGPU_VM_FAULT_STOP_ALWAYS)
return amdgpu_irq_get(adev, &adev->gmc.vm_fault, 0);
else
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 04b00df..19e153f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/am

Re: [PATCH v4 1/2] drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over

2018-04-11 Thread Alex Deucher
On Thu, Apr 12, 2018 at 12:08 AM, Andrey Grodzovsky
 wrote:
> From: Alex Deucher 
>
> Steal 9 MB for vga emulation and fb if vga is enabled, otherwise,
> steal enough to cover the current display size as set by the vbios.
>
> If no memory is used (e.g., secondary or headless card), skip
> stolen memory reserve.
>
> Signed-off-by: Alex Deucher 

Looks like you used v1 rather than v2 of this patch:
https://patchwork.freedesktop.org/patch/215566/

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +++--
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   | 17 -
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   | 17 -
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   | 17 -
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 42 
> -
>  5 files changed, 99 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index ab73300e..0555821 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1441,12 +1441,14 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> return r;
> }
>
> -   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, PAGE_SIZE,
> -   AMDGPU_GEM_DOMAIN_VRAM,
> -   &adev->stolen_vga_memory,
> -   NULL, NULL);
> -   if (r)
> -   return r;
> +   if (adev->gmc.stolen_size) {
> +   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, 
> PAGE_SIZE,
> +   AMDGPU_GEM_DOMAIN_VRAM,
> +   &adev->stolen_vga_memory,
> +   NULL, NULL);
> +   if (r)
> +   return r;
> +   }
> DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>
> @@ -1521,7 +1523,8 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
> return;
>
> amdgpu_ttm_debugfs_fini(adev);
> -   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
> +   if (adev->gmc.stolen_size)
> +   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
> amdgpu_ttm_fw_reserve_vram_fini(adev);
> if (adev->mman.aper_base_kaddr)
> iounmap(adev->mman.aper_base_kaddr);
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index 5617cf6..63f0b65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -825,6 +825,21 @@ static int gmc_v6_0_late_init(void *handle)
> return 0;
>  }
>
> +static unsigned gmc_v6_0_get_vbios_fb_size(struct amdgpu_device *adev)
> +{
> +   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
> +
> +   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
> +   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
> MB for FB */
> +   } else {
> +   u32 viewport = RREG32(mmVIEWPORT_SIZE);
> +   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
> VIEWPORT_HEIGHT) *
> +REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
> VIEWPORT_WIDTH) *
> +4);
> +   return size;
> +   }
> +}
> +
>  static int gmc_v6_0_sw_init(void *handle)
>  {
> int r;
> @@ -851,7 +866,7 @@ static int gmc_v6_0_sw_init(void *handle)
>
> adev->gmc.mc_mask = 0xffULL;
>
> -   adev->gmc.stolen_size = 256 * 1024;
> +   adev->gmc.stolen_size = gmc_v6_0_get_vbios_fb_size(adev);
>
> adev->need_dma32 = false;
> dma_bits = adev->need_dma32 ? 32 : 40;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 80054f3..2deb5c9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -964,6 +964,21 @@ static int gmc_v7_0_late_init(void *handle)
> return 0;
>  }
>
> +static unsigned gmc_v7_0_get_vbios_fb_size(struct amdgpu_device *adev)
> +{
> +   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
> +
> +   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
> +   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
> MB for FB */
> +   } else {
> +   u32 viewport = RREG32(mmVIEWPORT_SIZE);
> +   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
> VIEWPORT_HEIGHT) *
> +REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
> VIEWPORT_WIDTH) *
> +4);
> +   return size;
> +   }
> +}
> +
>  static int gmc_v7_0_sw_init(void *handle)
>  {
> int r;
> @@ -998,7 +1013,7 @@ static int gmc_v7_0_sw_init(vo

Re: [PATCH v4 1/2] drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over

2018-04-11 Thread Andrey Grodzovsky



On 04/12/2018 12:16 AM, Alex Deucher wrote:

On Thu, Apr 12, 2018 at 12:08 AM, Andrey Grodzovsky
 wrote:

From: Alex Deucher 

Steal 9 MB for vga emulation and fb if vga is enabled, otherwise,
steal enough to cover the current display size as set by the vbios.

If no memory is used (e.g., secondary or headless card), skip
stolen memory reserve.

Signed-off-by: Alex Deucher 

Looks like you used v1 rather than v2 of this patch:
https://patchwork.freedesktop.org/patch/215566/

Alex


To much patches circling around, NP, will respin tomorrow.

Thanks,
Andrey




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +++--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   | 17 -
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   | 17 -
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   | 17 -
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 42 -
  5 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index ab73300e..0555821 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1441,12 +1441,14 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 return r;
 }

-   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, PAGE_SIZE,
-   AMDGPU_GEM_DOMAIN_VRAM,
-   &adev->stolen_vga_memory,
-   NULL, NULL);
-   if (r)
-   return r;
+   if (adev->gmc.stolen_size) {
+   r = amdgpu_bo_create_kernel(adev, adev->gmc.stolen_size, 
PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM,
+   &adev->stolen_vga_memory,
+   NULL, NULL);
+   if (r)
+   return r;
+   }
 DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));

@@ -1521,7 +1523,8 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
 return;

 amdgpu_ttm_debugfs_fini(adev);
-   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
+   if (adev->gmc.stolen_size)
+   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
 amdgpu_ttm_fw_reserve_vram_fini(adev);
 if (adev->mman.aper_base_kaddr)
 iounmap(adev->mman.aper_base_kaddr);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 5617cf6..63f0b65 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -825,6 +825,21 @@ static int gmc_v6_0_late_init(void *handle)
 return 0;
  }

+static unsigned gmc_v6_0_get_vbios_fb_size(struct amdgpu_device *adev)
+{
+   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
+
+   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
+   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
MB for FB */
+   } else {
+   u32 viewport = RREG32(mmVIEWPORT_SIZE);
+   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_HEIGHT) *
+REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_WIDTH) *
+4);
+   return size;
+   }
+}
+
  static int gmc_v6_0_sw_init(void *handle)
  {
 int r;
@@ -851,7 +866,7 @@ static int gmc_v6_0_sw_init(void *handle)

 adev->gmc.mc_mask = 0xffULL;

-   adev->gmc.stolen_size = 256 * 1024;
+   adev->gmc.stolen_size = gmc_v6_0_get_vbios_fb_size(adev);

 adev->need_dma32 = false;
 dma_bits = adev->need_dma32 ? 32 : 40;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 80054f3..2deb5c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -964,6 +964,21 @@ static int gmc_v7_0_late_init(void *handle)
 return 0;
  }

+static unsigned gmc_v7_0_get_vbios_fb_size(struct amdgpu_device *adev)
+{
+   u32 d1vga_control = RREG32(mmD1VGA_CONTROL);
+
+   if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
+   return 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
MB for FB */
+   } else {
+   u32 viewport = RREG32(mmVIEWPORT_SIZE);
+   unsigned size = (REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_HEIGHT) *
+REG_GET_FIELD(viewport, VIEWPORT_SIZE, 
VIEWPORT_WIDTH) *
+4);
+   return size;
+   }
+}
+
  static int gmc_v7_0_sw_init(void *handle)
  {
 int r;
@@ -998,7 +1013,7 @@ static int gmc_v7_0_sw_init(void *handle)
  */
 adev->gmc.mc_mask = 0xffULL; /* 40 bit M

Re: [PATCH v4 2/2] drm/amdgpu: Free VGA stolen memory as soon as possible.

2018-04-11 Thread Alex Deucher
On Thu, Apr 12, 2018 at 12:08 AM, Andrey Grodzovsky
 wrote:
> Reserved VRAM is used to avoid overriding pre OS FB.
> Once our display stack takes over we don't need the reserved
> VRAM anymore.
>
> v2:
> Remove comment, we know actually why we need to reserve the stolen VRAM.
> Fix return type for amdgpu_ttm_late_init.
> v3:
> Return 0 in amdgpu_bo_late_init, rebase on changes to previous patch
> v4:
> Don't release stolen memory for GMC9 ASICs untill GART corruption
> on S3 resume is resolved.
>
> Signed-off-by: Andrey Grodzovsky 
> Signed-off-by: Alex Deucher 

Looks like you used the original version of this patch as well.
Updated version here:
https://patchwork.freedesktop.org/patch/215567/
more comments below.

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  8 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  2 ++
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  2 ++
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  2 ++
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 14 ++
>  8 files changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 9e23d6f..a160ef0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -852,6 +852,13 @@ int amdgpu_bo_init(struct amdgpu_device *adev)
> return amdgpu_ttm_init(adev);
>  }
>
> +int amdgpu_bo_late_init(struct amdgpu_device *adev)
> +{
> +   amdgpu_ttm_late_init(adev);
> +
> +   return 0;
> +}
> +
>  void amdgpu_bo_fini(struct amdgpu_device *adev)
>  {
> amdgpu_ttm_fini(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 3bee133..1e9fe85 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -251,6 +251,7 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
> domain,
>  int amdgpu_bo_unpin(struct amdgpu_bo *bo);
>  int amdgpu_bo_evict_vram(struct amdgpu_device *adev);
>  int amdgpu_bo_init(struct amdgpu_device *adev);
> +int amdgpu_bo_late_init(struct amdgpu_device *adev);
>  void amdgpu_bo_fini(struct amdgpu_device *adev);
>  int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
> struct vm_area_struct *vma);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 0555821..7a608cf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1517,14 +1517,18 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> return 0;
>  }
>
> +void amdgpu_ttm_late_init(struct amdgpu_device *adev)
> +{
> +   if (adev->gmc.stolen_size)

no need to check for NULL here.  amdgpu_bo_free_kernel() will do the
right thing even if stolen_vga_memory is NULL.

> +   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
> +}
> +
>  void amdgpu_ttm_fini(struct amdgpu_device *adev)
>  {
> if (!adev->mman.initialized)
> return;
>
> amdgpu_ttm_debugfs_fini(adev);
> -   if (adev->gmc.stolen_size)
> -   amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
> amdgpu_ttm_fw_reserve_vram_fini(adev);
> if (adev->mman.aper_base_kaddr)
> iounmap(adev->mman.aper_base_kaddr);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> index 6ea7de8..e969c87 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> @@ -77,6 +77,7 @@ uint64_t amdgpu_vram_mgr_usage(struct ttm_mem_type_manager 
> *man);
>  uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_mem_type_manager *man);
>
>  int amdgpu_ttm_init(struct amdgpu_device *adev);
> +void amdgpu_ttm_late_init(struct amdgpu_device *adev);
>  void amdgpu_ttm_fini(struct amdgpu_device *adev);
>  void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev,
> bool enable);
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index 63f0b65..4a8f9bd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -819,6 +819,8 @@ static int gmc_v6_0_late_init(void *handle)
>  {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> +   amdgpu_bo_late_init(adev);
> +
> if (amdgpu_vm_fault_stop != AMDGPU_VM_FAULT_STOP_ALWAYS)
> return amdgpu_irq_get(adev, &adev->gmc.vm_fault, 0);
> else
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 2deb5c9..189fdf9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> ++

[AMDGPU][DC] Bonaire - Unable to enable DC on Mobile CIK - Invalid Connector ObjectID

2018-04-11 Thread Shawn Starr

When trying an early 4.17-rc0 RPM from Fedora,

I get:

[2.794243] [drm:construct [amdgpu]] *ERROR* construct: Invalid 
Connector ObjectID from Adapter Service for connector index:1! type 0 
expected 3


We've had issues with connector types on this laptop before and found 
some workarounds, looks like those got lost with the DC code.


my boot options: radeon.cik_support=0 amdgpu.cik_support=1 
drm_kms_helper.poll=0 amdgpu.dc=1


What information would you like me to provide?

Thanks,
Shawn.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [AMDGPU][DC] Bonaire - Unable to enable DC on Mobile CIK - Invalid Connector ObjectID

2018-04-11 Thread Alex Deucher
On Thu, Apr 12, 2018 at 12:48 AM, Shawn Starr  wrote:
> When trying an early 4.17-rc0 RPM from Fedora,
>
> I get:
>
> [2.794243] [drm:construct [amdgpu]] *ERROR* construct: Invalid Connector
> ObjectID from Adapter Service for connector index:1! type 0 expected 3
>
> We've had issues with connector types on this laptop before and found some
> workarounds, looks like those got lost with the DC code.
>
> my boot options: radeon.cik_support=0 amdgpu.cik_support=1
> drm_kms_helper.poll=0 amdgpu.dc=1
>
> What information would you like me to provide?

Other than the error message is there an actual problem?  Please file
a bug and include your dmesg output (with amdgpu.dc_log=1) and xorg
log (if using X) and describe what display connectors are expected.
Also, if this is a hybrid laptop with multiple GPUs what mode is it in
and is there a difference if switch to a different mode (e.g., hybrid
vs dGPU only).

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [AMDGPU][DC] Bonaire - Unable to enable DC on Mobile CIK - Invalid Connector ObjectID

2018-04-11 Thread Shawn Starr

On 04/12/2018 12:54 AM, Alex Deucher wrote:

On Thu, Apr 12, 2018 at 12:48 AM, Shawn Starr  wrote:

When trying an early 4.17-rc0 RPM from Fedora,

I get:

[2.794243] [drm:construct [amdgpu]] *ERROR* construct: Invalid Connector
ObjectID from Adapter Service for connector index:1! type 0 expected 3

We've had issues with connector types on this laptop before and found some
workarounds, looks like those got lost with the DC code.

my boot options: radeon.cik_support=0 amdgpu.cik_support=1
drm_kms_helper.poll=0 amdgpu.dc=1

What information would you like me to provide?


Other than the error message is there an actual problem?  Please file
a bug and include your dmesg output (with amdgpu.dc_log=1) and xorg
log (if using X) and describe what display connectors are expected.
Also, if this is a hybrid laptop with multiple GPUs what mode is it in
and is there a difference if switch to a different mode (e.g., hybrid
vs dGPU only).

Alex
___


Hi Alex,

No display(s) turn on, I will provide all debug and logs.

Thanks,
Shawn.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx