Re: [PATCH libdrm v4 1/1] amdgpu: move asic id table to a separate file

2017-05-30 Thread Michel Dänzer
On 31/05/17 07:31 AM, Li, Samuel wrote:
> From: Michel Dänzer [mailto:mic...@daenzer.net] 
>> On 30/05/17 06:16 AM, Samuel Li wrote:
>> 
>>> diff --git a/amdgpu/amdgpu_asic_id.c b/amdgpu/amdgpu_asic_id.c new 
>>> file mode 100644 index 000..a43ca33
>>> --- /dev/null
>>> +++ b/amdgpu/amdgpu_asic_id.c
>> 
>> [...]
>> 
>>> +#include "xf86drm.h"
>>> +#include "amdgpu_drm.h"
>> 
>> Should be
>> 
>> #include 
>> #include 
>> 
>> since these header files are not located in the same directory as
>> amdgpu_asic_id.c.
> 
>  [Sam] Actually, "" is used to include programmer-defined header files,
> and <>  is used for files pre-designated by the compiler/IDE.

The only difference between the two is that #include "" first looks for
the header file in the same directory where the file containing the
#include directive (not necessarily the same as the original *.c file
passed to the compiler/preprocessor) is located, after that it looks in
the same paths in the same order as <>. So "" only really makes sense
when the header file is in the same directory as the file including it.


>>> @@ -267,6 +274,11 @@ int amdgpu_device_initialize(int fd,
>>> amdgpu_vamgr_init(>vamgr_32, start, max,
>>>   dev->dev_info.virtual_address_alignment);
>>>  
>>> +   r = amdgpu_parse_asic_ids(>asic_ids);
>>> +   if (r)
>>> +   fprintf(stderr, "%s: Can not parse asic ids, 0x%x.",
>>> +   __func__, r);
>> 
>> "Cannot parse ASIC IDs"
>> 
>> Also, there should be curly braces around a multi-line statement.
> 
> [Sam] Can be done. However, it is still a single statement. Does it matter?

It might not be strictly required, but I think it does make the code
clearer in this case.


>>> diff --git a/include/drm/amdgpu.ids b/include/drm/amdgpu.ids
>>> new file mode 100644
>>> index 000..6d6b944
>>> --- /dev/null
>>> +++ b/include/drm/amdgpu.ids
>> 
>> I think the path of this file in the repository should be
>> amdgpu/amdgpu.ids rather than include/drm/amdgpu.ids.
> 
> [Sam] The file is going to be shared with radeon.

We can cross that bridge when we get there. Meanwhile, it's not a header
file and not installed under $prefix/include/, so it doesn't belong in
include/.


>>> @@ -0,0 +1,170 @@
>>> +# List of AMDGPU ID's
>> 
>> This should say "IDs" instead of "ID's".
>> 
>> 
>>> +67FF,  CF, 67FF:CF
>>> +67FF,  EF, 67FF:EF
>> 
>> There should be no such dummy entries in the file. If it's useful,
>> amdgpu_get_marketing_name can return a dummy string based on the PCI ID
>> and revision when there's no matching entry in the file.
> 
> [Sam] I forwarded another thread to you.

Please make your argument explicitly, for the benefit of non-AMD readers
of the amd-gfx list.

Anyway, I don't think that invalidates what I wrote, and Alex seems to
agree. "67FF:CF" isn't a marketing name, so there should be no such
entries in this file. It's not necessary anyway; assuming it's useful
for amdgpu_get_marketing_name to return such "names", it can generate
them on the fly when there is no matching entry in the file.

Ideally the issues above should be fixed in the original file we get
from marketing (?), but meanwhile / failing that we should fix them up
(and can easily with Git).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: clean up code to make it easier to understand

2017-05-30 Thread zhoucm1



On 2017年05月31日 05:47, Alex Xie wrote:

  The original code convert pointer from amdgpu_crtc to
  drm_crtc then later convert the pointer back.
  It is difficult to understand why it was written
  that way.

Signed-off-by: Alex Xie 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 4dd83a3..7317fc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -258,8 +258,7 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
for (i = 0, found = 0; i < adev->mode_info.num_crtc; i++) {
crtc = (struct drm_crtc *)minfo->crtcs[i];
if (crtc && crtc->base.id == info->mode_crtc.id) {
-   struct amdgpu_crtc *amdgpu_crtc = 
to_amdgpu_crtc(crtc);
-   ui32 = amdgpu_crtc->crtc_id;
+   ui32 = minfo->crtcs[i]->crtc_id;
I feel previous code is more clear, it differentiates drm_crtc and 
amdgpu_crtc, which makes more clear for accessing their member.
If totally not convert, then we will see the below code, when we read 
the code, we will need to check struct amdgpu_mode_info, struct 
amdgpu_crtc and struct drm_crtc for condition line.
if (minfo->crtcs[i] && minfo->crtcs[i]->base.base.id == 
info->mode_crtc.id) {

ui32 = minfo->crtcs[i]->crtc_id;


Regards,
David Zhou

found = 1;
break;
}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdgpu: Remove two ! operations in an if condition

2017-05-30 Thread Michel Dänzer
On 31/05/17 06:47 AM, Alex Xie wrote:
>  Make the code easier to understand.
> 
> Signed-off-by: Alex Xie 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 7a323f9..9fdeb82 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -680,9 +680,11 @@ bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring 
> *ring,
>  
>   if (amdgpu_vm_had_gpu_reset(adev, id))
>   return true;
> - if (!vm_flush_needed && !gds_switch_needed)
> +
> + if (vm_flush_needed || gds_switch_needed)
> + return true;
> + else
>   return false;

This can be further simplified to

return vm_flush_needed || gds_switch_needed;

With that,

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdgpu: Remove two ! operations in an if condition

2017-05-30 Thread zhoucm1



On 2017年05月31日 05:47, Alex Xie wrote:

  Make the code easier to understand.

Signed-off-by: Alex Xie 

ok to me, Reviewed-by: Chunming Zhou 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7a323f9..9fdeb82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -680,9 +680,11 @@ bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
  
  	if (amdgpu_vm_had_gpu_reset(adev, id))

return true;
-   if (!vm_flush_needed && !gds_switch_needed)
+
+   if (vm_flush_needed || gds_switch_needed)
+   return true;
+   else
return false;
-   return true;
  }
  
  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: Optimize a function called by every IB sheduling

2017-05-30 Thread zhoucm1



On 2017年05月31日 05:47, Alex Xie wrote:

   Move several if statements and a loop statment from
   run time to initialization time.

Signed-off-by: Alex Xie 

nice, Reviewed-by: Chunming Zhou 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 33 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  6 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   | 28 +--
  3 files changed, 40 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6a85db0..7d95435 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -153,6 +153,36 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
  }
  
  /**

+ * amdgpu_ring_check_compute_vm_bug - check whether this ring has compute vm 
bug
+ *
+ * @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring structure holding ring information
+ */
+static void amdgpu_ring_check_compute_vm_bug(struct amdgpu_device *adev,
+   struct amdgpu_ring *ring)
+{
+   const struct amdgpu_ip_block *ip_block;
+
+   ring->has_compute_vm_bug = false;
+
+   if (ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE)
+   /* only compute rings */
+   return;
+
+   ip_block = amdgpu_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
+   if (!ip_block)
+   return;
+
+   /* Compute ring has a VM bug for GFX version < 7.
+   And compute ring has a VM bug for GFX 8 MEC firmware version < 
673.*/
+   if (ip_block->version->major <= 7) {
+   ring->has_compute_vm_bug = true;
+   } else if (ip_block->version->major == 8)
+   if (adev->gfx.mec_fw_version < 673)
+   ring->has_compute_vm_bug = true;
+}
+
+/**
   * amdgpu_ring_init - init driver ring struct.
   *
   * @adev: amdgpu_device pointer
@@ -257,6 +287,9 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
if (amdgpu_debugfs_ring_init(adev, ring)) {
DRM_ERROR("Failed to register debugfs file for rings !\n");
}
+
+   amdgpu_ring_check_compute_vm_bug(adev, ring);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

index a9223a8..334307e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -185,6 +185,7 @@ struct amdgpu_ring {
u64 cond_exe_gpu_addr;
volatile u32*cond_exe_cpu_addr;
unsignedvm_inv_eng;
+   boolhas_compute_vm_bug;
  #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
  #endif
@@ -207,4 +208,9 @@ static inline void amdgpu_ring_clear_ring(struct 
amdgpu_ring *ring)
  
  }
  
+static inline bool amdgpu_ring_has_compute_vm_bug(struct amdgpu_ring *ring)

+{
+   return ring->has_compute_vm_bug;
+}
+
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b2384b8..7a323f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -656,32 +656,6 @@ static int amdgpu_vm_alloc_reserved_vmid(struct 
amdgpu_device *adev,
return r;
  }
  
-static bool amdgpu_vm_ring_has_compute_vm_bug(struct amdgpu_ring *ring)

-{
-   struct amdgpu_device *adev = ring->adev;
-   const struct amdgpu_ip_block *ip_block;
-
-   if (ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE)
-   /* only compute rings */
-   return false;
-
-   ip_block = amdgpu_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
-   if (!ip_block)
-   return false;
-
-   if (ip_block->version->major <= 7) {
-   /* gfx7 has no workaround */
-   return true;
-   } else if (ip_block->version->major == 8) {
-   if (adev->gfx.mec_fw_version >= 673)
-   /* gfx8 is fixed in MEC firmware 673 */
-   return false;
-   else
-   return true;
-   }
-   return false;
-}
-
  bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
  struct amdgpu_job *job)
  {
@@ -691,7 +665,7 @@ bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
struct amdgpu_vm_id *id;
bool gds_switch_needed;
bool vm_flush_needed = job->vm_needs_flush ||
-   amdgpu_vm_ring_has_compute_vm_bug(ring);
+   amdgpu_ring_has_compute_vm_bug(ring);
  
  	if (job->vm_id == 0)

return false;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH libdrm v4 1/1] amdgpu: move asic id table to a separate file

2017-05-30 Thread Li, Samuel
Please see comments inline.

-Original Message-
From: Michel Dänzer [mailto:mic...@daenzer.net] 
Sent: Monday, May 29, 2017 9:26 PM
To: Li, Samuel 
Cc: amd-gfx@lists.freedesktop.org; Yuan, Xiaojie 
Subject: Re: [PATCH libdrm v4 1/1] amdgpu: move asic id table to a separate file

On 30/05/17 06:16 AM, Samuel Li wrote:
> From: Xiaojie Yuan 

I took a closer look and noticed some details (and some non-details about the 
amdgpu.ids file at the end).


> diff --git a/amdgpu/amdgpu_asic_id.c b/amdgpu/amdgpu_asic_id.c new 
> file mode 100644 index 000..a43ca33
> --- /dev/null
> +++ b/amdgpu/amdgpu_asic_id.c

[...]

> +#include "xf86drm.h"
> +#include "amdgpu_drm.h"

Should be

#include 
#include 

since these header files are not located in the same directory as
amdgpu_asic_id.c.

 [Sam] Actually, "" is used to include programmer-defined header files, and <>  
is used for files pre-designated by the compiler/IDE.

> +static int parse_one_line(const char *line, struct amdgpu_asic_id *id)
> +{
> + char *buf, *saveptr;
> + char *s_did;
> + char *s_rid;
> + char *s_name;
> + char *endptr;
> + int r = 0;

This function could be simplified slightly by initializing r = -EINVAL
here and only setting it to 0 just before the out label.

[Sam]This is likely a personal preference. I am fine with current 
implementation which is clear.

> +int amdgpu_parse_asic_ids(struct amdgpu_asic_id **p_asic_id_table)
> +{

[...]

> + /* 1st valid line is file version */
> + while ((n = getline(, , fp)) != -1) {
> + /* trim trailing newline */
> + if (line[n - 1] == '\n')
> + line[n - 1] = '\0';

Should probably increment line_num here, otherwise the line number in
the error message below might be confusing.

[Sam]That is a good catch.

> + fprintf(stderr, "Invalid format: %s: line %d: %s\n",
> + AMDGPU_ASIC_ID_TABLE, line_num, line);

The second line should be indented to align with the opening parenthesis.

 [Sam] Can be done.

> + if (table_size >= table_max_size) {
> + /* double table size */
> + table_max_size *= 2;
> + asic_id_table = realloc(asic_id_table, table_max_size *
> + sizeof(struct amdgpu_asic_id));

Ditto.

[Sam] Can be done.

> + /* end of table */
> + id = asic_id_table + table_size;
> + memset(id, 0, sizeof(struct amdgpu_asic_id));

Might also want to realloc asic_id_table according to the final table
size, to avoid wasting memory.

[Sam] Good one.

> + if (r && asic_id_table) {
> + while (table_size--) {
> + id = asic_id_table + table_size;
> + if (id->marketing_name !=  NULL)
> + free(id->marketing_name);

free(NULL) works fine (and parse_one_line returns an error for
id->marketing_name == NULL anyway), so this can be simplified to

free(id->marketing_name);


[Sam] Can be done.

> @@ -140,6 +140,13 @@ static void 
> amdgpu_device_free_internal(amdgpu_device_handle dev)
>   close(dev->fd);
>   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
>   close(dev->flink_fd);
> + if (dev->asic_ids) {
> + for (id = dev->asic_ids; id->did; id++) {
> + if (id->marketing_name !=  NULL)
> + free(id->marketing_name);
> + }

Ditto, this can be simplified to

[Sam] Can be done.

for (id = dev->asic_ids; id->did; id++)
free(id->marketing_name);


> @@ -267,6 +274,11 @@ int amdgpu_device_initialize(int fd,
>   amdgpu_vamgr_init(>vamgr_32, start, max,
> dev->dev_info.virtual_address_alignment);
>  
> + r = amdgpu_parse_asic_ids(>asic_ids);
> + if (r)
> + fprintf(stderr, "%s: Can not parse asic ids, 0x%x.",
> + __func__, r);

"Cannot parse ASIC IDs"

Also, there should be curly braces around a multi-line statement.

[Sam] Can be done. However, it is still a single statement. Does it matter?

> @@ -297,13 +309,15 @@ int amdgpu_device_deinitialize(amdgpu_device_handle dev)
>  
>  const char *amdgpu_get_marketing_name(amdgpu_device_handle dev)
>  {
> - const struct amdgpu_asic_id_table_t *t = amdgpu_asic_id_table;
> + const struct amdgpu_asic_id *id;
> +
> + if (!dev->asic_ids)
> + return NULL;
>  
> - while (t->did) {
> - if ((t->did == dev->info.asic_id) &&
> - (t->rid == dev->info.pci_rev_id))
> - return t->marketing_name;
> - t++;
> + for (id = dev->asic_ids; id->did; id++) {
> + if ((id->did == dev->info.asic_id) &&
> + (id->rid == dev->info.pci_rev_id))

The last line is 

[PATCH 3/3] drm/amdgpu: clean up code to make it easier to understand

2017-05-30 Thread Alex Xie
 The original code convert pointer from amdgpu_crtc to
 drm_crtc then later convert the pointer back.
 It is difficult to understand why it was written
 that way.

Signed-off-by: Alex Xie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 4dd83a3..7317fc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -258,8 +258,7 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
for (i = 0, found = 0; i < adev->mode_info.num_crtc; i++) {
crtc = (struct drm_crtc *)minfo->crtcs[i];
if (crtc && crtc->base.id == info->mode_crtc.id) {
-   struct amdgpu_crtc *amdgpu_crtc = 
to_amdgpu_crtc(crtc);
-   ui32 = amdgpu_crtc->crtc_id;
+   ui32 = minfo->crtcs[i]->crtc_id;
found = 1;
break;
}
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/3] drm/amdgpu: Optimize a function called by every IB sheduling

2017-05-30 Thread Alex Xie
  Move several if statements and a loop statment from
  run time to initialization time.

Signed-off-by: Alex Xie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 33 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   | 28 +--
 3 files changed, 40 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6a85db0..7d95435 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -153,6 +153,36 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
 }
 
 /**
+ * amdgpu_ring_check_compute_vm_bug - check whether this ring has compute vm 
bug
+ *
+ * @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring structure holding ring information
+ */
+static void amdgpu_ring_check_compute_vm_bug(struct amdgpu_device *adev,
+   struct amdgpu_ring *ring)
+{
+   const struct amdgpu_ip_block *ip_block;
+
+   ring->has_compute_vm_bug = false;
+
+   if (ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE)
+   /* only compute rings */
+   return;
+
+   ip_block = amdgpu_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
+   if (!ip_block)
+   return;
+
+   /* Compute ring has a VM bug for GFX version < 7.
+   And compute ring has a VM bug for GFX 8 MEC firmware version < 
673.*/
+   if (ip_block->version->major <= 7) {
+   ring->has_compute_vm_bug = true;
+   } else if (ip_block->version->major == 8)
+   if (adev->gfx.mec_fw_version < 673)
+   ring->has_compute_vm_bug = true;
+}
+
+/**
  * amdgpu_ring_init - init driver ring struct.
  *
  * @adev: amdgpu_device pointer
@@ -257,6 +287,9 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
if (amdgpu_debugfs_ring_init(adev, ring)) {
DRM_ERROR("Failed to register debugfs file for rings !\n");
}
+
+   amdgpu_ring_check_compute_vm_bug(adev, ring);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index a9223a8..334307e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -185,6 +185,7 @@ struct amdgpu_ring {
u64 cond_exe_gpu_addr;
volatile u32*cond_exe_cpu_addr;
unsignedvm_inv_eng;
+   boolhas_compute_vm_bug;
 #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
 #endif
@@ -207,4 +208,9 @@ static inline void amdgpu_ring_clear_ring(struct 
amdgpu_ring *ring)
 
 }
 
+static inline bool amdgpu_ring_has_compute_vm_bug(struct amdgpu_ring *ring)
+{
+   return ring->has_compute_vm_bug;
+}
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b2384b8..7a323f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -656,32 +656,6 @@ static int amdgpu_vm_alloc_reserved_vmid(struct 
amdgpu_device *adev,
return r;
 }
 
-static bool amdgpu_vm_ring_has_compute_vm_bug(struct amdgpu_ring *ring)
-{
-   struct amdgpu_device *adev = ring->adev;
-   const struct amdgpu_ip_block *ip_block;
-
-   if (ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE)
-   /* only compute rings */
-   return false;
-
-   ip_block = amdgpu_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX);
-   if (!ip_block)
-   return false;
-
-   if (ip_block->version->major <= 7) {
-   /* gfx7 has no workaround */
-   return true;
-   } else if (ip_block->version->major == 8) {
-   if (adev->gfx.mec_fw_version >= 673)
-   /* gfx8 is fixed in MEC firmware 673 */
-   return false;
-   else
-   return true;
-   }
-   return false;
-}
-
 bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
  struct amdgpu_job *job)
 {
@@ -691,7 +665,7 @@ bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
struct amdgpu_vm_id *id;
bool gds_switch_needed;
bool vm_flush_needed = job->vm_needs_flush ||
-   amdgpu_vm_ring_has_compute_vm_bug(ring);
+   amdgpu_ring_has_compute_vm_bug(ring);
 
if (job->vm_id == 0)
return false;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH libdrm v3 1/1] amdgpu: move asic id table to a separate file

2017-05-30 Thread Li, Samuel
>  - Marketing can make mistakes or have IT glitches The inconsistent use of 
> "(TM)" and using a 67C2:00 is something one wants to double-check with them.
Marketing names are there for a lot of reasons. The code here is to pass the 
names only.
If you are interested in a vendor's marketing names, please reach out to the 
vendor, e.g. through your contacts in the vendor who also shares your interest.

>- Having a separate file so that clients can update/edit it does not help 
>much. 
Please say it to pci.ids/usb.ids :)

> Adding ~200 loc for ~170 devices entries sounds like a step in the wrong 
> direction.
Check the vendor's entries in pci.ids, and you might have some better idea.

Sam

-Original Message-
From: Emil Velikov [mailto:emil.l.veli...@gmail.com] 
Sent: Tuesday, May 30, 2017 7:01 AM
To: Li, Samuel 
Cc: Alex Deucher ; amd-gfx list 
; Yuan, Xiaojie 
Subject: Re: [PATCH libdrm v3 1/1] amdgpu: move asic id table to a separate file

Hi all,

Pardon for dropping in uninvited. Just some food for thought.

On 29 May 2017 at 22:01, Li, Samuel  wrote:
> Understood your point. However as discussed internally before, marketing 
> names are there for a lot of reasons; my understanding of the policy is we do 
> not need to touch them as long as there is no error in the names and they are 
> allowed to be public.
>
Samuel,

It seems that most comments put forward by people are going on deaf ears.

While there may be valid arguments behind doing so, do consider the following:
 - Review is always encouraged
Regardless if the information is within or outside of the source code.
 - Marketing can make mistakes or have IT glitches The inconsistent use of 
"(TM)" and using a 67C2:00 is something one wants to double-check with them.
 - Having a separate file so that clients can update/edit it does not help much.
You want to ship the whole driver, in order to have a predictable and 
consistent user experience.
 - Adding ~200 loc for ~170 devices entries sounds like a step in the wrong 
direction.

In either case, not my call. I might follow-up with some issues in the code 
itself ;-)

HTH
Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdgpu: Remove two ! operations in an if condition

2017-05-30 Thread Alex Xie
 Make the code easier to understand.

Signed-off-by: Alex Xie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7a323f9..9fdeb82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -680,9 +680,11 @@ bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
 
if (amdgpu_vm_had_gpu_reset(adev, id))
return true;
-   if (!vm_flush_needed && !gds_switch_needed)
+
+   if (vm_flush_needed || gds_switch_needed)
+   return true;
+   else
return false;
-   return true;
 }
 
 /**
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [RFC] Exclusive gpu access for SteamVR usecases

2017-05-30 Thread Andres Rodriguez



On 2017-05-30 11:19 AM, Christian König wrote:

Looks like a good start, but a few notes in general:

1. Split the patches into two sets.

One for implementing changing the priorities and one for limiting the 
priorities.




No problem.

2. How are the priorities from processes supposed to interact with the 
per context priority?


Do you mean process niceness?

There isn't any relationship between niceness and gpu priority.

Let me know if you meant something different here.



3. Thinking more about it we can't limit the minimum priority in the 
scheduler.


For example a low priority job might block resources the high priority 
job needs to run. E.g. VRAM memory.




We avoid deadlocks by making sure that all dependencies of an exclusive 
task are also elevated to the same priority as said task. Usermode (the 
DRM_MASTER) is responsible to maintain this guarantee. The kernel does 
provide an ioctl that makes this task simple, 
amdgpu_sched_process_priority_set().


Lets take a look at this issue through three different scenarios.

(i) Job dependencies are all process internal, i.e. multiple contexts in 
one process.


This is the trivial case. A call to amdgpu_sched_process_priority_set() 
will change the priority of all contexts belonging to a process in lockstep.


Once amdgpu_sched_process_priority_set() returns, it is safe to raise 
the minimum priority using amdgpu_sched_min_priority_get(). At this 
point we have a guarantee that all contexts belonging to the process 
will be in a runnable state, or all the contexts will be in a 
not-runnable state. There won't be a mix of runnable and non-runnable 
processes.


Getting into that mixed state is what could cause a deadlock, a runnable 
context depends on a non-runnable one.


Note: the current patchset needs a fix to provide this guarantee in 
multi-gpu systems.


(ii) Job dependencies between two processes.

This case is mentioned separately as it is probably the most common use 
case we will encounter for this feature. Most graphics applications 
enter producer/consumer relationship with the compositor process (window 
swapchain).


In this case the compositor should already have all the information 
required to avoid a deadlock. It knows:

  - Itself (as a process)
  - The application process
  - The dependencies between both processes

At this stage it is simple for the compositor to understand that if it 
wishes to perform an exclusive mode transition, all dependencies (which 
are known) should also be part of the exclusive group.


We should be able to implement this feature without modifying a 
game/application.


(iii) Job dependencies between multiple (3+) processes.

This scenario is very uncommon for games. For example, if a game or 
application is split into multiple processes. Process A interacts with 
the compositor. Process B does some physics/compute calculations and 
send the results to Process A.


To support this use case, we would require an interface for the 
application to communicate to the compositor its dependencies. I.e. 
Process A would say, "Also keep Process B's priority in sync with mine". 
This should be a simple bit of plumbing to allow Process A to share an 
fd from Process B with the compositor.


B --[pipe_send(fdB)]--> A --[compositor_ext_priority_group_add(fdB)]--> 
Compositor


Once the compositor is aware of all of A's dependencies, this can be 
handled in the same fashion as (ii).


A special extension would be required for compositor protocols to 
communicate the dependencies fd. Applications would also need to be 
updated to use this extension.


I think this case would be very uncommon. But it is something that we 
would be able to handle if the need would arise.


> We need something like blocking the submitter instead (bad) or detection
> of dependencies in the scheduler (good, but tricky to implement).
>

I definitely agree that detecting dependencies is tricky. Which is why I 
prefer an approach where usermode defines the dependencies. It is simple 
for both the kernel and usermode to implement.


> Otherwise we can easily run into a deadlock situation with that approach.
>

The current API does allow you to deadlock yourself pretty easily if 
misused. But so do many other APIs, like having a thread trying to grab 
the same lock twice :)


Thanks for the comments,
Andres


Regards,
Christian.

Am 25.05.2017 um 02:00 schrieb Andres Rodriguez:

When multiple environments are running simultaneously on a system, e.g.
an X desktop + a SteamVR game session, it may be useful to sacrifice
performance in one environment in order to boost it on the other.

This series provides a mechanism for a DRM_MASTER to provide exclusive
gpu access to a group of processes.

Note: This series is built on the assumption that the drm lease patch 
series

will extend DRM_MASTER status to lesees.

The libdrm we intend to provide is as follows:

/**
  * Set the priority of all contexts in a process
  *
  * This function will 

[PATCH i-g-t v2] tests/kms_setmode: increase MAX_CRTCS to 6

2017-05-30 Thread Harry Wentland
AMD GPUs can have 6 CRTCs.

This requires us to allocate the combinations on the heap.

v2:
  Of course We should free the new allocation. Thanks, Alex.

Signed-off-by: Harry Wentland 
---
 tests/kms_setmode.c | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/tests/kms_setmode.c b/tests/kms_setmode.c
index 430568a1c24e..62905ec27e80 100644
--- a/tests/kms_setmode.c
+++ b/tests/kms_setmode.c
@@ -35,11 +35,13 @@
 #include "intel_bufmgr.h"
 
 #define MAX_CONNECTORS  10
-#define MAX_CRTCS   3
+#define MAX_CRTCS   6
 
 /* max combinations with repetitions */
+/* MAX_CONNECTORS ^ MAX_CRTCS */
+/* TODO should really be MAX_CONNECTORS ^ MAX_CONNECTORS ??? */
 #define MAX_COMBINATION_COUNT   \
-   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS)
+   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * 
MAX_CONNECTORS * MAX_CONNECTORS)
 #define MAX_COMBINATION_ELEMS   MAX_CRTCS
 
 static int drm_fd;
@@ -743,22 +745,25 @@ static void get_combinations(int n, int k, bool 
allow_repetitions,
 static void test_combinations(const struct test_config *tconf,
  int connector_count)
 {
-   struct combination_set connector_combs;
-   struct combination_set crtc_combs;
+   struct combination_set *connector_combs;
+   struct combination_set *crtc_combs;
struct connector_config *cconfs;
int i;
 
if (connector_count > 2 && (tconf->flags & TEST_STEALING))
return;
 
+   connector_combs = malloc(sizeof(*connector_combs));
+   crtc_combs = malloc(sizeof(*crtc_combs));
+
get_combinations(tconf->resources->count_connectors, connector_count,
-false, _combs);
+false, connector_combs);
get_combinations(tconf->resources->count_crtcs, connector_count,
-true, _combs);
+true, crtc_combs);
 
igt_info("Testing: %s %d connector combinations\n", tconf->name,
 connector_count);
-   for (i = 0; i < connector_combs.count; i++) {
+   for (i = 0; i < connector_combs->count; i++) {
int *connector_idxs;
int ret;
int j;
@@ -766,14 +771,14 @@ static void test_combinations(const struct test_config 
*tconf,
cconfs = malloc(sizeof(*cconfs) * connector_count);
igt_assert(cconfs);
 
-   connector_idxs = _combs.items[i].elems[0];
+   connector_idxs = _combs->items[i].elems[0];
ret = get_connectors(tconf->resources, connector_idxs,
 connector_count, cconfs);
if (ret < 0)
goto free_cconfs;
 
-   for (j = 0; j < crtc_combs.count; j++) {
-   int *crtc_idxs = _combs.items[j].elems[0];
+   for (j = 0; j < crtc_combs->count; j++) {
+   int *crtc_idxs = _combs->items[j].elems[0];
ret = assign_crtc_to_connectors(tconf, crtc_idxs,
connector_count,
cconfs);
@@ -787,6 +792,9 @@ static void test_combinations(const struct test_config 
*tconf,
 free_cconfs:
free(cconfs);
}
+
+   free(connector_combs);
+   free(crtc_combs);
 }
 
 static void run_test(const struct test_config *tconf)
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH i-g-t] tests/kms_setmode: increase MAX_CRTCS to 6

2017-05-30 Thread Alex Deucher
On Tue, May 30, 2017 at 4:01 PM, Harry Wentland  wrote:
> AMD GPUs can have 6 CRTCs.
>
> This requires us to allocate the combinations on the heap.
>
> Signed-off-by: Harry Wentland 
> ---
>  tests/kms_setmode.c | 25 +++--
>  1 file changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/tests/kms_setmode.c b/tests/kms_setmode.c
> index 430568a1c24e..847ad650d27f 100644
> --- a/tests/kms_setmode.c
> +++ b/tests/kms_setmode.c
> @@ -35,11 +35,13 @@
>  #include "intel_bufmgr.h"
>
>  #define MAX_CONNECTORS  10
> -#define MAX_CRTCS   3
> +#define MAX_CRTCS   6
>
>  /* max combinations with repetitions */
> +/* MAX_CONNECTORS ^ MAX_CRTCS */
> +/* TODO should really be MAX_CONNECTORS ^ MAX_CONNECTORS ??? */
>  #define MAX_COMBINATION_COUNT   \
> -   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS)
> +   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * 
> MAX_CONNECTORS * MAX_CONNECTORS)
>  #define MAX_COMBINATION_ELEMS   MAX_CRTCS
>
>  static int drm_fd;
> @@ -743,22 +745,25 @@ static void get_combinations(int n, int k, bool 
> allow_repetitions,
>  static void test_combinations(const struct test_config *tconf,
>   int connector_count)
>  {
> -   struct combination_set connector_combs;
> -   struct combination_set crtc_combs;
> +   struct combination_set *connector_combs;
> +   struct combination_set *crtc_combs;
> struct connector_config *cconfs;
> int i;
>
> if (connector_count > 2 && (tconf->flags & TEST_STEALING))
> return;
>
> +   connector_combs = malloc(sizeof(*connector_combs));
> +   crtc_combs = malloc(sizeof(*crtc_combs));

Shouldn't this get freed somewhere?


> +
> get_combinations(tconf->resources->count_connectors, connector_count,
> -false, _combs);
> +false, connector_combs);
> get_combinations(tconf->resources->count_crtcs, connector_count,
> -true, _combs);
> +true, crtc_combs);
>
> igt_info("Testing: %s %d connector combinations\n", tconf->name,
>  connector_count);
> -   for (i = 0; i < connector_combs.count; i++) {
> +   for (i = 0; i < connector_combs->count; i++) {
> int *connector_idxs;
> int ret;
> int j;
> @@ -766,14 +771,14 @@ static void test_combinations(const struct test_config 
> *tconf,
> cconfs = malloc(sizeof(*cconfs) * connector_count);
> igt_assert(cconfs);
>
> -   connector_idxs = _combs.items[i].elems[0];
> +   connector_idxs = _combs->items[i].elems[0];
> ret = get_connectors(tconf->resources, connector_idxs,
>  connector_count, cconfs);
> if (ret < 0)
> goto free_cconfs;
>
> -   for (j = 0; j < crtc_combs.count; j++) {
> -   int *crtc_idxs = _combs.items[j].elems[0];
> +   for (j = 0; j < crtc_combs->count; j++) {
> +   int *crtc_idxs = _combs->items[j].elems[0];
> ret = assign_crtc_to_connectors(tconf, crtc_idxs,
> connector_count,
> cconfs);
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH i-g-t] tests/kms_setmode: increase MAX_CRTCS to 6

2017-05-30 Thread Harry Wentland
AMD GPUs can have 6 CRTCs.

This requires us to allocate the combinations on the heap.

Signed-off-by: Harry Wentland 
---
 tests/kms_setmode.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/tests/kms_setmode.c b/tests/kms_setmode.c
index 430568a1c24e..847ad650d27f 100644
--- a/tests/kms_setmode.c
+++ b/tests/kms_setmode.c
@@ -35,11 +35,13 @@
 #include "intel_bufmgr.h"
 
 #define MAX_CONNECTORS  10
-#define MAX_CRTCS   3
+#define MAX_CRTCS   6
 
 /* max combinations with repetitions */
+/* MAX_CONNECTORS ^ MAX_CRTCS */
+/* TODO should really be MAX_CONNECTORS ^ MAX_CONNECTORS ??? */
 #define MAX_COMBINATION_COUNT   \
-   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS)
+   (MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * MAX_CONNECTORS * 
MAX_CONNECTORS * MAX_CONNECTORS)
 #define MAX_COMBINATION_ELEMS   MAX_CRTCS
 
 static int drm_fd;
@@ -743,22 +745,25 @@ static void get_combinations(int n, int k, bool 
allow_repetitions,
 static void test_combinations(const struct test_config *tconf,
  int connector_count)
 {
-   struct combination_set connector_combs;
-   struct combination_set crtc_combs;
+   struct combination_set *connector_combs;
+   struct combination_set *crtc_combs;
struct connector_config *cconfs;
int i;
 
if (connector_count > 2 && (tconf->flags & TEST_STEALING))
return;
 
+   connector_combs = malloc(sizeof(*connector_combs));
+   crtc_combs = malloc(sizeof(*crtc_combs));
+
get_combinations(tconf->resources->count_connectors, connector_count,
-false, _combs);
+false, connector_combs);
get_combinations(tconf->resources->count_crtcs, connector_count,
-true, _combs);
+true, crtc_combs);
 
igt_info("Testing: %s %d connector combinations\n", tconf->name,
 connector_count);
-   for (i = 0; i < connector_combs.count; i++) {
+   for (i = 0; i < connector_combs->count; i++) {
int *connector_idxs;
int ret;
int j;
@@ -766,14 +771,14 @@ static void test_combinations(const struct test_config 
*tconf,
cconfs = malloc(sizeof(*cconfs) * connector_count);
igt_assert(cconfs);
 
-   connector_idxs = _combs.items[i].elems[0];
+   connector_idxs = _combs->items[i].elems[0];
ret = get_connectors(tconf->resources, connector_idxs,
 connector_count, cconfs);
if (ret < 0)
goto free_cconfs;
 
-   for (j = 0; j < crtc_combs.count; j++) {
-   int *crtc_idxs = _combs.items[j].elems[0];
+   for (j = 0; j < crtc_combs->count; j++) {
+   int *crtc_idxs = _combs->items[j].elems[0];
ret = assign_crtc_to_connectors(tconf, crtc_idxs,
connector_count,
cconfs);
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its register space

2017-05-30 Thread Leo Liu



On 05/30/2017 12:15 PM, Christian König wrote:

Am 30.05.2017 um 17:56 schrieb Leo Liu:

We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.

Signed-off-by: Leo Liu 
Reviewed-by: Alex Deucher 
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 95 
+--

  1 file changed, 68 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c

index fb08193..7e39e42 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void 
*handle,

  static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
  {
  struct amdgpu_device *adev = ring->adev;
+u32 v;
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}


Did you missed my comment? I think we should always grab the mutex and 
program mmGRBM_GFX_INDEX.


No I haven't , I did move the programming the ring regs from "vce_start" 
for Instance 0, to under mutex and  mmGRBM_GFX_INDEX.


IMO, I don't think we need this here, because "mutex lock + 
mmGRBM_GFX_INDEX" and "mutex unlock+mmGRBM_GFX_DEFAULT" are always 
paired, So it should be always in default space when instance 0, and we 
do take care the case for instance 1 only.


I can follow your comment for here and other place in my patch for 
get/set w/rptr.


Thanks,
Leo



Otherwise we silently rely on that it is correctly set when the 
function is called and have different code path for different 
harvested configs.


Apart from that the patch looks good to me.

Christian.


if (ring == >vce.ring[0])
-return RREG32(mmVCE_RB_RPTR);
+v = RREG32(mmVCE_RB_RPTR);
  else if (ring == >vce.ring[1])
-return RREG32(mmVCE_RB_RPTR2);
+v = RREG32(mmVCE_RB_RPTR2);
  else
-return RREG32(mmVCE_RB_RPTR3);
+v = RREG32(mmVCE_RB_RPTR3);
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
+
+return v;
  }
/**
@@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct 
amdgpu_ring *ring)

  static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
  {
  struct amdgpu_device *adev = ring->adev;
+u32 v;
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}
if (ring == >vce.ring[0])
-return RREG32(mmVCE_RB_WPTR);
+v = RREG32(mmVCE_RB_WPTR);
  else if (ring == >vce.ring[1])
-return RREG32(mmVCE_RB_WPTR2);
+v = RREG32(mmVCE_RB_WPTR2);
  else
-return RREG32(mmVCE_RB_WPTR3);
+v = RREG32(mmVCE_RB_WPTR3);
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
+
+return v;
  }
/**
@@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct 
amdgpu_ring *ring)

  {
  struct amdgpu_device *adev = ring->adev;
  +if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}
+
  if (ring == >vce.ring[0])
  WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
  else if (ring == >vce.ring[1])
  WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
  else
  WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
  }
static void vce_v3_0_override_vce_clock_gating(struct 
amdgpu_device *adev, bool override)
@@ -231,33 +267,38 @@ static int vce_v3_0_start(struct amdgpu_device 
*adev)

  struct amdgpu_ring *ring;
  int idx, r;
  -ring = >vce.ring[0];
-WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
-WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
-WREG32(mmVCE_RB_BASE_LO, ring->gpu_addr);
-WREG32(mmVCE_RB_BASE_HI, upper_32_bits(ring->gpu_addr));
-WREG32(mmVCE_RB_SIZE, ring->ring_size / 4);
-
-ring = >vce.ring[1];
-WREG32(mmVCE_RB_RPTR2, lower_32_bits(ring->wptr));
-WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
-WREG32(mmVCE_RB_BASE_LO2, ring->gpu_addr);
-WREG32(mmVCE_RB_BASE_HI2, upper_32_bits(ring->gpu_addr));
-WREG32(mmVCE_RB_SIZE2, ring->ring_size / 4);
-
-ring = >vce.ring[2];
-WREG32(mmVCE_RB_RPTR3, lower_32_bits(ring->wptr));
-

Re: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its register space

2017-05-30 Thread Christian König

Am 30.05.2017 um 17:56 schrieb Leo Liu:

We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.

Signed-off-by: Leo Liu 
Reviewed-by: Alex Deucher 
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 95 +--
  1 file changed, 68 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index fb08193..7e39e42 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void *handle,
  static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
  {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }


Did you missed my comment? I think we should always grab the mutex and 
program mmGRBM_GFX_INDEX.


Otherwise we silently rely on that it is correctly set when the function 
is called and have different code path for different harvested configs.


Apart from that the patch looks good to me.

Christian.

  
  	if (ring == >vce.ring[0])

-   return RREG32(mmVCE_RB_RPTR);
+   v = RREG32(mmVCE_RB_RPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_RPTR2);
+   v = RREG32(mmVCE_RB_RPTR2);
else
-   return RREG32(mmVCE_RB_RPTR3);
+   v = RREG32(mmVCE_RB_RPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
  }
  
  /**

@@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring 
*ring)
  static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
  {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
  
  	if (ring == >vce.ring[0])

-   return RREG32(mmVCE_RB_WPTR);
+   v = RREG32(mmVCE_RB_WPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_WPTR2);
+   v = RREG32(mmVCE_RB_WPTR2);
else
-   return RREG32(mmVCE_RB_WPTR3);
+   v = RREG32(mmVCE_RB_WPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
  }
  
  /**

@@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct amdgpu_ring 
*ring)
  {
struct amdgpu_device *adev = ring->adev;
  
+	if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {

+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
+
if (ring == >vce.ring[0])
WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
else if (ring == >vce.ring[1])
WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
else
WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
  }
  
  static void vce_v3_0_override_vce_clock_gating(struct amdgpu_device *adev, bool override)

@@ -231,33 +267,38 @@ static int vce_v3_0_start(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
int idx, r;
  
-	ring = >vce.ring[0];

-   WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE, ring->ring_size / 4);
-
-   ring = >vce.ring[1];
-   WREG32(mmVCE_RB_RPTR2, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO2, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI2, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE2, ring->ring_size / 4);
-
-   ring = >vce.ring[2];
-   WREG32(mmVCE_RB_RPTR3, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO3, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI3, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE3, ring->ring_size / 4);
-

Re: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own register space

2017-05-30 Thread Leo Liu



On 05/30/2017 11:13 AM, Christian König wrote:

Am 30.05.2017 um 16:57 schrieb Deucher, Alexander:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Leo Liu
Sent: Monday, May 29, 2017 2:22 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Leo
Subject: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own
register space

when harvest part has only instance 1 available

Signed-off-by: Leo Liu 
---
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 61
+++
  1 file changed, 55 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index fb08193..77af395 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void
*handle,
  static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
  {
  struct amdgpu_device *adev = ring->adev;
+u32 v;
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}

  if (ring == >vce.ring[0])
-return RREG32(mmVCE_RB_RPTR);
+v = RREG32(mmVCE_RB_RPTR);
  else if (ring == >vce.ring[1])
-return RREG32(mmVCE_RB_RPTR2);
+v = RREG32(mmVCE_RB_RPTR2);
  else
-return RREG32(mmVCE_RB_RPTR3);
+v = RREG32(mmVCE_RB_RPTR3);
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
+
+return v;
  }

  /**
@@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct
amdgpu_ring *ring)
  static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
  {
  struct amdgpu_device *adev = ring->adev;
+u32 v;
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}

  if (ring == >vce.ring[0])
-return RREG32(mmVCE_RB_WPTR);
+v = RREG32(mmVCE_RB_WPTR);
  else if (ring == >vce.ring[1])
-return RREG32(mmVCE_RB_WPTR2);
+v = RREG32(mmVCE_RB_WPTR2);
  else
-return RREG32(mmVCE_RB_WPTR3);
+v = RREG32(mmVCE_RB_WPTR3);
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
+
+return v;
  }

  /**
@@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  {
  struct amdgpu_device *adev = ring->adev;

+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}
+
  if (ring == >vce.ring[0])
  WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
  else if (ring == >vce.ring[1])
  WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
  else
  WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
+
+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
  }

  static void vce_v3_0_override_vce_clock_gating(struct amdgpu_device
*adev, bool override)
@@ -231,6 +267,14 @@ static int vce_v3_0_start(struct amdgpu_device
*adev)
  struct amdgpu_ring *ring;
  int idx, r;

+/* we need program ring buffer on instance 1 register space domain
+when only if instance 1 available, with two instances or 
instance 0

+we need only program instance 0 regsiter space domain for ring */

Please add this comment to the commit message as well.  With that fixed:
Reviewed-by: Alex Deucher 
Also, add CC: sta...@vger.kernel.org


+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+mutex_lock(>grbm_idx_mutex);
+WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+}


I would prefer if we always grab the mutex and program 
mmGRBM_GFX_INDEX with the not harvested instance.


This way we don't run into bad surprises any more when we only test on 
boards where the second VCE instance is harvested.


Okay. the v2 patch will send shortly.

Leo




Christian.


+
  ring = >vce.ring[0];
  WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
  WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
@@ -252,6 +296,11 @@ static int vce_v3_0_start(struct amdgpu_device
*adev)
  WREG32(mmVCE_RB_BASE_HI3, upper_32_bits(ring->gpu_addr));
  WREG32(mmVCE_RB_SIZE3, ring->ring_size / 4);

+if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+mutex_unlock(>grbm_idx_mutex);
+}
+
  mutex_lock(>grbm_idx_mutex);
  for (idx = 0; idx < 2; ++idx) {
  if 

RE: [PATCH xf86-video-amdgpu] Use reference counting for tracking KMS framebuffer lifetimes

2017-05-30 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Monday, May 29, 2017 4:08 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH xf86-video-amdgpu] Use reference counting for tracking
> KMS framebuffer lifetimes
> 
> From: Michel Dänzer 
> 
> References are held by the pixmaps corresponding to the FBs (so
> the same KMS FB can be reused as long as the pixmap exists) and by the
> CRTCs scanning out from them (so a KMS FB is only destroyed once it's
> not being scanned out anymore, preventing intermittent black screens and
> worse issues due to a CRTC turning off when it should be on).
> 
> v2:
> * Only increase reference count in drmmode_fb_reference if it was sane
>   before
> * Make drmmode_fb_reference's indentation match the rest of
>   drmmode_display.h
> 
> (Ported from radeon commit 55e513b978b2afc52b7cafc5bfcb0d1dc78d75f6)
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  src/amdgpu_kms.c  |  70 +++---
>  src/amdgpu_pixmap.h   |  58 +++
>  src/amdgpu_present.c  |   9 ---
>  src/drmmode_display.c | 157 -
> -
>  src/drmmode_display.h |  42 --
>  5 files changed, 219 insertions(+), 117 deletions(-)
> 
> diff --git a/src/amdgpu_kms.c b/src/amdgpu_kms.c
> index a418cf9d3..69d61943d 100644
> --- a/src/amdgpu_kms.c
> +++ b/src/amdgpu_kms.c
> @@ -675,6 +675,18 @@ amdgpu_prime_scanout_flip_abort(xf86CrtcPtr crtc,
> void *event_data)
>  }
> 
>  static void
> +amdgpu_prime_scanout_flip_handler(xf86CrtcPtr crtc, uint32_t msc,
> uint64_t usec,
> +   void *event_data)
> +{
> + AMDGPUEntPtr pAMDGPUEnt = AMDGPUEntPriv(crtc->scrn);
> + drmmode_crtc_private_ptr drmmode_crtc = event_data;
> +
> + drmmode_fb_reference(pAMDGPUEnt->fd, _crtc->fb,
> +  drmmode_crtc->flip_pending);
> + amdgpu_prime_scanout_flip_abort(crtc, event_data);
> +}
> +
> +static void
>  amdgpu_prime_scanout_flip(PixmapDirtyUpdatePtr ent)
>  {
>   ScreenPtr screen = ent->slave_dst->drawable.pScreen;
> @@ -701,7 +713,8 @@ amdgpu_prime_scanout_flip(PixmapDirtyUpdatePtr
> ent)
>   drm_queue_seq = amdgpu_drm_queue_alloc(crtc,
> 
> AMDGPU_DRM_QUEUE_CLIENT_DEFAULT,
> 
> AMDGPU_DRM_QUEUE_ID_DEFAULT,
> -drmmode_crtc, NULL,
> +drmmode_crtc,
> +
> amdgpu_prime_scanout_flip_handler,
> 
> amdgpu_prime_scanout_flip_abort);
>   if (drm_queue_seq == AMDGPU_DRM_QUEUE_ERROR) {
>   xf86DrvMsg(scrn->scrnIndex, X_WARNING,
> @@ -709,8 +722,17 @@ amdgpu_prime_scanout_flip(PixmapDirtyUpdatePtr
> ent)
>   return;
>   }
> 
> + drmmode_crtc->flip_pending =
> + amdgpu_pixmap_get_fb(drmmode_crtc-
> >scanout[scanout_id].pixmap);
> + if (!drmmode_crtc->flip_pending) {
> + xf86DrvMsg(scrn->scrnIndex, X_WARNING,
> +"Failed to get FB for PRIME flip.\n");
> + amdgpu_drm_abort_entry(drm_queue_seq);
> + return;
> + }
> +
>   if (drmmode_page_flip_target_relative(pAMDGPUEnt,
> drmmode_crtc,
> -   drmmode_crtc-
> >scanout[scanout_id].fb_id,
> +   drmmode_crtc->flip_pending-
> >handle,
> 0, drm_queue_seq, 0) != 0) {
>   xf86DrvMsg(scrn->scrnIndex, X_WARNING, "flip queue
> failed in %s: %s\n",
>  __func__, strerror(errno));
> @@ -720,7 +742,6 @@ amdgpu_prime_scanout_flip(PixmapDirtyUpdatePtr
> ent)
> 
>   drmmode_crtc->scanout_id = scanout_id;
>   drmmode_crtc->scanout_update_pending = TRUE;
> - drmmode_crtc->flip_pending = TRUE;
>  }
> 
>  static void
> @@ -950,10 +971,14 @@ amdgpu_scanout_update(xf86CrtcPtr xf86_crtc)
>  static void
>  amdgpu_scanout_flip_abort(xf86CrtcPtr crtc, void *event_data)
>  {
> - drmmode_crtc_private_ptr drmmode_crtc = event_data;
> + amdgpu_prime_scanout_flip_abort(crtc, event_data);
> +}
> 
> - drmmode_crtc->scanout_update_pending = FALSE;
> - drmmode_clear_pending_flip(crtc);
> +static void
> +amdgpu_scanout_flip_handler(xf86CrtcPtr crtc, uint32_t msc, uint64_t
> usec,
> + void *event_data)
> +{
> + amdgpu_prime_scanout_flip_handler(crtc, msc, usec, event_data);
>  }
> 
>  static void
> @@ -977,7 +1002,8 @@ amdgpu_scanout_flip(ScreenPtr pScreen,
> AMDGPUInfoPtr info,
>   drm_queue_seq = amdgpu_drm_queue_alloc(xf86_crtc,
> 
> AMDGPU_DRM_QUEUE_CLIENT_DEFAULT,
> 
> AMDGPU_DRM_QUEUE_ID_DEFAULT,
> -drmmode_crtc, NULL,
> +drmmode_crtc,
> +amdgpu_scanout_flip_handler,

[PATCH] drm/amdgpu: Program ring for vce instance 1 at its register space

2017-05-30 Thread Leo Liu
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.

Signed-off-by: Leo Liu 
Reviewed-by: Alex Deucher 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 95 +--
 1 file changed, 68 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index fb08193..7e39e42 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void *handle,
 static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
 
if (ring == >vce.ring[0])
-   return RREG32(mmVCE_RB_RPTR);
+   v = RREG32(mmVCE_RB_RPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_RPTR2);
+   v = RREG32(mmVCE_RB_RPTR2);
else
-   return RREG32(mmVCE_RB_RPTR3);
+   v = RREG32(mmVCE_RB_RPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
 }
 
 /**
@@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring 
*ring)
 static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
 
if (ring == >vce.ring[0])
-   return RREG32(mmVCE_RB_WPTR);
+   v = RREG32(mmVCE_RB_WPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_WPTR2);
+   v = RREG32(mmVCE_RB_WPTR2);
else
-   return RREG32(mmVCE_RB_WPTR3);
+   v = RREG32(mmVCE_RB_WPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
 }
 
 /**
@@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct amdgpu_ring 
*ring)
 {
struct amdgpu_device *adev = ring->adev;
 
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
+
if (ring == >vce.ring[0])
WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
else if (ring == >vce.ring[1])
WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
else
WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
 }
 
 static void vce_v3_0_override_vce_clock_gating(struct amdgpu_device *adev, 
bool override)
@@ -231,33 +267,38 @@ static int vce_v3_0_start(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
int idx, r;
 
-   ring = >vce.ring[0];
-   WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE, ring->ring_size / 4);
-
-   ring = >vce.ring[1];
-   WREG32(mmVCE_RB_RPTR2, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO2, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI2, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE2, ring->ring_size / 4);
-
-   ring = >vce.ring[2];
-   WREG32(mmVCE_RB_RPTR3, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
-   WREG32(mmVCE_RB_BASE_LO3, ring->gpu_addr);
-   WREG32(mmVCE_RB_BASE_HI3, upper_32_bits(ring->gpu_addr));
-   WREG32(mmVCE_RB_SIZE3, ring->ring_size / 4);
-
mutex_lock(>grbm_idx_mutex);
for (idx = 0; idx < 2; ++idx) {
if (adev->vce.harvest_config & (1 << idx))
continue;
 
WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(idx));
+
+   /* Program instance 0 reg space for two instances or instance 0 
case
+   program instance 1 reg space 

Re: iMac 10,1 with Ubuntu 16.04: black screen after suspend

2017-05-30 Thread Florian Echtler
On 26.05.2017 23:03, Lukas Wunner wrote:
> On Fri, May 26, 2017 at 02:13:29PM +0200, Florian Echtler wrote:
>> I'm running Ubuntu 16.04.2 on a 27" unibody iMac 10,1 from 2009. However, 
>> even
>> with the most recent HWE stack (kernel 4.8), the display stays black after
>> suspend. I can ssh into the machine, so wakeup is OK, but the display is
>> entirely black (backlight stays off).
>>
> The ATI card has MXM_PNL_PWR_EN and MXM_PNL_BL_EN pins.  Those must be
> enabled for the panel to light up.  Perhaps radeon needs to be extended
> to do that.  One theory is that this is done via ACPI, but perhaps
> only if a certain operating system is running.  Dump the ACPI tables
> and search DSDT and SSDT* tables for methods relating to the ATI card
> and/or backlight.  If you find checks for OSDW there (e.g. in _PS0,
> i.e. on resume), it means the AML code is only executed on Darwin.
I've already had a look at the ACPI tables; there seem to be no references to
the "official" backlight/display control functions (no _BL? or _DO? methods).

> Conversely !OSDW means the code is only executed on Windows (Boot Camp).
> Linux masquerades as Darwin, but this can be disabled with
> acpi_osi=!Darwin on the command line.  It's worth trying if that changes
> the behaviour.  If it does, then macOS likely sets those pins on resume
> and we need to do the same in radeon.
There are OSDW checks present, however, booting with acpi_osi=!Darwin doesn't
make a difference to the backlight behaviour.

I'd be grateful for other suggestions...

Best regards, Florian
-- 
SENT FROM MY DEC VT50 TERMINAL




signature.asc
Description: OpenPGP digital signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [RFC] Exclusive gpu access for SteamVR usecases

2017-05-30 Thread Christian König

Looks like a good start, but a few notes in general:

1. Split the patches into two sets.

One for implementing changing the priorities and one for limiting the 
priorities.


2. How are the priorities from processes supposed to interact with the 
per context priority?


3. Thinking more about it we can't limit the minimum priority in the 
scheduler.


For example a low priority job might block resources the high priority 
job needs to run. E.g. VRAM memory.


We need something like blocking the submitter instead (bad) or detection 
of dependencies in the scheduler (good, but tricky to implement).


Otherwise we can easily run into a deadlock situation with that approach.

Regards,
Christian.

Am 25.05.2017 um 02:00 schrieb Andres Rodriguez:

When multiple environments are running simultaneously on a system, e.g.
an X desktop + a SteamVR game session, it may be useful to sacrifice
performance in one environment in order to boost it on the other.

This series provides a mechanism for a DRM_MASTER to provide exclusive
gpu access to a group of processes.

Note: This series is built on the assumption that the drm lease patch series
will extend DRM_MASTER status to lesees.

The libdrm we intend to provide is as follows:

/**
  * Set the priority of all contexts in a process
  *
  * This function will change the priority of all contexts owned by
  * the process identified by fd.
  *
  * \param dev - \c [in] device handle
  * \param fd  - \c [in] fd from target process
  * \param priority- \c [in] target priority AMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  * <0 - Negative POSIX error code
  *
  * \notes @fd can be *any* file descriptor from the target process.
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_process_priority_set(amdgpu_device_handle dev,
  int fd, int32_t priority);

/**
  * Request to raise the minimum required priority to schedule a gpu job
  *
  * Submit a request to increase the minimum required priority to schedule
  * a gpu job. Once this function returns, the gpu scheduler will no longer
  * consider jobs from contexts with priority lower than @priority.
  *
  * The minimum priority considered by the scheduler will be the highest from
  * all currently active requests.
  *
  * Requests are refcounted, and must be balanced using
  * amdgpu_sched_min_priority_put()
  *
  * \param dev - \c [in] device handle
  * \param priority- \c [in] target priority AMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  * <0 - Negative POSIX error code
  *
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_min_priority_get(amdgpu_device_handle dev,
  int32_t priority);

/**
  * Drop a request to raise the minimum required scheduler priority
  *
  * This call balances amdgpu_sched_min_priority_get()
  *
  * If no other active requests exists for @priority, the minimum required
  * priority will decay to a lower level until one is reached with an active
  * request or the lowest priority is reached.
  *
  * \param dev - \c [in] device handle
  * \param priority- \c [in] target priority AMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  * <0 - Negative POSIX error code
  *
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_min_priority_put(amdgpu_device_handle dev,
  int32_t priority);

Using this app, VRComposer can raise the priority of the VRapp and itself. Then
it can restrict the minimum scheduler priority in order to become exclusive gpu
clients.

One of the areas I'd like feedback is the following scenario. If a VRapp opens
a new fd and creates a new context after a call to set_priority, this specific
context will be lower priority than the rest. If the minimum required priority
is then raised, it is possible that this new context will be starved and
deadlock the VRapp.

One solution I had in mind to address this situation, is to make set_priority
also raise the priority of future contexts created by the VRapp. However, that
would require keeping track of the requested priority on a per-process data
structure. The current design appears to steer clean of keeping any process
specific data, and everything instead of stored on a per-file basis. Which is
why I did not pursue this approach. But if this is something you'd like me to
implement let me know.

One could also argue that preventing an application deadlock should be handled
between the VRComposer and the VRApp. It is not the kernel's responsibility to
babysit userspace applications and prevent themselves from shooting themselves
in the foot. The same could be achieved by improper usage of shared fences
between processes.

Thoughts/feedback/comments on this issue, or others, are appreciated.

Regards,
Andres
  



___
amd-gfx mailing list

Re: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own register space

2017-05-30 Thread Christian König

Am 30.05.2017 um 16:57 schrieb Deucher, Alexander:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Leo Liu
Sent: Monday, May 29, 2017 2:22 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Leo
Subject: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own
register space

when harvest part has only instance 1 available

Signed-off-by: Leo Liu 
---
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 61
+++
  1 file changed, 55 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index fb08193..77af395 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void
*handle,
  static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
  {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }

if (ring == >vce.ring[0])
-   return RREG32(mmVCE_RB_RPTR);
+   v = RREG32(mmVCE_RB_RPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_RPTR2);
+   v = RREG32(mmVCE_RB_RPTR2);
else
-   return RREG32(mmVCE_RB_RPTR3);
+   v = RREG32(mmVCE_RB_RPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
  }

  /**
@@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct
amdgpu_ring *ring)
  static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
  {
struct amdgpu_device *adev = ring->adev;
+   u32 v;
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }

if (ring == >vce.ring[0])
-   return RREG32(mmVCE_RB_WPTR);
+   v = RREG32(mmVCE_RB_WPTR);
else if (ring == >vce.ring[1])
-   return RREG32(mmVCE_RB_WPTR2);
+   v = RREG32(mmVCE_RB_WPTR2);
else
-   return RREG32(mmVCE_RB_WPTR3);
+   v = RREG32(mmVCE_RB_WPTR3);
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
+
+   return v;
  }

  /**
@@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct
amdgpu_ring *ring)
  {
struct amdgpu_device *adev = ring->adev;

+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }
+
if (ring == >vce.ring[0])
WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
else if (ring == >vce.ring[1])
WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
else
WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
+
+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   WREG32(mmGRBM_GFX_INDEX,
mmGRBM_GFX_INDEX_DEFAULT);
+   mutex_unlock(>grbm_idx_mutex);
+   }
  }

  static void vce_v3_0_override_vce_clock_gating(struct amdgpu_device
*adev, bool override)
@@ -231,6 +267,14 @@ static int vce_v3_0_start(struct amdgpu_device
*adev)
struct amdgpu_ring *ring;
int idx, r;

+   /* we need program ring buffer on instance 1 register space domain
+   when only if instance 1 available, with two instances or instance 0
+   we need only program instance 0 regsiter space domain for ring */

Please add this comment to the commit message as well.  With that fixed:
Reviewed-by: Alex Deucher 
Also, add CC: sta...@vger.kernel.org


+   if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
+   mutex_lock(>grbm_idx_mutex);
+   WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
+   }


I would prefer if we always grab the mutex and program mmGRBM_GFX_INDEX 
with the not harvested instance.


This way we don't run into bad surprises any more when we only test on 
boards where the second VCE instance is harvested.


Christian.


+
ring = >vce.ring[0];
WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
@@ -252,6 +296,11 @@ static int vce_v3_0_start(struct amdgpu_device
*adev)
WREG32(mmVCE_RB_BASE_HI3, upper_32_bits(ring->gpu_addr));
WREG32(mmVCE_RB_SIZE3, ring->ring_size / 4);

+   if (adev->vce.harvest_config == 

RE: [PATCH 3/3] drm/amdgpu/radeon: Use radeon by default for CIK GPUs

2017-05-30 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Monday, May 29, 2017 5:20 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH 3/3] drm/amdgpu/radeon: Use radeon by default for CIK
> GPUs
> 
> From: Michel Dänzer 
> 
> Even if CONFIG_DRM_AMDGPU_CIK is enabled.
> 
> There is no feature parity yet for CIK, in particular amdgpu doesn't
> support HDMI/DisplayPort without DC.

"HDMI/DisplayPort audio"

With that fixed, the series is:
Reviewed-by: Alex Deucher 

> 
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/Kconfig  | 8 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 7 +--
>  drivers/gpu/drm/radeon/radeon_drv.c | 4 ++--
>  drivers/gpu/drm/radeon/radeon_kms.c | 5 +
>  5 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig
> b/drivers/gpu/drm/amd/amdgpu/Kconfig
> index 8d36087fc186..e0121f8b436e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
> @@ -17,11 +17,11 @@ config DRM_AMDGPU_CIK
>   help
> Choose this option if you want to enable support for CIK asics.
> 
> -   CIK is already supported in radeon. If you enable this option,
> -   support for CIK will be provided by amdgpu and disabled in
> -   radeon by default. Use module options to override this:
> +   CIK is already supported in radeon. Support for SI in amdgpu
> +   will be disabled by default and is still provided by radeon.
> +   Use module options to override this:
> 
> -   radeon.cik_support=1 amdgpu.cik_support=0
> +   radeon.cik_support=0 amdgpu.cik_support=1
> 
>  config DRM_AMDGPU_USERPTR
>   bool "Always enable userptr write support"
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 76dea5fe620b..27599db7d630 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -240,8 +240,8 @@ module_param_named(si_support,
> amdgpu_si_support, int, 0444);
>  #endif
> 
>  #ifdef CONFIG_DRM_AMDGPU_CIK
> -int amdgpu_cik_support = 1;
> -MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 =
> disabled)");
> +int amdgpu_cik_support = 0;
> +MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled
> (default))");
>  module_param_named(cik_support, amdgpu_cik_support, int, 0444);
>  #endif
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index f4f77b99afeb..31901d2886d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -98,7 +98,7 @@ int amdgpu_driver_load_kms(struct drm_device *dev,
> unsigned long flags)
>   dev_info(dev->dev,
>"SI support provided by radeon.\n");
>   dev_info(dev->dev,
> - "Use radeon.si_support=0 amdgpu.si_support=1 to
> override.\n"
> +  "Use radeon.si_support=0
> amdgpu.si_support=1 to override.\n"
>   );
>   return -ENODEV;
>   }
> @@ -113,7 +113,10 @@ int amdgpu_driver_load_kms(struct drm_device
> *dev, unsigned long flags)
>   case CHIP_KABINI:
>   case CHIP_MULLINS:
>   dev_info(dev->dev,
> -  "CIK support disabled by module param\n");
> +  "CIK support provided by radeon.\n");
> + dev_info(dev->dev,
> +  "Use radeon.cik_support=0
> amdgpu.cik_support=1 to override.\n"
> + );
>   return -ENODEV;
>   }
>   }
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c
> b/drivers/gpu/drm/radeon/radeon_drv.c
> index 1cefadfe4a41..853ef118a735 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -307,8 +307,8 @@ int radeon_si_support = 1;
>  MODULE_PARM_DESC(si_support, "SI support (1 = enabled (default), 0 =
> disabled)");
>  module_param_named(si_support, radeon_si_support, int, 0444);
> 
> -int radeon_cik_support = 0;
> -MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled
> (default))");
> +int radeon_cik_support = 1;
> +MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 =
> disabled)");
>  module_param_named(cik_support, radeon_cik_support, int, 0444);
> 
>  static struct pci_device_id pciidlist[] = {
> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c
> b/drivers/gpu/drm/radeon/radeon_kms.c
> index baaddf989f6d..8f2579772c34 100644
> --- a/drivers/gpu/drm/radeon/radeon_kms.c
> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -118,10 +118,7 @@ int 

RE: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own register space

2017-05-30 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Leo Liu
> Sent: Monday, May 29, 2017 2:22 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Leo
> Subject: [PATCH] drm/amdgpu: Program ring for vce instance 1 at its own
> register space
> 
> when harvest part has only instance 1 available
> 
> Signed-off-by: Leo Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 61
> +++
>  1 file changed, 55 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index fb08193..77af395 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -77,13 +77,26 @@ static int vce_v3_0_set_clockgating_state(void
> *handle,
>  static uint64_t vce_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
>  {
>   struct amdgpu_device *adev = ring->adev;
> + u32 v;
> +
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + mutex_lock(>grbm_idx_mutex);
> + WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
> + }
> 
>   if (ring == >vce.ring[0])
> - return RREG32(mmVCE_RB_RPTR);
> + v = RREG32(mmVCE_RB_RPTR);
>   else if (ring == >vce.ring[1])
> - return RREG32(mmVCE_RB_RPTR2);
> + v = RREG32(mmVCE_RB_RPTR2);
>   else
> - return RREG32(mmVCE_RB_RPTR3);
> + v = RREG32(mmVCE_RB_RPTR3);
> +
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + WREG32(mmGRBM_GFX_INDEX,
> mmGRBM_GFX_INDEX_DEFAULT);
> + mutex_unlock(>grbm_idx_mutex);
> + }
> +
> + return v;
>  }
> 
>  /**
> @@ -96,13 +109,26 @@ static uint64_t vce_v3_0_ring_get_rptr(struct
> amdgpu_ring *ring)
>  static uint64_t vce_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
>  {
>   struct amdgpu_device *adev = ring->adev;
> + u32 v;
> +
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + mutex_lock(>grbm_idx_mutex);
> + WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
> + }
> 
>   if (ring == >vce.ring[0])
> - return RREG32(mmVCE_RB_WPTR);
> + v = RREG32(mmVCE_RB_WPTR);
>   else if (ring == >vce.ring[1])
> - return RREG32(mmVCE_RB_WPTR2);
> + v = RREG32(mmVCE_RB_WPTR2);
>   else
> - return RREG32(mmVCE_RB_WPTR3);
> + v = RREG32(mmVCE_RB_WPTR3);
> +
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + WREG32(mmGRBM_GFX_INDEX,
> mmGRBM_GFX_INDEX_DEFAULT);
> + mutex_unlock(>grbm_idx_mutex);
> + }
> +
> + return v;
>  }
> 
>  /**
> @@ -116,12 +142,22 @@ static void vce_v3_0_ring_set_wptr(struct
> amdgpu_ring *ring)
>  {
>   struct amdgpu_device *adev = ring->adev;
> 
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + mutex_lock(>grbm_idx_mutex);
> + WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
> + }
> +
>   if (ring == >vce.ring[0])
>   WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
>   else if (ring == >vce.ring[1])
>   WREG32(mmVCE_RB_WPTR2, lower_32_bits(ring->wptr));
>   else
>   WREG32(mmVCE_RB_WPTR3, lower_32_bits(ring->wptr));
> +
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + WREG32(mmGRBM_GFX_INDEX,
> mmGRBM_GFX_INDEX_DEFAULT);
> + mutex_unlock(>grbm_idx_mutex);
> + }
>  }
> 
>  static void vce_v3_0_override_vce_clock_gating(struct amdgpu_device
> *adev, bool override)
> @@ -231,6 +267,14 @@ static int vce_v3_0_start(struct amdgpu_device
> *adev)
>   struct amdgpu_ring *ring;
>   int idx, r;
> 
> + /* we need program ring buffer on instance 1 register space domain
> + when only if instance 1 available, with two instances or instance 0
> + we need only program instance 0 regsiter space domain for ring */

Please add this comment to the commit message as well.  With that fixed:
Reviewed-by: Alex Deucher 
Also, add CC: sta...@vger.kernel.org

> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + mutex_lock(>grbm_idx_mutex);
> + WREG32(mmGRBM_GFX_INDEX, GET_VCE_INSTANCE(1));
> + }
> +
>   ring = >vce.ring[0];
>   WREG32(mmVCE_RB_RPTR, lower_32_bits(ring->wptr));
>   WREG32(mmVCE_RB_WPTR, lower_32_bits(ring->wptr));
> @@ -252,6 +296,11 @@ static int vce_v3_0_start(struct amdgpu_device
> *adev)
>   WREG32(mmVCE_RB_BASE_HI3, upper_32_bits(ring->gpu_addr));
>   WREG32(mmVCE_RB_SIZE3, ring->ring_size / 4);
> 
> + if (adev->vce.harvest_config == AMDGPU_VCE_HARVEST_VCE0) {
> + WREG32(mmGRBM_GFX_INDEX,
> mmGRBM_GFX_INDEX_DEFAULT);
> + mutex_unlock(>grbm_idx_mutex);
> + }
> +
>   mutex_lock(>grbm_idx_mutex);
>   for 

Re: SPAM //Re: [PATCH 7/7] drm/amdgpu: enable huge page handling in the VM v2

2017-05-30 Thread Christian König
Sorry for the delayed reply, just had to fish this mail out of my spam 
folder.



Do you need something in amdgpu_vm_update_level to stop it from
overwriting huge page PTEs?


Yes, this is exactly what this chunk added to amdgpu_vm_update_level is 
doing:

-   if (parent->entries[pt_idx].addr == pt)
+   if (parent->entries[pt_idx].addr == pt ||
+   parent->entries[pt_idx].huge_page)
continue;


Regards,
Christian.

Am 23.05.2017 um 21:29 schrieb Felix Kuehling:

Do you need something in amdgpu_vm_update_level to stop it from
overwriting huge page PTEs?

Regards,
   Felix


On 17-05-23 12:52 PM, Christian König wrote:

From: Christian König 

The hardware can use huge pages to map 2MB of address space with only one PDE.

v2: few cleanups and rebased

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 95 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  4 ++
  2 files changed, 75 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f7afdfa..f07c9b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -325,6 +325,7 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
  
  			entry->bo = pt;

entry->addr = 0;
+   entry->huge_page = false;
}
  
  		if (level < adev->vm_manager.num_level) {

@@ -1013,7 +1014,8 @@ static int amdgpu_vm_update_level(struct amdgpu_device 
*adev,
  
  		pt = amdgpu_bo_gpu_offset(bo);

pt = amdgpu_gart_get_vm_pde(adev, pt);
-   if (parent->entries[pt_idx].addr == pt)
+   if (parent->entries[pt_idx].addr == pt ||
+   parent->entries[pt_idx].huge_page)
continue;
  
  		parent->entries[pt_idx].addr = pt;

@@ -1145,29 +1147,69 @@ int amdgpu_vm_update_directories(struct amdgpu_device 
*adev,
  }
  
  /**

- * amdgpu_vm_find_pt - find the page table for an address
+ * amdgpu_vm_handle_huge_pages - handle updating the PD with huge pages
   *
   * @p: see amdgpu_pte_update_params definition
   * @addr: virtual address in question
+ * @nptes: number of PTEs updated with this operation
+ * @dst: destination address where the PTEs should point to
+ * @flags: access flags fro the PTEs
+ * @bo: resulting page tables BO
   *
- * Find the page table BO for a virtual address, return NULL when none found.
+ * Check if we can update the PD with a huge page. Also finds the page table
+ * BO for a virtual address, returns -ENOENT when nothing found.
   */
-static struct amdgpu_bo *amdgpu_vm_get_pt(struct amdgpu_pte_update_params *p,
- uint64_t addr)
+static int amdgpu_vm_handle_huge_pages(struct amdgpu_pte_update_params *p,
+  uint64_t addr, unsigned nptes,
+  uint64_t dst, uint64_t flags,
+  struct amdgpu_bo **bo)
  {
-   struct amdgpu_vm_pt *entry = >vm->root;
-   unsigned idx, level = p->adev->vm_manager.num_level;
+   unsigned pt_idx, level = p->adev->vm_manager.num_level;
+   struct amdgpu_vm_pt *entry = >vm->root, *parent;
+   uint64_t pd_addr, pde;
  
-	while (entry->entries) {

-   idx = addr >> (p->adev->vm_manager.block_size * level--);
-   idx %= amdgpu_bo_size(entry->bo) / 8;
-   entry = >entries[idx];
-   }
+   do {
+   pt_idx = addr >> (p->adev->vm_manager.block_size * level--);
+   pt_idx %= amdgpu_bo_size(entry->bo) / 8;
+   parent = entry;
+   entry = >entries[pt_idx];
+   } while (entry->entries);
  
  	if (level)

-   return NULL;
+   return -ENOENT;
+
+   *bo = entry->bo;
+
+   /* In the case of a mixed PT the PDE must point to it*/
+   if (p->adev->asic_type < CHIP_VEGA10 ||
+   nptes != AMDGPU_VM_PTE_COUNT(p->adev) ||
+   p->func != amdgpu_vm_do_set_ptes ||
+   !(flags & AMDGPU_PTE_VALID)) {
+
+   dst = amdgpu_bo_gpu_offset(*bo);
+   dst = amdgpu_gart_get_vm_pde(p->adev, dst);
+   flags = AMDGPU_PTE_VALID;
+   } else {
+   flags |= AMDGPU_PDE_PTE;
+   }
  
-	return entry->bo;

+   if (entry->addr == dst &&
+   entry->huge_page == !!(flags & AMDGPU_PDE_PTE))
+   return 0;
+
+   entry->addr = dst;
+   entry->huge_page = !!(flags & AMDGPU_PDE_PTE);
+
+   if (parent->bo->shadow) {
+   pd_addr = amdgpu_bo_gpu_offset(parent->bo->shadow);
+   pde = pd_addr + pt_idx * 8;
+   amdgpu_vm_do_set_ptes(p, pde, dst, 1, 0, flags);
+   }
+
+   pd_addr = amdgpu_bo_gpu_offset(parent->bo);
+   pde = 

Re: [PATCH libdrm v3 1/1] amdgpu: move asic id table to a separate file

2017-05-30 Thread Emil Velikov
Hi all,

Pardon for dropping in uninvited. Just some food for thought.

On 29 May 2017 at 22:01, Li, Samuel  wrote:
> Understood your point. However as discussed internally before, marketing 
> names are there for a lot of reasons; my understanding of the policy is we do 
> not need to touch them as long as there is no error in the names and they are 
> allowed to be public.
>
Samuel,

It seems that most comments put forward by people are going on deaf ears.

While there may be valid arguments behind doing so, do consider the following:
 - Review is always encouraged
Regardless if the information is within or outside of the source code.
 - Marketing can make mistakes or have IT glitches
The inconsistent use of "(TM)" and using a 67C2:00 is something one
wants to double-check with them.
 - Having a separate file so that clients can update/edit it does not help much.
You want to ship the whole driver, in order to have a predictable and
consistent user experience.
 - Adding ~200 loc for ~170 devices entries sounds like a step in the
wrong direction.

In either case, not my call. I might follow-up with some issues in the
code itself ;-)

HTH
Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: iMac 10,1 with Ubuntu 16.04: black screen after suspend

2017-05-30 Thread Lukas Wunner
On Tue, May 30, 2017 at 11:34:17AM +0200, Florian Echtler wrote:
> On 26.05.2017 23:03, Lukas Wunner wrote:
> > On Fri, May 26, 2017 at 02:13:29PM +0200, Florian Echtler wrote:
> >> I'm running Ubuntu 16.04.2 on a 27" unibody iMac 10,1 from 2009. However, 
> >> even
> >> with the most recent HWE stack (kernel 4.8), the display stays black after
> >> suspend. I can ssh into the machine, so wakeup is OK, but the display is
> >> entirely black (backlight stays off).
> >>
> > The ATI card has MXM_PNL_PWR_EN and MXM_PNL_BL_EN pins.  Those must be
> > enabled for the panel to light up.  Perhaps radeon needs to be extended
> > to do that.  One theory is that this is done via ACPI, but perhaps
> > only if a certain operating system is running.  Dump the ACPI tables
> > and search DSDT and SSDT* tables for methods relating to the ATI card
> > and/or backlight.  If you find checks for OSDW there (e.g. in _PS0,
> > i.e. on resume), it means the AML code is only executed on Darwin.
> I've already had a look at the ACPI tables; there seem to be no references to
> the "official" backlight/display control functions (no _BL? or _DO? methods).
> 
> > Conversely !OSDW means the code is only executed on Windows (Boot Camp).
> > Linux masquerades as Darwin, but this can be disabled with
> > acpi_osi=!Darwin on the command line.  It's worth trying if that changes
> > the behaviour.  If it does, then macOS likely sets those pins on resume
> > and we need to do the same in radeon.
> There are OSDW checks present, however, booting with acpi_osi=!Darwin doesn't
> make a difference to the backlight behaviour.

So, just to confirm, if you never plug in an external DP source and the iMac
comes out of sleep, the display stays black?

If you log in via ssh and look at dmesg, was radeon able to train the DP
link again?  No error messages, nothing?

Is the panel off or is it on but merely with 0% brightness?

There's a block of pins on the GPU for "system management" which contains
a bunch of GPIO, OEM and reserved pins.  Three of these are used to control
the panel when it's switched to the radeon card:
pin 23 PNL_PWR_EN
pin 25 PNL_BL_EN
pin 27 PNL_BL_PWM

Basically you'd need to know what these are set to before and after sleep.
I don't know how to access them, presumably via a BAR register.  A quick
look at the RV730-specific registers in drivers/gpu/drm/radeon/*.h did not
yield anything useful, but someone at AMD may be able to find this out.

Maybe these registers are supposed to be modified by the OS when coming
out of sleep, or they're modified by video BIOS but the OS has to trigger
this somehow.

HTH,

Lukas
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu/radeon: Use radeon by default for CIK GPUs

2017-05-30 Thread Michel Dänzer
On 30/05/17 06:17 PM, Grazvydas Ignotas wrote:
> On Tue, May 30, 2017 at 6:30 AM, Michel Dänzer  wrote:
>> On 29/05/17 06:20 PM, Michel Dänzer wrote:
>>> From: Michel Dänzer 
>>>
>>> Even if CONFIG_DRM_AMDGPU_CIK is enabled.
>>>
>>> There is no feature parity yet for CIK, in particular amdgpu doesn't
>>> support HDMI/DisplayPort without DC.
>>>
>>> Signed-off-by: Michel Dänzer 
>>
>> [...]
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
>>> b/drivers/gpu/drm/amd/amdgpu/Kconfig
>>> index 8d36087fc186..e0121f8b436e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
>>> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
>>> @@ -17,11 +17,11 @@ config DRM_AMDGPU_CIK
>>>   help
>>> Choose this option if you want to enable support for CIK asics.
>>>
>>> -   CIK is already supported in radeon. If you enable this option,
>>> -   support for CIK will be provided by amdgpu and disabled in
>>> -   radeon by default. Use module options to override this:
>>> +   CIK is already supported in radeon. Support for SI in amdgpu
>>
>> Consider this "SI" typo fixed in v2.
> 
> While you are here, what about adding the full codenames here? You
> can't expect every user configuring the kernel to know what SI/CIK
> means, it's not even documented in
> https://www.x.org/wiki/RadeonFeature/ or wikipedia, while the long
> codenames are.

That's out of scope for this series, and I definitely won't get around
to doing that before next week. Anybody feel free to beat me to it.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu/radeon: Use radeon by default for CIK GPUs

2017-05-30 Thread Grazvydas Ignotas
On Tue, May 30, 2017 at 6:30 AM, Michel Dänzer  wrote:
> On 29/05/17 06:20 PM, Michel Dänzer wrote:
>> From: Michel Dänzer 
>>
>> Even if CONFIG_DRM_AMDGPU_CIK is enabled.
>>
>> There is no feature parity yet for CIK, in particular amdgpu doesn't
>> support HDMI/DisplayPort without DC.
>>
>> Signed-off-by: Michel Dänzer 
>
> [...]
>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
>> b/drivers/gpu/drm/amd/amdgpu/Kconfig
>> index 8d36087fc186..e0121f8b436e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
>> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
>> @@ -17,11 +17,11 @@ config DRM_AMDGPU_CIK
>>   help
>> Choose this option if you want to enable support for CIK asics.
>>
>> -   CIK is already supported in radeon. If you enable this option,
>> -   support for CIK will be provided by amdgpu and disabled in
>> -   radeon by default. Use module options to override this:
>> +   CIK is already supported in radeon. Support for SI in amdgpu
>
> Consider this "SI" typo fixed in v2.

While you are here, what about adding the full codenames here? You
can't expect every user configuring the kernel to know what SI/CIK
means, it's not even documented in
https://www.x.org/wiki/RadeonFeature/ or wikipedia, while the long
codenames are.

Gražvydas
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: Really leave SI support disabled by default

2017-05-30 Thread Christian König

Am 30.05.2017 um 05:26 schrieb Michel Dänzer:

On 30/05/17 02:18 AM, Christian König wrote:

Am 29.05.2017 um 11:20 schrieb Michel Dänzer:

From: Michel Dänzer 

The default option value didn't match the help text and intention.

Signed-off-by: Michel Dänzer 

I'm still unsure about the last one. The feature parity is a good
argument but on the other hand we want people to use amdgpu for CIK
these days, don't we?

We want to make it easy for people to test amdgpu on CIK, which is what
the options added by Felix are for. IMO we should not flip the default
(upstream) before there is feature parity.



Anyway Reviewed-by: Christian König .

Thanks, I assume that applies to the whole series?


With the SI typo and the HDMI/DP audio thing pointed out by Kai fixed, 
than yes that rb applies to the whole series.


Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)

2017-05-30 Thread Nath, Arindam
>-Original Message-
>From: Joerg Roedel [mailto:j...@8bytes.org]
>Sent: Monday, May 29, 2017 8:09 PM
>To: Nath, Arindam ; Lendacky, Thomas
>
>Cc: io...@lists.linux-foundation.org; amd-gfx@lists.freedesktop.org;
>Deucher, Alexander ; Bridgman, John
>; dr...@endlessm.com; Suthikulpanit, Suravee
>; li...@endlessm.com; Craig Stein
>; mic...@daenzer.net; Kuehling, Felix
>; sta...@vger.kernel.org
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)
>
>Hi Arindam,
>
>I met Tom Lendacky last week in Nuremberg last week and he told me he is
>working on the same area of the code that this patch is for. His reason
>for touching this code was to solve some locking problems. Maybe you two
>can work together on a joint approach to improve this?

Sure Joerg, I will work with Tom.

Thanks.

>
>On Mon, May 22, 2017 at 01:18:01PM +0530, arindam.n...@amd.com wrote:
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 63cacf5..1edeebec 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -2227,15 +2227,26 @@ static struct iommu_group
>*amd_iommu_device_group(struct device *dev)
>>
>>  static void __queue_flush(struct flush_queue *queue)
>>  {
>> -struct protection_domain *domain;
>> -unsigned long flags;
>>  int idx;
>>
>> -/* First flush TLB of all known domains */
>> -spin_lock_irqsave(_iommu_pd_lock, flags);
>> -list_for_each_entry(domain, _iommu_pd_list, list)
>> -domain_flush_tlb(domain);
>> -spin_unlock_irqrestore(_iommu_pd_lock, flags);
>> +/* First flush TLB of all domains which were added to flush queue */
>> +for (idx = 0; idx < queue->next; ++idx) {
>> +struct flush_queue_entry *entry;
>> +
>> +entry = queue->entries + idx;
>> +
>> +/*
>> + * There might be cases where multiple IOVA entries for the
>> + * same domain are queued in the flush queue. To avoid
>> + * flushing the same domain again, we check whether the
>> + * flag is set or not. This improves the efficiency of
>> + * flush operation.
>> + */
>> +if (!entry->dma_dom->domain.already_flushed) {
>> +entry->dma_dom->domain.already_flushed = true;
>> +domain_flush_tlb(>dma_dom->domain);
>> +}
>> +}
>
>There is also a race condition here I have to look into. It is not
>introduced by your patch, but needs fixing anyway. I'll look into this
>too.
>
>
>Regards,
>
>   Joerg

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx