答复: 答复: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

2017-01-18 Thread Liu, Monk
current code missed the 128 DW after SWITCH_BUFFER, even without vm flush, we 
still need those 128 DW betwen previous frame's SWITCH_BUFFER and current 
frame's CE ib,

and this patch fixed this issue as well otherwise in SRIOV case (Unigeen 
HEAVEN) CE will meet vm fault, and in  steam OS case (DOTA2) will meet VM fault 
as well.


so the conclusion is if we have vm-flush, we make sure 128dw between vm flush 
and CE ib, if we don't insert vm flush we stil make sure 128 DW between 
SWITCH_BUFFER and CE ib.


if you reject this patch, please give me a solution to fix above VM fault


BR Monk


发件人: amd-gfx  代表 Christian König 

发送时间: 2017年1月18日 20:52:14
收件人: Liu, Monk; amd-gfx@lists.freedesktop.org
主题: Re: 答复: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

> suppose hw_submit_count in gpu_scheduler is 2 (by default), and
> without suppress frame size under 256, we may sometime submit 256dw,
> and sometime submit 512 dw, and that could lead to CPU override ring
> buffer content which is under processing by GPU, e.g. :
>
The max_dw parameter for amdgpu_ring_init() is multiplied by
amdgpu_sched_hw_submission. See amdgpu_ring_init():

 ring->ring_size = roundup_pow_of_two(max_dw * 4 *
amdgpu_sched_hw_submission);

Since we use 1024 for max_dw for the GFX, Compute and SDMA rings using
512 dw is perfectly ok.

Even if we ever try to submit more than 1024 including padding we will
get a nice warning in the logs. See amdgpu_ring_alloc():

 if (WARN_ON_ONCE(ndw > ring->max_dw))
 return -ENOMEM;

> another reason is we keep each DMAframe to 256dw is for better
> performance.
>
Well considering what else we have in those command buffers I would
strongly say that this is negligible.

So NAK on that patch as long as we don't have a good reason for it.

Regards,
Christian.

Am 18.01.2017 um 11:11 schrieb Liu, Monk:
>
> >First question why do we want to limit the dw per submit to 256? Some
> >limitation for SRIOV?
>
>
> [ML]
>
> suppose hw_submit_count in gpu_scheduler is 2 (by default), and
> without suppress frame size under 256, we may sometime submit 256dw,
> and sometime submit 512 dw, and that could lead to CPU override ring
> buffer content which is under processing by GPU, e.g. :
>
>
> we assume ring buffer size = 512 dw (so ring buffer can hold 2 hw
> submit) for easy understanding.
>
>
> hw_count =2;
>
> submit job-1 (take 256 dw)
>
> hw_count = 1;
>
> submit job-2 (take 256 dw) //now ring buffer is full
>
> hw_count =0;
>
> gpu_scheduler waiting
>
> ...
>
> job-1 signaled, then hw_count => 1
>
> submit job3, and the job3 is filled in the head of RB (wrapped)
>
> but job3 take 512 dw (e.g. it takes 259 dw, but we aligned it to 512)
>
>
> job-2 still under processing, but the package belongs to job-2 is
> override by job3, disaster ...
>
>
> another reason is we keep each DMAframe to 256dw is for better
> performance.
>
>
> BR Monk
>
> 
> *发件人:* Christian König 
> *发送时间:* 2017年1月18日 17:28:15
> *收件人:* Liu, Monk; amd-gfx@lists.freedesktop.org
> *主题:* Re: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB
> Am 18.01.2017 um 06:55 schrieb Monk Liu:
> > previously we always insert 128nops behind vm_flush, which
> > may lead to DAMframe size above 256 dw and automatially aligned
> > to 512 dw.
> >
> > now we calculate how many DWs already inserted after vm_flush
> > and make up for the reset to pad up to 128dws before emit_ib.
> >
> > that way we only take 256 dw per submit.
>
> First question why do we want to limit the dw per submit to 256? Some
> limitation for SRIOV?
>
> Then for the implementation, please don't add all that counting to the
> different functions. Instead save the current position in the ring in
> emit_vm_flush and then calculate in emit_cntxcntl() how many dw we still
> need to add to have at least 128. E.g. call the variable something like
> last_vm_flush_pos.
>
> That makes the code way more robust towards any changes.
>
> Regards,
> Christian.
>
> >
> > Change-Id: Iac198e16f35b071476ba7bd48ab338223f6fe650
> > Signed-off-by: Monk Liu 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
> >   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 24 ++--
> >   2 files changed, 23 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > index c813cbe..1dbe600 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > @@ -173,6 +173,7 @@ struct amdgpu_ring {
> >   #if defined(CONFIG_DEBUG_FS)
> >struct dentry *ent;
> >   #endif
> > + u32 dws_between_vm_ib;
> >   };
> >
> >   int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> > index 5f37313..76b1315 100644
> > --- a/dr

Re: [PATCH 3/4] drm/amd/display: Switch to using atomic_helper for flip.

2017-01-18 Thread Michel Dänzer
On 19/01/17 07:18 AM, Grodzovsky, Andrey wrote:
>> From: Michel Dänzer [mailto:mic...@daenzer.net]
>> On 17/01/17 07:16 AM, Laurent Pinchart wrote:
>>> On Monday 16 Jan 2017 10:44:57 Andrey Grodzovsky wrote:
 Change-Id: Iad3e0b9b3546e4e4dc79be9233daf4fe4dba83e0
 Signed-off-by: Andrey Grodzovsky 
 ---
  .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 92 ++-
>> ---
  1 file changed, 6 insertions(+), 86 deletions(-)

 diff --git
>> a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
 b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
>> index
 a443b70..d4664bf 100644
 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
 +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
 @@ -1060,83 +1060,6 @@ static int dm_crtc_funcs_atomic_set_property(
return 0;
  }

 -
 -static int amdgpu_atomic_helper_page_flip(struct drm_crtc *crtc,
 -  struct drm_framebuffer *fb,
 -  struct drm_pending_vblank_event *event,
 -  uint32_t flags)
 -{
 -  struct drm_plane *plane = crtc->primary;
 -  struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
 -  struct drm_atomic_state *state;
 -  struct drm_plane_state *plane_state;
 -  struct drm_crtc_state *crtc_state;
 -  int ret = 0;
 -
 -  state = drm_atomic_state_alloc(plane->dev);
 -  if (!state)
 -  return -ENOMEM;
 -
 -  ret = drm_crtc_vblank_get(crtc);
>>>
>>> The DRM core's atomic page flip helper doesn't get/put vblank. Have
>>> you double-checked that removing them isn't a problem ?
>>
>> This patch makes the amdgpu DM code use the page_flip_target hook.
>> drm_mode_page_flip_ioctl calls drm_crtc_vblank_get before the
>> page_flip_target hook.
>>
>> You're right though that the fact that drm_atomic_helper_page_flip doesn't
>> call drm_crtc_vblank_get is a bit alarming. From the
>> DRM_IOCTL_MODE_PAGE_FLIP POV, drm_crtc_vblank_get must be called
>> when userspace calls the ioctl (either by drm_mode_page_flip_ioctl or the
>> page_flip hook implementation), and drm_crtc_vblank_put must be called
>> when the flip completes and the event is sent to userspace. How is this
>> supposed to be handled with atomic?
> 
> First let me ask you - why we have to enable vlbank as part of pflip, 
> I understand why it has to be enbbled in WAIT_FOR_VBLANK IOCTL 
> but not sure about PFLIP IOCTL. 

It's required for the vblank sequence counter to be accurate in the page
flip completion event sent to userspace.


> IMHO in atomic as before in legacy page_flip, it's up to the  driver 
> implementation in atomic_commit hook   to manage vblank ISR on,off.

When using the page_flip hook, it's all up to the implementation of that
hook. When using the page_flip_target hook, drm_mode_page_flip_ioctl
calls vblank_get, so the hook implementation must make sure that a
matching vblank_put call is made when the flip completes.


>> Andrey, did you check via code audit and/or testing that the vblank
>> reference count is still balanced after this change?
>>
> With the page_flip_target yes, if I switch back to page_flip hook then 
> vblank ISR never gets enabled on FLIP IPCTL and then I see warning in DAL's
> pflip done IRQ handler from vblank_put which is obvious,

For drivers using the atomic_helpers for flip, maybe there should be (or
might be already) a corresponding helper which deals with things like
this when the flip completes, so drivers don't have to call vblank_put
and such.

Anyway, it sounds like the vblank_get/puts are balanced with this patch
for now.


> which BTW didn't impact the flips, I still was able to run glxgears
> in vsync mode.

I don't think glxgears relies on the vblank sequence numbers being accurate.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/1] drm/amdgpu: fix unload driver issue for virtual display

2017-01-18 Thread Xiangliang Yu
Virtual display doesn't allocate amdgpu_encoder when initializing,
so will get invaild pointer if try to free amdgpu_encoder when
unloading driver.

Signed-off-by: Xiangliang Yu 
---
 drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c 
b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
index 64dd266..fc12cdf 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
@@ -634,11 +634,8 @@ static const struct drm_encoder_helper_funcs 
dce_virtual_encoder_helper_funcs =
 
 static void dce_virtual_encoder_destroy(struct drm_encoder *encoder)
 {
-   struct amdgpu_encoder *amdgpu_encoder = to_amdgpu_encoder(encoder);
-
-   kfree(amdgpu_encoder->enc_priv);
drm_encoder_cleanup(encoder);
-   kfree(amdgpu_encoder);
+   kfree(encoder);
 }
 
 static const struct drm_encoder_funcs dce_virtual_encoder_funcs = {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/4] drm/amd/display: Switch to using atomic_helper for flip.

2017-01-18 Thread Grodzovsky, Andrey
> -Original Message-
> From: Michel Dänzer [mailto:mic...@daenzer.net]
> Sent: Tuesday, January 17, 2017 8:50 PM
> To: Laurent Pinchart
> Cc: dri-de...@lists.freedesktop.org; Grodzovsky, Andrey;
> daniel.vet...@intel.com; amd-gfx@lists.freedesktop.org;
> nouv...@lists.freedesktop.org
> Subject: Re: [PATCH 3/4] drm/amd/display: Switch to using atomic_helper for
> flip.
> 
> On 17/01/17 07:16 AM, Laurent Pinchart wrote:
> > On Monday 16 Jan 2017 10:44:57 Andrey Grodzovsky wrote:
> >> Change-Id: Iad3e0b9b3546e4e4dc79be9233daf4fe4dba83e0
> >> Signed-off-by: Andrey Grodzovsky 
> >> ---
> >>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 92 ++-
> ---
> >>  1 file changed, 6 insertions(+), 86 deletions(-)
> >>
> >> diff --git
> a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> index
> >> a443b70..d4664bf 100644
> >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> @@ -1060,83 +1060,6 @@ static int dm_crtc_funcs_atomic_set_property(
> >>return 0;
> >>  }
> >>
> >> -
> >> -static int amdgpu_atomic_helper_page_flip(struct drm_crtc *crtc,
> >> -  struct drm_framebuffer *fb,
> >> -  struct drm_pending_vblank_event *event,
> >> -  uint32_t flags)
> >> -{
> >> -  struct drm_plane *plane = crtc->primary;
> >> -  struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
> >> -  struct drm_atomic_state *state;
> >> -  struct drm_plane_state *plane_state;
> >> -  struct drm_crtc_state *crtc_state;
> >> -  int ret = 0;
> >> -
> >> -  state = drm_atomic_state_alloc(plane->dev);
> >> -  if (!state)
> >> -  return -ENOMEM;
> >> -
> >> -  ret = drm_crtc_vblank_get(crtc);
> >
> > The DRM core's atomic page flip helper doesn't get/put vblank. Have
> > you double-checked that removing them isn't a problem ?
> 
> This patch makes the amdgpu DM code use the page_flip_target hook.
> drm_mode_page_flip_ioctl calls drm_crtc_vblank_get before the
> page_flip_target hook.
> 
> You're right though that the fact that drm_atomic_helper_page_flip doesn't
> call drm_crtc_vblank_get is a bit alarming. From the
> DRM_IOCTL_MODE_PAGE_FLIP POV, drm_crtc_vblank_get must be called
> when userspace calls the ioctl (either by drm_mode_page_flip_ioctl or the
> page_flip hook implementation), and drm_crtc_vblank_put must be called
> when the flip completes and the event is sent to userspace. How is this
> supposed to be handled with atomic?

First let me ask you - why we have to enable vlbank as part of pflip, 
I understand why it has to be enbbled in WAIT_FOR_VBLANK IOCTL 
but not sure about PFLIP IOCTL. 

IMHO in atomic as before in legacy page_flip, it's up to the  driver 
implementation in atomic_commit hook   to manage vblank ISR on,off.
> 
> Andrey, did you check via code audit and/or testing that the vblank
> reference count is still balanced after this change?
> 
With the page_flip_target yes, if I switch back to page_flip hook then 
vblank ISR never gets enabled on FLIP IPCTL and then I see warning in DAL's
pflip done IRQ handler from vblank_put which is obvious, which BTW
didn't impact the flips, I still was able to run glxgears in vsync mode.

> 
> >> @@ -3143,8 +3064,7 @@ int amdgpu_dm_atomic_check(struct
> drm_device *dev,
> >> * 1. This commit is not a page flip.
> >> * 2. This commit is a page flip, and targets are
> > created.
> >> */
> >> -  if (!page_flip_needed(plane_state, old_plane_state,
> >> -true) ||
> >> +  if (!page_flip_needed(plane_state, old_plane_state,
> > true) ||
> >
> > This seems to be an unrelated change.
> 
> Yeah, such whitespace-only changes should be dropped.

Will remove this.
> 
> 
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

2017-01-18 Thread StDenis, Tom
Coo, pushed to staging.


Tom



From: Deucher, Alexander
Sent: Wednesday, January 18, 2017 13:47
To: StDenis, Tom; Alex Deucher
Cc: amd-gfx list
Subject: RE: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs


Whoops, sorry, I glanced quickly and thought you were grabbing the vendor and 
device ids, not the revision.  Carry on :)



From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
StDenis, Tom
Sent: Wednesday, January 18, 2017 1:45 PM
To: Alex Deucher
Cc: amd-gfx list
Subject: Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs



I am grabbing revision.  Unless there's some other PCI revision value I'm 
missing :-)



Tom





From: Alex Deucher mailto:alexdeuc...@gmail.com>>
Sent: Wednesday, January 18, 2017 13:40
To: Tom St Denis
Cc: amd-gfx list; StDenis, Tom
Subject: Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs



On Wed, Jan 18, 2017 at 1:25 PM, Tom St Denis 
mailto:tstdeni...@gmail.com>> wrote:
> So we can determine which device the entry is before connecting
> a display.
>
> Signed-off-by: Tom St Denis mailto:tom.stde...@amd.com>>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 28681286d57c..8640f9216d93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2967,7 +2967,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> return -ENOMEM;
>
> /* version, increment each time something is added */
> -   config[no_regs++] = 2;
> +   config[no_regs++] = 3;
> config[no_regs++] = adev->gfx.config.max_shader_engines;
> config[no_regs++] = adev->gfx.config.max_tile_pipes;
> config[no_regs++] = adev->gfx.config.max_cu_per_sh;
> @@ -3001,6 +3001,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> config[no_regs++] = adev->family;
> config[no_regs++] = adev->external_rev_id;
>
> +   /* rev==3 */
> +   config[no_regs++] = adev->pdev->device;
> +   config[no_regs++] = adev->pdev->revision;
> +   config[no_regs++] = adev->pdev->subsystem_device;
> +   config[no_regs++] = adev->pdev->subsystem_vendor;

For completeness, please add pdev->revision as well.  With that:
Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>

> +
> while (size && (*pos < no_regs * 4)) {
> uint32_t value;
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

amd-gfx Info Page - 
lists.freedesktop.org

lists.freedesktop.org

To see the collection of prior postings to the list, visit the amd-gfx 
Archives. Using amd-gfx: To post a message to all the list members, send email 
...



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

2017-01-18 Thread Deucher, Alexander
Whoops, sorry, I glanced quickly and thought you were grabbing the vendor and 
device ids, not the revision.  Carry on :)

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
StDenis, Tom
Sent: Wednesday, January 18, 2017 1:45 PM
To: Alex Deucher
Cc: amd-gfx list
Subject: Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs


I am grabbing revision.  Unless there's some other PCI revision value I'm 
missing :-)



Tom


From: Alex Deucher mailto:alexdeuc...@gmail.com>>
Sent: Wednesday, January 18, 2017 13:40
To: Tom St Denis
Cc: amd-gfx list; StDenis, Tom
Subject: Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

On Wed, Jan 18, 2017 at 1:25 PM, Tom St Denis 
mailto:tstdeni...@gmail.com>> wrote:
> So we can determine which device the entry is before connecting
> a display.
>
> Signed-off-by: Tom St Denis mailto:tom.stde...@amd.com>>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 28681286d57c..8640f9216d93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2967,7 +2967,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> return -ENOMEM;
>
> /* version, increment each time something is added */
> -   config[no_regs++] = 2;
> +   config[no_regs++] = 3;
> config[no_regs++] = adev->gfx.config.max_shader_engines;
> config[no_regs++] = adev->gfx.config.max_tile_pipes;
> config[no_regs++] = adev->gfx.config.max_cu_per_sh;
> @@ -3001,6 +3001,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> config[no_regs++] = adev->family;
> config[no_regs++] = adev->external_rev_id;
>
> +   /* rev==3 */
> +   config[no_regs++] = adev->pdev->device;
> +   config[no_regs++] = adev->pdev->revision;
> +   config[no_regs++] = adev->pdev->subsystem_device;
> +   config[no_regs++] = adev->pdev->subsystem_vendor;

For completeness, please add pdev->revision as well.  With that:
Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>

> +
> while (size && (*pos < no_regs * 4)) {
> uint32_t value;
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - 
lists.freedesktop.org
lists.freedesktop.org
To see the collection of prior postings to the list, visit the amd-gfx 
Archives. Using amd-gfx: To post a message to all the list members, send email 
...


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

2017-01-18 Thread StDenis, Tom
I am grabbing revision.  Unless there's some other PCI revision value I'm 
missing :-)


Tom



From: Alex Deucher 
Sent: Wednesday, January 18, 2017 13:40
To: Tom St Denis
Cc: amd-gfx list; StDenis, Tom
Subject: Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

On Wed, Jan 18, 2017 at 1:25 PM, Tom St Denis  wrote:
> So we can determine which device the entry is before connecting
> a display.
>
> Signed-off-by: Tom St Denis 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 28681286d57c..8640f9216d93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2967,7 +2967,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> return -ENOMEM;
>
> /* version, increment each time something is added */
> -   config[no_regs++] = 2;
> +   config[no_regs++] = 3;
> config[no_regs++] = adev->gfx.config.max_shader_engines;
> config[no_regs++] = adev->gfx.config.max_tile_pipes;
> config[no_regs++] = adev->gfx.config.max_cu_per_sh;
> @@ -3001,6 +3001,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> config[no_regs++] = adev->family;
> config[no_regs++] = adev->external_rev_id;
>
> +   /* rev==3 */
> +   config[no_regs++] = adev->pdev->device;
> +   config[no_regs++] = adev->pdev->revision;
> +   config[no_regs++] = adev->pdev->subsystem_device;
> +   config[no_regs++] = adev->pdev->subsystem_vendor;

For completeness, please add pdev->revision as well.  With that:
Reviewed-by: Alex Deucher 

> +
> while (size && (*pos < no_regs * 4)) {
> uint32_t value;
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - 
lists.freedesktop.org
lists.freedesktop.org
To see the collection of prior postings to the list, visit the amd-gfx 
Archives. Using amd-gfx: To post a message to all the list members, send email 
...



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

2017-01-18 Thread Alex Deucher
On Wed, Jan 18, 2017 at 1:25 PM, Tom St Denis  wrote:
> So we can determine which device the entry is before connecting
> a display.
>
> Signed-off-by: Tom St Denis 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 28681286d57c..8640f9216d93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2967,7 +2967,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> return -ENOMEM;
>
> /* version, increment each time something is added */
> -   config[no_regs++] = 2;
> +   config[no_regs++] = 3;
> config[no_regs++] = adev->gfx.config.max_shader_engines;
> config[no_regs++] = adev->gfx.config.max_tile_pipes;
> config[no_regs++] = adev->gfx.config.max_cu_per_sh;
> @@ -3001,6 +3001,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
> file *f, char __user *buf,
> config[no_regs++] = adev->family;
> config[no_regs++] = adev->external_rev_id;
>
> +   /* rev==3 */
> +   config[no_regs++] = adev->pdev->device;
> +   config[no_regs++] = adev->pdev->revision;
> +   config[no_regs++] = adev->pdev->subsystem_device;
> +   config[no_regs++] = adev->pdev->subsystem_vendor;

For completeness, please add pdev->revision as well.  With that:
Reviewed-by: Alex Deucher 

> +
> while (size && (*pos < no_regs * 4)) {
> uint32_t value;
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: Add PCI info to gca_config debugfs

2017-01-18 Thread Tom St Denis
So we can determine which device the entry is before connecting
a display.

Signed-off-by: Tom St Denis 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 28681286d57c..8640f9216d93 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2967,7 +2967,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct file 
*f, char __user *buf,
return -ENOMEM;
 
/* version, increment each time something is added */
-   config[no_regs++] = 2;
+   config[no_regs++] = 3;
config[no_regs++] = adev->gfx.config.max_shader_engines;
config[no_regs++] = adev->gfx.config.max_tile_pipes;
config[no_regs++] = adev->gfx.config.max_cu_per_sh;
@@ -3001,6 +3001,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct 
file *f, char __user *buf,
config[no_regs++] = adev->family;
config[no_regs++] = adev->external_rev_id;
 
+   /* rev==3 */
+   config[no_regs++] = adev->pdev->device;
+   config[no_regs++] = adev->pdev->revision;
+   config[no_regs++] = adev->pdev->subsystem_device;
+   config[no_regs++] = adev->pdev->subsystem_vendor;
+
while (size && (*pos < no_regs * 4)) {
uint32_t value;
 
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Christian König

Am 18.01.2017 um 15:45 schrieb Pixel Ding:

From: Ding Pixel 

Return success when the ring is properly initialized, otherwise return
failure.

Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
these IPs is bypassed. The system crashes if application submit IB to
their rings which are not ready to use. It could be a common issue if
IP having ring buffer is disabled for some reason on specific ASIC, so
it should check the ring being ready to use.

Bug: amdgpu_test crashes system on Tonga VF.


Not a big issue, but when you send out a patch for the second time it is 
good practice to add "v2" to the subject and something like "v2: changed 
xyz" to the commit message.




Signed-off-by: Ding Pixel 


Reviewed-by: Christian König .


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 08dd97b..6e948e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -83,6 +83,13 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 
ip_type,
}
break;
}
+
+   if (!(*out_ring && (*out_ring)->adev)) {
+   DRM_ERROR("Ring %d is not initialized on IP %d\n",
+ ring, ip_type);
+   return -EINVAL;
+   }
+
return 0;
  }
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-ati] ati: Support loading the amdgpu driver from the ati wrapper

2017-01-18 Thread Alex Deucher
On Wed, Jan 18, 2017 at 5:08 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> If .../share/X11/xorg.conf.d/10-amdgpu.conf doesn't exist, but the ati
> wrapper is loaded, it will otherwise try to use the radeon driver even
> for GPUs driven by the amdgpu kernel driver. This can only fail,
> potentially in bad ways.
>
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  src/ati.c | 40 
>  1 file changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/src/ati.c b/src/ati.c
> index 227665e73..d5294df09 100644
> --- a/src/ati.c
> +++ b/src/ati.c
> @@ -58,6 +58,7 @@
>  #endif
>
>  #include 
> +#include 
>  #include "atipcirename.h"
>
>  #include "ati.h"
> @@ -68,6 +69,7 @@
>  #define MACH64_DRIVER_NAME  "mach64"
>  #define R128_DRIVER_NAME"r128"
>  #define RADEON_DRIVER_NAME  "radeon"
> +#define AMDGPU_DRIVER_NAME  "amdgpu"
>
>  enum
>  {
> @@ -140,9 +142,9 @@ ati_device_get_indexed(int index)
>  void
>  ati_gdev_subdriver(pointer options)
>  {
> -int  nATIGDev, nMach64GDev, nR128GDev, nRadeonGDev;
> +int  nATIGDev, nMach64GDev, nR128GDev, nRadeonGDev, nAmdgpuGDev;
>  GDevPtr *ATIGDevs;
> -Bool load_mach64 = FALSE, load_r128 = FALSE, load_radeon = FALSE;
> +Bool load_mach64 = FALSE, load_r128 = FALSE, load_radeon = FALSE, 
> load_amdgpu = FALSE;
>  int  i;
>
>  /* let the subdrivers configure for themselves */
> @@ -154,6 +156,7 @@ ati_gdev_subdriver(pointer options)
>  nMach64GDev = xf86MatchDevice(MACH64_DRIVER_NAME, NULL);
>  nR128GDev = xf86MatchDevice(R128_DRIVER_NAME, NULL);
>  nRadeonGDev = xf86MatchDevice(RADEON_DRIVER_NAME, NULL);
> +nAmdgpuGDev = xf86MatchDevice(AMDGPU_DRIVER_NAME, NULL);
>
>  for (i = 0; i < nATIGDev; i++) {
>  GDevPtr ati_gdev = ATIGDevs[i];
> @@ -200,8 +203,34 @@ ati_gdev_subdriver(pointer options)
>  }
>
>  if (chip_family == ATI_CHIP_FAMILY_Radeon) {
> -ati_gdev->driver = RADEON_DRIVER_NAME;
> -load_radeon = TRUE;
> +   char *busid;
> +
> +   XNFasprintf(&busid, "pci:%04x:%02x:%02x.%d",
> +   device->domain, device->bus, device->dev,
> +   device->func);
> +
> +   if (busid) {
> +   int fd = drmOpen(NULL, busid);
> +
> +   if (fd >= 0) {
> +   drmVersionPtr version = drmGetVersion(fd);
> +
> +   if (version->version_major == 3) {
> +   ati_gdev->driver = AMDGPU_DRIVER_NAME;
> +   load_amdgpu = TRUE;
> +   }
> +
> +   free(version);
> +   drmClose(fd);
> +   }
> +
> +   free(busid);
> +   }
> +
> +   if (strcmp(ati_gdev->driver, AMDGPU_DRIVER_NAME) != 0) {
> +   ati_gdev->driver = RADEON_DRIVER_NAME;
> +   load_radeon = TRUE;
> +   }
>  }
>  }
>
> @@ -219,6 +248,9 @@ ati_gdev_subdriver(pointer options)
>
>  if (load_radeon && (nRadeonGDev == 0))
>   xf86LoadOneModule(RADEON_DRIVER_NAME, options);
> +
> +if (load_amdgpu && (nAmdgpuGDev == 0))
> + xf86LoadOneModule(AMDGPU_DRIVER_NAME, options);
>  }
>
>  /*
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Pixel Ding
From: Ding Pixel 

Return success when the ring is properly initialized, otherwise return
failure.

Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
these IPs is bypassed. The system crashes if application submit IB to
their rings which are not ready to use. It could be a common issue if
IP having ring buffer is disabled for some reason on specific ASIC, so
it should check the ring being ready to use.

Bug: amdgpu_test crashes system on Tonga VF.

Signed-off-by: Ding Pixel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 08dd97b..6e948e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -83,6 +83,13 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 
ip_type,
}
break;
}
+
+   if (!(*out_ring && (*out_ring)->adev)) {
+   DRM_ERROR("Ring %d is not initialized on IP %d\n",
+ ring, ip_type);
+   return -EINVAL;
+   }
+
return 0;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Ding, Pixel
Christian,

Thank you for the quick comments. The revision is coming soon.

— 
Sincerely Yours,
Pixel








On 18/01/2017, 9:26 PM, "Christian König"  wrote:

>Am 18.01.2017 um 11:37 schrieb Pixel Ding:
>> From: Ding Pixel 
>>
>> Return success when the ring is properly initialized, otherwise return
>> failure.
>>
>> Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
>> these IPs is bypassed. The system crashes if application submit IB to
>> their rings which are not ready to use. It could be a common issue if
>> IP having ring buffer is disabled for some reason on specific ASIC, so
>> it should check the ring being ready to use.
>>
>> Bug: amdgpu_test crashes system on Tonga VF.
>
>Good catch, we probably have problem using the second VCE ring as well.
>
>>
>> Signed-off-by: Ding Pixel 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 ++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index 08dd97b..0a235b9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -83,6 +83,12 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 
>> ip_type,
>>  }
>>  break;
>>  }
>> +
>> +if (!(*out_ring && (*out_ring)->ready)) {
>
>Don't check ring->ready here, that member is used during GPU reset to 
>test if we have successfully restarted the ring.
>
>Instead check if ring->adev is set, that is used in ring_init() to check 
>if a ring is initialized or not.
>
>> +DRM_ERROR("invalid ip type: %d\n", ip_type);
>
>Please adjust the error message to something like "ring %d not 
>initialized on IP %d\".
>
>Regards,
>Christian.
>
>> +return -EINVAL;
>> +}
>> +
>>  return 0;
>>   }
>>   
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Christian König

Am 18.01.2017 um 11:37 schrieb Pixel Ding:

From: Ding Pixel 

Return success when the ring is properly initialized, otherwise return
failure.

Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
these IPs is bypassed. The system crashes if application submit IB to
their rings which are not ready to use. It could be a common issue if
IP having ring buffer is disabled for some reason on specific ASIC, so
it should check the ring being ready to use.

Bug: amdgpu_test crashes system on Tonga VF.


Good catch, we probably have problem using the second VCE ring as well.



Signed-off-by: Ding Pixel 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 08dd97b..0a235b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -83,6 +83,12 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 
ip_type,
}
break;
}
+
+   if (!(*out_ring && (*out_ring)->ready)) {


Don't check ring->ready here, that member is used during GPU reset to 
test if we have successfully restarted the ring.


Instead check if ring->adev is set, that is used in ring_init() to check 
if a ring is initialized or not.



+   DRM_ERROR("invalid ip type: %d\n", ip_type);


Please adjust the error message to something like "ring %d not 
initialized on IP %d\".


Regards,
Christian.


+   return -EINVAL;
+   }
+
return 0;
  }
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Pixel Ding
From: Ding Pixel 

Return success when the ring is properly initialized, otherwise return
failure.

Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
these IPs is bypassed. The system crashes if application submit IB to
their rings which are not ready to use. It could be a common issue if
IP having ring buffer is disabled for some reason on specific ASIC, so
it should check the ring being ready to use.

Bug: amdgpu_test crashes system on Tonga VF.

Signed-off-by: Ding Pixel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 08dd97b..0a235b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -83,6 +83,12 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 
ip_type,
}
break;
}
+
+   if (!(*out_ring && (*out_ring)->ready)) {
+   DRM_ERROR("invalid ip type: %d\n", ip_type);
+   return -EINVAL;
+   }
+
return 0;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: 答复: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

2017-01-18 Thread Christian König
suppose hw_submit_count in gpu_scheduler is 2 (by default), and 
without suppress frame size under 256, we may sometime submit 256dw, 
and sometime submit 512 dw, and that could lead to CPU override ring 
buffer content which is under processing by GPU, e.g. :


The max_dw parameter for amdgpu_ring_init() is multiplied by 
amdgpu_sched_hw_submission. See amdgpu_ring_init():


ring->ring_size = roundup_pow_of_two(max_dw * 4 *
amdgpu_sched_hw_submission);

Since we use 1024 for max_dw for the GFX, Compute and SDMA rings using 
512 dw is perfectly ok.


Even if we ever try to submit more than 1024 including padding we will 
get a nice warning in the logs. See amdgpu_ring_alloc():


if (WARN_ON_ONCE(ndw > ring->max_dw))
return -ENOMEM;

another reason is we keep each DMAframe to 256dw is for better 
performance.


Well considering what else we have in those command buffers I would 
strongly say that this is negligible.


So NAK on that patch as long as we don't have a good reason for it.

Regards,
Christian.

Am 18.01.2017 um 11:11 schrieb Liu, Monk:


>First question why do we want to limit the dw per submit to 256? Some
>limitation for SRIOV?


[ML]

suppose hw_submit_count in gpu_scheduler is 2 (by default), and 
without suppress frame size under 256, we may sometime submit 256dw, 
and sometime submit 512 dw, and that could lead to CPU override ring 
buffer content which is under processing by GPU, e.g. :



we assume ring buffer size = 512 dw (so ring buffer can hold 2 hw 
submit) for easy understanding.



hw_count =2;

submit job-1 (take 256 dw)

hw_count = 1;

submit job-2 (take 256 dw) //now ring buffer is full

hw_count =0;

gpu_scheduler waiting

...

job-1 signaled, then hw_count => 1

submit job3, and the job3 is filled in the head of RB (wrapped)

but job3 take 512 dw (e.g. it takes 259 dw, but we aligned it to 512)


job-2 still under processing, but the package belongs to job-2 is 
override by job3, disaster ...



another reason is we keep each DMAframe to 256dw is for better 
performance.



BR Monk


*发件人:* Christian König 
*发送时间:* 2017年1月18日 17:28:15
*收件人:* Liu, Monk; amd-gfx@lists.freedesktop.org
*主题:* Re: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB
Am 18.01.2017 um 06:55 schrieb Monk Liu:
> previously we always insert 128nops behind vm_flush, which
> may lead to DAMframe size above 256 dw and automatially aligned
> to 512 dw.
>
> now we calculate how many DWs already inserted after vm_flush
> and make up for the reset to pad up to 128dws before emit_ib.
>
> that way we only take 256 dw per submit.

First question why do we want to limit the dw per submit to 256? Some
limitation for SRIOV?

Then for the implementation, please don't add all that counting to the
different functions. Instead save the current position in the ring in
emit_vm_flush and then calculate in emit_cntxcntl() how many dw we still
need to add to have at least 128. E.g. call the variable something like
last_vm_flush_pos.

That makes the code way more robust towards any changes.

Regards,
Christian.

>
> Change-Id: Iac198e16f35b071476ba7bd48ab338223f6fe650
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 24 ++--
>   2 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h

> index c813cbe..1dbe600 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -173,6 +173,7 @@ struct amdgpu_ring {
>   #if defined(CONFIG_DEBUG_FS)
>struct dentry *ent;
>   #endif
> + u32 dws_between_vm_ib;
>   };
>
>   int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

> index 5f37313..76b1315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -5670,6 +5670,8 @@ static void 
gfx_v8_0_ring_emit_gds_switch(struct amdgpu_ring *ring,

>amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].oa);
>amdgpu_ring_write(ring, 0);
>amdgpu_ring_write(ring, (1 << (oa_size + oa_base)) - (1 << 
oa_base));

> +
> + ring->dws_between_vm_ib += 20;
>   }
>
>   static uint32_t wave_read_ind(struct amdgpu_device *adev, uint32_t 
simd, uint32_t wave, uint32_t address)
> @@ -6489,6 +6491,8 @@ static void 
gfx_v8_0_ring_emit_hdp_flush(struct amdgpu_ring *ring)

>amdgpu_ring_write(ring, ref_and_mask);
>amdgpu_ring_write(ring, ref_and_mask);
>amdgpu_ring_write(ring, 0x20); /* poll interval */
> +
> + ring->dws_between_vm_ib += 7;
>   }
>
>   static void gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring *ring)
> @@ -6500,6 +6504,8 @@ static void 
gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring

Re: [PATCH 3/4] drm/amd/display: Switch to using atomic_helper for flip.

2017-01-18 Thread Laurent Pinchart
Hi Michel,

On Wednesday 18 Jan 2017 10:50:01 Michel Dänzer wrote:
> On 17/01/17 07:16 AM, Laurent Pinchart wrote:
> > On Monday 16 Jan 2017 10:44:57 Andrey Grodzovsky wrote:
> >> Change-Id: Iad3e0b9b3546e4e4dc79be9233daf4fe4dba83e0
> >> Signed-off-by: Andrey Grodzovsky 
> >> ---
> >> 
> >>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_types.c| 92
> >>  ++
> >>  1 file changed, 6 insertions(+), 86 deletions(-)
> >> 
> >> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c index
> >> a443b70..d4664bf 100644
> >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_types.c
> >> @@ -1060,83 +1060,6 @@ static int dm_crtc_funcs_atomic_set_property(
> >> 
> >>return 0;
> >>  
> >>  }
> >> 
> >> -
> >> -static int amdgpu_atomic_helper_page_flip(struct drm_crtc *crtc,
> >> -  struct drm_framebuffer *fb,
> >> -  struct drm_pending_vblank_event *event,
> >> -  uint32_t flags)
> >> -{
> >> -  struct drm_plane *plane = crtc->primary;
> >> -  struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
> >> -  struct drm_atomic_state *state;
> >> -  struct drm_plane_state *plane_state;
> >> -  struct drm_crtc_state *crtc_state;
> >> -  int ret = 0;
> >> -
> >> -  state = drm_atomic_state_alloc(plane->dev);
> >> -  if (!state)
> >> -  return -ENOMEM;
> >> -
> >> -  ret = drm_crtc_vblank_get(crtc);
> > 
> > The DRM core's atomic page flip helper doesn't get/put vblank. Have you
> > double-checked that removing them isn't a problem ?
> 
> This patch makes the amdgpu DM code use the page_flip_target hook.
> drm_mode_page_flip_ioctl calls drm_crtc_vblank_get before the
> page_flip_target hook.
> 
> You're right though that the fact that drm_atomic_helper_page_flip
> doesn't call drm_crtc_vblank_get is a bit alarming. From the
> DRM_IOCTL_MODE_PAGE_FLIP POV, drm_crtc_vblank_get must be called when
> userspace calls the ioctl (either by drm_mode_page_flip_ioctl or the
> page_flip hook implementation), and drm_crtc_vblank_put must be called
> when the flip completes and the event is sent to userspace. How is this
> supposed to be handled with atomic?

I'll let Daniel comment on that.

> Andrey, did you check via code audit and/or testing that the vblank
> reference count is still balanced after this change?
>
> >> @@ -3143,8 +3064,7 @@ int amdgpu_dm_atomic_check(struct drm_device *dev,
> >> 
> >> * 1. This commit is not a page flip.
> >> * 2. This commit is a page flip, and targets are
> > 
> > created.
> > 
> >> */
> >> 
> >> -  if (!page_flip_needed(plane_state, old_plane_state,
> >> -true) ||
> >> +  if (!page_flip_needed(plane_state, old_plane_state,
> > 
> > true) ||
> > 
> > This seems to be an unrelated change.
> 
> Yeah, such whitespace-only changes should be dropped.

-- 
Regards,

Laurent Pinchart

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB(v3)

2017-01-18 Thread Monk Liu
previously we always insert 128nops behind vm_flush, which
may lead to DAMframe size above 256 dw and automatially aligned
to 512 dw.

now we calculate how many DWs already inserted after vm_flush
and make up for the reset to pad up to 128dws before emit_ib.
that way we only take 256 dw per submit.

v2:
drop insert_nops in vm_flush
re-calculate the estimate frame size for gfx8 ring
v3:
calculate the gap between vm_flush and IB in cntx_cntl
on an member field of ring structure
v4:
set last_vm_flush_pos even for case of no vm flush.

Change-Id: I0a1845de07c0b86ac1b77ac1a007e893144144fd
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   |  7 +++
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 16 
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index c813cbe..332969f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -173,6 +173,7 @@ struct amdgpu_ring {
 #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
 #endif
+   u32 last_vm_flush_pos;
 };
 
 int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d05546e..53fc5e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -396,6 +396,13 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job)
amdgpu_vm_ring_has_compute_vm_bug(ring)))
amdgpu_ring_emit_pipeline_sync(ring);
 
+   /* when no vm-flush this frame, we still need to mark down
+* the position of the tail of hdp_flush, because we still
+* need to make sure there are 128 DWs between last SWITCH_BUFFER and
+* the emit_ib this frame. otherwise there is still VM fault occured on
+* constant engine.
+*/
+   ring->last_vm_flush_pos = ring->wptr;
if (ring->funcs->emit_vm_flush && (job->vm_needs_flush ||
amdgpu_vm_is_gpu_reset(adev, id))) {
struct fence *fence;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 5f37313..dde4177 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6572,7 +6572,6 @@ static void gfx_v8_0_ring_emit_fence_gfx(struct 
amdgpu_ring *ring, u64 addr,
  DATA_SEL(write64bit ? 2 : 1) | INT_SEL(int_sel ? 2 : 
0));
amdgpu_ring_write(ring, lower_32_bits(seq));
amdgpu_ring_write(ring, upper_32_bits(seq));
-
 }
 
 static void gfx_v8_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
@@ -6636,9 +6635,8 @@ static void gfx_v8_0_ring_emit_vm_flush(struct 
amdgpu_ring *ring,
/* sync PFP to ME, otherwise we might get invalid PFP reads */
amdgpu_ring_write(ring, PACKET3(PACKET3_PFP_SYNC_ME, 0));
amdgpu_ring_write(ring, 0x0);
-   /* GFX8 emits 128 dw nop to prevent CE access VM before 
vm_flush finish */
-   amdgpu_ring_insert_nop(ring, 128);
}
+   ring->last_vm_flush_pos = ring->wptr;
 }
 
 static u32 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -6743,6 +6741,15 @@ static void gfx_v8_ring_emit_cntxcntl(struct amdgpu_ring 
*ring, uint32_t flags)
if (amdgpu_sriov_vf(ring->adev))
gfx_v8_0_ring_emit_de_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+
+   /* We need to pad some NOPs before emit_ib to prevent CE run ahead of
+* vm_flush, which may trigger VM fault. */
+   if (ring->wptr > ring->last_vm_flush_pos) /* no wptr wrapping to RB 
head */
+   amdgpu_ring_insert_nop(ring, 128 - (ring->wptr - 
ring->last_vm_flush_pos));
+   else
+   if (ring->ptr_mask + 1 - ring->last_vm_flush_pos + ring->wptr < 
128)
+   amdgpu_ring_insert_nop(ring,
+   128 - (ring->ptr_mask + 1 - 
ring->last_vm_flush_pos + ring->wptr));
 }
 
 static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring *ring, uint32_t reg)
@@ -7018,7 +7025,8 @@ static const struct amdgpu_ring_funcs 
gfx_v8_0_ring_funcs_gfx = {
7 + /* gfx_v8_0_ring_emit_pipeline_sync */
128 + 19 + /* gfx_v8_0_ring_emit_vm_flush */
2 + /* gfx_v8_ring_emit_sb */
-   3 + 4 + 29, /* gfx_v8_ring_emit_cntxcntl including vgt 
flush/meta-data */
+   3 + 4 + 29 - /* gfx_v8_ring_emit_cntxcntl including vgt 
flush/meta-data */
+   20 - 7 - 6 - 3 - 4 - 29, /* no need to count 
gds/hdp_flush/vm_flush fence/cntx_cntl/vgt_flush/meta-data anymore */
.emit_ib_size = 4, /* gfx_v8_0_ring_emit_ib_gfx */
.emit_ib = gfx_v8_0_ring_emit_ib_gfx,
.emit_fe

[PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB(v3)

2017-01-18 Thread Monk Liu
previously we always insert 128nops behind vm_flush, which
may lead to DAMframe size above 256 dw and automatially aligned
to 512 dw.

now we calculate how many DWs already inserted after vm_flush
and make up for the reset to pad up to 128dws before emit_ib.
that way we only take 256 dw per submit.

v2:
drop insert_nops in vm_flush
re-calculate the estimate frame size for gfx8 ring
v3:
calculate the gap between vm_flush and IB in cntx_cntl
on an member field of ring structure

Change-Id: I0a1845de07c0b86ac1b77ac1a007e893144144fd
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 16 
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index c813cbe..332969f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -173,6 +173,7 @@ struct amdgpu_ring {
 #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
 #endif
+   u32 last_vm_flush_pos;
 };
 
 int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 5f37313..dde4177 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6572,7 +6572,6 @@ static void gfx_v8_0_ring_emit_fence_gfx(struct 
amdgpu_ring *ring, u64 addr,
  DATA_SEL(write64bit ? 2 : 1) | INT_SEL(int_sel ? 2 : 
0));
amdgpu_ring_write(ring, lower_32_bits(seq));
amdgpu_ring_write(ring, upper_32_bits(seq));
-
 }
 
 static void gfx_v8_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
@@ -6636,9 +6635,8 @@ static void gfx_v8_0_ring_emit_vm_flush(struct 
amdgpu_ring *ring,
/* sync PFP to ME, otherwise we might get invalid PFP reads */
amdgpu_ring_write(ring, PACKET3(PACKET3_PFP_SYNC_ME, 0));
amdgpu_ring_write(ring, 0x0);
-   /* GFX8 emits 128 dw nop to prevent CE access VM before 
vm_flush finish */
-   amdgpu_ring_insert_nop(ring, 128);
}
+   ring->last_vm_flush_pos = ring->wptr;
 }
 
 static u32 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -6743,6 +6741,15 @@ static void gfx_v8_ring_emit_cntxcntl(struct amdgpu_ring 
*ring, uint32_t flags)
if (amdgpu_sriov_vf(ring->adev))
gfx_v8_0_ring_emit_de_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+
+   /* We need to pad some NOPs before emit_ib to prevent CE run ahead of
+* vm_flush, which may trigger VM fault. */
+   if (ring->wptr > ring->last_vm_flush_pos) /* no wptr wrapping to RB 
head */
+   amdgpu_ring_insert_nop(ring, 128 - (ring->wptr - 
ring->last_vm_flush_pos));
+   else
+   if (ring->ptr_mask + 1 - ring->last_vm_flush_pos + ring->wptr < 
128)
+   amdgpu_ring_insert_nop(ring,
+   128 - (ring->ptr_mask + 1 - 
ring->last_vm_flush_pos + ring->wptr));
 }
 
 static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring *ring, uint32_t reg)
@@ -7018,7 +7025,8 @@ static const struct amdgpu_ring_funcs 
gfx_v8_0_ring_funcs_gfx = {
7 + /* gfx_v8_0_ring_emit_pipeline_sync */
128 + 19 + /* gfx_v8_0_ring_emit_vm_flush */
2 + /* gfx_v8_ring_emit_sb */
-   3 + 4 + 29, /* gfx_v8_ring_emit_cntxcntl including vgt 
flush/meta-data */
+   3 + 4 + 29 - /* gfx_v8_ring_emit_cntxcntl including vgt 
flush/meta-data */
+   20 - 7 - 6 - 3 - 4 - 29, /* no need to count 
gds/hdp_flush/vm_flush fence/cntx_cntl/vgt_flush/meta-data anymore */
.emit_ib_size = 4, /* gfx_v8_0_ring_emit_ib_gfx */
.emit_ib = gfx_v8_0_ring_emit_ib_gfx,
.emit_fence = gfx_v8_0_ring_emit_fence_gfx,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


答复: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

2017-01-18 Thread Liu, Monk
>First question why do we want to limit the dw per submit to 256? Some
>limitation for SRIOV?


[ML]

suppose hw_submit_count in gpu_scheduler is 2 (by default), and without 
suppress frame size under 256, we may sometime submit 256dw, and sometime 
submit 512 dw, and that could lead to CPU override ring buffer content which is 
under processing by GPU, e.g. :


we assume ring buffer size = 512 dw (so ring buffer can hold 2 hw submit) for 
easy understanding.


hw_count =2;

submit job-1 (take 256 dw)

hw_count = 1;

submit job-2 (take 256 dw) //now ring buffer is full

hw_count =0;

gpu_scheduler waiting

...

job-1 signaled, then hw_count => 1

submit job3, and the job3 is filled in the head of RB (wrapped)

but job3 take 512 dw (e.g. it takes 259 dw, but we aligned it to 512)


job-2 still under processing, but the package belongs to job-2 is override by 
job3, disaster ...


another reason is we keep each DMAframe to 256dw is for better performance.


BR Monk


发件人: Christian König 
发送时间: 2017年1月18日 17:28:15
收件人: Liu, Monk; amd-gfx@lists.freedesktop.org
主题: Re: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

Am 18.01.2017 um 06:55 schrieb Monk Liu:
> previously we always insert 128nops behind vm_flush, which
> may lead to DAMframe size above 256 dw and automatially aligned
> to 512 dw.
>
> now we calculate how many DWs already inserted after vm_flush
> and make up for the reset to pad up to 128dws before emit_ib.
>
> that way we only take 256 dw per submit.

First question why do we want to limit the dw per submit to 256? Some
limitation for SRIOV?

Then for the implementation, please don't add all that counting to the
different functions. Instead save the current position in the ring in
emit_vm_flush and then calculate in emit_cntxcntl() how many dw we still
need to add to have at least 128. E.g. call the variable something like
last_vm_flush_pos.

That makes the code way more robust towards any changes.

Regards,
Christian.

>
> Change-Id: Iac198e16f35b071476ba7bd48ab338223f6fe650
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 24 ++--
>   2 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index c813cbe..1dbe600 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -173,6 +173,7 @@ struct amdgpu_ring {
>   #if defined(CONFIG_DEBUG_FS)
>struct dentry *ent;
>   #endif
> + u32 dws_between_vm_ib;
>   };
>
>   int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 5f37313..76b1315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -5670,6 +5670,8 @@ static void gfx_v8_0_ring_emit_gds_switch(struct 
> amdgpu_ring *ring,
>amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].oa);
>amdgpu_ring_write(ring, 0);
>amdgpu_ring_write(ring, (1 << (oa_size + oa_base)) - (1 << oa_base));
> +
> + ring->dws_between_vm_ib += 20;
>   }
>
>   static uint32_t wave_read_ind(struct amdgpu_device *adev, uint32_t simd, 
> uint32_t wave, uint32_t address)
> @@ -6489,6 +6491,8 @@ static void gfx_v8_0_ring_emit_hdp_flush(struct 
> amdgpu_ring *ring)
>amdgpu_ring_write(ring, ref_and_mask);
>amdgpu_ring_write(ring, ref_and_mask);
>amdgpu_ring_write(ring, 0x20); /* poll interval */
> +
> + ring->dws_between_vm_ib += 7;
>   }
>
>   static void gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring *ring)
> @@ -6500,6 +6504,8 @@ static void gfx_v8_0_ring_emit_vgt_flush(struct 
> amdgpu_ring *ring)
>amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0));
>amdgpu_ring_write(ring, EVENT_TYPE(VGT_FLUSH) |
>EVENT_INDEX(0));
> +
> + ring->dws_between_vm_ib += 4;
>   }
>
>
> @@ -6573,6 +6579,7 @@ static void gfx_v8_0_ring_emit_fence_gfx(struct 
> amdgpu_ring *ring, u64 addr,
>amdgpu_ring_write(ring, lower_32_bits(seq));
>amdgpu_ring_write(ring, upper_32_bits(seq));
>
> + ring->dws_between_vm_ib += 6;
>   }
>
>   static void gfx_v8_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
> @@ -6639,6 +6646,8 @@ static void gfx_v8_0_ring_emit_vm_flush(struct 
> amdgpu_ring *ring,
>/* GFX8 emits 128 dw nop to prevent CE access VM before 
> vm_flush finish */
>amdgpu_ring_insert_nop(ring, 128);
>}
> +
> + ring->dws_between_vm_ib = 0; /* clear before recalculate */
>   }
>
>   static u32 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
> @@ -6711,9 +6720,11 @@ static void gfx_v8_ring_emit_cntxcntl(struct 
> amdgpu_ring *ring, uint32_t flags)
>   {
>uint32_t dw2 = 0;
>
> - if (amdgpu_sriov_vf(ring->adev))
> +

[PATCH xf86-video-ati] ati: Support loading the amdgpu driver from the ati wrapper

2017-01-18 Thread Michel Dänzer
From: Michel Dänzer 

If .../share/X11/xorg.conf.d/10-amdgpu.conf doesn't exist, but the ati
wrapper is loaded, it will otherwise try to use the radeon driver even
for GPUs driven by the amdgpu kernel driver. This can only fail,
potentially in bad ways.

Signed-off-by: Michel Dänzer 
---
 src/ati.c | 40 
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/ati.c b/src/ati.c
index 227665e73..d5294df09 100644
--- a/src/ati.c
+++ b/src/ati.c
@@ -58,6 +58,7 @@
 #endif
 
 #include 
+#include 
 #include "atipcirename.h"
 
 #include "ati.h"
@@ -68,6 +69,7 @@
 #define MACH64_DRIVER_NAME  "mach64"
 #define R128_DRIVER_NAME"r128"
 #define RADEON_DRIVER_NAME  "radeon"
+#define AMDGPU_DRIVER_NAME  "amdgpu"
 
 enum
 {
@@ -140,9 +142,9 @@ ati_device_get_indexed(int index)
 void
 ati_gdev_subdriver(pointer options)
 {
-int  nATIGDev, nMach64GDev, nR128GDev, nRadeonGDev;
+int  nATIGDev, nMach64GDev, nR128GDev, nRadeonGDev, nAmdgpuGDev;
 GDevPtr *ATIGDevs;
-Bool load_mach64 = FALSE, load_r128 = FALSE, load_radeon = FALSE;
+Bool load_mach64 = FALSE, load_r128 = FALSE, load_radeon = FALSE, 
load_amdgpu = FALSE;
 int  i;
 
 /* let the subdrivers configure for themselves */
@@ -154,6 +156,7 @@ ati_gdev_subdriver(pointer options)
 nMach64GDev = xf86MatchDevice(MACH64_DRIVER_NAME, NULL);
 nR128GDev = xf86MatchDevice(R128_DRIVER_NAME, NULL);
 nRadeonGDev = xf86MatchDevice(RADEON_DRIVER_NAME, NULL);
+nAmdgpuGDev = xf86MatchDevice(AMDGPU_DRIVER_NAME, NULL);
 
 for (i = 0; i < nATIGDev; i++) {
 GDevPtr ati_gdev = ATIGDevs[i];
@@ -200,8 +203,34 @@ ati_gdev_subdriver(pointer options)
 }
 
 if (chip_family == ATI_CHIP_FAMILY_Radeon) {
-ati_gdev->driver = RADEON_DRIVER_NAME;
-load_radeon = TRUE;
+   char *busid;
+
+   XNFasprintf(&busid, "pci:%04x:%02x:%02x.%d",
+   device->domain, device->bus, device->dev,
+   device->func);
+
+   if (busid) {
+   int fd = drmOpen(NULL, busid);
+
+   if (fd >= 0) {
+   drmVersionPtr version = drmGetVersion(fd);
+
+   if (version->version_major == 3) {
+   ati_gdev->driver = AMDGPU_DRIVER_NAME;
+   load_amdgpu = TRUE;
+   }
+
+   free(version);
+   drmClose(fd);
+   }
+
+   free(busid);
+   }
+
+   if (strcmp(ati_gdev->driver, AMDGPU_DRIVER_NAME) != 0) {
+   ati_gdev->driver = RADEON_DRIVER_NAME;
+   load_radeon = TRUE;
+   }
 }
 }
 
@@ -219,6 +248,9 @@ ati_gdev_subdriver(pointer options)
 
 if (load_radeon && (nRadeonGDev == 0))
  xf86LoadOneModule(RADEON_DRIVER_NAME, options);
+
+if (load_amdgpu && (nAmdgpuGDev == 0))
+ xf86LoadOneModule(AMDGPU_DRIVER_NAME, options);
 }
 
 /*
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB

2017-01-18 Thread Christian König

Am 18.01.2017 um 06:55 schrieb Monk Liu:

previously we always insert 128nops behind vm_flush, which
may lead to DAMframe size above 256 dw and automatially aligned
to 512 dw.

now we calculate how many DWs already inserted after vm_flush
and make up for the reset to pad up to 128dws before emit_ib.

that way we only take 256 dw per submit.


First question why do we want to limit the dw per submit to 256? Some 
limitation for SRIOV?


Then for the implementation, please don't add all that counting to the 
different functions. Instead save the current position in the ring in 
emit_vm_flush and then calculate in emit_cntxcntl() how many dw we still 
need to add to have at least 128. E.g. call the variable something like 
last_vm_flush_pos.


That makes the code way more robust towards any changes.

Regards,
Christian.



Change-Id: Iac198e16f35b071476ba7bd48ab338223f6fe650
Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 24 ++--
  2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index c813cbe..1dbe600 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -173,6 +173,7 @@ struct amdgpu_ring {
  #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
  #endif
+   u32 dws_between_vm_ib;
  };
  
  int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 5f37313..76b1315 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5670,6 +5670,8 @@ static void gfx_v8_0_ring_emit_gds_switch(struct 
amdgpu_ring *ring,
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].oa);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, (1 << (oa_size + oa_base)) - (1 << oa_base));
+
+   ring->dws_between_vm_ib += 20;
  }
  
  static uint32_t wave_read_ind(struct amdgpu_device *adev, uint32_t simd, uint32_t wave, uint32_t address)

@@ -6489,6 +6491,8 @@ static void gfx_v8_0_ring_emit_hdp_flush(struct 
amdgpu_ring *ring)
amdgpu_ring_write(ring, ref_and_mask);
amdgpu_ring_write(ring, ref_and_mask);
amdgpu_ring_write(ring, 0x20); /* poll interval */
+
+   ring->dws_between_vm_ib += 7;
  }
  
  static void gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring *ring)

@@ -6500,6 +6504,8 @@ static void gfx_v8_0_ring_emit_vgt_flush(struct 
amdgpu_ring *ring)
amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0));
amdgpu_ring_write(ring, EVENT_TYPE(VGT_FLUSH) |
EVENT_INDEX(0));
+
+   ring->dws_between_vm_ib += 4;
  }
  
  
@@ -6573,6 +6579,7 @@ static void gfx_v8_0_ring_emit_fence_gfx(struct amdgpu_ring *ring, u64 addr,

amdgpu_ring_write(ring, lower_32_bits(seq));
amdgpu_ring_write(ring, upper_32_bits(seq));
  
+	ring->dws_between_vm_ib += 6;

  }
  
  static void gfx_v8_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)

@@ -6639,6 +6646,8 @@ static void gfx_v8_0_ring_emit_vm_flush(struct 
amdgpu_ring *ring,
/* GFX8 emits 128 dw nop to prevent CE access VM before 
vm_flush finish */
amdgpu_ring_insert_nop(ring, 128);
}
+
+   ring->dws_between_vm_ib = 0; /* clear before recalculate */
  }
  
  static u32 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)

@@ -6711,9 +6720,11 @@ static void gfx_v8_ring_emit_cntxcntl(struct amdgpu_ring 
*ring, uint32_t flags)
  {
uint32_t dw2 = 0;
  
-	if (amdgpu_sriov_vf(ring->adev))

+   if (amdgpu_sriov_vf(ring->adev)) {
gfx_v8_0_ring_emit_ce_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+   ring->dws_between_vm_ib += 8;
+   }
  
  	dw2 |= 0x8000; /* set load_enable otherwise this package is just NOPs */

if (flags & AMDGPU_HAVE_CTX_SWITCH) {
@@ -6739,10 +6750,19 @@ static void gfx_v8_ring_emit_cntxcntl(struct 
amdgpu_ring *ring, uint32_t flags)
amdgpu_ring_write(ring, PACKET3(PACKET3_CONTEXT_CONTROL, 1));
amdgpu_ring_write(ring, dw2);
amdgpu_ring_write(ring, 0);
+   ring->dws_between_vm_ib += 3;
  
-	if (amdgpu_sriov_vf(ring->adev))

+   if (amdgpu_sriov_vf(ring->adev)) {
gfx_v8_0_ring_emit_de_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+   ring->dws_between_vm_ib += 21;
+   }
+
+   /* emit_de_meta_init should be the last package right before emi_ib,
+* and we need to pad some NOPs before emit_ib to prevent CE run ahead 
of
+* vm_flush, which may trigger VM fault.
+*/
+   amdgpu_ring_insert_nop(ring, 128 - ring->dws_between_vm_ib);
  }
  
  static void gfx_

Re: [PATCH 1/2] drm/amdgpu: fix reboot failure issue for virtualization

2017-01-18 Thread Christian König

Am 18.01.2017 um 06:00 schrieb Xiangliang Yu:

Reboot process will call HW fini functions of IP blocks. For virt,
need to send event three before hw fini and send event four after
hw fini.

Signed-off-by: Xiangliang Yu 


Reviewed-by: Christian König  for both patches.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2868128..4b82081 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1566,6 +1566,9 @@ int amdgpu_suspend(struct amdgpu_device *adev)
  {
int i, r;
  
+	if (amdgpu_sriov_vf(adev))

+   amdgpu_virt_request_full_gpu(adev, false);
+
/* ungate SMC block first */
r = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_SMC,
 AMD_CG_STATE_UNGATE);
@@ -1594,6 +1597,9 @@ int amdgpu_suspend(struct amdgpu_device *adev)
}
}
  
+	if (amdgpu_sriov_vf(adev))

+   amdgpu_virt_release_full_gpu(adev, false);
+
return 0;
  }
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] some cleanups and fixes

2017-01-18 Thread Christian König

Reviewed-by: Christian König  for the series.

Regards,
Christian.

Am 18.01.2017 um 03:40 schrieb Monk Liu:

Monk Liu (3):
   drm/amdgpu:set cond_exec polling value to 1 in ring_init
   drm/amdgpu:Preamble is forbid to be ignored in SRIOV
   drm/amdgpu:insert switch buffer only for VM submit

  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   | 8 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 +++-
  2 files changed, 6 insertions(+), 6 deletions(-)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu:guarantee 128dws between vm flush and IB(v2)

2017-01-18 Thread Monk Liu
previously we always insert 128nops behind vm_flush, which
may lead to DAMframe size above 256 dw and automatially aligned
to 512 dw.

now we calculate how many DWs already inserted after vm_flush
and make up for the reset to pad up to 128dws before emit_ib.

that way we only take 256 dw per submit.

v2:
drop the 128nop inserting in gfx_v8_vm_flush
and the estimated frame size should minor those between
vm_flush and emit_ib, since we already consdier vm_flush
will take 128 + 19 DWs.

Change-Id: Iac198e16f35b071476ba7bd48ab338223f6fe650
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 25 -
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 9129b8c..e91f227 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -165,6 +165,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
patch_offset = amdgpu_ring_init_cond_exec(ring);
 
need_ctx_switch = ring->current_ctx != fence_ctx;
+   ring->dws_between_vm_ib = 0; /* clear before recalculate */
if (vm) {
r = amdgpu_vm_flush(ring, job);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index c813cbe..1dbe600 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -173,6 +173,7 @@ struct amdgpu_ring {
 #if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
 #endif
+   u32 dws_between_vm_ib;
 };
 
 int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 5f37313..5e8e4eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5670,6 +5670,8 @@ static void gfx_v8_0_ring_emit_gds_switch(struct 
amdgpu_ring *ring,
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].oa);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, (1 << (oa_size + oa_base)) - (1 << oa_base));
+
+   ring->dws_between_vm_ib += 20;
 }
 
 static uint32_t wave_read_ind(struct amdgpu_device *adev, uint32_t simd, 
uint32_t wave, uint32_t address)
@@ -6489,6 +6491,8 @@ static void gfx_v8_0_ring_emit_hdp_flush(struct 
amdgpu_ring *ring)
amdgpu_ring_write(ring, ref_and_mask);
amdgpu_ring_write(ring, ref_and_mask);
amdgpu_ring_write(ring, 0x20); /* poll interval */
+
+   ring->dws_between_vm_ib += 7;
 }
 
 static void gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring *ring)
@@ -6500,6 +6504,8 @@ static void gfx_v8_0_ring_emit_vgt_flush(struct 
amdgpu_ring *ring)
amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0));
amdgpu_ring_write(ring, EVENT_TYPE(VGT_FLUSH) |
EVENT_INDEX(0));
+
+   ring->dws_between_vm_ib += 4;
 }
 
 
@@ -6573,6 +6579,7 @@ static void gfx_v8_0_ring_emit_fence_gfx(struct 
amdgpu_ring *ring, u64 addr,
amdgpu_ring_write(ring, lower_32_bits(seq));
amdgpu_ring_write(ring, upper_32_bits(seq));
 
+   ring->dws_between_vm_ib += 6;
 }
 
 static void gfx_v8_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
@@ -6636,8 +6643,6 @@ static void gfx_v8_0_ring_emit_vm_flush(struct 
amdgpu_ring *ring,
/* sync PFP to ME, otherwise we might get invalid PFP reads */
amdgpu_ring_write(ring, PACKET3(PACKET3_PFP_SYNC_ME, 0));
amdgpu_ring_write(ring, 0x0);
-   /* GFX8 emits 128 dw nop to prevent CE access VM before 
vm_flush finish */
-   amdgpu_ring_insert_nop(ring, 128);
}
 }
 
@@ -6711,9 +6716,11 @@ static void gfx_v8_ring_emit_cntxcntl(struct amdgpu_ring 
*ring, uint32_t flags)
 {
uint32_t dw2 = 0;
 
-   if (amdgpu_sriov_vf(ring->adev))
+   if (amdgpu_sriov_vf(ring->adev)) {
gfx_v8_0_ring_emit_ce_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+   ring->dws_between_vm_ib += 8;
+   }
 
dw2 |= 0x8000; /* set load_enable otherwise this package is just 
NOPs */
if (flags & AMDGPU_HAVE_CTX_SWITCH) {
@@ -6739,10 +6746,17 @@ static void gfx_v8_ring_emit_cntxcntl(struct 
amdgpu_ring *ring, uint32_t flags)
amdgpu_ring_write(ring, PACKET3(PACKET3_CONTEXT_CONTROL, 1));
amdgpu_ring_write(ring, dw2);
amdgpu_ring_write(ring, 0);
+   ring->dws_between_vm_ib += 3;
 
-   if (amdgpu_sriov_vf(ring->adev))
+   if (amdgpu_sriov_vf(ring->adev)) {
gfx_v8_0_ring_emit_de_meta_init(ring,
(flags & AMDGPU_VM_DOMAIN) ? AMDGPU_CSA_VADDR : 
ring->adev->virt.csa_vmid0_addr);
+   ring->dws_between_