Re: [PATCH 1/1] drm/amdgpu: Use list_del_init in amdgpu_mn_unregister

2017-08-01 Thread Felix Kuehling
Hi Christian,

Please apply this patch if you think this is the correct solution. This
is currently blocking the amd-kfd-staging merge from amd-staging-4.11.
Kent will try another merge tomorrow morning our time.

I'm quite sure this also breaks userptrs in the graphics driver.

Thanks,
  Felix


On 17-08-01 10:34 PM, Felix Kuehling wrote:
> Otherwise bo->shadow_list (which is aliased by bo->mn_list) will not
> appear empty in amdgpu_ttm_bo_destroy and cause an oops when freeing
> former userptr BOs.
>
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> index 38f739f..6558a3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> @@ -359,7 +359,7 @@ void amdgpu_mn_unregister(struct amdgpu_bo *bo)
>   head = bo->mn_list.next;
>  
>   bo->mn = NULL;
> - list_del(>mn_list);
> + list_del_init(>mn_list);
>  
>   if (list_empty(head)) {
>   struct amdgpu_mn_node *node;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/1] drm/amdgpu: Use list_del_init in amdgpu_mn_unregister

2017-08-01 Thread Felix Kuehling
Otherwise bo->shadow_list (which is aliased by bo->mn_list) will not
appear empty in amdgpu_ttm_bo_destroy and cause an oops when freeing
former userptr BOs.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 38f739f..6558a3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -359,7 +359,7 @@ void amdgpu_mn_unregister(struct amdgpu_bo *bo)
head = bo->mn_list.next;
 
bo->mn = NULL;
-   list_del(>mn_list);
+   list_del_init(>mn_list);
 
if (list_empty(head)) {
struct amdgpu_mn_node *node;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: vega firmware failing to load

2017-08-01 Thread Huang Rui
On Wed, Aug 02, 2017 at 10:07:01AM +1000, Dave Airlie wrote:
> Running amd-staging-4.11 at 0ae168080db88fa3a2ec84741a7293a25e4ecb42
> on
> VEGA10 0x1002:0x687F pciid
> 
> with firmware from
> https://people.freedesktop.org/~agd5f/radeon_ucode/vega10/
> 
> PSP firmware loading fails.
> 
> Dave.
> 
> [1.690928] [drm] initializing kernel modesetting (VEGA10
> 0x1002:0x687F 0x1002:0x0B36 0xC3).
> [1.690934] [drm] register mmio base: 0xEFB0
> [1.690935] [drm] register mmio size: 524288
> [1.690955] [drm] probing gen 2 caps for device 1022:1471 = 700d03/e
> [1.690956] [drm] probing mlw for device 1022:1471 = 700d03
> [1.690960] [drm] UVD is enabled in VM mode
> [1.690961] [drm] UVD ENC is enabled in VM mode
> [1.690961] [drm] VCE enabled in VM mode
> [1.690967] amdgpu :03:00.0: Invalid PCI ROM header signature:
> expecting 0xaa55, got 0x
> [1.690992] ATOM BIOS: 113-D0650101-004

Dave, your VBIOS is too old, in other words, the PSP boot loader version
within VBIOS is too old, so it's unable to match latest sos firmware.

Alex, do we have a public VBIOS website that Dave can upgrade his vbios?

Thanks,
Rui
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Add drm ioctl DRM_IOCTL_MODE_GETFB2 & associated helpers.

2017-08-01 Thread Joe Kniss
On Tue, Aug 1, 2017 at 1:46 PM, Laurent Pinchart <
laurent.pinch...@ideasonboard.com> wrote:

> Hi Joe,
>
> On Tuesday 01 Aug 2017 10:24:25 Joe Kniss wrote:
> > On Tue, Aug 1, 2017 at 5:09 AM, Laurent Pinchart wrote:
> > > On Monday 31 Jul 2017 11:29:13 Joe Kniss wrote:
> > >> New getfb2 functionality uses drm_mode_fb_cmd2 struct to be symmetric
> > >> with addfb2.
> > >
> > > What's the use case for this ? We haven't needed such an ioctl for so
> long
> > > that it seemed to me that userspace doesn't really need it, but I
> could be
> > > wrong.
> >
> > Sorry, I failed to reference the original email.  Here it is:
> >
> > 
> > I am a recent addition to Google's ChromeOS gfx team.   I am currently
> > working on display testing and reporting.   An important part of this is
> > our screen capture tool, which works by querying drm for crtcs, planes,
> and
> > fbs.  Unfortunately, there is only limited information available via
> > drmModeGetFB(), which often wrong information when drmModeAddFB2() was
> used
> > to create the fbs.   For example, if the pixel format is NV12 or YUV420,
> > the fb returned knows nothing about the additional buffer planes required
> > by these formats.   Ideally, we would like a function (e.g.
> drmModeGetFB2)
> > to return information symmetric with drmModeAddFB2 including the pixel
> > format id, buffer plane information etc.
> > 
> >
> > ChromeOS has needed this functionality from the start, for both testing
> and
> > error reporting.  We got away with guessing the buffer's format (32bit
> > xrgb) until now.  We are now enabling overlays and more formats including
> > multi-planar (e.g. NV12).  Current getfb() reports neither the pixel
> format
> > nor  planar information.  Without this information, going forward, our
> gfx
> > testing is going to break.  It would be great if we had access to higher
> > level buffer structs (like gbm), but we generally don't since they would
> be
> > created by other apps (chrome browser, android apps, etc...).
>
> How is screen capture implemented ? Do you enumerate the framebuffers being
> scanned out, dump their contents and compose them manually based on the
> active
> plane configuration ? If so, isn't this very race-prone ?
>
>
Yes, this is basically it.  We identify the crtc to capture, query the
planes associated with it and their properties.  For each plane, we get the
fb, then a FD that we use to import a GBM buffer, which we map and
composite.  It's not ideal, but it's the only thing that will work across
all of our platforms and configs.   We either wait or sleep for consistent
testing captures.  I'm not sure what we can do about the race condition
without a much more substantial change...  We are currently looking into
some platform specific methods, but the current approach won't be going
away anytime soon.


> > >> Also modifies *_fb_create_handle() calls to accept a
> > >> format_plane_index so that handles for each plane can be generated.
> > >> Previously, many *_fb_create_handle() calls simply defaulted to plane
> 0
> > >> only.
> > >
> > > And with this patch the amd/amdgpu, armada, gma500, i915, mediatek,
> msm,
> > > nouveau and radeon drivers still do. Do none of them support
> multi-planar
> > > formats ?
> >
> > I would imagine that some of these do support multi-planar formats, but
> > they don't appear to have them implemented yet (except perhaps as offsets
> > into a single buffer).  I will certainly be looking into this soon, but
> any
> > changes will come in future patches.
> >
> > >> Signed-off-by: Joe Kniss 
> > >> ---
> > >>
> > >>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  5 +-
> > >>  drivers/gpu/drm/armada/armada_fb.c  |  1 +
> > >>  drivers/gpu/drm/drm_crtc_internal.h |  2 +
> > >>  drivers/gpu/drm/drm_fb_cma_helper.c | 11 ++--
> > >>  drivers/gpu/drm/drm_framebuffer.c   | 79
> +++-
> > >>  drivers/gpu/drm/drm_ioctl.c |  1 +
> > >>  drivers/gpu/drm/exynos/exynos_drm_fb.c  |  7 ++-
> > >>  drivers/gpu/drm/gma500/framebuffer.c|  2 +
> > >>  drivers/gpu/drm/i915/intel_display.c|  1 +
> > >>  drivers/gpu/drm/mediatek/mtk_drm_fb.c   |  1 +
> > >>  drivers/gpu/drm/msm/msm_fb.c|  5 +-
> > >>  drivers/gpu/drm/nouveau/nouveau_display.c   |  1 +
> > >>  drivers/gpu/drm/omapdrm/omap_fb.c   |  5 +-
> > >>  drivers/gpu/drm/radeon/radeon_display.c |  5 +-
> > >>  drivers/gpu/drm/rockchip/rockchip_drm_fb.c  |  6 ++-
> > >>  drivers/gpu/drm/tegra/fb.c  |  9 +++-
> > >>  include/drm/drm_framebuffer.h   |  1 +
> > >>  include/uapi/drm/drm.h  |  2 +
> > >>  18 files changed, 127 insertions(+), 17 deletions(-)
> > >
> > > [snip]
> > >
> > >> diff --git a/drivers/gpu/drm/drm_framebuffer.c
> > >> b/drivers/gpu/drm/drm_framebuffer.c index 28a0108a1ab8..67b3be1bedbc
> > >> 100644
> > >> --- a/drivers/gpu/drm/drm_framebuffer.c
> > >> +++ 

vega firmware failing to load

2017-08-01 Thread Dave Airlie
Running amd-staging-4.11 at 0ae168080db88fa3a2ec84741a7293a25e4ecb42
on
VEGA10 0x1002:0x687F pciid

with firmware from
https://people.freedesktop.org/~agd5f/radeon_ucode/vega10/

PSP firmware loading fails.

Dave.

[1.690928] [drm] initializing kernel modesetting (VEGA10
0x1002:0x687F 0x1002:0x0B36 0xC3).
[1.690934] [drm] register mmio base: 0xEFB0
[1.690935] [drm] register mmio size: 524288
[1.690955] [drm] probing gen 2 caps for device 1022:1471 = 700d03/e
[1.690956] [drm] probing mlw for device 1022:1471 = 700d03
[1.690960] [drm] UVD is enabled in VM mode
[1.690961] [drm] UVD ENC is enabled in VM mode
[1.690961] [drm] VCE enabled in VM mode
[1.690967] amdgpu :03:00.0: Invalid PCI ROM header signature:
expecting 0xaa55, got 0x
[1.690992] ATOM BIOS: 113-D0650101-004
[1.690998] [drm] GPU post is not needed
[1.691005] [drm] vm size is 262144 GB, block size is 9-bit
[1.691008] amdgpu :03:00.0: VRAM: 8176M 0x00F4 -
0x00F5FEFF (8176M used)
[1.691009] amdgpu :03:00.0: GTT: 256M 0x00F5FF00 -
0x00F60EFF
[1.691011] [drm] Detected VRAM RAM=8176M, BAR=256M
[1.691011] [drm] RAM width 2048bits HBM
[1.691101] [TTM] Zone  kernel: Available graphics memory: 2002554 kiB
[1.691101] [TTM] Initializing pool allocator
[1.691103] [TTM] Initializing DMA pool allocator
[1.691114] [drm] amdgpu: 8176M of VRAM memory ready
[1.691115] [drm] amdgpu: 8176M of GTT memory ready.
[1.691123] [drm] GART: num cpu pages 65536, num gpu pages 65536
[1.691190] [drm] PCIE GART of 256M enabled (table at 0x00F40080).
[1.691239] amdgpu :03:00.0: amdgpu: using MSI.
[1.691292] [drm] amdgpu: irq initialized.

[1.712298] amdgpu: [powerplay] amdgpu: powerplay sw initialized
[1.712596] amdgpu :03:00.0: fence driver on ring 0 use gpu
addr 0x00f5ff48, cpu addr 0x9a6ceaeb3008
[1.712679] amdgpu :03:00.0: fence driver on ring 1 use gpu
addr 0x00f5ff400010, cpu addr 0x9a6ceaeb3010
[1.712762] amdgpu :03:00.0: fence driver on ring 2 use gpu
addr 0x00f5ff400018, cpu addr 0x9a6ceaeb3018
[1.712835] amdgpu :03:00.0: fence driver on ring 3 use gpu
addr 0x00f5ff400028, cpu addr 0x9a6ceaeb3028
[1.712917] amdgpu :03:00.0: fence driver on ring 4 use gpu
addr 0x00f5ff400030, cpu addr 0x9a6ceaeb3030
[1.712990] amdgpu :03:00.0: fence driver on ring 5 use gpu
addr 0x00f5ff400038, cpu addr 0x9a6ceaeb3038
[1.713044] amdgpu :03:00.0: fence driver on ring 6 use gpu
addr 0x00f5ff400048, cpu addr 0x9a6ceaeb3048
[1.713094] amdgpu :03:00.0: fence driver on ring 7 use gpu
addr 0x00f5ff400050, cpu addr 0x9a6ceaeb3050
[1.713136] amdgpu :03:00.0: fence driver on ring 8 use gpu
addr 0x00f5ff400058, cpu addr 0x9a6ceaeb3058
[1.713152] amdgpu :03:00.0: fence driver on ring 9 use gpu
addr 0x00f5ff40006c, cpu addr 0x9a6ceaeb306c
[1.713211] [drm] use_doorbell being set to: [true]
[1.713244] amdgpu :03:00.0: fence driver on ring 10 use gpu
addr 0x00f5ff400074, cpu addr 0x9a6ceaeb3074
[1.713252] [drm] use_doorbell being set to: [true]
[1.713286] amdgpu :03:00.0: fence driver on ring 11 use gpu
addr 0x00f5ff40007c, cpu addr 0x9a6ceaeb307c
[1.713367] [drm] Found UVD firmware Version: 1.62 Family ID: 17
[1.713374] [drm] PSP loading UVD firmware
[1.713530] amdgpu :03:00.0: fence driver on ring 12 use gpu
addr 0x00f4009118e0, cpu addr 0xb4f0c135a8e0
[1.713570] amdgpu :03:00.0: fence driver on ring 13 use gpu
addr 0x00f5ff4000ac, cpu addr 0x9a6ceaeb30ac
[1.713613] amdgpu :03:00.0: fence driver on ring 14 use gpu
addr 0x00f5ff4000bc, cpu addr 0x9a6ceaeb30bc
[1.713658] [drm] Found VCE firmware Version: 53.29 Binary ID: 4
[1.713665] [drm] PSP loading VCE firmware
[1.713697] amdgpu :03:00.0: fence driver on ring 15 use gpu
addr 0x00f5ff4000d4, cpu addr 0x9a6ceaeb30d4
[1.713739] amdgpu :03:00.0: fence driver on ring 16 use gpu
addr 0x00f5ff4000ec, cpu addr 0x9a6ceaeb30ec
[1.713781] amdgpu :03:00.0: fence driver on ring 17 use gpu
addr 0x00f5ff4000fc, cpu addr 0x9a6ceaeb30fc
[1.875199] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Add drm ioctl DRM_IOCTL_MODE_GETFB2 & associated helpers.

2017-08-01 Thread Laurent Pinchart
Hi Joe,

On Tuesday 01 Aug 2017 10:24:25 Joe Kniss wrote:
> On Tue, Aug 1, 2017 at 5:09 AM, Laurent Pinchart wrote:
> > On Monday 31 Jul 2017 11:29:13 Joe Kniss wrote:
> >> New getfb2 functionality uses drm_mode_fb_cmd2 struct to be symmetric
> >> with addfb2.
> > 
> > What's the use case for this ? We haven't needed such an ioctl for so long
> > that it seemed to me that userspace doesn't really need it, but I could be
> > wrong.
> 
> Sorry, I failed to reference the original email.  Here it is:
> 
> 
> I am a recent addition to Google's ChromeOS gfx team.   I am currently
> working on display testing and reporting.   An important part of this is
> our screen capture tool, which works by querying drm for crtcs, planes, and
> fbs.  Unfortunately, there is only limited information available via
> drmModeGetFB(), which often wrong information when drmModeAddFB2() was used
> to create the fbs.   For example, if the pixel format is NV12 or YUV420,
> the fb returned knows nothing about the additional buffer planes required
> by these formats.   Ideally, we would like a function (e.g. drmModeGetFB2)
> to return information symmetric with drmModeAddFB2 including the pixel
> format id, buffer plane information etc.
> 
> 
> ChromeOS has needed this functionality from the start, for both testing and
> error reporting.  We got away with guessing the buffer's format (32bit
> xrgb) until now.  We are now enabling overlays and more formats including
> multi-planar (e.g. NV12).  Current getfb() reports neither the pixel format
> nor  planar information.  Without this information, going forward, our gfx
> testing is going to break.  It would be great if we had access to higher
> level buffer structs (like gbm), but we generally don't since they would be
> created by other apps (chrome browser, android apps, etc...).

How is screen capture implemented ? Do you enumerate the framebuffers being 
scanned out, dump their contents and compose them manually based on the active 
plane configuration ? If so, isn't this very race-prone ?

> >> Also modifies *_fb_create_handle() calls to accept a
> >> format_plane_index so that handles for each plane can be generated.
> >> Previously, many *_fb_create_handle() calls simply defaulted to plane 0
> >> only.
> > 
> > And with this patch the amd/amdgpu, armada, gma500, i915, mediatek, msm,
> > nouveau and radeon drivers still do. Do none of them support multi-planar
> > formats ?
>  
> I would imagine that some of these do support multi-planar formats, but
> they don't appear to have them implemented yet (except perhaps as offsets
> into a single buffer).  I will certainly be looking into this soon, but any
> changes will come in future patches.
> 
> >> Signed-off-by: Joe Kniss 
> >> ---
> >> 
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  5 +-
> >>  drivers/gpu/drm/armada/armada_fb.c  |  1 +
> >>  drivers/gpu/drm/drm_crtc_internal.h |  2 +
> >>  drivers/gpu/drm/drm_fb_cma_helper.c | 11 ++--
> >>  drivers/gpu/drm/drm_framebuffer.c   | 79 +++-
> >>  drivers/gpu/drm/drm_ioctl.c |  1 +
> >>  drivers/gpu/drm/exynos/exynos_drm_fb.c  |  7 ++-
> >>  drivers/gpu/drm/gma500/framebuffer.c|  2 +
> >>  drivers/gpu/drm/i915/intel_display.c|  1 +
> >>  drivers/gpu/drm/mediatek/mtk_drm_fb.c   |  1 +
> >>  drivers/gpu/drm/msm/msm_fb.c|  5 +-
> >>  drivers/gpu/drm/nouveau/nouveau_display.c   |  1 +
> >>  drivers/gpu/drm/omapdrm/omap_fb.c   |  5 +-
> >>  drivers/gpu/drm/radeon/radeon_display.c |  5 +-
> >>  drivers/gpu/drm/rockchip/rockchip_drm_fb.c  |  6 ++-
> >>  drivers/gpu/drm/tegra/fb.c  |  9 +++-
> >>  include/drm/drm_framebuffer.h   |  1 +
> >>  include/uapi/drm/drm.h  |  2 +
> >>  18 files changed, 127 insertions(+), 17 deletions(-)
> > 
> > [snip]
> > 
> >> diff --git a/drivers/gpu/drm/drm_framebuffer.c
> >> b/drivers/gpu/drm/drm_framebuffer.c index 28a0108a1ab8..67b3be1bedbc
> >> 100644
> >> --- a/drivers/gpu/drm/drm_framebuffer.c
> >> +++ b/drivers/gpu/drm/drm_framebuffer.c
> >> @@ -24,6 +24,7 @@

[snip]

> >> +/**
> >> + * drm_mode_getfb2 - get FB info
> >> + * @dev: drm device for the ioctl
> >> + * @data: data pointer for the ioctl
> >> + * @file_priv: drm file for the ioctl call
> >> + *
> >> + * Lookup the FB given its ID and return info about it.
> >> + *
> >> + * Called by the user via ioctl.
> >> + *
> >> + * Returns:
> >> + * Zero on success, negative errno on failure.
> >> + */
> >> +int drm_mode_getfb2(struct drm_device *dev,
> >> +void *data, struct drm_file *file_priv)
> >> +{
> >> + struct drm_mode_fb_cmd2 *r = data;
> >> + struct drm_framebuffer *fb;
> >> + int ret, i;
> >> +
> >> + if (!drm_core_check_feature(dev, DRIVER_MODESET))
> >> + return -EINVAL;
> >> +
> >> + fb = drm_framebuffer_lookup(dev, r->fb_id);
> >> + if (!fb)
> >> +

Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Christian König

Am 01.08.2017 um 19:00 schrieb Tom St Denis:

On 01/08/17 11:41 AM, Christian König wrote:

We can turn of the trace subsystem during compile time already.

Additional to that the trace points use self modifying code to make 
sure that they don't have any overhead as long as they are turned 
off. But I don't think this works with the "if (trace_*_enabled()" as 
well.


Anyway, not a performance critical code path so I don't really bother.


Well aside from that this will only really be useful in upstream 
kernels where we don't control the configuration.


Eventually if we come up with a better solution (tm) then we can 
revert this cleanly.


Actually even if we come up with some other approach to access the GPU 
VM of a process I would rather like to keep that trace point because it 
is rather useful for some userptr debugging as well.


Christian.



Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Add drm ioctl DRM_IOCTL_MODE_GETFB2 & associated helpers.

2017-08-01 Thread Joe Kniss
Thanks for the quick review!

On Tue, Aug 1, 2017 at 5:09 AM, Laurent Pinchart  wrote:

> Hi Joe,
>
> Thank you for the patch.
>
> On Monday 31 Jul 2017 11:29:13 Joe Kniss wrote:
> > New getfb2 functionality uses drm_mode_fb_cmd2 struct to be symmetric
> > with addfb2.
>
> What's the use case for this ? We haven't needed such an ioctl for so long
> that it seemed to me that userspace doesn't really need it, but I could be
> wrong.
>
> Sorry, I failed to reference the original email.  Here it is:

 I am a recent addition to Google's ChromeOS gfx team.   I am currently
working on display testing and reporting.   An important part of this is
our screen capture tool, which works by querying drm for crtcs, planes, and
fbs.  Unfortunately, there is only limited information available via
drmModeGetFB(), which often wrong information when drmModeAddFB2() was used
to create the fbs.   For example, if the pixel format is NV12 or YUV420,
the fb returned knows nothing about the additional buffer planes required
by these formats.   Ideally, we would like a function (e.g. drmModeGetFB2)
to return information symmetric with drmModeAddFB2 including the pixel
format id, buffer plane information etc.

ChromeOS has needed this functionality from the start, for both testing and
error reporting.  We got away with guessing the buffer's format (32bit
xrgb) until now.  We are now enabling overlays and more formats including
multi-planar (e.g. NV12).  Current getfb() reports neither the pixel format
nor  planar information.  Without this information, going forward, our gfx
testing is going to break.  It would be great if we had access to higher
level buffer structs (like gbm), but we generally don't since they would be
created by other apps (chrome browser, android apps, etc...).


> > Also modifies *_fb_create_handle() calls to accept a
> > format_plane_index so that handles for each plane can be generated.
> > Previously, many *_fb_create_handle() calls simply defaulted to plane 0
> > only.
>
> And with this patch the amd/amdgpu, armada, gma500, i915, mediatek, msm,
> nouveau and radeon drivers still do. Do none of them support multi-planar
> formats ?
>
> I would imagine that some of these do support multi-planar formats, but
they don't appear to have them implemented yet (except perhaps as offsets
into a single buffer).  I will certainly be looking into this soon, but any
changes will come in future patches.


> > Signed-off-by: Joe Kniss 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  5 +-
> >  drivers/gpu/drm/armada/armada_fb.c  |  1 +
> >  drivers/gpu/drm/drm_crtc_internal.h |  2 +
> >  drivers/gpu/drm/drm_fb_cma_helper.c | 11 ++--
> >  drivers/gpu/drm/drm_framebuffer.c   | 79
> +-
> >  drivers/gpu/drm/drm_ioctl.c |  1 +
> >  drivers/gpu/drm/exynos/exynos_drm_fb.c  |  7 ++-
> >  drivers/gpu/drm/gma500/framebuffer.c|  2 +
> >  drivers/gpu/drm/i915/intel_display.c|  1 +
> >  drivers/gpu/drm/mediatek/mtk_drm_fb.c   |  1 +
> >  drivers/gpu/drm/msm/msm_fb.c|  5 +-
> >  drivers/gpu/drm/nouveau/nouveau_display.c   |  1 +
> >  drivers/gpu/drm/omapdrm/omap_fb.c   |  5 +-
> >  drivers/gpu/drm/radeon/radeon_display.c |  5 +-
> >  drivers/gpu/drm/rockchip/rockchip_drm_fb.c  |  6 ++-
> >  drivers/gpu/drm/tegra/fb.c  |  9 +++-
> >  include/drm/drm_framebuffer.h   |  1 +
> >  include/uapi/drm/drm.h  |  2 +
> >  18 files changed, 127 insertions(+), 17 deletions(-)
>
> [snip]
>
> > diff --git a/drivers/gpu/drm/drm_framebuffer.c
> > b/drivers/gpu/drm/drm_framebuffer.c index 28a0108a1ab8..67b3be1bedbc
> 100644
> > --- a/drivers/gpu/drm/drm_framebuffer.c
> > +++ b/drivers/gpu/drm/drm_framebuffer.c
> > @@ -24,6 +24,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include "drm_crtc_internal.h"
>
> [snip]
>
> > +/**
> > + * drm_mode_getfb2 - get FB info
> > + * @dev: drm device for the ioctl
> > + * @data: data pointer for the ioctl
> > + * @file_priv: drm file for the ioctl call
> > + *
> > + * Lookup the FB given its ID and return info about it.
> > + *
> > + * Called by the user via ioctl.
> > + *
> > + * Returns:
> > + * Zero on success, negative errno on failure.
> > + */
> > +int drm_mode_getfb2(struct drm_device *dev,
> > +void *data, struct drm_file *file_priv)
> > +{
> > + struct drm_mode_fb_cmd2 *r = data;
> > + struct drm_framebuffer *fb;
> > + int ret, i;
> > +
> > + if (!drm_core_check_feature(dev, DRIVER_MODESET))
> > + return -EINVAL;
> > +
> > + fb = drm_framebuffer_lookup(dev, r->fb_id);
> > + if (!fb)
> > + return -ENOENT;
> > +
> > + r->height = fb->height;
> > + r->width = fb->width;
> > + r->pixel_format = fb->format->format;
> > + for (i = 0; i < 4; ++i) {
> > + 

Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Tom St Denis

On 01/08/17 11:41 AM, Christian König wrote:

We can turn of the trace subsystem during compile time already.

Additional to that the trace points use self modifying code to make sure 
that they don't have any overhead as long as they are turned off. But I 
don't think this works with the "if (trace_*_enabled()" as well.


Anyway, not a performance critical code path so I don't really bother.


Well aside from that this will only really be useful in upstream kernels 
where we don't control the configuration.


Eventually if we come up with a better solution (tm) then we can revert 
this cleanly.


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Tom St Denis
In v2 of the patch I used the _enabled() function around the blocks so 
the for loop is only reached if the trace is enabled.


Tom

On 01/08/17 11:56 AM, Xie, AlexBin wrote:
I don't know if compiler is smart enough to optimize the following for 
statement out...



+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
+   }


-Alex Bin



*From:* Christian König 
*Sent:* Tuesday, August 1, 2017 11:41 AM
*To:* Xie, AlexBin; amd-gfx@lists.freedesktop.org
*Subject:* Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping
We can turn of the trace subsystem during compile time already.

Additional to that the trace points use self modifying code to make sure
that they don't have any overhead as long as they are turned off. But I
don't think this works with the "if (trace_*_enabled()" as well.

Anyway, not a performance critical code path so I don't really bother.

Christian.

Am 01.08.2017 um 17:00 schrieb axie:
Because this is used by a debug tool, can we use a macro to 
conditionally compile this feature?



On 2017-08-01 10:54 AM, Christian König wrote:

Am 01.08.2017 um 16:26 schrieb Tom St Denis:

On 01/08/17 10:10 AM, Christian König wrote:


You need to cover multiple code path here:
1. The one you currently implemented which uses ttm_dma_populate() 
and pci_map_page().

2. The one using ttm_dma_populate().


I'll have to look into this one though it's my understanding that 
code path is only followed if there's no (hardware) IOMMU right?


Wrong, that one is used when "anything" in the system used the 
SWIOTLB path once. So the detection doesn't always work reliable.


This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.


When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to 
get the adev.



  Would it be taboo to do something like

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
*ttm)

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
gtt->ttm.dma_address, ttm->num_pages);
ttm->state = tt_unbound;
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+ page_to_phys(ttm->pages[i]));
+   }
return 0;
}

This would normally result in a for loop around a nop which 
shouldn't be a huge performance hit but is one none the less.


Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those 
functions are generated by the trace subsystem automatically when you 
define a trace point.


It just doesn't use those nice tricks to modify the compiled binary 
on the fly.


Christian.



4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for 
userptrs.


Basically all just take gtt->ttm.ttm.pages and fill 
gtt->ttm.dma_address.


Same comment as #3.  I can tackle this with a for loop around the 
trace if that is ok.


I suggest to add a helper you can call to trace all 
pages->dma_address mappings inside a ttm.


One thing at a time :-).  That would probably be a bit better since 
the trace log gets filled with remappings (which is why my 
proof-of-concept umr patch only grabs the latest mapping as it reads 
the trace).


Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface 
instead.


Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - freedesktop.org 


lists.freedesktop.org
Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the 
following form. Use of all freedesktop.org lists is subject to our Code 
of ...






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - freedesktop.org 


lists.freedesktop.org
Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the 
following form. Use of all freedesktop.org lists is subject to our Code 
of ...






Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Christian König

We can turn of the trace subsystem during compile time already.

Additional to that the trace points use self modifying code to make sure 
that they don't have any overhead as long as they are turned off. But I 
don't think this works with the "if (trace_*_enabled()" as well.


Anyway, not a performance critical code path so I don't really bother.

Christian.

Am 01.08.2017 um 17:00 schrieb axie:
Because this is used by a debug tool, can we use a macro to 
conditionally compile this feature?



On 2017-08-01 10:54 AM, Christian König wrote:

Am 01.08.2017 um 16:26 schrieb Tom St Denis:

On 01/08/17 10:10 AM, Christian König wrote:


You need to cover multiple code path here:
1. The one you currently implemented which uses ttm_dma_populate() 
and pci_map_page().

2. The one using ttm_dma_populate().


I'll have to look into this one though it's my understanding that 
code path is only followed if there's no (hardware) IOMMU right?


Wrong, that one is used when "anything" in the system used the 
SWIOTLB path once. So the detection doesn't always work reliable.


This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.


When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to 
get the adev.



  Would it be taboo to do something like

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
*ttm)

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
gtt->ttm.dma_address, ttm->num_pages);
ttm->state = tt_unbound;
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+ page_to_phys(ttm->pages[i]));
+   }
return 0;
}

This would normally result in a for loop around a nop which 
shouldn't be a huge performance hit but is one none the less.


Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those 
functions are generated by the trace subsystem automatically when you 
define a trace point.


It just doesn't use those nice tricks to modify the compiled binary 
on the fly.


Christian.



4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for 
userptrs.


Basically all just take gtt->ttm.ttm.pages and fill 
gtt->ttm.dma_address.


Same comment as #3.  I can tackle this with a for loop around the 
trace if that is ok.


I suggest to add a helper you can call to trace all 
pages->dma_address mappings inside a ttm.


One thing at a time :-).  That would probably be a bit better since 
the trace log gets filled with remappings (which is why my 
proof-of-concept umr patch only grabs the latest mapping as it reads 
the trace).


Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface 
instead.


Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread axie
Because this is used by a debug tool, can we use a macro to 
conditionally compile this feature?



On 2017-08-01 10:54 AM, Christian König wrote:

Am 01.08.2017 um 16:26 schrieb Tom St Denis:

On 01/08/17 10:10 AM, Christian König wrote:


You need to cover multiple code path here:
1. The one you currently implemented which uses ttm_dma_populate() 
and pci_map_page().

2. The one using ttm_dma_populate().


I'll have to look into this one though it's my understanding that 
code path is only followed if there's no (hardware) IOMMU right?


Wrong, that one is used when "anything" in the system used the SWIOTLB 
path once. So the detection doesn't always work reliable.


This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.


When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to 
get the adev.



  Would it be taboo to do something like

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
*ttm)

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
gtt->ttm.dma_address, ttm->num_pages);
ttm->state = tt_unbound;
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
+   }
return 0;
}

This would normally result in a for loop around a nop which shouldn't 
be a huge performance hit but is one none the less.


Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those 
functions are generated by the trace subsystem automatically when you 
define a trace point.


It just doesn't use those nice tricks to modify the compiled binary on 
the fly.


Christian.



4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for 
userptrs.


Basically all just take gtt->ttm.ttm.pages and fill 
gtt->ttm.dma_address.


Same comment as #3.  I can tackle this with a for loop around the 
trace if that is ok.


I suggest to add a helper you can call to trace all 
pages->dma_address mappings inside a ttm.


One thing at a time :-).  That would probably be a bit better since 
the trace log gets filled with remappings (which is why my 
proof-of-concept umr patch only grabs the latest mapping as it reads 
the trace).


Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface instead.


Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping (v2)

2017-08-01 Thread Tom St Denis
This helps map DMA addresses back to physical addresses.

Signed-off-by: Tom St Denis 

(v2):  Added tracepoints for USERPTR, SG mappings, and
 SWIOTBL mappings.  Reformatted trace call perform
 PCI decoding internal to the trace.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 35 ---
 2 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 509f7a63d40c..3e0f1885a379 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -14,6 +14,34 @@
 #define AMDGPU_JOB_GET_TIMELINE_NAME(job) \
 
job->base.s_fence->finished.ops->get_timeline_name(>base.s_fence->finished)
 
+TRACE_EVENT(amdgpu_ttm_tt_populate,
+   TP_PROTO(struct amdgpu_device *adev, uint64_t dma_address, uint64_t 
phys_address),
+   TP_ARGS(adev, dma_address, phys_address),
+   TP_STRUCT__entry(
+   __field(uint16_t, domain)
+   __field(uint8_t, bus)
+   __field(uint8_t, slot)
+   __field(uint8_t, func)
+   __field(uint64_t, dma)
+   __field(uint64_t, phys)
+   ),
+   TP_fast_assign(
+  __entry->domain = pci_domain_nr(adev->pdev->bus);
+  __entry->bus = adev->pdev->bus->number;
+  __entry->slot = PCI_SLOT(adev->pdev->devfn);
+  __entry->func = PCI_FUNC(adev->pdev->devfn);
+  __entry->dma = dma_address;
+  __entry->phys = phys_address;
+  ),
+   TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+ (unsigned)__entry->domain,
+ (unsigned)__entry->bus,
+ (unsigned)__entry->slot,
+ (unsigned)__entry->func,
+ (unsigned long long)__entry->dma,
+ (unsigned long long)__entry->phys)
+);
+
 TRACE_EVENT(amdgpu_mm_rreg,
TP_PROTO(unsigned did, uint32_t reg, uint32_t value),
TP_ARGS(did, reg, value),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8da59d212b3b..7857fc581342 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include "amdgpu.h"
+#include "amdgpu_trace.h"
 #include "bif/bif_4_1_d.h"
 
 #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
@@ -667,7 +668,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
struct amdgpu_ttm_tt *gtt = (void *)ttm;
-   unsigned nents;
+   unsigned i, nents;
int r;
 
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
@@ -688,6 +689,15 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 gtt->ttm.dma_address, ttm->num_pages);
 
+   if (unlikely(trace_amdgpu_ttm_tt_populate_enabled())) {
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
+   }
+   }
+
return 0;
 
 release_sg:
@@ -892,7 +902,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
ttm_bo_device *bdev,
 
 static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
 {
-   struct amdgpu_device *adev;
+   struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
struct amdgpu_ttm_tt *gtt = (void *)ttm;
unsigned i;
int r;
@@ -915,14 +925,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 gtt->ttm.dma_address, 
ttm->num_pages);
ttm->state = tt_unbound;
-   return 0;
+   r = 0;
+   goto trace_mappings;
}
 
-   adev = amdgpu_ttm_adev(ttm->bdev);
-
 #ifdef CONFIG_SWIOTLB
if (swiotlb_nr_tbl()) {
-   return ttm_dma_populate(>ttm, adev->dev);
+   r = ttm_dma_populate(>ttm, adev->dev);
+   goto trace_mappings;
}
 #endif
 
@@ -945,7 +955,18 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
return -EFAULT;
}
}
-   return 0;
+
+   r = 0;
+trace_mappings:
+   if (!r && unlikely(trace_amdgpu_ttm_tt_populate_enabled())) {

Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Christian König

Am 01.08.2017 um 16:26 schrieb Tom St Denis:

On 01/08/17 10:10 AM, Christian König wrote:


You need to cover multiple code path here:
1. The one you currently implemented which uses ttm_dma_populate() 
and pci_map_page().

2. The one using ttm_dma_populate().


I'll have to look into this one though it's my understanding that code 
path is only followed if there's no (hardware) IOMMU right?


Wrong, that one is used when "anything" in the system used the SWIOTLB 
path once. So the detection doesn't always work reliable.


This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.


When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to 
get the adev.



  Would it be taboo to do something like

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
*ttm)

drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
gtt->ttm.dma_address, ttm->num_pages);
ttm->state = tt_unbound;
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
+   }
return 0;
}

This would normally result in a for loop around a nop which shouldn't 
be a huge performance hit but is one none the less.


Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those functions 
are generated by the trace subsystem automatically when you define a 
trace point.


It just doesn't use those nice tricks to modify the compiled binary on 
the fly.


Christian.



4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for 
userptrs.


Basically all just take gtt->ttm.ttm.pages and fill 
gtt->ttm.dma_address.


Same comment as #3.  I can tackle this with a for loop around the 
trace if that is ok.


I suggest to add a helper you can call to trace all 
pages->dma_address mappings inside a ttm.


One thing at a time :-).  That would probably be a bit better since 
the trace log gets filled with remappings (which is why my 
proof-of-concept umr patch only grabs the latest mapping as it reads 
the trace).


Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface instead.


Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Tom St Denis

On 01/08/17 10:10 AM, Christian König wrote:


You need to cover multiple code path here:
1. The one you currently implemented which uses ttm_dma_populate() and 
pci_map_page().

2. The one using ttm_dma_populate().


I'll have to look into this one though it's my understanding that code 
path is only followed if there's no (hardware) IOMMU right?  Which while 
it'd have to be covered isn't a high priority right now (our devel 
platforms have hardware IOMMU).


None the less I'll look into it once I figure out #3 and #4 per your 
comments.


3. The one using drm_prime_sg_to_page_addr_arrays() a bit above for 
imported BOs.


This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.  Would it be taboo to do 
something like


diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 gtt->ttm.dma_address, 
ttm->num_pages);

ttm->state = tt_unbound;
+   for (i = 0; i < ttm->num_pages; i++) {
+   trace_amdgpu_ttm_tt_populate(
+   adev,
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
+   }
return 0;
}

This would normally result in a for loop around a nop which shouldn't be 
a huge performance hit but is one none the less.




4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for userptrs.

Basically all just take gtt->ttm.ttm.pages and fill gtt->ttm.dma_address.


Same comment as #3.  I can tackle this with a for loop around the trace 
if that is ok.


I suggest to add a helper you can call to trace all pages->dma_address 
mappings inside a ttm.


One thing at a time :-).  That would probably be a bit better since the 
trace log gets filled with remappings (which is why my proof-of-concept 
umr patch only grabs the latest mapping as it reads the trace).


Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface instead.


Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-ati v2 1/3] Use root window (pixmap) instead of screen pixmap for scanout updates

2017-08-01 Thread Alex Deucher
On Tue, Aug 1, 2017 at 5:19 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> Preparation for following changes, no functional change intended yet.
>
> Reviewed-by: Alex Deucher  # v1
> Signed-off-by: Michel Dänzer 
> ---
>
> v2: Add drmmode_screen_damage_destroy to prevent use-after-free on
> server shutdown

Reviewed-by: Alex Deucher 

>
>  src/drmmode_display.c | 21 ++---
>  src/radeon_kms.c  |  2 +-
>  2 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
> index 245a92fb0..309ccbd6d 100644
> --- a/src/drmmode_display.c
> +++ b/src/drmmode_display.c
> @@ -529,11 +529,8 @@ drmmode_crtc_scanout_free(drmmode_crtc_private_ptr 
> drmmode_crtc)
>  _crtc->scanout[1]);
> }
>
> -   if (drmmode_crtc->scanout_damage) {
> +   if (drmmode_crtc->scanout_damage)
> DamageDestroy(drmmode_crtc->scanout_damage);
> -   drmmode_crtc->scanout_damage = NULL;
> -   RegionUninit(_crtc->scanout_last_region);
> -   }
>  }
>
>  void
> @@ -605,6 +602,15 @@ radeon_screen_damage_report(DamagePtr damage, RegionPtr 
> region, void *closure)
> damage->damage.data = NULL;
>  }
>
> +static void
> +drmmode_screen_damage_destroy(DamagePtr damage, void *closure)
> +{
> +   drmmode_crtc_private_ptr drmmode_crtc = closure;
> +
> +   drmmode_crtc->scanout_damage = NULL;
> +   RegionUninit(_crtc->scanout_last_region);
> +}
> +
>  static Bool
>  drmmode_can_use_hw_cursor(xf86CrtcPtr crtc)
>  {
> @@ -793,9 +799,10 @@ drmmode_crtc_scanout_update(xf86CrtcPtr crtc, 
> DisplayModePtr mode,
> if (!drmmode_crtc->scanout_damage) {
> drmmode_crtc->scanout_damage =
> DamageCreate(radeon_screen_damage_report,
> -NULL, DamageReportRawRegion,
> -TRUE, screen, NULL);
> -   
> DamageRegister(>GetScreenPixmap(screen)->drawable,
> +drmmode_screen_damage_destroy,
> +DamageReportRawRegion,
> +TRUE, screen, drmmode_crtc);
> +   DamageRegister(>root->drawable,
>drmmode_crtc->scanout_damage);
> }
>
> diff --git a/src/radeon_kms.c b/src/radeon_kms.c
> index b22c98406..f76d76a91 100644
> --- a/src/radeon_kms.c
> +++ b/src/radeon_kms.c
> @@ -981,7 +981,7 @@ radeon_scanout_do_update(xf86CrtcPtr xf86_crtc, int 
> scanout_id)
> GCPtr gc = GetScratchGC(pDraw->depth, pScreen);
>
> ValidateGC(pDraw, gc);
> -   (*gc->ops->CopyArea)(>GetScreenPixmap(pScreen)->drawable,
> +   
> (*gc->ops->CopyArea)(>GetWindowPixmap(pScreen->root)->drawable,
>  pDraw, gc,
>  xf86_crtc->x + extents.x1, xf86_crtc->y + 
> extents.y1,
>  extents.x2 - extents.x1, extents.y2 - extents.y1,
> --
> 2.13.3
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH umr] Fix ib_addr_lo in BUFFER_CIK/BUFFER_CONST PM4 packets

2017-08-01 Thread Huang Rui
On Tue, Aug 01, 2017 at 06:00:42AM -0400, Tom St Denis wrote:
> We correctly grabbed the bits but did not shift them
> back to their correct place.
> 
> Signed-off-by: Tom St Denis 

Tested-by: Huang Rui 

> ---
>  src/lib/ring_decode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/lib/ring_decode.c b/src/lib/ring_decode.c
> index 772ea49dda6f..c05fa632a233 100644
> --- a/src/lib/ring_decode.c
> +++ b/src/lib/ring_decode.c
> @@ -407,7 +407,7 @@ static void print_decode_pm4_pkt3(struct umr_asic *asic, 
> struct umr_ring_decoder
>   case 0x33: // INDIRECT_BUFFER_CONST
>   switch (decoder->pm4.cur_word) {
>   case 0: printf("IB_BASE_LO: 0x%08lx, SWAP:%lu", 
> BITS(ib, 2, 32), BITS(ib, 0, 2));
> - decoder->pm4.next_ib_state.ib_addr_lo = 
> BITS(ib, 2, 32);
> + decoder->pm4.next_ib_state.ib_addr_lo = 
> BITS(ib, 2, 32) << 2;
>   break;
>   case 1: printf("IB_BASE_HI: 0x%08lx", BITS(ib, 
> 0, 16));
>   decoder->pm4.next_ib_state.ib_addr_hi = 
> BITS(ib, 0, 16);
> -- 
> 2.12.0
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Tom St Denis

On 01/08/17 07:55 AM, Christian König wrote:

Am 01.08.2017 um 13:51 schrieb Tom St Denis:

This helps map DMA addresses back to physical addresses.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 


  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  8 
  2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h

index 509f7a63d40c..5b2bb28da504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -14,6 +14,34 @@
  #define AMDGPU_JOB_GET_TIMELINE_NAME(job) \
   
job->base.s_fence->finished.ops->get_timeline_name(>base.s_fence->finished) 


+TRACE_EVENT(amdgpu_ttm_tt_populate,
+TP_PROTO(uint16_t domain, uint8_t bus, uint8_t slot, uint8_t 
func, uint64_t dma_address, uint64_t phys_address),

+TP_ARGS(domain, bus, slot, func, dma_address, phys_address),
+TP_STRUCT__entry(
+__field(uint16_t, domain)
+__field(uint8_t, bus)
+__field(uint8_t, slot)
+__field(uint8_t, func)
+__field(uint64_t, dma)
+__field(uint64_t, phys)
+),


Better just give adev here and extract the values during the fast assign.


Easy enough, I've done this now.




+TP_fast_assign(
+   __entry->domain = domain;
+   __entry->bus = bus;
+   __entry->slot = slot;
+   __entry->func = func;
+   __entry->dma = dma_address;
+   __entry->phys = phys_address;
+   ),
+TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+  (unsigned)__entry->domain,
+  (unsigned)__entry->bus,
+  (unsigned)__entry->slot,
+  (unsigned)__entry->func,
+  (unsigned long long)__entry->dma,
+  (unsigned long long)__entry->phys)
+);
+
  TRACE_EVENT(amdgpu_mm_rreg,
  TP_PROTO(unsigned did, uint32_t reg, uint32_t value),
  TP_ARGS(did, reg, value),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 8da59d212b3b..1cf274603476 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include "amdgpu.h"
+#include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
  #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
@@ -944,6 +945,13 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
*ttm)

  ttm_pool_unpopulate(ttm);
  return -EFAULT;
  }
+trace_amdgpu_ttm_tt_populate(
+pci_domain_nr(adev->pdev->bus),
+adev->pdev->bus->number,
+PCI_SLOT(adev->pdev->devfn),
+PCI_FUNC(adev->pdev->devfn),
+gtt->ttm.dma_address[i],
+page_to_phys(ttm->pages[i]));


Please add that tracing for the dma pool path as well.

With that fixed the change looks good to me,
Christian.


Unsure what you mean here.  The ttm_pool_populate() seems to be 
preparing the page list to back the request.


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Add drm ioctl DRM_IOCTL_MODE_GETFB2 & associated helpers.

2017-08-01 Thread Laurent Pinchart
Hi Joe,

Thank you for the patch.

On Monday 31 Jul 2017 11:29:13 Joe Kniss wrote:
> New getfb2 functionality uses drm_mode_fb_cmd2 struct to be symmetric
> with addfb2.

What's the use case for this ? We haven't needed such an ioctl for so long 
that it seemed to me that userspace doesn't really need it, but I could be 
wrong.

> Also modifies *_fb_create_handle() calls to accept a
> format_plane_index so that handles for each plane can be generated.
> Previously, many *_fb_create_handle() calls simply defaulted to plane 0
> only.

And with this patch the amd/amdgpu, armada, gma500, i915, mediatek, msm, 
nouveau and radeon drivers still do. Do none of them support multi-planar 
formats ?

> Signed-off-by: Joe Kniss 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  5 +-
>  drivers/gpu/drm/armada/armada_fb.c  |  1 +
>  drivers/gpu/drm/drm_crtc_internal.h |  2 +
>  drivers/gpu/drm/drm_fb_cma_helper.c | 11 ++--
>  drivers/gpu/drm/drm_framebuffer.c   | 79 +-
>  drivers/gpu/drm/drm_ioctl.c |  1 +
>  drivers/gpu/drm/exynos/exynos_drm_fb.c  |  7 ++-
>  drivers/gpu/drm/gma500/framebuffer.c|  2 +
>  drivers/gpu/drm/i915/intel_display.c|  1 +
>  drivers/gpu/drm/mediatek/mtk_drm_fb.c   |  1 +
>  drivers/gpu/drm/msm/msm_fb.c|  5 +-
>  drivers/gpu/drm/nouveau/nouveau_display.c   |  1 +
>  drivers/gpu/drm/omapdrm/omap_fb.c   |  5 +-
>  drivers/gpu/drm/radeon/radeon_display.c |  5 +-
>  drivers/gpu/drm/rockchip/rockchip_drm_fb.c  |  6 ++-
>  drivers/gpu/drm/tegra/fb.c  |  9 +++-
>  include/drm/drm_framebuffer.h   |  1 +
>  include/uapi/drm/drm.h  |  2 +
>  18 files changed, 127 insertions(+), 17 deletions(-)

[snip]

> diff --git a/drivers/gpu/drm/drm_framebuffer.c
> b/drivers/gpu/drm/drm_framebuffer.c index 28a0108a1ab8..67b3be1bedbc 100644
> --- a/drivers/gpu/drm/drm_framebuffer.c
> +++ b/drivers/gpu/drm/drm_framebuffer.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "drm_crtc_internal.h"

[snip]

> +/**
> + * drm_mode_getfb2 - get FB info
> + * @dev: drm device for the ioctl
> + * @data: data pointer for the ioctl
> + * @file_priv: drm file for the ioctl call
> + *
> + * Lookup the FB given its ID and return info about it.
> + *
> + * Called by the user via ioctl.
> + *
> + * Returns:
> + * Zero on success, negative errno on failure.
> + */
> +int drm_mode_getfb2(struct drm_device *dev,
> +void *data, struct drm_file *file_priv)
> +{
> + struct drm_mode_fb_cmd2 *r = data;
> + struct drm_framebuffer *fb;
> + int ret, i;
> +
> + if (!drm_core_check_feature(dev, DRIVER_MODESET))
> + return -EINVAL;
> +
> + fb = drm_framebuffer_lookup(dev, r->fb_id);
> + if (!fb)
> + return -ENOENT;
> +
> + r->height = fb->height;
> + r->width = fb->width;
> + r->pixel_format = fb->format->format;
> + for (i = 0; i < 4; ++i) {
> + r->pitches[i] = fb->pitches[i];
> + r->offsets[i] = fb->offsets[i];
> + r->modifier[i] = fb->modifier;
> + r->handles[i] = 0;
> + }
> +
> + for (i = 0; i < fb->format->num_planes; ++i) {
> + if (fb->funcs->create_handle) {
> + if (drm_is_current_master(file_priv) ||
> + capable(CAP_SYS_ADMIN) ||
> + drm_is_control_client(file_priv)) {
> + ret = fb->funcs->create_handle(fb, i,
> file_priv,
> +
> >handles[i]);
> + if (ret)
> + break;
> + } else {
> + /* GET_FB() is an unprivileged ioctl so we
> must
> +  * not return a buffer-handle to non-master
> +  * processes! For backwards-compatibility
> +  * reasons, we cannot make GET_FB()
> privileged,
> +  * so just return an invalid handle for
> +  * non-masters. */

There's no backward compatibility to handle here, just make it privileged if 
it has to be.

> + r->handles[i] = 0;
> + ret = 0;
> + }
> + } else {
> + ret = -ENODEV;
> + break;
> + }
> + }
> +
> + /* If handle creation failed, delete/dereference any that were made. 
*/
> + if (ret) {
> + for (i = 0; i < 4; ++i) {
> + if (r->handles[i])
> + drm_gem_handle_delete(file_priv, r-
>handles[i]);

My initial reaction to this was to reply that not all drivers use GEM, but 
after some investigation it seems they actually do. If 

Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Christian König

Am 01.08.2017 um 13:51 schrieb Tom St Denis:

This helps map DMA addresses back to physical addresses.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  8 
  2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 509f7a63d40c..5b2bb28da504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -14,6 +14,34 @@
  #define AMDGPU_JOB_GET_TIMELINE_NAME(job) \
 
job->base.s_fence->finished.ops->get_timeline_name(>base.s_fence->finished)
  
+TRACE_EVENT(amdgpu_ttm_tt_populate,

+   TP_PROTO(uint16_t domain, uint8_t bus, uint8_t slot, uint8_t func, 
uint64_t dma_address, uint64_t phys_address),
+   TP_ARGS(domain, bus, slot, func, dma_address, phys_address),
+   TP_STRUCT__entry(
+   __field(uint16_t, domain)
+   __field(uint8_t, bus)
+   __field(uint8_t, slot)
+   __field(uint8_t, func)
+   __field(uint64_t, dma)
+   __field(uint64_t, phys)
+   ),


Better just give adev here and extract the values during the fast assign.


+   TP_fast_assign(
+  __entry->domain = domain;
+  __entry->bus = bus;
+  __entry->slot = slot;
+  __entry->func = func;
+  __entry->dma = dma_address;
+  __entry->phys = phys_address;
+  ),
+   TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+ (unsigned)__entry->domain,
+ (unsigned)__entry->bus,
+ (unsigned)__entry->slot,
+ (unsigned)__entry->func,
+ (unsigned long long)__entry->dma,
+ (unsigned long long)__entry->phys)
+);
+
  TRACE_EVENT(amdgpu_mm_rreg,
TP_PROTO(unsigned did, uint32_t reg, uint32_t value),
TP_ARGS(did, reg, value),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8da59d212b3b..1cf274603476 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include "amdgpu.h"
+#include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
  
  #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)

@@ -944,6 +945,13 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
ttm_pool_unpopulate(ttm);
return -EFAULT;
}
+   trace_amdgpu_ttm_tt_populate(
+   pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number,
+   PCI_SLOT(adev->pdev->devfn),
+   PCI_FUNC(adev->pdev->devfn),
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));


Please add that tracing for the dma pool path as well.

With that fixed the change looks good to me,
Christian.


}
return 0;
  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

2017-08-01 Thread Tom St Denis
This helps map DMA addresses back to physical addresses.

Signed-off-by: Tom St Denis 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  8 
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 509f7a63d40c..5b2bb28da504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -14,6 +14,34 @@
 #define AMDGPU_JOB_GET_TIMELINE_NAME(job) \
 
job->base.s_fence->finished.ops->get_timeline_name(>base.s_fence->finished)
 
+TRACE_EVENT(amdgpu_ttm_tt_populate,
+   TP_PROTO(uint16_t domain, uint8_t bus, uint8_t slot, uint8_t func, 
uint64_t dma_address, uint64_t phys_address),
+   TP_ARGS(domain, bus, slot, func, dma_address, phys_address),
+   TP_STRUCT__entry(
+   __field(uint16_t, domain)
+   __field(uint8_t, bus)
+   __field(uint8_t, slot)
+   __field(uint8_t, func)
+   __field(uint64_t, dma)
+   __field(uint64_t, phys)
+   ),
+   TP_fast_assign(
+  __entry->domain = domain;
+  __entry->bus = bus;
+  __entry->slot = slot;
+  __entry->func = func;
+  __entry->dma = dma_address;
+  __entry->phys = phys_address;
+  ),
+   TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+ (unsigned)__entry->domain,
+ (unsigned)__entry->bus,
+ (unsigned)__entry->slot,
+ (unsigned)__entry->func,
+ (unsigned long long)__entry->dma,
+ (unsigned long long)__entry->phys)
+);
+
 TRACE_EVENT(amdgpu_mm_rreg,
TP_PROTO(unsigned did, uint32_t reg, uint32_t value),
TP_ARGS(did, reg, value),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8da59d212b3b..1cf274603476 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include "amdgpu.h"
+#include "amdgpu_trace.h"
 #include "bif/bif_4_1_d.h"
 
 #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
@@ -944,6 +945,13 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
ttm_pool_unpopulate(ttm);
return -EFAULT;
}
+   trace_amdgpu_ttm_tt_populate(
+   pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number,
+   PCI_SLOT(adev->pdev->devfn),
+   PCI_FUNC(adev->pdev->devfn),
+   gtt->ttm.dma_address[i],
+   page_to_phys(ttm->pages[i]));
}
return 0;
 }
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH umr] Fix ib_addr_lo in BUFFER_CIK/BUFFER_CONST PM4 packets

2017-08-01 Thread Tom St Denis
We correctly grabbed the bits but did not shift them
back to their correct place.

Signed-off-by: Tom St Denis 
---
 src/lib/ring_decode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/lib/ring_decode.c b/src/lib/ring_decode.c
index 772ea49dda6f..c05fa632a233 100644
--- a/src/lib/ring_decode.c
+++ b/src/lib/ring_decode.c
@@ -407,7 +407,7 @@ static void print_decode_pm4_pkt3(struct umr_asic *asic, 
struct umr_ring_decoder
case 0x33: // INDIRECT_BUFFER_CONST
switch (decoder->pm4.cur_word) {
case 0: printf("IB_BASE_LO: 0x%08lx, SWAP:%lu", 
BITS(ib, 2, 32), BITS(ib, 0, 2));
-   decoder->pm4.next_ib_state.ib_addr_lo = 
BITS(ib, 2, 32);
+   decoder->pm4.next_ib_state.ib_addr_lo = 
BITS(ib, 2, 32) << 2;
break;
case 1: printf("IB_BASE_HI: 0x%08lx", BITS(ib, 
0, 16));
decoder->pm4.next_ib_state.ib_addr_hi = 
BITS(ib, 0, 16);
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-ati v2 1/3] Use root window (pixmap) instead of screen pixmap for scanout updates

2017-08-01 Thread Michel Dänzer
From: Michel Dänzer 

Preparation for following changes, no functional change intended yet.

Reviewed-by: Alex Deucher  # v1
Signed-off-by: Michel Dänzer 
---

v2: Add drmmode_screen_damage_destroy to prevent use-after-free on
server shutdown

 src/drmmode_display.c | 21 ++---
 src/radeon_kms.c  |  2 +-
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 245a92fb0..309ccbd6d 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -529,11 +529,8 @@ drmmode_crtc_scanout_free(drmmode_crtc_private_ptr 
drmmode_crtc)
 _crtc->scanout[1]);
}
 
-   if (drmmode_crtc->scanout_damage) {
+   if (drmmode_crtc->scanout_damage)
DamageDestroy(drmmode_crtc->scanout_damage);
-   drmmode_crtc->scanout_damage = NULL;
-   RegionUninit(_crtc->scanout_last_region);
-   }
 }
 
 void
@@ -605,6 +602,15 @@ radeon_screen_damage_report(DamagePtr damage, RegionPtr 
region, void *closure)
damage->damage.data = NULL;
 }
 
+static void
+drmmode_screen_damage_destroy(DamagePtr damage, void *closure)
+{
+   drmmode_crtc_private_ptr drmmode_crtc = closure;
+
+   drmmode_crtc->scanout_damage = NULL;
+   RegionUninit(_crtc->scanout_last_region);
+}
+
 static Bool
 drmmode_can_use_hw_cursor(xf86CrtcPtr crtc)
 {
@@ -793,9 +799,10 @@ drmmode_crtc_scanout_update(xf86CrtcPtr crtc, 
DisplayModePtr mode,
if (!drmmode_crtc->scanout_damage) {
drmmode_crtc->scanout_damage =
DamageCreate(radeon_screen_damage_report,
-NULL, DamageReportRawRegion,
-TRUE, screen, NULL);
-   
DamageRegister(>GetScreenPixmap(screen)->drawable,
+drmmode_screen_damage_destroy,
+DamageReportRawRegion,
+TRUE, screen, drmmode_crtc);
+   DamageRegister(>root->drawable,
   drmmode_crtc->scanout_damage);
}
 
diff --git a/src/radeon_kms.c b/src/radeon_kms.c
index b22c98406..f76d76a91 100644
--- a/src/radeon_kms.c
+++ b/src/radeon_kms.c
@@ -981,7 +981,7 @@ radeon_scanout_do_update(xf86CrtcPtr xf86_crtc, int 
scanout_id)
GCPtr gc = GetScratchGC(pDraw->depth, pScreen);
 
ValidateGC(pDraw, gc);
-   (*gc->ops->CopyArea)(>GetScreenPixmap(pScreen)->drawable,
+   (*gc->ops->CopyArea)(>GetWindowPixmap(pScreen->root)->drawable,
 pDraw, gc,
 xf86_crtc->x + extents.x1, xf86_crtc->y + 
extents.y1,
 extents.x2 - extents.x1, extents.y2 - extents.y1,
-- 
2.13.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-ati v2 3/3] Allow DRI page flipping when some CRTCs use separate scanout buffers

2017-08-01 Thread Michel Dänzer
From: Michel Dänzer 

As long as the CRTC we're synchronizing to doesn't.

Reviewed-by: Alex Deucher 
Signed-off-by: Michel Dänzer 
---

v2: Remove redundant checks from can_exchange which still prevented DRI2
page flipping

 man/radeon.man|  2 +-
 src/drmmode_display.c |  2 +-
 src/drmmode_display.h |  4 +++-
 src/radeon_dri2.c | 30 +++---
 src/radeon_present.c  | 12 
 5 files changed, 16 insertions(+), 34 deletions(-)

diff --git a/man/radeon.man b/man/radeon.man
index 3e1723f21..1e9a7bebf 100644
--- a/man/radeon.man
+++ b/man/radeon.man
@@ -285,7 +285,7 @@ Set the default value of the per-output 'TearFree' 
property, which controls
 tearing prevention using the hardware page flipping mechanism. TearFree is
 on for any CRTC associated with one or more outputs with TearFree on. Two
 separate scanout buffers need to be allocated for each CRTC with TearFree
-on. While TearFree is on for any CRTC, it currently prevents clients from using
+on. While TearFree is on for any CRTC, it may prevent clients from using
 DRI page flipping. If this option is set, the default value of the property
 is 'on' or 'off' accordingly. If this option isn't set, the default value of 
the
 property is
diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 6be6513af..d8031875b 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -3007,7 +3007,7 @@ Bool radeon_do_pageflip(ScrnInfoPtr scrn, ClientPtr 
client,
for (i = 0; i < config->num_crtc; i++) {
crtc = config->crtc[i];
 
-   if (!crtc->enabled)
+   if (!drmmode_crtc_can_flip(crtc))
continue;
 
flipdata->flip_count++;
diff --git a/src/drmmode_display.h b/src/drmmode_display.h
index 4378be86f..f859377c2 100644
--- a/src/drmmode_display.h
+++ b/src/drmmode_display.h
@@ -146,7 +146,9 @@ drmmode_crtc_can_flip(xf86CrtcPtr crtc)
 drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
 
 return crtc->enabled &&
-   drmmode_crtc->pending_dpms_mode == DPMSModeOn;
+   drmmode_crtc->pending_dpms_mode == DPMSModeOn &&
+   !drmmode_crtc->rotate.bo &&
+   !drmmode_crtc->scanout[drmmode_crtc->scanout_id].bo;
 }
 
 
diff --git a/src/radeon_dri2.c b/src/radeon_dri2.c
index 70751b0bf..4b059897e 100644
--- a/src/radeon_dri2.c
+++ b/src/radeon_dri2.c
@@ -720,18 +720,6 @@ can_exchange(ScrnInfoPtr pScrn, DrawablePtr draw,
 struct dri2_buffer_priv *back_priv = back->driverPrivate;
 PixmapPtr front_pixmap;
 PixmapPtr back_pixmap = back_priv->pixmap;
-xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(pScrn);
-int i;
-
-for (i = 0; i < xf86_config->num_crtc; i++) {
-   xf86CrtcPtr crtc = xf86_config->crtc[i];
-   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
-
-   if (crtc->enabled &&
-   (crtc->rotatedData ||
-drmmode_crtc->scanout[drmmode_crtc->scanout_id].bo))
-   return FALSE;
-}
 
 if (!update_front(draw, front))
return FALSE;
@@ -754,9 +742,10 @@ can_exchange(ScrnInfoPtr pScrn, DrawablePtr draw,
 }
 
 static Bool
-can_flip(ScrnInfoPtr pScrn, DrawablePtr draw,
+can_flip(xf86CrtcPtr crtc, DrawablePtr draw,
 DRI2BufferPtr front, DRI2BufferPtr back)
 {
+ScrnInfoPtr pScrn = crtc->scrn;
 RADEONInfoPtr info = RADEONPTR(pScrn);
 xf86CrtcConfigPtr config = XF86_CRTC_CONFIG_PTR(pScrn);
 int num_crtcs_on;
@@ -771,15 +760,10 @@ can_flip(ScrnInfoPtr pScrn, DrawablePtr draw,
return FALSE;
 
 for (i = 0, num_crtcs_on = 0; i < config->num_crtc; i++) {
-   xf86CrtcPtr crtc = config->crtc[i];
-   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
-
-   if (!drmmode_crtc || drmmode_crtc->rotate.bo ||
-   drmmode_crtc->scanout[drmmode_crtc->scanout_id].bo)
-   return FALSE;
-
-   if (drmmode_crtc_can_flip(crtc))
+   if (drmmode_crtc_can_flip(config->crtc[i]))
num_crtcs_on++;
+   else if (config->crtc[i] == crtc)
+   return FALSE;
 }
 
 return num_crtcs_on > 0 && can_exchange(pScrn, draw, front, back);
@@ -859,7 +843,7 @@ static void radeon_dri2_frame_event_handler(xf86CrtcPtr 
crtc, uint32_t seq,
 
 switch (event->type) {
 case DRI2_FLIP:
-   if (can_flip(scrn, drawable, event->front, event->back) &&
+   if (can_flip(crtc, drawable, event->front, event->back) &&
radeon_dri2_schedule_flip(crtc,
  event->client,
  drawable,
@@ -1352,7 +1336,7 @@ static int radeon_dri2_schedule_swap(ClientPtr client, 
DrawablePtr draw,
 current_msc &= 0x;
 
 /* Flips need to be submitted one frame before */
-if (can_flip(scrn, draw, front, back)) {
+if (can_flip(crtc, draw, front, back)) {
swap_info->type = DRI2_FLIP;
flip =