date:20190506

[PATCH AUTOSEL 4.19 28/81] drm/amd/display: If one stream full updates, full update all planes

2019-05-06 Thread Sasha Levin

From: David Francis 

[ Upstream commit c238bfe0be9ef7420f7669a69e27c8c8f4d8a568 ]

[Why]
On some compositors, with two monitors attached, VT terminal
switch can cause a graphical issue by the following means:

There are two streams, one for each monitor. Each stream has one
plane

current state:
M1:S1->P1
M2:S2->P2

The user calls for a terminal switch and a commit is made to
change both planes to linear swizzle mode. In atomic check,
a new dc_state is constructed with new planes on each stream

new state:
M1:S1->P3
M2:S2->P4

In commit tail, each stream is committed, one at a time. The first
stream (S1) updates properly, triggerring a full update and replacing
the state

current state:
M1:S1->P3
M2:S2->P4

The update for S2 comes in, but dc detects that there is no difference
between the stream and plane in the new and current states, and so
triggers a fast update. The fast update does not program swizzle,
so the second monitor is corrupted

[How]
Add a flag to dc_plane_state that forces full updates

When a stream undergoes a full update, set this flag on all changed
planes, then clear it on the current stream

Subsequent streams will get full updates as a result

Signed-off-by: David Francis 
Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Roman Li 
Acked-by: Bhawanpreet Lakha 
Acked-by: Nicholas Kazlauskas 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 19 +++
 drivers/gpu/drm/amd/display/dc/dc.h  |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index bb0cda727605..e3f5e5d6f0c1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1213,6 +1213,11 @@ static enum surface_update_type det_surface_update(const 
struct dc *dc,
return UPDATE_TYPE_FULL;
}
 
+   if (u->surface->force_full_update) {
+   update_flags->bits.full_update = 1;
+   return UPDATE_TYPE_FULL;
+   }
+
type = get_plane_info_update_type(u);
elevate_update_type(_type, type);
 
@@ -1467,6 +1472,14 @@ void dc_commit_updates_for_stream(struct dc *dc,
}
 
dc_resource_state_copy_construct(state, context);
+
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *new_pipe = 
>res_ctx.pipe_ctx[i];
+   struct pipe_ctx *old_pipe = 
>current_state->res_ctx.pipe_ctx[i];
+
+   if (new_pipe->plane_state && new_pipe->plane_state != 
old_pipe->plane_state)
+   new_pipe->plane_state->force_full_update = true;
+   }
}
 
 
@@ -1510,6 +1523,12 @@ void dc_commit_updates_for_stream(struct dc *dc,
dc->current_state = context;
dc_release_state(old);
 
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *pipe_ctx = 
>res_ctx.pipe_ctx[i];
+
+   if (pipe_ctx->plane_state && pipe_ctx->stream == stream)
+   pipe_ctx->plane_state->force_full_update = 
false;
+   }
}
/*let's use current_state to update watermark etc*/
if (update_type >= UPDATE_TYPE_FULL)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 6c9990bef267..4094b4f50111 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -505,6 +505,9 @@ struct dc_plane_state {
struct dc_plane_status status;
struct dc_context *ctx;
 
+   /* HACK: Workaround for forcing full reprogramming under some 
conditions */
+   bool force_full_update;
+
/* private to dc_surface.c */
enum dc_irq_source irq_source;
struct kref refcount;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH AUTOSEL 4.19 20/81] drm/amd/display: extending AUX SW Timeout

2019-05-06 Thread Sasha Levin

From: Martin Leung 

[ Upstream commit f4bbebf8e7eb4d294b040ab2d2ba71e70e69b930 ]

[Why]
AUX takes longer to reply when using active DP-DVI dongle on some asics
resulting in up to 2000+ us edid read (timeout).

[How]
1. Adjust AUX poll to match spec
2. Extend the SW timeout. This does not affect normal
operation since we exit the loop as soon as AUX acks.

Signed-off-by: Martin Leung 
Reviewed-by: Jun Lei 
Acked-by: Joshua Aberback 
Acked-by: Leo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 9 ++---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.h | 6 +++---
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
index 3f5b2e6f7553..df936edac5c7 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -189,6 +189,12 @@ static void submit_channel_request(
1,
0);
}
+
+   REG_UPDATE(AUX_INTERRUPT_CONTROL, AUX_SW_DONE_ACK, 1);
+
+   REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 0,
+   10, aux110->timeout_period/10);
+
/* set the delay and the number of bytes to write */
 
/* The length include
@@ -241,9 +247,6 @@ static void submit_channel_request(
}
}
 
-   REG_UPDATE(AUX_INTERRUPT_CONTROL, AUX_SW_DONE_ACK, 1);
-   REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 0,
-   10, aux110->timeout_period/10);
REG_UPDATE(AUX_SW_CONTROL, AUX_SW_GO, 1);
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
index f7caab85dc80..2c6f50b4245a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
@@ -69,11 +69,11 @@ enum {  /* This is the timeout as defined in DP 1.2a,
 * at most within ~240usec. That means,
 * increasing this timeout will not affect normal operation,
 * and we'll timeout after
-* SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD = 1600usec.
+* SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD = 2400usec.
 * This timeout is especially important for
-* resume from S3 and CTS.
+* converters, resume from S3, and CTS.
 */
-   SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 4
+   SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 6
 };
 struct aux_engine_dce110 {
struct aux_engine base;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH AUTOSEL 5.0 23/99] drm/amd/display: extending AUX SW Timeout

2019-05-06 Thread Sasha Levin

From: Martin Leung 

[ Upstream commit f4bbebf8e7eb4d294b040ab2d2ba71e70e69b930 ]

[Why]
AUX takes longer to reply when using active DP-DVI dongle on some asics
resulting in up to 2000+ us edid read (timeout).

[How]
1. Adjust AUX poll to match spec
2. Extend the SW timeout. This does not affect normal
operation since we exit the loop as soon as AUX acks.

Signed-off-by: Martin Leung 
Reviewed-by: Jun Lei 
Acked-by: Joshua Aberback 
Acked-by: Leo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 9 ++---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.h | 6 +++---
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
index aaeb7faac0c4..e0fff5744b5f 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -189,6 +189,12 @@ static void submit_channel_request(
1,
0);
}
+
+   REG_UPDATE(AUX_INTERRUPT_CONTROL, AUX_SW_DONE_ACK, 1);
+
+   REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 0,
+   10, aux110->timeout_period/10);
+
/* set the delay and the number of bytes to write */
 
/* The length include
@@ -241,9 +247,6 @@ static void submit_channel_request(
}
}
 
-   REG_UPDATE(AUX_INTERRUPT_CONTROL, AUX_SW_DONE_ACK, 1);
-   REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 0,
-   10, aux110->timeout_period/10);
REG_UPDATE(AUX_SW_CONTROL, AUX_SW_GO, 1);
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
index f7caab85dc80..2c6f50b4245a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
@@ -69,11 +69,11 @@ enum {  /* This is the timeout as defined in DP 1.2a,
 * at most within ~240usec. That means,
 * increasing this timeout will not affect normal operation,
 * and we'll timeout after
-* SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD = 1600usec.
+* SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD = 2400usec.
 * This timeout is especially important for
-* resume from S3 and CTS.
+* converters, resume from S3, and CTS.
 */
-   SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 4
+   SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 6
 };
 struct aux_engine_dce110 {
struct aux_engine base;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH AUTOSEL 5.0 27/99] drm/amdgpu: shadow in shadow_list without tbo.mem.start cause page fault in sriov TDR

2019-05-06 Thread Sasha Levin

From: wentalou 

[ Upstream commit b575f10dbd6f84c2c8744ff1f486bfae1e4f6f38 ]

shadow was added into shadow_list by amdgpu_bo_create_shadow.
meanwhile, shadow->tbo.mem was not fully configured.
tbo.mem would be fully configured by amdgpu_vm_sdma_map_table until calling 
amdgpu_vm_clear_bo.
If sriov TDR occurred between amdgpu_bo_create_shadow and 
amdgpu_vm_sdma_map_table,
amdgpu_device_recover_vram would deal with shadow without tbo.mem.start.

Signed-off-by: Wentao Lou 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7ff3a28fc903..5336b2c9b615 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3150,6 +3150,7 @@ static int amdgpu_device_recover_vram(struct 
amdgpu_device *adev)
 
/* No need to recover an evicted BO */
if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+   shadow->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET ||
shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
continue;
 
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH AUTOSEL 5.0 38/99] drm/amd/display: If one stream full updates, full update all planes

2019-05-06 Thread Sasha Levin

From: David Francis 

[ Upstream commit c238bfe0be9ef7420f7669a69e27c8c8f4d8a568 ]

[Why]
On some compositors, with two monitors attached, VT terminal
switch can cause a graphical issue by the following means:

There are two streams, one for each monitor. Each stream has one
plane

current state:
M1:S1->P1
M2:S2->P2

The user calls for a terminal switch and a commit is made to
change both planes to linear swizzle mode. In atomic check,
a new dc_state is constructed with new planes on each stream

new state:
M1:S1->P3
M2:S2->P4

In commit tail, each stream is committed, one at a time. The first
stream (S1) updates properly, triggerring a full update and replacing
the state

current state:
M1:S1->P3
M2:S2->P4

The update for S2 comes in, but dc detects that there is no difference
between the stream and plane in the new and current states, and so
triggers a fast update. The fast update does not program swizzle,
so the second monitor is corrupted

[How]
Add a flag to dc_plane_state that forces full updates

When a stream undergoes a full update, set this flag on all changed
planes, then clear it on the current stream

Subsequent streams will get full updates as a result

Signed-off-by: David Francis 
Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Roman Li 
Acked-by: Bhawanpreet Lakha 
Acked-by: Nicholas Kazlauskas 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 19 +++
 drivers/gpu/drm/amd/display/dc/dc.h  |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 1f92e7e8e3d3..5af2ea1f201d 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1308,6 +1308,11 @@ static enum surface_update_type det_surface_update(const 
struct dc *dc,
return UPDATE_TYPE_FULL;
}
 
+   if (u->surface->force_full_update) {
+   update_flags->bits.full_update = 1;
+   return UPDATE_TYPE_FULL;
+   }
+
type = get_plane_info_update_type(u);
elevate_update_type(_type, type);
 
@@ -1637,6 +1642,14 @@ void dc_commit_updates_for_stream(struct dc *dc,
}
 
dc_resource_state_copy_construct(state, context);
+
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *new_pipe = 
>res_ctx.pipe_ctx[i];
+   struct pipe_ctx *old_pipe = 
>current_state->res_ctx.pipe_ctx[i];
+
+   if (new_pipe->plane_state && new_pipe->plane_state != 
old_pipe->plane_state)
+   new_pipe->plane_state->force_full_update = true;
+   }
}
 
 
@@ -1680,6 +1693,12 @@ void dc_commit_updates_for_stream(struct dc *dc,
dc->current_state = context;
dc_release_state(old);
 
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *pipe_ctx = 
>res_ctx.pipe_ctx[i];
+
+   if (pipe_ctx->plane_state && pipe_ctx->stream == stream)
+   pipe_ctx->plane_state->force_full_update = 
false;
+   }
}
/*let's use current_state to update watermark etc*/
if (update_type >= UPDATE_TYPE_FULL)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 4b5bbb13ce7f..7d5656d7e460 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -496,6 +496,9 @@ struct dc_plane_state {
struct dc_plane_status status;
struct dc_context *ctx;
 
+   /* HACK: Workaround for forcing full reprogramming under some 
conditions */
+   bool force_full_update;
+
/* private to dc_surface.c */
enum dc_irq_source irq_source;
struct kref refcount;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amdgpu: Skip IH reroute in Vega20 SR-IOV VF

2019-05-06 Thread Trigger Huang

IH reroute commands are not supported on Vega20 VF

Signed-off-by: Trigger Huang 
---
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index b91df7b..4bdd70a 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -232,6 +232,10 @@ static void psp_v11_0_reroute_ih(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
uint32_t tmp;
 
+   /* reroute_ih is not supported on SR_IOV VF */
+   if (amdgpu_sriov_vf(adev))
+   return;
+
/* Change IH ring for VMC */
tmp = REG_SET_FIELD(0, IH_CLIENT_CFG_DATA, CREDIT_RETURN_ADDR, 0x1244b);
tmp = REG_SET_FIELD(tmp, IH_CLIENT_CFG_DATA, CLIENT_TYPE, 1);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/2] Skip IH re-route on Vega SR-IOV

2019-05-06 Thread Trigger Huang

IH re-route is not supported on Vega SR-IOV, need to be skipped

Trigger Huang (2):
  drm/amdgpu: Skip IH reroute in Vega10 SR-IOV VF
  drm/amdgpu: Skip IH reroute in Vega20 SR-IOV VF

 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 4 
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c  | 4 
 2 files changed, 8 insertions(+)

-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amdgpu: Skip IH reroute in Vega10 SR-IOV VF

2019-05-06 Thread Trigger Huang

IH reroute commands are not supported on Vega10 VF

Signed-off-by: Trigger Huang 
---
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index 143f0fa..9d6e603 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -260,6 +260,10 @@ static void psp_v3_1_reroute_ih(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
uint32_t tmp;
 
+   /* reroute_ih is not supported on SR_IOV VF */
+   if (amdgpu_sriov_vf(adev))
+   return;
+
/* Change IH ring for VMC */
tmp = REG_SET_FIELD(0, IH_CLIENT_CFG_DATA, CREDIT_RETURN_ADDR, 0x1244b);
tmp = REG_SET_FIELD(tmp, IH_CLIENT_CFG_DATA, CLIENT_TYPE, 1);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

2019-05-06 Thread Huang, Trigger

Thanks Alex's comments

Yes, they are only in the SR-IOV HW initialization path of UVD/VCE.
Thanks & Best Wishes,
Trigger Huang

From: Deucher, Alexander 
Sent: Monday, May 06, 2019 10:52 PM
To: Huang, Trigger ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

As long as this doesn't break bare metal, I'm ok with it.
Acked-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Trigger Huang 
mailto:trigger.hu...@amd.com>>
Sent: Thursday, May 2, 2019 8:56 AM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Trigger
Subject: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

[CAUTION: External Email]

One Vega10 SR-IOV VF, the FW address returned by PSP should be
set into the init table, while not the original BO mc address.
otherwise, UVD and VCE IB test will fail under Vega10 SR-IOV

reference:
commit bfcea5204287 ("drm/amdgpu:change VEGA booting with firmware 
loaded by PSP")
commit aa5873dca463 ("drm/amdgpu: Change VCE booting with firmware 
loaded by PSP")

Signed-off-by: Trigger Huang 
mailto:trigger.hu...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 17 +++--
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index dc461df..2191d3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -787,10 +787,13 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)
   0x, 
0x0004);
/* mc resume*/
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
-   
lower_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
-   
upper_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_lo);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_hi);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0), 
0);
offset = 0;
} else {

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
@@ -798,10 +801,11 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),

upper_32_bits(adev->uvd.inst[i].gpu_addr));
offset = size;
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0),
+   
AMDGPU_UVD_FIRMWARE_OFFSET >> 3);
+
}

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_OFFSET0),
-   AMDGPU_UVD_FIRMWARE_OFFSET 
>> 3);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_SIZE0), size);

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW),
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index f3f5938..c0ec279 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -244,13 +244,18 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_SWAP_CNTL1), 0);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);

+   offset = AMDGPU_VCE_FIRMWARE_OFFSET;
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   uint32_t

RE: [PATCH] drm/amdgpu: treat negative lockup timeout as 'infinite timeout'

2019-05-06 Thread Quan, Evan

Thanks! Just sent out a V2 version with this addressed.

> -Original Message-
> From: Christian König 
> Sent: 2019年5月6日 19:26
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian 
> Subject: Re: [PATCH] drm/amdgpu: treat negative lockup timeout as 'infinite
> timeout'
> 
> [CAUTION: External Email]
> 
> Am 05.05.19 um 16:23 schrieb Evan Quan:
> > Negative lockup timeout is valid and will be treated as 'infinite
> > timeout'.
> >
> > Change-Id: I0d8387956a9c744073c0281ef2e1a547d4f16dec
> > Signed-off-by: Evan Quan 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 ++
> >   1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 5b03e17e6e06..4d6dff6855f8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -233,13 +233,14 @@ module_param_named(msi, amdgpu_msi, int,
> 0444);
> >* Set GPU scheduler timeout value in ms.
> >*
> >* The format can be [Non-Compute] or [GFX,Compute,SDMA,Video].
> That
> > is there can be one or
> > - * multiple values specified. 0 and negative values are invalidated.
> > They will be adjusted
> > - * to default timeout.
> > + * multiple values specified.
> >*  - With one value specified, the setting will apply to all non-compute
> jobs.
> >*  - With multiple values specified, the first one will be for GFX. The
> second one is for Compute.
> >*And the third and fourth ones are for SDMA and Video.
> >* By default(with no lockup_timeout settings), the timeout for all non-
> compute(GFX, SDMA and Video)
> >* jobs is 1. And there is no timeout enforced on compute jobs.
> > + * Value 0 is invalidated, will be adjusted to default timeout settings.
> > + * Negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET).
> >*/
> >   MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms
> (default: 1 for non-compute jobs and no timeout for compute jobs), "
> >   "format is [Non-Compute] or [GFX,Compute,SDMA,Video]");
> > @@ -1248,11 +1249,16 @@ int
> amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
> >   if (ret)
> >   return ret;
> >
> > - /* Invalidate 0 and negative values */
> > - if (timeout <= 0) {
> > + /*
> > +  * Value 0 will be adjusted to default timeout 
> > settings.
> > +  * Negative values mean 'infinite timeout' 
> > (MAX_JIFFY_OFFSET).
> > +  */
> > + if (!timeout) {
> >   index++;
> >   continue;
> >   }
> > + if (timeout < 0)
> > + timeout = MAX_JIFFY_OFFSET;
> 
> This is superfluous and maybe even harmful, msecs_to_jiffies() should take
> care of this conversion.
> 
> Maybe even convert the values directly here.
> 
> Christian.
> 
> >
> >   switch (index++) {
> >   case 0:

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: treat negative lockup timeout as 'infinite timeout' V2

2019-05-06 Thread Evan Quan

Negative lockup timeout is valid and will be treated as
'infinite timeout'.

- V2: use msecs_to_jiffies for negative values

Change-Id: I0d8387956a9c744073c0281ef2e1a547d4f16dec
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c5fba79c3660..bcd59ba07bb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -237,13 +237,14 @@ module_param_named(msi, amdgpu_msi, int, 0444);
  * Set GPU scheduler timeout value in ms.
  *
  * The format can be [Non-Compute] or [GFX,Compute,SDMA,Video]. That is there 
can be one or
- * multiple values specified. 0 and negative values are invalidated. They will 
be adjusted
- * to default timeout.
+ * multiple values specified.
  *  - With one value specified, the setting will apply to all non-compute jobs.
  *  - With multiple values specified, the first one will be for GFX. The 
second one is for Compute.
  *And the third and fourth ones are for SDMA and Video.
  * By default(with no lockup_timeout settings), the timeout for all 
non-compute(GFX, SDMA and Video)
  * jobs is 1. And there is no timeout enforced on compute jobs.
+ * Value 0 is invalidated, will be adjusted to default timeout settings.
+ * Negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET).
  */
 MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: 1 for 
non-compute jobs and no timeout for compute jobs), "
"format is [Non-Compute] or [GFX,Compute,SDMA,Video]");
@@ -1339,24 +1340,27 @@ int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
if (ret)
return ret;
 
-   /* Invalidate 0 and negative values */
-   if (timeout <= 0) {
+   /*
+* Value 0 will be adjusted to default timeout settings.
+* Negative values mean 'infinite timeout' 
(MAX_JIFFY_OFFSET).
+*/
+   if (!timeout) {
index++;
continue;
}
 
switch (index++) {
case 0:
-   adev->gfx_timeout = timeout;
+   adev->gfx_timeout = msecs_to_jiffies(timeout);
break;
case 1:
-   adev->compute_timeout = timeout;
+   adev->compute_timeout = 
msecs_to_jiffies(timeout);
break;
case 2:
-   adev->sdma_timeout = timeout;
+   adev->sdma_timeout = msecs_to_jiffies(timeout);
break;
case 3:
-   adev->video_timeout = timeout;
+   adev->video_timeout = msecs_to_jiffies(timeout);
break;
default:
break;
-- 
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 13/17] IB, arm64: untag user pointers in ib_uverbs_(re)reg_mr()

2019-05-06 Thread Jason Gunthorpe

On Mon, May 06, 2019 at 06:30:59PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> ib_uverbs_(re)reg_mr() use provided user pointers for vma lookups (through
> e.g. mlx4_get_umem_mr()), which can only by done with untagged pointers.
> 
> Untag user pointers in these functions.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/infiniband/core/uverbs_cmd.c | 4 
>  1 file changed, 4 insertions(+)

I think this is OK.. We should really get it tested though.. Leon?

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: rename amdgpu_prime.[ch] into amdgpu_dma_buf.[ch]

2019-05-06 Thread Kuehling, Felix

On 2019-05-06 7:24 a.m., Christian König wrote:
> [CAUTION: External Email]
>
> We are getting a dma-buf implementation completely separate from drm prime,
> so rename the files now and cleanup the code a bit.
>
> No functional change.
>
> Signed-off-by: Christian König 

Acked-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|   1 +
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   1 +
>   .../{amdgpu_prime.c => amdgpu_dma_buf.c}  | 131 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h   |  46 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  16 ---
>   7 files changed, 116 insertions(+), 83 deletions(-)
>   rename drivers/gpu/drm/amd/amdgpu/{amdgpu_prime.c => amdgpu_dma_buf.c} (93%)
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 7d539ba6400d..11a651ff7f0d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -49,7 +49,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>  amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
>  amdgpu_pm.o atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
>  atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
> -   amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
> +   amdgpu_dma_buf.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
>  amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
>  amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o 
> amdgpu_atomfirmware.o \
>  amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index aeead072fa79..e829c53accf5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -25,6 +25,7 @@
>   #include 
>   #include "amdgpu.h"
>   #include "amdgpu_gfx.h"
> +#include "amdgpu_dma_buf.h"
>   #include 
>   #include 
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 047bba8c62d6..2bc80942e5d5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -30,6 +30,7 @@
>   #include "amdgpu_object.h"
>   #include "amdgpu_vm.h"
>   #include "amdgpu_amdkfd.h"
> +#include "amdgpu_dma_buf.h"
>
>   /* Special VM and GART address alignment needed for VI pre-Fiji due to
>* a HW bug.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> similarity index 93%
> rename from drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> rename to drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index a38e0fb4a6fe..4711cf1b5bd2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -1,5 +1,5 @@
>   /*
> - * Copyright 2012 Advanced Micro Devices, Inc.
> + * Copyright 2019 Advanced Micro Devices, Inc.
>*
>* Permission is hereby granted, free of charge, to any person obtaining a
>* copy of this software and associated documentation files (the 
> "Software"),
> @@ -103,7 +103,8 @@ void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, 
> void *vaddr)
>* Returns:
>* 0 on success or a negative error code on failure.
>*/
> -int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
> *vma)
> +int amdgpu_gem_prime_mmap(struct drm_gem_object *obj,
> + struct vm_area_struct *vma)
>   {
>  struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>  struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> @@ -137,57 +138,6 @@ int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, 
> struct vm_area_struct *vma
>  return ret;
>   }
>
> -/**
> - * amdgpu_gem_prime_import_sg_table - _driver.gem_prime_import_sg_table
> - * implementation
> - * @dev: DRM device
> - * @attach: DMA-buf attachment
> - * @sg: Scatter/gather table
> - *
> - * Imports shared DMA buffer memory exported by another device.
> - *
> - * Returns:
> - * A new GEM BO of the given DRM device, representing the memory
> - * described by the given DMA-buf attachment and scatter/gather table.
> - */
> -struct drm_gem_object *
> -amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
> -struct dma_buf_attachment *attach,
> -struct sg_table *sg)
> -{
> -   struct reservation_object *resv = attach->dmabuf->resv;
> -   struct amdgpu_device *adev = dev->dev_private;
> -   struct amdgpu_bo *bo;
> -   struct amdgpu_bo_param bp;
> -   int ret;
> -
> -   memset(, 0, sizeof(bp));
> -   bp.size = attach->dmabuf->size;
> -

[PATCH 1/1] drm/amdgpu: Reserve shared fence for eviction fence

2019-05-06 Thread Kuehling, Felix

Need to reserve space for the shared eviction fence when initializing
a KFD VM.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 20cf8e1e7445..e1cae4a37113 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -875,6 +875,9 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
  AMDGPU_FENCE_OWNER_KFD, false);
if (ret)
goto wait_pd_fail;
+   ret = reservation_object_reserve_shared(vm->root.base.bo->tbo.resv, 1);
+   if (ret)
+   goto reserve_shared_fail;
amdgpu_bo_fence(vm->root.base.bo,
>process_info->eviction_fence->base, true);
amdgpu_bo_unreserve(vm->root.base.bo);
@@ -888,6 +891,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
 
return 0;
 
+reserve_shared_fail:
 wait_pd_fail:
 validate_pd_fail:
amdgpu_bo_unreserve(vm->root.base.bo);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/radeon: prefer lower reference dividers

2019-05-06 Thread Christian König

Instead of the closest reference divider prefer the lowest,
this fixes flickering issues on HP Compaq nx9420.

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108514
Suggested-by:  Paul Dufresne 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 9d3ac8b981da..d8e2d7b3b836 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -921,12 +921,12 @@ static void avivo_get_fb_ref_div(unsigned nom, unsigned 
den, unsigned post_div,
ref_div_max = max(min(100 / post_div, ref_div_max), 1u);
 
/* get matching reference and feedback divider */
-   *ref_div = min(max(DIV_ROUND_CLOSEST(den, post_div), 1u), ref_div_max);
+   *ref_div = min(max(den/post_div, 1u), ref_div_max);
*fb_div = DIV_ROUND_CLOSEST(nom * *ref_div * post_div, den);
 
/* limit fb divider to its maximum */
if (*fb_div > fb_div_max) {
-   *ref_div = DIV_ROUND_CLOSEST(*ref_div * fb_div_max, *fb_div);
+   *ref_div = (*ref_div * fb_div_max)/(*fb_div);
*fb_div = fb_div_max;
}
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/radeon: prefer lower reference dividers

2019-05-06 Thread Deucher, Alexander

Acked-by: Alex Deucher 

From: amd-gfx  on behalf of Christian 
König 
Sent: Monday, May 6, 2019 2:01 PM
To: amd-gfx@lists.freedesktop.org; dufres...@gmail.com; 
werner.luec...@googlemail.com
Subject: [PATCH] drm/radeon: prefer lower reference dividers

[CAUTION: External Email]

Instead of the closest reference divider prefer the lowest,
this fixes flickering issues on HP Compaq nx9420.

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108514
Suggested-by:  Paul Dufresne 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 9d3ac8b981da..d8e2d7b3b836 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -921,12 +921,12 @@ static void avivo_get_fb_ref_div(unsigned nom, unsigned 
den, unsigned post_div,
ref_div_max = max(min(100 / post_div, ref_div_max), 1u);

/* get matching reference and feedback divider */
-   *ref_div = min(max(DIV_ROUND_CLOSEST(den, post_div), 1u), ref_div_max);
+   *ref_div = min(max(den/post_div, 1u), ref_div_max);
*fb_div = DIV_ROUND_CLOSEST(nom * *ref_div * post_div, den);

/* limit fb divider to its maximum */
if (*fb_div > fb_div_max) {
-   *ref_div = DIV_ROUND_CLOSEST(*ref_div * fb_div_max, *fb_div);
+   *ref_div = (*ref_div * fb_div_max)/(*fb_div);
*fb_div = fb_div_max;
}
 }
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 16/17] vfio/type1, arm64: untag user pointers in vaddr_get_pfn

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

vaddr_get_pfn() uses provided user pointers for vma lookups, which can
only by done with untagged pointers.

Untag user pointers in this function.

Signed-off-by: Andrey Konovalov 
---
 drivers/vfio/vfio_iommu_type1.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d0f731c9920a..5daa966d799e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -382,6 +382,8 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned 
long vaddr,
 
down_read(>mmap_sem);
 
+   vaddr = untagged_addr(vaddr);
+
vma = find_vma_intersection(mm, vaddr, vaddr + 1);
 
if (vma && vma->vm_flags & VM_PFNMAP) {
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 15/17] tee, arm64: untag user pointers in tee_shm_register

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

tee_shm_register()->optee_shm_unregister()->check_mem_type() uses provided
user pointers for vma lookups (via __check_mem_type()), which can only by
done with untagged pointers.

Untag user pointers in this function.

Signed-off-by: Andrey Konovalov 
---
 drivers/tee/tee_shm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
index 0b9ab1d0dd45..8e7b52ab6c63 100644
--- a/drivers/tee/tee_shm.c
+++ b/drivers/tee/tee_shm.c
@@ -263,6 +263,7 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, 
unsigned long addr,
shm->teedev = teedev;
shm->ctx = ctx;
shm->id = -1;
+   addr = untagged_addr(addr);
start = rounddown(addr, PAGE_SIZE);
shm->offset = addr - start;
shm->size = length;
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 01/17] uaccess: add untagged_addr definition for other arches

2019-05-06 Thread Andrey Konovalov

To allow arm64 syscalls to accept tagged pointers from userspace, we must
untag them when they are passed to the kernel. Since untagging is done in
generic parts of the kernel, the untagged_addr macro needs to be defined
for all architectures.

Define it as a noop for architectures other than arm64.

Acked-by: Catalin Marinas 
Signed-off-by: Andrey Konovalov 
---
 include/linux/mm.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6b10c21630f5..44041df804a6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -99,6 +99,10 @@ extern int mmap_rnd_compat_bits __read_mostly;
 #include 
 #include 
 
+#ifndef untagged_addr
+#define untagged_addr(addr) (addr)
+#endif
+
 #ifndef __pa_symbol
 #define __pa_symbol(x)  __pa(RELOC_HIDE((unsigned long)(x), 0))
 #endif
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 11/17] drm/amdgpu, arm64: untag user pointers

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages()
an MMU notifier is set up with a (tagged) userspace pointer. The untagged
address should be used so that MMU notifiers for the untagged address get
correctly matched up with the right BO. This patch untag user pointers in
amdgpu_gem_userptr_ioctl() for the GEM case and in amdgpu_amdkfd_gpuvm_
alloc_memory_of_gpu() for the KFD case. This also makes sure that an
untagged pointer is passed to amdgpu_ttm_tt_get_user_pages(), which uses
it for vma lookups.

Suggested-by: Kuehling, Felix 
Signed-off-by: Andrey Konovalov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 1921dec3df7a..20cac44ed449 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1121,7 +1121,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
alloc_flags = 0;
if (!offset || !*offset)
return -EINVAL;
-   user_addr = *offset;
+   user_addr = untagged_addr(*offset);
} else if (flags & ALLOC_MEM_FLAGS_DOORBELL) {
domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d21dd2f369da..985cb82b2aa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -286,6 +286,8 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
uint32_t handle;
int r;
 
+   args->addr = untagged_addr(args->addr);
+
if (offset_in_page(args->addr | args->size))
return -EINVAL;
 
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 08/17] mm, arm64: untag user pointers in get_vaddr_frames

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

get_vaddr_frames uses provided user pointers for vma lookups, which can
only by done with untagged pointers. Instead of locating and changing
all callers of this function, perform untagging in it.

Signed-off-by: Andrey Konovalov 
---
 mm/frame_vector.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index c64dca6e27c2..c431ca81dad5 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
if (WARN_ON_ONCE(nr_frames > vec->nr_allocated))
nr_frames = vec->nr_allocated;
 
+   start = untagged_addr(start);
+
down_read(>mmap_sem);
locked = 1;
vma = find_vma_intersection(mm, start, start + 1);
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 14/17] media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

videobuf_dma_contig_user_get() uses provided user pointers for vma
lookups, which can only by done with untagged pointers.

Untag the pointers in this function.

Signed-off-by: Andrey Konovalov 
---
 drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c 
b/drivers/media/v4l2-core/videobuf-dma-contig.c
index e1bf50df4c70..8a1ddd146b17 100644
--- a/drivers/media/v4l2-core/videobuf-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
@@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct 
videobuf_dma_contig_memory *mem)
 static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,
struct videobuf_buffer *vb)
 {
+   unsigned long untagged_baddr = untagged_addr(vb->baddr);
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
unsigned long prev_pfn, this_pfn;
@@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct 
videobuf_dma_contig_memory *mem,
unsigned int offset;
int ret;
 
-   offset = vb->baddr & ~PAGE_MASK;
+   offset = untagged_baddr & ~PAGE_MASK;
mem->size = PAGE_ALIGN(vb->size + offset);
ret = -EINVAL;
 
down_read(>mmap_sem);
 
-   vma = find_vma(mm, vb->baddr);
+   vma = find_vma(mm, untagged_baddr);
if (!vma)
goto out_up;
 
-   if ((vb->baddr + mem->size) > vma->vm_end)
+   if ((untagged_baddr + mem->size) > vma->vm_end)
goto out_up;
 
pages_done = 0;
prev_pfn = 0; /* kill warning */
-   user_address = vb->baddr;
+   user_address = untagged_baddr;
 
while (pages_done < (mem->size >> PAGE_SHIFT)) {
ret = follow_pfn(vma, user_address, _pfn);
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 10/17] fs, arm64: untag user pointers in fs/userfaultfd.c

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

userfaultfd code use provided user pointers for vma lookups, which can
only by done with untagged pointers.

Untag user pointers in validate_range().

Signed-off-by: Andrey Konovalov 
---
 fs/userfaultfd.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index f5de1e726356..aa47ed0969dd 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1261,21 +1261,23 @@ static __always_inline void wake_userfault(struct 
userfaultfd_ctx *ctx,
 }
 
 static __always_inline int validate_range(struct mm_struct *mm,
- __u64 start, __u64 len)
+ __u64 *start, __u64 len)
 {
__u64 task_size = mm->task_size;
 
-   if (start & ~PAGE_MASK)
+   *start = untagged_addr(*start);
+
+   if (*start & ~PAGE_MASK)
return -EINVAL;
if (len & ~PAGE_MASK)
return -EINVAL;
if (!len)
return -EINVAL;
-   if (start < mmap_min_addr)
+   if (*start < mmap_min_addr)
return -EINVAL;
-   if (start >= task_size)
+   if (*start >= task_size)
return -EINVAL;
-   if (len > task_size - start)
+   if (len > task_size - *start)
return -EINVAL;
return 0;
 }
@@ -1325,7 +1327,7 @@ static int userfaultfd_register(struct userfaultfd_ctx 
*ctx,
goto out;
}
 
-   ret = validate_range(mm, uffdio_register.range.start,
+   ret = validate_range(mm, _register.range.start,
 uffdio_register.range.len);
if (ret)
goto out;
@@ -1514,7 +1516,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx 
*ctx,
if (copy_from_user(_unregister, buf, sizeof(uffdio_unregister)))
goto out;
 
-   ret = validate_range(mm, uffdio_unregister.start,
+   ret = validate_range(mm, _unregister.start,
 uffdio_unregister.len);
if (ret)
goto out;
@@ -1665,7 +1667,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
if (copy_from_user(_wake, buf, sizeof(uffdio_wake)))
goto out;
 
-   ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
+   ret = validate_range(ctx->mm, _wake.start, uffdio_wake.len);
if (ret)
goto out;
 
@@ -1705,7 +1707,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
   sizeof(uffdio_copy)-sizeof(__s64)))
goto out;
 
-   ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
+   ret = validate_range(ctx->mm, _copy.dst, uffdio_copy.len);
if (ret)
goto out;
/*
@@ -1761,7 +1763,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx 
*ctx,
   sizeof(uffdio_zeropage)-sizeof(__s64)))
goto out;
 
-   ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
+   ret = validate_range(ctx->mm, _zeropage.range.start,
 uffdio_zeropage.range.len);
if (ret)
goto out;
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 12/17] drm/radeon, arm64: untag user pointers in radeon_gem_userptr_ioctl

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

In radeon_gem_userptr_ioctl() an MMU notifier is set up with a (tagged)
userspace pointer. The untagged address should be used so that MMU
notifiers for the untagged address get correctly matched up with the right
BO. This funcation also calls radeon_ttm_tt_pin_userptr(), which uses
provided user pointers for vma lookups, which can only by done with
untagged pointers.

This patch untags user pointers in radeon_gem_userptr_ioctl().

Signed-off-by: Andrey Konovalov 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 44617dec8183..90eb78fb5eb2 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -291,6 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void 
*data,
uint32_t handle;
int r;
 
+   args->addr = untagged_addr(args->addr);
+
if (offset_in_page(args->addr | args->size))
return -EINVAL;
 
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 04/17] mm: add ksys_ wrappers to memory syscalls

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

This patch adds ksys_ wrappers to the following memory syscalls:

brk, get_mempolicy (renamed kernel_get_mempolicy -> ksys_get_mempolicy),
madvise, mbind (renamed kernel_mbind -> ksys_mbind), mincore,
mlock (renamed do_mlock -> ksys_mlock), mlock2, mmap_pgoff,
mprotect (renamed do_mprotect_pkey -> ksys_mprotect_pkey), mremap, msync,
munlock, munmap, remap_file_pages, shmat, shmdt.

The next patch in this series will add a custom implementation for these
syscalls that makes them accept tagged pointers on arm64.

Signed-off-by: Andrey Konovalov 
---
 include/linux/syscalls.h |  22 +++
 ipc/shm.c|   7 ++-
 mm/madvise.c | 129 ---
 mm/mempolicy.c   |  21 +++
 mm/mincore.c |  57 +
 mm/mlock.c   |  20 --
 mm/mmap.c|  30 ++---
 mm/mprotect.c|   6 +-
 mm/mremap.c  |  27 +---
 mm/msync.c   |  35 ++-
 10 files changed, 213 insertions(+), 141 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e446806a561f..70008f5ed84f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1260,6 +1260,28 @@ int ksys_ipc(unsigned int call, int first, unsigned long 
second,
unsigned long third, void __user * ptr, long fifth);
 int compat_ksys_ipc(u32 call, int first, int second,
u32 third, u32 ptr, u32 fifth);
+unsigned long ksys_mremap(unsigned long addr, unsigned long old_len,
+   unsigned long new_len, unsigned long flags,
+   unsigned long new_addr);
+int ksys_munmap(unsigned long addr, size_t len);
+unsigned long ksys_brk(unsigned long brk);
+int ksys_get_mempolicy(int __user *policy, unsigned long __user *nmask,
+   unsigned long maxnode, unsigned long addr, unsigned long flags);
+int ksys_madvise(unsigned long start, size_t len_in, int behavior);
+long ksys_mbind(unsigned long start, unsigned long len,
+   unsigned long mode, const unsigned long __user *nmask,
+   unsigned long maxnode, unsigned int flags);
+__must_check int ksys_mlock(unsigned long start, size_t len, vm_flags_t flags);
+__must_check int ksys_mlock2(unsigned long start, size_t len, vm_flags_t 
flags);
+int ksys_munlock(unsigned long start, size_t len);
+int ksys_mprotect_pkey(unsigned long start, size_t len,
+   unsigned long prot, int pkey);
+int ksys_msync(unsigned long start, size_t len, int flags);
+long ksys_mincore(unsigned long start, size_t len, unsigned char __user *vec);
+unsigned long ksys_remap_file_pages(unsigned long start, unsigned long size,
+   unsigned long prot, unsigned long pgoff, unsigned long flags);
+long ksys_shmat(int shmid, char __user *shmaddr, int shmflg);
+long ksys_shmdt(char __user *shmaddr);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/ipc/shm.c b/ipc/shm.c
index ce1ca9f7c6e9..557b43968c0e 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1588,7 +1588,7 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg,
return err;
 }
 
-SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg)
+long ksys_shmat(int shmid, char __user *shmaddr, int shmflg)
 {
unsigned long ret;
long err;
@@ -1600,6 +1600,11 @@ SYSCALL_DEFINE3(shmat, int, shmid, char __user *, 
shmaddr, int, shmflg)
return (long)ret;
 }
 
+SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg)
+{
+   return ksys_shmat(shmid, shmaddr, shmflg);
+}
+
 #ifdef CONFIG_COMPAT
 
 #ifndef COMPAT_SHMLBA
diff --git a/mm/madvise.c b/mm/madvise.c
index 21a7881a2db4..c27f5f14e2ee 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -738,68 +738,7 @@ madvise_behavior_valid(int behavior)
}
 }
 
-/*
- * The madvise(2) system call.
- *
- * Applications can use madvise() to advise the kernel how it should
- * handle paging I/O in this VM area.  The idea is to help the kernel
- * use appropriate read-ahead and caching techniques.  The information
- * provided is advisory only, and can be safely disregarded by the
- * kernel without affecting the correct operation of the application.
- *
- * behavior values:
- *  MADV_NORMAL - the default behavior is to read clusters.  This
- * results in some read-ahead and read-behind.
- *  MADV_RANDOM - the system should read the minimum amount of data
- * on any access, since it is unlikely that the appli-
- * cation will need more than what it asks for.
- *  MADV_SEQUENTIAL - pages in the given range will probably be accessed
- * once, so they can be aggressively read ahead, and
- * can be freed soon after they are accessed.
- *  MADV_WILLNEED - the

[PATCH v15 03/17] lib, arm64: untag user pointers in strn*_user

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

strncpy_from_user and strnlen_user accept user addresses as arguments, and
do not go through the same path as copy_from_user and others, so here we
need to handle the case of tagged user addresses separately.

Untag user pointers passed to these functions.

Note, that this patch only temporarily untags the pointers to perform
validity checks, but then uses them as is to perform user memory accesses.

Signed-off-by: Andrey Konovalov 
---
 lib/strncpy_from_user.c | 3 ++-
 lib/strnlen_user.c  | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 58eacd41526c..6209bb9507c7 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -107,7 +108,7 @@ long strncpy_from_user(char *dst, const char __user *src, 
long count)
return 0;
 
max_addr = user_addr_max();
-   src_addr = (unsigned long)src;
+   src_addr = (unsigned long)untagged_addr(src);
if (likely(src_addr < max_addr)) {
unsigned long max = max_addr - src_addr;
long retval;
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 1c1a1b0e38a5..8ca3d2ac32ec 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -109,7 +110,7 @@ long strnlen_user(const char __user *str, long count)
return 0;
 
max_addr = user_addr_max();
-   src_addr = (unsigned long)str;
+   src_addr = (unsigned long)untagged_addr(str);
if (likely(src_addr < max_addr)) {
unsigned long max = max_addr - src_addr;
long retval;
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 13/17] IB, arm64: untag user pointers in ib_uverbs_(re)reg_mr()

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

ib_uverbs_(re)reg_mr() use provided user pointers for vma lookups (through
e.g. mlx4_get_umem_mr()), which can only by done with untagged pointers.

Untag user pointers in these functions.

Signed-off-by: Andrey Konovalov 
---
 drivers/infiniband/core/uverbs_cmd.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 062a86c04123..36e7b52577d0 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -708,6 +708,8 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle 
*attrs)
if (ret)
return ret;
 
+   cmd.start = untagged_addr(cmd.start);
+
if ((cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK))
return -EINVAL;
 
@@ -790,6 +792,8 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle 
*attrs)
if (ret)
return ret;
 
+   cmd.start = untagged_addr(cmd.start);
+
if (cmd.flags & ~IB_MR_REREG_SUPPORTED || !cmd.flags)
return -EINVAL;
 
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 02/17] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

copy_from_user (and a few other similar functions) are used to copy data
from user memory into the kernel memory or vice versa. Since a user can
provided a tagged pointer to one of the syscalls that use copy_from_user,
we need to correctly handle such pointers.

Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr,
before performing access validity checks.

Note, that this patch only temporarily untags the pointers to perform the
checks, but then passes them as is into the kernel internals.

Reviewed-by: Catalin Marinas 
Signed-off-by: Andrey Konovalov 
---
 arch/arm64/include/asm/uaccess.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index e5d5f31c6d36..9164ecb5feca 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void __user 
*addr, unsigned long si
return ret;
 }
 
-#define access_ok(addr, size)  __range_ok(addr, size)
+#define access_ok(addr, size)  __range_ok(untagged_addr(addr), size)
 #define user_addr_max  get_fs
 
 #define _ASM_EXTABLE(from, to) \
@@ -226,7 +226,8 @@ static inline void uaccess_enable_not_uao(void)
 
 /*
  * Sanitise a uaccess pointer such that it becomes NULL if above the
- * current addr_limit.
+ * current addr_limit. In case the pointer is tagged (has the top byte set),
+ * untag the pointer before checking.
  */
 #define uaccess_mask_ptr(ptr) (__typeof__(ptr))__uaccess_mask_ptr(ptr)
 static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
@@ -234,10 +235,11 @@ static inline void __user *__uaccess_mask_ptr(const void 
__user *ptr)
void __user *safe_ptr;
 
asm volatile(
-   "   bicsxzr, %1, %2\n"
+   "   bicsxzr, %3, %2\n"
"   csel%0, %1, xzr, eq\n"
: "=" (safe_ptr)
-   : "r" (ptr), "r" (current_thread_info()->addr_limit)
+   : "r" (ptr), "r" (current_thread_info()->addr_limit),
+ "r" (untagged_addr(ptr))
: "cc");
 
csdb();
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-06 Thread Andrey Konovalov

=== Overview

arm64 has a feature called Top Byte Ignore, which allows to embed pointer
tags into the top byte of each pointer. Userspace programs (such as
HWASan, a memory debugging tool [1]) might use this feature and pass
tagged user pointers to the kernel through syscalls or other interfaces.

Right now the kernel is already able to handle user faults with tagged
pointers, due to these patches:

1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a
 tagged pointer")
2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged
  pointers")
3. 276e9327 ("arm64: entry: improve data abort handling of tagged
  pointers")

This patchset extends tagged pointer support to syscall arguments.

As per the proposed ABI change [3], tagged pointers are only allowed to be
passed to syscalls when they point to memory ranges obtained by anonymous
mmap() or sbrk() (see the patchset [3] for more details).

For non-memory syscalls this is done by untaging user pointers when the
kernel performs pointer checking to find out whether the pointer comes
from userspace (most notably in access_ok). The untagging is done only
when the pointer is being checked, the tag is preserved as the pointer
makes its way through the kernel and stays tagged when the kernel
dereferences the pointer when perfoming user memory accesses.

Memory syscalls (mmap, mprotect, etc.) don't do user memory accesses but
rather deal with memory ranges, and untagged pointers are better suited to
describe memory ranges internally. Thus for memory syscalls we untag
pointers completely when they enter the kernel.

=== Other approaches

One of the alternative approaches to untagging that was considered is to
completely strip the pointer tag as the pointer enters the kernel with
some kind of a syscall wrapper, but that won't work with the countless
number of different ioctl calls. With this approach we would need a custom
wrapper for each ioctl variation, which doesn't seem practical.

An alternative approach to untagging pointers in memory syscalls prologues
is to inspead allow tagged pointers to be passed to find_vma() (and other
vma related functions) and untag them there. Unfortunately, a lot of
find_vma() callers then compare or subtract the returned vma start and end
fields against the pointer that was being searched. Thus this approach
would still require changing all find_vma() callers.

=== Testing

The following testing approaches has been taken to find potential issues
with user pointer untagging:

1. Static testing (with sparse [2] and separately with a custom static
   analyzer based on Clang) to track casts of __user pointers to integer
   types to find places where untagging needs to be done.

2. Static testing with grep to find parts of the kernel that call
   find_vma() (and other similar functions) or directly compare against
   vm_start/vm_end fields of vma.

3. Static testing with grep to find parts of the kernel that compare
   user pointers with TASK_SIZE or other similar consts and macros.

4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running
   a modified syzkaller version that passes tagged pointers to the kernel.

Based on the results of the testing the requried patches have been added
to the patchset.

=== Notes

This patchset is meant to be merged together with "arm64 relaxed ABI" [3].

This patchset is a prerequisite for ARM's memory tagging hardware feature
support [4].

This patchset has been merged into the Pixel 2 & 3 kernel trees and is
now being used to enable testing of Pixel phones with HWASan.

Thanks!

[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html

[2] 
https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e0145f292

[3] https://lkml.org/lkml/2019/3/18/819

[4] 
https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a

Changes in v15:
- Removed unnecessary untagging from radeon_ttm_tt_set_userptr().
- Removed unnecessary untagging from amdgpu_ttm_tt_set_userptr().
- Moved untagging to validate_range() in userfaultfd code.
- Moved untagging to ib_uverbs_(re)reg_mr() from mlx4_get_umem_mr().
- Rebased onto 5.1.

Changes in v14:
- Moved untagging for most memory syscalls to an arm64 specific
  implementation, instead of doing that in the common code.
- Dropped "net, arm64: untag user pointers in tcp_zerocopy_receive", since
  the provided user pointers don't come from an anonymous map and thus are
  not covered by this ABI relaxation.
- Dropped "kernel, arm64: untag user pointers in prctl_set_mm*".
- Moved untagging from __check_mem_type() to tee_shm_register().
- Updated untagging for the amdgpu and radeon drivers to cover the MMU
  notifier, as suggested by Felix.
- Since this ABI relaxation doesn't actually allow tagged instruction
  pointers, dropped the following patches:
- Dropped "tracing, arm64: untag user pointers in seq_print_user_ip".

[PATCH v15 17/17] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

This patch adds a simple test, that calls the uname syscall with a
tagged user pointer as an argument. Without the kernel accepting tagged
user pointers the test fails with EFAULT.

Signed-off-by: Andrey Konovalov 
---
 tools/testing/selftests/arm64/.gitignore  |  1 +
 tools/testing/selftests/arm64/Makefile| 11 ++
 .../testing/selftests/arm64/run_tags_test.sh  | 12 +++
 tools/testing/selftests/arm64/tags_test.c | 21 +++
 4 files changed, 45 insertions(+)
 create mode 100644 tools/testing/selftests/arm64/.gitignore
 create mode 100644 tools/testing/selftests/arm64/Makefile
 create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh
 create mode 100644 tools/testing/selftests/arm64/tags_test.c

diff --git a/tools/testing/selftests/arm64/.gitignore 
b/tools/testing/selftests/arm64/.gitignore
new file mode 100644
index ..e8fae8d61ed6
--- /dev/null
+++ b/tools/testing/selftests/arm64/.gitignore
@@ -0,0 +1 @@
+tags_test
diff --git a/tools/testing/selftests/arm64/Makefile 
b/tools/testing/selftests/arm64/Makefile
new file mode 100644
index ..a61b2e743e99
--- /dev/null
+++ b/tools/testing/selftests/arm64/Makefile
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
+
+# ARCH can be overridden by the user for cross compiling
+ARCH ?= $(shell uname -m 2>/dev/null || echo not)
+
+ifneq (,$(filter $(ARCH),aarch64 arm64))
+TEST_GEN_PROGS := tags_test
+TEST_PROGS := run_tags_test.sh
+endif
+
+include ../lib.mk
diff --git a/tools/testing/selftests/arm64/run_tags_test.sh 
b/tools/testing/selftests/arm64/run_tags_test.sh
new file mode 100755
index ..745f11379930
--- /dev/null
+++ b/tools/testing/selftests/arm64/run_tags_test.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+echo ""
+echo "running tags test"
+echo ""
+./tags_test
+if [ $? -ne 0 ]; then
+   echo "[FAIL]"
+else
+   echo "[PASS]"
+fi
diff --git a/tools/testing/selftests/arm64/tags_test.c 
b/tools/testing/selftests/arm64/tags_test.c
new file mode 100644
index ..2bd1830a7ebe
--- /dev/null
+++ b/tools/testing/selftests/arm64/tags_test.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SHIFT_TAG(tag) ((uint64_t)(tag) << 56)
+#define SET_TAG(ptr, tag)  (((uint64_t)(ptr) & ~SHIFT_TAG(0xff)) | \
+   SHIFT_TAG(tag))
+
+int main(void)
+{
+   struct utsname *ptr = (struct utsname *)malloc(sizeof(*ptr));
+   void *tagged_ptr = (void *)SET_TAG(ptr, 0x42);
+   int err = uname(tagged_ptr);
+
+   free(ptr);
+   return err;
+}
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v14 13/17] IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr

2019-05-06 Thread Andrey Konovalov

On Fri, May 3, 2019 at 7:03 PM Catalin Marinas  wrote:
>
> On Tue, Apr 30, 2019 at 03:25:09PM +0200, Andrey Konovalov wrote:
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> >
> > mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can
> > only by done with untagged pointers.
> >
> > Untag user pointers in this function.
> >
> > Signed-off-by: Andrey Konovalov 
> > Reviewed-by: Leon Romanovsky 
> > ---
> >  drivers/infiniband/hw/mlx4/mr.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mlx4/mr.c 
> > b/drivers/infiniband/hw/mlx4/mr.c
> > index 395379a480cb..9a35ed2c6a6f 100644
> > --- a/drivers/infiniband/hw/mlx4/mr.c
> > +++ b/drivers/infiniband/hw/mlx4/mr.c
> > @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata 
> > *udata, u64 start,
> >* again
> >*/
> >   if (!ib_access_writable(access_flags)) {
> > + unsigned long untagged_start = untagged_addr(start);
> >   struct vm_area_struct *vma;
> >
> >   down_read(>mm->mmap_sem);
> > @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata 
> > *udata, u64 start,
> >* cover the memory, but for now it requires a single vma to
> >* entirely cover the MR to support RO mappings.
> >*/
> > - vma = find_vma(current->mm, start);
> > - if (vma && vma->vm_end >= start + length &&
> > - vma->vm_start <= start) {
> > + vma = find_vma(current->mm, untagged_start);
> > + if (vma && vma->vm_end >= untagged_start + length &&
> > + vma->vm_start <= untagged_start) {
> >   if (vma->vm_flags & VM_WRITE)
> >   access_flags |= IB_ACCESS_LOCAL_WRITE;
> >   } else {
>
> Discussion ongoing on the previous version of the patch but I'm more
> inclined to do this in ib_uverbs_(re)reg_mr() on cmd.start.

OK, I want to publish v15 sooner to fix the issue with emails
addresses, so I'll implement this approach there for now.



>
> --
> Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

This patch allows tagged pointers to be passed to the following memory
syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
remap_file_pages, shmat and shmdt.

This is done by untagging pointers passed to these syscalls in the
prologues of their handlers.

Signed-off-by: Andrey Konovalov 
---
 arch/arm64/kernel/sys.c | 128 +++-
 1 file changed, 127 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
index b44065fb1616..933bb9f3d6ec 100644
--- a/arch/arm64/kernel/sys.c
+++ b/arch/arm64/kernel/sys.c
@@ -35,10 +35,33 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, 
len,
 {
if (offset_in_page(off) != 0)
return -EINVAL;
-
+   addr = untagged_addr(addr);
return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
 }
 
+SYSCALL_DEFINE6(arm64_mmap_pgoff, unsigned long, addr, unsigned long, len,
+   unsigned long, prot, unsigned long, flags,
+   unsigned long, fd, unsigned long, pgoff)
+{
+   addr = untagged_addr(addr);
+   return ksys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
+}
+
+SYSCALL_DEFINE5(arm64_mremap, unsigned long, addr, unsigned long, old_len,
+   unsigned long, new_len, unsigned long, flags,
+   unsigned long, new_addr)
+{
+   addr = untagged_addr(addr);
+   new_addr = untagged_addr(new_addr);
+   return ksys_mremap(addr, old_len, new_len, flags, new_addr);
+}
+
+SYSCALL_DEFINE2(arm64_munmap, unsigned long, addr, size_t, len)
+{
+   addr = untagged_addr(addr);
+   return ksys_munmap(addr, len);
+}
+
 SYSCALL_DEFINE1(arm64_personality, unsigned int, personality)
 {
if (personality(personality) == PER_LINUX32 &&
@@ -47,10 +70,113 @@ SYSCALL_DEFINE1(arm64_personality, unsigned int, 
personality)
return ksys_personality(personality);
 }
 
+SYSCALL_DEFINE1(arm64_brk, unsigned long, brk)
+{
+   brk = untagged_addr(brk);
+   return ksys_brk(brk);
+}
+
+SYSCALL_DEFINE5(arm64_get_mempolicy, int __user *, policy,
+   unsigned long __user *, nmask, unsigned long, maxnode,
+   unsigned long, addr, unsigned long, flags)
+{
+   addr = untagged_addr(addr);
+   return ksys_get_mempolicy(policy, nmask, maxnode, addr, flags);
+}
+
+SYSCALL_DEFINE3(arm64_madvise, unsigned long, start,
+   size_t, len_in, int, behavior)
+{
+   start = untagged_addr(start);
+   return ksys_madvise(start, len_in, behavior);
+}
+
+SYSCALL_DEFINE6(arm64_mbind, unsigned long, start, unsigned long, len,
+   unsigned long, mode, const unsigned long __user *, nmask,
+   unsigned long, maxnode, unsigned int, flags)
+{
+   start = untagged_addr(start);
+   return ksys_mbind(start, len, mode, nmask, maxnode, flags);
+}
+
+SYSCALL_DEFINE2(arm64_mlock, unsigned long, start, size_t, len)
+{
+   start = untagged_addr(start);
+   return ksys_mlock(start, len, VM_LOCKED);
+}
+
+SYSCALL_DEFINE2(arm64_mlock2, unsigned long, start, size_t, len)
+{
+   start = untagged_addr(start);
+   return ksys_mlock(start, len, VM_LOCKED);
+}
+
+SYSCALL_DEFINE2(arm64_munlock, unsigned long, start, size_t, len)
+{
+   start = untagged_addr(start);
+   return ksys_munlock(start, len);
+}
+
+SYSCALL_DEFINE3(arm64_mprotect, unsigned long, start, size_t, len,
+   unsigned long, prot)
+{
+   start = untagged_addr(start);
+   return ksys_mprotect_pkey(start, len, prot, -1);
+}
+
+SYSCALL_DEFINE3(arm64_msync, unsigned long, start, size_t, len, int, flags)
+{
+   start = untagged_addr(start);
+   return ksys_msync(start, len, flags);
+}
+
+SYSCALL_DEFINE3(arm64_mincore, unsigned long, start, size_t, len,
+   unsigned char __user *, vec)
+{
+   start = untagged_addr(start);
+   return ksys_mincore(start, len, vec);
+}
+
+SYSCALL_DEFINE5(arm64_remap_file_pages, unsigned long, start,
+   unsigned long, size, unsigned long, prot,
+   unsigned long, pgoff, unsigned long, flags)
+{
+   start = untagged_addr(start);
+   return ksys_remap_file_pages(start, size, prot, pgoff, flags);
+}
+
+SYSCALL_DEFINE3(arm64_shmat, int, shmid, char __user *, shmaddr, int, shmflg)
+{
+   shmaddr = untagged_addr(shmaddr);
+   return ksys_shmat(shmid, shmaddr, shmflg);
+}
+
+SYSCALL_DEFINE1(arm64_shmdt, char __user *, shmaddr)
+{
+   shmaddr = untagged_addr(shmaddr);
+   return ksys_shmdt(shmaddr);
+}
+
 /*
  * Wrappers to pass the pt_regs argument.
  */
 #define sys_personalitysys_arm64_personality
+#define sys_mmap_pgoff sys_arm64_mmap_pgoff
+#define sys_mremap

[PATCH v15 06/17] mm: untag user pointers in do_pages_move

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

do_pages_move() is used in the implementation of the move_pages syscall.

Untag user pointers in this function.

Signed-off-by: Andrey Konovalov 
---
 mm/migrate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index 663a5449367a..c014a07135f0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1617,6 +1617,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t 
task_nodes,
if (get_user(node, nodes + i))
goto out_flush;
addr = (unsigned long)p;
+   addr = untagged_addr(addr);
 
err = -ENODEV;
if (node < 0 || node >= MAX_NUMNODES)
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 07/17] mm, arm64: untag user pointers in mm/gup.c

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

mm/gup.c provides a kernel interface that accepts user addresses and
manipulates user pages directly (for example get_user_pages, that is used
by the futex syscall). Since a user can provided tagged addresses, we need
to handle this case.

Add untagging to gup.c functions that use user addresses for vma lookups.

Signed-off-by: Andrey Konovalov 
---
 mm/gup.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index 91819b8ad9cc..2f477a0a7180 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -696,6 +696,8 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
if (!nr_pages)
return 0;
 
+   start = untagged_addr(start);
+
VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
 
/*
@@ -858,6 +860,8 @@ int fixup_user_fault(struct task_struct *tsk, struct 
mm_struct *mm,
struct vm_area_struct *vma;
vm_fault_t ret, major = 0;
 
+   address = untagged_addr(address);
+
if (unlocked)
fault_flags |= FAULT_FLAG_ALLOW_RETRY;
 
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v15 09/17] fs, arm64: untag user pointers in copy_mount_options

2019-05-06 Thread Andrey Konovalov

This patch is a part of a series that extends arm64 kernel ABI to allow to
pass tagged user pointers (with the top byte set to something else other
than 0x00) as syscall arguments.

In copy_mount_options a user address is being subtracted from TASK_SIZE.
If the address is lower than TASK_SIZE, the size is calculated to not
allow the exact_copy_from_user() call to cross TASK_SIZE boundary.
However if the address is tagged, then the size will be calculated
incorrectly.

Untag the address before subtracting.

Signed-off-by: Andrey Konovalov 
---
 fs/namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index c9cab307fa77..c27e5713bf04 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2825,7 +2825,7 @@ void *copy_mount_options(const void __user * data)
 * the remainder of the page.
 */
/* copy_from_user cannot cross TASK_SIZE ! */
-   size = TASK_SIZE - (unsigned long)data;
+   size = TASK_SIZE - (unsigned long)untagged_addr(data);
if (size > PAGE_SIZE)
size = PAGE_SIZE;
 
-- 
2.21.0.1020.gf2820cf01a-goog

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: rename amdgpu_prime.[ch] into amdgpu_dma_buf.[ch]

2019-05-06 Thread Alex Deucher

On Mon, May 6, 2019 at 7:24 AM Christian König
 wrote:
>
> We are getting a dma-buf implementation completely separate from drm prime,
> so rename the files now and cleanup the code a bit.
>
> No functional change.
>
> Signed-off-by: Christian König 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|   1 +
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   1 +
>  .../{amdgpu_prime.c => amdgpu_dma_buf.c}  | 131 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h   |  46 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  16 ---
>  7 files changed, 116 insertions(+), 83 deletions(-)
>  rename drivers/gpu/drm/amd/amdgpu/{amdgpu_prime.c => amdgpu_dma_buf.c} (93%)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 7d539ba6400d..11a651ff7f0d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -49,7 +49,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
> amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
> amdgpu_pm.o atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
> -   amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
> +   amdgpu_dma_buf.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
> amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
> amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o 
> amdgpu_atomfirmware.o \
> amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index aeead072fa79..e829c53accf5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include "amdgpu.h"
>  #include "amdgpu_gfx.h"
> +#include "amdgpu_dma_buf.h"
>  #include 
>  #include 
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 047bba8c62d6..2bc80942e5d5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -30,6 +30,7 @@
>  #include "amdgpu_object.h"
>  #include "amdgpu_vm.h"
>  #include "amdgpu_amdkfd.h"
> +#include "amdgpu_dma_buf.h"
>
>  /* Special VM and GART address alignment needed for VI pre-Fiji due to
>   * a HW bug.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> similarity index 93%
> rename from drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> rename to drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index a38e0fb4a6fe..4711cf1b5bd2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright 2012 Advanced Micro Devices, Inc.
> + * Copyright 2019 Advanced Micro Devices, Inc.
>   *
>   * Permission is hereby granted, free of charge, to any person obtaining a
>   * copy of this software and associated documentation files (the "Software"),
> @@ -103,7 +103,8 @@ void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, 
> void *vaddr)
>   * Returns:
>   * 0 on success or a negative error code on failure.
>   */
> -int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
> *vma)
> +int amdgpu_gem_prime_mmap(struct drm_gem_object *obj,
> + struct vm_area_struct *vma)
>  {
> struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
> struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> @@ -137,57 +138,6 @@ int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, 
> struct vm_area_struct *vma
> return ret;
>  }
>
> -/**
> - * amdgpu_gem_prime_import_sg_table - _driver.gem_prime_import_sg_table
> - * implementation
> - * @dev: DRM device
> - * @attach: DMA-buf attachment
> - * @sg: Scatter/gather table
> - *
> - * Imports shared DMA buffer memory exported by another device.
> - *
> - * Returns:
> - * A new GEM BO of the given DRM device, representing the memory
> - * described by the given DMA-buf attachment and scatter/gather table.
> - */
> -struct drm_gem_object *
> -amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
> -struct dma_buf_attachment *attach,
> -struct sg_table *sg)
> -{
> -   struct reservation_object *resv = attach->dmabuf->resv;
> -   struct amdgpu_device *adev = dev->dev_private;
> -   struct amdgpu_bo *bo;
> -   struct amdgpu_bo_param bp;
> -   int ret;
> -
> -   memset(, 0, sizeof(bp));
> -   bp.size = attach->dmabuf->size;
> -   bp.byte_align = PAGE_SIZE;
> -   bp.domain = AMDGPU_GEM_DOMAIN_CPU;

Re: [PATCH v14 10/17] fs, arm64: untag user pointers in fs/userfaultfd.c

2019-05-06 Thread Andrey Konovalov

On Fri, May 3, 2019 at 6:56 PM Catalin Marinas  wrote:
>
> On Tue, Apr 30, 2019 at 03:25:06PM +0200, Andrey Konovalov wrote:
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> >
> > userfaultfd_register() and userfaultfd_unregister() use provided user
> > pointers for vma lookups, which can only by done with untagged pointers.
> >
> > Untag user pointers in these functions.
> >
> > Signed-off-by: Andrey Konovalov 
> > ---
> >  fs/userfaultfd.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > index f5de1e726356..fdee0db0e847 100644
> > --- a/fs/userfaultfd.c
> > +++ b/fs/userfaultfd.c
> > @@ -1325,6 +1325,9 @@ static int userfaultfd_register(struct 
> > userfaultfd_ctx *ctx,
> >   goto out;
> >   }
> >
> > + uffdio_register.range.start =
> > + untagged_addr(uffdio_register.range.start);
> > +
> >   ret = validate_range(mm, uffdio_register.range.start,
> >uffdio_register.range.len);
> >   if (ret)
> > @@ -1514,6 +1517,8 @@ static int userfaultfd_unregister(struct 
> > userfaultfd_ctx *ctx,
> >   if (copy_from_user(_unregister, buf, 
> > sizeof(uffdio_unregister)))
> >   goto out;
> >
> > + uffdio_unregister.start = untagged_addr(uffdio_unregister.start);
> > +
> >   ret = validate_range(mm, uffdio_unregister.start,
> >uffdio_unregister.len);
> >   if (ret)
>
> Wouldn't it be easier to do this in validate_range()? There are a few
> more calls in this file, though I didn't check whether a tagged address
> would cause issues.

Yes, I think it makes more sense, will do in v15, thanks!

>
> --
> Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

2019-05-06 Thread Deucher, Alexander

As long as this doesn't break bare metal, I'm ok with it.
Acked-by: Alex Deucher 

From: amd-gfx  on behalf of Trigger 
Huang 
Sent: Thursday, May 2, 2019 8:56 AM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Trigger
Subject: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

[CAUTION: External Email]

One Vega10 SR-IOV VF, the FW address returned by PSP should be
set into the init table, while not the original BO mc address.
otherwise, UVD and VCE IB test will fail under Vega10 SR-IOV

reference:
commit bfcea5204287 ("drm/amdgpu:change VEGA booting with firmware 
loaded by PSP")
commit aa5873dca463 ("drm/amdgpu: Change VCE booting with firmware 
loaded by PSP")

Signed-off-by: Trigger Huang 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 17 +++--
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index dc461df..2191d3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -787,10 +787,13 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)
   0x, 
0x0004);
/* mc resume*/
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
-   
lower_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
-   
upper_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_lo);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_hi);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0), 
0);
offset = 0;
} else {

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
@@ -798,10 +801,11 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),

upper_32_bits(adev->uvd.inst[i].gpu_addr));
offset = size;
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0),
+   
AMDGPU_UVD_FIRMWARE_OFFSET >> 3);
+
}

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_OFFSET0),
-   AMDGPU_UVD_FIRMWARE_OFFSET 
>> 3);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_SIZE0), size);

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW),
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index f3f5938..c0ec279 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -244,13 +244,18 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_SWAP_CNTL1), 0);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);

+   offset = AMDGPU_VCE_FIRMWARE_OFFSET;
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   uint32_t low = 
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].tmr_mc_addr_lo;
+   uint32_t hi = 
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].tmr_mc_addr_hi;
+   uint64_t tmr_mc_addr = (uint64_t)(hi) << 32 | low;
+
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
-   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-

Re: [PATCH v14 11/17] drm/amdgpu, arm64: untag user pointers

2019-05-06 Thread Andrey Konovalov

On Tue, Apr 30, 2019 at 8:03 PM Kuehling, Felix  wrote:
>
> On 2019-04-30 9:25 a.m., Andrey Konovalov wrote:
> > [CAUTION: External Email]
> >
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> >
> > amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma
> > lookups, which can only by done with untagged pointers. This patch
> > untag user pointers when they are being set in
> > amdgpu_ttm_tt_set_userptr().
> >
> > In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages()
> > an MMU notifier is set up with a (tagged) userspace pointer. The untagged
> > address should be used so that MMU notifiers for the untagged address get
> > correctly matched up with the right BO. This patch untag user pointers in
> > amdgpu_gem_userptr_ioctl() for the GEM case and in
> > amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu() for the KFD case.
> >
> > Suggested-by: Kuehling, Felix 
> > Signed-off-by: Andrey Konovalov 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  | 2 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  | 2 +-
> >   3 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 1921dec3df7a..20cac44ed449 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -1121,7 +1121,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
> >  alloc_flags = 0;
> >  if (!offset || !*offset)
> >  return -EINVAL;
> > -   user_addr = *offset;
> > +   user_addr = untagged_addr(*offset);
> >  } else if (flags & ALLOC_MEM_FLAGS_DOORBELL) {
> >  domain = AMDGPU_GEM_DOMAIN_GTT;
> >  alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index d21dd2f369da..985cb82b2aa6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -286,6 +286,8 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, 
> > void *data,
> >  uint32_t handle;
> >  int r;
> >
> > +   args->addr = untagged_addr(args->addr);
> > +
> >  if (offset_in_page(args->addr | args->size))
> >  return -EINVAL;
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 73e71e61dc99..1d30e97ac2c4 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1248,7 +1248,7 @@ int amdgpu_ttm_tt_set_userptr(struct ttm_tt *ttm, 
> > uint64_t addr,
> >  if (gtt == NULL)
> >  return -EINVAL;
> >
> > -   gtt->userptr = addr;
> > +   gtt->userptr = untagged_addr(addr);
>
> Doing this here seems unnecessary. You already untagged the address in
> both callers of this function. Untagging in the two callers ensures that
> the userptr and MMU notifier are in sync, using the same untagged
> address. Doing it again here is redundant.

 Will fix in v15, thanks!

>
> Regards,
>Felix
>
>
> >  gtt->userflags = flags;
> >
> >  if (gtt->usertask)
> > --
> > 2.21.0.593.g511ec345e18-goog
> >
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v14 08/17] mm, arm64: untag user pointers in get_vaddr_frames

2019-05-06 Thread Andrey Konovalov

On Fri, May 3, 2019 at 6:51 PM Catalin Marinas  wrote:
>
> On Tue, Apr 30, 2019 at 03:25:04PM +0200, Andrey Konovalov wrote:
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> >
> > get_vaddr_frames uses provided user pointers for vma lookups, which can
> > only by done with untagged pointers. Instead of locating and changing
> > all callers of this function, perform untagging in it.
> >
> > Signed-off-by: Andrey Konovalov 
> > ---
> >  mm/frame_vector.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> > index c64dca6e27c2..c431ca81dad5 100644
> > --- a/mm/frame_vector.c
> > +++ b/mm/frame_vector.c
> > @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> > nr_frames,
> >   if (WARN_ON_ONCE(nr_frames > vec->nr_allocated))
> >   nr_frames = vec->nr_allocated;
> >
> > + start = untagged_addr(start);
> > +
> >   down_read(>mmap_sem);
> >   locked = 1;
> >   vma = find_vma_intersection(mm, start, start + 1);
>
> Is this some buffer that the user may have malloc'ed? I got lost when
> trying to track down the provenience of this buffer.

The caller that I found when I was looking at this:

drivers/gpu/drm/exynos/exynos_drm_g2d.c:482
exynos_g2d_set_cmdlist_ioctl()->g2d_map_cmdlist_gem()->g2d_userptr_get_dma_addr()->get_vaddr_frames()

>
> --
> Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v14 12/17] drm/radeon, arm64: untag user pointers

2019-05-06 Thread Andrey Konovalov

On Tue, Apr 30, 2019 at 7:57 PM Kuehling, Felix  wrote:
>
> On 2019-04-30 9:25 a.m., Andrey Konovalov wrote:
> > [CAUTION: External Email]
> >
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> >
> > radeon_ttm_tt_pin_userptr() uses provided user pointers for vma
> > lookups, which can only by done with untagged pointers. This patch
> > untags user pointers when they are being set in
> > radeon_ttm_tt_pin_userptr().
> >
> > In amdgpu_gem_userptr_ioctl() an MMU notifier is set up with a (tagged)
> > userspace pointer. The untagged address should be used so that MMU
> > notifiers for the untagged address get correctly matched up with the right
> > BO. This patch untags user pointers in radeon_gem_userptr_ioctl().
> >
> > Signed-off-by: Andrey Konovalov 
> > ---
> >   drivers/gpu/drm/radeon/radeon_gem.c | 2 ++
> >   drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
> >   2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> > b/drivers/gpu/drm/radeon/radeon_gem.c
> > index 44617dec8183..90eb78fb5eb2 100644
> > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > @@ -291,6 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, 
> > void *data,
> >  uint32_t handle;
> >  int r;
> >
> > +   args->addr = untagged_addr(args->addr);
> > +
> >  if (offset_in_page(args->addr | args->size))
> >  return -EINVAL;
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> > b/drivers/gpu/drm/radeon/radeon_ttm.c
> > index 9920a6fc11bf..dce722c494c1 100644
> > --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> > @@ -742,7 +742,7 @@ int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, 
> > uint64_t addr,
> >  if (gtt == NULL)
> >  return -EINVAL;
> >
> > -   gtt->userptr = addr;
> > +   gtt->userptr = untagged_addr(addr);
>
> Doing this here seems unnecessary, because you already untagged the
> address in the only caller of this function in radeon_gem_userptr_ioctl.
> The change there will affect both the userptr and MMU notifier setup and
> makes sure that both are in sync, using the same untagged address.

Will fix in v15, thanks!

>
> Regards,
>Felix
>
>
> >  gtt->usermm = current->mm;
> >  gtt->userflags = flags;
> >  return 0;
> > --
> > 2.21.0.593.g511ec345e18-goog
> >
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

2019-05-06 Thread Huang, Trigger

Ping again.


Thanks & Best Wishes,
Trigger Huang

-Original Message-
From: Trigger Huang  
Sent: Thursday, May 02, 2019 8:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Trigger 
Subject: [PATCH] drm/amdgpu: Use FW addr returned by PSP for VF MM

One Vega10 SR-IOV VF, the FW address returned by PSP should be set into the 
init table, while not the original BO mc address.
otherwise, UVD and VCE IB test will fail under Vega10 SR-IOV

reference:
commit bfcea5204287 ("drm/amdgpu:change VEGA booting with firmware 
loaded by PSP")
commit aa5873dca463 ("drm/amdgpu: Change VCE booting with firmware 
loaded by PSP")

Signed-off-by: Trigger Huang 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 16 ++--  
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 17 +++--
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index dc461df..2191d3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -787,10 +787,13 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)
   0x, 
0x0004);
/* mc resume*/
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
-   
lower_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
-   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
-   
upper_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_lo);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i,
+   
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].tmr_mc_addr_hi);
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, 
+mmUVD_VCPU_CACHE_OFFSET0), 0);
offset = 0;
} else {

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
@@ -798,10 +801,11 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device 
*adev)

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),

upper_32_bits(adev->uvd.inst[i].gpu_addr));
offset = size;
+   
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0),
+   
AMDGPU_UVD_FIRMWARE_OFFSET >> 3);
+
}
 
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_OFFSET0),
-   AMDGPU_UVD_FIRMWARE_OFFSET 
>> 3);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_VCPU_CACHE_SIZE0), size);
 
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, i, 
mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW),
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index f3f5938..c0ec279 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -244,13 +244,18 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_SWAP_CNTL1), 0);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
 
+   offset = AMDGPU_VCE_FIRMWARE_OFFSET;
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   uint32_t low = 
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].tmr_mc_addr_lo;
+   uint32_t hi = 
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].tmr_mc_addr_hi;
+   uint64_t tmr_mc_addr = (uint64_t)(hi) << 32 | low;
+
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
-   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+

Re: [PATCH v3 00/26] compat_ioctl: cleanups

2019-05-06 Thread Andy Shevchenko

On Tue, Apr 16, 2019 at 11:23 PM Arnd Bergmann  wrote:
>
> Hi Al,
>
> It took me way longer than I had hoped to revisit this series, see
> https://lore.kernel.org/lkml/20180912150142.157913-1-a...@arndb.de/
> for the previously posted version.
>
> I've come to the point where all conversion handlers and most
> COMPATIBLE_IOCTL() entries are gone from this file, but for
> now, this series only has the parts that have either been reviewed
> previously, or that are simple enough to include.
>
> The main missing piece is the SG_IO/SG_GET_REQUEST_TABLE conversion.
> I'll post the patches I made for that later, as they need more
> testing and review from the scsi maintainers.
>
> I hope you can still take these for the coming merge window, unless
> new problems come up.

>  drivers/platform/x86/wmi.c  |   2 +-

Acked-by: Andy Shevchenko 

-- 
With Best Regards,
Andy Shevchenko
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64

2019-05-06 Thread Yanik Yiannakis


Hello Evan,

Yes I always used that command to commit my changes. I also have 
amdgpu.ppfeaturemask=0x as a boot parameter and I set 
power_dpm_force_performance_level to manual. Sorry for omitting that I 
assumed it was evident.


I have heard that the MCLK can only be as high as the SOCCLK. That would 
make sense because the SOCCLK of my Vega 64 is 1107MHz in its highest 
state. I noticed that on Windows the SOCCLK is raised automatically if 
the user sets the MCLK high enough through Wattman.


To replicate this on Linux I manually edited the pp_table to change the 
MCLK to 1175MHz and the SOCCLK to 1180MHz. The new SOCCLK was displayed 
in pp_dpm_socclk and in Unigine Superposition the FPS increased as 
expected (compared to an MCLK of 1107MHz). As a final test I edited the 
pp_table to set the MCLK to 1220MHz (this was unstable on Windows) and 
the SOCCLK to 1250MHz. This resulted in a crash (just like on Windows) 
which indicates that the MCLK really was set to 1220MHz.


My understanding of the situation is that powerplay doesn't 
automatically raise the SOCCLK like Wattman.
It would be cool if the user had the ability to overclock the SOCCLK 
through powerplay.


Greetings,
Yanik


On 06.05.19 10:13, Quan, Evan wrote:


+Alex,

Hi Yanik,

Did you ever run the following command to let your OD settings take 
effect (before running games)? Otherwise, they did not take effect 
actually.


echo "c" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage


Regards,

Evan

*From:*Yanik Yiannakis 
*Sent:* Monday, April 29, 2019 7:44 AM
*To:* rex@amd.com; Quan, Evan ; 
amd-gfx@lists.freedesktop.org
*Subject:* Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on 
Vega 64


Hello,

I experience a bug that prevents me from setting the MCLK of my Vega 
64 LC above 1107MHz.


I am using Unigine Superposition 1.1 in "Game"-mode to check the 
performance by watching the FPS.


*Behaviour with a single monitor:*

First I set the MCLK to a known stable value below 1108MHz:

/$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage/


In Unigine Superposition the FPS increase as expected.

pp_dpm_mclk also confirms the change.

/$ watch cat 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_dpm_mclk/


0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *

After that I set the MCLK to a stable value above 1107MHz:

/$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage/


In Unigine Superposition the FPS drop drastically.

pp_dpm_mclk indicates that the MCLK is stuck in state 0 (167MHz):

0: 167Mhz *
1: 500Mhz
2: 800Mhz
3: 1200Mhz

*Behaviour with multiple monitors that have different refresh rates:*

My monitors have different refresh rates. This causes the MCLK to stay 
in state 3 (945MHz stock) which is the expected behaviour as I 
understand it.


Now I try to set the MCLK to a value above 1107MHz:

/$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage/


The FPS in Unigine Superposition remain the same as they were with 945MHz.

pp_dpm_mclk shows however that the value was set:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1200Mhz *

Then I set the MCLK to a value of 1107MHz or lower:

/$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage/


The FPS in Unigine Superposition *increase*.

pp_dpm_mclk again confirms the set value:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *

Finally I increase MCLK to a known unstable value:

/$ echo "m 3 1300 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage/


The FPS in Unigine Superposition remain the same. I therefore believe 
the value was not actually applied.


However pp_dpm_mclk shows that it was:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1300Mhz *

amdgpu_pm_info also claims that the value was set:

/$ sudo watch cat /sys/kernel/debug/dri/1/amdgpu_pm_info/

GFX Clocks and Power:
    1300 MHz (MCLK)
    27 MHz (SCLK)
    1348 MHz (PSTATE_SCLK)
    800 MHz (PSTATE_MCLK)
    825 mV (VDDGFX)
    4.0 W (average GPU)

Again, I think the displayed MCLK is false and the memory still runs 
at 1100MHz because the performance in Unigine Superposition indicates 
this and 1300MHz would cause a crash immediately.


A stable value (e.g. 1200MHz) causes the same behaviour. I just chose 
1300MHz to be sure.


Tested on these Kernels:

Arch-Linux 5.0.9 (Arch)

Linux 5.1-rc6 (Ubuntu)

Linux 5.0 with amd-staging-drm-next (Ubuntu)
(https://github.com/M-Bab/linux-kernel-amdgpu-binaries)

(Same behaviour on every kernel.)

Tested on this hardware:

CPU: Intel i7-8700k

Re: How to dump gfx and waves after GPU reset happened?

2019-05-06 Thread Koenig, Christian

Am 06.05.19 um 14:20 schrieb Mikhail Gavrilov:
> [CAUTION: External Email]
>
> On Sun, 5 May 2019 at 15:18, Christian König
>  wrote:
>> Yeah, but for most end users we need to get the GPU working as fast as
>> possible on a lockup.
>>
>> Saving all the state (which actually can be a couple of gigabytes if you
>> include all textures etc..) is not really an option then.
>>
>> What we could probably do rather easily is to add a function to run a
>> script instead of a GPU reset on lockup detection.
>>
> This would be useful if this script would run not instead GPU reset
> and before GPU reset.

That won't work. The kernel can't wait for spawned processes to finish 
because it is holding locks.

The script could as last operation trigger a manual reset, but that 
would not be the same as a timeout reset because you don't know the 
cause of it and would always need to do a full engine reset.

Sorry, but what you are suggesting here (collect data and then reset) is 
not easily doable.

Christian.

>
> --
> Best Regards,
> Mike Gavrilov.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: How to dump gfx and waves after GPU reset happened?

2019-05-06 Thread Mikhail Gavrilov

On Sun, 5 May 2019 at 15:18, Christian König
 wrote:
>
> Yeah, but for most end users we need to get the GPU working as fast as
> possible on a lockup.
>
> Saving all the state (which actually can be a couple of gigabytes if you
> include all textures etc..) is not really an option then.
>
> What we could probably do rather easily is to add a function to run a
> script instead of a GPU reset on lockup detection.
>

This would be useful if this script would run not instead GPU reset
and before GPU reset.

--
Best Regards,
Mike Gavrilov.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: rename amdgpu_prime.[ch] into amdgpu_dma_buf.[ch]

2019-05-06 Thread Christian König

We are getting a dma-buf implementation completely separate from drm prime,
so rename the files now and cleanup the code a bit.

No functional change.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   1 +
 .../{amdgpu_prime.c => amdgpu_dma_buf.c}  | 131 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h   |  46 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  16 ---
 7 files changed, 116 insertions(+), 83 deletions(-)
 rename drivers/gpu/drm/amd/amdgpu/{amdgpu_prime.c => amdgpu_dma_buf.c} (93%)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 7d539ba6400d..11a651ff7f0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -49,7 +49,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
amdgpu_pm.o atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
-   amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
+   amdgpu_dma_buf.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index aeead072fa79..e829c53accf5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -25,6 +25,7 @@
 #include 
 #include "amdgpu.h"
 #include "amdgpu_gfx.h"
+#include "amdgpu_dma_buf.h"
 #include 
 #include 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 047bba8c62d6..2bc80942e5d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -30,6 +30,7 @@
 #include "amdgpu_object.h"
 #include "amdgpu_vm.h"
 #include "amdgpu_amdkfd.h"
+#include "amdgpu_dma_buf.h"
 
 /* Special VM and GART address alignment needed for VI pre-Fiji due to
  * a HW bug.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
similarity index 93%
rename from drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
rename to drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index a38e0fb4a6fe..4711cf1b5bd2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012 Advanced Micro Devices, Inc.
+ * Copyright 2019 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -103,7 +103,8 @@ void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, 
void *vaddr)
  * Returns:
  * 0 on success or a negative error code on failure.
  */
-int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma)
+int amdgpu_gem_prime_mmap(struct drm_gem_object *obj,
+ struct vm_area_struct *vma)
 {
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
@@ -137,57 +138,6 @@ int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, 
struct vm_area_struct *vma
return ret;
 }
 
-/**
- * amdgpu_gem_prime_import_sg_table - _driver.gem_prime_import_sg_table
- * implementation
- * @dev: DRM device
- * @attach: DMA-buf attachment
- * @sg: Scatter/gather table
- *
- * Imports shared DMA buffer memory exported by another device.
- *
- * Returns:
- * A new GEM BO of the given DRM device, representing the memory
- * described by the given DMA-buf attachment and scatter/gather table.
- */
-struct drm_gem_object *
-amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
-struct dma_buf_attachment *attach,
-struct sg_table *sg)
-{
-   struct reservation_object *resv = attach->dmabuf->resv;
-   struct amdgpu_device *adev = dev->dev_private;
-   struct amdgpu_bo *bo;
-   struct amdgpu_bo_param bp;
-   int ret;
-
-   memset(, 0, sizeof(bp));
-   bp.size = attach->dmabuf->size;
-   bp.byte_align = PAGE_SIZE;
-   bp.domain = AMDGPU_GEM_DOMAIN_CPU;
-   bp.flags = 0;
-   bp.type = ttm_bo_type_sg;
-   bp.resv = resv;
-   ww_mutex_lock(>lock, NULL);
-   ret = amdgpu_bo_create(adev, , );
-   if (ret)
-   goto error;
-
-   bo->tbo.sg = sg;
-   bo->tbo.ttm->sg = sg;
-   bo->allowed_domains = AMDGPU_GEM_DOMAIN_GTT;

Re: [PATCH] drm/amdgpu: treat negative lockup timeout as 'infinite timeout'

2019-05-06 Thread Christian König


Am 05.05.19 um 16:23 schrieb Evan Quan:

Negative lockup timeout is valid and will be treated as
'infinite timeout'.

Change-Id: I0d8387956a9c744073c0281ef2e1a547d4f16dec
Signed-off-by: Evan Quan 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5b03e17e6e06..4d6dff6855f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -233,13 +233,14 @@ module_param_named(msi, amdgpu_msi, int, 0444);
   * Set GPU scheduler timeout value in ms.
   *
   * The format can be [Non-Compute] or [GFX,Compute,SDMA,Video]. That is there 
can be one or
- * multiple values specified. 0 and negative values are invalidated. They will 
be adjusted
- * to default timeout.
+ * multiple values specified.
   *  - With one value specified, the setting will apply to all non-compute 
jobs.
   *  - With multiple values specified, the first one will be for GFX. The 
second one is for Compute.
   *And the third and fourth ones are for SDMA and Video.
   * By default(with no lockup_timeout settings), the timeout for all 
non-compute(GFX, SDMA and Video)
   * jobs is 1. And there is no timeout enforced on compute jobs.
+ * Value 0 is invalidated, will be adjusted to default timeout settings.
+ * Negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET).
   */
  MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: 1 for 
non-compute jobs and no timeout for compute jobs), "
"format is [Non-Compute] or [GFX,Compute,SDMA,Video]");
@@ -1248,11 +1249,16 @@ int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
if (ret)
return ret;
  
-			/* Invalidate 0 and negative values */

-   if (timeout <= 0) {
+   /*
+* Value 0 will be adjusted to default timeout settings.
+* Negative values mean 'infinite timeout' 
(MAX_JIFFY_OFFSET).
+*/
+   if (!timeout) {
index++;
continue;
}
+   if (timeout < 0)
+   timeout = MAX_JIFFY_OFFSET;


This is superfluous and maybe even harmful, msecs_to_jiffies() should 
take care of this conversion.


Maybe even convert the values directly here.

Christian.

  
  			switch (index++) {

case 0:


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64

2019-05-06 Thread Quan, Evan

+Alex,

Hi Yanik,

Did you ever run the following command to let your OD settings take effect 
(before running games)? Otherwise, they did not take effect actually.
echo "c" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

Regards,
Evan
From: Yanik Yiannakis 
Sent: Monday, April 29, 2019 7:44 AM
To: rex@amd.com; Quan, Evan ; 
amd-gfx@lists.freedesktop.org
Subject: Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64


Hello,

I experience a bug that prevents me from setting the MCLK of my Vega 64 LC 
above 1107MHz.

I am using Unigine Superposition 1.1 in "Game"-mode to check the performance by 
watching the FPS.


Behaviour with a single monitor:

First I set the MCLK to a known stable value below 1108MHz:

$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

In Unigine Superposition the FPS increase as expected.

pp_dpm_mclk also confirms the change.

$ watch cat 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_dpm_mclk

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *



After that I set the MCLK to a stable value above 1107MHz:

$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

In Unigine Superposition the FPS drop drastically.

pp_dpm_mclk indicates that the MCLK is stuck in state 0 (167MHz):

0: 167Mhz *
1: 500Mhz
2: 800Mhz
3: 1200Mhz



Behaviour with multiple monitors that have different refresh rates:

My monitors have different refresh rates. This causes the MCLK to stay in state 
3 (945MHz stock) which is the expected behaviour as I understand it.



Now I try to set the MCLK to a value above 1107MHz:

$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition remain the same as they were with 945MHz.

pp_dpm_mclk shows however that the value was set:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1200Mhz *



Then I set the MCLK to a value of 1107MHz or lower:

$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition increase.

pp_dpm_mclk again confirms the set value:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *


Finally I increase MCLK to a known unstable value:

$ echo "m 3 1300 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition remain the same. I therefore believe the value 
was not actually applied.

However pp_dpm_mclk shows that it was:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1300Mhz *



amdgpu_pm_info also claims that the value was set:

$ sudo watch cat /sys/kernel/debug/dri/1/amdgpu_pm_info

GFX Clocks and Power:
1300 MHz (MCLK)
27 MHz (SCLK)
1348 MHz (PSTATE_SCLK)
800 MHz (PSTATE_MCLK)
825 mV (VDDGFX)
4.0 W (average GPU)

Again, I think the displayed MCLK is false and the memory still runs at 1100MHz 
because the performance in Unigine Superposition indicates this and 1300MHz 
would cause a crash immediately.

A stable value (e.g. 1200MHz) causes the same behaviour. I just chose 1300MHz 
to be sure.





Tested on these Kernels:

Arch-Linux 5.0.9 (Arch)

Linux 5.1-rc6 (Ubuntu)

Linux 5.0 with amd-staging-drm-next (Ubuntu) 
(https://github.com/M-Bab/linux-kernel-amdgpu-binaries)

(Same behaviour on every kernel.)



Tested on this hardware:

CPU: Intel i7-8700k

Motherboard: MSI Z370 Gaming Pro Carbon

GPU: Powercolor Vega 64 Liquid Cooled (Memory stable below 1220MHz, tested on 
Windows 10 with Wattman and Unigine Superposition)



Unigine Superposition "Game"-Mode settings:

Preset: Custom

Fullscreen: Disabled

Resolution: 3840x2160 (4K UHD)

Shaders Quality: Extreme

Textures Quality: High

Vsync: Off

Depth of Field: On

Motion Blur: On



I hope this helps.

Yanik Yiannakis
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

49 matches

Mail list logo