Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register

2019-06-06 Thread John Hubbard
On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > From: Jason Gunthorpe > > Ralph observes that hmm_range_register() can only be called by a driver > while a mirror is registered. Make this clear in the API by passing in the > mirror structure as a parameter. > > This also simplifies understanding

Re: [PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers

2019-06-06 Thread John Hubbard
On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > From: Jason Gunthorpe ... > diff --git a/mm/hmm.c b/mm/hmm.c > index 8e7403f081f44a..547002f56a163d 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c ... > @@ -125,7 +130,7 @@ static void hmm_free(struct kref *kref) > mm->hmm = NULL; >

Re: [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm

2019-06-06 Thread John Hubbard
On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > From: Jason Gunthorpe > > So long a a struct hmm pointer exists, so should the struct mm it is > linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it > once the hmm refcount goes to zero. > > Since mmdrop() (ie a 0 kref on

Re: [PATCH v16 14/16] tee, arm64: untag user pointers in tee_shm_register

2019-06-06 Thread Jens Wiklander
On Mon, Jun 3, 2019 at 6:56 PM Andrey Konovalov wrote: > > This patch is a part of a series that extends arm64 kernel ABI to allow to > pass tagged user pointers (with the top byte set to something else other > than 0x00) as syscall arguments. > >

Re: [PATCH] drm/amdkfd: Initialize dqm earlier

2019-06-06 Thread Kuehling, Felix
On 2019-06-06 5:51 p.m., Zeng, Oak wrote: > dqm is referenced in function kfd_toplogy_add_device. > Move dqm initialization up to avoid NULL pointer reference. This addresses a pretty unlikely race condition where someone looks at /sys/kernel/debug/kfd/hqds during the device initialization. We

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v2)

2019-06-06 Thread Tom St Denis
ping? On Tue, Jun 4, 2019 at 1:15 PM StDenis, Tom wrote: > This option is no longer needed. The default code paths > are now the only option. > > v2: Add HPAGE support and a default for non contiguous maps > > Signed-off-by: Tom St Denis > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h |

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v2)

2019-06-06 Thread Christian König
Am 04.06.19 um 19:15 schrieb StDenis, Tom: This option is no longer needed. The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -

Re: [PATCH][next] drm/amd/display: remove redundant assignment to status

2019-06-06 Thread Dan Carpenter
On Fri, May 31, 2019 at 08:19:03PM +, Harry Wentland wrote: > On 2019-05-30 12:12 p.m., Colin King wrote: > > From: Colin Ian King > > > > The variable status is initialized with a value that is never read > > and status is reassigned several statements later. This initialization > > is

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v3)

2019-06-06 Thread StDenis, Tom
On 2019-06-06 7:49 a.m., Christian König wrote: > Am 06.06.19 um 12:50 schrieb StDenis, Tom: >> This option is no longer needed.  The default code paths >> are now the only option. >> >> v2: Add HPAGE support and a default for non contiguous maps >> v3: Misread 512 pages as MiB ... >> >>

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v3)

2019-06-06 Thread Christian König
Am 06.06.19 um 13:51 schrieb StDenis, Tom: On 2019-06-06 7:49 a.m., Christian König wrote: Am 06.06.19 um 12:50 schrieb StDenis, Tom: This option is no longer needed.  The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps v3: Misread 512

[PATCH 1/2] drm/amd/amdgpu: Add module parameter for specifying default ABM level

2019-06-06 Thread Nicholas Kazlauskas
[Why] It's non trivial to configure or specify an ABM reduction level for userspace outside of X. There is also no method to specify the default ABM value at boot time. A parameter should be added to configure this. [How] Expose a module parameter that can specify the default ABM level to use

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v3)

2019-06-06 Thread Christian König
Am 06.06.19 um 12:50 schrieb StDenis, Tom: This option is no longer needed. The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps v3: Misread 512 pages as MiB ... Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/amdgpu.h

Re: FW: [PATCH] drm/ttm: fix ttm client driver (e.g. amdgpu) reload issue

2019-06-06 Thread Christian König
It is part of amd-staging-drm-next and has Alex Signed-of by tag. So it should definitely be upstream, Alex any idea why that patch isn't in drm-next? Christian. Am 05.06.19 um 20:10 schrieb Koenig, Christian: Mhm, looks like that somehow got dropped during rebase. Going to dig up where

Re: [PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v2)

2019-06-06 Thread StDenis, Tom
Ah ya I misread the original default as MiB instead of pages. Tom On 2019-06-06 6:35 a.m., Christian König wrote: > Am 04.06.19 um 19:15 schrieb StDenis, Tom: >> This option is no longer needed.  The default code paths >> are now the only option. >> >> v2: Add HPAGE support and a default for

[PATCH] drm/amd/amdgpu: remove vram_page_split kernel option (v3)

2019-06-06 Thread StDenis, Tom
This option is no longer needed. The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps v3: Misread 512 pages as MiB ... Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -

[PATCH 2/2] drm/amd/display: Set default ABM level to module parameter

2019-06-06 Thread Nicholas Kazlauskas
[Why] The module parameter to specify the default ABM level is now defined, so hook it up in DM. [How] On connector reset specify the default level. DC will program this as part of the modeset since it gets passed onto the stream in dm_update_crtc_state. It's only set for eDP connectors, but it

Re: FW: [PATCH] drm/ttm: fix ttm client driver (e.g. amdgpu) reload issue

2019-06-06 Thread Deucher, Alexander
It's upstream: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bd4264112f93045704731850c5e4d85db981cd85 and in drm-next: https://cgit.freedesktop.org/drm/drm/commit/?id=bd4264112f93045704731850c5e4d85db981cd85 and in amd-staging-drm-next:

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-06 Thread Jason Gunthorpe
On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: > These problems were found in AMD-internal testing as we're working on > adopting HMM. They are rebased against glisse/hmm-5.2-v3. We'd like to get > them applied to a mainline Linux kernel as well as drm-next and >

Re: [PATCH 2/2] drm/amd/display: Set default ABM level to module parameter

2019-06-06 Thread Francis, David
Series is Reviewed-by: David Francis From: amd-gfx on behalf of Nicholas Kazlauskas Sent: June 6, 2019 9:02:13 AM To: amd-gfx@lists.freedesktop.org Cc: Francis, David; Wentland, Harry; Kazlauskas, Nicholas Subject: [PATCH 2/2] drm/amd/display: Set default ABM

[PATCH 5/6] drm/amdkfd: Fix a circular lock dependency

2019-06-06 Thread Zeng, Oak
The idea to break the circular lock dependency is to move allocate_mqd out of dqm lock protection. See callstack #1 below. [ 59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for :04:00.0 on minor 0 [ 513.604034] == [ 513.604205]

[PATCH 6/6] drm/amdkfd: Fix sdma queue allocate race condition

2019-06-06 Thread Zeng, Oak
SDMA queue allocation requires the dqm lock at it modify the global dqm members. Move up the dqm_lock so sdma queue allocation is enclosed in the critical section. Move mqd allocation out of critical section to avoid circular lock dependency. Change-Id: I96abd42eae6e77c82a5ba1b8e600af3efe8d791d

[PATCH 4/6] drm/amdkfd: Separate mqd allocation and initialization

2019-06-06 Thread Zeng, Oak
Introduce a new mqd allocation interface and split the original init_mqd function into two functions: allocate_mqd and init_mqd. Also renamed uninit_mqd to free_mqd. This is preparation work to fix a circular lock dependency. Change-Id: I26e53ee1abcdd688ad11d35b433da77e3fa1bee7 Signed-off-by: Oak

Re: [PATCH 2/7] drm/dp_mst: Register AUX devices for MST ports

2019-06-06 Thread Lyude Paul
On Thu, 2019-06-06 at 19:41 +, Li, Sun peng (Leo) wrote: > > On 2019-06-03 3:28 p.m., Lyude Paul wrote: > > > I'm reproducing this just by reloading i915 on a machine with some MST > > > displays connected. I uploaded a copy of the script that I use to do > > > this > > > here: > > > > > >

[PATCH 4/6] drm/amdkfd: Separate mqd allocation and initialization

2019-06-06 Thread Zeng, Oak
Introduce a new mqd allocation interface and split the original init_mqd function into two functions: allocate_mqd and init_mqd. Also renamed uninit_mqd to free_mqd. This is preparation work to fix a circular lock dependency. Change-Id: I26e53ee1abcdd688ad11d35b433da77e3fa1bee7 Signed-off-by: Oak

[PATCH] drm/amdkfd: Initialize dqm earlier

2019-06-06 Thread Zeng, Oak
dqm is referenced in function kfd_toplogy_add_device. Move dqm initialization up to avoid NULL pointer reference. Change-Id: Id6cb2541af129826b7621ceaa8e06e638c7bb122 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 16 1 file changed, 8 insertions(+), 8

Re: [PATCH 4/6] drm/amdkfd: Separate mqd allocation and initialization

2019-06-06 Thread Kuehling, Felix
On 2019-06-06 5:40 p.m., Zeng, Oak wrote: > Introduce a new mqd allocation interface and split the original > init_mqd function into two functions: allocate_mqd and init_mqd. > Also renamed uninit_mqd to free_mqd. This is preparation work to > fix a circular lock dependency. > > Change-Id:

Re: [PATCH 6/6] drm/amdkfd: Fix sdma queue allocate race condition

2019-06-06 Thread Kuehling, Felix
Patches 5 and 6 are Reviewed-by: Felix Kuehling On 2019-06-06 2:25 p.m., Zeng, Oak wrote: > SDMA queue allocation requires the dqm lock at it modify > the global dqm members. Move up the dqm_lock so sdma > queue allocation is enclosed in the critical section. Move > mqd allocation out of

RE: [PATCH revert] Revert "drm/amdgpu: use GMC v9 KIQ workaround only for the GFXHUB"

2019-06-06 Thread Zeng, Oak
Not sure whether it is related to this change, when I boot a system with amdgpu blacklisted, then modprobe amdgpu, I followed issue. Failing path is gmc_v9_0_flush_gpu_tlb calling amdgpu_virt_kiq_req_write_reg_wait. If I boot w/o amdgpu blacklisted, then it is fine (load amdgpu directly during

[PATCH 14/24] drm/amd/display: 3.2.35

2019-06-06 Thread Bhawanpreet Lakha
From: Aric Cyr Signed-off-by: Aric Cyr Reviewed-by: Aric Cyr Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index

[PATCH 15/24] drm/amd/display: Clean up scdc_test_data struct

2019-06-06 Thread Bhawanpreet Lakha
From: Chris Park These are no longer needed, Also added RESERVED bits. Signed-off-by: Chris Park Reviewed-by: Charlene Liu Acked-by: Bhawanpreet Lakha Acked-by: Harry Wentland --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 1 - drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c |

[PATCH 19/24] drm/amd/display: Gamma logic limitations causing unintended use of RAM over ROM.

2019-06-06 Thread Bhawanpreet Lakha
From: Harmanprit Tatla [Why] Our existing logic in deciding whether to use RAM or ROM depends on whether we are dealing with an identity gamma ramp. [How] In addition to the is_identity flag a new is_logical_identity flag has been added. The is_identity flag now denotes whether the OS gamma is

[PATCH 02/24] drm/amd/display: Update link rate from DPCD 10

2019-06-06 Thread Bhawanpreet Lakha
From: Wesley Chalmers [WHY] Some panels return a link rate of 0 (unknown) in DPCD 0. In this case, an appropriate mode cannot be set, and certain panels will show corruption as they are forced to use a mode they do not support. [HOW] Read DPCD 10 in the case where supported link rate from DPCD

[PATCH 07/24] drm/amd/display: add i2c_hw_Status check to make sure as HW I2c in use

2019-06-06 Thread Bhawanpreet Lakha
From: Derek Lai 1. Add i2c_hw_Status check to make sure when HW i2c is in use. 2. Don't reset HW engine in is_hw_busy() and instead do this in process_transaction() because SW i2c does not check if hw i2c is in use Signed-off-by: Derek Lai Reviewed-by: Charlene Liu Acked-by: Bhawanpreet Lakha

[PATCH 06/24] drm/amd/display: Dont aser if DP_DPHY_INTERNAL_CTRL

2019-06-06 Thread Bhawanpreet Lakha
From: Eric Bernstein No need to assert just return Signed-off-by: Eric Bernstein Reviewed-by: Charlene Liu Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git

[PATCH 05/24] drm/amd/display: 3.2.33

2019-06-06 Thread Bhawanpreet Lakha
From: Aric Cyr Signed-off-by: Aric Cyr Reviewed-by: Aric Cyr Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index

[PATCH 12/24] drm/amd/display: fix gamma logic breaking driver unload

2019-06-06 Thread Bhawanpreet Lakha
From: Krunoslav Kovac Using this logic breaks driver unload, this is a temporary fix a followup patch will properly fix this Signed-off-by: Krunoslav Kovac Reviewed-by: Aric Cyr Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 6 ++ 1 file

[PATCH 20/24] drm/amd/display: Use stream opp_id instead of hubp

2019-06-06 Thread Bhawanpreet Lakha
From: Wesley Chalmers [WHY] By the time output csc matrix is being programmed, stream connection to OPP has been established, but this information has not been relayed back to HUBP. Signed-off-by: Wesley Chalmers Reviewed-by: Anthony Koo Acked-by: Bhawanpreet Lakha Acked-by: Krunoslav Kovac

[PATCH 04/24] drm/amd/display: add some math functions for dcn_calc_math

2019-06-06 Thread Bhawanpreet Lakha
From: Charlene Liu Implement floor, ceil, and fabs Signed-off-by: Charlene Liu Reviewed-by: Charlene Liu Acked-by: Bhawanpreet Lakha --- .../drm/amd/display/dc/calcs/dcn_calc_math.c | 20 +++ .../drm/amd/display/dc/calcs/dcn_calc_math.h | 3 +++

[PATCH 01/24] drm/amd/display: fix resource saving missing when power state switch

2019-06-06 Thread Bhawanpreet Lakha
From: "Tao.Huang" Signed-off-by: Tao.Huang Reviewed-by: Jun Lei Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index

[PATCH 00/24] DC Patches 06 Jun 2019

2019-06-06 Thread Bhawanpreet Lakha
Summary Of Changes *Rework CRTC color management *Add underflow asserts *i2c fix *gamma fixes Anthony Koo (1): drm/amd/display: fix issue with eDP not detected on driver load Aric Cyr (3): drm/amd/display: 3.2.33 drm/amd/display: 3.2.34 drm/amd/display: 3.2.35 Charlene Liu (2):

[PATCH 09/24] drm/amd/display: Use macro for invalid OPP ID

2019-06-06 Thread Bhawanpreet Lakha
From: Wesley Chalmers [WHY] This is meant to make it clearer that 0xf is not a valid OPP ID, and that code making use of OPP IDs should not accept this value. Signed-off-by: Wesley Chalmers Reviewed-by: Charlene Liu Acked-by: Bhawanpreet Lakha ---

[PATCH 11/24] drm/amd/display: fix issue with eDP not detected on driver load

2019-06-06 Thread Bhawanpreet Lakha
From: Anthony Koo [Why] HPD not going to be high if Panel VDD is off And all AUX transaction will fail :( [How] 1. Power on VDD before attempting detection if it isn't already on 2. Improve the robustness by having a retry mechanism on the first DPCD read after VDD on. If a particular board

[PATCH 10/24] drm/amd/display: Rework CRTC color management

2019-06-06 Thread Bhawanpreet Lakha
From: Nicholas Kazlauskas [Why] To prepare for the upcoming DRM plane color management properties we need to correct a lot of wrong behavior and assumptions made for CRTC color management. The documentation added by this commit in amdgpu_dm_color explains how the HW color pipeline works and its

[PATCH 03/24] drm/amd/display: Copy stream updates onto streams

2019-06-06 Thread Bhawanpreet Lakha
From: Nicholas Kazlauskas [Why] Almost every function in DC that works with stream state expects that the current state on the stream is the one that it should be writing out. These functions are typically triggered by specifying a particular stream update - but the actual contents of the stream

[PATCH 16/24] drm/amd/display: Remove superflous error message

2019-06-06 Thread Bhawanpreet Lakha
From: Jordan Lazare [Why] VBios sometimes reports incorrect object type as encoder instead of connector [How] Change error message to debug message Signed-off-by: Jordan Lazare Reviewed-by: Harry Wentland Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 2 +-

[PATCH 18/24] drm/amd/display: Add Underflow Asserts to dc

2019-06-06 Thread Bhawanpreet Lakha
From: Thomas Lim [Why] For debugging underflow issues it can be useful to have asserts when the underflow initially occurs. [How] Read the underflow status registers after actions that have a high risk of causing underflow and assert that no underflow occurred. If underflow occurred, clear the

[PATCH 08/24] drm/amd/display: add audio related regs

2019-06-06 Thread Bhawanpreet Lakha
From: Charlene Liu Signed-off-by: Charlene Liu Reviewed-by: Chris Park Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/dce/dce_audio.c | 4 +--- drivers/gpu/drm/amd/display/dc/dce/dce_audio.h | 7 +++ 2 files changed, 8 insertions(+), 3 deletions(-) diff --git

[PATCH 22/24] drm/amd/display: Do not grant POST_LT_ADJ when TPS4 is used

2019-06-06 Thread Bhawanpreet Lakha
From: abdoulaye berthe [Description] The spec does not allow POST_LT_ADJ_GRANTED to be set when TPS4 is used. Signed-off-by: abdoulaye berthe Acked-by: Bhawanpreet Lakha --- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 56 ++- 1 file changed, 31 insertions(+), 25

[PATCH 24/24] drm/amd/display: set link->dongle_max_pix_clk to 0 on a disconnect

2019-06-06 Thread Bhawanpreet Lakha
From: Samson Tam [Why] Found issue in EDID Emulation where if we connect a display using a passive HDMI-DP dongle, disconnect it and then try to emulate a display using DP, we could not see 4K modes. This was because on a disconnect, dongle_max_pix_clk was still set so when we emulate using

[PATCH 23/24] drm/amd/display: make clk_mgr call enable_pme_wa

2019-06-06 Thread Bhawanpreet Lakha
From: Su Sung Chung refactor a code so we will call clk_mgr's enable_pme_wa function so we can use pme_wa for future asics. This way we don't need to worry about different ASIC since clk_mgr already have that information Signed-off-by: Su Sung Chung Reviewed-by: Eric Yang Acked-by:

[PATCH 13/24] drm/amd/display: 3.2.34

2019-06-06 Thread Bhawanpreet Lakha
From: Aric Cyr Signed-off-by: Aric Cyr Reviewed-by: Aric Cyr Acked-by: Bhawanpreet Lakha --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index

[PATCH 17/24] drm/amd/display: move vmid determination logic out of dc

2019-06-06 Thread Bhawanpreet Lakha
From: Dmytro Laktyushkin Currently vmid is decided internally inside dc. This makes it difficult to use vmid use with external components. This change moves vmid logic outside dc and allowing vmid to be passed in as a parameter to DC. Signed-off-by: Dmytro Laktyushkin Reviewed-by: Charlene

[PATCH 21/24] drm/amd/display: S3 Resume time increase after decoupling DPMS from fast boot

2019-06-06 Thread Bhawanpreet Lakha
From: SivapiriyanKumarasamy [Why] We incorrectly began powering down the display at boot/resume whenever fast boot was not possible. This should not be done in the case where there exists a stream for the eDP since this implies that we want to turn it on. [How] Add check for eDP stream to

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-06 Thread Kuehling, Felix
On 2019-06-06 11:11 a.m., Jason Gunthorpe wrote: > On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: >> These problems were found in AMD-internal testing as we're working on >> adopting HMM. They are rebased against glisse/hmm-5.2-v3. We'd like to get >> them applied to a mainline

Re: [PATCH 2/7] drm/dp_mst: Register AUX devices for MST ports

2019-06-06 Thread Li, Sun peng (Leo)
On 2019-06-03 3:28 p.m., Lyude Paul wrote: >> I'm reproducing this just by reloading i915 on a machine with some MST >> displays connected. I uploaded a copy of the script that I use to do this >> here: >> >> https://people.freedesktop.org/~lyudess/archive/06-03-2019/unloadgpumod.sh >

[PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must

[PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe No other register/unregister kernel API attempts to provide this kind of protection as it is inherently racy, so just drop it. Callers should provide their own protection, it appears nouveau already does, but just in case drop a debugging POISON. Signed-off-by: Jason

[PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe As coded this function can false-fail in various racy situations. Make it reliable by running only under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Also make the locking very easy to understand by only ever reading or writing

[PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe The wait_event_timeout macro already tests the condition as its first action, so there is no reason to open code another version of this, all that does is skip the might_sleep() debugging in common cases, which is not helpful. Further, based on prior patches, we can no

[PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe hmm_release() is called exactly once per hmm. ops->release() cannot accidentally trigger any action that would recurse back onto hmm->mirrors_sem. This fixes a use after-free race of the form: CPU0 CPU1

[PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe Range functions like hmm_range_snapshot() and hmm_range_fault() call find_vma, which requires hodling the mmget() and the mmap_sem for the mm. Make this simpler for the callers by holding the mmget() inside the range for the lifetime of the range. Other functions that

[PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe So we can check locking at runtime. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- v2 - Fix missing & in lockdeps (Jason) --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index

[PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier system will continue to reference hmm->mn until the srcu grace period expires. Resulting in use after free races like this: CPU0 CPU1

[PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe This list is always read and written while holding hmm->lock so there is no need for the confusing _rcu annotations. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.c

[PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe For hmm.git: This patch series arised out of discussions with Jerome when looking at the ODP changes, particularly informed by use after free races we have already found and fixed in the ODP code (thanks to syzkaller) working with mmu notifiers, and the discussion with

[PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe So long a a struct hmm pointer exists, so should the struct mm it is linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it once the hmm refcount goes to zero. Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL mm->hmm delete the

[PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister

2019-06-06 Thread Jason Gunthorpe
From: Jason Gunthorpe Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON and poison bytes to detect this condition. Signed-off-by: Jason Gunthorpe Reviewed-by: Jérôme Glisse --- v2 - Keep range start/end valid after unregistration (Jerome) --- mm/hmm.c | 7 +-- 1

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-06 Thread Kuehling, Felix
[resent with correct address for Alex] On 2019-06-06 11:11 a.m., Jason Gunthorpe wrote: > On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: >> These problems were found in AMD-internal testing as we're working on >> adopting HMM. They are rebased against glisse/hmm-5.2-v3. We'd

Re: [PATCH 0/2] Two bug-fixes for HMM

2019-06-06 Thread Jason Gunthorpe
On Thu, Jun 06, 2019 at 07:04:46PM +, Kuehling, Felix wrote: > On 2019-06-06 11:11 a.m., Jason Gunthorpe wrote: > > On Fri, May 10, 2019 at 07:53:21PM +, Kuehling, Felix wrote: > >> These problems were found in AMD-internal testing as we're working on > >> adopting HMM. They are rebased

Re: [PATCH 4/6] drm/amdkfd: Separate mqd allocation and initialization

2019-06-06 Thread Kuehling, Felix
On 2019-06-06 2:25 p.m., Zeng, Oak wrote: > Introduce a new mqd allocation interface and split the original > init_mqd function into two functions: allocate_mqd and init_mqd. > Also renamed uninit_mqd to free_mqd. This is preparation work to > fix a circular lock dependency. > > Change-Id: