Re: two KASANs in TTM logic
On 09/08/2018 05:23 AM, Huang Rui wrote: On Fri, Sep 07, 2018 at 04:59:11PM +0800, Christian König wrote: Hi Ray, in the meantime can we disable the feature once more in the kernel until we have hammered out all possible corner cases? That's fine. So far, we have to disable it again. I will do more testing and repro the issue of Tom firstly. As Tom figured out commenting out setting "bulk_moveable" to true should be enough. I saw you already remove the "bulk_moveable = true" in amdgpu_vm_init(), do you point we also comment out the one in amdgpu_vm_move_to_lru_tail() to disable bulk_move totally for the moment? Hi Ray, I just commented out the assignment of true. Tom Thanks, Ray Thanks, Christian. Am 07.09.2018 um 08:51 schrieb Huang, Ray: Hi Tom, Thanks to trace this issue. I am trying to reproduce it on amd-staging-drm-next with piglit. May I know the steps/configurations to repro it? Thanks, Ray -Original Message- From: amd-gfx On Behalf Of Tom St Denis Sent: Wednesday, September 5, 2018 9:27 PM To: Koenig, Christian ; Daenzer, Michel ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: two KASANs in TTM logic Logs attached. Tom On 09/05/2018 08:02 AM, Christian König wrote: Still not the slightest idea what is causing this and the patch definitely fixes things a lot. Can you try to enable list debugging in your kernel? Thanks, Christian. Am 04.09.2018 um 19:18 schrieb Tom St Denis: Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
On Fri, Sep 07, 2018 at 04:59:11PM +0800, Christian König wrote: > Hi Ray, > > in the meantime can we disable the feature once more in the kernel until > we have hammered out all possible corner cases? That's fine. So far, we have to disable it again. I will do more testing and repro the issue of Tom firstly. > > As Tom figured out commenting out setting "bulk_moveable" to true should > be enough. I saw you already remove the "bulk_moveable = true" in amdgpu_vm_init(), do you point we also comment out the one in amdgpu_vm_move_to_lru_tail() to disable bulk_move totally for the moment? Thanks, Ray > > Thanks, > Christian. > > Am 07.09.2018 um 08:51 schrieb Huang, Ray: > > Hi Tom, > > > > Thanks to trace this issue. I am trying to reproduce it on > > amd-staging-drm-next with piglit. > > May I know the steps/configurations to repro it? > > > > Thanks, > > Ray > > > > -Original Message- > > From: amd-gfx On Behalf Of Tom St > > Denis > > Sent: Wednesday, September 5, 2018 9:27 PM > > To: Koenig, Christian ; Daenzer, Michel > > ; amd-gfx@lists.freedesktop.org; Deucher, Alexander > > > > Subject: Re: two KASANs in TTM logic > > > > Logs attached. > > > > Tom > > > > > > > > On 09/05/2018 08:02 AM, Christian König wrote: > >> Still not the slightest idea what is causing this and the patch > >> definitely fixes things a lot. > >> > >> Can you try to enable list debugging in your kernel? > >> > >> Thanks, > >> Christian. > >> > >> Am 04.09.2018 um 19:18 schrieb Tom St Denis: > >>> Sure: > >>> > >>> d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit > >>> commit d2917f399e0b250f47d07da551a335843a24f835 > >>> Author: Christian König > >>> Date: Thu Aug 30 10:04:53 2018 +0200 > >>> > >>> drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 > >>> > >>> First step to fix the LRU corruption, we accidentially tried to > >>> move things > >>> on the LRU after dropping the lock. > >>> > >>> Signed-off-by: Christian König > >>> Tested-by: Michel Dänzer > >>> > >>> :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 > >>> 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers > >>> > >>> > >>> The commit before that I can run xonotic-glx and piglit on my Carrizo > >>> without a KASAN. > >>> > >>> Tom > >>> > >>> On 09/04/2018 10:05 AM, Christian König wrote: > >>>> The first one should already be fixed. > >>>> > >>>> Not sure where the second comes from. Can you narrow that down further? > >>>> > >>>> Christian. > >>>> > >>>> Am 04.09.2018 um 15:46 schrieb Tom St Denis: > >>>>> First is caused by this commit while running a GL heavy application. > >>>>> > >>>>> d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit > >>>>> commit d78c1fa0c9f815fe951fd57001acca3d35262a17 > >>>>> Author: Michel Dänzer > >>>>> Date: Wed Aug 29 11:59:38 2018 +0200 > >>>>> > >>>>> Revert "drm/amdgpu: move PD/PT bos on LRU again" > >>>>> > >>>>> This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. > >>>>> > >>>>> It triggered various badness on my development machine when > >>>>> running the > >>>>> piglit gpu profile with radeonsi on Bonaire, looks like memory > >>>>> corruption due to insufficiently protected list manipulations. > >>>>> > >>>>> Signed-off-by: Michel Dänzer > >>>>> Signed-off-by: Alex Deucher > >>>>> > >>>>> :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d > >>>>> 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers > >>>>> > >>>>> The second is caused by something between that and the tip of the > >>>>> 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while > >>>>> loading GNOME. > >>>>> > >>>>> Tom > >>>>> > >>>>> > >>>>> > >>>>> ___ > >>>>> amd-gfx mailing list > >>>>> amd-gfx@lists.freedesktop.org > >>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
Hi Ray, in the meantime can we disable the feature once more in the kernel until we have hammered out all possible corner cases? As Tom figured out commenting out setting "bulk_moveable" to true should be enough. Thanks, Christian. Am 07.09.2018 um 08:51 schrieb Huang, Ray: Hi Tom, Thanks to trace this issue. I am trying to reproduce it on amd-staging-drm-next with piglit. May I know the steps/configurations to repro it? Thanks, Ray -Original Message- From: amd-gfx On Behalf Of Tom St Denis Sent: Wednesday, September 5, 2018 9:27 PM To: Koenig, Christian ; Daenzer, Michel ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: two KASANs in TTM logic Logs attached. Tom On 09/05/2018 08:02 AM, Christian König wrote: Still not the slightest idea what is causing this and the patch definitely fixes things a lot. Can you try to enable list debugging in your kernel? Thanks, Christian. Am 04.09.2018 um 19:18 schrieb Tom St Denis: Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: two KASANs in TTM logic
Hi Tom, Thanks to trace this issue. I am trying to reproduce it on amd-staging-drm-next with piglit. May I know the steps/configurations to repro it? Thanks, Ray -Original Message- From: amd-gfx On Behalf Of Tom St Denis Sent: Wednesday, September 5, 2018 9:27 PM To: Koenig, Christian ; Daenzer, Michel ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: two KASANs in TTM logic Logs attached. Tom On 09/05/2018 08:02 AM, Christian König wrote: > Still not the slightest idea what is causing this and the patch > definitely fixes things a lot. > > Can you try to enable list debugging in your kernel? > > Thanks, > Christian. > > Am 04.09.2018 um 19:18 schrieb Tom St Denis: >> Sure: >> >> d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit >> commit d2917f399e0b250f47d07da551a335843a24f835 >> Author: Christian König >> Date: Thu Aug 30 10:04:53 2018 +0200 >> >> drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 >> >> First step to fix the LRU corruption, we accidentially tried to >> move things >> on the LRU after dropping the lock. >> >> Signed-off-by: Christian König >> Tested-by: Michel Dänzer >> >> :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 >> 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers >> >> >> The commit before that I can run xonotic-glx and piglit on my Carrizo >> without a KASAN. >> >> Tom >> >> On 09/04/2018 10:05 AM, Christian König wrote: >>> The first one should already be fixed. >>> >>> Not sure where the second comes from. Can you narrow that down further? >>> >>> Christian. >>> >>> Am 04.09.2018 um 15:46 schrieb Tom St Denis: >>>> First is caused by this commit while running a GL heavy application. >>>> >>>> d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit >>>> commit d78c1fa0c9f815fe951fd57001acca3d35262a17 >>>> Author: Michel Dänzer >>>> Date: Wed Aug 29 11:59:38 2018 +0200 >>>> >>>> Revert "drm/amdgpu: move PD/PT bos on LRU again" >>>> >>>> This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. >>>> >>>> It triggered various badness on my development machine when >>>> running the >>>> piglit gpu profile with radeonsi on Bonaire, looks like memory >>>> corruption due to insufficiently protected list manipulations. >>>> >>>> Signed-off-by: Michel Dänzer >>>> Signed-off-by: Alex Deucher >>>> >>>> :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d >>>> 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers >>>> >>>> The second is caused by something between that and the tip of the >>>> 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while >>>> loading GNOME. >>>> >>>> Tom >>>> >>>> >>>> >>>> ___ >>>> amd-gfx mailing list >>>> amd-gfx@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
Logs attached. Tom On 09/05/2018 08:02 AM, Christian König wrote: Still not the slightest idea what is causing this and the patch definitely fixes things a lot. Can you try to enable list debugging in your kernel? Thanks, Christian. Am 04.09.2018 um 19:18 schrieb Tom St Denis: Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx [0.00] Linux version 4.19.0-rc1+ (root@raven) (gcc version 8.1.1 20180712 (Red Hat 8.1.1-5) (GCC)) #24 SMP Wed Sep 5 08:59:20 EDT 2018 [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.19.0-rc1+ root=UUID=66163c80-0ca1-4beb-aeba-5cc130b813e6 ro rhgb quiet modprobe.blacklist=amdgpu,radeon LANG=en_CA.UTF-8 [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format. [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009d3ff] usable [0.00] BIOS-e820: [mem 0x0009d400-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x03ff] usable [0.00] BIOS-e820: [mem 0x0400-0x04009fff] ACPI NVS [0.00] BIOS-e820: [mem 0x0400a000-0x09bf] usable [0.00] BIOS-e820: [mem 0x09c0-0x09ff] reserved [0.00] BIOS-e820: [mem 0x0a00-0x0aff] usable [0.00] BIOS-e820: [mem 0x0b00-0x0b01] reserved [0.00] BIOS-e820: [mem 0x0b02-0x73963fff] usable [0.00] BIOS-e820: [mem 0x73964000-0x7397cfff] ACPI data [0.00] BIOS-e820: [mem 0x7397d000-0x7a5aafff] usable [0.00] BIOS-e820: [mem 0x7a5ab000-0x7a6c2fff] reserved [0.00] BIOS-e820: [mem 0x7a6c3000-0x7a6cefff] ACPI data [0.00] BIOS-e820: [mem 0x7a6cf000-0x7a7d1fff] usable [0.00] BIOS-e820: [mem 0x7a7d2000-0x7ab89fff] ACPI NVS [0.00] BIOS-e820: [mem 0x7ab8a000-0x7b942fff] reserved [0.00] BIOS-e820: [mem 0x7b943000-0x7dff] usable [0.00] BIOS-e820: [mem 0x7e00-0xbfff] reserved [0.00] BIOS-e820: [mem 0xf800-0xfbff] reserved [0.00] BIOS-e820: [mem 0xfd80-0xfdff] reserved [0.00] BIOS-e820: [mem 0xfea0-0xfea0] reserved [0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff] reserved [0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved [0.00] BIOS-e820: [mem 0xfec3-0xfec30fff] reserved [0.00]
Re: two KASANs in TTM logic
Hi Christian, Will in a sec. I'm doing a piglit run with Felix's KFD patch on top of HEAD~ just to verify that everything before that is peachy on my Raven+Polaris rig. Tom On 09/05/2018 08:02 AM, Christian König wrote: Still not the slightest idea what is causing this and the patch definitely fixes things a lot. Can you try to enable list debugging in your kernel? Thanks, Christian. Am 04.09.2018 um 19:18 schrieb Tom St Denis: Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
Still not the slightest idea what is causing this and the patch definitely fixes things a lot. Can you try to enable list debugging in your kernel? Thanks, Christian. Am 04.09.2018 um 19:18 schrieb Tom St Denis: Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
Sure: d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit commit d2917f399e0b250f47d07da551a335843a24f835 Author: Christian König Date: Thu Aug 30 10:04:53 2018 +0200 drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: Christian König Tested-by: Michel Dänzer :04 04 ed5be1ad4da129c4154b2b43acf7ef349a470700 0008c4e2fb56512f41559618dd474c916fc09a37 M drivers The commit before that I can run xonotic-glx and piglit on my Carrizo without a KASAN. Tom On 09/04/2018 10:05 AM, Christian König wrote: The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: two KASANs in TTM logic
The first one should already be fixed. Not sure where the second comes from. Can you narrow that down further? Christian. Am 04.09.2018 um 15:46 schrieb Tom St Denis: First is caused by this commit while running a GL heavy application. d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit commit d78c1fa0c9f815fe951fd57001acca3d35262a17 Author: Michel Dänzer Date: Wed Aug 29 11:59:38 2018 +0200 Revert "drm/amdgpu: move PD/PT bos on LRU again" This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher :04 04 b7169f0cf0c7decec631751a9896a92badb67f9d 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers The second is caused by something between that and the tip of the 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while loading GNOME. Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx