Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-10 Thread StDenis, Tom
Hi Mike, This might be an issue better suited for our llvm team since umr just uses llvm-dev to access the diassembly code. I'll make sure the key folk are aware. Cheers, Tom On 2019-01-10 10:22 a.m., Mikhail Gavrilov wrote: > On Thu, 10 Jan 2019 at 00:36, Mikhail Gavrilov > wrote: >> >>

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-10 Thread Mikhail Gavrilov
On Thu, 10 Jan 2019 at 00:36, Mikhail Gavrilov wrote: > > All new one logs attached here. > > Thanks. > > P.S. This time I had to terminate command `./umr -O verbose,follow -R > gfx[.] > gfx.log 2>&1` cause it tried to write log infinitely. > I also had to terminate command `./umr -O

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-10 Thread Michel Dänzer
On 2019-01-10 10:42 a.m., Mikhail Gavrilov wrote: > On Thu, 10 Jan 2019 at 13:54, Michel Dänzer wrote: >> >> Assuming that's using DXVK, it could be an issue between DXVK and RADV. >> I'd start by filing a bug report against RADV. > > In the case of the last report, I agree it makes sense. > But

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-10 Thread Mikhail Gavrilov
On Thu, 10 Jan 2019 at 13:54, Michel Dänzer wrote: > > Assuming that's using DXVK, it could be an issue between DXVK and RADV. > I'd start by filing a bug report against RADV. > In the case of the last report, I agree it makes sense. But from the beginning I started this discussion because "*

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-10 Thread Michel Dänzer
On 2019-01-09 10:12 p.m., Mikhail Gavrilov wrote: > On Thu, 10 Jan 2019 at 01:35, Grodzovsky, Andrey > wrote: >> >> I think the 'verbose' flag causes it do dump so much output, maybe try >> without it in ALL the commands above. >> Are you are aware of any particular application during which run

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-09 Thread Mikhail Gavrilov
On Thu, 10 Jan 2019 at 01:35, Grodzovsky, Andrey wrote: > > I think the 'verbose' flag causes it do dump so much output, maybe try > without it in ALL the commands above. > Are you are aware of any particular application during which run this happens > ? > Last logs related to situation when I

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-09 Thread Grodzovsky, Andrey
Can you launch Poton from command line appending GALLIUM_DDEBUG=1000 before the command ? This should create MESA debug info in ~/ddebug_dumps/ Andrey On 01/09/2019 04:12 PM, Mikhail Gavrilov wrote: > On Thu, 10 Jan 2019 at 01:35, Grodzovsky, Andrey > wrote: >> I think the 'verbose' flag

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-09 Thread Grodzovsky, Andrey
On 01/09/2019 02:36 PM, Mikhail Gavrilov wrote: > On Mon, 7 Jan 2019 at 23:47, Grodzovsky, Andrey > wrote: >> I see 'no active waves' print meaning it's not shader hang. >> >> We can try and estimate around which commands the hang occurred - in >> addition to what you already print please also

Re: After Vega 56/64 GPU hang I unable reboot system

2019-01-07 Thread Grodzovsky, Andrey
I see 'no active waves' print meaning it's not shader hang. We can try and estimate around which commands the hang occurred - in addition to what you already print please also dump sudo umr -O many,bits  -r *.*.mmGRBM_STATUS* && sudo umr -O many,bits  -r *.*.mmCP_EOP_* && sudo umr -O many,bits

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-22 Thread Mikhail Gavrilov
On Thu, 20 Dec 2018 at 21:20, StDenis, Tom wrote: > > Sorry I didn't mean to be dismissive. It's just not a bug in umr though. > > On Fedora I can access those files as root just fine: > > tom@fx8:~$ sudo bash > [sudo] password for tom: > root@fx8:/home/tom# cd /sys/kernel/debug/dri/0 >

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-20 Thread StDenis, Tom
On 2018-12-20 11:07 a.m., Mikhail Gavrilov wrote: > On Thu, 20 Dec 2018 at 19:19, StDenis, Tom wrote: >> >> Ya I was right. With a plain build I can access the files just fine. >> >> >> >> I did manage to get into a weird shell where I couldn't cat >> amdgpu_gca_config from bash though after a

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-20 Thread Mikhail Gavrilov
On Thu, 20 Dec 2018 at 19:19, StDenis, Tom wrote: > > Ya I was right. With a plain build I can access the files just fine. > > > > I did manage to get into a weird shell where I couldn't cat > amdgpu_gca_config from bash though after a reboot (had updates pending) > it works fine. > > If you

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-20 Thread StDenis, Tom
On 2018-12-20 9:08 a.m., Tom St Denis wrote: > On 2018-12-20 9:06 a.m., Tom St Denis wrote: >> On 2018-12-20 6:45 a.m., Mikhail Gavrilov wrote: >>> On Thu, 20 Dec 2018 at 16:17, StDenis, Tom wrote: Well yup the kernel is not letting you open the files: As sudo/root you

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-20 Thread StDenis, Tom
On 2018-12-20 9:06 a.m., Tom St Denis wrote: > On 2018-12-20 6:45 a.m., Mikhail Gavrilov wrote: >> On Thu, 20 Dec 2018 at 16:17, StDenis, Tom wrote: >>> >>> Well yup the kernel is not letting you open the files: >>> >>> >>> As sudo/root you should be able to open these files with umr.  What >>>

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-20 Thread StDenis, Tom
On 2018-12-20 6:45 a.m., Mikhail Gavrilov wrote: > On Thu, 20 Dec 2018 at 16:17, StDenis, Tom wrote: >> >> Well yup the kernel is not letting you open the files: >> >> >> As sudo/root you should be able to open these files with umr. What >> happens if you just open a shell as root and run it? >>

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread Mikhail Gavrilov
On Thu, 20 Dec 2018 at 03:41, StDenis, Tom wrote: > sudo strace umr -R gfx[.] 2>&1 | tee strace.log > > will capture everything. > > In the mean time I can fix at least the segfault. > > The issue is why can't it open "amdgpu_ring_gfx". > > Tom > strace file is attached here. -- Best Regards,

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread Mikhail Gavrilov
On Thu, 20 Dec 2018 at 01:56, StDenis, Tom wrote: > > Sorry missed the gfx ring in the reply. > > Um what kernel version? 4.20.0-0.rc6 > Is this the latest umr? yes, master branch, commit 546c30a71f7b87f97f2a96eab184c3973b014711 > Maybe capture a trace of umr to see what is happening. Cannot

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread Mikhail Gavrilov
I see that backtrace in my previous message are borked. I place backtrace in text file for more comfort reading in this message. -- Best Regards, Mike Gavrilov. Cannot seek to MMIO address: Bad file descriptor [ERROR]: Could not open ring debugfs file Program received signal SIGSEGV,

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread StDenis, Tom
On 2018-12-19 4:21 p.m., Mikhail Gavrilov wrote: > I see that backtrace in my previous message are borked. > I place backtrace in text file for more comfort reading in this message. The backtrace points to the segfault in umr caused when it fails to read the file. We want to know why it can't

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread StDenis, Tom
No gfx ring? You can specify a ring name for --waves should be in the docs. It's not on the web docs but in the help text https://cgit.freedesktop.org/amd/umr/tree/src/app/main.c#n643 I'll fix the web docs when I'm in next. Tom On December 19, 2018 3:21:25 PM EST, "Grodzovsky, Andrey"

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread Grodzovsky, Andrey
+Tom Andrey On 12/19/2018 01:35 PM, Mikhail Gavrilov wrote: > On Tue, 18 Dec 2018 at 00:08, Grodzovsky, Andrey > wrote: >> Please install UMR and dump gfx ring content and waves after the hang is >> happening. >> >> UMR at - https://cgit.freedesktop.org/amd/umr/ >> Waves dump >> sudo umr -O

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-19 Thread Mikhail Gavrilov
On Tue, 18 Dec 2018 at 00:08, Grodzovsky, Andrey wrote: > > Please install UMR and dump gfx ring content and waves after the hang is > happening. > > UMR at - https://cgit.freedesktop.org/amd/umr/ > Waves dump > sudo umr -O verbose,halt_waves -wa > GFX ring dump > sudo umr -O verbose,follow -R

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-17 Thread Grodzovsky, Andrey
On 12/17/2018 01:51 PM, Wentland, Harry wrote: > On 2018-12-15 4:42 a.m., Mikhail Gavrilov wrote: >> On Sat, 15 Dec 2018 at 00:36, Wentland, Harry wrote: >>> Looks like there's an error before this happens that might get us into this >>> mess: >>> >>> [ 229.741741] [drm:amdgpu_job_timedout

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-17 Thread Wentland, Harry
On 2018-12-15 4:42 a.m., Mikhail Gavrilov wrote: > On Sat, 15 Dec 2018 at 00:36, Wentland, Harry wrote: >> >> Looks like there's an error before this happens that might get us into this >> mess: >> >> [ 229.741741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, >> signaled

Re: After Vega 56/64 GPU hang I unable reboot system

2018-12-14 Thread Wentland, Harry
On 2018-12-04 5:18 p.m., Mikhail Gavrilov wrote: > Hi guys. > I having troubles when Vega GPU is hang I unable reboot system. > > Can somebody help me please. > > I tried enter follow commands: > # init 6 > # reboot > But no one of them having success. > > SysRq-W showing to us this: > Looks