Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
5., Sze, 11:54):
>
> On Wed, 5 Jan 2022 08:41:05 +0100 "ezerot...@gmail.com" <ezerot...@gmail.com>
> said:
>
> > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
> > 5., Sze, 0:37):
> > >
> > > On Tue, 4 Jan 2022 22:31:26 +0100 "ezerot...@gmail.com"
> > > <ezerot...@gmail.com> said:
> > >
> > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
> > > > 4., K, 15:21):
> > > > >
> > > > > On Tue, 4 Jan 2022 11:56:00 +0100 "ezerot...@gmail.com"
> > > > > <ezerot...@gmail.com> said:
> > > > >
> > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. 
> > > > > > jan.
> > > > > > 3., H, 22:49):
> > > > > > >
> > > > > > > On Mon, 3 Jan 2022 22:28:19 +0100 "ezerot...@gmail.com"
> > > > > > > <ezerot...@gmail.com> said:
> > > > > > >
> > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022.
> > > > > > > > jan. 3., H, 21:36):
> > > > > > > > >
> > > > > > > > > On Mon, 3 Jan 2022 19:34:41 +0100 "ezerot...@gmail.com"
> > > > > > > > > <ezerot...@gmail.com> said:
> > > > > > > > >
> > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont:
> > > > > > > > > > 2022. jan. 3., H, 19:13):
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 3 Jan 2022 17:07:43 +0100 "ezerot...@gmail.com"
> > > > > > > > > > > <ezerot...@gmail.com> said:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > I've a brand new amd laptop with an nvidia mobile GPU. 
> > > > > > > > > > > > It
> > > > > > > > > > > > arrived with TuxedoOS (ubuntu 20.04 + budgie wm)
> > > > > > > > > > > > preinstalled. That setup works fine out of the box, but 
> > > > > > > > > > > > I
> > > > > > > > > > > > want to replace budgie with enlightenment, because 
> > > > > > > > > > > > that's
> > > > > > > > > > > > what I always use on linux.
> > > > > > > > > > > >
> > > > > > > > > > > > I've compiled E 0.25 from git (using
> > > > > > > > > > > > https://github.com/batden/esteem), and it seemed to work
> > > > > > > > > > > > fine. Unfortunately, when I tested suspend+resume, I had
> > > > > > > > > > > > a problem. The desktop resumes, but only with minimal
> > > > > > > > > > > > brightness, and then it seems to freeze (no
> > > > > > > > > > > > keyboard/mouse). I can ssh into the laptop, and killing
> > > > > > > > > > > > enlightenment sends me back to the lightdm login prompt.
> > > > > > > > > > > >
> > > > > > > > > > > > dmesg has this:
> > > > > > > > > > > >
> > > > > > > > > > > > [11814.110778] PM: suspend exit
> > > > > > > > > > > > [11814.630838] NVRM: GPU at PCI:0000:01:00:
> > > > > > > > > > > > GPU-589fde69-1161-f26b-1773-e5bcda70d601
> > > > > > > > > > > > [11814.630845] NVRM: Xid (PCI:0000:01:00): 13, pid=5525,
> > > > > > > > > > > > Graphics Exception: Shader Program Header 11 Error
> > > > > > > > > > > > [11814.630855] NVRM: Xid (PCI:0000:01:00): 13, pid=5525,
> > > > > > > > > > > > Graphics Exception: Shader Program Header 18 Error
> > > > > > > > > > > > [11814.630865] NVRM: Xid (PCI:0000:01:00): 13, pid=5525,
> > > > > > > > > > > > Graphics Exception: ESR 0x405840=0xa2040800
> > > > > > > > > > > > [11814.630877] NVRM: Xid (PCI:0000:01:00): 13, pid=5525,
> > > > > > > > > > > > Graphics Exception: ESR 0x405848=0x80000000
> > > > > > > > > > > >
> > > > > > > > > > > > The problem happens with both the sw and the opengl
> > > > > > > > > > > > compositors.
> > > > > > > > > > > >
> > > > > > > > > > > > When I suspend from the lightdm prompt or from the 
> > > > > > > > > > > > budgie
> > > > > > > > > > > > desktop, resuming works fine. So it seems something is
> > > > > > > > > > > > happening/not happening with the nvidia card when the
> > > > > > > > > > > > suspend is started from E.
> > > > > > > > > > > >
> > > > > > > > > > > > Anyone has any idea, how to debug this?
> > > > > > > > > > > i suspect it may have to do with vblank interrupts. the
> > > > > > > > > > > nvidia driver doesn't produce them anymore? a quick way to
> > > > > > > > > > > test this:
> > > > > > > > > > >
> > > > > > > > > > > touch ~/.ecore-no-vsync
> > > > > > > > > > >
> > > > > > > > > > > restart e then do your suspend/resume
> > > > > > > > > >
> > > > > > > > > > Thanks for your reply. Unfortunately the problem seems to be
> > > > > > > > > > somewhere else, as resuming still fails the same way.
> > > > > > > > > > Anything else to try? Could rebuilding E in debugging mode
> > > > > > > > > > help?
> > > > > > > > >
> > > > > > > > > probably not - btw - those shader exceptions might have to do
> > > > > > > > > with it. evas caches binaries for shaders. rm -rf
> > > > > > > > > ~/.cache/evas_gl_common_caches/ - but beyond that the only 
> > > > > > > > > thing
> > > > > > > > > left is your driver. those are its shaders it compiled.
> > > > > > > > >
> > > > > > > > > google for it: "Graphics Exception: Shader Program Header 11
> > > > > > > > > Error"
> > > > > > > > >
> > > > > > > > > seems to actually be OS independent and happen on windows too.
> > > > > > > > >
> > > > > > > > > https://forums.developer.nvidia.com/t/screen-system-is-dead-on-resume-unable-to-resume-with-all-current-drivers/29872/57?page=3
> > > > > > > > >
> > > > > > > > > this has been there for a long time... and it seems it doesn't
> > > > > > > > > get resolved.
> > > > > > > > >
> > > > > > > > > https://github.com/Bumblebee-Project/Bumblebee/issues/739
> > > > > > > >
> > > > > > > > Yeah, I've tried googling for this too, but found no solutions
> > > > > > > > either.
> > > > > > > >
> > > > > > > > > it could be that evas uses egl+gles and the nvidia driver
> > > > > > > > > implementation for egl+gles is buggy - you can rebuild efl to
> > > > > > > > > use full desktop opengl+glx (-Dopengl=full).
> > > > > > > >
> > > > > > > > I've deleted the evas cache, and set the compositor to SW to 
> > > > > > > > make
> > > > > > > > sure that it's not an evas egl problem. The exceptions are still
> > > > > > > > there. Actually there are 3 exceptions for the kernel thread
> > > > > > > > "[irq/92-nvidia]", and 1 for Xorg. When the compositor was set 
> > > > > > > > to
> > > > > > > > opengl there were more exceptions, and one of them is was for 
> > > > > > > > the
> > > > > > > > enlightenment process.
> > > > > > > >
> > > > > > > > So my guess is, that this may not be a problem in E, but maybe a
> > > > > > > > missing/extra step during suspend/resume. I'll look into this
> > > > > > > > tomorrow.
> > > > > > > >
> > > > > > > > Thanks for your help, Laszlo
> > > > > > >
> > > > > > > hmm i wonder why the nvidia driver is complaining - something is
> > > > > > > using a shader program of some sot and it's not happy at all. 
> > > > > > > there
> > > > > > > i something deeper going on here. but yes - with e using opengl 
> > > > > > > for
> > > > > > > compositing it'll be driving the gpu (via opengl) and thus more
> > > > > > > chance of something going wrong.
> > > > > >
> > > > > > I've found another strange thing. In my original configuration I 
> > > > > > used
> > > > > > amdgpu+nvidia X drivers. Now I switched to modesetting+nvidia.
> > > > > > Resuming fails again, but there is a different new problem. After
> > > > > > starting E from lightdm as usual, I press ctrl+alt+end to restart E,
> > > > > > it fades to black as usual, then it switches to something that looks
> > > > > > like a console (empty black screen with a cursor line) and stays
> > > > > > there. I can not restore the desktop until I kill E.  No exceptions
> > > > > > from nvidia in the dmesg this time. Any idea for this?
> > > > >
> > > > > so this is an optimus setup of some sort but now with amd + nvidia... 
> > > > > i
> > > > > might imagine something goes wrong setting up randr maybe? simotek 
> > > > > found
> > > > > his optimus setup required a forced refresh of randr info ... and e 
> > > > > has
> > > > > that in it (otherwise edid info would not be populated right). check
> > > > > ~/.e-log.log - it will tell you what e is doing randr-wise and what it
> > > > > sees, but you should end up with some kind of screen. perhaps go back
> > > > > away from modesetting to amdgpu + nvidia?
> > > >
> > > > I've switched off the optimus stuff, and checked what happens with the
> > > > nvidia only setup. Unfortunately it failed with the usual GPU error.
> > > >
> > > > Then I switched back to amdgpu+nvidia again, and saved the log file.
> > > > Maybe you can see something in it:
> > > >
> > > > https://drive.google.com/file/d/1r69Bw43uMS8xWM2wemqxUvIAr0xH76pp/view?usp=sharing
> > >
> > > resume has nothing odd to do with randr.. but this smells a bit weird:
> > >
> > > ERROR: ecore_animator thread - epoll_wait(..., 200) at 3870,51700 should
> > > have slept ~ 0,01667s but took 1,65593s!
> > >
> > > that smells very wrong - the animator thread asked to sleep for 16.67ms 
> > > but
> > > slept 1650ms instead ... and this is measuring monotonic time - not wall
> > > clock. monotonic stops ticking when suspended. this thread is dedicated to
> > > ticking for animation so will not be blocked by the mainloop... this is
> > > kernel not sleeping for anywhere near the time it should.
> > >
> > > so with amdgpu+nvidia it works? i'm not sure from your mail.
> >
> > None of amdgpu+nvidia, modesetting+nvidia, and nvidia alone work - GPU
> > shader error when resuming. Desktop is at minimal brightness, no
> > inputs accepted.
>
> Well it could be E is hung - you will only know if you send a SEGV signal 
> (kill
> -SEGV `pidof enlgithenment`) then collect a backtrace with gdb and see where
> it's at.

Actually it seems that not E is hung, but rather the X server. When I
kill E, it gets restarted (new PID) but the desktop remains frozen. I
have to kill enlightenment_start to get back to the lighdm login
prompt.

> > With modesetting+nvidia there is a new problem: restarting E with
> > ctrl+alt+end does not work (switches to console mode). Suspend/resume
> > is not involved in this, and there is no GPU error.
>
> I can't help a lot with nvidia - I gave up on them years ago because they
> didn't want to play ball with Wayland like everyone else and frankly having
> their kernel driver keep breaking on kernel upgrades (kernel changes api/abi -
> nvidia driver can't build anymore and i'm forced to manually downgrade my
> kernel). I can say that all of my machines run arch linux (except some of my
> arm devices - they are special and mostly used as testbeds and not stable
> systems) and they all use either amd or intel graphics and suspend/resume 
> works.

Well, I originally wanted to buy an amd CPU+amd GPU laptop, but none
of I found ticked all the boxes. Now I have amd CPU+nvidia GPU and an
ugly shader error... :-/

After some googling, I found that it's possible to disable the nvidia
GPU in nvidia-settings, and use amdgpu exclusively. I've tried this,
and E+resume works like as it should! Unfortunately I have no externel
monitor outputs in this mode, because only nvidia is wired to the
hdmi/DP ports. Oh well.

> I am wondering if there is some bizarre side effect from the gesture support.
> go into e_main.c in e's src and find e_gesture_init() and e_gesture_shutdown()
> and comment those lines out and rebuild e. there is a bizarre side effect
> inside vbox with the xorg vmware driver that this causes (doesnt happen on 
> real
> hardware like intel/amd).

Thanks for the idea, I'll look into this.

Laszlo


_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to