On Wed, 5 Jan 2022 13:57:39 +0100 "ezerot...@gmail.com" <ezerot...@gmail.com>
said:

> Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
> 5., Sze, 11:54):
> >
> > On Wed, 5 Jan 2022 08:41:05 +0100 "ezerot...@gmail.com"
> > <ezerot...@gmail.com> said:
> >
> > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
> > > 5., Sze, 0:37):
> > > >
> > > > On Tue, 4 Jan 2022 22:31:26 +0100 "ezerot...@gmail.com"
> > > > <ezerot...@gmail.com> said:
> > > >
> > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan.
> > > > > 4., K, 15:21):
> > > > > >
> > > > > > On Tue, 4 Jan 2022 11:56:00 +0100 "ezerot...@gmail.com"
> > > > > > <ezerot...@gmail.com> said:
> > > > > >
> > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022.
> > > > > > > jan. 3., H, 22:49):
> > > > > > > >
> > > > > > > > On Mon, 3 Jan 2022 22:28:19 +0100 "ezerot...@gmail.com"
> > > > > > > > <ezerot...@gmail.com> said:
> > > > > > > >
> > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont:
> > > > > > > > > 2022. jan. 3., H, 21:36):
> > > > > > > > > >
> > > > > > > > > > On Mon, 3 Jan 2022 19:34:41 +0100 "ezerot...@gmail.com"
> > > > > > > > > > <ezerot...@gmail.com> said:
> > > > > > > > > >
> > > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont:
> > > > > > > > > > > 2022. jan. 3., H, 19:13):
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 3 Jan 2022 17:07:43 +0100 "ezerot...@gmail.com"
> > > > > > > > > > > > <ezerot...@gmail.com> said:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've a brand new amd laptop with an nvidia mobile
> > > > > > > > > > > > > GPU. It arrived with TuxedoOS (ubuntu 20.04 + budgie
> > > > > > > > > > > > > wm) preinstalled. That setup works fine out of the
> > > > > > > > > > > > > box, but I want to replace budgie with enlightenment,
> > > > > > > > > > > > > because that's what I always use on linux.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've compiled E 0.25 from git (using
> > > > > > > > > > > > > https://github.com/batden/esteem), and it seemed to
> > > > > > > > > > > > > work fine. Unfortunately, when I tested
> > > > > > > > > > > > > suspend+resume, I had a problem. The desktop resumes,
> > > > > > > > > > > > > but only with minimal brightness, and then it seems
> > > > > > > > > > > > > to freeze (no keyboard/mouse). I can ssh into the
> > > > > > > > > > > > > laptop, and killing enlightenment sends me back to
> > > > > > > > > > > > > the lightdm login prompt.
> > > > > > > > > > > > >
> > > > > > > > > > > > > dmesg has this:
> > > > > > > > > > > > >
> > > > > > > > > > > > > [11814.110778] PM: suspend exit
> > > > > > > > > > > > > [11814.630838] NVRM: GPU at PCI:0000:01:00:
> > > > > > > > > > > > > GPU-589fde69-1161-f26b-1773-e5bcda70d601
> > > > > > > > > > > > > [11814.630845] NVRM: Xid (PCI:0000:01:00): 13,
> > > > > > > > > > > > > pid=5525, Graphics Exception: Shader Program Header
> > > > > > > > > > > > > 11 Error [11814.630855] NVRM: Xid (PCI:0000:01:00):
> > > > > > > > > > > > > 13, pid=5525, Graphics Exception: Shader Program
> > > > > > > > > > > > > Header 18 Error [11814.630865] NVRM: Xid (PCI:
> > > > > > > > > > > > > 0000:01:00): 13, pid=5525, Graphics Exception: ESR
> > > > > > > > > > > > > 0x405840=0xa2040800 [11814.630877] NVRM: Xid (PCI:
> > > > > > > > > > > > > 0000:01:00): 13, pid=5525, Graphics Exception: ESR
> > > > > > > > > > > > > 0x405848=0x80000000
> > > > > > > > > > > > >
> > > > > > > > > > > > > The problem happens with both the sw and the opengl
> > > > > > > > > > > > > compositors.
> > > > > > > > > > > > >
> > > > > > > > > > > > > When I suspend from the lightdm prompt or from the
> > > > > > > > > > > > > budgie desktop, resuming works fine. So it seems
> > > > > > > > > > > > > something is happening/not happening with the nvidia
> > > > > > > > > > > > > card when the suspend is started from E.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Anyone has any idea, how to debug this?
> > > > > > > > > > > > i suspect it may have to do with vblank interrupts. the
> > > > > > > > > > > > nvidia driver doesn't produce them anymore? a quick way
> > > > > > > > > > > > to test this:
> > > > > > > > > > > >
> > > > > > > > > > > > touch ~/.ecore-no-vsync
> > > > > > > > > > > >
> > > > > > > > > > > > restart e then do your suspend/resume
> > > > > > > > > > >
> > > > > > > > > > > Thanks for your reply. Unfortunately the problem seems to
> > > > > > > > > > > be somewhere else, as resuming still fails the same way.
> > > > > > > > > > > Anything else to try? Could rebuilding E in debugging mode
> > > > > > > > > > > help?
> > > > > > > > > >
> > > > > > > > > > probably not - btw - those shader exceptions might have to
> > > > > > > > > > do with it. evas caches binaries for shaders. rm -rf
> > > > > > > > > > ~/.cache/evas_gl_common_caches/ - but beyond that the only
> > > > > > > > > > thing left is your driver. those are its shaders it
> > > > > > > > > > compiled.
> > > > > > > > > >
> > > > > > > > > > google for it: "Graphics Exception: Shader Program Header 11
> > > > > > > > > > Error"
> > > > > > > > > >
> > > > > > > > > > seems to actually be OS independent and happen on windows
> > > > > > > > > > too.
> > > > > > > > > >
> > > > > > > > > > https://forums.developer.nvidia.com/t/screen-system-is-dead-on-resume-unable-to-resume-with-all-current-drivers/29872/57?page=3
> > > > > > > > > >
> > > > > > > > > > this has been there for a long time... and it seems it
> > > > > > > > > > doesn't get resolved.
> > > > > > > > > >
> > > > > > > > > > https://github.com/Bumblebee-Project/Bumblebee/issues/739
> > > > > > > > >
> > > > > > > > > Yeah, I've tried googling for this too, but found no solutions
> > > > > > > > > either.
> > > > > > > > >
> > > > > > > > > > it could be that evas uses egl+gles and the nvidia driver
> > > > > > > > > > implementation for egl+gles is buggy - you can rebuild efl
> > > > > > > > > > to use full desktop opengl+glx (-Dopengl=full).
> > > > > > > > >
> > > > > > > > > I've deleted the evas cache, and set the compositor to SW to
> > > > > > > > > make sure that it's not an evas egl problem. The exceptions
> > > > > > > > > are still there. Actually there are 3 exceptions for the
> > > > > > > > > kernel thread "[irq/92-nvidia]", and 1 for Xorg. When the
> > > > > > > > > compositor was set to opengl there were more exceptions, and
> > > > > > > > > one of them is was for the enlightenment process.
> > > > > > > > >
> > > > > > > > > So my guess is, that this may not be a problem in E, but
> > > > > > > > > maybe a missing/extra step during suspend/resume. I'll look
> > > > > > > > > into this tomorrow.
> > > > > > > > >
> > > > > > > > > Thanks for your help, Laszlo
> > > > > > > >
> > > > > > > > hmm i wonder why the nvidia driver is complaining - something is
> > > > > > > > using a shader program of some sot and it's not happy at all.
> > > > > > > > there i something deeper going on here. but yes - with e using
> > > > > > > > opengl for compositing it'll be driving the gpu (via opengl)
> > > > > > > > and thus more chance of something going wrong.
> > > > > > >
> > > > > > > I've found another strange thing. In my original configuration I
> > > > > > > used amdgpu+nvidia X drivers. Now I switched to
> > > > > > > modesetting+nvidia. Resuming fails again, but there is a
> > > > > > > different new problem. After starting E from lightdm as usual, I
> > > > > > > press ctrl+alt+end to restart E, it fades to black as usual, then
> > > > > > > it switches to something that looks like a console (empty black
> > > > > > > screen with a cursor line) and stays there. I can not restore the
> > > > > > > desktop until I kill E.  No exceptions from nvidia in the dmesg
> > > > > > > this time. Any idea for this?
> > > > > >
> > > > > > so this is an optimus setup of some sort but now with amd +
> > > > > > nvidia... i might imagine something goes wrong setting up randr
> > > > > > maybe? simotek found his optimus setup required a forced refresh of
> > > > > > randr info ... and e has that in it (otherwise edid info would not
> > > > > > be populated right). check ~/.e-log.log - it will tell you what e
> > > > > > is doing randr-wise and what it sees, but you should end up with
> > > > > > some kind of screen. perhaps go back away from modesetting to
> > > > > > amdgpu + nvidia?
> > > > >
> > > > > I've switched off the optimus stuff, and checked what happens with the
> > > > > nvidia only setup. Unfortunately it failed with the usual GPU error.
> > > > >
> > > > > Then I switched back to amdgpu+nvidia again, and saved the log file.
> > > > > Maybe you can see something in it:
> > > > >
> > > > > https://drive.google.com/file/d/1r69Bw43uMS8xWM2wemqxUvIAr0xH76pp/view?usp=sharing
> > > >
> > > > resume has nothing odd to do with randr.. but this smells a bit weird:
> > > >
> > > > ERROR: ecore_animator thread - epoll_wait(..., 200) at 3870,51700 should
> > > > have slept ~ 0,01667s but took 1,65593s!
> > > >
> > > > that smells very wrong - the animator thread asked to sleep for 16.67ms
> > > > but slept 1650ms instead ... and this is measuring monotonic time - not
> > > > wall clock. monotonic stops ticking when suspended. this thread is
> > > > dedicated to ticking for animation so will not be blocked by the
> > > > mainloop... this is kernel not sleeping for anywhere near the time it
> > > > should.
> > > >
> > > > so with amdgpu+nvidia it works? i'm not sure from your mail.
> > >
> > > None of amdgpu+nvidia, modesetting+nvidia, and nvidia alone work - GPU
> > > shader error when resuming. Desktop is at minimal brightness, no
> > > inputs accepted.
> >
> > Well it could be E is hung - you will only know if you send a SEGV signal
> > (kill -SEGV `pidof enlgithenment`) then collect a backtrace with gdb and
> > see where it's at.
> 
> Actually it seems that not E is hung, but rather the X server. When I
> kill E, it gets restarted (new PID) but the desktop remains frozen. I
> have to kill enlightenment_start to get back to the lighdm login
> prompt.

wow.. well then... maybe e hit on an xorg/nvidia driver bug? some people have
reported bad things with sddm - somehow it has caused e to launch in wayland ..
or xwayland (i dont know how it could do the latter so i assume it launched in
wl mode).

> > > With modesetting+nvidia there is a new problem: restarting E with
> > > ctrl+alt+end does not work (switches to console mode). Suspend/resume
> > > is not involved in this, and there is no GPU error.
> >
> > I can't help a lot with nvidia - I gave up on them years ago because they
> > didn't want to play ball with Wayland like everyone else and frankly having
> > their kernel driver keep breaking on kernel upgrades (kernel changes
> > api/abi - nvidia driver can't build anymore and i'm forced to manually
> > downgrade my kernel). I can say that all of my machines run arch linux
> > (except some of my arm devices - they are special and mostly used as
> > testbeds and not stable systems) and they all use either amd or intel
> > graphics and suspend/resume works.
> 
> Well, I originally wanted to buy an amd CPU+amd GPU laptop, but none
> of I found ticked all the boxes. Now I have amd CPU+nvidia GPU and an
> ugly shader error... :-/

well this is personal - but i'd just veto any choices that involve an nvidia
gpu. if nvidia drivers were all oss like amd - i wouldn't have as much of an
issue. i know it doesn't help you now, but maybe in future choices.

> After some googling, I found that it's possible to disable the nvidia
> GPU in nvidia-settings, and use amdgpu exclusively. I've tried this,
> and E+resume works like as it should! Unfortunately I have no externel
> monitor outputs in this mode, because only nvidia is wired to the
> hdmi/DP ports. Oh well.

well wow.. so something to do with nvidia maybe optimus ... but... hmmm. but at
least see if you can get a backtrace from e to see where it is stuck - if it
is. that will tell me some information at least.

> > I am wondering if there is some bizarre side effect from the gesture
> > support. go into e_main.c in e's src and find e_gesture_init() and
> > e_gesture_shutdown() and comment those lines out and rebuild e. there is a
> > bizarre side effect inside vbox with the xorg vmware driver that this
> > causes (doesnt happen on real hardware like intel/amd).
> 
> Thanks for the idea, I'll look into this.
> 
> Laszlo
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - ras...@rasterman.com



_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to