On Wed, 5 Jan 2022 13:57:39 +0100 "ezerot...@gmail.com" <ezerot...@gmail.com> said:
> Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan. > 5., Sze, 11:54): > > > > On Wed, 5 Jan 2022 08:41:05 +0100 "ezerot...@gmail.com" > > <ezerot...@gmail.com> said: > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan. > > > 5., Sze, 0:37): > > > > > > > > On Tue, 4 Jan 2022 22:31:26 +0100 "ezerot...@gmail.com" > > > > <ezerot...@gmail.com> said: > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. jan. > > > > > 4., K, 15:21): > > > > > > > > > > > > On Tue, 4 Jan 2022 11:56:00 +0100 "ezerot...@gmail.com" > > > > > > <ezerot...@gmail.com> said: > > > > > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: 2022. > > > > > > > jan. 3., H, 22:49): > > > > > > > > > > > > > > > > On Mon, 3 Jan 2022 22:28:19 +0100 "ezerot...@gmail.com" > > > > > > > > <ezerot...@gmail.com> said: > > > > > > > > > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: > > > > > > > > > 2022. jan. 3., H, 21:36): > > > > > > > > > > > > > > > > > > > > On Mon, 3 Jan 2022 19:34:41 +0100 "ezerot...@gmail.com" > > > > > > > > > > <ezerot...@gmail.com> said: > > > > > > > > > > > > > > > > > > > > > Carsten Haitzler <ras...@rasterman.com> ezt írta (időpont: > > > > > > > > > > > 2022. jan. 3., H, 19:13): > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 3 Jan 2022 17:07:43 +0100 "ezerot...@gmail.com" > > > > > > > > > > > > <ezerot...@gmail.com> said: > > > > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > I've a brand new amd laptop with an nvidia mobile > > > > > > > > > > > > > GPU. It arrived with TuxedoOS (ubuntu 20.04 + budgie > > > > > > > > > > > > > wm) preinstalled. That setup works fine out of the > > > > > > > > > > > > > box, but I want to replace budgie with enlightenment, > > > > > > > > > > > > > because that's what I always use on linux. > > > > > > > > > > > > > > > > > > > > > > > > > > I've compiled E 0.25 from git (using > > > > > > > > > > > > > https://github.com/batden/esteem), and it seemed to > > > > > > > > > > > > > work fine. Unfortunately, when I tested > > > > > > > > > > > > > suspend+resume, I had a problem. The desktop resumes, > > > > > > > > > > > > > but only with minimal brightness, and then it seems > > > > > > > > > > > > > to freeze (no keyboard/mouse). I can ssh into the > > > > > > > > > > > > > laptop, and killing enlightenment sends me back to > > > > > > > > > > > > > the lightdm login prompt. > > > > > > > > > > > > > > > > > > > > > > > > > > dmesg has this: > > > > > > > > > > > > > > > > > > > > > > > > > > [11814.110778] PM: suspend exit > > > > > > > > > > > > > [11814.630838] NVRM: GPU at PCI:0000:01:00: > > > > > > > > > > > > > GPU-589fde69-1161-f26b-1773-e5bcda70d601 > > > > > > > > > > > > > [11814.630845] NVRM: Xid (PCI:0000:01:00): 13, > > > > > > > > > > > > > pid=5525, Graphics Exception: Shader Program Header > > > > > > > > > > > > > 11 Error [11814.630855] NVRM: Xid (PCI:0000:01:00): > > > > > > > > > > > > > 13, pid=5525, Graphics Exception: Shader Program > > > > > > > > > > > > > Header 18 Error [11814.630865] NVRM: Xid (PCI: > > > > > > > > > > > > > 0000:01:00): 13, pid=5525, Graphics Exception: ESR > > > > > > > > > > > > > 0x405840=0xa2040800 [11814.630877] NVRM: Xid (PCI: > > > > > > > > > > > > > 0000:01:00): 13, pid=5525, Graphics Exception: ESR > > > > > > > > > > > > > 0x405848=0x80000000 > > > > > > > > > > > > > > > > > > > > > > > > > > The problem happens with both the sw and the opengl > > > > > > > > > > > > > compositors. > > > > > > > > > > > > > > > > > > > > > > > > > > When I suspend from the lightdm prompt or from the > > > > > > > > > > > > > budgie desktop, resuming works fine. So it seems > > > > > > > > > > > > > something is happening/not happening with the nvidia > > > > > > > > > > > > > card when the suspend is started from E. > > > > > > > > > > > > > > > > > > > > > > > > > > Anyone has any idea, how to debug this? > > > > > > > > > > > > i suspect it may have to do with vblank interrupts. the > > > > > > > > > > > > nvidia driver doesn't produce them anymore? a quick way > > > > > > > > > > > > to test this: > > > > > > > > > > > > > > > > > > > > > > > > touch ~/.ecore-no-vsync > > > > > > > > > > > > > > > > > > > > > > > > restart e then do your suspend/resume > > > > > > > > > > > > > > > > > > > > > > Thanks for your reply. Unfortunately the problem seems to > > > > > > > > > > > be somewhere else, as resuming still fails the same way. > > > > > > > > > > > Anything else to try? Could rebuilding E in debugging mode > > > > > > > > > > > help? > > > > > > > > > > > > > > > > > > > > probably not - btw - those shader exceptions might have to > > > > > > > > > > do with it. evas caches binaries for shaders. rm -rf > > > > > > > > > > ~/.cache/evas_gl_common_caches/ - but beyond that the only > > > > > > > > > > thing left is your driver. those are its shaders it > > > > > > > > > > compiled. > > > > > > > > > > > > > > > > > > > > google for it: "Graphics Exception: Shader Program Header 11 > > > > > > > > > > Error" > > > > > > > > > > > > > > > > > > > > seems to actually be OS independent and happen on windows > > > > > > > > > > too. > > > > > > > > > > > > > > > > > > > > https://forums.developer.nvidia.com/t/screen-system-is-dead-on-resume-unable-to-resume-with-all-current-drivers/29872/57?page=3 > > > > > > > > > > > > > > > > > > > > this has been there for a long time... and it seems it > > > > > > > > > > doesn't get resolved. > > > > > > > > > > > > > > > > > > > > https://github.com/Bumblebee-Project/Bumblebee/issues/739 > > > > > > > > > > > > > > > > > > Yeah, I've tried googling for this too, but found no solutions > > > > > > > > > either. > > > > > > > > > > > > > > > > > > > it could be that evas uses egl+gles and the nvidia driver > > > > > > > > > > implementation for egl+gles is buggy - you can rebuild efl > > > > > > > > > > to use full desktop opengl+glx (-Dopengl=full). > > > > > > > > > > > > > > > > > > I've deleted the evas cache, and set the compositor to SW to > > > > > > > > > make sure that it's not an evas egl problem. The exceptions > > > > > > > > > are still there. Actually there are 3 exceptions for the > > > > > > > > > kernel thread "[irq/92-nvidia]", and 1 for Xorg. When the > > > > > > > > > compositor was set to opengl there were more exceptions, and > > > > > > > > > one of them is was for the enlightenment process. > > > > > > > > > > > > > > > > > > So my guess is, that this may not be a problem in E, but > > > > > > > > > maybe a missing/extra step during suspend/resume. I'll look > > > > > > > > > into this tomorrow. > > > > > > > > > > > > > > > > > > Thanks for your help, Laszlo > > > > > > > > > > > > > > > > hmm i wonder why the nvidia driver is complaining - something is > > > > > > > > using a shader program of some sot and it's not happy at all. > > > > > > > > there i something deeper going on here. but yes - with e using > > > > > > > > opengl for compositing it'll be driving the gpu (via opengl) > > > > > > > > and thus more chance of something going wrong. > > > > > > > > > > > > > > I've found another strange thing. In my original configuration I > > > > > > > used amdgpu+nvidia X drivers. Now I switched to > > > > > > > modesetting+nvidia. Resuming fails again, but there is a > > > > > > > different new problem. After starting E from lightdm as usual, I > > > > > > > press ctrl+alt+end to restart E, it fades to black as usual, then > > > > > > > it switches to something that looks like a console (empty black > > > > > > > screen with a cursor line) and stays there. I can not restore the > > > > > > > desktop until I kill E. No exceptions from nvidia in the dmesg > > > > > > > this time. Any idea for this? > > > > > > > > > > > > so this is an optimus setup of some sort but now with amd + > > > > > > nvidia... i might imagine something goes wrong setting up randr > > > > > > maybe? simotek found his optimus setup required a forced refresh of > > > > > > randr info ... and e has that in it (otherwise edid info would not > > > > > > be populated right). check ~/.e-log.log - it will tell you what e > > > > > > is doing randr-wise and what it sees, but you should end up with > > > > > > some kind of screen. perhaps go back away from modesetting to > > > > > > amdgpu + nvidia? > > > > > > > > > > I've switched off the optimus stuff, and checked what happens with the > > > > > nvidia only setup. Unfortunately it failed with the usual GPU error. > > > > > > > > > > Then I switched back to amdgpu+nvidia again, and saved the log file. > > > > > Maybe you can see something in it: > > > > > > > > > > https://drive.google.com/file/d/1r69Bw43uMS8xWM2wemqxUvIAr0xH76pp/view?usp=sharing > > > > > > > > resume has nothing odd to do with randr.. but this smells a bit weird: > > > > > > > > ERROR: ecore_animator thread - epoll_wait(..., 200) at 3870,51700 should > > > > have slept ~ 0,01667s but took 1,65593s! > > > > > > > > that smells very wrong - the animator thread asked to sleep for 16.67ms > > > > but slept 1650ms instead ... and this is measuring monotonic time - not > > > > wall clock. monotonic stops ticking when suspended. this thread is > > > > dedicated to ticking for animation so will not be blocked by the > > > > mainloop... this is kernel not sleeping for anywhere near the time it > > > > should. > > > > > > > > so with amdgpu+nvidia it works? i'm not sure from your mail. > > > > > > None of amdgpu+nvidia, modesetting+nvidia, and nvidia alone work - GPU > > > shader error when resuming. Desktop is at minimal brightness, no > > > inputs accepted. > > > > Well it could be E is hung - you will only know if you send a SEGV signal > > (kill -SEGV `pidof enlgithenment`) then collect a backtrace with gdb and > > see where it's at. > > Actually it seems that not E is hung, but rather the X server. When I > kill E, it gets restarted (new PID) but the desktop remains frozen. I > have to kill enlightenment_start to get back to the lighdm login > prompt. wow.. well then... maybe e hit on an xorg/nvidia driver bug? some people have reported bad things with sddm - somehow it has caused e to launch in wayland .. or xwayland (i dont know how it could do the latter so i assume it launched in wl mode). > > > With modesetting+nvidia there is a new problem: restarting E with > > > ctrl+alt+end does not work (switches to console mode). Suspend/resume > > > is not involved in this, and there is no GPU error. > > > > I can't help a lot with nvidia - I gave up on them years ago because they > > didn't want to play ball with Wayland like everyone else and frankly having > > their kernel driver keep breaking on kernel upgrades (kernel changes > > api/abi - nvidia driver can't build anymore and i'm forced to manually > > downgrade my kernel). I can say that all of my machines run arch linux > > (except some of my arm devices - they are special and mostly used as > > testbeds and not stable systems) and they all use either amd or intel > > graphics and suspend/resume works. > > Well, I originally wanted to buy an amd CPU+amd GPU laptop, but none > of I found ticked all the boxes. Now I have amd CPU+nvidia GPU and an > ugly shader error... :-/ well this is personal - but i'd just veto any choices that involve an nvidia gpu. if nvidia drivers were all oss like amd - i wouldn't have as much of an issue. i know it doesn't help you now, but maybe in future choices. > After some googling, I found that it's possible to disable the nvidia > GPU in nvidia-settings, and use amdgpu exclusively. I've tried this, > and E+resume works like as it should! Unfortunately I have no externel > monitor outputs in this mode, because only nvidia is wired to the > hdmi/DP ports. Oh well. well wow.. so something to do with nvidia maybe optimus ... but... hmmm. but at least see if you can get a backtrace from e to see where it is stuck - if it is. that will tell me some information at least. > > I am wondering if there is some bizarre side effect from the gesture > > support. go into e_main.c in e's src and find e_gesture_init() and > > e_gesture_shutdown() and comment those lines out and rebuild e. there is a > > bizarre side effect inside vbox with the xorg vmware driver that this > > causes (doesnt happen on real hardware like intel/amd). > > Thanks for the idea, I'll look into this. > > Laszlo > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- Carsten Haitzler - ras...@rasterman.com _______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users