On Sun, 8 Apr 2018 13:50:14 +0900 Carsten Haitzler (The Rasterman) <ras...@rasterman.com> wrote: > > terminate sends a term signal. process may trap this and not respond > if it so chooses. kill signals can't be trapps so kill will kill.
Sure, the parent entrance process is trapping those signals, but not the client. The problem is a break down between server and client. It seems the ecore_exe stuff is not helping in such case. The parent entrance process traps the kill signal. Then tells the client to stop, and that stops the parent server process. The server process stopping is dependent on being able to control the client process. I can see the signals trapped and call to ecore_exe_terminate. But the client isn't terminated properly or something is happening that should not normally. Seems to be something funky in Travis Ubuntu env. I cannot replicate the same issues on live systems or via Xephyr > to know the result the ECORE_EXE_DEL event will happen - listen for > it with the exe that exited and why (exit code). if this event > doesn't come in - the app hasn't exited yet. Something is going wrong. The client does exit, but the handler/callback is never fired. I am doing a ps after and I can see no entrance_client process. https://travis-ci.org/Obsidian-StudiosInc/entrance/jobs/363681776#L1261 Yet I never see the log from callback. src/daemon/entrance.c:302 _entrance_client_del() client have terminated ecore_event_handler_add(ECORE_EXE_EVENT_DEL, _entrance_client_del, NULL); Those [Xorg] <defunct> and [bash] <defunct> process maybe sign of the problem. The bash one is from entrance_client. But the hang maybe related to Xorg itself. > if the process has exited already these will have no effect. it > could be the process is stuck in a kernel syscall (does happen) and > the kernel is refusing to end the process. Seems like something is stuck or going wrong. But with entrance not entrance_client. But what ever is going wrong, seems to effect their communication. I had some trials before where I killed the client. It got stuck trying to kill the parent or something. This was the furthest I ever got https://travis-ci.org/Obsidian-StudiosInc/entrance/jobs/363371263#L1242 But that got stuck and never logged "login shutdown". https://github.com/Obsidian-StudiosInc/entrance/blob/master/src/bin/entrance_client.c#L87 The message shown comes from entrance_gui_shutdown(); https://github.com/Obsidian-StudiosInc/entrance/blob/master/src/bin/entrance_gui.c#L234 I cannot see anything there that would cause that to hang, but it never makes it to "login shutdown" Another attempt later, never made it past _entrance_server_del https://travis-ci.org/Obsidian-StudiosInc/entrance/jobs/363663070#L1359 I thought that one was due to failing to write profile files. But with that resolved, seems to just hang... https://travis-ci.org/Obsidian-StudiosInc/entrance/jobs/363681776#L1307 > kill() (the syscall) will not tell you this. the > only things it will tell you is if you don't have permission to kill > the process (another uid and you are not root for example), or the > pid does not exist. I was more thinking if there was an error killing the process. Then I could take further action. Issue another kill, maybe change signals from 9 to 15. Or some action to kill the stuck process. > i assume you know the pid exists and it's there > as you say it's not going away... That is what I was thinking about doing after calling ecore_exe_terminate. See if the PID existed and if not, take action. > so either it's a different uid (setuid?) That is taking place in the client, but ps is showing all processes running as root. That suid does not cause problem on normal systems. The only time I cannot shutdown entrance normally is when the desktop session is running, active E desktop session. https://github.com/Obsidian-StudiosInc/entrance/issues/5 Not sure if that is related to the problem I am having or not. > , or it's ignoring term signals but kill should work (unless > permissions fail), and other than that ... it may be kernel holding > on in a syscall. i have seen this happen often enough over the years > and end up with an unkillable process because of it. I do not believe its being ignored, but something funky is happening. I am not sure if it is related or not. But to get things to start, I have to send a kill signal that I believe the X server should send. kill -SIGUSR1 https://github.com/Obsidian-StudiosInc/entrance/blob/master/tests/run-tests.sh#L25 I could not figure out why that was not called on its own. It could be I am rushing things. I increased the timeout between start and kill to 30, but not really seeing any difference. Not sure if I need a delay from start of X to stopping/killing entrance. For normal hardware I can see it taking time to start. For virtual with dummy, not sure there should be any delay. There could be a some general issue with signals or otherwise not sure. Lots of funky issues in Travis environment with entrance. But I see that as a good thing, as it is helping to improve error handling. It is surely presenting some odd scenarios. -- William L. Thomson Jr.
pgpAmlK_TdmMD.pgp
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel