On Mon, 8 Jan 2024 23:20:16 -0800 Ross Vandegrift <r...@kallisti.us> said:

> On Mon, Jan 08, 2024 at 11:08:55PM +0000, Carsten Haitzler wrote:
> > try run the above eina test suite and pipe to something that makes it
> > timeout... and strace it - or gdb attach to it and find out where it's
> > sitting? it should complete in < 1 sec so launch and immediately try and
> > strace and/or gdb attach and find out where it's at - if it is still around.
> > 
> > is somehow a forked child not coming back that it expects to... ?
> 
> Yea, it's something like this.  I found out it hangs for exactly 60s, which
> lead me to timeout.c.  I also learned strace -f triggers the issue.
> CK_FORK=no fixes the hang as well.

I just reproduced the issue... So it looks like it's a check bug of some sort.
literally i see it hang - i've gone through and forced a longer timeout of
240sec on everything so i have a lot longer to play (probably a good idea
anyway)... and literally all the test  suite processes have exited... but check
still things they are running. this would be in the check meson/ninja infra
waiting on the child procs. either it should have failed ... or it should have
succeeded by the time the efl test suite process exits... but they are all gone
and check is still waiting on a whole bunch of them...

> I added debug printfs to efl_check.h and timeout.c - when eina_suite tries to
> kill timeout, it kills the wrong pid:
> 
>   $ ./build/src/tests/eina/eina_suite fp
>   Running suite(s): eina_init_module
>   100%: Checks: 0, Failures: 0, Errors: 0
>   -------------------- efl_check forked timeout: 296393    <-----
>   -------------------- efl_check forked timeout: 0
>   Running suite(s): Eina
>   -------------------- timeout.c my pid: 296396            <-----
>   Max delta(multiplication): 0.007627 (0.061668%)
>   Max delta(division): 0.000173 (0.740211%)
>   100%: Checks: 4, Failures: 0, Errors: 0
>   -------------------- efl_check killing timeout child: 296393
>   -------------------- efl_check cleared timeout_pid: 0
> 
> So eina_suite.c gets the wrong pid from fork().  In a simple standalone
> program, fork() behaves as expected.

now a returned pid of 0 ... THAT IS WRONG! well unless its inside the child.
the parent should get -1 for a failure or > 0 for the child pid. the child gets
0 ... but see below

> I'm going to compare the arch & debian check packages for any suspicious
> differences.  And maybe walk through more carefully with gdb.  But I'm out of
> time tonight.

Yeah - I see the problem in debian SID (this is on aarch64 btw) ... but it
totally smells of a check bug as above. just a cursory glance tells me that
something is wrong over there - if check is sitting and waiting on test suite
procs that have already exited/gone...

part of my pstree:

        │         │         │                 │               
├─terminology─┬─zsh───ninja───sh───meson
        │         │         │                 │               │             
└─3*[{terminology}]

that's a ninja test... sitting there:

[20-39/39] 🌔 ecore_wl2-suite                           87/240s

i.e. waiting on tests 20-39  now with a 240sec timeout (my local mods to force 
this) with no child processes. so something up there in check+meson...

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - ras...@rasterman.com



_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to