On Tue, Jan 9, 2024 at 9:17 AM Carsten Haitzler <ras...@rasterman.com> wrote:
>
> On Mon, 8 Jan 2024 23:20:16 -0800 Ross Vandegrift <r...@kallisti.us> said:
>
> > On Mon, Jan 08, 2024 at 11:08:55PM +0000, Carsten Haitzler wrote:
> > > try run the above eina test suite and pipe to something that makes it
> > > timeout... and strace it - or gdb attach to it and find out where it's
> > > sitting? it should complete in < 1 sec so launch and immediately try and
> > > strace and/or gdb attach and find out where it's at - if it is still 
> > > around.
> > >
> > > is somehow a forked child not coming back that it expects to... ?
> >
> > Yea, it's something like this.  I found out it hangs for exactly 60s, which
> > lead me to timeout.c.  I also learned strace -f triggers the issue.
> > CK_FORK=no fixes the hang as well.
>
> I just reproduced the issue... So it looks like it's a check bug of some sort.
> literally i see it hang - i've gone through and forced a longer timeout of
> 240sec on everything so i have a lot longer to play (probably a good idea
> anyway)... and literally all the test  suite processes have exited... but 
> check
> still things they are running. this would be in the check meson/ninja infra
> waiting on the child procs. either it should have failed ... or it should have
> succeeded by the time the efl test suite process exits... but they are all 
> gone
> and check is still waiting on a whole bunch of them...
>
> > I added debug printfs to efl_check.h and timeout.c - when eina_suite tries 
> > to
> > kill timeout, it kills the wrong pid:
> >
> >   $ ./build/src/tests/eina/eina_suite fp
> >   Running suite(s): eina_init_module
> >   100%: Checks: 0, Failures: 0, Errors: 0
> >   -------------------- efl_check forked timeout: 296393    <-----
> >   -------------------- efl_check forked timeout: 0
> >   Running suite(s): Eina
> >   -------------------- timeout.c my pid: 296396            <-----
> >   Max delta(multiplication): 0.007627 (0.061668%)
> >   Max delta(division): 0.000173 (0.740211%)
> >   100%: Checks: 4, Failures: 0, Errors: 0
> >   -------------------- efl_check killing timeout child: 296393
> >   -------------------- efl_check cleared timeout_pid: 0
> >
> > So eina_suite.c gets the wrong pid from fork().  In a simple standalone
> > program, fork() behaves as expected.
>
> now a returned pid of 0 ... THAT IS WRONG! well unless its inside the child.
> the parent should get -1 for a failure or > 0 for the child pid. the child 
> gets
> 0 ... but see below
>
> > I'm going to compare the arch & debian check packages for any suspicious
> > differences.  And maybe walk through more carefully with gdb.  But I'm out 
> > of
> > time tonight.
>
> Yeah - I see the problem in debian SID (this is on aarch64 btw) ... but it
> totally smells of a check bug as above. just a cursory glance tells me that
> something is wrong over there - if check is sitting and waiting on test suite
> procs that have already exited/gone...
>
> part of my pstree:
>
>         │         │         │                 │               
> ├─terminology─┬─zsh───ninja───sh───meson
>         │         │         │                 │               │             
> └─3*[{terminology}]
>
> that's a ninja test... sitting there:
>
> [20-39/39] 🌔 ecore_wl2-suite                           87/240s
>
> i.e. waiting on tests 20-39  now with a 240sec timeout (my local mods to 
> force this) with no child processes. so something up there in check+meson...

have you tried without fork :

https://libcheck.github.io/check/doc/check_html/check_4.html#No-Fork-Mode

Vincent


_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to