On Mon, Nov 6, 2017 at 9:19 AM Warren Young <war...@etr-usa.com> wrote:

> On Nov 3, 2017, at 12:08 PM, Richard Hipp <d...@sqlite.org> wrote:
> >
> > On 11/3/17, Olivier R. <m...@grammalecte.net> wrote:
> >>
> >>
> >> Sorry. My knowledge of the C toolchain is null.
> >
> > The next step will be to figure out
> > how to attach the debugger to a hung process.
>
> Problem #1 could be fixed (in principle) without any more help from you,
> Oliver: PIDs 888 and 893 are zombies, meaning Fossil is forking off
> children without calling wait() on them.  That’s why their VIRT column
> shows as 0 in your screenshot: the kernel has stripped all resources from
> them it can, and is holding onto only the exit status and such for the
> parent’s benefit.  This is a bug in Fossil, plain and simple.
>
> That said, zombies are nearly harmless, merely adding noise to the process
> table.  They don’t explain your actual symptom.
>
> The remaining PIDs are all certainly a single parent with multiple
> children.  You’d have to run top in “tree” mode or show the PPID column to
> find out which one is the parent.  You can tell without doing that by the
> fact that all of the VIRT column values are identical, meaning that within
> the limits of top’s reporting resolution, the children are allocating no
> dynamic virtual memory of their own, which is what we’d expect from a
> forking HTTP child-per-conn model.
>
> Given all of that, I’d just pick one of the PIDs and attach to it:
>
>     $ gdb -p 26819
>
> If that works, say “bt” when attached, then “quit” to detach again.  Post
> the backtrace output here, Oliver.
>
> If it doesn’t work, it’s probably due to lack of debugging permission on
> the target system, in which case you’ve got some sysadminning ahead of you,
> not on topic here.
>
> But, this does not look like a madly-spinning system.  The CPU is idle and
> the PIDs are pretty far apart.
>


Does contemporary Linux not randomize its PIDs?


> Basically, it’s looking like each one is the result of an HTTP transaction
> and the child just isn’t dying at transaction end as it should.  This
> should only be a serious problem when the children collectively hold so
> many resources that the system can’t run properly.
>
> Bottom line, I don’t think the top output explains the problem.
> _______________________________________________
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to