On Mon, Nov 6, 2017 at 9:19 AM Warren Young <war...@etr-usa.com> wrote:
> On Nov 3, 2017, at 12:08 PM, Richard Hipp <d...@sqlite.org> wrote: > > > > On 11/3/17, Olivier R. <m...@grammalecte.net> wrote: > >> > >> > >> Sorry. My knowledge of the C toolchain is null. > > > > The next step will be to figure out > > how to attach the debugger to a hung process. > > Problem #1 could be fixed (in principle) without any more help from you, > Oliver: PIDs 888 and 893 are zombies, meaning Fossil is forking off > children without calling wait() on them. That’s why their VIRT column > shows as 0 in your screenshot: the kernel has stripped all resources from > them it can, and is holding onto only the exit status and such for the > parent’s benefit. This is a bug in Fossil, plain and simple. > > That said, zombies are nearly harmless, merely adding noise to the process > table. They don’t explain your actual symptom. > > The remaining PIDs are all certainly a single parent with multiple > children. You’d have to run top in “tree” mode or show the PPID column to > find out which one is the parent. You can tell without doing that by the > fact that all of the VIRT column values are identical, meaning that within > the limits of top’s reporting resolution, the children are allocating no > dynamic virtual memory of their own, which is what we’d expect from a > forking HTTP child-per-conn model. > > Given all of that, I’d just pick one of the PIDs and attach to it: > > $ gdb -p 26819 > > If that works, say “bt” when attached, then “quit” to detach again. Post > the backtrace output here, Oliver. > > If it doesn’t work, it’s probably due to lack of debugging permission on > the target system, in which case you’ve got some sysadminning ahead of you, > not on topic here. > > But, this does not look like a madly-spinning system. The CPU is idle and > the PIDs are pretty far apart. > Does contemporary Linux not randomize its PIDs? > Basically, it’s looking like each one is the result of an HTTP transaction > and the child just isn’t dying at transaction end as it should. This > should only be a serious problem when the children collectively hold so > many resources that the system can’t run properly. > > Bottom line, I don’t think the top output explains the problem. > _______________________________________________ > fossil-users mailing list > fossil-users@lists.fossil-scm.org > http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users >
_______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users