Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Yann Ylavic Thu, 01 Feb 2018 08:16:03 -0800

On Thu, Feb 1, 2018 at 4:32 PM, Mark Blackman <[email protected]> wrote:>
>> On 1 Feb 2018, at 12:36, Yann Ylavic <[email protected]> wrote:
>>
>> Hi Mark,
>>
>> On Thu, Feb 1, 2018 at 10:29 AM, Mark Blackman <[email protected]>
>> wrote:>
>>>
>>>
>>> Just to confirm, you expect that patch to handle SHM clean-up
>>> even in the “nasty error” case?
>>
>> Not really, no patch can avoid a crash for a crashing code :/ The
>> "stop_signals-PR61558.patch" patch avoids a known httpd crash in
>> some circumstances, but...
>
> Well, I just mean, if sig_coredump gets called, will the patch result
> in the normal SHM clean-up routines getting called, where they would
> have not been called before?


No, unfortunately nothing fancy in there, keep in mind that it's a
root process faulting so I don't think much should ben done...

> SHM clean-up is the key here and any patch that doesn’t contribute to
> that has no immediate value for me.

What you may want to try is remove "s->defn_line_number" from the id there:
 
https://github.com/apache/httpd/blob/trunk/modules/proxy/mod_proxy_balancer.c#L787
If your configuration file changes often, that contributes to changing
the name of the SHM...

>
>>
>>> I suspect that nasty error is triggered by the Weblogic plugin
>>> based on the adjacency in the logs, but the tracing doesn’t
>>> reveal any details, so an strace will probably be required to get
>>> more detail.
>
> Tracing has confirmed this really is a segmentation fault despite the
> lack of host-level messages and that reading a 3rd party module (but
> not Weblogic) is the last thing that happens before the segmentation
> fault and that pattern is fairly consistent. Now we need to ensure
> coredumps are generated.
>
> Finally, there are no orphaned child httpd processes with a PPID of
> 1.  Just thousands and thousands of SHM segments with no processes
> attached to them.

Which brings us back to why attach and/or create fail if nothing is
attached to them.
These are SHMs (per "ipcs -m"), right? Not semaphores ("ipcs -s")?

"thousands and thousands" is kind of exponential, even for thousands
of vhosts, do the names of SHMs change for each startup?
(besides the generation number if you use that patch, I'm hardly
thinking that the processes would crash arbitrarily at generation
[0..1000]...)
If so, does it relate to configuration changes?

We are not talking about fixing the root issue here :/


Regards,
Yann.

Re: [Bug 62044] shared memory segments are not found in global list, but appear to exist in kernel.

Reply via email to