Justin Erenkrantz wrote:
I have these two corefiles set aside and can do any examination folks would
like to see on it.  I don't know enough about Perl's internal structure to do
much good by myself.

You'll want to use Perl_sv_dump() instead of just trying to print the pointer directly (you'll get more useful details). That being said, there is no similarity between the three backtraces you have sent, so it seems likely to me that you are not running into one problem but a host of them. The first one seemed likely to be a signal race condition, the second look awfully suspicious:

        #0  Perl_malloc (nbytes=672735716) at malloc.c:1514

That's 641MB, which would probably be more than Perl should really be asking for. The third one seems to be an attempt to clear a null SV. I don't know that access to the core files would be much help because by the time you get to the segfault itself, the damage has been done way upstream (and it is hard to research those).

However, my professional recommendation is to add at least 2 if not 4 other MX boxes and share the load a little. The zombie army tends to get fixated on a single IP even with multiple MX records, but the legit mail will load balance much better (and if you segfault handling spam, too bad ;-). You may just be trying to push too much traffic through a single qpsmtpd instance.

Our current configuration is 2 equal distance MX boxes running equivalent configurations (including virus scanning, blacklist and address validation). Once the message has been accepted for delivery (i.e. not blacklisted or infected), the message is relayed to the actual mail server via qmail-qmqpc with a fall back to qmail-queue/smtproutes. We aren't handling nearly the same volume you are, but our MX boxes are Cobalt RaQ3's ( barely 450Mhz Pentium-equivalent) and they are quite happily chugging along with loads below 0.5.

If you want more details, e-mail me directly and I'll tell you what I am doing...

John

Reply via email to