--On Tuesday, January 19, 2016 8:04 PM -0500 listsb-ama...@bitrate.net wrote:

On Jan 19, 2016, at 18.09, Quanah Gibson-Mount <qua...@zimbra.com> wrote:

amavis-services[18544]: PID 13724 went away, 13724-01

does $log_level = 5 reveal any additional clues about what happened to
the process?

It looks like it's related to the recent changes around the use of the "file" binary. Every process that "goes away" occurs here:

Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: target fd0 closing, to become < /dev/null Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: target fd1 closing, to become (65) &=25 Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: target fd1 dup2 from fd25 (65) &=25 Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: source fd25 closed Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: target fd2 closing, to become (65) &1 Jan 21 07:01:24 zqa-211 amavis[5414]: (04970-01-6) open_on_specific_fd: target fd2 dup2 from fd1 (65) &1 Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) result line from file(1): p001: HTML document, ASCII text, with very long lines\n Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) lookup_re("HTML document, ASCII text, with very long lines") matches key "(?^i:\\btext\\b)", result="asc" Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) lookup [map_full_type_to_short_type] => true, "HTML document, ASCII text, with very long lines" matches, result="asc", matching_key="(?^i:\\btext\\b)" Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) File-type of p001: HTML document, ASCII text, with very long lines; (asc) Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) do_ascii: Decoding part p001 Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) do_ascii: Setting sigaction handler, was 0 Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) get_deadline do_ascii_pre - deadline in 479.9 s, set to 288.000 s Jan 21 07:01:24 zqa-211 amavis[4970]: (04970-01-6) prolong_timer do_ascii_pre: timer 288, was 0, deadline in 479.9 s


Then we see (all together):
Jan 21 07:01:34 zqa-211 amavis-services[6954]: PID 3480 went away, 03480-01-25 Jan 21 07:01:34 zqa-211 amavis-services[6954]: PID 4970 went away, 04970-01-6 Jan 21 07:01:34 zqa-211 amavis-services[6954]: PID 2609 went away, 02609-01-31
Jan 21 07:01:36 zqa-211 amavis-services[6954]: PID 5406 went away, 05406-01
Jan 21 07:01:38 zqa-211 amavis-services[6954]: PID 5416 went away, 05416-03
Jan 21 07:01:38 zqa-211 amavis-services[6954]: PID 5421 went away, 05421-01

I.e., every single one of the above processes are in the same function.


It's anecdotal, but, on a handful of occasions, we have had our mail
server use up all of its
memory, and iirc, it seemed that amavis had trouble handling that
elegantly, and troubleshooting
was a little obscure. most recently, the culprit was something wrt razor
servers changing
[hostname, ip address, or such], which caused amavis children to get
stuck.

To be clear, this happens on any server (we have hundreds) if we put amavis under load. They have plenty of memory, and are not running out. I think this is related to the changes made here:

- use a perl module File::LibMagic when available, instead of spawning
 a file(1) utility for classifying contents of mail parts.
 By using a direct interface to a libmagic library the startup cost
 of spawning an external process is avoided. Benchmarking shows that
 using libmagic is significantly faster especially for checking a small
 number of files - takes 4 ms for checking one file with libmagic
 vs. 27 ms with a spawned file(1); based on a patch by Markus Benning;

or possibly this:


- adjusted some timeouts to leave more reserve for later stages of
 mail processing and forwarding;


--Quanah


--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Reply via email to