Replies...
> On Nov 12, 2021, at 10:56 PM, Bill Cole via MIMEDefang > <[email protected]> wrote: > > On 2021-11-12 at 15:36:51 UTC-0500 (Fri, 12 Nov 2021 13:36:51 -0700) > Philip Prindeville via MIMEDefang <[email protected]> > is rumored to have said: > > [...] > >> Okay, seeing this: >> >> >> ... >> Nov 12 11:43:37.936 [36367] dbg: rules: ran eval rule __UPPERCASE_25_50 >> ======> got hit (1) >> Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC >> ======> got hit (1) >> ... >> Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH >> ======> got hit (1) >> Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in >> Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping >> further tests >> >> >> Wait, what??? Why is this taking 4:30s to execute??? > > Because some rule that was in the compiled ruleset of the moment somewhere > after __ANY_TEXT_ATTACH went insane. The most common cause of such insanity > is a regex (i.e. body, rawbody, header or full) rule which causes O(n!) or > similarly complex behavior, usually because it has to backtrack frequently > when scanning for a match. The root cause is usually a greedy '.*' or '.+' in > a rule where the initial '.' might match in many places that the full pattern > doesn't match. It is a good idea to anchor rule patterns when possible, use > {min,max} ranges where possible instead of * or +, and use specific character > classes rather than '.' where possible. A runaway regex in a SA rule usually > requires the combination of a vulnerable rule with an uncommon sort of > message. > >> And how to dig into this further? > > Unfortunately, we do not know which rule caused the problem because of how > the rules are loaded and compiled. We only know what rules had hits before > the one that caused the timeout. In this case we got a completely bizarre > function name for where the bad rule lives (_eval_tests_type11_pri0_set1) > which means that it is one of the many "body eval" rules that are registered > . The order of rules should be the same between runs of spamassassin or spamd > scanner processes using the same rules & config, but not so much between > different systems. One identification trick is to run a similar but > non-hanging message through a '-D all' scan and looking for what matches in > the good message right after the last match in the bad message. Another > approach is to temporarily raise the time_limit setting to something extreme > to see if you can work through it and possibly get a hit on the bad rule > (unlikely.) The more sure method is a bisecting approach, truncating the bad > message at various! lengths to find where the problematic content is in the message body. You can also work by eliminating/including rules (or whole files of rules such as local.cf) to find the problematic rule. At least for debugging purposes, it would be useful to have a mode that forced strictly serial execution... if that's even possible. > >> Googling the error message doesn't return much that's useful... > > Yeah, it won't. The method name "_eval_tests_type11_pri0_set1" identifies a > block of code assembled at runtime from a subset (priority 0, set 1) of the > current "body eval" rules (type 11.) It's not a function that exists in the > SA code on disk and the only way to stumble across it is if SA breaks loudly > while that ephemeral code runs (e.g. by timing out.) What about fingering the source file and line number? Almost as good... -Philip > > > -- > Bill Cole > [email protected] or [email protected] > (AKA @grumpybozo and many *@billmail.scconsult.com addresses) > Not Currently Available For Hire > _______________________________________________ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. MIMEDefang mailing list [email protected] https://lists.mimedefang.org/mailman/listinfo/mimedefang_lists.mimedefang.org
