https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8303

            Bug ID: 8303
           Summary: sa-learn crash ( sort of ) when dealing with
                    compressed files > max_size
           Product: Spamassassin
           Version: 4.0.1
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Learner
          Assignee: dev@spamassassin.apache.org
          Reporter: p...@casapigi.pigi.org
  Target Milestone: Undefined

After upgrading to 4.0.1 I've noticed that my sa-learn script was getting this
output:
bzip2: I/O or other error, bailing out.  Possible reason follows.
bzip2: Broken pipe
        Input file =
/tmp/ham/cur//1709112771.M515865P31750.puma,S=562472,W=570714:2,S, output file
= (stdout)
error closing input file:  at
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/ArchiveIterator.pm line 388. at
/usr/bin/sa-learn line 499.

and stops to the first encountered error.
( My sa-learn script is as easy as sa-learn --ham /tmp/ham/cur/* )

The imap file ( in this case ) is a compressed one:
file /tmp/ham/cur/1709112771.M515865P31750.puma\,S\=562472\,W\=570714\:2\,S
/tmp/ham/cur/1709112771.M515865P31750.puma,S=562472,W=570714:2,S: bzip2
compressed data, block size = 600k

ls -l /tmp/ham/cur/1709112771.M515865P31750.puma\,S\=562472\,W\=570714\:2\,S 
-rw------- 1 root root 300775 Dec 29 07:43
'/tmp/ham/cur/1709112771.M515865P31750.puma,S=562472,W=570714:2,S'

etracting:

bzcat /tmp/ham/cur/1709112771.M515865P31750.puma\,S\=562472\,W\=570714\:2\,S >
EXTRACTED
ls -l EXTRACTED 
-rw-r--r-- 1 root root 562472 Dec 29 07:44 EXTRACTED


and the max-size default value is 500K

The message is at least scaring, basically due to the "die" in the code :

'close $fh  or die "error closing input file: $!";'

The easier workaround, to avoid the message is to simply add --max-size 0 to
the script for sa-learn, also if it is a risk in itself ( some messages could
even be too  big leading to high cpu or ram usage, you know ).


Running with -D I get the following:
...
...

Dec 29 07:51:11.334 [23730] dbg: archive-iterator: detected bzip2 file
/tmp/ham/cur/1709112771.M515865P31750.puma,S=562472,W=570714:2,S, reopening
with bzip2
Dec 29 07:51:11.396 [23730] info: archive-iterator: skipping large message:
read 524288, limit 512000 bytes

and then the usual message:
bzip2: I/O or other error, bailing out.  Possible reason follows.
bzip2: Broken pipe
        Input file =
/tmp/ham/cur/1709112771.M515865P31750.puma,S=562472,W=570714:2,S, output file =
(stdout)
Learned tokens from 0 message(s) (0 message(s) examined)
Dec 29 07:51:11.398 [23730] dbg: plugin:
Mail::SpamAssassin::Plugin::Bayes=HASH(0x55c937f66f90) implements
'learner_close', priority 0
Dec 29 07:51:11.398 [23730] dbg: bayes: untie-ing
Dec 29 07:51:11.406 [23730] dbg: bayes: files locked, now unlocking lock
Dec 29 07:51:11.406 [23730] dbg: locker: safe_unlock: unlink
/var/spamassasin/bayes.lock
error closing input file:  at
/usr/share/perl5/site_perl/Mail/SpamAssassin/ArchiveIterator.pm line 388. at
/usr/bin/site_perl/sa-learn line 499.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to