Hi,
I have deployed an mail receiving postfix server combined with amavisd (with
clamAV and spamassassin) by using the reference:
http://wiki.centos.org/HowTos/Amavisd
on a CentOS 5.5(64 bit) machine.
Now I have deployed the FuzzyOCR module for image spam, but the test cases as
provided for fuzzy OCR are not getting passed.
spamassassin version : 3.3.1
Perl version: 5.8.8
FuzzyOcr version: 3.6.0
Postfix version: 3.6.5
Following is one such debug output that I am getting:
$ spamassassin --debug FuzzyOcr <
/mnt/fuzzyOCR/FuzzyOcr-3.6.0/samples/ocr-wrongext.eml > /dev/null
Jun 22 15:44:48.653 [12849] dbg: FuzzyOcr: focr_bin_helper:
'pnmnorm,pnminvert,convert,ppmtopgm,tesseract'
Jun 22 15:44:48.653 [12849] info: FuzzyOcr: Adding <5> new helper apps
Jun 22 15:44:48.665 [12849] info: FuzzyOcr: Starting preprocessor parser for
file "/etc/mail/spamassassin/FuzzyOcr.preps"...
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: preprocessor normalize {
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: command = pnmnorm
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: preprocessor invert {
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: command = pnminvert
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: preprocessor ppmtopgm {
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: command = ppmtopgm
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: preprocessor pamtopnm {
Jun 22 15:44:48.665 [12849] dbg: FuzzyOcr: line: command = pamtopnm
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: preprocessor pamthreshold {
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: command = pamthreshold
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: args = -simple -threshold 0.5
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: preprocessor maketiff {
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: command = pnmtotiff
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: args = -color -truecolor
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line: }
Jun 22 15:44:48.666 [12849] info: FuzzyOcr: Starting scanset parser for file
"/etc/mail/spamassassin/FuzzyOcr.scansets"...
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line scanset ocrad {
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line command = $ocrad
Jun 22 15:44:48.666 [12849] dbg: FuzzyOcr: line args = -s5 $input
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line scanset ocrad-invert {
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line command = $ocrad
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line args = -s5 -i $input
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line scanset ocrad-decolorize-invert
{
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line preprocessors = ppmtopgm,
pamthreshold, pamtopnm
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line command = $ocrad
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line args = -s5 -i $input
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line scanset ocrad-decolorize {
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line preprocessors = ppmtopgm,
pamthreshold, pamtopnm
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line command = $ocrad
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line args = -s5 $input
Jun 22 15:44:48.667 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line scanset gocr {
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line command = $gocr
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line args = -i $input
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line scanset gocr-180 {
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line command = $gocr
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line args = -l 180 -d 2 -i $input
Jun 22 15:44:48.668 [12849] dbg: FuzzyOcr: line }
Jun 22 15:44:49.009 [12849] info: FuzzyOcr: Searching in: /usr/local/netpbm/bin
Jun 22 15:44:49.009 [12849] info: FuzzyOcr: Searching in: /usr/local/bin
Jun 22 15:44:49.009 [12849] info: FuzzyOcr: Searching in: /usr/bin
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using gifsicle => /usr/bin/gifsicle
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using giffix => /usr/bin/giffix
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using giftext => /usr/bin/giftext
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using gifinter => /usr/bin/gifinter
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using giftopnm => /usr/bin/giftopnm
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using jpegtopnm =>
/usr/bin/jpegtopnm
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using pngtopnm => /usr/bin/pngtopnm
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using bmptopnm => /usr/bin/bmptopnm
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using tifftopnm =>
/usr/bin/tifftopnm
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using ppmhist => /usr/bin/ppmhist
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using pamfile => /usr/bin/pamfile
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using ocrad => /usr/bin/ocrad
Jun 22 15:44:49.010 [12849] info: FuzzyOcr: Using gocr => /usr/bin/gocr
Jun 22 15:44:49.011 [12849] info: FuzzyOcr: Using pnmnorm => /usr/bin/pnmnorm
Jun 22 15:44:49.011 [12849] info: FuzzyOcr: Using pnminvert =>
/usr/bin/pnminvert
Jun 22 15:44:49.011 [12849] info: FuzzyOcr: Using convert => /usr/bin/convert
Jun 22 15:44:49.011 [12849] info: FuzzyOcr: Using ppmtopgm => /usr/bin/ppmtopgm
Jun 22 15:44:49.011 [12849] info: FuzzyOcr: Using tesseract =>
/usr/bin/tesseract
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[max_hash] => 5
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[c] => 5
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[s] => 0.01
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[w] => 0.01
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[h] => 0.01
Jun 22 15:44:49.011 [12849] dbg: FuzzyOcr: Threshold[cn] => 0.01
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_add_score => 1
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_autodisable_negative_score => -5
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_autodisable_score => 1000
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_autosort_buffer => 10
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_autosort_scanset => 1
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_base_score => 5
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_corrupt_score => 2.5
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_corrupt_unfixable_score => 5
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_counts_required => 2
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_db_hash =>
/etc/mail/spamassassin/FuzzyOcr.db
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_db_max_days => 35
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_db_safe =>
/etc/mail/spamassassin/FuzzyOcr.safe.db
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_digest_db =>
/etc/mail/spamassassin/FuzzyOcr.hashdb
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_enable_image_hashing => 2
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_global_timeout => 0
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_global_wordlist =>
/etc/mail/spamassassin/FuzzyOcr.words
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_hashing_learn_scanned => 1
Jun 22 15:44:49.012 [12849] dbg: FuzzyOcr: focr_keep_bad_images => 0
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_log_pmsinfo => 1
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_log_stderr => 1
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_max_height => 800
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_max_width => 800
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_min_height => 4
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_min_width => 4
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_minimal_scanset => 1
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_db => FuzzyOcr
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_hash => Hash
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_host => localhost
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_port => 3306
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_safe => Safe
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_update_hash => 0
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_mysql_user => fuzzyocr
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_no_homedirs => 0
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_path_bin =>
/usr/local/netpbm/bin:/usr/local/bin:/usr/bin
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_pdf_maxpages => 1
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_personal_wordlist =>
__userstate__/FuzzyOcr.words
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_preprocessor_file =>
/etc/mail/spamassassin/FuzzyOcr.preps
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_scan_pdfs => 0
Jun 22 15:44:49.013 [12849] dbg: FuzzyOcr: focr_scanset_file =>
/etc/mail/spamassassin/FuzzyOcr.scansets
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_score_ham => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_bmp => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_gif => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_jpeg => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_png => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_tiff => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_skip_updates => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_strip_numbers => 1
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_threshold => 0.25
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_timeout => 10
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_twopass_scoring_factor => 1.5
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_unique_matches => 0
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_verbose => 3
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_wrongctype_score => 1.5
Jun 22 15:44:49.014 [12849] dbg: FuzzyOcr: focr_wrongext_score => 1.5
Jun 22 15:44:49.014 [12849] info: FuzzyOcr: Loaded preprocessor normalize:
/usr/bin/pnmnorm
Jun 22 15:44:49.014 [12849] info: FuzzyOcr: Loaded preprocessor invert:
/usr/bin/pnminvert
Jun 22 15:44:49.014 [12849] info: FuzzyOcr: Loaded preprocessor ppmtopgm:
/usr/bin/ppmtopgm
Jun 22 15:44:49.014 [12849] info: FuzzyOcr: Loaded preprocessor pamtopnm:
pamtopnm
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Loaded preprocessor pamthreshold:
pamthreshold -simple -threshold 0.5
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Loaded preprocessor maketiff:
pnmtotiff -color -truecolor
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan ocrad: /usr/bin/ocrad
-s5 $input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan ocrad-invert:
/usr/bin/ocrad -s5 -i $input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan ocrad-decolorize-invert:
/usr/bin/ocrad -s5 -i $input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan ocrad-decolorize:
/usr/bin/ocrad -s5 $input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan gocr: /usr/bin/gocr -i
$input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Using scan gocr-180: /usr/bin/gocr
-l 180 -d 2 -i $input
Jun 22 15:44:49.015 [12849] info: FuzzyOcr: Added <43> words from
"/etc/mail/spamassassin/FuzzyOcr.words"
Jun 22 15:44:50.042 [12849] info: rules: meta test ADVANCE_FEE_3_NEW_FORM has
dependency 'ADVANCE_FEE_3_NEW' with a zero score
Jun 22 15:44:50.056 [12849] info: rules: meta test ADVANCE_FEE_3_NEW_MONEY has
dependency 'ADVANCE_FEE_3_NEW' with a zero score
Jun 22 15:44:50.102 [12849] dbg: FuzzyOcr: Starting FuzzyOcr...
Jun 22 15:44:50.102 [12849] info: FuzzyOcr: Processing Message with ID
"<[email protected]>" (Clifton Ballard
<[email protected]> -> [email protected])
Jun 22 15:44:50.102 [12849] dbg: FuzzyOcr: fname: "sbillet" => "sbillet"
Jun 22 15:44:50.102 [12849] info: FuzzyOcr: GIF: [327x549] sbillet (7239)
Jun 22 15:44:50.102 [12849] dbg: FuzzyOcr: Saved:
/tmp/.spamassassin12849jvyvJgtmp/sbillet
Jun 22 15:44:50.102 [12849] dbg: FuzzyOcr: Saved:
/tmp/.spamassassin12849jvyvJgtmp/raw.eml
Jun 22 15:44:50.103 [12849] info: FuzzyOcr: Found: 1 images
Jun 22 15:44:50.103 [12849] dbg: FuzzyOcr: pfile =>
/tmp/.spamassassin12849jvyvJgtmp/sbillet.pnm
Jun 22 15:44:50.103 [12849] dbg: FuzzyOcr: efile =>
/tmp/.spamassassin12849jvyvJgtmp/sbillet.err
Jun 22 15:44:50.103 [12849] dbg: FuzzyOcr: Errors to:
/tmp/.spamassassin12849jvyvJgtmp/raw.err
Jun 22 15:44:50.103 [12849] dbg: FuzzyOcr: File has Content-Type "image/jpeg"
and no File Extension
Jun 22 15:44:50.103 [12849] info: FuzzyOcr: Found GIF header name="sbillet"
Jun 22 15:44:50.103 [12849] info: FuzzyOcr: Image has format "GIF" but
content-type is "image/jpeg"
Jun 22 15:44:50.116 [12850] dbg: FuzzyOcr: Exec : /usr/bin/giftext
/tmp/.spamassassin12849jvyvJgtmp/sbillet
Jun 22 15:44:50.117 [12850] dbg: FuzzyOcr: Stdout:
>/tmp/.spamassassin12849jvyvJgtmp/giftext.info
Jun 22 15:44:50.117 [12850] dbg: FuzzyOcr: Stderr:
>>/tmp/.spamassassin12849jvyvJgtmp/giftext.err
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
Jun 22 15:44:50.134 [12849] dbg: FuzzyOcr: Saved pid: 12850
Jun 22 15:44:50.134 [12849] dbg: FuzzyOcr: Elapsed [12850]: 0.030674 sec.
(/usr/bin/giftext: exit 8)
Jun 22 15:44:50.134 [12849] warn: readline() on closed filehandle INFILE at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 205.
Jun 22 15:44:50.135 [12849] info: FuzzyOcr: Image is single non-interlaced...
Jun 22 15:44:50.142 [12851] dbg: FuzzyOcr: Exec : /usr/bin/giffix
/tmp/.spamassassin12849jvyvJgtmp/sbillet
Jun 22 15:44:50.143 [12851] dbg: FuzzyOcr: Stdout:
>/tmp/.spamassassin12849jvyvJgtmp/sbillet-fixed.gif
Jun 22 15:44:50.143 [12851] dbg: FuzzyOcr: Stderr:
>>/tmp/.spamassassin12849jvyvJgtmp/sbillet.err
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
Jun 22 15:44:50.149 [12849] dbg: FuzzyOcr: Saved pid: 12851
Jun 22 15:44:50.149 [12849] dbg: FuzzyOcr: Elapsed [12851]: 0.014443 sec.
(/usr/bin/giffix: exit 8)
Jun 22 15:44:50.156 [12852] dbg: FuzzyOcr: Exec : /usr/bin/giftopnm
/tmp/.spamassassin12849jvyvJgtmp/sbillet-fixed.gif
Jun 22 15:44:50.157 [12852] dbg: FuzzyOcr: Stdout:
>/tmp/.spamassassin12849jvyvJgtmp/sbillet.pnm
Jun 22 15:44:50.157 [12852] dbg: FuzzyOcr: Stderr:
>>/tmp/.spamassassin12849jvyvJgtmp/sbillet.err
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
save_execute: Insecure dependency in open while running with -T switch at
/etc/mail/spamassassin/FuzzyOcr/Misc.pm line 92.
Jun 22 15:44:50.173 [12849] dbg: FuzzyOcr: Saved pid: 12852
Jun 22 15:44:50.174 [12849] dbg: FuzzyOcr: Elapsed [12852]: 0.023864 sec.
(/usr/bin/giftopnm: exit 8)
Jun 22 15:44:50.174 [12849] error: FuzzyOcr: /usr/bin/giftopnm: Returned
[2048], skipping...
Jun 22 15:44:50.175 [12849] dbg: FuzzyOcr: Remove DIR:
/tmp/.spamassassin12849jvyvJgtmp
Jun 22 15:44:50.175 [12849] dbg: FuzzyOcr: FuzzyOcr ending successfully...
Jun 22 15:44:50.175 [12849] dbg: FuzzyOcr: Processed in 0.073397 sec.
Can anybody tell me what's wrong with my deployment?
Thanks in advance
Ashish Sharma