I've finally managed to get something working with the Outlook addin and
Skip's cool new ocrad stuff.  the results look promising! :)  A summary of
my results are below.  The runs are 'Tokenizer:x-image_size' and
'Tokenizer:x-crack_images' both set to False, versus both set to True.  It
looks like a 13% improvement in false negatives, which is nothing to sneeze
at!  I've never been an expert at reading these results though, so let me
know if there is anything interesting I missed or neglected to send.

Cheers,

Mark

false positive percentages
<snip 10 lines of zeros>
won   0 times
tied 10 times
lost  0 times

total unique fp went from 0 to 0 tied
mean fp % went from 0.0 to 0.0 tied

false negative percentages
    5.859  5.455  won     -6.90%
    5.410  4.363  won    -19.35%
    5.794  5.234  won     -9.67%
    3.008  2.820  won     -6.25%
    5.588  4.817  won    -13.80%
    4.469  3.166  won    -29.16%
    5.051  4.646  won     -8.02%
    5.829  5.647  won     -3.12%
    6.842  5.789  won    -15.39%
    7.356  5.964  won    -18.92%

won  10 times
tied  0 times
lost  0 times

total unique fn went from 293 to 254 won    -13.31%
mean fn % went from 5.52048164035 to 4.79002154629 won    -13.23%

ham mean                     ham sdev
   0.00    0.00 +(was 0)        0.05    0.05   +0.00%
   0.03    0.04  +33.33%        0.48    0.70  +45.83%
   0.06    0.04  -33.33%        0.82    0.53  -35.37%
   0.08    0.08   +0.00%        1.53    1.53   +0.00%
   0.00    0.00 +(was 0)        0.01    0.01   +0.00%
   0.08    0.09  +12.50%        1.89    1.90   +0.53%
   0.00    0.00 +(was 0)        0.08    0.08   +0.00%
   0.00    0.00 +(was 0)        0.08    0.07  -12.50%
   0.01    0.01   +0.00%        0.17    0.17   +0.00%
   0.01    0.01   +0.00%        0.21    0.18  -14.29%

ham mean and sdev for all runs
   0.03    0.03   +0.00%        0.83    0.82   -1.20%

spam mean                    spam sdev
  90.19   90.67   +0.53%       24.58   23.81   -3.13%
  91.43   91.74   +0.34%       22.99   22.24   -3.26%
  91.51   91.85   +0.37%       23.66   22.79   -3.68%
  93.62   93.94   +0.34%       18.73   17.90   -4.43%
  90.62   91.11   +0.54%       23.83   22.99   -3.52%
  91.07   91.66   +0.65%       22.71   21.31   -6.16%
  90.85   91.31   +0.51%       23.28   22.58   -3.01%
  89.74   90.19   +0.50%       25.29   24.57   -2.85%
  89.49   90.15   +0.74%       25.99   24.73   -4.85%
  88.57   89.45   +0.99%       26.84   25.00   -6.86%

spam mean and sdev for all runs
  90.72   91.21   +0.54%       23.91   22.90   -4.22%

ham/spam mean difference: 90.69 91.18 +0.49

_______________________________________________
spambayes-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/spambayes-dev

Reply via email to