>> Perhaps we could replace '\r' with ' ' in the subject before
    >> tokenizing without losing much/any accuracy.  I don't believe we can
    >> get whitespace in body tokens.

    Tony> +1.

    Tony> (I presume that this is a nicer solution than having our own csv
    Tony> subclass that has the problem fixed?)

Well, given that the bug is in the underlying _csv extension module, I
suspect so. ;-)

Checked in as tokenizer.py 1.34.

Skip


_______________________________________________
spambayes-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/spambayes-dev

Reply via email to