On Fri, 26 Sep 2014, dar...@chaosreigns.com wrote:

I created some rules to match Polish text:
http://www.chaosreigns.com/sa/polish.txt

The rules with only ascii characters work, the ones with utf8 characters
don't.  According to hexedit, they're identical in my maildir and in my
/etc/spamassassin/local.cf.

Put the hex strings for the accented characters into the RE.

I've had the best reliability from placing each byte in its own character class: [\xd0][\x80]

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  [People] are socialists because they are blinded by
  envy and ignorance.       -- economist Ludwig von Mises (1881-1973)
-----------------------------------------------------------------------
 848 days since the first successful private support mission to ISS (SpaceX)

Reply via email to