The problem is more complex. Some charsets are used without the space char so
where do you break the string int tokens? If kana and katakana are present then
I have been told that you should break at transitions between charsets. Chinese
chars could each be considered a token. Stevan, please tell more.
Though most of my mail is in English, charset support would be useful because
some correspondents have Cantonese in the sig.
Cheers-- Rick
ML mail <mlnos...@yahoo.com> wrote:
>Hi Tom,
>
>Thanks for your feedback...
>
>@Stevan, any input regarding support of dspam of various charsets?
>
>Regards
>ML
>
>
>
>
>On Thursday, March 13, 2014 9:53 PM, Tom Hendrikx <t...@whyscream.net>
>wrote:
>
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA256
>
>On 12-03-14 14:29, ML mail wrote:
>> Hello,
>>
>> Two questions:
>>
>> How well does dspam perform with more "exotic" foreign languages
>> such as arabic, chinese, etc?
>
>As long as dspam can break up the strings into tokens, it should able
>to do smething wth it. I don't know if the charset actually has any
>effect, maybe that's more a question for Stevan...
>
>>
>> and how does dspam also work for fighting spam mails which include
>> their content in pictures jpeg/png/etc ?
>
>Dspam does nothing with pictures. Extracting text from images (OCR) is
>completely different task, where dspam has support for. You could look
>into various projects with support for OCR.
>
>Tom
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.14 (GNU/Linux)
>Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
>iQIcBAEBCAAGBQJTIhgKAAoJEJPfMZ19VO/1INgP+gPX7t1Pl9Cn+ES4yw6UsIvD
>IMZchfti0QvMRIAMM51dbo0TaaZYgvne9S6AYVpMD685POoafFQuh7/BmCFGZnYT
>sMhh99fNe7uvWZ891Y8USJuBIuUX99gBxhRfHwg+D1pW/nJhnV7aK+dDn23OmVz3
>i4T9mwC++ky3SRjyxiEQ00QGZK2alefQUXfBfLlQPeAGM2UX8yg4W8DQU8LKBG29
>tc1FlLvbIJvXsclrjTYpdIDBgA+EX63hnhFeroD6IpYWwz0rl2ZKx3pfUBFEEHgA
>7rqgXO0egpT4kEhAd6iZR/M6Eqn0o+4oRyRG7viojyRwcX2Yke5BXreeT7qXojEC
>jW7KRj/RHPBQ0xlLUNo0sezzF2vP2WdU0r1XG3CPuGGNxemLe7nJMzokultKWO8K
>3mC0vu90amjUsGwwDtijhhupRI6p6bWc0bkBdlP8iMbeH0Nt6bNN54XUnjCPrz4w
>O7JPI8fet6h2Ubf2HE6sZKUQIffSJ1m0OaVxp0CJu5SEeQVa5JFL9KEvOU/0ko9t
>cQxBWuMyQQW8WKS05vNll+347pshPcwiiAv7p8lPLN5m1D5LbSTcQLEaYtP5ia+F
>3eoh75tsBOGQydOgBLPSE6vrVcPxOESB4wTD9PA8miIijkc2HrQhRzlwDOn2aSsO
>rTs+LLLb4vtuLFihtfGB
>=6K5l
>-----END PGP SIGNATURE-----
>
>
>------------------------------------------------------------------------------
>Learn Graph Databases - Download FREE O'Reilly Book
>"Graph Databases" is the definitive new guide to graph databases and
>their
>applications. Written by three acclaimed leaders in the field,
>this first edition is now available. Download your free book today!
>http://p.sf.net/sfu/13534_NeoTech
>_______________________________________________
>Dspam-user mailing list
>Dspam-user@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/dspam-user
>
>------------------------------------------------------------------------
>
>------------------------------------------------------------------------------
>Learn Graph Databases - Download FREE O'Reilly Book
>"Graph Databases" is the definitive new guide to graph databases and
>their
>applications. Written by three acclaimed leaders in the field,
>this first edition is now available. Download your free book today!
>http://p.sf.net/sfu/13534_NeoTech
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Dspam-user mailing list
>Dspam-user@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/dspam-user
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user