You may need to do some more work to be more specific in what causes the
crashes, so that you can have a more targeted approach.

Ie, is this a case of the client not handling an EAI message (raw utf-8 in
the from header), is it specific to specific characters, is it specific to
the being in the "comment" portion of the address?

If it's specific to EAI messages, you could rewrite the messages to use
rfc2047 encoding instead.  If your system isn't supposed to handle EAI
messages, then you could even block
them at SMTP time.  (note that not everything allowed in an EAI message can
be translated to a traditional rfc 5322 message, but the majority that is
used by clients can be)

If it's specific to the characters or location, you can ban messages with
those characters or remove them.

Other alternatives include running a web-client that your users can use to
handle/remove those messages themselves if their client is crashing.

Brandon

On Mon, Mar 4, 2024 at 9:09 PM Philip Paeps via mailop <[email protected]>
wrote:

> On 2024-03-05 05:40:46 (+0800), Sebastian Nielsen via mailop wrote:
> > Anyone that have a general algoritm to filter out emoji from sender
> > addresses?
> >
> > How I do in regexp to identify emoji? (its such a stupid thing)..
>
> Today's regular expression will not capture tomorrow's emoji.  The nice
> people who standardise Unicode keep allocating more code points to more
> characters.
>
> > A guy sent a email containing emoji in the name part of a email sender
> > address in MIME FROM (like: Name [EMOJI] <[email protected]>). This
> > caused a
> > few email clients to crash completely and being unable to reopen until
> > I had
> > deleted the offending email from the inbox manually in the server.
> >
> > So now I need to construct a rule to delete all emoji from both From:
> > header
> > and To: header.
>
> You have constructed a textbook example of an "XY problem".
>
> > Im thinking to do same as I do when I filter emoji from subject lines,
> > but
> > this will also filter out umlaits from people's names so "André
> > Andersson"
> > becomes "Andr Andersson" and "Recep Tayyip Erdoğan" would become
> > "Recep
> > Tayyip Erdoan".
> >
> > Which isn't a good thing to do.
>
> How do you deal with users who write in 漢字 or
> देवनागरी?
>
> > So I need a rule to filter it more specifically, just delete all emoji
> > but
> > not other Unicode like characters and names from other countries.
>
> That is not a sustainable solution.
>
> Replacing the clients that crash seems much easier.
>
> Philip
> _______________________________________________
> mailop mailing list
> [email protected]
> https://list.mailop.org/listinfo/mailop
>
_______________________________________________
mailop mailing list
[email protected]
https://list.mailop.org/listinfo/mailop

Reply via email to