RE: Scientific American: Open Source infected with malware from invisible Unicode characters

Peter Constable via Unicode Mon, 23 Mar 2026 08:14:40 -0700

>From the SA article:

>In 2021 researchers at the University of Cambridge identified a class of 
>attacks they called "Trojan Source," which exploited Unicode...


I wouldn't word things quite that way: it's not Unicode that's exploited; 
rather, it's using Unicode character to craft an exploit against software 
editing tools and human code reviewers. It's a kind of social-engineering 
attack.

UTS 55 was created to address this class of issue:

https://www.unicode.org/reports/tr55/

For handling of invisible characters, see in particular section 4.2.

https://www.unicode.org/reports/tr55/#Invisibles

Something that wasn't specifically covered in 4.2, however, was guidance for 
display of contiguous sequences of variation selectors (which, btw, are not a 
conformant use of Unicode).



Peter


-----Original Message-----
From: Unicode <[email protected]> On Behalf Of Nitai Sasson via 
Unicode
Sent: March 22, 2026 3:38 PM
To: Karl Williamson <[email protected]>
Cc: [email protected]
Subject: Re: Scientific American: Open Source infected with malware from 
invisible Unicode characters

Thank you for sharing, this is quite interesting. I tried to find examples of 
how this actually works. Found this article:  
https://www.koi.ai/blog/glassworm-first-self-propagating-worm-using-invisible-code-hits-openvsx-marketplace

The screenshot in it shows a clearly suspicious line of code: var decodedBytes 
= decode(' ... a very long invisible string ... ');

So yes, the string is invisible, but it's not in itself executable. It needs to 
be decoded using a small amount of normal, visible and very suspicious code. So 
the claim that the vulnerability is invisible and can't be caught by normal 
code review seems a bit disingenuous. It's just a new way to obfuscate a string.

I haven't found any other description of what compromised source code looks 
like in practice. So best I can tell, while this is really interesting, it's 
not as undetectable to the naked eye as it sounds.

Still, very interesting! And if anyone has information that I haven't found, 
please share. Any technical dive into it would likely be a good read.

- Nitai

-------- Original Message --------
On Sunday, 03/22/26 at 11:12 Karl Williamson via Unicode 
<[email protected]> wrote:
Open-source software has an invisible vulnerability. Hackers have found it A 
cybercrime campaign called GlassWorm is hiding malware in invisible characters 
and spreading it through software that millions of developers rely on The 
danger in the code came from characters that are invisible to the human eye. In 
early March researchers at several security firms examined what looked like 
empty space and found hidden Unicode characters that decoded into a malicious 
program. Investigators soon traced hundreds of compromised open-source 
components spread across GitHub, npm and

Read in Scientific American: https://apple.news/ACCjFPpifQlCNSMetYCJ2Dg

RE: Scientific American: Open Source infected with malware from invisible Unicode characters

Reply via email to