On Wed, Nov 10, 2021 at 6:30 AM Gaëtan Rivet <[email protected]> wrote: > > On Tue, Nov 2, 2021, at 19:43, Mike Pattrick wrote: > > Recently there has been a lot of press about the "trojan source" attack, > > where Unicode characters are used to obfuscate the true functionality of > > code. This attack didn't effect OVS, but adding the check here will help > > guard against it sneaking in later. > > > > Signed-off-by: Mike Pattrick <[email protected]> > > Hi, > > What did you base the selection of characters to blacklist on?
I believe this list was sourced from https://unicode.org/reports/tr9/ > Reading issues open on other languages, I haven't found a good comprehensive > set of characters that would need to be blacklisted. I'm not sure it is a > sufficient > approach: getting creative and circumventing this kind of blacklist would be > a sport. > > Instead, shouldn't we take the reverse approach and whitelist single-byte > chars? > (warn on multi-byte unicode sequence). It would be sufficient for the vast > majority > of C sources (and scripts). I've been going back and forth on that idea. I'm afraid of making a change that seems exclusive to people with non-latin characters in their name. There are a few pre-canned lists of homoglyphs, maybe I could add those to the blacklist? > > If there are exceptions, at least checkpatch would still show a warning about > the introduced characters and they could be reviewed on a case-by-case basis. > The idea is only to make invisible chars visible to reviewers. > > WDYT? > > -- > Gaetan Rivet > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
