(If you don't know much about character encodings and how they can cause issues I just posted the blog post for you: https://www.oxymoronical.com/blog/2019/07/Please-watch-your-character-encodings )
I've run into a few bugs recently where non-english characters were causing things to break because we were encoding or decoding strings incorrectly. Please watch out for this, even better add tests using non-English characters where it makes sense. A couple of specific things worth keeping in mind: * nsAString is documented as always being encoded in (potentially invalid) UTF-16. * nsACString is documented as not having any defined encoding. Look back over where your string is coming from and see if you can infer the encoding from there. Ideally document it! * Use the right IDL type when passing strings through XPCOM between C++ and JavaScript. Even though you are working with an nsACString in C++, ACString is not the right IDL type to use unless you know that there can be no international characters involved, they will be mangled without any kind of warning. Instead consider AUTF8String, this will encode/decode the nsACString as UTF-8 when converting from or to a JavaScript string. As a concrete example I skimmed over the IDLs that use ACString today and came across nsINetUtil.idl. The escape and unescape functions look suspect as they take and return ACStrings and it seems likely that someone may want to escape international characters. Sure enough: netUtils.escapeString("Ć", netUtils.ESCAPE_XALPHAS) == "%06" (should be "%C4%86") netUtils.unescapeString("%C4%86", 0) == "Ä\u0086" Maybe those functions are only meant to work with single byte characters, but that isn't clear from the comments. Currently these two functions are only in use in tests so we can probably just remove them rather than figure out whether to fix them or not. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform