> For the other files, it seems strange to force the use of a charset > which is different from the charset of record for all our source files > (i.e. US-ASCII).
Can you clarify where this "charset of record" rule comes from? Is this written down somewhere, or more of an oral tradition? The non-ASCII characters I'm working with are, in fact, in the original Markdown sources. If it's really important to avoid those in all sources, I could (reluctantly) use a different strategy. If the consensus is that the build tools should standardize on US-ASCII, I guess there's a separate question about whether we're willing to rely on the implicit platform default (now uniformly US-ASCII via command-line args), or whether it's better to be explicit about it (s/UTF_8/US_ASCII/ in my changeset).