Hi Dan,
I think it's a combination of oral tradition and long-standing precedent.
Earlier this year, I raised this general issue, partly because of
inconsistent use of -encoding in the build system. The response was
that there was some concern that not all tools in the tool chain could
handle UTF-8 files.
$ find open/make -name \*.gmk | xargs grep -o -e '-encoding [^ ]*'
open/make/Docs.gmk:-encoding ISO-8859-1
open/make/Docs.gmk:-encoding ISO-8859-1
open/make/common/SetupJavaCompilers.gmk:-encoding ascii
open/make/common/SetupJavaCompilers.gmk:-encoding ascii
I think we should be consistent, but (at the time) it did not seem worth
pushing for UTF-8 everywhere.
-- Jon
On 11/27/2019 07:23 PM, Dan Smith wrote:
For the other files, it seems strange to force the use of a charset
which is different from the charset of record for all our source files
(i.e. US-ASCII).
Can you clarify where this "charset of record" rule comes from? Is this written
down somewhere, or more of an oral tradition?
The non-ASCII characters I'm working with are, in fact, in the original
Markdown sources. If it's really important to avoid those in all sources, I
could (reluctantly) use a different strategy.
If the consensus is that the build tools should standardize on US-ASCII, I
guess there's a separate question about whether we're willing to rely on the
implicit platform default (now uniformly US-ASCII via command-line args), or
whether it's better to be explicit about it (s/UTF_8/US_ASCII/ in my changeset).