We need to define source and execution chracter sets (see section 5.2.1
of the C11 standard) for SDCC.

The easiest approach would be to support only the basic character set as
required by the standard. Currently, we have some (but not complete)
support for iso-8859-1.

Here's a suggestion for supporting a bit more:

The execution character set and encoding are the same as the source
character set and encoding.
The source file shall be written in an character set and encoding, for
which the following properties hold:
1) Every character from the basic character set is encoded as a value in
the range 0 to 127, which is the same as in the execution character set
of the compiler used to compile SDCC.
2) Every multibyte character that contains a byte of value in the range
0 to 127 consists of exactly one byte.

However, unless the source is written in UTF-8-encoed unicode, the
following restrictions to standard-compliance apply:
3) The value of universal character constants is undefined.
4) The value of u8-prefixed string literals is undefined.
5) The use of u- and U-prefixed string literals results in undefined
behaviour.

This should be easy to implement, will allow to make use of most common
character sets one might want to use on an embedded device today. And
for today's de-facto standard, UTF-8-encoded Unicode, we get full
standard-compliance.

Philipp


Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
_______________________________________________
Sdcc-user mailing list
Sdcc-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sdcc-user

Reply via email to