The ACES image container specification, meant to be compatible OpenEXR, prescribes UTF-8 for the representation of strings. Therefore I suggest that OpenEXR adopt the following rules:
- All text strings are to be interpreted as Unicode, encoded as UTF-8. This includes attribute names and strings contained in attributes, for example, as channel names. - Text strings stored in files must be in Normalization Form C (NFC, canonical decomposition followed by canonical composition). - Where text strings need to be collated, strcmp() is used to compare the corresponding char sequences: string A comes before (or is less than) string B if strcmp(A,B) == -1 (Note: this is not ambigous; the C99 standard specifies that strcmp() interprets the bytes that make up a string as unsigned.) - Text strings passed to the IlmImf library must be encoded as UTF-8 and in Normalization Form C. As far as I can tell, these rules are entirely compatible with all existing versions of the IlmImf library. Users whose writing system includes non-ASCII Unicode characters can continue to employ the existing library versions without change. Future versions of the library should verify that text strings are valid UTF-8. In addition, the library should either verify that strings are normalized to NFC, or normalize to NFC on the fly. Florian _______________________________________________ Openexr-devel mailing list Openexr-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/openexr-devel