>That's why I repeat: if we have ISOLat1 characters to output, these should be >encoded as 2-byte sequences in case of UTF-8. Thus, the output files we have >at the moment <emphasis>cannot</emphasis> be interpreted as UTF-8, since they >are not.
Hmm. Ok, this might be a problem. I don't know. It's up to the application developer (Ardo) to determine what he wants to do. >You see, the construct \|...\| can be easily cought since it's a special thing >(`\' in input will be escaped with \ giving \\ in output). Well, in case of >SDATA-entities, I see how to make use of them. I don't see why \|...\| just as easily as ©. They are both unique! Furthermore, if we can get the charset of the debiandoc char stream sorted out, you can hook up *standard*, already written tools to go from one char set to another. >> >One more issues (I just made a more throughly look on entities supplied by >> >sgml-data. Why some files provide Unicode equivalents for entities and some >> >proprietary SDATA? Is this by design? >> >> There are none that use SDATA AFAIK. YOu might be mixing up sgml-data >> with some other packages which put stuff in /usr/lib/sgml/entities. > >I am sorry to say that the freshly downloaded and unpacked in a separate >directory sgml-data package has ISO* files that define SDATA-entities. Yes indeed. This inconsistency seems to be a bug. >Well, and now returning to `stock' SGML entities. copy, and certain other >entities (like nbsp, for example) are from ISOnum, while in sgml-data package >they are defined in both of them (and they are different, BTW). Some overlap may be ok. ISO defines it -- not Debian! >As for working out this problem. There are two possibilities: to make use of >SDATA entities in all programs that come with Debian; or to use some Unicode >encoding for intermediate/output files. I opt for unicode. Unless there is a standard that the copyright circle 'c' glyph needs to be '[copy ]' and not '[copy ]' nor '[COPY ]', that is, unless I am given a guidelines by which to distinguish the proper notation from the impostor, I am very hesitant to do that. I would like someone to tell me what should be done, using the standards out there to back up their arguments. I am willing to provide SDATA encodings but not as the *default* unless they are defined by some standard and it doesn't break the fundamental jade/dsssl toolchain. -- .....Adam Di [EMAIL PROTECTED]<URL:http://www.onShore.com/>

