>I do not quite understand why you think that debiandoc-sgml's output is encoded >in ISOLat1? The output of debiandoc-sgml is just plain 8-bit stream. I >believe nobody can tell what it is.
Well, I don't know for sure, but: (a) it's obviously not us-ascii (7bit) (b) it could be interpreted as UTF8, but if that was the case, why would Russian/Japanese not use Unicode too? (c) it uses the stock SGML entities we suppy, which are Unicode/ISOLat1. Your distinction between Unicode (UTF8) and ISOLat1 is basically irrelevent, since, as far as I know, for all the character entities that debiandoc-sgml uses, they are represented identically in both representations. I don't understand why you are saying that you *can't* capture the character '©' but you can capture '[copy ]'? Isnt' that the simplest way to solve things, rather than breaking the rest of the roman character set languages? >One more issues (I just made a more throughly look on entities supplied by >sgml-data. Why some files provide Unicode equivalents for entities and some >proprietary SDATA? Is this by design? There are none that use SDATA AFAIK. YOu might be mixing up sgml-data with some other packages which put stuff in /usr/lib/sgml/entities. -- .....Adam Di [EMAIL PROTECTED]<URL:http://www.onShore.com/>

