As it says in the CHANGES entry for OpenProofPower 3.1w5: > The Reader/Writer now supports input and output in the UTF-8 encoding > of a Unicode representation of the mathematical symbols. This gives an > alternative to the ProofPower extended character set that is more > portable and convenient for many purposes. (This is not yet supported by > xpp or the document preparation system.)
I would welcome feedback on the following thoughts about what to do next with xpp and the document preparation system. For the document preparation system, we want to support documents that use the UTF-8 encoding. Some systems try to distinguish between UTF-8 and other encodings automatically or by including the UTF-8 byte order mark (BOM) as a magic number at the the head of the file. I don’t think either of these approaches is promising for the ProofPower document preparation system. So I propose using a new filename extension “.ppd” for UTF-8 encoded ProofPower documents that would be recognised by docsml and doctex. docsml would simply pass UTF-8 encodings on unchanged and the pp scripts would now default to running the ProofPower database with input and output in UTF-8. This will mean that if you run docsml and then run pp in a terminal window on the resulting “.sml” file, you will see the proper mathematical symbols, provided your system has reasonable support for UTF-8 and Unicode. It also means that text-processing utilities like “grep” will work properly on “.sml” files without you having to set LANG=C. Likewise, off-the-shelf text editors should work well on all the files involved. doctex would map the UTF-8 encodings for ProofPower symbols into the appropriate LaTeX commands. I propose that it will leave UTF-8 encodings that are not recognised as ProofPower symbols unchanged (so if you you are using the utf8 package in LaTeX for purposes of internationalization, then things will work as you would expect). The sieve keyword file format would be extended to allow Unicode code points to be given as equivalents for keywords, allowing you to extend the range of supported mathematical symbols if you desire. The sieve view file and sieve keyword file format would support a keyword indicating that the file uses the UTF-8 encoding. This is just for backwards compatibility for any users who have their own versions of these files and don’t want to upgrade them to UTF-8 just yet - the files supplied with the ProofPower distribution would have this keyword. I have been investigating the options for xpp and have concluded that making it use UTF-8 internally would require either some substantial work on the OpenMotif GUI toolkit that it uses or porting it to a more modern GUI toolkit. These aren’t very attractive options in the short term. So my proposal is that xpp would use the existing extended character set internally and convert to and from UTF-8 as required (as inferred from the filename or by selecting an option). To allow xpp to work this way, there will be a way of invoking pp so that it does input and output in the ProofPower encoding. The intention of the above is that all the text files in the ProofPower distribution will use the UTF-8 encoding from now on. Your existing documents will still work and you can convert to using UTF-8 gradually if you wish. If for some reason you maintain “.sml” files rather than just generating them as needed from “.doc” (or now “.ppd” files), then you can use xpp (or the pputf8 and utf8pp programs in the pputf8 contrib) to convert old “.sml” files to UTF-8 if you wish. Regards, Rob. _______________________________________________ Proofpower mailing list [email protected] http://lemma-one.com/mailman/listinfo/proofpower_lemma-one.com
