Rob,

12/03/15 12:31, Rob Arthan wrote:
I am currently working on support for input and output using Unicode
into ProofPower. I would appreciate any feedback on the translation
scheme I am proposing to use, which is described here:

http://www.lemma-one.com/ProofPower/unicode/pp-unicode.html

I doubt %xNNNNNN% occurs in any existing scripts, so I think that's fine.

Your Unicode values for %lseq% and %rseq% have not taken into account the correction to the Z Standard in Technical Corrigendum 1:
Page 24, in 6.4.6.5
In line 3, replace “0000 3008 LEFT ANGLE BRACKET” by “0000 27E8 MATHEMATICAL LEFT ANGLE BRACKET”. In line 4, replace “0000 3009 RIGHT ANGLE BRACKET” by “0000 27E9 MATHEMATICAL RIGHT ANGLE BRACKET”.
So
   - 3008 should be 27E8
   - 3009 should be 27E9

You currently have
2588 FULL BLOCK
for %EFT%. This renders quite large. I'm guessing that the description 'full block' means that the intention is for the glyph to be entirely black. This is larger that the square we're used to and it may not even be square. I think
25A0 BLACK SQUARE
could be more suitable (and would be square!)
To compare in your email client:  █ (2588) and ■ (25A0)
What do you think?

You currently have
208D SUBSCRIPT LEFT PARENTHESIS     for %ulbegin%
208E SUBSCRIPT RIGHT PARENTHESIS    for %ulend%
but these parentheses don't have underscores like the ProofPower glyphs. One option is
298B LEFT SQUARE BRACKET WITH UNDERBAR
298C RIGHT SQUARE BRACKET WITH UNDERBAR
They appear as ⦋ (298B) and ⦌ (298C) but I think they're too close to refinement symbols. I think this could be a case where you probably want to combine another glyph with the existing parenthesis using the combining diacritics. I think the best options are
0331 COMBINING MACRON BELOW
which gives ̱( ̱)
These work in my email client and terminal.  Alternatively
0332 COMBINING LOW LINE
which gives (̲ )̲
These don't overlay correctly in my email client or terminal but are a single wide character.

For epsilon, I can see why
03F5 GREEK LUNATE EPSILON SYMBOL
has been used instead of
03B5 GREEK SMALL LETTER EPSILON
- it is the correct shape for the equivalent LaTeX character. I was finding that the lunate epsilon is very similar to the membership operator. The lower case LaTeX mathematical greek letters appear italicized so I wonder whether the standard should have used e.g.
𝜃 (1D703) instead of θ (03B8)
𝜆 (1D706) instead of λ (03BB)
𝜇 (1D707) instead of μ (03BC)
so you would have chosen
𝜖 (1D716) instead of ϵ (03F5)
That would have meant also that any Greek authors of Z don't find themselves deprived of certain letters. As things stand, I agree with your choice.

There are some multi-character operators that could be translated to a single character operator that would result in Unicode documents more in line with ISO Z.
\%down%s        ---> ⧹ (29F9) for schema hiding
%filter%%down%s ---> ⨡ (2A21) for schema projection
%rcomp%%down%s  ---> ⨟ (2A1F) for schema composition
This would require ProofPower-Z to be updated to add the new symbols. I mention it now because the transition to Unicode could be used as an opportunity to eliminate the non-ISO forms from Unicode documents, if you wished to transition to the ISO symbol.

Either way, you probably want to add schema projection and composition to the ML function utf8_to_pp_replacement_data.

The situation with hyphen minus is unfortunate. I presume the issue is that you can't identify which instances of ASCII hyphen-minus are Z subtraction operators.

The following are not well supported on my system:
01F13C SQUARED LATIN CAPITAL LETTER M    for %SML%
01F143 SQUARED LATIN CAPITAL LETTER T    for %<:%
01F149 SQUARED LATIN CAPITAL LETTER Z    for %SZT%
What do you consider converting them to the expanded form, e.g.
  %SZT% ---> ⌜↘Z↕
?
I think a single character is preferable and ideally it would be somehow based on the opening corner '⌜'. I wonder if there is a combining trick possible here.


I am particularly interested to know how good the coverage for the
Unicode characters needed is on different systems and on different
applications. You can see how good your web browser is just by looking
at the above Web page, which includes GIFs that show how LaTeX renders
the ProofPower symbols alongside the Unicode translation as rendered by
your browser. To see how good your mail client is, the following should
look something like the HOL definition of distributed union:

⌜∀ x a⦁ x ∈ ⋃ a ⇔ (∃ s⦁ x ∈ s ∧ s ∈ a)⌝

This is readable on my system - all glyphs are available. The sizes are a little odd but that could be due to font substitution.


(There is a complete list of the symbols used at the end of this e-mail.)

Current indications are that Mac OS can do everything that is required,
but that recent Linux systems and MS Windows tend to miss one or two
glyphs: the squared upper case letters I am using for some of the
quotation symbols seem to be poorly supported, for example. Most text
editors and word processors on the systems I have tried do support
UTF-8. If anyone can advise on how to extend the range of Unicode glyphs
available on Linux or on how to get font substitution working nicely
with OpenMotif, I would be very grateful.

On Linux, perhaps it helps to install or upgrade the package
  stix-fonts (Fedora)
  fonts-stix (Ubuntu)
or, without using packages, follow these instructions:
http://www.stixfonts.org/install.html
With fontconfig, I suspect a user installation just requires the font to be dropped into ~/.fonts and it just works.


There is a prototype implementation of the translation as two
free-standing programs pputf8 and utf8pp here:

https://github.com/RobArthan/pp-contrib

(These have been around for a while, but I have just changed them to
implement the somewhat simplified and extended scheme described in the
Web page cited above.)

My immediate plan is to integrate the translation into ProofPower
itself, so that Unicode input and output are options you can select with
set_flag (similar to the way that you can currently select ASCII output,
by calling set_flag(“use_extended_chars”, false)). I will provide
separate flags for input and output.

Great - I look forward to seeing Z fly past in my shell!! Now I need to work out how to get the same key mapping working in Linux generally.


Below, as a further test of your e-mail client, you will find the ASCII
encoding and UTF-8 encoding of the corresponding Unicode for all the
characters in the ProofPower extended character set.

I have checked the Unicode support with a few programs. I suspect that the user experience will also be affected by which fonts are available, in particular STIX, because fontconfig performs glyph substitution.


Thunderbird 17.0.2, this email

The following have no glyph available:
 >       2  %<:%: 🅃
 >      20  %SML%: 🄼
 >      24  %SZT%: 🅉

Also:
 >      27  %Upsilon%: Υ    -- glyph appears identical to capital Y
 >     108  %select%: ϵ     -- very similar to ∈


Firefox 18.0.2, pp-unicode.html

The following have no glyph available:
0x01F13C
0x01F143
0x01F149

0x0003A5 is identical to capital Y


I'm using Fedora 16 (still!) and have package versions as follows:

$ rpm -q stix-fonts fontconfig thunderbird firefox
stix-fonts-1.0.0-2.fc15.noarch
fontconfig-2.8.0-8.fc16.x86_64
fontconfig-2.8.0-8.fc16.i686
thunderbird-17.0.2-1.fc16.x86_64
firefox-18.0.2-1.fc16.x86_64


Regards,
Phil

_______________________________________________
Proofpower mailing list
Proofpower@lemma-one.com
http://lemma-one.com/mailman/listinfo/proofpower_lemma-one.com

Reply via email to