der Mouse wrote:
You'll have to preprocess the text into something compatible with
one of the defined FMapType values (unless you're lucky enough to
have a printer with something more modern than Level 2, I suppose,
if such a thing exists).
Ok, can you suggest what kind of multi-byte encoding would work?
UTF-16?

I'm not sure; the Red Book is at home.  But the simplest thing is to
restrict yourself to the BMP and then encode each character as two
octets, using a composite font with a suitable suite of base fonts an
FMapType specifying either 9/7 or 8/8 encoding (depending on whether
you have 512 7-bit base fonts or 256 8-bit base fonts - well, minus
whatever ones aren't needed because you don't need those characters).
Postscript's notion of fonts is pretty limited if you want to keep compatibility with level2. Fonts are limited to contain 256 glyphs, but, the mapping is relatively flexible, the font map itself allows for arbitrary names to be put into each slot for the font.

The current Postscript code does the following:
1) Assume most of the characters come from the Latin-1 encoding.
2) Scan all the strings in the current document for non-latin-1 (e.g. UTF-8) characters
3) Enumerate and keep track of all non Latin-1 characters.
4) Take the list of characters, and map them to the 'unused' region in the font map, starting at index 128 (I think) 5) Emit a font remap command that re-maps the aforementioned characters into the Adobe sanctioned UTF-8 glyph names. 6) When strings print, replace the non-latin-1 characters recognized in step 2 with the re-encoded index.

This way, most latin-1 characters pass through without any changes, but any UTF-8 characters come out re-mapped to the upper half of the font.


The immediate problem seems to me.. how do you refer to a specific
glyph image you want.  [Let's] pick the "mu" symbol as an example.  I
can find its unicode, but what would the postscript font know this
as?

00b5?  Or 03bc?  It depends on the fonts in use, of course, but the
former would probably be encoded as two bytes 0x00 0xb5 and the latter
as 0x03 0xbc (in PostScript strings, most likely \0\265 and \3\274).

Why does the 'mu' character not render? Possibly because the font is missing the '\mu' definition. The last time this came up, I think we wrote it off to the missing encoding in the font. I can't remember if I found a font that actually had that character.


_______________________________________________
geda-dev mailing list
[email protected]
http://www.seul.org/cgi-bin/mailman/listinfo/geda-dev

Reply via email to