After some thorough research on the subject I decided to post my conclusions/thoughts here. Beware, this is a long one.
Font problems ========== There are no good, complete, free, unicode, Open/TrueType math fonts currently. We will have to wait for the STIX fonts. On the site it says that the beta version of the fonts will be available in september, so probably the next SoC could cover that - if we're lucky ;). I had a look at the following Open/TrueType unicode fonts: * CMU fonts. This fonts practicaly don't have any math symbols, so they're not a solution. * The fonts used by Open Office - Open Symbol (opens___.ttf), which has a decent set of symbols (unicode). This fonts were made to play well with Times, and could be used in mathtext with perhaps Nimbus Roman fonts. * FreeFont. GPL fonts, available on any Linux box. They have an extensive list of supported symbols. Probably the best free TrueType fonts out there. The best solution to the problem of good fonts would be using the currently available CM and AMS (and other) Type1 fonts which are free and come with every TeX distribution. These fonts are complete, and have pretty good Unicode support which is ilustrated by the following code: from matplotlib.ft2font import FT2Font import unicodedata # Path to a Type1 font filename = r'c:\texmf\fonts\type1\bluesky\symbols\msam10.pfb' f = FT2Font(filename) indexes = f.get_charmap() for index, uni in indexes.items(): try: name = unicodedata.name(unichr(uni)) except ValueError: name = None print f.get_glyph_name(index), index, name, repr(unichr(uni)) which outputs space 128 SPACE u' ' diamond 6 BLACK DIAMOND SUIT u'\u2666' therefore 41 THEREFORE u'\u2234' because 42 BECAUSE u'\u2235' muchless 110 MUCH LESS-THAN u'\u226a' muchgreater 111 MUCH GREATER-THAN u'\u226b' dblarrowleft 18 LEFT RIGHT DOUBLE ARROW u'\u21d4' dblarrowright 19 RIGHTWARDS DOUBLE ARROW u'\u21d2' lessorgreater 55 LESS-THAN OR GREATER-THAN u'\u2276' greaterorless 63 GREATER-THAN OR LESS-THAN u'\u2277' angle 92 ANGLE u'\u2220' proportional 95 PROPORTIONAL TO u'\u221d' msam10 font was used in the above code, but other fonts behave similarly. Unfortunately the most important function in FT2Font class f.get_glyph(index) raises ValueError: Glyph index out of range for Type1 fonts, but I think that this could be easily fixed. Current C++ code for get_glyph: char FT2Font::get_glyph__doc__[] = "get_glyph(num)\n" "\n" "Return the glyph object with num num\n" ; Py::Object FT2Font::get_glyph(const Py::Tuple & args){ _VERBOSE("FT2Font::get_glyph"); args.verify_length(1); int num = Py::Int(args[0]); if ( (size_t)num >= gms.size()) throw Py::ValueError("Glyph index out of range"); //todo: refcount? return Py::asObject(gms[num]); } The problem with this solution (if we get get_glyph to work with Type1) could be the backends. Agg wouldn't have to change much (if at all), but I don't know about the PS and SVG backends. Type 1 fonts are installable on both windows (via .pfm files) and Unix systems, so I guess SVG files could be viewed/changed without much hassle, and the PS backend could be changed a bit to support Type1 fonts. Also, all the characters are spread around in a pretty large number of files, but I suppose that with a little code this can be surpassed. Unicode problems ============= The following is assembled from the report ¸"Unicode Support for Mathematics", which is the first source of information regarding mathematics and Unicode. The biggest problem with *proper* math Unicode are the "Mathematical Alphanumeric Symbols", which are found in the 1D400..1D7FF range, not in the Basic Multilingual Plane. These are not found in any free font. I also noticed that Python's support for Unicode outside the BMP plane is not very good. The following example works on Linux (Ubuntu 6.06), but doesn't work on Windows XP (32): >>> import unicodedata >>> unicodedata.name(U'U\U0001d400') Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: need a single Unicode character as parameter The output should say: MATHEMATICAL BOLD CAPITAL A The "Mathematical Alphanumeric Symbols" block contains: * Mathematical bold letters * Mathematical italic letters (used for variables, default font in TeX math mode) * Mathematical bold italic letters * Mathematical script (calligraphic) letters * Mathematical bold script letters * Mathematical fraktur letters * Mathematical double-struck letters * Mathematical bold fraktur letters * Mathematical sans-serif letters * Mathematical sans-serif bold letters * Mathematical sans-serif italic letters * Mathematical sans-serif bold italic letters * Mathematical monospace letters * Dotless symbols * Bold Greek symbols * Additional bold Greek symbols * Italic Greek symbols * Additional italic Greek symbols * Bold italic Greek symbols * Additional bold italic Greek symbols * Sans-serif bold Greek symbols * Sans-serif bold italic Greek symbols * Additional sans-serif bold Greek symbols * Additional sans-serif bold italic Greek symbols * Bold digits * Double-struck digits * Sans-serif digits * Sans-serif bold digits * Monospace digits These were all put in the Unicode character set because of their semantic meanings in mathematics, although practically all are just font variations (<font>). The roman math letters (serif, normal, used for digits) default to the "Basic Latin" block. It is interesting to note that the "Mathematical Alphanumeric Symbols" block doesn't seem to be supported by, for example, Arial Unicode MS (it supports only the BMP). This issue cannot be successfully solved until the STIX fonts come out. If they package them right (and they ought to), we could have a single .ttf file for all the glyphs needed for mathtext. Until then, any solution will need some sort of mapping between unicode blocks (character ranges) and fontfiles (at least for italic, calligraphic etc. fonts) Possible enhancements ================= I think there should be a thin Python wrapper around the FreeType2 FT2Font class. Then, for example, all the caching could be handled by that class. This would allow not only caching for mathtext, but even for *plain text* and would clean up code. This would also allow adding new functionality, without messing around with C++, and without breaking old code. One could then, for example, have a FT2Font class method get_unicode_glyph that would return the glyph based on his unicode index, or better yet, the next code would be easy implementable: glyphs = FT2Font('/path/to/font') glypha = glyphs['a'] or even: text_to_render = glyphs.text('Some lame text') or something similar. Again, this would not break old code and would ease writing new code. However, as John once said: The font library is probably an SOC project of it's own, because we would like to settle on one freetype library that both matplotlib and enthought/chaco can use. How to deal with this issue without becoming consumed by it will require some thought. Conclusion ======== John, what should I do? Please comment. I think that the best solution right now are unfortunately the BaKoMa fonts. If we could get the Type1 fonts to work then I could probably easily ingegrate them into the existing model. I could also try to do something with the Open Symbol fonts, and the FreeFont (windows users could dowload them sepparately). Cheers, Edin ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel