Re: [matplotlib-devel] Unicode to Tex symbols, Type1 names, and vice versa

Edin Salković Fri, 23 Jun 2006 02:50:31 -0700

On 6/22/06, John Hunter <[EMAIL PROTECTED]> wrote:
> Since you asked :-)
>
> I may not have mentioned this but the style conventions for mpl code
> are
>
>   functions : lower or lower_score_separated
>   variables and attributes : lower or lowerUpper
>   classes : Upper or MixedUpper


OK

> Also, I am not too fond of the dict of dicts -- why not use variable
> names?

I used a dict of dicts because this allowed me to generate separate
picle files (for each one of the dicts in the top-level dict) and
anything else (see the final script) by their coresponding top-level
dict name. I thought it was better, for practical/speed reasons, to
have separate pickle files, for every dict.

>     for line in file(fname):
>         if line[:2]!=' 0': continue # using continue avoids unneccesary indent

Thanks for the tip!

>         uninum = line[2:6].strip().lower()
>         type1name = line[12:37].strip()
>         texname = line[83:110].strip()
>
>         uninum = int(uninum, 16)

I thought that the idea was to allow users to write unicode strings
directly in TeX (OK, this isn't much of an excuse :). That's why I
used the eval approach, to get the dict keys (or values) to be unicode
strings. I'm also aware that indexing by ints is faster, and that the
underlying FT2 functions work with ints... OK, I'm now convinced that
your approach is better :)

>     pickle.dump((uni2type1, type12uni, uni2tex, tex2uni), 
> file('unitex.pcl','w'))
>
>     # An example
>     unichar = int('00d7', 16)
>     print uni2tex.get(unichar)
>     print uni2type1.get(unichar)
>
> Also, I am a little hesitant to use pickle files for the final
> mapping.  I suggest you write a script that generates the python code
> contains the dictionaries you need (that is how much of _mathext_data
> was generated.

The reason why I used pickle - from the Python docs:
=====
Strings can easily be written to and read from a file. Numbers take a
bit more effort, since the read() method only returns strings, which
will have to be passed to a function like int(), which takes a string
like '123' and returns its numeric value 123. However, when you want
to save more complex data types like lists, dictionaries, or class
instances, things get a lot more complicated.

Rather than have users be constantly writing and debugging code to
save complicated data types, Python provides a standard module called
pickle. This is an amazing module that can take almost any Python
object (even some forms of Python code!), and convert it to a string
representation; this process is called pickling. Reconstructing the
object from the string representation is called unpickling. Between
pickling and unpickling, the string representing the object may have
been stored in a file or data, or sent over a network connection to
some distant machine.
=====
So I thought that pickling was the obvious way to go. And, of course,
unpickling with cPickle is very fast. I also think that no human being
should change the automaticaly generated dicts. Rather, we should put
a separate python file (i.e. _mathtext_manual_data.py) where anybody
who wants to manually override the automaticaly generated values, or
add new (key, value) pairs can do so.

The idea:

_mathtext_manual_data.py:
=======
uni2text = {key1:value1, key2:value2}
tex2uni = {}
uni2type1 = {}
type12uni = {}

uni2tex.py:
=======
from cPickle import load

uni2tex = load(open('uni2tex.cpl'))
try:
    import _mathtext_manual_data
    uni2tex.update(_mathtext_manual_data.uni2tex)
except (TypeError, SyntaxError): # Just these exceptions should be raised
    raise
except: # All other exceptions should be silent
    pass
=====

Finally, I added lines for automatically generating pretty much
everything that can be automatically generated

stix-tbl2py.py
=======
'''A script for seemlesly copying the data from the stix-tbl.ascii*
file to a set
of python dicts. Dicts are then pickled to coresponding files, for
later retrieval.
Currently used table file:
http://www.ams.org/STIX/bnb/stix-tbl.ascii-2005-09-24
'''

import pickle

tablefilename = 'stix-tbl.ascii-2005-09-24'
dictnames = ['uni2type1', 'type12uni', 'uni2tex', 'tex2uni']
dicts = {}
# initialize the dicts
for name in dictnames:
    dicts[name] = {}

for line in file(tablefilename):
    if line[:2]!=' 0': continue
    uninum = int(line[2:6].strip().lower(), 16)
    type1name = line[12:37].strip()
    texname = line[83:110].strip()
    if type1name:
        dicts['uni2type1'][uninum] = type1name
        dicts['type12uni'][type1name] = uninum
    if texname:
        dicts['uni2tex'][uninum] = texname
        dicts['tex2uni'][texname] = uninum

template = '''# Automatically generated file.
from cPickle import load

%(name)s = load(open('%(name)s.pcl'))
try:
    import _mathtext_manual_data
    %(name)s.update(_mathtext_manual_data.%(name)s)
except (TypeError, SyntaxError): # Just these exceptions should be raised
    raise
except: # All other exceptions should be silent
    pass
'''

# pickling the dicts to corresponding .pcl files
# automatically generating .py module files, used by importers
for name in dictnames:
    pickle.dump(dicts[name], open(name + '.pcl','w'))
    file(name + '.py','w').write(template%{'name':name})

# An example
from uni2tex import uni2tex
from  uni2type1 import uni2type1

unichar = u'\u00d7'
uninum = ord(unichar)
print uni2tex[uninum]
print uni2type1[uninum]

Cheers,
Edin

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Re: [matplotlib-devel] Unicode to Tex symbols, Type1 names, and vice versa

Reply via email to