On Tue, May 5, 2015 at 9:41 AM, Armin Rigo <ar...@tunes.org> wrote: > Hi Mariano, > > On 5 May 2015 at 09:57, Mariano Reingart <reing...@gmail.com> wrote: > > I'll try to clone PyPy too, I think it could be even easier to > > internationalize, as it is pure python and no C API should be changed, > but > > please correct me if I'm wrong. > > It's easier in some ways, but harder in others, because there is no > experience about doing that on an RPython project (unlike C). To get > started: > > * let's say the goal is to use gettext via the dgettext() API. This > is a function that takes and returns a C string, i.e. a "char *", > which is different from an RPython string. We need to interface to > dgettext() via rffi (see examples in rpython/rlib/*.py, like e.g. > zrlib.py or rpoll.py or others). > > Or using a pure python implementation already in the stdlib?
https://hg.python.org/cpython/file/2.7/Lib/gettext.py > * I think the sanest is to assume that the language doesn't change > dynamically, so that we can use caching techniques; for now it can be > done by adding "@jit.elidable" as a decorator, which mean that at > least the JIT will constant-fold calls to dgettext(). > Ok, I'll investigate this... > * the first place to use this would be in the error message > construction. Unfortunately, this is also a place where we do custom > things instead of relying on a standard printf() format: > pypy/interpreter/error.py. To see examples, grep for "raise oefmt" > anywhere in pypy/. A message is specified like "expected %d > arguments, got %d", but it is broken in advance (during "translation", > i.e. when we turn PyPy from Python code to an actual executable) into > two strings, "expected " and " arguments, got "; at runtime we build > the final string by concatenating the pieces. This logic needs to > change, but the idea of needing only to concatenate some pieces should > stay, as it is essential to get good JIT performance. We probably > need to call dgettext() on the whole input string, and then split it > up to prepare the concatenation-of-pieces for future calls --- done at > runtime but still cached so that it occurs only once for each input > string. (Obviously we need a way to reorder the arguments, too.) > Ok, understood. Anyway, do you think this really has any performance implication? dgettext is mainly a lookup and interpolation, cannot it just be used to return the final string? Splitting the text can be direct for some languages, but I don't know how it will work for left-to-right or non latin1 character sets... This should be discussed in more details. Maybe you should show up on > irc, channel #pypy on irc.freenode.net. > Ok, I'm downloading pypy and will build it and try to further investigate. BTW, I see a lot of heads in the mercurial repo, which version should I use? tip? I prefer communication using mailing list as I gives more time to properly reply, but you can tell me in the next week when will you be available and I'll try to show up in the IRC. Best regards, Mariano Reingart http://www.sistemasagiles.com.ar http://reingart.blogspot.com
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev