Jim Jewett schrieb: > By knowing that there is only one possible representation for a given > string, he skips the equivalency cache. On the other hand, he also > loses the equivalency cache.
What is an equivalency cache, and why would one like to have one? > When python 2.x chooses the unicode > width, it tries to match tcl; under a "minimal size possible" scheme, > strings that fit in ASCII will have to be recoded twice on every round > trip. The same problem pops up with other extension modules, and with > system encodings. In _tkinter, strings have to be copied *always*, whether they use the same representation or a different one. Tcl requires strings to be represented in a TclObj; you cannot pass a Python string object directly into Tcl. As you have to copy, anyway, it doesn't matter if you do size conversions in the process. > By exposing the full object insted of the abstract interface, > compilers can do pointer addition instead of calling a get_data > function. But they still don't know (until run time) how wide the > data at that pointer will be, and we're locked into binary > compatibility. That's not true. The internal representation of objects can and did change across releases. People have to and will recompile their extension modules for a new feature release. >> I doubt any kind of "pluggable" representation could work in a >> reasonable way. With that generality, you lose any information >> as to what the internal representation is, and then code becomes >> tedious to write and slow to run. > > Instead of working with ((string)obj).data directly, you work with > string.recode(object, desired) ... causing a copy of the data, right? This is expensive. > If you're saying this will be slow because it is a C function call, > then I can't really argue; I just think it will be a good trade for > all the times we don't recode at all (or recode only once/encoding). It's not the function call that makes it slow. It's the copying of potentially large string data that a recoding requires. In addition, for some encodings, the algorithm to do the transformation is fairly slow. > I'll admit that I'm not sure what sort of data would make a real-world > (as opposed to contrived) benchmark. Any kind of text application will suffer if strings get constantly recoded. Regards, Martin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com