On Mon, Mar 19, 2012 at 9:15 PM, Peter Cock <p.j.a.c...@googlemail.com> wrote: > On Mon, Mar 19, 2012 at 6:36 PM, Maciej Fijalkowski <fij...@gmail.com> wrote: >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> append_charpsize is special - it's not the *actual* implementation, >> the actual implementation is buried somewhere in >> rpython/lltypesystem/rbuilder.py, with the one you're mentioning being >> just fake implementation for tests. StringBuilder is special in a >> sense that it has some special GC support (which we can probably >> improve upon). >> >> Cheers, >> fijal > > I guess you are referring to the copy_string_contents function here: > https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rstr.py > > However, methods ll_append_multiple_char not ll_append_charpsize > defined in rbuilder seem to use this - they both use a for loop char-by-char, > https://bitbucket.org/pypy/pypy/src/default/pypy/rpython/lltypesystem/rbuilder.py > > My hunch would be to replace this: > > @staticmethod > def ll_append_charpsize(ll_builder, charp, size): > used = ll_builder.used > if used + size > ll_builder.allocated: > ll_builder.grow(ll_builder, size) > for i in xrange(size): > ll_builder.buf.chars[used] = charp[i] > used += 1 > ll_builder.used = used > > with this: > > @staticmethod > def ll_append_charpsize(ll_builder, charp, size): > used = ll_builder.used > if used + size > ll_builder.allocated: > ll_builder.grow(ll_builder, size) > assert size >= 0 > ll_str.copy_contents(charp, ll_builder.buf, 0, used, size) > ll_builder.used += size > > (and similarly for ll_append_multiple_char above it) > > Like an onion - more and more layers ;) I'm beginning to suspect > speeding up append_charpsize in order to make passing strings > to/from C code faster is a bit too ambitious for a first contribution > to PyPy! [*] > > Peter > > [*] Especially as after three hours it is still building from source: > $ python translate.py --opt=jit targetpypystandalone.py
ok, so let me reply a bit more :) First of all, you don't have to translate pypy to see changes. We mostly run tests to see if they work. You can also write a very small rpython program in translator/goal (look at targetnopstandalone.py) if you want to just test the performance of single function. I suppose your code is indeed a bit faster, but my bet would be it's not too much faster (feel free to prove me wrong, especially on older GCCs, they might not figure out that a loop is vectorizable for example). The main source of why passing strings to C is slow is however copying the string from the GC area to non-moving one, raw malloced in C. There are various strategies how to approach this, one of those would be pinning, so the GC structures don't move and you can pass a pointer to C. This is however definitely not a good first patch to pypy ;-) What I would suggest: * Your patch looks good to me, although I'm not sure if copy-string-contents would accept a raw memory. Check if tests pass. * If you want to benchmark, write a small test for passing such strings in translator/goal and see if it works. We're usually available for help on IRC and thanks for tackling this problem! Cheers, fijal _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev