Re: [pypy-dev] CFFI and UTF8

Armin Rigo Thu, 31 Jul 2014 08:40:46 -0700

Hi,

On 31 July 2014 16:47, Eleytherios Stamatogiannakis <est...@gmail.com> wrote:
> Wouldn't it be faster to have a ffi.stringUTF8 for the case where we know
> the input is in UTF8?


It seems the truth is the opposite of what you expect.  Right now,
`ffi.string(p).decode('utf-8')` does two copies, whereas in the
proposed UTF8 future of PyPy the same expression might possibly be
done with only one copy (because `s` and `s.decode('utf-8')` could
share the same byte string).

It doesn't mean the idea of `ffi.stringUTF8()` is necessarily bad, but
it should be a CFFI discussion instead of a PyPy one.  I'm "-0" on the
idea as adding more complexity to the API for just a minor performance
gain (particularly one that disappears in the UTF8 future of PyPy).

> Ideally we could also have a ffi.stringUTF8const, which knowing that the
> char* is const (won't be changed by the C side), won't do a copy at all?

That's not possible: a PyPy string object cannot point directly to raw
memory, but must contain its own data, just like a CPython
(byte)string object.


A bientôt,

Armin.
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] CFFI and UTF8

Reply via email to