On Mon, 16 May 2022 11:13:56 +0200 Victor Stinner <vstin...@python.org> wrote: > Hi, > > I propose adding a new C API to "build an Unicode string". What do you > think? Would it be efficient with any possible Unicode string storage > and any Python implementation? > > PyPy has an UnicodeBuilder type in Python, but here I only propose C > API. Later, if needed, it would be easy to add a Python API for it. > PyPy has UnicodeBuilder to replace "str += str" pattern which is > inefficient in PyPy: CPython has a micro-optimization (in ceval.c) to > keep this pattern performance interesting. Adding a Python API was > discussed in 2020, see the LWN article: > https://lwn.net/Articles/816415/ > > Example without error handling, naive implementation which doesn't use > known length of key and value strings (calling Preallocate may be more > efficient): > --------------------------- > // Format "key=value" > PyObject *format_with_builder(PyObject *key, PyObject *value) > { > assert(PyUnicode_Check(key)); > assert(PyUnicode_Check(value)); > > // Allocated on the stack > PyUnicodeBuilder builder; > PyUnicodeBuilder_Init(&builder); > > // Overallocation is more efficient if the final length is unknown > PyUnicodeBuilder_EnableOverallocation(&builder); > PyUnicodeBuilder_WriteStr(&builder, key); > PyUnicodeBuilder_WriteChar(&builder, '='); > > // Disable overallocation before the last write > PyUnicodeBuilder_DisableOverallocation(&builder);
Having to manually enable or disable overallocation doesn't sound right. Overallocation should be done *before* writing, not after. If there are N bytes remaining and you write N bytes, then no reallocation should occur. Regards Antoine. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XOBUBUUCUS252CHFZA7I2HXEDUQ2G45P/ Code of Conduct: http://python.org/psf/codeofconduct/