New submission from STINNER Victor: The _PyUnicodeWriter API avoids creation of temporary Unicode strings and has very good performances to build Unicode strings with the PEP 393 (compact unicode string).
Attached patch adds a _PyObject_ReprWriter() function to avoid creation of tempory Unicode string while calling repr(obj) on containers like tuple, list or dict. I did something similar for str%args and str.format(args). To avoid the following code, we might add something to PyTypeObject, maybe a new tp_repr_writer field. + if (PyLong_CheckExact(v)) { + return _PyLong_FormatWriter(writer, v, 10, 0); + } + if (PyUnicode_CheckExact(v)) { + return _PyUnicode_ReprWriter(writer, v); + } + if (PyList_CheckExact(v)) { + return _PyList_ReprWriter(writer, v); + } + if (PyTuple_CheckExact(v)) { + return _PyTuple_ReprWriter(writer, v); + } + if (PyList_CheckExact(v)) { + return _PyList_ReprWriter(writer, v); + } + if (PyDict_CheckExact(v)) { + return _PyDict_ReprWriter(writer, v); + } For example, repr(list(range(10))) ('[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]') should only allocate one buffer of 37 bytes and then shink it to 30 bytes. I guess that benchmarks are required to justify such changes. ---------- files: repr_writer.patch keywords: patch messages: 203371 nosy: haypo, serhiy.storchaka priority: normal severity: normal status: open title: Generalize usage of _PyUnicodeWriter for repr(obj): add _PyObject_ReprWriter() versions: Python 3.4 Added file: http://bugs.python.org/file32701/repr_writer.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19653> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com