On Sat, May 28, 2011 at 2:37 AM, Stefan Behnel <stefan...@behnel.de> wrote: > Robert Bradshaw, 28.05.2011 00:39: >> >> On Fri, May 27, 2011 at 3:32 PM, Stefan Behnel wrote: >>> >>> I recently stumbled over a tradeoff question with AttributeError, and now >>> found the same situation for UnboundLocalError in Vitja's control flow >>> branch. So here it is. >>> >>> When we raise an exception several times in different parts of the code >>> with >>> a message that only differs slightly each time (usually something like >>> "'NoneType' has no attribute X", or "local variable X referenced before >>> assignment"), we have three choices to handle this: >>> >>> 1) Optimise for speed: create a Python string object at module >>> initialisation time and call PyErr_SetObject(exc_type, msg_str_obj). >>> >>> 2) Current way: let CPython create the string object when raising the >>> exception and just call PyErr_SetString(exc_type, "complete message"). >>> >>> 3) Trade speed for size and allow the C compiler to reduce the storage >>> redundancy: write only the message template and the names as C char* >>> constants by calling PyErr_Format(exc_type, "message template %s", "X"). >>> >>> Assuming that exceptions should be exceptional, I'm leaning towards 3). >>> This >>> would allow the C compiler to collapse multiple usages of the same C >>> string >>> into one data constant, thus reducing a bit of redundancy in the shared >>> library size and the memory footprint. However, it would (slightly?) slow >>> down the exception raising due to the additional string formatting, even >>> when compared to the need to build a Python string object that it shares >>> with 2). While 1) would obviously be the fastest way to raise an >>> exception >>> (no memory allocation, only refcounting), I think it's not worth it for >>> exceptions as it increases both the runtime memory overhead and the >>> module >>> startup time. >> >> Any back-of-the-envelope calculations on how much the savings would >> be? > > As a micro benchmark, I wrote three C functions that do 10 exception setting > calls and then clear the exception, and called those 10x in a loop (i.e. 100 > exceptions). Results: > > 1) PyErr_SetObject(PyExc_TypeError, Py_None) > Py3.3: 1000000 loops, best of 3: 1.42 usec > Py2.7: 1000000 loops, best of 3: 0.965 usec > > 2) PyErr_SetString(PyExc_TypeError, "[complete message]") > Py3.3: 100000 loops, best of 3: 11.2 usec > Py2.7: 100000 loops, best of 3: 4.85 usec > > 3) PyErr_Format(PyExc_TypeError, "[message %s template]", "Abc1") > Py3.3: 10000 loops, best of 3: 37.3 usec > Py2.7: 10000 loops, best of 3: 25.3 usec > > Observations: these are really tiny numbers for 100 exceptions. The string > formatting case is only some 0.3 microseconds (25x) slower per exception > than the constant pointer case, and about 0.2 microseconds (4-5x) slower > than the C string constant case. > > Note that this only benchmarks the exception setting, not the catching, i.e. > without the instantiation of the exception object etc., which is identical > for all three cases. > > This change would only apply to Cython generated exceptions (from None > safety checks, unbound locals, etc.), which can appear in a lot of places in > the C code but should not normally be triggered in production code. If they > occur, we'd loose about 0.2 microseconds per exception, comparing 2) and 3). > I think that's totally negligible, given that these exceptions potentially > indicate a bug in the user code. > > "strings" tells me that the C compiler really only keeps one copy of the > string constants. The savings per exception message are somewhere between 30 > and 40 bytes. Not much in today's categories. Assuming even 1000 such > exceptions in a large module, that's only some 30K of savings, whereas such > a module would likely have a total stripped size of a *lot* more than 1MB. > > Personally, I think that the performance degradation is basically > non-existent, so the space savings come almost for free, however tiny they > may be.
Sounds good. I'm fine with 2 or 3, and despite the performance advantage of 1, it should be the exceptional case to raise this kind of error, and the module initialization time is and issue. - Robert _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel