On Fri, 25 Aug 2006 10:53:15 -0700, Guido van Rossum <[EMAIL PROTECTED]> wrote: >On 8/25/06, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: >> >For the record, I think this is a major case of YAGNI. You appear way >> >to obsessed with performance of some microscopic aspect of the >> >language. Please stop firing random proposals until you actually have >> >working code and proof that it matters. Speeding up microbenchmarks is >> >irrelevant. >> >>Twisted's core loop uses string views to avoid unnecessary copying. This >>has proven to be a real-world speedup. This isn't a synthetic benchmark >>or a micro-optimization. > >OK, that's the kind of data I was hoping for; if this was mentioned >before I apologize. Did they implement this in C or in Python? Can you >point us to the docs for their API?
One instance of this is an implementation detail which doesn't impact any application-level APIs: http://twistedmatrix.com/trac/browser/trunk/twisted/internet/abstract.py?r=17451#L88 Another instance of this is implemented in C++: http://twistedmatrix.com/trac/browser/sandbox/itamar/cppreactor/fusion but doesn't interact a lot with Python code. The C++ API uses char* with a length (a natural way to implement string views in C/C++). The Python API just uses strings, because Twisted has always used str here, and passing in a buffer would break everything expecting something with str methods. >>I don't understand the resistance. Is it really so earth-shatteringly >>surprising that not copying memory unnecessarily is faster than copying >>memory unnecessarily? > >It depends on how much bookkeeping is needed to properly free the >underlying buffer when it is no longer referenced, and whether the >application repeatedly takes short long-lived slices of long otherwise >short-lived buffers. Unless you have a heuristic for deciding to copy >at some point, you may waste a lot of space. Certainly. The first link above includes an example of such a heuristic. >>If the goal is to avoid speeding up Python programs because views are too >>complex or unpythonic or whatever, fine. But there isn't really any >>question as to whether or not this is a real optimization. > >There are many ways to implement views. It has often been proposed to >make views an automatic feature of the basic string object. There the >optimization in one case has to be weighed against the pessimization >in another case (like the bookkeeping overhead everywhere and the >worst-case scenario I mentioned above). I'm happy to see things progress one step at a time. Having them _at all_ (buffer) was a good place to start. A view which has string methods is a nice incremental improvement. Maybe somewhere down the line there can be a single type which magically knows how to behave optimally for all programs, but I'm not asking for that yet. ;) >If views have to be explicitly >requested that may not be a problem because the app author will >(hopefully) understand the issues. But even if it was just a standard >library module, I would worry that many inexperienced programmers >would complicate their code by using the string views module without >real benefits. Sort of the way some folks have knee-jerk habits to >write > > def foo(x, None=None): > >if they use None anywhere in the body of the function. This should be >done only as a last resort when real-life measurements have shown that >foo() is a performance show-stopper. > I don't think we see people overusing buffer() in ways which damage readability now, and buffer is even a builtin. Tossing something off into a module somewhere shouldn't really be a problem. To most people who don't actually know what they're doing, the idea to optimize code by reducing memory copying usually just doesn't come up. Jean-Paul _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
