Sending again to the list (sorry boB)... On 29 May 2013 17:51, boB Stepp <robertvst...@gmail.com> wrote: >> I don't know exactly how str.join is implemented but it does not use >> this quadratic algorithm. For example if str.join would first compute >> the length of the resulting string first then it can allocate memory >> for exactly one string of that length and copy each substring to the >> appropriate place (actually I imagine it uses an exponentially >> resizing buffer but this isn't important). >> > > ...str.join gets around these issues?
As I said I don't know how this is implemented in CPython (I hoped Eryksun might chime in there :) ). > In the linked article it was > discussing increasing memory allocation by powers of two instead of > trying to determine the exact length of the strings involved, > mentioning that the maximum wasted memory would be 50% of what was > actually needed. Is Python more clever in its implementation? Actually the maximum memory wastage is 100% of what is needed or 50% of what is actually used. This is if the amount needed is one greater than a power of two and you end up doubling to the next power of two. I don't see how CPython could be much cleverer in its implementation. There aren't that many reasonable strategies here (when implementing strings as linear arrays like CPython does). >> * CPython actually has an optimisation that can append strings in >> precisely this situation. However it is an implementation detail of >> CPython that may change and it does not work in other interpreters >> e.g. Jython. Using this kind of code can damage portability since your >> program may run fine in CPython but fail in other interpreters. >> > You are speaking of "appending" and not "concatenation" here? In this case I was just talking about single characters so you could think of it as either. However, yes the optimisation is for concatenation and in particular the '+' and '+=' operators. > I had not even considered other Python interpreters than CPython. More > complexity to consider for the future... It's only a little bit of complexity. Just bear in mind the distinction between a "language feature" that is true in any conforming implementation and an "implementation detail" that happens to be true in some or other interpreter but is not a specified part of the language, In practise this means not really thinking too hard about how CPython implements things and just using the recommended idioms e.g. str.join. I don't know if it is documented anywhere that str.join is linear rather than quadratic but I consider that to be a language feature. Exactly how it achieves linear behaviour (precomputing, resizing, etc.) is an implementation detail. If your code relies only on language features then it should not have problems when changing interpreters. Oscar _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor