On 8/29/06, Josiah Carlson <[EMAIL PROTECTED]> wrote: > "Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > For operations that may be forced to return a new string (e.g. > > concatenation) I think the return value should always be a new string, > > even if it could be optimized. So for example if v is a view and s is > > a string, v+s should always return a new string, even if s is empty. > > I'm on the fence about this. On the one hand, I understand the > desireability of being able to get the underlying string object without > difficulty. On the other hand, its performance characteristics could be > confusing to users of Python who may have come to expect that "st+''" is > a constant time operation, regardless of the length of st.
Well views aren't strings. And s+t (for s and t strings) normally takes O(len(s)+len(t)) time. The type consistency and predictability is more important to me. I didn't mean to recommend v+"" as the best way to turn a view v into a string; that would be str(v). > The non-null string addition case, I agree that it could make some sense > to return the string (considering you will need to copy it anyways), but > if one returned a view on that string, it would be more consistant with > other methods, and getting the string back via str(view) would offer > equivalent functionality. It would also require the user to be explicit > about what they really want; though there is the argument that if I'm > passing a string as an operand to addition with a view, I actually want > a string, so give me one. I strongly believe you're mistaken here. I don't think users will hvae any trouble with the concept "operations that don't (necessarily) return a substring will return a new string. > I'm going to implement it as returning a view, but leave commented > sections for some of them to return a string. > > > BTW beware that in py3k, strings (which will always be unicode > > strings) won't support the buffer API -- bytes objects will. Would you > > want views on strings or ob bytes or on both? > > That's tricky. Views on bytes will come for free, like array, mmap, and > anything else that supports the buffer protocol. It requires the removal > of the __hash__ method for mutables, but that is certainly expected. The question is, how useful is the buffer protocol going to be? We don't know yet. > Right now, a large portion of standard library code use strings and > string methods to handle parsing, etc. Removing immutable byte strings > from 3.x seems likely to result in a huge amount of rewriting necessary > to utilize either bytes or text (something I have mentioned before). I > believe that with views on bytes (and/or sufficient bytes methods), the > vast majority would likely result in the use of bytes. Um, unless you consider decoding a GIF file "parsing", parsing would seem to naturally fall in the realm of text (characters), not bytes. > Having a text view for such situtions that works with the same kinds of > semantics as the bytes view would be nice from a purity/convenience > standpoint, and only needing to handle a single data type (text) could > make its implementation easier. I don't have any short-term plans of > writing text views, but it may be somewhat easier to do after I'm done > with string/byte views. Unifying the semantics between byte views and text views will be difficult since bytes are mutable. I recommend that you have a good look at the bytes implementation in the p3yk branch. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
