Paul Sokolovsky wrote:
> Roughly speaking, the answer would be about the same in idea as answers
> to the following questions:
> [snip]

I would say the difference between this proposal so far and the ones listed
are that they emphasized concrete, real-world examples from existing code
either in the stdlib or "out in the wild", showing clear before and after
benefits of the proposed syntax. It may not seem necessary to the person
proposing the feature and it does take some time to research, but it
creates a drastically stronger argument for the new feature. The code
examples I've seen so far in the proposal have been mostly abstract or
simple toy examples. To get a general idea, I'd recommend looking over the
examples in their respective PEPs, and then try to do something similar in
your own arguments.

> The scholasm of "there's only one way to do it" is getting old for this
> language. Have you already finished explaining everyone why we needed
> assignment expressions, and why Python originally had % as a formatting
> operator, and some people swear to keep not needing anything else?

While I agree that it's sometimes okay to go outside the strict bounds of
"only one way to do it", there needs to be adequate justification for doing
so which provides a demonstrable benefit in real-world code. So the default
should be just having one way, unless we have a very strong reason to
consider adding an alternative. This was the case for the features you
mentioned above.

> Please let people learn
> computer science inside Python, not learn bag of tricks to then escape
> in awe and make up haikus along the lines of:
>
> A language, originally for kids,
> Now for grown-up noobs.

Considering the current widespread usage of Python in the software
development industry and others, characterizing it as a language for
"grown-up noobs" seems rather disingenuous (even if partially in jest). We
emphasize readability and beginner-friendliness, but Python is very far
from beginner-only and I don't think it's even reasonable to say that it's
going in that direction. In some ways, it simplifies operations that would
otherwise be more complicated, but that's largely the point of a high-level
language: abstracting the complex and low-level parts to focus more on the
core business logic.

Also, while I can see that blindly relying on "str += part" can be
sidestepping the underlying computer science to some degree, I find that
appending the parts to a list and joining the elements is very conceptually
similar to using a string buffer/builder; even if the syntax differs
significantly from how other languages do it.

Regarding the proposal in general though, I actually like the main idea of
having "StringBuffer/StringBuilder"-like behavior, *assuming* it provides
substantial benefits to alternative Python implementations compared to
``""join()``. As someone who regularly uses other languages with something
similar, I find the syntax to be appealing, but not strong enough on its
own to justify a stdlib version (mainly since a wrapper would be very
trivial to implement).

But, I'm against the idea of adding this to the existing StringIO class,
largely for the reasons cited above, of it being outside of the scope of
its intended use case. There's also a significant discoverability factor to
consider. Based on the name and its use case in existing versions of
Python, I don't think a substantial number of users will even consider
using it for the purpose of building strings. As it stands, the only people
who could end up benefiting from it would be the alternative
implementations and their users, assuming they spend time *actively
searching* for a way to build strings with reduced memory usage. So I would
greatly prefer to see it as a separate class with a more informative name,
even if it ends up being effectively implemented as a subset of StringIO
with much of the same logic.

For example:

buf = StringBuilder() # feel free to bikeshed over the name
for part in parts:
    buf += part # in the __iadd__, it would presumably call something like
buf.append() or buf.write()
return str(buf)

This would be highly similar to existing string building classes in other
popular languages, such as Java and C#.

Also, on the point of memory usage: I'd very much like to see some real
side-by-side comparisons of the ``''.join(parts)`` memory usage across
Python implementations compared to ``StringIO.write()``. I some earlier in
the thread, but the results were inaccurate since they relied entirely on
``sys.getsizeof()``, as mentioned earlier. IMO, having accurate memory
benchmarks is critical to this proposal. As Chris Angelico mentioned, this
can be observed through monitoring the before and after RSS (or equivalent
on platforms without it). On Linux, I typically use something like this:

```
def show_rss():
    os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
```

With the above in mind, I'm currently +0 on the proposal. It seems like it
might be a reasonable overall idea, but the arguments of its benefits need
to be much more concrete before I'm convinced.

On Wed, Apr 1, 2020 at 5:45 PM Paul Sokolovsky <pmis...@gmail.com> wrote:

> Hello,
>
> On Wed, 1 Apr 2020 10:01:06 +0100
> Paul Moore <p.f.mo...@gmail.com> wrote:
>
> > On Wed, 1 Apr 2020 at 02:07, Steven D'Aprano <st...@pearwood.info>
> > wrote:
> > > Paul has not suggested making StringIO look and feel like a string.
> > > Nobody is going to add 45+ string methods to StringIO. This is a
> > > minimal extension to the StringIO class which will allow people to
> > > improve their string building code with a minimal change.
> >
> > Thanks for paring the proposal down to its bare bones, there's a lot
> > of side questions being discussed here that are confusing things for
> > me.
> >
> > With this in mind, and looking at the bare proposal, my immediate
> > thought is who's going to use this new approach:
> >
> []
>
> >
> > I hope this isn't going to trigger another digression, but it seems to
> > me that the answer is "nobody, unless they are taught about it, or
> > work it out for themselves[1]".
>
> Roughly speaking, the answer would be about the same in idea as answers
> to the following questions:
>
> * Who'd be using assignment expressions? (2nd way to do assignment,
>   whoa!)
> * Who'd be using f-strings? (3rd (or more) way to do string formatting,
>   bhoa!)
> * Who'd be writing s = s.removeprefix("foo") instead of
>   "if s.startswith("foo"): s = s[3:]" (PEP616)?
> * Who'd be using binary operator @ ?
> * Who'd be using using unary operator + ?
>
>
> > My reasons for saying this are that it
> > adds no value over the current idiom of building a list then using
> > join(), so people who already write efficient code won't need to
> > change. The people who *might* change to this are people currently
> > writing
> >
> >     buf = ''
> >     # repeated many times
> >     buf += 'substring'
> >
> > Those people have presumably not yet learned about the (language
> > independent) performance implication of repeated concatenation of
> > immutable strings[2].
>
> Ok, so we found the answers to all those questions - people who might
> have a need to use, would use it. You definitely may argue of how many
> people (in absolute and relative figures) would use it. Let the binary
> operator @ and unary operator + be your aides in this task.
>
>
> > At the moment, the
> > message is relatively clear - "build a list and join it" (it's very
> > rare that anyone suggests StringIO currently).
>
> I don't know how much you mix with other Pythonistas, but word "clear"
> is an exaggeration. From those who don't like it, the usual word is
> "ugly", though I've seen more vivid epithets, like "repulsive":
> https://mail.python.org/pipermail/python-list/2006-January/403480.html
>
> More cool-headed guys like me just call it "complete wastage of memory".
>
> > This proposal is
> > presumably intended to make "use StringIO and +=" a more attractive
> > alternative alternative proposal (because it avoids the need to
> > rewrite all those += lines).
>
> Aye.
>
> > So we now find ourselves in the position
> > of having *two* "recommended approaches" to addressing the performance
> > issue with string concatenation.
>
> The scholasm of "there's only one way to do it" is getting old for this
> language. Have you already finished explaining everyone why we needed
> assignment expressions, and why Python originally had % as a formatting
> operator, and some people swear to keep not needing anything else?
>
> What's worse, is that "there's only one way to do it" gets routinely
> misinterpreted as "One True Way (tm)". And where Python is deficient to
> other languages, there's rising small-scale exceptionalism along the
> lines "we don't have it, and - we don't need it!". The issue is that
> some (many) Python programmers use a lot of different languages, and
> treat Python first of all as a generic programming language, not as a
> bag of tricks of a particular implementation. And of course, there
> never will be agreement between the one-true-way-tm vs
> nice-generic-languages factions of the community.
>
> > I'd contend that there's a benefit in having a single well-known idiom
> > for fixing this issue when beginners hit it. Clarity of teaching, and
> > less confusion for people who are learning that they need to address
> > an issue that they weren't previously aware of.
>
> Another acute and beaten topic in the community. Python is a melting pot
> for diverse masses - beginners, greybeards, data scientists, scripting
> kiddies, PhD, web programmers, etc. That's one of the greatest
> achievements of Python, but also one of the pain points. I wonder how
> many people escaped from Python to just not be haunted by that
> "beginners" chanting.
>
> Python is beginners-friendly language, period, can't change that.
> Please don't bend it to be beginner-only. Please let people learn
> computer science inside Python, not learn bag of tricks to then escape
> in awe and make up haikus along the lines of:
>
> A language, originally for kids,
> Now for grown-up noobs.
>
> (Actual haiku seen on Reddit, sorry, can't find a link now, reproduced
> from memory, the original might have sounded better).
>
> []
>
> --
> Best regards,
>  Paul                          mailto:pmis...@gmail.com
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/P6S5K6X4ZBEBLRVPNOUNFUYW6WNSQUNS/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/22BGCY27READCPVNVE4WETG6EZ4OGZO5/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to