[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

Christopher Barker Wed, 23 Oct 2019 18:27:29 -0700

There's a reason I've never actually proposed adding a char ....

On Wed, Oct 23, 2019 at 5:34 PM Andrew Barnert <abarn...@yahoo.com> wrote:


> Well, just adding a char type (and presumably a way of defining char
literals) wouldn’t be too disruptive.

sure.

> But changing str to iterate chars instead of strs, that probably would be.

And that would be the whole point -- a char type by itself isn't very
useful. in some ssense, the only difference between a char and a str would
be that a char isn't iterable -- but the benefit would be that a string is
an iterable (and sequence) of chars, rather than an (infinitely recursable)
iterable of strings.

> Also, you’d have to go through a lot of functions and decide what types
they should take.

sure would -- a lot of thought to see how disruptive it would be ...

> For example, does str.join still accept a string instead of an iterable
of strings? Does it accept other iterables of char too?

if it accepted an iterable of either char or str, then I *think* there
would be little disruption.

> Can you pass a char to str.__contains__

yes, that's a no brainer, the whole point is that a string would be a
sequence of chars.

> or str.endswith?

I would think so -- a char would behave like a length-one string as much as
possible.

> What about a tuple of chars?

that's an odd one -- but I'm not sutre I see the point, if you have a tuple
of chars, you could "".join() them if you want a string, in any context.

> Or should we take the backward-compat breaking opportunity to eliminate
the “str or tuple of str” thing and instead use *args, or at least change
it to “str or iterable of str (which no longer includes str itself)”?

Is this for .endswith() and friends? if so, there was discussion a while
back about that -- but probably not the time to introduce even more
backward incompatible changes.

And I'm not sure how much string functionality a char should have --
probably next to none, as the point is that it would be easy to distinguish
from a string that happened to have one character.

> Surely you’d want to be able to do things like isdigit or swapcase. Even
C has functions to do most of that kind of stuff on chars.

probably -- it would be least disruptive for a char to act as much as
possible the same as a length-one string -- so maybe inexorability and
indexability would be it.

> But I think that, other than join and maybe encode and translate,

not sure why encode or translate should be an issue off the top of my head
-- it would surley be a unicode char :-)

> there’s an obvious right answer for every str method and operator, so
this isn’t too much of a problem.

well, we'd have to go through all of them, and do a lot of thinking...

I think the greater confusion is where can you use a char instead of a
string in other places? using it as a filename, for instance would make it
pointless for at least the cases I commonly deal with (list of filenames).

I can only imagine how many "things" take a string where a char would make
sense, but then it gets harder to distinguish them all.

> Speaking of operators, should char+int and char-int and char-char be
legal? (What about char%int? A thousand students doing the rot13 assignment
would rejoice, but allowing % without * and // is kind of weird, and
allowing * and // even weirder—as well as potentially confusing with
str*int being legal but meaning something very different.)

I would say no -- in C a char IS an unsigned 8bit int, but that's C -- in
pyhton a char and a number are very diferent things.

ord() and chr() would work, of course.

By the way, the bytes and bytearray types already does this -- index into
or loop through a bytes object, you get an int.

Sure, but b'abc'.find(66) is -1, and b'abc'.replace(66, 70) is a TypeError,
and so on.

I wonder if they need to be -- would we need a "byte" type, or would it be
OK to accept an int in all those sorts of places?

> Fixing those inconsistencies is what I meant by “go all the way to making
them sequences of ints”. But it might be friendlier to undo the changes and
instead add a byte type like the char type for bytes to be a sequence of.
I’m not sure which is better.

me neither.

> But anyway, I think all of these questions are questions for a new
language. If making str not iterate str was too big a change even for 3.0,
how could it be reasonable for any future version?

Well, I don't know that it was seriously considered -- with the Unicode
changes, that WOULD have been the time to do it!

Again though,, it seems like it would be pretty disruptive, so a
non-starter, but maybe not?

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FHC7OGICHUUNK23FITJLNPRRNZXKAZ4U/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

Reply via email to