On Tue, Mar 3, 2020 at 10:13 AM Steven D'Aprano <st...@pearwood.info> wrote:
>
> On Sun, Feb 23, 2020 at 01:46:53PM -0500, Richard Damon wrote:
>
> > I would agree with this. In my mind, fundamentally a 'string' is a
> > sequence of characters, not strings,
>
> If people are going to seriously propose this Character type, I think
> they need to be more concrete about the proposal and not just hand-wave
> it as "strings are sequences of characters".
>
> Presumably you would want `mystring[0]` to return a char, not a str, but
> there are plenty of other unspecified details.
>
> - Should `mystring[0:1]`return a char or a length 1 str?

I'm not seriously proposing it, and I am in fact against the proposal
quite strongly, but ISTM the only sane way to do things is to mirror
the Py3 bytes object. Just as mybytes[0] returns an int, not a bytes,
this should return a char. And that can then be the pattern for
anything else that's similar.

> - Presumably "Z" remains a length-1 str for backward compatibility,
>   so how do you create a char directly?

There would probably need to be an alternative literal form. In C, "Z"
is a string, and 'Z' is a char; in Python, a more logical way to do it
would probably be a prefix like c"Z" - or perhaps just "Z"[0] and have
done with it.

> - Does `chr(n)` continue to return a str?

Logically it should return a char, and in fact would probably want to
be the type, just as str/int/float etc are.

> - Is the char type a subclass of str?

That way lies madness. I suggest not.

> - Do we support mixed concatenation between str and char?

For the sake of backward compatibility, probably yes. But that's a
weak opinion and could easily be swayed.

> - If so, does concatenating the empty string to a char give a char
>   or a length-1 string?

A length 1 string (or, per above, TypeError).

> - Are chars indexable?
>
> - Do they support len()?

No. A character is a single entity, just as an integer is. (NOTE: This
discussion has been talking about "characters", but I think logically
they have to be single Unicode codepoints. Thus the "length" of a
character is not a meaningful quantity.)

> If char is not a subclass of string, that's going to break code that
> expects that `all(isinstance(c, str) for c in obj)` to be true when
> `obj` happens to be a string.

Backward compatibility WOULD be broken by this proposal (which is part
of why I'm so against it). This is one of those eggs that has to be
broken to make this omelette.

> If char is a subclass, that means we can no longer deny that strings are
> sequences of strings, since chars are strings. It also means that it
> will break code that expects strings to be iterable,

And that's why I say this way lies madness.

> I don't have a good intuition for how much code will break or simply
> stop working correctly if we changed string iteration to yield a new
> char type instead of length-1 strings.
>
> Nor do I have a good intuition for whether this will *actually* help
> much code. It seems to me that there's a good chance that this could end
> up simply shifting isinstance tests for str in some contexts to
> isinstance tests for char in different contexts.

Agreed.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HJT6APLRZW4CSW4O6NLBRVOBLBQD6YUY/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to