> From: Mikhail V
> Sent: Wednesday, October 12, 2016 9:57 PM
> Subject: Re: [Python-ideas] Proposal for default character representation
Hello, and welcome to Python-ideas, where only a small portion of ideas go
further, and where most newcomers that wish to improve the language get hit
by the reality bat! I hope you enjoy your stay :)
> On 13 October 2016 at 01:50, Chris Angelico <ros...@gmail.com> wrote:
> > On Thu, Oct 13, 2016 at 10:09 AM, Mikhail V <mikhail...@gmail.com>
> > Way WAY less readable, and I'm comfortable working in both hex and
> Please don't mix the readability and personal habit, which previuos
> repliers seems to do as well. Those two things has nothing
> to do with each other. If you are comfortable with old roman numbering
> system this does not make it readable.
> And I am NOT comfortable with hex, as well as most people would
> be glad to use single notation.
> But some of them think that they are cool because they know several
> numbering notations ;) But I bet few can actually understand which is more
I'll turn your argument around: Not being comfortable with hex does not make
it unreadable; it's a matter of habit (as Brendan pointed out in his
> > You're the one who's non-standard here. Most of the world uses hex for
> > Unicode codepoints.
> No I am not the one, many people find it silly to use different notations
> for same thing - index of the element, and they are very right about that.
> I am not silly, I refuse to use it and luckily I can. Also I know that
> is more readable than hex so my choice is supportend by the
> understanding and not simply refusing.
Unicode code points are represented using hex notation virtually everywhere
I ever saw it. Your Unicode-code-points-as-decimal website was a new
discovery for me (and, I presume, many others on this list). Since it's
widely used in the world, going against that effectively makes you
non-standard. That doesn't mean it's necessarily a bad thing, but it does
mean that your chances (or anyone's chances) of actually changing that are
equal to zero (and this isn't some gross exaggeration),
> >> PS:
> >> that is rather peculiar, three negative replies already but with no
> >> arguments why it would be bad to stick to decimal only, only some
> >> "others do it so" and "tradition" arguments.
> > "Others do it so" is actually a very strong argument. If all the rest
> > of the world uses + to mean addition, and Python used + to mean
> > subtraction, it doesn't matter how logical that is, it is *wrong*.
> This actually supports my proposal perfectly, if everyone uses decimal
> why suddenly use hex for same thing - index of array. I don't see how
> your analogy contradicts with my proposal, it's rather supporting it.
I fail to see your point here. Where is that "everyone uses decimal"? Unless
you stopped talking about representation in strings (which seems likely, as
you're talking about indexing?), everything is represented as hex.
> But I do want that you could abstract yourself from your habit for a while
> and talk about what would be better for the future usage.
I'll be that guy and tell you that you need to step back from your own idea
for a while and consider your proposal and the current state of things. I'll
also take the opportunity to reiterate that there is virtually no chance to
change this behaviour. This doesn't, however, prevent you or anyone from
talking about the topic, either for fun, or for finding other (related or
otherwise) areas of interest that you think might be worth investigating
further. A lot of threads actually branch off in different topics that came
up when discussing, and that are interesting enough to pursue on their own.
> > everyone has to do the conversion from that to 201C.
> Nobody need to do ANY conversions if use decimal,
> and as said everything is decimal: numbers, array indexes,
> ord() function returns decimal, you can imagine more examples
> so it is not only more readable but also more traditional.
You're mixing up more than just one concept here:
- Integer literals; I assume this is what you meant, and you seem to forget
(or maybe you didn't know, in which case here's to learning something new!)
that 0xff is perfectly valid syntax, and store the integer with the value of
255 in base 10.
- Indexing, and that's completely irrelevant to the topic at hand (also see
above bullet point).
- ord() which returns an integer (which can be interpreted in any base!),
and that's both an argument for and against this proposal; the "against"
side is actually that decimal notation has no defined boundary for when to
stop (and before you argue that it does, I'll point out that the
separations, e.g. grouping by the thousands, are culture-driven and not an
international standard). There's actually a precedent for this in Python 2
with the \x escape (need I remind anyone why Python 3 was created again? :),
but that's exactly a stone in the "don't do that" camp, instead of the other
> > How many decimal digits would you use to denote a single character?
> for text, three decimal digits would be enough for me personally,
> and in long perspective when the world's alphabetical garbage will
> dissapear, two digits would be ok.
You seem to have misunderstood the question - in "\u00123456", there is no
ambiguity that this is a string consisting of 5 characters; the first one is
'\u0012', the second one is '3', the third one is '4', the fourth one is
'5', and the last one is '6'. In the string (using \d as a hypothetical
escape method; regex gurus can go read #27364 ;) "\d00123456", how many
characters does this contain? It's decimal, so should the escape grab the
first 5 digits? Or 6 maybe? You tell me.
> > you have to pad everything to seven digits (\u0000034 for an ASCII
> > quote)?
> Depends on case, for input -
> some separator, or padding is also ok,
> I don't have problems with both. For printing obviously don't show
> leading zeros, but rather spaces.
No leading zeros? That means you don't have a fixed number of digits, and
your string is suddenly very ambiguous (also see my point above).
> But as said I find this Unicode only some temporary happening,
> it will go to history in some future and be
> used only to study extinct glyphs.
Unicode, a temporary happening? Well, strictly speaking, nobody can know
that, but I'd expect that it's going to, someday, be *the* common standard.
I'm not bathed in illusion, though.
All in all, that's a pretty interesting idea. However, it has no chance of
happening, because a lot of code would break, Python would deviate from the
rest of the world, this wouldn't be backwards compatible (and another
backwards-incompatible major release isn't happening; the community still
hasn't fully caught up with the one 8 years ago), and it would be
unintuitive to anyone who's done computer programming before (or after, or
during, or anytime).
I do see some bits worth pursuing in your idea, though, and I encourage you
to keep going! As I said earlier, Python-ideas is a place where a lot of
ideas are born and die, and that shouldn't stop you from trying to
contribute. Python is 25 years old, and a bunch of stuff is there just for
backwards compatibility; these kind of things can't get changed easily. The
older (older by contribution period, not actual age) contributors still
active don't try to fix what's not broken (to them). Newcomers, such as you,
are a breath of fresh air to the language, and what helps make it thrive
even more! By bringing new, uncommon ideas, you're challenging the status
quo and potentially changing it for the best. But keep in mind that, with no
clear consensus, the status quo always wins a stalemate.
I hope that makes sense!
Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/