[Python-ideas] Re: Custom literals, a la C++

Chris Angelico Fri, 08 Apr 2022 09:55:42 -0700

On Sat, 9 Apr 2022 at 02:31, Christopher Barker <[email protected]> wrote:
>
> On Fri, Apr 8, 2022 at 8:29 AM Chris Angelico <[email protected]> wrote:
>>
>> > > another that's using nautical miles, but *not both in the same
>> > > module*?
>
>
> Absolutely!
>
> There is talk about "the Application" as though that's one thing, but Python 
> applications these days can be quite large collections of third party 
> packages -- each of which do not know about the other,a nd each of which may 
> be using units in different ways.
>
> For example, I have an application that literally depends on four different 
> JSON libraries -- each used by a different third-party package. Imagine if 
> the configurable JSON encoding/decoding settings were global state -- that 
> would be a disaster.
>


You're misunderstanding the difference between "application" and
"library" here. Those are four separate libraries, and each one has a
single purpose: encoding/decoding stuff. It is not the application. It
is not the primary purpose of the process. If one of those JSON
libraries were to change your process's working directory, you would
be extremely surprised. We aren't bothered by the fact that os.chdir()
is global, we just accept that it belongs to the application, not a
library.

The Application *is* one thing. It calls on libraries, but there's
only one thing that has command of this sort of thing.

General rule: A library is allowed to change things that belong to the
application if, and only if, it is at the behest of the application.
That's a matter of etiquette rather than a hard-and-fast rule, but we
decry badly-behaved libraries for violating it, rather than blaming
the feature for being global.

> Granted
> * Python is dynamic and has a global module namespace, so packages CAN monkey 
> patch and make of mess of virtually anything.
> * "Well behaved" packages would not mess with the global configuration.
>
> But that doesn't mean that it wouldn't happen -- why encourage it? Why have a 
> global registry and then tell people not to use it?

For precisely the same reason that we have so many other global
registries. It is simplest and cleanest to maintain consistency rather
than try to have per-module boundaries.

When you have per-module features, refactoring becomes more of a
hassle. I've fielded multiple questions from people who do "import
sys" in one module, and then try to use "sys.argv" in another module,
not realising that the namespace into which the name 'sys' was
imported belonged only to that module. It's not too hard to explain,
but it's a thing that has to be learned. The more things that are
per-module, the more things you have to think about when you refactor.

It is a *good thing*, not a bad thing, that a large number of settings
are completely global. We do not need per-module settings for
everything, and it would be a nightmare to work with if we did.

> Having a global registry/context/whatever for something that is 
> designed/expected to be configured is dangerous and essentially useless.
>

Only if it's expected to be configured with some granularity. And, as
with decimal.localcontext(), it's perfectly possible to have scopes
much smaller than modules. So my question to you, just as to D'Aprano,
is: why should this be at the module scope, not global, and not
narrower?

> I'm not sure if this is really a good analogy, but it reminds me of the 
> issues with system locale settings:
>
> Back in the day, it seemed like a great idea to have one central palceon a 
> computer to set these nifty things that apply to that particular computer. 
> But enter the internet, where the location the computer the code is running 
> on could be completely unrelated to where the user is and what the user wants 
> to see, and it's a complete mess. Add to that different operating systems, 
> etc.
>
>
> To this day, Python struggles with these issues -- if you use the default 
> settings to open a text file, it may get virtually any encoding depending on 
> what system the program is running on -- there is a PEP in progress to fix 
> that, but it's been a long time!
>

What we now have is an even broader setting: the entire *planet* is
being set into a default of UTF-8, one programming language at a time.
We don't need it to be per-process any more, and we definitely never
wanted it to be per-module or any other finer scope.

The reason for having it centralized on the computer has always been
that different applications could then agree on something. Let's say
you set your computer to use ISO-8859-7 (or, if you're a Microsoft
shop, you might use code page 1253 for the same purpose). You're
telling every single application that you're planning to use Greek
text, and that it should assume that eight-bit data is most likely to
be in Greek. Since text files don't have inherent metadata identifying
their encodings, it's not unreasonable to let the system decide it.

Of course, that never worked all that well, so I'm not sorry to see
more and more things go UTF-8 by default...

> Dateitme handling has the same issues -- I think the C libs STILL use the 
> system timezone settings. And an early version of the numpy datetime 
> implementation did too -- realy bad idea.
>
> In short: The context in which code is run should be in complete control of 
> the person writing the code, not the person writing the "application".
>

Not sure what you mean there. Obviously any date/time with inherent
timezone data should simply use that, but if a library is parsing
something like "2022-04-09 02:46:17", should every single library have
a way for you to tell it what timezone that is, or should it just use
the system settings? I put it to you that this is something that
belongs to the application, unless there's a VERY VERY VERY good
reason for the library to override that. (In the case of timezone
settings, that could mean having some sort of hidden metadata about
that string, eg you're working with the Widgets Inc API and the
library knows that Widgets Inc always send their timestamps in the
Europe/Elbonia timezone.)

And if you mean the interpretation of timezones themselves... that
definitely does NOT belong in the library. I don't want to have to dig
through every single dependency to see if it needs to have tzdata
updated. One single global tzdata is absolutely fine, thank you very
much. You may want to use one from PyPI or one from your operating
system, and there's good reasons for both, but you definitely don't
want every single library having its own copy. (It's big, anyhow.)

> Again: practical use case with units:
>
> I maintain a primitive unit conversion lib -- in that lib, I have a 
> "registry" of units and names and synonyms, etc. That registry is loaded at 
> module import, and at that time it checks for conflicts, etc. Being Python, 
> the registry could be altered at run time, but that is not exposed as part of 
> the public API, and it's not a recommended or standard practice. And this 
> lets me make all sorts of arbitrary decisions about what "mile" and "oz" and 
> all that means, and it's not going to get broken by someone else that prefers 
> different uses -- at least if they use the public API.
>

Cool. The global repository that I suggest would be completely
independent, unless you choose to synchronize them. The registry that
you have would be used by your tools, and source code would use the
interpreter-wide ones. This is not a conflict. Of course, since you
have all the functionality already, it would make a lot of sense to
offer an easy way to register all of your library's units with the
system repository, thus making them all available; but that would be
completely optional to both you and your users.

ChrisA
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/IY4ZF46NJDTV3VNXTWQWCSPZAGYBGUNR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom literals, a la C++

Reply via email to