Re: [Distutils] Entry points: specifying and caching

Donald Stufft Thu, 19 Oct 2017 11:10:50 -0700


> On Oct 19, 2017, at 12:14 PM, Thomas Kluyver <tho...@kluyver.me.uk> wrote:
> 
> On Thu, Oct 19, 2017, at 04:10 PM, Donald Stufft wrote:
>> I’m in favor, although one question I guess is whether it should be a a
>> PEP or an ad hoc spec. Given (2) it should *probably* be a a PEP (since
>> without (2), its just another file in the .dist-info directory and that
>> doesn’t actually need standardized at all). I don’t think that this will
>> be a very controversial PEP though, and should be pretty easy.
> 
> I have opened a PR to document what is already there, without adding any
> new features. I think this is worth doing even if we don't change
> anything, since it's a de-facto standard used for different tools to
> interact.
> 
> https://github.com/pypa/python-packaging-user-guide/pull/390
> 
> We can still write a PEP for caching if necessary.


I think documenting what’s there is a reasonable goal, but if we’re going to 
add caching we should just PEP the whole thing changing it from a defect 
standard to an actual standard + caching. Generally we should only use non-PEP 
“specs” in places where we’re just trying to document what exists already, but 
where we’re not really happy with the current solution or we plan to alter it 
eventually.

For this, I think the entry points solution is generally a good one with some 
alterations (namely, the addition of caching)…. Although now that I think about 
it, maybe this isn’t really a packaging problem at all and I’m not sure that it 
benefits from standardization at all.

So stepping back a second, here’s what entrypoints provides today:

1. A way to implement a interface that some other package can provide 
implementations for.
2. A way to specify script wrappers that will be automatically generated.
3. A way to define extras that must be installed in order for a particular 
entry point to be available.

Off the bat I’m going to say we don’t need to worry about (2) in this 
hypothetical system, because I think the fact it is implemented currently via 
this system is mostly a historic accident, and it’s not something we should be 
looking at in the future. Script wrappers should have some dedicated metadata, 
not piggybacking off of the plugin system.

For (3) I don’t believe that what extras were installed is recorded anywhere, 
so I’m going to guess that this works by looking up what extras are *available* 
for a particular package and then seeing if all of the requirements of that 
distribution are satisfied. Assuming that’s the case then that’s not really 
something that requires deep integration with the packaging toolchain, it just 
needs the APIs to look those things up.

Finally we come to (1), which is in my opinion the meet of what you’re hoping 
to achieve here (and what most people are using entry points for outside of 
console scripts. What I notice about (1) is that it really has absolutely 
nothing to do with packaging at all. It would likely use some of the APIs 
provided by the packaging toolchain (for instance, the ability to add custom 
files to a .dist-info directory, the ability to iterate over installed 
packages, etc) but as a whole pip, setuptools, twine, PyPI, etc none of these 
things need to know anything about it.

EXCEPT, for the fact that with the desire to cache things, it would be 
beneficial to “hook” into the lifecycle of a package install. However I know 
that there are other plugin systems out there that would like to also be able 
to do that (Twisted Plugins come to mind) and that I think outside of plugin 
systems, such a mechanism is likely to be useful in general for other cases.

So heres a different idea that is a bit more ambitious but that I think is a 
better overall idea. Let entrypoints be a setuptools thing, and lets define 
some key lifecycle hooks during the installation of a package and some 
mechanism in the metadata to let other tools subscribe to those hooks. Then  a 
caching layer could be written for setuptools entrypoints to make that faster 
without requiring standardization, but also a whole new, better plugin system 
could to, Twisted plugins could benefit, etc [1].

One thing that I like about all of our work recently in packaging is a lot of 
it has been about making it so there isn’t just one standard set of tools, and 
I think that providing lifecycle hooks is another step along that path.

> 
>> I’m also in favor of this. Although I would suggest SQLite rather than a
>> JSON file for the primary reason being that a JSON file isn’t
>> multiprocess safe without being careful (and possibly introducing
>> locking) whereas SQLite has already solved that problem.
> 
> SQLite was actually my first thought, but from experience in Jupyter &
> IPython I'm wary of it - its built-in locking does not work well over
> NFS, and it's easy to corrupt the database. I think careful use of
> atomic writing can be more reliable (though that has given us some
> problems too).
> 
> That may be easier if there's one cache per user, though - we can
> perhaps try to store it somewhere that's not NFS.
> 


I don’t have a lot of experience using SQLite in this way so it’s entirely 
possible it’s not as robust as we want/need it to be. I’m not wedded to this 
idea (but then if we do what I said above, this idea becomes something for any 
individual implementation of plugins to decide and we don’t need to pick a 
standard here at all!).


[1] I realize the irony in saying a plugin system isn’t a packaging problem, so 
let’s define a plugin system for packaging hooks, but I think it can be very 
simple and not something designed to be reusable outside of that context and 
speed is less of a concern, etc.

_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Re: [Distutils] Entry points: specifying and caching

Reply via email to