[Python-Dev] Move ensurepip blobs to external place
Currently the repository contains bundled pip and setuptools (2 MB total) which are updated with every release of pip and setuptools. This increases the size of the repository by around 2 MB several times per year. There were total 37 updates of Lib/ensurepip/_bundled, therefore the repository contains up to 70 MB of unused blobs. The size of the repository is 350 MB. Currently blobs takes up to 20% of the size of the repository, but this percent will likely grow in future, because they where added only 4 years ago. Wouldn't be better to put them into a separate repository like Tcl/Tk and other external binaries for Windows, and download only the recent version? ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Better support for consuming vendored packages
On 23 March 2018 at 02:58, Gregory Szorc wrote: > I'd like to start a discussion around practices for vendoring package > dependencies. I'm not sure python-dev is the appropriate venue for this > discussion. If not, please point me to one and I'll gladly take it there. > > Since you mainly seem interested in the import side of things (rather than the initial vendoring process), python-ideas is probably the most suitable location (we're not at the stage of a concrete design proposal that would be appropriate for python-dev, and this doesn't get far enough into import system arcana to really need to be an import-sig discussion rather than a python-ideas one). > What we've done is effectively rename the "shrubbery" package to > "knights.vendored.shrubbery." If a module inside that package attempts an > `import shrubbery.x`, this could fail because "shrubbery" is no longer the > package name. Or worse, it could pick up a separate copy of "shrubbery" > somewhere else in `sys.path` and you could have a Frankenstein package > pulling its code from multiple installs. So for this to work, all > package-local imports must be using relative imports. e.g. `from . import > x`. > If it's the main application doing the vendoring, then the following kind of snippet can be helpful: from knights.vendored import shrubbery import sys sys.path["shrubbery"] = shrubbery So doing that kind of aliasing on a process-wide basis is already possible, as long as you have a point where you can inject the alias (and by combining it with a lazy importer, you can defer the actual import until someone actually uses the module). Limiting aliasing to a particular set of modules *doing* imports would be much harder though, since we don't pass that information along (although context variables would potentially give us a way to make it available without having to redefine all the protocol APIs) Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Better support for consuming vendored packages
On 24 March 2018 at 19:29, Nick Coghlan wrote: > On 23 March 2018 at 02:58, Gregory Szorc wrote: > >> I'd like to start a discussion around practices for vendoring package >> dependencies. I'm not sure python-dev is the appropriate venue for this >> discussion. If not, please point me to one and I'll gladly take it there. >> >> > Since you mainly seem interested in the import side of things (rather than > the initial vendoring process), python-ideas is probably the most suitable > location (we're not at the stage of a concrete design proposal that would > be appropriate for python-dev, and this doesn't get far enough into import > system arcana to really need to be an import-sig discussion rather than a > python-ideas one). > > >> What we've done is effectively rename the "shrubbery" package to >> "knights.vendored.shrubbery." If a module inside that package attempts an >> `import shrubbery.x`, this could fail because "shrubbery" is no longer the >> package name. Or worse, it could pick up a separate copy of "shrubbery" >> somewhere else in `sys.path` and you could have a Frankenstein package >> pulling its code from multiple installs. So for this to work, all >> package-local imports must be using relative imports. e.g. `from . import >> x`. >> > > If it's the main application doing the vendoring, then the following kind > of snippet can be helpful: > > from knights.vendored import shrubbery > import sys > sys.path["shrubbery"] = shrubbery > Oops, s/path/modules/ :) Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Better support for consuming vendored packages
On Sat, Mar 24, 2018 at 9:29 AM, Nick Coghlan wrote: > On 23 March 2018 at 02:58, Gregory Szorc wrote: > >> I'd like to start a discussion around practices for vendoring package >> dependencies. I'm not sure python-dev is the appropriate venue for this >> discussion. If not, please point me to one and I'll gladly take it there. >> >> > [...] > > If it's the main application doing the vendoring, then the following kind > of snippet can be helpful: > > from knights.vendored import shrubbery > import sys > sys.path["shrubbery"] = shrubbery > > I suspect you meant > sys.modules["shrubbery"] = shrubbery ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
On 24 March 2018 at 18:50, Serhiy Storchaka wrote: > Currently the repository contains bundled pip and setuptools (2 MB total) > which are updated with every release of pip and setuptools. This increases > the size of the repository by around 2 MB several times per year. There > were total 37 updates of Lib/ensurepip/_bundled, therefore the repository > contains up to 70 MB of unused blobs. The size of the repository is 350 MB. > Currently blobs takes up to 20% of the size of the repository, but this > percent will likely grow in future, because they where added only 4 years > ago. > > Wouldn't be better to put them into a separate repository like Tcl/Tk and > other external binaries for Windows, and download only the recent version? > Specifically, I believe that would entail adding them to https://github.com/python/cpython-bin-deps, and then updating the make file to do a shallow clone of the relevant branch and copy the binaries to a point where ensurepip expects to find them? I'm fine with the general idea of moving these out to the bin-deps repo, as long as cloning the main CPython repo and running "./configure && make && ./python -m test test_ensurepip" still works. We'd also want to add docs to the developer guide on how to update them (those docs are missing at the moment, since the update process is just dropping the new wheel files directly into the right place) Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
On 24 March 2018 at 10:50, Nick Coghlan wrote: > On 24 March 2018 at 18:50, Serhiy Storchaka wrote: >> >> Currently the repository contains bundled pip and setuptools (2 MB total) >> which are updated with every release of pip and setuptools. This increases >> the size of the repository by around 2 MB several times per year. There were >> total 37 updates of Lib/ensurepip/_bundled, therefore the repository >> contains up to 70 MB of unused blobs. The size of the repository is 350 MB. >> Currently blobs takes up to 20% of the size of the repository, but this >> percent will likely grow in future, because they where added only 4 years >> ago. >> >> Wouldn't be better to put them into a separate repository like Tcl/Tk and >> other external binaries for Windows, and download only the recent version? > > > Specifically, I believe that would entail adding them to > https://github.com/python/cpython-bin-deps, and then updating the make file > to do a shallow clone of the relevant branch and copy the binaries to a > point where ensurepip expects to find them? > > I'm fine with the general idea of moving these out to the bin-deps repo, as > long as cloning the main CPython repo and running "./configure && make && > ./python -m test test_ensurepip" still works. We'd also want to add docs to > the developer guide on how to update them (those docs are missing at the > moment, since the update process is just dropping the new wheel files > directly into the right place) I don't have a problem with moving the pip/setuptools wheels - as long as (as a pip dev doing a release) I know where to put the files, it makes little difference to me. But as Nick says, if the files aren't in the main CPython repository, the build process (both the Unix and the Windows processes) will need updating to ensure that the files are taken from where they do reside and put in the right places. I'd assume that a change like that is big enough that it would be targeted at 3.8, BTW (and so won't affect what I need to do for 3.7). Paul ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Replacing self.__dict__ in __init__
Hi Python-dev,
I'm one of the core attrs contributors, and I'm contemplating applying an
optimization to our generated __init__s. Before someone warns me python-dev
is for the development of the language itself, there are two reasons I'm
posting this here:
1) it's a very low level question that I'd really like the input of the
core devs on, and
2) maybe this will find its way into dataclasses if it works out.
I've found that, if a class has more than one attribute, instead of
creating an init like this:
self.a = a
self.b = b
self.c = c
it's faster to do this:
self.__dict__ = {'a': a, 'b': b, 'c': c}
i.e. to replace the instance dictionary altogether. On PyPy, their core
devs inform me this is a bad idea because the instance dictionary is
special there, so we won't be doing this on PyPy.
But is it safe to do on CPython?
To make the question simpler, disregard the possibility of custom setters
on the attributes.
Thanks in advance!
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacing self.__dict__ in __init__
2018-03-24 17:18 GMT+03:00 Tin Tvrtković :
>
> I've found that, if a class has more than one attribute, instead of
> creating an init like this:
>
> self.a = a
> self.b = b
> self.c = c
>
> it's faster to do this:
>
> self.__dict__ = {'a': a, 'b': b, 'c': c}
>
> i.e. to replace the instance dictionary altogether. On PyPy, their core
> devs inform me this is a bad idea because the instance dictionary is
> special there, so we won't be doing this on PyPy.
>
But why you need to replace it? When you can just update it:
class C:
def __init__(self, a, b, c):
self.__dict__.update({'a': a, 'b': b, 'c': c})
I'm certainly not a developer. Just out of curiosity.
With kind regards,
-gdg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacing self.__dict__ in __init__
On Sat, Mar 24, 2018 at 02:18:14PM +, Tin Tvrtković wrote:
> self.__dict__ = {'a': a, 'b': b, 'c': c}
>
> i.e. to replace the instance dictionary altogether. On PyPy, their core
> devs inform me this is a bad idea because the instance dictionary is
> special there, so we won't be doing this on PyPy.
>
> But is it safe to do on CPython?
I don't know if it's safe, but replacing __init__ is certainly an old
and famous idiom:
https://code.activestate.com/recipes/66531-singleton-we-dont-need-no-stinkin-singleton-the-bo/
--
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacing self.__dict__ in __init__
> On Mar 24, 2018, at 7:18 AM, Tin Tvrtković wrote:
>
> it's faster to do this:
>
> self.__dict__ = {'a': a, 'b': b, 'c': c}
>
> i.e. to replace the instance dictionary altogether. On PyPy, their core devs
> inform me this is a bad idea because the instance dictionary is special
> there, so we won't be doing this on PyPy.
>
> But is it safe to do on CPython?
This should work. I've seen it done in other production tools without any ill
effect.
The dict can be replaced during __init__() and still get benefits of
key-sharing. That benefit is lost only when the instance dict keys are
modified downstream from __init__(). So, from a dict size point of view, your
optimization is fine.
Still, you should look at whether this would affect static type checkers, lint
tools, and other tooling.
Raymond
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
Or we could just pull the right version directly from PyPI? (Note that updating the version should be an explicit step, as it is today, but the file should be identical to what’s on PyPI, right? And a urlretrieve is easier than pulling from a git repo.) Top-posted from my Windows phone From: Paul Moore Sent: Saturday, March 24, 2018 4:17 To: Nick Coghlan Cc: Serhiy Storchaka; python-dev Subject: Re: [Python-Dev] Move ensurepip blobs to external place On 24 March 2018 at 10:50, Nick Coghlan wrote: > On 24 March 2018 at 18:50, Serhiy Storchaka wrote: >> >> Currently the repository contains bundled pip and setuptools (2 MB total) >> which are updated with every release of pip and setuptools. This increases >> the size of the repository by around 2 MB several times per year. There were >> total 37 updates of Lib/ensurepip/_bundled, therefore the repository >> contains up to 70 MB of unused blobs. The size of the repository is 350 MB. >> Currently blobs takes up to 20% of the size of the repository, but this >> percent will likely grow in future, because they where added only 4 years >> ago. >> >> Wouldn't be better to put them into a separate repository like Tcl/Tk and >> other external binaries for Windows, and download only the recent version? > > > Specifically, I believe that would entail adding them to > https://github.com/python/cpython-bin-deps, and then updating the make file > to do a shallow clone of the relevant branch and copy the binaries to a > point where ensurepip expects to find them? > > I'm fine with the general idea of moving these out to the bin-deps repo, as > long as cloning the main CPython repo and running "./configure && make && > ./python -m test test_ensurepip" still works. We'd also want to add docs to > the developer guide on how to update them (those docs are missing at the > moment, since the update process is just dropping the new wheel files > directly into the right place) I don't have a problem with moving the pip/setuptools wheels - as long as (as a pip dev doing a release) I know where to put the files, it makes little difference to me. But as Nick says, if the files aren't in the main CPython repository, the build process (both the Unix and the Windows processes) will need updating to ensure that the files are taken from where they do reside and put in the right places. I'd assume that a change like that is big enough that it would be targeted at 3.8, BTW (and so won't affect what I need to do for 3.7). Paul ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
On Mar 24, 2018, at 16:13, Steve Dower wrote: > Or we could just pull the right version directly from PyPI? (Note that > updating the version should be an explicit step, as it is today, but the file > should be identical to what’s on PyPI, right? And a urlretrieve is easier > than pulling from a git repo.) I think the primary original rationale for having the pip wheel and its dependencies checked into the cpython repo was so that users would be able to install pip even if they did not have an Internet connection. But perhaps that requirement can be relaxed a bit if we say that the necessary wheels are vendored into all of our downloadable release items, that is, included in the packaging of source release files (the various tarballs) and the Windows and macOS binary installers. The main change would likely be making ensurepip a bit smarter to download if the bundled wheels are not present in the source directory. Assuming that people building from a cpython repo need to have a network connection if they want to run ensurepip, at least for the first time, is probably not an onerous requirement. -- Ned Deily [email protected] -- [] ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacing self.__dict__ in __init__
On 25 March 2018 at 00:18, Tin Tvrtković wrote: > But is it safe to do on CPython? > That depends on what you mean by "safe" :) It won't crash, but it will lose any existing entries that a metaclass, subclass, or __new__ method implementation might have added to the instance dictionary before calling the __init__ method. That can be OK in a tightly controlled application specific class hierarchy, but it would be questionable in a general purpose utility library that may be combined with arbitrary other types. As Kirill suggests, `self.__dict__.update(new_attrs)` is likely to be faster than repeated assignment statements, without the potentially odd interactions with other instance initialisation code. It should also be explicitly safe to do in the case of "type(self) is __class__ and not self.__dict__", which would let you speed up the common case of direct instantiation, while falling back to the update based approach when combined with other classes at runtime. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
On 25 March 2018 at 06:52, Ned Deily wrote: > On Mar 24, 2018, at 16:13, Steve Dower wrote: > > Or we could just pull the right version directly from PyPI? (Note that > updating the version should be an explicit step, as it is today, but the > file should be identical to what’s on PyPI, right? And a urlretrieve is > easier than pulling from a git repo.) > > I think the primary original rationale for having the pip wheel and its > dependencies checked into the cpython repo was so that users would be able > to install pip even if they did not have an Internet connection. But > perhaps that requirement can be relaxed a bit if we say that the necessary > wheels are vendored into all of our downloadable release items, that is, > included in the packaging of source release files (the various tarballs) > and the Windows and macOS binary installers. The main change would likely > be making ensurepip a bit smarter to download if the bundled wheels are not > present in the source directory. Assuming that people building from a > cpython repo need to have a network connection if they want to run > ensurepip, at least for the first time, is probably not an onerous > requirement. > Right, having the wheels in the release artifacts is a requirement, as is having them available for use when running the test suite, but having them in the git repo isn't. Adding them directly to the repo was just the simplest approach to getting ensurepip working, since it didn't require any changes to the build process. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Move ensurepip blobs to external place
As i recall git LFS makes storing large binary objects in some external object storage fairly seamless - might be a good fit for keeping the same workflow and not bloating the repo. M -- Matt Billenstein [email protected] Sent from my iPhone 6 (this put here so you know I have one) > On Mar 24, 2018, at 8:27 PM, Nick Coghlan wrote: > >> On 25 March 2018 at 06:52, Ned Deily wrote: >> On Mar 24, 2018, at 16:13, Steve Dower wrote: >> > Or we could just pull the right version directly from PyPI? (Note that >> > updating the version should be an explicit step, as it is today, but the >> > file should be identical to what’s on PyPI, right? And a urlretrieve is >> > easier than pulling from a git repo.) >> >> I think the primary original rationale for having the pip wheel and its >> dependencies checked into the cpython repo was so that users would be able >> to install pip even if they did not have an Internet connection. But >> perhaps that requirement can be relaxed a bit if we say that the necessary >> wheels are vendored into all of our downloadable release items, that is, >> included in the packaging of source release files (the various tarballs) and >> the Windows and macOS binary installers. The main change would likely be >> making ensurepip a bit smarter to download if the bundled wheels are not >> present in the source directory. Assuming that people building from a >> cpython repo need to have a network connection if they want to run >> ensurepip, at least for the first time, is probably not an onerous >> requirement. > > Right, having the wheels in the release artifacts is a requirement, as is > having them available for use when running the test suite, but having them in > the git repo isn't. > > Adding them directly to the repo was just the simplest approach to getting > ensurepip working, since it didn't require any changes to the build process. > > Cheers, > Nick. > > -- > Nick Coghlan | [email protected] | Brisbane, Australia > ___ > Python-Dev mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/matt%40vazor.com ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
