[Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Serhiy Storchaka
Currently the repository contains bundled pip and setuptools (2 MB 
total) which are updated with every release of pip and setuptools. This 
increases the size of the repository by around 2 MB several times per 
year. There were total 37 updates of Lib/ensurepip/_bundled, therefore 
the repository contains up to 70 MB of unused blobs. The size of the 
repository is 350 MB. Currently blobs takes up to 20% of the size of the 
repository, but this percent will likely grow in future, because they 
where added only 4 years ago.


Wouldn't be better to put them into a separate repository like Tcl/Tk 
and other external binaries for Windows, and download only the recent 
version?


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Better support for consuming vendored packages

2018-03-24 Thread Nick Coghlan
On 23 March 2018 at 02:58, Gregory Szorc  wrote:

> I'd like to start a discussion around practices for vendoring package
> dependencies. I'm not sure python-dev is the appropriate venue for this
> discussion. If not, please point me to one and I'll gladly take it there.
>
>
Since you mainly seem interested in the import side of things (rather than
the initial vendoring process), python-ideas is probably the most suitable
location (we're not at the stage of a concrete design proposal that would
be appropriate for python-dev, and this doesn't get far enough into import
system arcana to really need to be an import-sig discussion rather than a
python-ideas one).


> What we've done is effectively rename the "shrubbery" package to
> "knights.vendored.shrubbery." If a module inside that package attempts an
> `import shrubbery.x`, this could fail because "shrubbery" is no longer the
> package name. Or worse, it could pick up a separate copy of "shrubbery"
> somewhere else in `sys.path` and you could have a Frankenstein package
> pulling its code from multiple installs. So for this to work, all
> package-local imports must be using relative imports. e.g. `from . import
> x`.
>

If it's the main application doing the vendoring, then the following kind
of snippet can be helpful:

from knights.vendored import shrubbery
import sys
sys.path["shrubbery"] = shrubbery

So doing that kind of aliasing on a process-wide basis is already possible,
as long as you have a point where you can inject the alias (and by
combining it with a lazy importer, you can defer the actual import until
someone actually uses the module).

Limiting aliasing to a particular set of modules *doing* imports would be
much harder though, since we don't pass that information along (although
context variables would potentially give us a way to make it available
without having to redefine all the protocol APIs)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Better support for consuming vendored packages

2018-03-24 Thread Nick Coghlan
On 24 March 2018 at 19:29, Nick Coghlan  wrote:

> On 23 March 2018 at 02:58, Gregory Szorc  wrote:
>
>> I'd like to start a discussion around practices for vendoring package
>> dependencies. I'm not sure python-dev is the appropriate venue for this
>> discussion. If not, please point me to one and I'll gladly take it there.
>>
>>
> Since you mainly seem interested in the import side of things (rather than
> the initial vendoring process), python-ideas is probably the most suitable
> location (we're not at the stage of a concrete design proposal that would
> be appropriate for python-dev, and this doesn't get far enough into import
> system arcana to really need to be an import-sig discussion rather than a
> python-ideas one).
>
>
>> What we've done is effectively rename the "shrubbery" package to
>> "knights.vendored.shrubbery." If a module inside that package attempts an
>> `import shrubbery.x`, this could fail because "shrubbery" is no longer the
>> package name. Or worse, it could pick up a separate copy of "shrubbery"
>> somewhere else in `sys.path` and you could have a Frankenstein package
>> pulling its code from multiple installs. So for this to work, all
>> package-local imports must be using relative imports. e.g. `from . import
>> x`.
>>
>
> If it's the main application doing the vendoring, then the following kind
> of snippet can be helpful:
>
> from knights.vendored import shrubbery
> import sys
> sys.path["shrubbery"] = shrubbery
>

Oops, s/path/modules/ :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Better support for consuming vendored packages

2018-03-24 Thread Steve Holden
On Sat, Mar 24, 2018 at 9:29 AM, Nick Coghlan  wrote:

> On 23 March 2018 at 02:58, Gregory Szorc  wrote:
>
>> I'd like to start a discussion around practices for vendoring package
>> dependencies. I'm not sure python-dev is the appropriate venue for this
>> discussion. If not, please point me to one and I'll gladly take it there.
>>
>>
> ​[...]​
>
> If it's the main application doing the vendoring, then the following kind
> of snippet can be helpful:
>
> from knights.vendored import shrubbery
> import sys
> sys.path["shrubbery"] = shrubbery
>
> ​I suspect you meant
>

​sys.modules["shrubbery"]​ = shrubbery
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Nick Coghlan
On 24 March 2018 at 18:50, Serhiy Storchaka  wrote:

> Currently the repository contains bundled pip and setuptools (2 MB total)
> which are updated with every release of pip and setuptools. This increases
> the size of the repository by around 2 MB several times per year. There
> were total 37 updates of Lib/ensurepip/_bundled, therefore the repository
> contains up to 70 MB of unused blobs. The size of the repository is 350 MB.
> Currently blobs takes up to 20% of the size of the repository, but this
> percent will likely grow in future, because they where added only 4 years
> ago.
>
> Wouldn't be better to put them into a separate repository like Tcl/Tk and
> other external binaries for Windows, and download only the recent version?
>

Specifically, I believe that would entail adding them to
https://github.com/python/cpython-bin-deps, and then updating the make file
to do a shallow clone of the relevant branch and copy the binaries to a
point where ensurepip expects to find them?

I'm fine with the general idea of moving these out to the bin-deps repo, as
long as cloning the main CPython repo and running "./configure && make &&
./python -m test test_ensurepip" still works. We'd also want to add docs to
the developer guide on how to update them (those docs are missing at the
moment, since the update process is just dropping the new wheel files
directly into the right place)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Paul Moore
On 24 March 2018 at 10:50, Nick Coghlan  wrote:
> On 24 March 2018 at 18:50, Serhiy Storchaka  wrote:
>>
>> Currently the repository contains bundled pip and setuptools (2 MB total)
>> which are updated with every release of pip and setuptools. This increases
>> the size of the repository by around 2 MB several times per year. There were
>> total 37 updates of Lib/ensurepip/_bundled, therefore the repository
>> contains up to 70 MB of unused blobs. The size of the repository is 350 MB.
>> Currently blobs takes up to 20% of the size of the repository, but this
>> percent will likely grow in future, because they where added only 4 years
>> ago.
>>
>> Wouldn't be better to put them into a separate repository like Tcl/Tk and
>> other external binaries for Windows, and download only the recent version?
>
>
> Specifically, I believe that would entail adding them to
> https://github.com/python/cpython-bin-deps, and then updating the make file
> to do a shallow clone of the relevant branch and copy the binaries to a
> point where ensurepip expects to find them?
>
> I'm fine with the general idea of moving these out to the bin-deps repo, as
> long as cloning the main CPython repo and running "./configure && make &&
> ./python -m test test_ensurepip" still works. We'd also want to add docs to
> the developer guide on how to update them (those docs are missing at the
> moment, since the update process is just dropping the new wheel files
> directly into the right place)

I don't have a problem with moving the pip/setuptools wheels - as long
as (as a pip dev doing a release) I know where to put the files, it
makes little difference to me. But as Nick says, if the files aren't
in the main CPython repository, the build process (both the Unix and
the Windows processes) will need updating to ensure that the files are
taken from where they do reside and put in the right places.

I'd assume that a change like that is big enough that it would be
targeted at 3.8, BTW (and so won't affect what I need to do for 3.7).

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Tin Tvrtković
Hi Python-dev,

I'm one of the core attrs contributors, and I'm contemplating applying an
optimization to our generated __init__s. Before someone warns me python-dev
is for the development of the language itself, there are two reasons I'm
posting this here:

1) it's a very low level question that I'd really like the input of the
core devs on, and
2) maybe this will find its way into dataclasses if it works out.

I've found that, if a class has more than one attribute, instead of
creating an init like this:

self.a = a
self.b = b
self.c = c

it's faster to do this:

self.__dict__ = {'a': a, 'b': b, 'c': c}

i.e. to replace the instance dictionary altogether. On PyPy, their core
devs inform me this is a bad idea because the instance dictionary is
special there, so we won't be doing this on PyPy.

But is it safe to do on CPython?

To make the question simpler, disregard the possibility of custom setters
on the attributes.

Thanks in advance!
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Kirill Balunov
2018-03-24 17:18 GMT+03:00 Tin Tvrtković :
>
> I've found that, if a class has more than one attribute, instead of
> creating an init like this:
>
> self.a = a
> self.b = b
> self.c = c
>
> it's faster to do this:
>
> self.__dict__ = {'a': a, 'b': b, 'c': c}
>
> i.e. to replace the instance dictionary altogether. On PyPy, their core
> devs inform me this is a bad idea because the instance dictionary is
> special there, so we won't be doing this on PyPy.
>

But why you need to replace it? When you can just update it:

class C:
def __init__(self, a, b, c):
self.__dict__.update({'a': a, 'b': b, 'c': c})

I'm certainly not a developer. Just out of curiosity.

With kind regards,
-gdg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Steven D'Aprano
On Sat, Mar 24, 2018 at 02:18:14PM +, Tin Tvrtković wrote:
 
> self.__dict__ = {'a': a, 'b': b, 'c': c}
> 
> i.e. to replace the instance dictionary altogether. On PyPy, their core
> devs inform me this is a bad idea because the instance dictionary is
> special there, so we won't be doing this on PyPy.
> 
> But is it safe to do on CPython?

I don't know if it's safe, but replacing __init__ is certainly an old 
and famous idiom:

https://code.activestate.com/recipes/66531-singleton-we-dont-need-no-stinkin-singleton-the-bo/


-- 
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Raymond Hettinger

> On Mar 24, 2018, at 7:18 AM, Tin Tvrtković  wrote:
> 
> it's faster to do this:
> 
> self.__dict__ = {'a': a, 'b': b, 'c': c}
> 
> i.e. to replace the instance dictionary altogether. On PyPy, their core devs 
> inform me this is a bad idea because the instance dictionary is special 
> there, so we won't be doing this on PyPy. 
> 
> But is it safe to do on CPython?

This should work. I've seen it done in other production tools without any ill 
effect.

The dict can be replaced during __init__() and still get benefits of 
key-sharing.  That benefit is lost only when the instance dict keys are 
modified downstream from __init__().  So, from a dict size point of view, your 
optimization is fine.

Still, you should look at whether this would affect static type checkers, lint 
tools, and other tooling.


Raymond
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Steve Dower
Or we could just pull the right version directly from PyPI? (Note that updating 
the version should be an explicit step, as it is today, but the file should be 
identical to what’s on PyPI, right? And a urlretrieve is easier than pulling 
from a git repo.)

Top-posted from my Windows phone

From: Paul Moore
Sent: Saturday, March 24, 2018 4:17
To: Nick Coghlan
Cc: Serhiy Storchaka; python-dev
Subject: Re: [Python-Dev] Move ensurepip blobs to external place

On 24 March 2018 at 10:50, Nick Coghlan  wrote:
> On 24 March 2018 at 18:50, Serhiy Storchaka  wrote:
>>
>> Currently the repository contains bundled pip and setuptools (2 MB total)
>> which are updated with every release of pip and setuptools. This increases
>> the size of the repository by around 2 MB several times per year. There were
>> total 37 updates of Lib/ensurepip/_bundled, therefore the repository
>> contains up to 70 MB of unused blobs. The size of the repository is 350 MB.
>> Currently blobs takes up to 20% of the size of the repository, but this
>> percent will likely grow in future, because they where added only 4 years
>> ago.
>>
>> Wouldn't be better to put them into a separate repository like Tcl/Tk and
>> other external binaries for Windows, and download only the recent version?
>
>
> Specifically, I believe that would entail adding them to
> https://github.com/python/cpython-bin-deps, and then updating the make file
> to do a shallow clone of the relevant branch and copy the binaries to a
> point where ensurepip expects to find them?
>
> I'm fine with the general idea of moving these out to the bin-deps repo, as
> long as cloning the main CPython repo and running "./configure && make &&
> ./python -m test test_ensurepip" still works. We'd also want to add docs to
> the developer guide on how to update them (those docs are missing at the
> moment, since the update process is just dropping the new wheel files
> directly into the right place)

I don't have a problem with moving the pip/setuptools wheels - as long
as (as a pip dev doing a release) I know where to put the files, it
makes little difference to me. But as Nick says, if the files aren't
in the main CPython repository, the build process (both the Unix and
the Windows processes) will need updating to ensure that the files are
taken from where they do reside and put in the right places.

I'd assume that a change like that is big enough that it would be
targeted at 3.8, BTW (and so won't affect what I need to do for 3.7).

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Ned Deily
On Mar 24, 2018, at 16:13, Steve Dower  wrote:
> Or we could just pull the right version directly from PyPI? (Note that 
> updating the version should be an explicit step, as it is today, but the file 
> should be identical to what’s on PyPI, right? And a urlretrieve is easier 
> than pulling from a git repo.)

I think the primary original rationale for having the pip wheel and its 
dependencies checked into the cpython repo was so that users would be able to 
install pip even if they did not have an Internet connection.  But perhaps that 
requirement can be relaxed a bit if we say that the necessary wheels are 
vendored into all of our downloadable release items, that is, included in the 
packaging of source release files (the various tarballs) and the Windows and 
macOS binary installers.  The main change would likely be making ensurepip a 
bit smarter to download if the bundled wheels are not present in the source 
directory.  Assuming that people building from a cpython repo need to have a 
network connection if they want to run ensurepip, at least for the first time, 
is probably not an onerous requirement.


--
  Ned Deily
  [email protected] -- []

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Nick Coghlan
On 25 March 2018 at 00:18, Tin Tvrtković  wrote:

> But is it safe to do on CPython?
>

That depends on what you mean by "safe" :)

It won't crash, but it will lose any existing entries that a metaclass,
subclass, or __new__ method implementation might have added to the instance
dictionary before calling the __init__ method. That can be OK in a tightly
controlled application specific class hierarchy, but it would be
questionable in a general purpose utility library that may be combined with
arbitrary other types.

As Kirill suggests, `self.__dict__.update(new_attrs)` is likely to be
faster than repeated assignment statements, without the potentially odd
interactions with other instance initialisation code.

It should also be explicitly safe to do in the case of "type(self) is
__class__ and not self.__dict__", which would let you speed up the common
case of direct instantiation, while falling back to the update based
approach when combined with other classes at runtime.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Nick Coghlan
On 25 March 2018 at 06:52, Ned Deily  wrote:

> On Mar 24, 2018, at 16:13, Steve Dower  wrote:
> > Or we could just pull the right version directly from PyPI? (Note that
> updating the version should be an explicit step, as it is today, but the
> file should be identical to what’s on PyPI, right? And a urlretrieve is
> easier than pulling from a git repo.)
>
> I think the primary original rationale for having the pip wheel and its
> dependencies checked into the cpython repo was so that users would be able
> to install pip even if they did not have an Internet connection.  But
> perhaps that requirement can be relaxed a bit if we say that the necessary
> wheels are vendored into all of our downloadable release items, that is,
> included in the packaging of source release files (the various tarballs)
> and the Windows and macOS binary installers.  The main change would likely
> be making ensurepip a bit smarter to download if the bundled wheels are not
> present in the source directory.  Assuming that people building from a
> cpython repo need to have a network connection if they want to run
> ensurepip, at least for the first time, is probably not an onerous
> requirement.
>

Right, having the wheels in the release artifacts is a requirement, as is
having them available for use when running the test suite, but having them
in the git repo isn't.

Adding them directly to the repo was just the simplest approach to getting
ensurepip working, since it didn't require any changes to the build process.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move ensurepip blobs to external place

2018-03-24 Thread Matt Billenstein
As i recall git LFS makes storing large binary objects in some external object 
storage fairly seamless - might be a good fit for keeping the same workflow and 
not bloating the repo.

M

--
Matt Billenstein
[email protected]

Sent from my iPhone 6 (this put here so you know I have one)

> On Mar 24, 2018, at 8:27 PM, Nick Coghlan  wrote:
> 
>> On 25 March 2018 at 06:52, Ned Deily  wrote:
>> On Mar 24, 2018, at 16:13, Steve Dower  wrote:
>> > Or we could just pull the right version directly from PyPI? (Note that 
>> > updating the version should be an explicit step, as it is today, but the 
>> > file should be identical to what’s on PyPI, right? And a urlretrieve is 
>> > easier than pulling from a git repo.)
>> 
>> I think the primary original rationale for having the pip wheel and its 
>> dependencies checked into the cpython repo was so that users would be able 
>> to install pip even if they did not have an Internet connection.  But 
>> perhaps that requirement can be relaxed a bit if we say that the necessary 
>> wheels are vendored into all of our downloadable release items, that is, 
>> included in the packaging of source release files (the various tarballs) and 
>> the Windows and macOS binary installers.  The main change would likely be 
>> making ensurepip a bit smarter to download if the bundled wheels are not 
>> present in the source directory.  Assuming that people building from a 
>> cpython repo need to have a network connection if they want to run 
>> ensurepip, at least for the first time, is probably not an onerous 
>> requirement.
> 
> Right, having the wheels in the release artifacts is a requirement, as is 
> having them available for use when running the test suite, but having them in 
> the git repo isn't.
> 
> Adding them directly to the repo was just the simplest approach to getting 
> ensurepip working, since it didn't require any changes to the build process.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   [email protected]   |   Brisbane, Australia
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/matt%40vazor.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com