[Python-Dev] Does Zip Importer have to be Special?

2014-07-24 Thread Phil Thompson
I have an importer for use in applications that embed an interpreter 
that does a similar job to the Zip importer (except that the storage is 
a C data structure rather than a .zip file). Just like the Zip importer 
I need to import my importer and add it to sys.path_hooks. However the 
earliest opportunity I have to do this is after the Py_Initialize() call 
returns - but this is too late because some parts of the standard 
library have already needed to be imported.


My current workaround is to include a modified version of _bootstrap.py 
as a frozen module that has the necessary steps added to the end of its 
_install() function.


The Zip importer doesn't have this problem because it gets special 
treatment - the call to its equivalent code is hard-coded and happens 
exactly when needed.


What would help is a table of functions that were called where 
_PyImportZip_Init() is currently called. By default the only entry in 
the table would be _PyImportZip_Init. There would be a way of modifying 
the table, either like how PyImport_FrozenModules is handled or how 
Inittab is handled.


...or if there is a better solution that I have missed that doesn't 
require a modified _bootstrap.py.


Thanks,
Phil
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Does Zip Importer have to be Special?

2014-07-24 Thread Brett Cannon
On Thu Jul 24 2014 at 1:07:12 PM, Phil Thompson 
wrote:

> I have an importer for use in applications that embed an interpreter
> that does a similar job to the Zip importer (except that the storage is
> a C data structure rather than a .zip file). Just like the Zip importer
> I need to import my importer and add it to sys.path_hooks. However the
> earliest opportunity I have to do this is after the Py_Initialize() call
> returns - but this is too late because some parts of the standard
> library have already needed to be imported.
>
> My current workaround is to include a modified version of _bootstrap.py
> as a frozen module that has the necessary steps added to the end of its
> _install() function.
>
> The Zip importer doesn't have this problem because it gets special
> treatment - the call to its equivalent code is hard-coded and happens
> exactly when needed.
>
> What would help is a table of functions that were called where
> _PyImportZip_Init() is currently called. By default the only entry in
> the table would be _PyImportZip_Init. There would be a way of modifying
> the table, either like how PyImport_FrozenModules is handled or how
> Inittab is handled.
>
> ...or if there is a better solution that I have missed that doesn't
> require a modified _bootstrap.py.
>

Basically you want a way to specify arguments into
importlib._bootstrap._install() so that sys.path_hooks and sys.meta_path
were configurable instead of hard-coded (it could also be done just past
importlib being installed, but that's a minor detail). Either way there is
technically no reason not to allow for it, just lack of motivation since
this would only come up for people who embed the interpreter AND have a
custom importer which affects loading the stdlib as well (any reason you
can't freeze the stdblib as a solution?).

We could go the route of some static array that people could modify.
Another option would be to allow for the specification of a single function
which is called just prior to importing the rest of the stdlib,

The problem with all of this is you are essentially asking for a hook to
let you have code have access to the interpreter state before it is fully
initialized. Zipimport and the various bits of code that get loaded during
startup are special since they are coded to avoid touching anything that
isn't ready to be used. So if we expose something that allows access prior
to full initialization it would have to be documented as having no
guarantees of interpreter state, etc. so we are not held to some API that
makes future improvements difficult.

IOW allowing for easy patching of Python is probably the best option I can
think of. Would tweaking importlib._bootstrap._install() to accept
specified values for sys.meta_path and sys.path_hooks be enough so that you
can change the call site for those functions?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Does Zip Importer have to be Special?

2014-07-24 Thread Phil Thompson

On 24/07/2014 6:48 pm, Brett Cannon wrote:
On Thu Jul 24 2014 at 1:07:12 PM, Phil Thompson 


wrote:


I have an importer for use in applications that embed an interpreter
that does a similar job to the Zip importer (except that the storage 
is
a C data structure rather than a .zip file). Just like the Zip 
importer

I need to import my importer and add it to sys.path_hooks. However the
earliest opportunity I have to do this is after the Py_Initialize() 
call

returns - but this is too late because some parts of the standard
library have already needed to be imported.

My current workaround is to include a modified version of 
_bootstrap.py
as a frozen module that has the necessary steps added to the end of 
its

_install() function.

The Zip importer doesn't have this problem because it gets special
treatment - the call to its equivalent code is hard-coded and happens
exactly when needed.

What would help is a table of functions that were called where
_PyImportZip_Init() is currently called. By default the only entry in
the table would be _PyImportZip_Init. There would be a way of 
modifying

the table, either like how PyImport_FrozenModules is handled or how
Inittab is handled.

...or if there is a better solution that I have missed that doesn't
require a modified _bootstrap.py.



Basically you want a way to specify arguments into
importlib._bootstrap._install() so that sys.path_hooks and 
sys.meta_path
were configurable instead of hard-coded (it could also be done just 
past
importlib being installed, but that's a minor detail). Either way there 
is
technically no reason not to allow for it, just lack of motivation 
since

this would only come up for people who embed the interpreter AND have a
custom importer which affects loading the stdlib as well (any reason 
you

can't freeze the stdblib as a solution?).


Not really. I'd lose the compression my importer implements.

(Are there any problems with freezing packages rather than simple 
modules?)



We could go the route of some static array that people could modify.
Another option would be to allow for the specification of a single 
function

which is called just prior to importing the rest of the stdlib,

The problem with all of this is you are essentially asking for a hook 
to
let you have code have access to the interpreter state before it is 
fully
initialized. Zipimport and the various bits of code that get loaded 
during
startup are special since they are coded to avoid touching anything 
that
isn't ready to be used. So if we expose something that allows access 
prior

to full initialization it would have to be documented as having no
guarantees of interpreter state, etc. so we are not held to some API 
that

makes future improvements difficult.

IOW allowing for easy patching of Python is probably the best option I 
can

think of. Would tweaking importlib._bootstrap._install() to accept
specified values for sys.meta_path and sys.path_hooks be enough so that 
you

can change the call site for those functions?


My importer runs under PathFinder so it needs sys.path as well (and 
doesn't need sys.meta_path).


Phil
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Does Zip Importer have to be Special?

2014-07-24 Thread Brett Cannon
On Thu Jul 24 2014 at 2:12:20 PM, Phil Thompson 
wrote:

> On 24/07/2014 6:48 pm, Brett Cannon wrote:
> > On Thu Jul 24 2014 at 1:07:12 PM, Phil Thompson
> > 
> > wrote:
> >
> >> I have an importer for use in applications that embed an interpreter
> >> that does a similar job to the Zip importer (except that the storage
> >> is
> >> a C data structure rather than a .zip file). Just like the Zip
> >> importer
> >> I need to import my importer and add it to sys.path_hooks. However the
> >> earliest opportunity I have to do this is after the Py_Initialize()
> >> call
> >> returns - but this is too late because some parts of the standard
> >> library have already needed to be imported.
> >>
> >> My current workaround is to include a modified version of
> >> _bootstrap.py
> >> as a frozen module that has the necessary steps added to the end of
> >> its
> >> _install() function.
> >>
> >> The Zip importer doesn't have this problem because it gets special
> >> treatment - the call to its equivalent code is hard-coded and happens
> >> exactly when needed.
> >>
> >> What would help is a table of functions that were called where
> >> _PyImportZip_Init() is currently called. By default the only entry in
> >> the table would be _PyImportZip_Init. There would be a way of
> >> modifying
> >> the table, either like how PyImport_FrozenModules is handled or how
> >> Inittab is handled.
> >>
> >> ...or if there is a better solution that I have missed that doesn't
> >> require a modified _bootstrap.py.
> >>
> >
> > Basically you want a way to specify arguments into
> > importlib._bootstrap._install() so that sys.path_hooks and
> > sys.meta_path
> > were configurable instead of hard-coded (it could also be done just
> > past
> > importlib being installed, but that's a minor detail). Either way there
> > is
> > technically no reason not to allow for it, just lack of motivation
> > since
> > this would only come up for people who embed the interpreter AND have a
> > custom importer which affects loading the stdlib as well (any reason
> > you
> > can't freeze the stdblib as a solution?).
>
> Not really. I'd lose the compression my importer implements.
>
> (Are there any problems with freezing packages rather than simple
> modules?)
>

Nope, modules and packages are both supported.


>
> > We could go the route of some static array that people could modify.
> > Another option would be to allow for the specification of a single
> > function
> > which is called just prior to importing the rest of the stdlib,
> >
> > The problem with all of this is you are essentially asking for a hook
> > to
> > let you have code have access to the interpreter state before it is
> > fully
> > initialized. Zipimport and the various bits of code that get loaded
> > during
> > startup are special since they are coded to avoid touching anything
> > that
> > isn't ready to be used. So if we expose something that allows access
> > prior
> > to full initialization it would have to be documented as having no
> > guarantees of interpreter state, etc. so we are not held to some API
> > that
> > makes future improvements difficult.
> >
> > IOW allowing for easy patching of Python is probably the best option I
> > can
> > think of. Would tweaking importlib._bootstrap._install() to accept
> > specified values for sys.meta_path and sys.path_hooks be enough so that
> > you
> > can change the call site for those functions?
>
> My importer runs under PathFinder so it needs sys.path as well (and
> doesn't need sys.meta_path).
>

sys.path can be set via PYTHONPATH, etc. so that shouldn't be as much of an
issue.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Does Zip Importer have to be Special?

2014-07-24 Thread Nick Coghlan
On 25 Jul 2014 03:51, "Brett Cannon"  wrote:

> The problem with all of this is you are essentially asking for a hook to
let you have code have access to the interpreter state before it is fully
initialized. Zipimport and the various bits of code that get loaded during
startup are special since they are coded to avoid touching anything that
isn't ready to be used. So if we expose something that allows access prior
to full initialization it would have to be documented as having no
guarantees of interpreter state, etc. so we are not held to some API that
makes future improvements difficult.

Note that this is *exactly* the problem PEP 432 is designed to handle:
separating the configuration of the core interpreter from the configuration
of the operating system interfaces, so the latter can run relatively
normally (at least compared to today).

As you say, though it's a niche problem compared to something like
packaging, which is why it got bumped down my personal priority list. I
haven't even got back to the first preparatory step I identified which is
to separate out our main functions to a separate "Programs" directory so
it's easier to distinguish "embeds Python" sections of the code from the
more typical "is part of Python" and "extends Python" code.

> IOW allowing for easy patching of Python is probably the best option I
can think of.

Yeah, that sounds reasonable - IIRC, Christian ended up going with a
similar "make it patch friendly" approach for the hashing changes, rather
than going overboard with configuration options.

Cheers,
Nick.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com