[Python-ideas] Link accepted PEPs to their whatsnew section?

2018-06-12 Thread Neil Schemenauer
I'm testing "Data Classes" for Python 3.7.  Awesome new feature,
BTW.  The PEP is the first search result when I lookup "dataclass
python".  Given that the PEP is not the best documentation for an
end user, I wonder if we should have a link in the header section of
the PEP that goes to better documentation.  We could link accepted
PEPs to their section of the whatsnew.  Or, link to the language
documentation that describes the feature.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-14 Thread Neil Schemenauer
On 2018-04-12, M.-A. Lemburg wrote:
> This leaves the proposal to restructure pyc files into a sectioned
> file and possibly indexed file to make access to (lazily) loaded
> parts faster.

I would like to see a format can hold one or more modules in a
single file.  Something like the zip format but optimized for fast
interpreter startup time.  It should support lazy loading of module
parts (e.g. maybe my lazy bytecode execution idea[1]).  Obviously a
lot of details to work out.

The design should also take into account the widespread use of
virtual environments.  So, it should be easy and space efficient to
build virtual environments using this format (e.g. maybe allow
overlays so that stdlib package is not copied into virtual
environment, virtual packages would be overlaid on stdlib file).
Also, should be easy to bundle all modules into a "uber" package and
append it to the Python executable.  CPython should provide
out-of-box support for single-file executables.


1. https://github.com/python/cpython/pull/6194
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] f-string literals by default?

2017-12-05 Thread Neil Schemenauer
On 2017-12-05, Joseph Jevnik wrote:
> This would break code that uses str.format everywhere for very
> little benefit.

That is a very strong reason not to do it.  I think we can end this
thread.  Thanks.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] f-string literals by default?

2017-12-05 Thread Neil Schemenauer
I think most people who have tried f-strings have found them handy.
Could we transition to making default string literal into an
f-string?  I think there is a smooth migration path.

f-strings without embedded expressions already compile to the same
bytecode as normal string literals.  I.e. no overhead.  The issue
will be literal strings that contain the f-string format characters.

We could add a future import, e.g.

from __future__ import fstring_literals

that would make all literal strings in the module into f-strings.
In some future release, we could warn about literal strings in
modules without the future import that contain f-string format
characters.  Eventually, we can change the default.

To make migration easier, we can provide a source-to-source
translation tool.  It is quite simple to do that using
the tokenizer module.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Provide a way to import module without exec body

2017-12-02 Thread Neil Schemenauer
On 2017-12-03, Nick Coghlan wrote:
> There'd be some subtleties around handling backwards compatibility
> with __import__ overrides (essentially, CREATE_MODULE would have to
> revert to doing all the work, while EXEC_MODULE would become a no-op),
> but the basic idea seems plausible.

Right now (half-baked ideas), I'm thinking:

IMPORT_RESOLVE

Gives the abs_name for a module (to feed to _find_and_load())

IMPORT_LOAD

Calls _find_and_load() with abs_name as argment.  The body of
the module is not executed yet.  Could return a spec or a module
with the spec that contains the code object of the body.

IMPORT_EXEC

Executes the body of the module.

IMPORT_FROM

Calls _handle_fromlist().

Props to Brett for making importlib in such as way that this clean
separation should be relatively easy to do.

To handle custom __import__ hook, I think we can do the following.
Have each opcode detect if __import__ is overridden.  There is
already such test (import_name fast path).  If it is overridden,
IMPORT_RESOLVE and IMPORT_LOAD will gather up info and then
IMPORT_EXEC will call __import__() using compatible arguments.

Inititally, the benefit of making these changes is not some
performance improvement or some functionalty we didn't previously
have.  importlib does all this already and probably just as quickly.
The benefit that the import system becomes more understandable.

If we decide it is a good idea, we could expose hooks for these
opcodes.  Not like __import__ though.  Maybe there should be a
function like sys.set_import_hook(, func).  That will keep ceval
fast as it will know if there is a hook or not, without having to
crawl around in builtins.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Provide a way to import module without exec body

2017-12-01 Thread Neil Schemenauer
On 2017-12-01, Chris Angelico wrote:
> Can you elaborate on where this is useful, please?

Introspection tools, for example, might want to look at the module
without executing it.  Also, it is a building block to make lazy loading
of modules work.  As Nick points out, importlib can do this already.

Currently, the IMPORT_NAME both loads the code for a module and also
executes it.  The exec happens fairly deep in the guts of importlib.
This makes import.c and ceval.c mutually recursive.  The locking gets
complicated.  There are hacks like _call_with_frames_removed() to hide
the recursion going on.

Instead, we could have two separate opcodes, one that gets the module
but does not exec it (i.e. a function like __import__() that returns a
future) and another opcode that actually does the execution.  Figuring
out all the details is complicated.

Possible benefits:

- importlib is simpler

- reduce the amount of stack space used (removing recursion by
  "continuation passing style").

- makes profiling Python easier.  Tools like valgrind get confused
  by call cycle between ceval.c and import.c.

- easier to implement lazy loading of modules (not necessarily a
  standard Python feature but will make 3rd party implementations
  cleaner)

I'm CCing Brett as I'm sure he has thoughts on this, given his intimate
knowledge of importlib.  To me, it seems like __import__() has a
terribly complicated API because it does so many different things.

Maybe two opcodes is not even enough.  Maybe we should have one to
resolve relative imports (i.e. import.c:resolve_name), one to load but
not exec a module given its absolute name (i.e.  _find_and_load()
without the exec), one to exec a loaded module, one or more to handle
the horror of "fromlist" (i.e.  _handle_fromlist()).

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Provide a way to import module without exec body

2017-12-01 Thread Neil Schemenauer
I have been working on reducing Python statup time.  It would be
nice if there was some way to load a module into memory without exec
of its body code.  I'm sure other people have wished for this.

Perhaps there could be a new special function, similar to __import__
for this purpose.  E.g.  __load_module__().  To actually execute the
module, I had the idea to make module objects callable, i.e. tp_call
for PyModule_Type.  That's a little too cute though and will cause
confusion.  Maybe instead, add a function attribute to modules, e.g.
mod.__exec__().

I have a little experimental code, just a small step:

https://github.com/nascheme/cpython/tree/import_defer_exec

We need importlib to give us the module object and the bytecode
without doing the exec().  My hackish solution is to set properties
on __spec__ and then have PyImport_ImportModuleLevelObject() do the
exec().
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Modules as global namespaces rather than dicts

2017-11-15 Thread Neil Schemenauer
On 2017-11-15, Koos Zevenhoven wrote:
> Another point, perhaps more difficult to address: Would for instance
> globals() then return a module instead of a dict/mapping?​

For compatibility, it would definitely have to return a dict.  As a
result, calling globals() would cause the "fast globals" flag to be
cleared.  My hope is that getting a reference to the module dict
from user code is the exception rather than the norm.  I.e. most
modules would never have globals() called within their code.  If it
is too common, the flag will be cleared on most modules and we won't
gain much speed.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Modules as global namespaces rather than dicts

2017-11-14 Thread Neil Schemenauer
This is an idea I have been playing with and seems to hold some
promise.  I think we should use a module instance as the standard
global namespace rather than directly using its dict.  I have a
prototype version of CPython that does this, not working 100% yet
though.

Major changes:

- In the frameobject, remove f_builtins and f_globals, add
  f_namespace that refers to the module.  Make f_globals and
  f_builtins into properties.

- Change ceval to use f_namespace, rather than carrying around
  globals and builtins.  Change functions that take globals as a
  dict so they also accept a module object.

- Change the module object to keep track of the builtins for it.

- Change funcobject (e.g. PyFunction_NewWithQualName) to
  accept 'globals' as a module object

- When given a dict and we now expect a module object, create an
  anonymous module to wrap it.  This part is a bit tricky due to
  reference cycles and speed requirements.  However, my hope is that
  if the internals of CPython can be made to pass modules instead of
  dicts around, these anonymous modules will become a rare case and
  so won't matter if they are a little slower.


So, what is the purpose of all this trouble?

- I believe quite a lot of Python internals can be simpler.  For
  example, importlib is complicated by the fact that a dict is
  passed around when most of the logic would prefer to have the
  module.  Grubbing in sys.modules to lookup the module object is
  ugly.  The expression "exec(code, module)" is elegant to me.

- I believe it will be possible to make global variable access work
  similar to fast locals.  I.e. have each module contain a indexed
  list of global names, and use an array lookup rather than a dict
  lookup in the normal case.  For backwards compatibility, my idea
  is to keep a flag on the module object noting if fast global
  behavior is okay.  If someone grabs the module dict, e.g. using
  module.__dict__ , vars() or frame.f_globals then we clear the flag
  and revert to the slow dict behavior.  Also, if a piece of code is
  executed in a different namespace, we have to revert to the slow
  case.  That does not seem so hard to do.

- I want to have properties for module globals and have them work
  from within the module.  I.e. changing something from a global
  variable to a property should not require changes to other module
  code.  I.e. LOAD_GLOBAL should trigger a property get.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr()

2017-09-12 Thread Neil Schemenauer
On 2017-09-12, Eric Snow wrote:
> Yeah, good luck! :). If I weren't otherwise occupied with my own crazy
> endeavor I'd lend a hand.

No problem.  It makes sense to have a proof of concept before
spending time on a PEP.  If the idea breaks too much old code it is
not going to happen.  So, I will work on a slow but mostly
compatible implementation for now.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] LOAD_NAME/LOAD_GLOBAL should be use getattr()

2017-09-12 Thread Neil Schemenauer
This is my idea of making module properties work.  It is necessary
for various lazy-loading module ideas and it cleans up the language
IMHO.  I think it may be possible to do it with minimal backwards
compatibility problems and performance regression.

To me, the main issue with module properties (or module __getattr__)
is that you introduce another level of indirection on global
variable access.  Anywhere the module.__dict__ is used as the
globals for code execution, changing LOAD_NAME/LOAD_GLOBAL to have
another level of indirection is necessary.  That seems inescapable.

Introducing another special feature of modules to make this work is
not the solution, IMHO.  We should make module namespaces be more
like instance namespaces.  We already have a mechanism and it is
getattr on objects.

I have a very early prototype of this idea.  See:

https://github.com/nascheme/cpython/tree/exec_mod

Issues to be resolved:

- __namespace__ entry in the __dict__ creates a reference cycle.
  Maybe could use a weakref somehow to avoid it.  Maybe we just
  explicitly break it.

- getattr() on the module may return things that LOAD_NAME and
  LOAD_GLOBAL don't expect (e.g. things from the module type).  I
  need to investigate that.

- Need to fix STORE_* opcodes to do setattr() rather than
  __setitem__.

- Need to optimize the implementation.  Maybe the module instance
  can know if any properties or __getattr__ are defined.  If no,
  have __getattribute__ grab the variable directly from md_dict.

- Need to fix eval() to allow module as well as dict.

- Need to change logic where global dict is passed around.  Pass the
  module instead so we don't have to keep retrieving __namespace__.
  For backwards compatibility, need to keep functions that take
  'globals' as dict and use PyModule_GetDict() on public APIs that
  return globals as a dict.

- interp->builtins should be a module, not a dict.

- module shutdown procedure needs to be investigated and fixed.  I
  think it may get simpler.

- importlib needs to be fixed to pass modules to exec() and not
  dicts.  From my initial experiments, it looks like importlib gets
  a lot simpler.  Right now we pass around dicts in a lot of places
  and then have to grub around in sys.modules to get the module
  object, which is what importlib usually wants.

I have requested help in writing a PEP for this idea but so far no
one is foolish enough to join my crazy endeavor. ;-)

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Hexadecimal floating literals

2017-09-11 Thread Neil Schemenauer
On 2017-09-12, Victor Stinner wrote:
> Instead of modifying the Python grammar, the alternative is to enhance
> float(str) to support it:
> 
> k = float("0x1.2492492492492p-3") # 1/7

Making it a different function from float() would avoid backwards
compatibility issues. I.e. float() no longer returns errors on some
inputs.

E.g.

from math import hexfloat
k = hexfloat("0x1.2492492492492p-3")

I still think a literal syntax has merits.  The above cannot be
optimized by the compiler as it doesn't know what hexfloat() refers
to.  That in turn destroys constant folding peephole stuff that uses
the literal.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] lazy import via __future__ or compiler analysis

2017-09-11 Thread Neil Schemenauer
On 2017-09-11, Neil Schemenauer wrote:
> A module can be a singleton instance of a singleton ModuleType
> instance.

Maybe more accurate to say each module would have its own unique
__class__ associated with it.  So, you can add properties to the
class without affecting other modules.  For backwards compatibility,
we can create anonymous modules as needed if people are passing
'dict' objects to the legacy APIs.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] lazy import via __future__ or compiler analysis

2017-09-11 Thread Neil Schemenauer
On 2017-09-11, C Anthony Risinger wrote:
> I'm getting at, is can we find a way to make modules a real type? So dunder
> methods are activated? This would make modules phenomenally powerful
> instead of just a namespace (or resorting to after the fact __class__
> reassignment hacks).

My __namespace__ idea will allow this.  A module can be a singleton
instance of a singleton ModuleType instance.  So, you can assign a
property like:

.__class__.prop = 

and have it just work.  Each module would have a singleton class
associated with it to store the properties.  The spelling of
 will need to be worked out.  It could be
sys.modules[__name__].__class__ or perhaps we can have a weakref, so
this:

__module__.__class__.prop = ...

Need to think about this.

I have done import hooks before and I know the pain involved.
importlib cleans things up a lot.  However, if my early prototype
work is an indication, the import stuff gets a whole lot simpler.
Instead of passing around a dict and then grubbing around
sys.modules because the module is actually what you want, you just
pass the module around directly.

Thanks for you feedback.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] lazy import via __future__ or compiler analysis

2017-09-11 Thread Neil Schemenauer
On 2017-09-11, C Anthony Risinger wrote:
> I'm not sure I follow the `exec(code, module)` part from the other thread.
> `exec` needs a dict to exec code into [..]
[..]
> How do you handle lazy loading when a defined function requests a global
> via LOAD_NAME? Are you suggesting to change function.__globals__ to
> something not-a-dict, and/or change LOAD_NAME to bypass
> function.__globals__ and instead do something like:

I propose to make function.__namespace__ be a module (or other
namespace object).  function.__globals__ would be a property that
calls vars(function.__namespace__).   Implementing this is a lot of
work, need to fix LOAD_NAME, LOAD_GLOBAL and a whole heap of other
things.  I have a partly done proof-of-concept implementation.  It
crashes immediately on Python startup at this point but so far I
have not seen any insurmountable issues.

Doing it while perserving backwards compatibility will be a
challenge.  Doing it without losing performance (LOAD_GLOBAL using
the fact that f_globals is an honest 'dict') is also hard.  It this
point, I think there is a chance we can do it.  It is a conceptual
simplification of Python that gives the language more consistency
and more power.

> All this chatter about modifying opcodes, adding future statements, lazy
> module opt-in mechanisms, special handling of __init__ or __getattr__ or
> SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel like
> an awful lot of work to me, adding even more cognitive load to an already
> massively complex import system. They seem to make modules even less like
> other objects or types.

I disagree.  It would make for less cognitive load as LOAD_ATTR
would be very simlar to LOAD_NAME/LOAD_GLOBAL.  It makes modules
*more* like other objects and types.

I'm busy with "real work" this week and so can't follow the
discussion closely or work on my proof-of-concept prototype.  I hope
we can come up with an elegant solution and not some special hack
just to make module properties work.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 562

2017-09-10 Thread Neil Schemenauer
On 2017-09-10, Neil Schemenauer wrote:
> I have something 90% working, only 90% left to go. ;-)

Prototype:

https://github.com/warsaw/lazyimport/blob/master/lazy_demo.py
https://github.com/nascheme/cpython/tree/exec_mod

Next step is to do the compiler and change importlib to do
exec(code, module) rather than exec(code, module.__dict__).
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] lazy import via __future__ or compiler analysis

2017-09-08 Thread Neil Schemenauer
On 2017-09-08, Joshua Morton wrote:
> In general this won't work. It's not generally possible to know if a given
> statement has side effects or not.

That's true but with the AST static analysis, we find anything that
has potential side effects.  The question if any useful subset of
real modules pass these checks.  If we flag everything as no lazy
import safe then we don't gain anything.

> As an example, one normally wouldn't expect function or class
> definition to have side effects, but if a function is decorated,
> the decorators are evaluated at function "compilation"/import
> time, and may have side effects.

Decorators are handled in my latest prototype (module is not lazy).

>  As another example, one can put arbitrary expressions in a
>  function annotation, and those are evaluated at import time.

Not handled yet but no reason they can't be.

> As a result of this, you can't even know if an import is safe,
> because that module may have side effects. That is, the module
> foo.py:
> 
> import bar
> 
> isn't known to be lazy, because bar may import and start the logging
> module, as an example.

That is handled as well.  We only need to know if the current module
is lazy safe or not.  Imports of submodules that have side-effects
will have those side effects happen like they do now.

The major challenge I see right now is 'from .. import' and class
bases (i.e. metaclass behavior).  If we do the safe thing then all
from-imports make the module unsafe for lazy loading and any class
definition that has a base class is also unsafe.

I think the idea is not yet totally dead though.  We could have a
command-line option to enable it.  Modules that depend on
side-effects of from-import and from base classes could let the
compiler know about that somehow (make it explicit).  That would
also a good fraction of modules to be lazy import safe.

Regards,

  Neil
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Lazy creation of module level functions and classes

2017-09-07 Thread Neil Schemenauer
This is an idea that come out of the lazy loading modules idea.
Larry Hastings mentioned what a good improvement this was for PHP.
I think it would help a lot of Python too.  Very many functions and
classes are not actually needed but are instantiated anyhow.

Back of napkin idea:

Write AST transformer tool, change top-level functions and classes
to be like properties (can use __class__ I guess)

Transform is something like:

# old code
def inc(x):
return x + 1

# transformed code
def __make_inc(code=):
obj = eval(code)
_ModuleClass.inc = obj # only do eval once
return obj
inc = property(__make_inc)

Totally seat of pants idea but I can't think of a reason why it
shouldn't work.  It seems much more powerful than lazying loading
modules.  In the lazy module case, you load the whole module if any
part is touched.  Many modules only have a small fraction of their
functions and classes actually used.

If this transformer idea works, the standard Python compiler could
be changed to do the above stuff, no transformer needed.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] lazy import via __future__ or compiler analysis

2017-09-07 Thread Neil Schemenauer
Barry Warsaw  wrote:
> There are a few other things that might end up marking a module as
> "industrious" (my thesaurus's antonym for "lazy").

Good points.  The analysis can be simple at first and then we can
enhance it to be smarter about what is okay and still lazy load.  We
may evolve it over time too, making things that are not strictly
safe still not trigger the "industrious" load lazy anyhow.

Another idea is to introduce __lazy__ or some such in the global
namespace of the module, if present, e.g.

__lazy__ = True

then the analysis doesn't do anything except return True.  The
module has explicitly stated that side-effects in the top-level code
are okay to be done in a lazy fashion.

Perhaps with a little bit of smarts in the analsis and a little
sprinkling of __lazy__ flags, we can get a big chunk of modules to
lazy load.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] lazy import via __future__ or compiler analysis

2017-09-07 Thread Neil Schemenauer
This is a half baked idea that perhaps could work.  Maybe call it
2-stage module load instead of lazy.

Introduce a lazy module import process that modules can opt-in to.
The opt-in would either be with a __future__ statement or the
compiler would statically analyze the module and determine if it is
safe.  E.g. if the module has no module level statements besides
imports.

.pyc files get some other bits of information:

A) whether the module has opted for lazy import (IS_LAZY)

B) the modules imported by the module (i.e. top-level imports,
IMPORT_LIST)

Make __import__ understand this data and do lazy loading for modules
that want it.  Sub-modules that have import side-effects will still
execute as normal and the side effects will happen when the parent
module is imported..

This would consist of a recursive process, something like:

def load_module(name):
if not IS_LAZY(name):
import as usual
else:
create lazy version of module 'name'
for subname in IMPORT_LIST(name):
load_module(subname)

An additional idea from Barry W, if a module wants lazy loading but
wants to do some init when the module is "woken up", define a
__init__ top-level function.  Python would call that function when
attributes of the module are first actually used.

My plan was to implement this with a Python __import__
implementation. I would unmarshal the .pyc, compute IS_LAZY and
IMPORT_LIST at import time.  So, not gaining a lot of speedup.  It
would prove if the idea works in terms of not causing application
crashes, etc.  I could try running it with bigger apps and see how
many modules are flagged for lazy loading.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/