[Python-Dev] Re: Accepting PEP 675 - Arbitrary Literal String Type

2022-03-21 Thread Neil Schemenauer
On 2022-03-21, Gregory P. Smith wrote:
> TL;DR - PEP 675 allows type checkers to help prevent bugs allowing
> attacker-controlled data to be passed to APIs that declare themselves as
> requiring literal, in-code strings.

Great idea.  I did something like this for HTML templating in the
Quixote web framework (to avoid XSS bugs).  I did it as a special
kind of module with a slightly different compiler (using AST
transform).  With the LiteralString feature, I can implement the
same kind of thing directly in Python.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LVLMHYESDODJCH57KSEY6AAVM65IMYYD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Move the pythoncapi_compat project under the GitHub Python or PSF organization?

2022-02-11 Thread Neil Schemenauer

On 2022-02-11 06:14, Petr Viktorin wrote:


Sounds reasonable, but...

The implication of endorsing code like this is that *we cannot change 
private API even in patch releases*, which I don't think is documented 
anywhere, and might be a bit controversial.


I think we are still allowed to change them.  We should be aware of the 
impact though.  If an API is supposed to be private but is actually used 
by a large number of 3rd party extensions, we need to consider carefully 
when changing it.  I don't have much sympathy for work caused for people 
using clearly marked private APIs.  OTOH, practicality beats purity and 
we want them to be able to somehow use new versions of Python.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7FX3I4XDSXHZY2ZHFHQCVTAVWJL4KDPQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Move the pythoncapi_compat project under the GitHub Python or PSF organization?

2022-02-10 Thread Neil Schemenauer

On 2022-02-10 2:58 p.m., Victor Stinner wrote:

Would it make sense to move the pythoncapi_compat project under the
GitHub Python or PSF organization to make it more "official" and a
little bit more sustainable?


I think that makes sense.  Extensions typically have this kind of 
compatibility code built into them, so they can support multiple Python 
versions.  It makes more sense to centralize that code. It will be 
easier to keep up-to-date and will be better quality.


I think having the project more tightly associated with CPython is good 
too.  When a change is made to the CPython extension API, it would be 
good to consider how pythoncapi_compat will need to be updated.  E.g. 
how can extensions that want to support both old and new versions of the 
API work?


Regards,

   Neil

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L62AAY6JZIYIRD7FYMC567XFM5PBAIZK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Request to revert unittest and configparser incompatible changes in Python 3.11

2022-01-26 Thread Neil Schemenauer

On 2022-01-18 23:14, Gregory P. Smith wrote:
Our stdlib unittest already enables warnings by default per 
https://bugs.python.org/issue10535.


Getting the right people to pay attention to them is always the hard part.


I wonder if we can do a bit better in that regard.  When I install 3rd 
party packages, I create a usercustomize.py file that uses 
filterwarnings() to turn off all the warnings I don't care about.  I 
don't know how but maybe we could make that easier to do.  That way, you 
don't get buried in warnings coming from code you don't maintain.


Additionally, maybe we should be more aggressive about showing 
PendingDeprecationWarning if it comes from code that seems to be written 
by the user, e.g. outside site-packages or not from a package installed 
by pip.  The exact logic of that is complicated though.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HB4ZSI5UWSI3R3ASTIOCAYNU7YPBB57F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-15 Thread Neil Schemenauer

On 2021-12-15 2:57 p.m., Guido van Rossum wrote:

But as long as the imbalance is less than 0x_2000_, the refcount 
will remain in the inclusive range [ 0x_4000_ , 0x_7FFF_ ] and 
we can test for immortality by testing a single bit:


if (o->ob_refcnt & 0x_4000_)


Could we have a full GC pass reset those counts to make it even more 
unlikely to get out of bounds?


Allocating immortal objects from a specific memory region seems like 
another idea worth pursuing.  It seems mimalloc has the ability to 
allocate pools aligned to certain large boundaries. That takes some 
platform specific magic.   If we can do that, the test for immortality 
is pretty cheap.  However, if you can't allocate them at a fixed region 
determined at compile time, I don't think you can match the performance 
of the code above. Maybe it helps that you could determine immortality 
by looking at the PyObject pointer and without loading the ob_refcnt 
value from memory?  You would do something like:


if (((uintptr_t)o) & _Py_immortal_mask)

The _Py_immortal_mask value would not be known at compile time but would 
be a global constant.  So, it would be cached by the CPU.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BESETUOZGP6NZ37F32DUBEYZ4BUK2UWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Difficulty of testing beta releases now available

2021-05-25 Thread Neil Schemenauer
On 2021-05-04, Łukasz Langa wrote:
> We strongly encourage maintainers of third-party Python projects
> to test with 3.10 during the beta phase and report issues found to
> the Python bug tracker  as soon as
> possible.

Testing with Python 3.10b1 is not easy, at least for me.  Here is a
list of initial problems I ran into, from memory:

- Cython doesn't work because of _PyGen_Send change [1]

- scipy cannot be installed because it has requires_python =
  ">=3.7,<3.10".  If you manually install from source, it seems to
  work.

- numpy cannot be installed because of _Py_HashDouble() change [2]

- trio cannot be used because of TracebackException.__init__ changes [3]

For the above problems, I would suggest the 3rd party package has
the issue and it's not a problem with the Python release.  However,
I guess that few people are using Python without 3rd party packages.
So, it seems unsurprising that beta and RC releases are not well
tested.  It has taken me quite a few hours to get a working version
of Python 3.10 with all required dependancies such that I can run
unit tests for some application code.

Can we do any things to improve the situation?  Perhaps using the
pre-release functionality of PyPI would help.  We would have to
somehow encourage 3rd party packages to upload pre-releases that are
compatible with our beta/RC releases.


1. https://github.com/cython/cython/issues/3876
2. https://github.com/numpy/numpy/issues/19033
3. https://github.com/python-trio/trio/issues/1899
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2Z23GRYT5VDHR4YJKS6YIQG7G46SC27T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Future PEP: Include Fine Grained Error Locations in Tracebacks

2021-05-07 Thread Neil Schemenauer
On 2021-05-07, Pablo Galindo Salgado wrote:
> Technically the main concern may be the size of the unmarshalled
> pyc files in memory, more than the storage size of disk.

It would be cool if we could mmap the pyc files and have the VM run
code without an unmarshal step.  One idea is something similar to
the Facebook "not another freeze" PR but with a twist.  Their
approach was to dump out code objects so they could be loaded as if
they were statically defined structures.

Instead, could we dump out the pyc data in a format similar to Cap'n
Proto?  That way no unmarshal is needed.  The VM would have to be
extensively changed to run code in that format.  That's the hard
part.

The benefit would be faster startup times.  The unmarshal step is
costly.  It would mostly solve the concern about these larger
linenum/colnum tables.  We would only load that data into memory if
the table is accessed.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UKDLCOTUFNWGSMWWGLH3DJC4AVYZANDM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Neil Schemenauer
On 2021-01-12, Petr Viktorin wrote:
> Unfortunately, it's not just the creation that needs to be changed.
> You also need to decref Foo_Type somewhere.

Add the type to the module dict?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7LSR4IASUNVVEAV6M6FRHZN7DABWHSBY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Neil Schemenauer
On 2021-01-12, Pablo Galindo Salgado wrote:
> One worry that I have in general with this move is the usage of
> _PyType_GetModuleByDef to get the type object from the module
> definition. This normally involves getting a TLS in every instance
> creation, which can impact notably performance for some
> perf-sensitive types or types that are created a lot.

I would say _PyType_GetModuleByDef is the problem.  Why do we need
to use such an ugly approach (walking the MRO) when Python defined
classes don't have the same performance issue?  E.g.

class A:
def b():
pass
A.b.__globals__

IMHO, we should be working to make types and functions defined in
extensions more like the pure Python versions.

Related, my "__namespace__" idea[1] might be helpful in reducing the
differences between pure Python modules and extension modules.
Rather than functions having a __globals__ property, which is a
dict, they would have a __namespace__, which is a module object.
Basically, functions and methods known which global namespace
(module) they have been defined in.  For extension modules, when you
call a function or method defined in the extension, it could be
passed the module instance, by using the __namespace__ property.

Maybe I'm missing some details on why this approach wouldn't work.
However, at a high level, I don't see why it shouldn't.  Maybe
performance would be an issue?  Reducing the number of branches in
code paths like CALL_FUNCTION should help.

1. https://github.com/nascheme/cpython/tree/frame_no_builtins
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ERZFFBSO6J4G4X3V5QFWH6CBEEECEIAG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Neil Schemenauer
On 2021-01-12, Victor Stinner wrote:
> It seems like a safer approach is to continue the work on
> bpo-40077: "Convert static types to PyType_FromSpec()".

I agree that trying to convert static types is a good idea.  Another
possible bonus might be that we can gain some performance by
integrating garbage collection with the Python object memory
allocator.  Static types frustrate that effort.

Could we have something easier to use than PyType_FromSpec(), for
the purposes of coverting existing code?  I was thinking of
something like:

static PyTypeObject Foo_TypeStatic = {
}
static PyTypeObject *Foo_Type;

PyInit_foo(void)
{
Foo_Type = PyType_FromStatic(_TypeStatic);
}


The PyType_FromStatic() would return a new heap type, created by
copying the static type.  The static type could be marked as being
unusable (e.g. with a type flag).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RPG2TRQLONM2OCXKPVCIDKVLQOJR7EUU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Deferred Evaluation Of Annotations Using Descriptors

2021-01-11 Thread Neil Schemenauer
On 2021-01-11, Łukasz Langa wrote:
> The stringification process which your PEP describes as costly
> only happens during compilation of a .py file to .pyc. Since
> pip-installing pre-compiles modules for the user at installation
> time, there is very little runtime penalty for a fully annotated
> application.

It should be possible to make Larry's approach cheap as well.  I
have an old experiment stashed away[1] where I made the code object
for functions to be lazily created.  I.e. when a module is first
loaded, functions are not fully loaded until they are first
executed.  My goal was to reduce startup time.  It didn't show a
significant gain so I didn't pursue it further.

In my experiment, I deferred the unmarshal of the code object.
However, it occurs to me you could go a bit further and have the
function object be mostly skeletal until someone runs it or tries to
inspect it.  The skeleton would really be nothing but a file offset
(or memory offset, if using mmap) into the .pyc file.

Of course this would be some work to implement but then all Python
functions would benefit and likely Python startup time would be
reduced.  I think memory use would be reduced too since typically
you import a lot of modules but only use some of the functions in
them.

I like the idea of Larry's PEP.  I understand why the string-based
annotations was done (I use the __future__ import for my own code).
Using eval() is ugly though and Larry's idea seems like a nice way
to remove the need to call eval().


[1] https://github.com/nascheme/cpython/commits/lazy_codeobject
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/64DP2LFFRA5NO53PN3G46YZ7V3OD3RT2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Pattern matching reborn: PEP 622 is dead, long live PEP 634, 635, 636

2020-10-23 Thread Neil Schemenauer
Woah, this is both exciting and scary.  If adopted, it will be a major 
change to how Python programs are written.  It seems a lot of work has 
been put into polishing the design.  That's good because if we do this, 
will not be easy to fix things later if we made design errors.


One of my first thoughts is this sounds similar to Clojure's "spec"[1].  
Are the pattern matching PEP authors aware of it?  I don't mean we need 
to copy what spec does (really, I'm not all that familiar with it).  I 
do notice that spec patterns are not just used for case statement 
de-structuring.  Maybe we should think about a future Python that would 
use similar patterns for those things too.  I believe "spec" has been in 
use for a number of years and so the Clojure community has useful 
experience with the design.


An example of something "spec" does and this proposal agrees on is the 
matching of mappings.  I.e. that if there are additional key/value 
pairs, they don't cause the match to fail by default. That's important 
for loose coupling between systems.



[1] https://clojure.org/about/spec

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HEPSCFLR7ONEDUMCX7MRFBH7A3CVYCUJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Neil Schemenauer
On 2020-10-15, Serhiy Storchaka wrote:
> [..] it seems that there are no usages the __version__ variable in
> top 4K pypi packages.

Given that, I think it's fine to remove them.  If we find broken
code during the alpha release we still have a chance to revert.
However, it would seem quite unlikely there would be a problem.
Thanks to Batuhan for the useful search tool.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7BX4J3LSTFFGQ4GCB5EGN552ZLVOBCSR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-14 Thread Neil Schemenauer
On 2020-10-14, Serhiy Storchaka wrote:
> I propose to remove __version__ in all stdlib modules. Are there any
> exceptions?

I agree that these kinds of meta attributes are not useful and it
would be nice to clean them up.  However, IMHO, maybe the cleanup is
not worth breaking Python programs.  We could remove them from the
documentation, add comments (or deprecation warnings) telling people
not to use them.

I think it would be okay to remove them if we could show that the
top N PyPI packages don't use these attributes or at least very few
of them do.  As someone who regularly tests alpha releases, I've
found it quite painful to do since nearly every release is breaking
3rd party packages that my code depends on.  I feel we should try
hard to avoid breaking things unless there is a strong reason and
there is no easy way to provide backwards compatibility.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MI2SLQCZIKBRFX7HCUB7G4B64MTZ6XVC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Deferred, coalescing, and other very recent reference counting optimization

2020-09-02 Thread Neil Schemenauer
On 2020-09-02, Greg Ewing wrote:
> On 2/09/20 8:32 am, Neil Schemenauer wrote:
> > The most obvious approach is to adopt a multi-threaded model like is
> > done by modern Java.  I.e. no GIL and non-thread safe core data
> > structures.  That sounds a bit scary but based on Java experience it
> > seems programmers can manage it.
> 
> I think that depends on how non-thread-safe it is. If it's
> "weird things can happen" it might be all right. But if
> it's "can crash the interpreter" it might not be all right.

Weird things would include unexpected exceptions.  This seems a
relevant discussion:


https://softwareengineering.stackexchange.com/questions/262428/race-conditions-in-jvm-languages-versus-c-c#262440

The Java spec seems to contain the details but I admit I haven't
studied them:

https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html

Getting exceptions if your locking is incorrect seems an okay
tradeoff to me.  My knowledge of Java is pretty limited but I
believe they originally tried to make the core data structures
thread-safe (e.g. maps, vectors).  That turned out to be too
difficult or too expensive.  Instead, the core collection types are
not thread-safe and they introduced new "concurrent" collections.
That way, you only pay the cost of synchronization if you need it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JSYPXRC2N4LXVYIP4IB53LQKKFDYLD4M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Deferred, coalescing, and other very recent reference counting optimization

2020-09-01 Thread Neil Schemenauer
On 2020-09-01, Larry Hastings wrote:
> Personally I think the future of CPython is to change completely
> over to tracing garbage collection.  It's so much friendlier to
> multicore, which is clearly the future of programming.  I'd rather
> see efforts in this area directed towards that goal.

I think either CPython does that or some other implementation is
going to displace it.  CPython doesn't have a good way of utilizing
multi-core CPUs.  The various multi-process approaches don't solve
the problem of efficiently passing data between threads of
execution.

An elegant approach would be to use message passing like is done by
Erlang.  However, given that Python is not a functional language and
that most core data structures are mutable, it seems a poor fit.

The most obvious approach is to adopt a multi-threaded model like is
done by modern Java.  I.e. no GIL and non-thread safe core data
structures.  That sounds a bit scary but based on Java experience it
seems programmers can manage it.

If it wasn't for CPython's reference counted GC, that kind of
threading model seems relatively easy to implement.  Getting libgc
working would be a useful first step:

  https://discuss.python.org/t/switching-from-refcounting-to-libgc/1641

I guess it's hard to get people excited about that work because you
have to go backwards in performance before you can possibily go
forward.  The non-reference counted CPython is going to be much
slower and recovering that performance will be a long slog.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VBBPHP3FNJQHJLLESO2BTUFJ63H5KSS5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-23 Thread Neil Schemenauer
On 2020-06-23, Thomas Wouters wrote:
> I think the ability for per-type allocation/deallocation routines isn't
> really about efficiency, but more about giving more control to embedding
> systems (or libraries wrapped by extension modules) about how *their*
> objects are allocated. It doesn't make much sense, however, because Python
> wouldn't allocate their objects anyway, just the Python objects wrapping
> theirs. Allocating CPython objects should be CPython's job.

My thinking is that, eventually, we would like to allow CPython to
use something other than reference counting for internal PyObject
memory management.  In other systems with garbage collection, the
memory allocator is typically tightly integrated with the garbage
collector.  To get good efficiency, they need to cooperate.  E.g.
newly allocated objects are allocated in nursery memory arenas.  

The current API doesn't allow that because you can allocate memory
via some custom allocator and then pass that memory to be
initialized and treated as a PyObject.  That's one thing locking
us into reference counting.

This relates to the sub-interpreter discussion.  I think the
sub-interpreter cleanup work is worth doing, if only because it will
make embedding CPython cleaner.  I have some doubts that
sub-interpreters will help much in terms of multi-core utilization.
Efficiently sharing data between interpreters seems like a huge
challenge.  I think we should also pursue Java style multi-threading
and complete the "gilectomy".  To me, that means killing reference
counting for internal PyObject management.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FQS5TB6G77EE35QW3JHRU7ISZ4ASDCTQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-22 Thread Neil Schemenauer
Hi Victor,

Thanks for putting work into this.  I support the idea of slowly
evolving the C API.  It must be done carefully so as to not
unnecessarily break 3rd party extensions.  Changes must be made for
well founded reasons and not just because we think it makes a
"cleaner" API.  I believe you are following those principles.

One aspect of the API that could be improved is memory management
for PyObjects.  The current API is quite a mess and for no good
reason except legacy, IMHO.  The original API design allowed
extension types to use their own memory allocator.  E.g. they could
call their own malloc()/free() implemention and the rest of the
CPython runtime would handle that.  One consequence is that
Py_DECREF() cannot call PyObject_Free() but instead has to call
tp_dealloc().  There was supposed to be multiple layers of
allocators, PyMem vs PyObject, but since the layering was not
enforced, we ended up with a bunch of aliases to the same underlying
function.

Perhaps there are a few cases when the flexibility to use a custom
object allocator is useful.  I think in practice it is very rare
than an extension needs to manage memory itself.  To achieve
something similar, allow a PyObject to have a reference to some
externally managed resource and then the tp_del method would take
care of freeing it.  IMHO, the Python runtime should be in charge of
allocating and freeing PyObject memory.

I believe fixing this issue is not tricky, just tedious.  The
biggest hurdle might be dealing with statically allocated objects.
IMHO, they should go away and there should only be heap allocated
PyObjects (created and freed by calling CPython API functions).
That change would affect most extensions, unfortunately.

Another place for improvement is that the C API is unnecessarily
large.  E.g. we don't really need PyList_GetItem(),
PyTuple_GetItem(), and PyObject_GetItem().  Every extra API is a
potential leak of implementation details and a burden for
alternative VMs.  Maybe we should introduce something like
WIN32_LEAN_AND_MEAN that hides all the extra stuff.  The
Py_LIMITED_API define doesn't really mean the same thing since it
tries to give ABI compatibility.  It would make sense to cooperate
with the HPy project on deciding what parts are unnecessary.  Things
like Cython might still want to use the larger API, to extract every
bit of performance.  The vast majority of C extensions don't require
that.

One final comment: I think even if we manage to cleanup the API and
make it friendly for other Python implementations, there is going to
be a fair amount of overhead.  If you look at other "managed
runtimes" that just seems unavoidable (e.g. Java, CLR, V8, etc).
You want to design the API so that you maximize the amount of useful
work done with each API call.  Using something like
PyList_GET_ITEM() to iterate over a list is not a good pattern.  So
keep in mind that an extension API is going to have some overhead.


Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EH5DXCR4QTFLOVJTQWSJ6QBK6HS7Y65U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Pass the Python thread state to internal C functions

2019-11-16 Thread Neil Schemenauer
On AMD64 Linux, the location of the thread local data seems to be
stored in the GS CPU register[1].  It seems likely other platforms
and other operating systems could do something similar.  Passing
threadstate as an explicit argument could be either faster or slower
depending on how often you use it.  If you use threadstate often,
passing it explicitly (which likely uses a CPU register) could be a
win.  If you use it rarely, that CPU register would be better
utilized for passing function arguments you actually use.

Doing some experiments with optimized (i.e. using platform specific)
TLS would seem a useful step before undertaking a major refactoring.
Explicit passing could be a lot of code churn for no practical gain.

1. 
https://stackoverflow.com/questions/6611346/how-are-the-fs-gs-registers-used-in-linux-amd64
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R3XFSL5F6ZOV7VJYYZDEKA7JY327DYLD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Neil Schemenauer
On 2019-08-10, Serhiy Storchaka wrote:
> Actually we need to distinguish the the author and the user of the code and
> show warnings only to the author. Using .pyc files was just an heuristic:
> the author compiles the Python code, and the user uses compiled .pyc files.
> Would be nice to have more reliable way to determine the owning of the code.
> It is related not only to SyntaxWarnings, but to runtime
> DeprecationWarnings. Maybe silence warnings only for readonly files and make
> files installed by PIP readonly?

Identifying the author vs the user seems like a good idea.  Relying
on the OS filesystem seems like a solution that would cause some
challenges.  Can we embed that information in the .pyc file instead?
That way, Python knows that it is module/package that has been
installed with pip or similar and the end user is likely not the
author.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OVUKO7BJHG3JBKKGOWYWK4HTJ4SICCSK/


[Python-Dev] Re: How to extricate large set of diffs from hg.python.org/sandbox?

2019-08-07 Thread Neil Schemenauer
On 2019-08-07, Skip Montanaro wrote:
> Victor's experiments into a register-based virtual machine live here:
> 
> https://hg.python.org/sandbox/registervm
> 
> I'd like to revive them, if for no other reason to understand what he
> did. I see no obvious way to collect them all as a massive diff.

I think this might work:

$ hg diff -r fb80df16c4ff -r tip

Not sure fb80df16c4ff is the correct base revision.  It seems to be
the base of Victor's work.  I put the resulting patch file here:

http://python.ca/nas/python/registervm-victor.txt

If you are actively working on the register VM idea, shoot me an
email.  I'm interested to collaborate.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XHB66YDONTOL2ERDMLN4Y6HPIFRMY6BD/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-06 Thread Neil Schemenauer


Making it an error so soon would be mistake, IMHO.  That will break
currently working code for small benefit.  When Python was a young
language with a few thousand users, it was easier to make these
kinds of changes.  Now, we should be much more conservative and give
people a long time and a lot of warning.  Ideally, we should provide
tools to fix code if possible.

Could PyPI and pip gain the ability to warn and even fix these
issues?  Having a warning from pip at install time could be better
than a warning at import time.  If linting was built into PyPI, we
could even do a census to see how many packages would be affected by
turning it into an error.

On 2019-08-05, raymond.hettin...@gmail.com wrote:
> P.S. In the world of C compilers, I suspect that if the relatively
> new compiler warnings were treated as errors, the breakage would
> be widespread. Presumably that's why they haven't gone down this
> road.

The comparision with C compilers is relevant.  C and C++ represent a
fairly extreme position on not breaking working code.   E.g. K & R
style functional declarations were supported for decades.  I don't
think we need to go quite that far but also one or two releases is
not enough time.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V2EDFDJGXRIDMKJU3FKIWC2NDLMUZA2Y/


[Python-Dev] Re: Optimizing pymalloc (was obmalloc

2019-07-10 Thread Neil Schemenauer
On 2019-07-09, Inada Naoki wrote:
> PyObject_Malloc inlines pymalloc_alloc, and PyObject_Free inlines 
> pymalloc_free.
> But compiler doesn't know which is the hot part in pymalloc_alloc and
> pymalloc_free.

Hello Inada,

I don't see this on my PC.  I'm using GCC 8.3.0.  I have configured
the build with --enable-optimizations.  To speed up the profile
generation, I have changed PROFILE_TASK to only run these tests:

test_shelve test_set test_pprint test_pickletools
test_ordered_dict test_tabnanny test_difflib test_pickle
test_json test_collections

I haven't spent much time trying to figure out what set of tests is
best but the above set runs pretty quickly and seems to work okay.

I have run pyperformance to compare CPython 'master' with your PR
14674.  There doesn't seem to be a difference (table below).  If I
look at the disassembly, it seems that the hot paths of
pymalloc_alloc and pymalloc_free are being inlined as you would
hope, without needing the LIKELY/UNLIKELY annotations.

OTOH, your addition of LIKELY() and UNLIKELY() in the PR is a pretty
small change and probably doesn't hurt anything.  So, I think it
would be fine to merge it.

Regards,

  Neil


+-+-+-+
| Benchmark   | master  | PR-14674|
+=+=+=+
| 2to3| 305 ms  | 304 ms: 1.00x faster (-0%)  |
+-+-+-+
| chaos   | 109 ms  | 110 ms: 1.01x slower (+1%)  |
+-+-+-+
| crypto_pyaes| 118 ms  | 117 ms: 1.01x faster (-1%)  |
+-+-+-+
| django_template | 112 ms  | 114 ms: 1.02x slower (+2%)  |
+-+-+-+
| fannkuch| 446 ms  | 440 ms: 1.01x faster (-1%)  |
+-+-+-+
| float   | 119 ms  | 120 ms: 1.01x slower (+1%)  |
+-+-+-+
| go  | 247 ms  | 250 ms: 1.01x slower (+1%)  |
+-+-+-+
| json_loads  | 25.1 us | 24.4 us: 1.03x faster (-3%) |
+-+-+-+
| logging_simple  | 8.86 us | 8.66 us: 1.02x faster (-2%) |
+-+-+-+
| meteor_contest  | 97.5 ms | 97.7 ms: 1.00x slower (+0%) |
+-+-+-+
| nbody   | 140 ms  | 142 ms: 1.01x slower (+1%)  |
+-+-+-+
| pathlib | 19.2 ms | 18.9 ms: 1.01x faster (-1%) |
+-+-+-+
| pickle  | 8.95 us | 9.08 us: 1.02x slower (+2%) |
+-+-+-+
| pickle_dict | 18.1 us | 18.0 us: 1.01x faster (-1%) |
+-+-+-+
| pickle_list | 2.75 us | 2.68 us: 1.03x faster (-3%) |
+-+-+-+
| pidigits| 182 ms  | 184 ms: 1.01x slower (+1%)  |
+-+-+-+
| python_startup  | 7.83 ms | 7.81 ms: 1.00x faster (-0%) |
+-+-+-+
| python_startup_no_site  | 5.36 ms | 5.36 ms: 1.00x faster (-0%) |
+-+-+-+
| raytrace| 495 ms  | 499 ms: 1.01x slower (+1%)  |
+-+-+-+
| regex_dna   | 173 ms  | 170 ms: 1.01x faster (-1%)  |
+-+-+-+
| regex_effbot| 2.79 ms | 2.67 ms: 1.05x faster (-4%) |
+-+-+-+
| regex_v8| 21.1 ms | 21.2 ms: 1.00x slower (+0%) |
+-+-+-+
| richards| 68.2 ms | 68.7 ms: 1.01x slower (+1%) |
+-+-+-+
| scimark_monte_carlo | 103 ms  | 102 ms: 1.01x faster (-1%)  |
+-+-+-+
| scimark_sparse_mat_mult | 4.37 ms | 4.35 ms: 1.00x faster (-0%) |
+-+-+-+
| spectral_norm   | 132 ms  | 133 ms: 1.01x slower (+1%)  |
+-+-+-+
| sqlalchemy_imperative   | 30.3 ms | 30.7 ms: 1.01x slower 

[Python-Dev] Re: Optimizing pymalloc (was obmalloc

2019-07-09 Thread Neil Schemenauer
On 2019-07-09, Inada Naoki wrote:
> So I tried to use LIKELY/UNLIKELY macro to teach compiler hot part.
> But I need to use
> "static inline" for pymalloc_alloc and pymalloc_free yet [1].

I think LIKELY/UNLIKELY is not helpful if you compile with LTO/PGO
enabled.  So, I would try that first.  Also, if you use relatively
small static functions that are defined before use (no forward
declarations), I have found that GCC is usually smart about inlining
them.  So, I don't think you should have to use "static inline"
rather than just "static".

Good work looking into this.  Should be some relatively easy
performance win.

Cheers,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TISXQ7E3AA5BPMVPTDWKDWBKV5VPVOTI/


[Python-Dev] Re: Removing dead bytecode vs reporting syntax errors

2019-07-05 Thread Neil Schemenauer
On 2019-07-06, Victor Stinner wrote:
> More people seems to expect "if 0: ..." to be removed, than people who
> care of syntax errors on "if 0".

One small data point: I have shipped code that depended on 'if 0'
removing code from the .pyc file.  The code inside was not meant to
be released publicly in the case someone inspects the .pyc file.

I could have solved the problem in a different way, e.g. have a tool
that removes all the code inside the 'if'.  Having a tool that
toggles 'if DEV_MODE' to 'if 0' was simpler.

I will freely admit that is a bit of a dirty solution.  I knew that
Python removes those blocks but I'm not sure that is guaranteed
anywhere.  I think it is maybe more important that we give the
syntax errors.

So, I don't care strongly one way or another.  However, there is
other code out there that likely depends on the behavior (not just
for code coverage).

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7CHVZU54OM5UTI6VQ6P3EHX3BTUQFP4Y/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Neil Schemenauer
For those who would like to test with something compatible with
Python 3.7.3, I made re-based branches here:

 https://github.com/nascheme/cpython/tree/obmalloc_radix_v37
 https://github.com/nascheme/cpython/tree/obmalloc_big_pools_v37

They should be ABI compatible with Python 3.7.3.  So, if you just
re-build the "python" executable, you don't have to rebuild anything
else.  Both those use the same arena/pool sizes and they both have
Tim's arena thrashing fix.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K5DCROCGGVNWWLC6XM6XMCTJACESNEYS/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Neil Schemenauer
On 2019-06-21, Tim Peters wrote:
> [Thomas Wouters ]
> > Getting rid of address_in_range sounds like a nice idea, and I
> > would love to test how feasible it is -- I can run such a change
> > against a wide selection of code at work, including a lot of
> > third-party extension modules, but I don't see an easy way to do
> > it right now.
> 
> Neil's branch is here:
> 
>  https://github.com/nascheme/cpython/tree/obmalloc_radix_tree

If you can test vs some real-world programs, that would be great.
I was trying to run some tests this afternoon.  Testing with Python
3.8+ is a pain because of the PyCode_New and tp_print changes.  I've
just added two fixes to the head of the obmalloc_radix_tree branch
so that you can compile code generated by old versions of Cython.
Without those fixes, building 3rd party extensions can be a real
pain.

> My PR uses 16K pools and 1M arenas, quadrupling the status quo.
> Because "why not?" ;-)
> 
> Neil's branch has _generally_, but not always, used 16 MiB arenas.
> The larger the arenas in his branch, the smaller the radix tree needs
> to grow.

Currently I have it like your big pool branch (16 KB, 1MB).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TF2JI7G5ZMCGUMM3AWNSCQDYVFNRPMQ4/


[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Neil Schemenauer
On 2019-06-15, Tim Peters wrote:
> At the start, obmalloc never returned arenas to the system.  The vast
> majority of users were fine with that.

Yeah, I was totally fine with that back in the day.  However, I
wonder now if there is a stronger reason to try to free memory back
to the OS.  Years ago, people would typically have swap space that
was as large or larger than their real RAM.  So, if the OS had to
swap out unused pages, it wasn't a big deal.  Now that disks are
relatively so much slower and RAM is larger, people don't have as
much swap.  Some Linux systems get setup without any.  Freeing
arenas seems more important than it used to be.

OTOH, I don't think obmalloc should try too hard. The whole point of
the small object allocator is to be really fast.  Anti-fragmentation
heuristics are going to slow it down.  As far as I'm concerned, it
works well enough as it is.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z6O5HPSLQXNUHTY6GY7WN4KUB43J6QGV/


[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Neil Schemenauer
On 2019-06-15, Antoine Pitrou wrote:
> We should evaluate what problem we are trying to solve here, instead
> of staring at micro-benchmark numbers on an idle system.

I think a change to obmalloc is not going to get accepted unless we
can show it doesn't hurt these micro-benchmarks.  To displace the
status quo, it has to give other advantages as well.  I don't have
any agenda or "problem to solve".  After Tim made a PR to allow
obmalloc to use larger pools, I thought it would be interesting to
see if a arena mapping scheme based on radix trees should be
performance competitive.  I'm not proposing any changes to CPython
at this point.  I'm sharing the results of an experiment.  I thought
it was interesting.  I guess you don't.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6PZU7UYGG5VCZTC4PAYDDU456SS3YTGT/


[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Neil Schemenauer
Here are benchmark results for 64 MB arenas and 16 kB pools.  I ran
without the --fast option and on a Linux machine in single user
mode.  The "base" columm is the obmalloc-big-pools branch with
ARENA_SIZE = 64 MB and POOL_SIZE = 16 kB.  The "radix" column is
obmalloc_radix_tree (commit 5e00f6041) with the same arena and pool
sizes.

+-+-+-+
| Benchmark   | base (16kB/64MB)| radix (16KB/64MB)   |
+=+=+=+
| 2to3| 290 ms  | 292 ms: 1.00x slower (+0%)  |
+-+-+-+
| crypto_pyaes| 114 ms  | 116 ms: 1.02x slower (+2%)  |
+-+-+-+
| django_template | 109 ms  | 106 ms: 1.03x faster (-3%)  |
+-+-+-+
| dulwich_log | 75.2 ms | 74.5 ms: 1.01x faster (-1%) |
+-+-+-+
| fannkuch| 454 ms  | 449 ms: 1.01x faster (-1%)  |
+-+-+-+
| float   | 113 ms  | 111 ms: 1.01x faster (-1%)  |
+-+-+-+
| hexiom  | 9.45 ms | 9.47 ms: 1.00x slower (+0%) |
+-+-+-+
| json_dumps  | 10.6 ms | 11.1 ms: 1.04x slower (+4%) |
+-+-+-+
| json_loads  | 24.4 us | 25.2 us: 1.03x slower (+3%) |
+-+-+-+
| logging_simple  | 8.19 us | 8.37 us: 1.02x slower (+2%) |
+-+-+-+
| mako| 15.1 ms | 15.1 ms: 1.01x slower (+1%) |
+-+-+-+
| meteor_contest  | 98.3 ms | 97.1 ms: 1.01x faster (-1%) |
+-+-+-+
| nbody   | 142 ms  | 140 ms: 1.02x faster (-2%)  |
+-+-+-+
| nqueens | 93.8 ms | 93.0 ms: 1.01x faster (-1%) |
+-+-+-+
| pickle  | 8.89 us | 8.85 us: 1.01x faster (-0%) |
+-+-+-+
| pickle_dict | 17.9 us | 18.2 us: 1.01x slower (+1%) |
+-+-+-+
| pickle_list | 2.68 us | 2.64 us: 1.01x faster (-1%) |
+-+-+-+
| pidigits| 182 ms  | 184 ms: 1.01x slower (+1%)  |
+-+-+-+
| python_startup_no_site  | 5.31 ms | 5.33 ms: 1.00x slower (+0%) |
+-+-+-+
| raytrace| 483 ms  | 476 ms: 1.02x faster (-1%)  |
+-+-+-+
| regex_compile   | 167 ms  | 169 ms: 1.01x slower (+1%)  |
+-+-+-+
| regex_dna   | 170 ms  | 171 ms: 1.01x slower (+1%)  |
+-+-+-+
| regex_effbot| 2.70 ms | 2.75 ms: 1.02x slower (+2%) |
+-+-+-+
| regex_v8| 21.1 ms | 21.3 ms: 1.01x slower (+1%) |
+-+-+-+
| scimark_fft | 368 ms  | 371 ms: 1.01x slower (+1%)  |
+-+-+-+
| scimark_monte_carlo | 103 ms  | 101 ms: 1.02x faster (-2%)  |
+-+-+-+
| scimark_sparse_mat_mult | 4.31 ms | 4.27 ms: 1.01x faster (-1%) |
+-+-+-+
| spectral_norm   | 131 ms  | 135 ms: 1.03x slower (+3%)  |

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
On 2019-06-14, Tim Peters wrote:
> However, last I looked there Neil was still using 4 KiB obmalloc
> pools, all page-aligned.  But using much larger arenas (16 MiB, 16
> times bigger than my branch, and 64 times bigger than Python currently
> uses).

I was testing it verses your obmalloc-big-pool branch and trying to
make it a fair comparision.  You are correct: 4 KiB pools and 16 MiB
arenas.  Maybe I should test with 16 KiB pools and 16 MiB arenas.
That seems a more optimized setting for current machines and
workloads.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SUN6QZQKRPI5WQZKSBZFSLBNG4MMV3YH/


[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
On 2019-06-15, Inada Naoki wrote:
> Oh, do you mean your branch doesn't have headers in each page?

That's right.  Each pool still has a header but pools can be larger
than the page size.  Tim's obmalloc-big-pool idea writes something
to the head of each page within a pool.  The radix tree doesn't need
that and actually doesn't care about OS page size.

BTW, the current radix tree doesn't even require that pools are
aligned to POOL_SIZE.  We probably want to keep pools aligned
because other parts of obmalloc rely on that.

Here is the matchup of the radix tree vs the current
address_in_range() approach.

- nearly the same in terms of performance.  It might depend on OS
  and workload but based on my testing on Linux, they are very
  close.  Would be good to do more testing but I think the radix
  tree is not going to be faster, only slower.

- radix tree uses a bit more memory overhead.  Maybe 1 or 2 MiB on a
  64-bit OS.  The radix tree uses more as memory use goes up but it
  is a small fraction of total used memory.  The extra memory use is
  the main downside at this point, I think.

- the radix tree doesn't read uninitialized memory.  The current
  address_in_range() approach has worked very well but is relying on
  some assumptions about the OS (how it maps pages into the program
  address space).  This is the only aspect where the radix tree is
  clearly better.  I'm not sure this matters enough to offset the
  extra memory use.

- IMHO, the radix tree code is a bit simpler than Tim's
  obmalloc-big-pool code.  That's not a big deal though as long as
  the code works and is well commented (which Tim's code is).

My feeling right now is that Tim's obmalloc-big-pool is the best
design at this point.  Using 8 KB or 16 KB pools seems to be better
than 4 KB.  The extra complexity added by Tim's change is not so
nice.  obmalloc is already extremely subtle and obmalloc-big-pool
makes it more so.

Regards,

Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZAPSJB6TOODRBRCF3T3CXMYSX3FLWDDI/


[Python-Dev] radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
I've been working on this idea for a couple of days.  Tim Peters has
being helping me out and I think it has come far enough to get some
more feedback.  It is not yet a good replacement for the current
address_in_range() test.  However, performance wise, it is very
close.  Tim figures we are not done optimizing it yet so maybe it
will get better.

Code is available on my github branch:

https://github.com/nascheme/cpython/tree/obmalloc_radix_tree

Tim's "obmalloc-big-pools" is what I have been comparing it to.  It
seems 8 KB pools are faster than 4 KB.  I applied Tim's arena
trashing fix (bpo-37257) to both branches.  Some rough (--fast)
pyperformance benchmark results are below.


+-+-+-+
| Benchmark   | obmalloc-big-pools  | obmalloc_radix  |
+=+=+=+
| crypto_pyaes| 168 ms  | 170 ms: 1.01x slower (+1%)  |
+-+-+-+
| hexiom  | 13.7 ms | 13.6 ms: 1.01x faster (-1%) |
+-+-+-+
| json_dumps  | 15.9 ms | 15.6 ms: 1.02x faster (-2%) |
+-+-+-+
| json_loads  | 36.9 us | 37.1 us: 1.01x slower (+1%) |
+-+-+-+
| meteor_contest  | 141 ms  | 139 ms: 1.02x faster (-2%)  |
+-+-+-+
| nqueens | 137 ms  | 140 ms: 1.02x slower (+2%)  |
+-+-+-+
| pickle_dict | 26.2 us | 25.9 us: 1.01x faster (-1%) |
+-+-+-+
| pickle_list | 3.91 us | 3.94 us: 1.01x slower (+1%) |
+-+-+-+
| python_startup_no_site  | 8.00 ms | 7.78 ms: 1.03x faster (-3%) |
+-+-+-+
| regex_dna   | 246 ms  | 241 ms: 1.02x faster (-2%)  |
+-+-+-+
| regex_v8| 29.6 ms | 30.0 ms: 1.01x slower (+1%) |
+-+-+-+
| richards| 93.9 ms | 92.7 ms: 1.01x faster (-1%) |
+-+-+-+
| scimark_fft | 525 ms  | 531 ms: 1.01x slower (+1%)  |
+-+-+-+
| scimark_sparse_mat_mult | 6.32 ms | 6.24 ms: 1.01x faster (-1%) |
+-+-+-+
| spectral_norm   | 195 ms  | 198 ms: 1.02x slower (+2%)  |
+-+-+-+
| sqlalchemy_imperative   | 49.5 ms | 50.5 ms: 1.02x slower (+2%) |
+-+-+-+
| sympy_expand| 691 ms  | 695 ms: 1.01x slower (+1%)  |
+-+-+-+
| unpickle_list   | 5.09 us | 5.32 us: 1.04x slower (+4%) |
+-+-+-+
| xml_etree_parse | 213 ms  | 215 ms: 1.01x slower (+1%)  |
+-+-+-+
| xml_etree_generate  | 134 ms  | 136 ms: 1.01x slower (+1%)  |
+-+-+-+
| xml_etree_process   | 103 ms  | 104 ms: 1.01x slower (+1%)  |
+-+-+-+

Not significant (34): 2to3; chameleon; chaos; deltablue;
django_template; dulwich_log; fannkuch; float; go; html5lib;
logging_format; logging_silent; logging_simple; mako; nbody;
pathlib; pickle; pidigits; python_startup; raytrace; regex_compile;
regex_effbot; scimark_lu; scimark_monte_carlo; scimark_sor;
sqlalchemy_declarative; sqlite_synth; sympy_integrate; sympy_sum;
sympy_str; telco; unpack_sequence; unpickle; xml_etree_iterparse
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 

[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-10 Thread Neil Schemenauer
On 2019-06-09, Tim Peters wrote:
> And now there's a PR that removes obmalloc's limit on pool sizes, and,
> for a start, quadruples pool (and arena!) sizes on 64-bit boxes:

Neat.

> As the PR says,
> 
> """
> It would be great to get feedback from 64-bit apps that do massive
> amounts of small-object allocations and deallocations.
> """

I've done a little testing the pool overhead.  I have an application
that uses many small dicts as holders of data.  The function:

sys._debugmallocstats()

is useful to get stats for the obmalloc pools.  Total data allocated
by obmalloc is 262 MB.  At the 4*PAGE_SIZE pool size, the wasted
space due to partly filled pools is only 0.18%.  For 16*PAGE_SIZE
pools, 0.71%.

I have a set of stats for another program.  In that case, total
memory allocated by obmalloc is 14 MB.  For 4*PAGE_SIZE pools,
wasted space is 0.78% of total.  At 16*PAGE_SIZE, it is 2.4%.

Based on that small set of data, using 4*PAGE_SIZE seems
conservative.  As I'm sure you realize, making pools bigger will
waste actual memory, not just virtual address space because you
write the arena pointer to each OS page.

I want to do performance profiling using Linux perf.  That should
show where the hotspot instructions in the obmalloc code.  Maybe
that will be useful to you.

Another thought about address_in_range(): some operating systems
allow you to allocate memory a specific alignments.  Or, you can
even allocate a chunk of memory at a fixed memory location if you do
the correct magic incantation.  I noticed that Go does that.  I
imagine doing that has a bunch of associated challenges with it.
However, if we could control the alignment and memory location of
obmalloc arenas, we would not have the segv problem of
address_in_range().  It's probably not worth going down that path
due to the problems involved.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LJWXNQNZRVCCF36ALXKZVWYHINXTVMVB/


[Python-Dev] Re: Expected stability of PyCode_New() and types.CodeType() signatures

2019-06-08 Thread Neil Schemenauer
On 2019-05-31, Simon Cross wrote:
> As the maintainer of Genshi, one the libraries affected by the CodeType and
> similar changes, I thought I could add a users perspective to the
> discussion:
[...]

Thanks.  I think this change to PyCode_New() could have been handled
a bit better.  Couldn't we introduce a new CPP define that enables
the revised API of PyCode_New()?  For 3.8 extensions, they would get
the backwards compatible API (and a warning) unless they set the
define. For 3.9, we would enable the new API by default.  That gives
3rd party extensions one release cycle to catch up to the change.
Perhaps something similar could be done for CodeType called from
within Python code.

In this case, it seems that introducing a new API like
PyCode_NewEx() is not the way.  However, just changing an API like
that is not very friendly to 3rd party extensions, even if we don't
claim it is a stable API.  For me, change to PyCode_New() means that
I can't test the 3.8.0b1 because the 3rd party extensions I rely on
don't compile with it. Normally I try to test my application code
with the latest alpha and beta releases.

It would be great if we had a system that did CI testing with the
top PyPI modules.  E.g. pull the latest versions of the top 100 PyPI
modules and test them with the latest CPython branch.  With that, at
least we would know what the fallout would be from some incompatible
CPython change.  Setting that system up would be a fair amounnt of
work but I suspect the PSF could fund someone who puts together a
plan to do it.  Such a system would be even more useful if we start
moving stuff out of stdlib into PyPI.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2A4LAPCESAMJCPVCAQNEWYG2UC2BFKUU/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-07 Thread Neil Schemenauer
On 2019-06-06, Tim Peters wrote:
> The doubly linked lists in gc primarily support efficient
> _partitioning_ of objects for gc's purposes (a union of disjoint sets,
> with constant-time moving of an object from one set to another, and
> constant-time union of disjoint sets).  "All objects" is almost never
> interesting to it (it is only when the oldest non-frozen generation is
> being collected).

My current idea is to put partitioning flags on the interior radix
tree nodes.  If you mark an object as "finalizer reachable", for
example, it would mark all the nodes on the path from the root with
that flag.  Then, when you want to iterate over all the GC objects
with a flag, you can avoid uninteresting branches of the tree.

For generations, maybe tracking them at the pool level is good
enough.  Interior nodes can track generations too (i.e. the youngest
generation contained under them).

My gut feeling is that the prev/next pointer updates done by
move_unreachable() and similar functions must be quite expensive.
Doing the traversal with an explicit stack is a lot less elegant but
I think it should be faster.  At least, when you are dealing with a
big set of GC objects that don't fit in the CPU cache.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J422RWENKJAYHMXSZVRV5KGWSHNMAMJF/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-07 Thread Neil Schemenauer
On 2019-06-06, Tim Peters wrote:
> Like now:  if the size were passed in, obmalloc could test the size
> instead of doing the `address_in_range()` dance(*).  But if it's ever
> possible that the size won't be passed in, all the machinery
> supporting `address_in_range()` still needs to be there, and every
> obmalloc spelling of malloc/realloc needs to ensure that machinery
> will work if the returned address is passed back to an obmalloc
> free/realloc spelling without the size.


We can almost make it work for GC objects, the use of obmalloc is
quite well encapsulated.  I think I intentionally designed the
PyObject_GG_New/PyObject_GC_Del/etc APIs that way.

Quick and dirty experiment is here:

https://github.com/nascheme/cpython/tree/gc_malloc_free_size

The major hitch seems my new gc_obj_size() function.  We can't be
sure the 'nbytes' passed to _PyObject_GC_Malloc() is the same as
what is computed by gc_obj_size().  It usually works but there are
exceptions (freelists for frame objects and tuple objects, for one)

A nasty problem is the weirdness with PyType_GenericAlloc() and the
sentinel item.  _PyObject_GC_NewVar() doesn't include space for the
sentinel but PyType_GenericAlloc() does.  When you get to
gc_obj_size(), you don't if you should use "nitems" or "nitems+1".

I'm not sure how the fix the sentinel issue.  Maybe a new type slot
or a type flag?  In any case, making a change like my git branch
above would almost certainly break extensions that don't play
nicely.  It won't be hard to make it a build option, like the
original gcmodule was.  Then, assuming there is a performance boost,
people can enable it if their extensions are friendly.


> The "only"problem with address_in_range is that it limits us to a
> maximum pool size of 4K.  Just for fun, I boosted that to 8K to see
> how likely segfaults really are, and a Python built that way couldn't
> even get to its first prompt before dying with an access violation
> (Windows-speak for segfault).

If we can make the above idea work, you could set the pool size to
8K without issue.  A possible problem is that the obmalloc and
gcmalloc arenas are separate.  I suppose that affects 
performance testing.

> We could eliminate the pool size restriction in many ways.  For
> example, we could store the addresses obtained from the system
> malloc/realloc - but not yet freed - in a set, perhaps implemented as
> a radix tree to cut the memory burden.  But digging through 3 or 4
> levels of a radix tree to determine membership is probably
> significantly slower than address_in_range.

You are likely correct. I'm hoping to benchmark the radix tree idea.
I'm not too far from having it working such that it can replace
address_in_range().  Maybe allocating gc_refs as a block would
offset the radix tree cost vs address_in_range().  If the above idea
works, we know the object size at free() and realloc(), we don't
need address_in_range() for those code paths.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ILFK2MTCVA7GB7JGBVSUWASKJ7T4LLJE/


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-22 Thread Neil Schemenauer
On 2019-05-21, Brett Cannon wrote:
> On Tue., May 21, 2019, 13:07 Neil Schemenauer, 
> wrote:

> > Here is an alternative, straw man, proposal.  Split the CPython repo
> > into two parts:
> >
> > - core Python: minimal possible stdlib
> > - everything else
> 
> How to this lighten the burden of maintaining those modules which aren't in
> the core install? Are you suggesting modules in the core install get
> serious focus and the rest is more of a legacy, unsupported release for
> some time to give people an extended period to shift to the core install?
> Or do you have something else in mind?

It would give us the freedom to choose how we want to do it.  It
would give a lightweight Python install for the people who don't need
all the batteries, much lighter than what the PEP 594 strategy could
provide.

For CI, we can decide what should be tested.  Obviously the core
part is always tested.  Initially, we could continue testing all
parts of non-core.  Later, we could decide that certain parts of
non-core get tested less often (e.g. full nntplib tests only
nightly).  BTW, I think it would be great if automated nightly jobs
could also also run tests for PyPI modules like NumPy, requests,
Pandas, etc.

The installers could offer options as to which parts of the non-core
library to install.  Modules that no longer receive the same quality
of development and testing could get moved to a "deprecated"
section.  Users who want the best backwards compatibility would
install everything.  If we want to remove something from the
"deprecated" section, I think we should give a lot of notice.  A
couple of years is not enough.

Here is a sketch for a Linux-like package system:

python3-stdlib-base
all recommended stdlib packages (e.g. same as stdlib after
PEP 594 removals)

python3-stdlib-deprecated
packages suggested for removal by PEP 594

python3-stdlib-broken
packages that have bugs or are really not recommended to
be used.  I'm not sure if we need this but stuff like crypt
could go here.

python3-stdlib-all
depends on the above three packages


Ideally these packages don't contain the module files themselves.
Instead, they depend on individual packages for each module.  E.g.
python3-stdlib-deprecated would depend on python3-nntplib.  So,
someone could install python3-stdlib-base and python3-nntplib if
that's all they need.

I'm not familiar with the internals of 'pip' but I would guess we
could do a similar thing by creating meta PyPI packages that
correspond to these sets of packages.  So, someone could download
the small "core" Python installer and then run:

pip install stdlib-base

or something like that.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread Neil Schemenauer
On 2019-05-21, Terry Reedy wrote:
> The problem with this argument, taken by itself, it that it would argue for
> adding to the stdlib 100s or 1000s of modules or packages that would be
> useful to many more people than the modules proposed to be dropped.

I don't think it does.  We are not talking about 100s or 1000s of
modules.  We are talking about modules which have been in Python's
stdlib for years or decades.  If I have a script that uses one of
these modules and it gets removed, my script breaks.

Installing it from PyPI is not really a great solution.  We are
going to be breaking working scripts just like if we add new
language keywords, etc.  I think we need to be extremely careful
with trying to maintain backwards compatibility, at least as far as
we reasonably can.

The problem I have with this PEP is that I think it both too
aggressive and too conservative at the same time.  For almost all
modules on the list, I'm sure there will be many people who are
harmed by its removal.  OTOH, having to maintain all of the modules
in the stdlib is a heavy burden.  Also, new users can be lured into
using a module that is not really the best solution anymore.

Here is an alternative, straw man, proposal.  Split the CPython repo
into two parts:

- core Python: minimal possible stdlib
- everything else

When Python is released, provide installers for a Python that only
includes the "core" part and a second installer that includes
everything else.  I realize this is more work for the release team.
Hopefully with some scripting, it will not be too labour intensive.

The core Python installer should become the recommended installer.
People who need backwards compability with older versions of Python
can download the big installer package.

To help the people who need 100s or 1000s of extra PyPI packages, we
could develop a tool that creates a "sumo" Python installer,
grabbing packages from PyPI and building a installer package.  To
install that package, you would not need network access.  That
doesn't need to happen right away.  Also, maybe other Python
distributions can fill that need if core Python devs don't want to
build it.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-29 Thread Neil Schemenauer
On 2019-04-27, Nathaniel Smith wrote:
> For Py_TRACE_REFS specifically, IIUC the only goal is to be able to produce
> a list of all live objects on demand. If that's the goal, then static type
> objects aren't a huge deal. You can't add extra data into the type objects
> themselves, but since there's a fixed set of them and they're immortal, you
> can just build a static list of all of them in PyType_Ready.

As far as I understand, we have a similar problem already for
gc.get_objects() because those static type objects don't have a
PyGC_Head.  My 2-cent proposal for fixing things in the long term
would be to introduce a function like PyType_Ready that returns a
pointer to the new type.  The argument to it would be what is the
current static type structure.  The function would copy things from
the static type structure into a newly allocated type structure.

We have a kind of solution already with PyType_FromSpec, etc.
However, I think it is harder to convert existing extension module
source code to use that API.  We want to make it very easy for
people to fix source code.

If we can remove static types, that would allow us to kill off
Py_TYPE(o)->tp_is_gc(o).  I understand why that exists but I think
it is quite an ugly detail of the current GC implementation.  I
wonder about the performance impact of it given current memory
latencies.  When we do a full GC run, we call PyObject_IS_GC() on
many objects.  I fear having to lookup and call tp_is_gc could be
quite expensive.

I've been playing with the idea of using memory bitmaps rather then
the PyGC_Head.  That idea seems to depend on removing static type
objects.  Initially I was thinking of it as reducing the memory
overhead for GC types.  Now I think the memory overhead doesn't
matter too much but perhaps the bitmaps would be much faster due to
memory latency.  There is an interesting Youtube video that compares
vector traversals vs linked list traversals in C++.  Linked lists on
modern machines are really terrible.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-24 Thread Neil Schemenauer
On 2019-04-24, Victor Stinner wrote:
> The current blocker issue is that the Py_DEBUG define imply the
> Py_TRACE_REFS define

I think your change to make Py_TRACE_REFS as separate configure flag
is fine.  I've used the trace fields to debug occasionally but I
don't use it often enough to need it enabled by Py_DEBUG.

> Being able to switch between Python in release mode and Python in
> debug mode is a first step. My long term plan would be to better
> separate "Python" from its "runtime".

Regarding the Py_TRACE_REFS fields, I think we can't do them without
breaking the ABI because of the following.  For GC objects, they are
always allocated by _PyObject_GC_New/_PyObject_GC_NewVar.  So, we
can allocate the extra space needed for the GC linked list.  For
non-GC objects, that's not the case.  Extensions can allocate using
malloc() directly or their own allocator and then pass that memory
to be initialized as a PyObject.

I think that's a poor design and I think we should try to make slow
progress in fixing it.  I think non-GC objects should also get
allocated by a Python API.  In that case, the Py_TRACE_REFS
functionality could be implemented in a way that doesn't break the
ABI.  It also makes the CPython API more friendly for alternative
Python runtimes like PyPy, etc.

Note that this change would not prevent an extension from allocating
memory with it's own allocator.  It just means that memory can't
hold a PyObject.  The extension PyObject would need to have a
pointer that points to this externally allocated memory.

I can imagine there could be some situations when people really
want a PyObject to reside in a certain memory location.  E.g. maybe
you have some kind of special shared memory area.  In that case, I
think we could have specialized APIs to create PyObjects using a
specialized allocator.  Those APIs would not be supported by
some runtimes (e.g. tracing/moving GC for PyObjects) and the APIs
would not be used by most extensions.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc

2019-03-28 Thread Neil Schemenauer
On 2019-03-28, Antoine Pitrou wrote:
> On Wed, 27 Mar 2019 15:59:25 -0700
> "Gregory P. Smith"  wrote:
> > 
> > That had a C++ stack trace 1000+ levels deep repeating the pattern of
> > 
> > ...
> > @ 0x564d59bd21de 32  func_dealloc
> > @ 0x564d59bce0c1 32  cell_dealloc
> > @ 0x564d5839db41 48  tupledealloc
> > @ 0x564d59bd21de 32  func_dealloc
> > @ 0x564d59bce0c1 32  cell_dealloc
> > @ 0x564d5839db41 48  tupledealloc
> > ...
> 
> As Tim said, if you still have a core dump somewhere (or can reproduce
> the issue) it would be nice to know why the "trashcan" mechanism didn't
> trigger.

To expand on this, every time tupledealloc gets called,
Py_TRASHCAN_SAFE_BEGIN also gets invoked.  It increments
tstate->trash_delete_nesting.  As Tim suggests, maybe
PyTrash_UNWIND_LEVEL is too large given the size of the C stack
frames from func_dealloc + cell_dealloc + tupledealloc.

That theory seems hard to believe though, unless the C stack is
quite small.  I see PyTrash_UNWIND_LEVEL = 50.  Perhaps the stack
could have been mostly used up before the dealloc sequence started.

The other option is that there is some bug in the trashcan
mechanism.  It certainly is some very tricky code.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Register-based VM [Was: Possible performance regression]

2019-03-11 Thread Neil Schemenauer
On 2019-02-27, Victor Stinner wrote:
> The compiler begins with using static single assignment form (SSA) but
> then uses a register allocator to reduce the number of used registers.
> Usually, at the end you have less than 5 registers for a whole
> function.

In case anyone is interested on working on this, I dug up some
discussion from years ago.  Advice from Tim Peters:

[Python-Dev] Rattlesnake progress
https://mail.python.org/pipermail/python-dev/2002-February/020172.html
https://mail.python.org/pipermail/python-dev/2002-February/020182.html
https://mail.python.org/pipermail/python-dev/2002-February/020146.html

Doing a prototype register-based compiler in Python seems like a
good idea.  Using the 'compiler' package would give you a good
start.  I think this is the most recent version of that package:

https://github.com/pfalcon/python-compiler

Based on a little poking around, I think it has not been updated for
the 16-bit word code.  Shouldn't be too hard to make it work though.

I was thinking about the code format on the weekend.  Using
three-register opcodes seems a good idea.   We could could retain
the 16-bit word code format.  For opcodes that use three registers,
use a second word for the last two registers.  I.e.

<8 bit opcode><8 bit register #>
<8 bit register #><8 bit register #>

Limit the number of registers to 256.  If you run out, just push and
pop from stack.  You want to keep the instruction decode path in the
evaluation loop simple and not confuse the CPU branch predictor.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASE] Python 3.8.0a1 is now available for testing

2019-02-28 Thread Neil Schemenauer
On 2019-02-26, Stephane Wirtel wrote:
> I also filled an issue [2] for brotlipy (used by httpbin and requests).
> The problem is with PyInterpreterState.

I tried compiling psycopg2 today and it has a similar problem:

psycopg/psycopgmodule.c: In function ‘psyco_is_main_interp’:
psycopg/psycopgmodule.c:689:18: error: dereferencing pointer to incomplete 
type ‘PyInterpreterState’ {aka ‘struct _is’}
 while (interp->next)

That code is inside a function:

/* Return nonzero if the current one is the main interpreter */
static int
psyco_is_main_interp(void)
...

I believe the correct fix is to use PEP 3121 per-interpreter module
state.  I created a new issue:

https://github.com/psycopg/psycopg2/issues/854

I think the fix is not trival as the psycopgmodule.c source code has
change a fair bit to use the PEP 3121 APIs.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Register-based VM [Was: Possible performance regression]

2019-02-26 Thread Neil Schemenauer
On 2019-02-27, Greg Ewing wrote:
> Joe Jevnik via Python-Dev wrote:
> > If Python switched to a global stack and global registers we may be able
> > to eliminate a lot of instructions that just shuffle data from the
> > caller's stack to the callee's stack.
> 
> That would make implementing generators more complicated.

Right.  I wonder though, could we avoid allocating the Python frame
object until we actually need it?  Two situations when you need a
heap allocated frame come to mind immediately: generators that are
suspended and frames as part of a traceback.  I guess
sys._getframe() is another.  Any more?

I'm thinking that perhaps for regular Python functions and regular
calls, you could defer creating the full PyFrame object and put the
locals, stack, etc on the C stack.  That would make calling Python
functions a lot similar to the machine calling convention and
presumably could be much faster.  If you do need the frame object,
copy over the data from the C stack into the frame structure.

I'm sure there are all kinds of reasons why this idea is not easy to
implement or not possible.  It seems somewhat possible though.  I
wonder how IronPython works in this respect?  Apparently it doesn't
support sys._getframe().

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Neil Schemenauer
On 2019-02-26, Raymond Hettinger wrote:
> That said, I'm only observing the effect when building with the
> Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5).
> When building GCC 8.3.0, there is no change in performance.

My guess is that the code in _PyEval_EvalFrameDefault() got changed
enough that Clang started emitting a bit different machine code.  If
the conditional jumps are a bit different, I understand that could
have a significant difference on performance.

Are you compiling with --enable-optimizations (i.e. PGO)?  In my
experience, that is needed to get meaningful results.  Victor also
mentions that on his "how-to-get-stable-benchmarks" page.  Building
with PGO is really (really) slow so I supect you are not doing it
when bisecting.  You can speed it up greatly by using a simpler
command for PROFILE_TASK in Makefile.pre.in.  E.g.

PROFILE_TASK=$(srcdir)/my_benchmark.py

Now that you have narrowed it down to a single commit, it would be
worth doing the comparison with PGO builds (assuming Clang supports
that).

> That said, it seems to be compiler specific and only affects the
> Mac builds, so maybe we can decide that we don't care.

I think the key question is if the ceval loop got a bit slower due
to logic changes or if Clang just happened to generate a bit worse
code due to source code details.  A PGO build could help answer
that.  I suppose trying to compare machine code is going to produce
too large of a diff.

Could you try hoisting the eval_breaker expression, as suggested by
Antoine:

https://discuss.python.org/t/profiling-cpython-with-perf/940/2

If you think a slowdown affects most opcodes, I think the DISPATCH
change looks like the only cause.  Maybe I missed something though.

Also, maybe there would be some value in marking key branches as
likely/unlikely if it helps Clang generate better machine code.
Then, even if you compile without PGO (as many people do), you still
get the better machine code.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Compile-time resolution of packages [Was: Another update for PEP 394...]

2019-02-26 Thread Neil Schemenauer
On 2019-02-26, Gregory P. Smith wrote:
> On Tue, Feb 26, 2019 at 9:55 AM Barry Warsaw  wrote:
> For an OS distro provided interpreter, being able to restrict its use to
> only OS distro provided software would be ideal (so ideal that people who
> haven't learned the hard distro maintenance lessons may hate me for it).

Interesting idea.  I remember when I was helping develop Debian
packaging guides for Python software.   I had to fight with people
to convince them that Debian packages should use 

#!/usr/bin/pythonX.Y

rather than

#!/usr/bin/env python

The situtation is much better now but I still sometimes have
packaged software fail because it picks up my version of
/usr/local/bin/python.  I don't understand how people can believe
grabbing /usr/local/bin/python is going to be a way to build a
reliable system.

> Such a restriction could be implemented within the interpreter itself. For
> example: Say that only this set of fully qualified path whitelisted .py
> files are allowed to invoke it, with no interactive, stdin, or command line
> "-c" use allowed.

I think this is related to an idea I was tinkering with on the
weekend.  Why shouldn't we do more compile time linkage of Python
packages?  At least, I think we give people the option to do it.
Obviously you still need to also support run-time import search
(interactive REPL, support __import__(unknown_at_compiletime)__).

Here is the sketch of the idea (probably half-baked, as most of my
ideas are):

- add PYTHONPACKAGES envvar and -p options to 'python'

- the argument for these options would be a colon separated list of
  Python package archives (crates, bales, bundles?).  The -p option
  could be a colon separated list or provided multiple times to
  specify more packages.

- the modules/packages contained in those archives become the
  preferred bytecode code source when those names are imported.  We
  look there first.  The crawling around behavor (dynamic import
  based on sys.path) happens only if a module is not found and could
  be turned off.

- the linking of the modules could be computed when the code is
  compiled and the package archive created, rather than when the
  'import' statement gets executed.  This would provide a number of
  advantages.  It would be faster.  Code analysis tools could
  statically determine which modules imported code corresponds too.
  E.g. if your code calls module.foo, assuming no monkey patching,
  you know what code 'foo' actually is.

- to get extra fancy, the package archives could be dynamic
  link libraries containing "frozen modules" like this FB experiment:
  https://github.com/python/cpython/pull/9320
  That way, you avoid the unmarshal step and just execute the module
  bytecode directly.  On startup, Python would dlopen all of the
  package archives specified by PYTHONPACKAGES.  On init, it would
  build an index of the package tree and it would have the memory
  location for the code object for each module.

That would seem like quite a useful thing.  For an application like
Mercurial, they could build all the modules/packages required into a
single package archive.  Or, there would be a small number of
archives (one for standard Python library, one for everything else
that Mercurial needs).

Now that I write this, it sounds a lot like the debate between
static linking and dynamic linking.  Golang does static linking and
people seem to like the single executable distribution.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Register-based VM [Was: Possible performance regression]

2019-02-26 Thread Neil Schemenauer
On 2019-02-26, Victor Stinner wrote:
> I made an attempt once and it was faster:
> https://faster-cpython.readthedocs.io/registervm.html

Interesting.  I don't think I have seen that before.  Were you aware
of "Rattlesnake" before you started on that?  It seems your approach
is similar.  Probably not because I don't think it is easy to find.
I uploaded a tarfile I had on my PC to my web site:

http://python.ca/nas/python/rattlesnake20010813/

It seems his name doesn't appear in the readme or source but I think
Rattlesnake was Skip Montanaro's project.  I suppose my idea of
unifying the local variables and the registers could have came from
Rattlesnake.  Very little new in the world. ;-P

Cheers,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible performance regression

2019-02-26 Thread Neil Schemenauer
On 2019-02-25, Eric Snow wrote:
> So it looks like commit ef4ac967 is not responsible for a performance
> regression.

I did a bit of exploration myself and that was my conclusion as
well.  Perhaps others would be interested in how to use "perf" so I
did a little write up:

https://discuss.python.org/t/profiling-cpython-with-perf/940

To me, it looks like using a register based VM could produce a
pretty decent speedup.  Research project for someone. ;-)

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Another update for PEP 394 -- The "python" Command on Unix-Like Systems

2019-02-13 Thread Neil Schemenauer
On 2019-02-13, Terry Reedy wrote:
> It appears python is already python3 for a large majority of human users (as
> opposed to machines).

IMHO, the question about where /usr/bin/python points is more
important for machines than for humans.  Think about changing
/bin/sh to some different version of the Borne Shell that changes
'echo'.  Or changing 'awk' to some incompatible version.  That's
going to break a lot of scripts (cron jobs, etc).

I experienced the bad old days when you couldn't rely on /bin/sh to
be a proper POSIX shell.  It was a mess and it wasted countless
hours of human life to work around all the flavours.  Python is not
as fundamental as the Unix shell but it has replaced a lot of shell
scripting.

How can we avoid making a lot of work for people?  I don't see an
easy answer.  We don't want Python to become frozen forever (whether
it is called 'python', 'python3', or 'py').  OTOH, making
/usr/bin/python point to the most recent X.Y release doesn't seem
like a good solution either.  For example, if I used 'async' as a
variable in some of my scripts and then 3.7 broke them.

Should we dust off PEP 407 "New release cycle and introducing
long-term support versions"?  Having /usr/bin/python point to a LTS
release seems better to me.  I don't know if the core developers are
willing to support PEP 407 though.  Maybe OS packagers like Red Hat
and Debian will already do something like LTS releases and core
developers don't need to.  /usr/bin/python in Red Hat has behaved
like that, as far as I know.

Another idea is that we could adopt something like the Rust
"language edition" system.  Obviously lots of details to be worked
out.  If we had that, the 'py' command could take an argument to
specify the Python edition.  OTOH, perhaps deprecation warnings and
__future__ achieves most of the same benefits.  Maintaining
different editions sounds like a lot of work.  More work than doing
LTS releases.

Maybe the solution is just that we become a lot more careful about
making incompatible changes.  To me, that would seem to reduce the
rate that Python is improving.  However, a less evolved but more
stable Python could actually have a higher value to society.

We could create an experimental branch of Python, e.g. python-ng.
Then, all the crazy new ideas go in there.  Only after they are
stable would we merge them into the stable version of Python.  I'm
not sure how well that works in practice though.  That's similar to
what Linux did with the even/odd version numbering.  It turned into
a mess because the experimental (next) version quickly outran the
stable version and merging fixes between them was difficult.  They
abandoned that and now use something like PEP 407 for LTS releases.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Another update for PEP 394 -- The "python" Command on Unix-Like Systems

2019-02-13 Thread Neil Schemenauer
On 2019-02-13, Barry Warsaw wrote:
> I personally would like for `python` to be the latest Python 3
> version (or perhaps Brett’s launcher), `python2` to be Python 2.7
> where installed (and not mandatory).  `python3` would be an alias
> for the latest Python 3.

To me, having 'py' on Unix would be a good thing(tm).  If we have
that then I suppose we will encourage people to prefer it over
'python', 'python3', and 'python2'.  At that point, where 'python'
points would be less of an issue.

I'm not opposed to making 'python' configurable or eventually
pointing it to python3.  However, if we do go with 'py' as the
preferred command in the future, it seems to be some pain for little
gain.  If the OS already allows it to be re-directed, maybe that's
good enough.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] About the future of multi-process Python

2019-02-07 Thread Neil Schemenauer
On 2019-02-06, Antoine Pitrou wrote:
> For maximum synergy between these initiatives and the resulting APIs,
> it is better if things are done in the open ;-)

Hi Antoine,

It would be good if we could have some feedback from alternative
Python implementations as well.  I suspect they might want to 
support these APIs.  Doing zero-copy or sharing memory areas could
be a challenge with a compacting GC, for example.  In that case,
having something in the API that tells the VM that a certain chunk
of memory cannot move would be helpful.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Asking for reversion

2019-02-05 Thread Neil Schemenauer
I wrote:
> Could we somehow mark these APIs as experimental in 3.8?

It seems the change "e5ef45b8f519a9be9965590e1a0a587ff584c180" the
one we are discussing.  It adds two new files:

  Lib/multiprocessing/shared_memory.py
  Modules/_multiprocessing/posixshmem.c

It doesn't introduce new C APIs.  So, only
multiprocessing.shared_memory seems public.  I see we have PEP 411
that should cover this case:

  https://www.python.org/dev/peps/pep-0411/

The setup.py code could be more defensive.  Maybe only build on
platforms that have supported word sizes etc?  For 3.8, could it be
activated by uncommenting a line in Modules/Setup, rather than by
setup.py?

What happens in shared_memory if the _posixshmem module is not
available?  On Windows it seems like an import error is raised.
Otherwise, _PosixSharedMemory becomes 'object'.  Does that mean the
API still works but you lose the zero-copy speed?

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Asking for reversion

2019-02-05 Thread Neil Schemenauer
On 2019-02-05, Giampaolo Rodola' wrote:
> The main problem I have with this PR is that it seems to introduce
> 8 brand new APIs, but since there is no doc, docstrings or tests
> it's unclear which ones are supposed to be used, how or whether
> they are supposed to supersede or deprecate older (slower) ones
> involving inter process communication.

New or changed APIs are my major concern as well.  Localized
problems can be fixed later without much trouble.  However, APIs
"lock" us in and make it harder to change things later.  Also, will
new APIs need to be eventually supported by other Python
implementations?  I would imagine that doing zero-copy mixed with
alternative garbage collection strategies could be complicated.
Could we somehow mark these APIs as experimental in 3.8?

My gut reaction is that we shouldn't revert.  However, looking at
the changes, it seems 'multiprocessing.shared_memory' could be an
external extension package that lives in PyPI.  It doesn't require
changes to other interpreter internals.  It doesn't seem to require
internal Python header files.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add more SyntaxWarnings?

2019-01-24 Thread Neil Schemenauer
On 2019-01-24, Terry Reedy wrote:
> Serhiy Storchaka suggested a compiler SyntaxWarning and uploaded a
> proof-of-concept diff that handled the above and many similar cases.

I believe that in general we should give better errors or warnings
if we can do it without huge difficulty.  Serhiy's patch is quite
simple.  The same check *could* be done by a linting tool.  Putting
it in CPython will make it more widely available.  These checks
could be helpful to beginners who probably won't have linting tools
setup.

I think we should not make it an error, otherwise we have changed
Python "the language".  We don't want to force other Python
implementations to do the same check.  It might be hard for them to
implement.  So, SyntaxWarning seems like a reasonable compromise.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API changes

2018-11-30 Thread Neil Schemenauer
On 2018-11-29, Armin Rigo wrote:
> ...Also, although I'm discussing it here, I think the whole approach
> would be better if done as a third-party extension for now, without
> requiring changes to CPython---just use the existing C API to
> implement the CPython version.

Hello Armin,

Thank you for providing your input on this subject.  I too like the
idea of an API "shim layer" as a separate project.

What do you think of writing the shim layer in C++?  I'm not a C++
programmer but my understanding is that modern C++ compilers are
much better than years ago.  Using C++ would allow us to provide a
higher level API with smaller runtime costs.  However, it would
require that any project using the shim layer would have to be
compiled with a C++ compiler (CPython and PyPy could still expose a
C compatible API).

Perhaps it is a bad idea.  If someone does create such a shim layer,
it will already be challenging to convince extension authors to move
to it.  If it requires them to switch to using a C++ compiler rather
than a C compiler, maybe that's too much effort.  OTOH, with C++ I
think you could do things like use smart pointers to automatically
handle refcounts on the handles.  Or maybe we should just skip C++
and implement the layer in Rust.  Then the Rust borrow checker can
handle the refcounts. ;-)

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Neil Schemenauer
On 2018-11-19, Antoine Pitrou wrote:
> There are important use cases for the C API where it is desired to have
> fast type-specific access to Python objects such as tuples, ints,
> strings, etc.  This is relied upon by modules such as _json and _pickle,
> and third-party extensions as well.

Thank you for pointing this out.  The feedback from Stefan on what
Cython would like (e.g. more access to functions that are currently
"internal") is useful too.  Keeping our dreams tied to reality
is important. ;-P

It seems to me that we can't "have our cake and eat it too". I.e. on
the one hand hide CPython implementation internals but on the other
hand allow extensions that want to take advantage of those internals
to provide the best performance.

Maybe we could have a multiple levels of API:

A) maximum portability (Py_LIMITED_API)

B) source portability (non-stable ABI, inlined functions)

C) portability but poor performance on non-CPython VMs
   (PySequence_Fast_ITEMS, borrowed refs, etc)

D) non-portability, CPython specific (access to more internals like
   Stefan was asking for).  The extension would have to be
   re-implemented on each VM or provide a pure Python
   alternative.

I think it would be nice if the extension module could explicitly
choose which level of API it wants to use.

It would be interesting to do a census on what extensions are out
there.  If they mostly fall into wanting level "C" then I think this
API overhaul is not going to work out too well.  Level C is mostly
what we have now.  No point in putting the effort into A and B if no
one will use them.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-19 Thread Neil Schemenauer
On 2018-11-19, Victor Stinner wrote:
> Moreover, I failed to find anyone who can explain me how the C API
> is used in the wild, which functions are important or not, what is
> the C API, etc.

One idea is to download a large sample of extension modules from
PyPI and then analyze them with some automated tool (maybe
libclang).  I guess it is possible there is a large non-public set
of extensions that we would miss.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Neil Schemenauer
On 2018-11-16, Nathaniel Smith wrote:
> [..] it seems like you should investigate (a) whether you can make
> Py_LIMITED_API *be* that API, instead of having two different
> ifdefs

That might be a good idea.  One problem is that we might like to
make backwards incompatible changes to Py_LIMITED_API.  Maybe it
doesn't matter if no extensions actually use Py_LIMITED_API.
Keeping API and ABI compatibility with the existing Py_LIMITED_API
could be difficult.

What would be the downside of using a new CPP define?  We could
deprecate Py_LIMITED_API and the new API could do the job.

Also, I think extensions should have to option to turn the ABI
compatibility off.  For some extensions, they will not want to
convert if there is a big performance hit (some macros turn into
non-inlined functions, call functions rather than access a
non-opaque structure).

Maybe there is a reason my toggling idea won't work.  If we can use
a CPP define to toggle between inline and non-inline functions, I
think it should work.  Maybe it will get complicated.

Providing ABI compatibility like Py_LIMITED_API is a different goal
than making the API more friendly to alternative Python VMs.  So,
maybe it is a mistake to try to tackle both goals at once.  However,
the goals seem closely related and so it would be a shame to do a
bunch of work and not achieve both.


Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-16 Thread Neil Schemenauer
On 2018-11-16, Brett Cannon wrote:
> I think part of the challenge here (and I believe it has been
> brought up elsewhere) is no one knows what kind of API is
> necessary for some faster VM other than PyPy.

I think we have some pretty good ideas as to what are the
problematic parts of the current API.  Victor's C-API web site has
details[1].  We can ask other implementors which parts are hard to
support.

Here are my thoughts about some desired changes:

- We are *not* getting rid of refcounting for extension modules.  That
  would require a whole new API.  We might as well start from
  scratch with Python 4.  No one wants that.  However, it is likely
  different VMs use a different GC internally and only use
  refcounting for objects passed through the C-API.  Using
  refcounted handles is the usual implementation approach.  We can
  make some changes to make that easier.  I think making PyObject an
  opaque pointer would help.

- Borrowed references are a problem.  However, because they are so
  commonly used and because the source code changes needed to change
  to a non-borrowed API is non-trivial, I don't think we should try
  to change this.  Maybe we could just discourage their use?  For
  CPython, using a borrowed reference API is faster.  For other
  Python implementations, it is likely slower and maybe much slower.
  So, if you are an extension module that wants to work well with
  other VMs, you should avoid those APIs.

- It would be nice to make PyTypeObject an opaque pointer as well.
  I think that's a lot more difficult than making PyObject opaque.
  So, I don't think we should attempt it in the near future.  Maybe
  we could make a half-way step and discourage accessing ob_type
  directly.  We would provide functions (probably inline) to do what
  you would otherwise do by using op->ob_type->.

  One reason you want to discourage access to ob_type is that
  internally there is not necessarily one PyTypeObject structure for
  each Python level type.  E.g. the VM might have specialized types
  for certain sub-domains.  This is like the different flavours of
  strings, depending on the set of characters stored in them.  Or,
  you could have different list types.  One type of list if all
  values are ints, for example.

  Basically, with CPython op->ob_type is super fast.  For other VMs,
  it could be a lot slower.  By accessing ob_type you are saying
  "give me all possible type information for this object pointer".
  By using functions to get just what you need, you could be putting
  less burden on the VM.  E.g. "is this object an instance of some
  type" is faster to compute.

- APIs that return pointers to the internals of objects are a
  problem.  E.g. PySequence_Fast_ITEMS().  For CPython, this is
  really fast because it is just exposing the internal details of
  the layout that is already in the correct format.  For other VMs,
  that API could be expensive to emulate.  E.g. you have a list to
  store only ints.  If someone calls PySequence_Fast_ITEMS(), you
  have to create real PyObjects for all of the list elements.

- Reducing the size of the API seems helpful.  E.g. we don't need
  PyObject_CallObject() *and* PyObject_Call().  Also, do we really
  need all the type specific APIs, PyList_GetItem() vs
  PyObject_GetItem()?  In some cases maybe we can justify the bigger
  API due to performance.  To add a new API, someone should have a
  benchmark that shows a real speedup (not just that they imagine it
  makes a difference).

I don't think we should change CPython internals to try to use this
new API.  E.g. we know that getting ob_type is fast so just leave
the code that does that alone.  Maybe in the far distant future,
if we have successfully got extension modules to switch to using
the new API, we could consider changing CPython internals.  There
would have to be a big benefit though to justify the code churn.
E.g. if my tagged pointers experiment shows significant performance
gains (it hasn't yet).

I like Nathaniel Smith's idea of doing the new API as a separate
project, outside the cpython repo.  It is possible that in that
effort, we would like some minor changes to cpython in order to make
the new API more efficient, for example.  Those should be pretty
limited changes because we are hoping that the new API will work on
top of old Python versions, e.g. 3.6.

To avoid exposing APIs that should be hidden, re-organizing include
files is an idea.  However, that doesn't help for old versions of
Python.  So, I'm thinking that Dino's idea of just duplicating the
prototypes would be better.  We would like a minimal API and so the
number of duplicated prototypes shouldn't be too large.

Victor's recent work in changing some macros to inline functions is
not really related to the new API project, IMHO.  I don't think
there is a problem to leave an existing macro as a macro.  If we
need to introduce new APIs, e.g. to help hide PyTypeObject, those
APIs could use inline functions.  That 

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-10 Thread Neil Schemenauer
On 2018-11-09, Dino Viehland wrote:
> Rather than adding yet another pre-processor directive for this I would
> suggest just adding a new header file that only has the new stable API.
> For example it could just be "py.h" or "pyapi.h".  It would have all of the
> definitions for the stable API.

I like this idea.  It will be easier to define a minimal and clean
API with this approach.  I believe it can mostly be a subset of the
current API.

I think we could Dino's idea with Nathaniel's suggestion of
developing it separate from CPython.  Victor's C-API project is
already attempting to provide backwards compatibility.  I.e. you can
have an extension module that uses the new API but compiles and runs
with older versions of Python (e.g. 3.6).  So, whatever is inside
this new API, it must be possible to build it on top of the existing
Python API.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rename Include/internals/ to Include/pycore/

2018-11-02 Thread Neil Schemenauer
On 2018-10-28, Benjamin Peterson wrote:
> I don't think more or less API should be magically included based
> on whether Py_BUILD_CORE is defined or not.

I agree.

> If we want to have private headers, we should include them where
> needed and not install them. Really, Py_BUILD_CORE should go away.
> We should be moving away from monolithic includes like Python.h to
> having each C file include exactly what it uses, private or not.

It seems that is best practice (e.g. look at Linux kernel include
file style).  I wonder however what are the real benefits to having
modular include files and directly using them as needed?

Pros for modular includes:

- speeds up build process if you have good dependency info in the
  build system.  Right now, change Python.h and everything gets
  rebuilt.  I'm not sure this is a huge advantage anymore.

- makes it clearer where an API is implemented?

Cons:

- more work to include the correct headers

- build system dependency definitions are more complicated.  Other
  systems generally have automatic dependancy generates (i.e. parse
  C files and find used includes).

A simple approach would be to introduce something like
Python-internal.h.  If you are a Python internal unit, you can
include both Python.h and Python-internal.h.  We could, over time,
split Python-iternal.h into smaller modular includes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] dear core-devs

2018-10-02 Thread Neil Schemenauer
On 2018-10-02, Michael Felt wrote:
> I am sorry, for myself obviously - but also for Python. Obviously, I am
> doing it all wrong - as I see lots of other issues being picked up
> immediately.

I'm not sure that's the case.  There are a lot of PRs or bugs that
sit there without getting reviews.  The problem is that few (or no)
core developers get paid to work on Python.  So, the time they spend
is motivated by their specific "itch".  Getting reviews on any PR is
difficult, even for core developers.  In their case, they have to
option of forcing the issue, I guess.

This is a problem we should try to deal with somehow.  Turning off
valuable contributors like you is bad.  I'm not sure how to do it
though.  At the core Python sprint in September there was some talk
about how CPython developers might get funding.  Maybe that could
help deal with the backlog of reviews required.

> And, while you may not give a damn about anything other than Windows,
> macos and/or Linux - there are other platforms that would like a stable
> Python.

There is probably some truth in not caring about other platforms.
The problem from the reviewer perspective is the question of "what
is the potential downsides of this PR vs what are the benefits?".
The safest thing is to not approve the PR.  No core developer wants
to be the person who broke CPython.  You must admit, AIX is an
extremely niche platform at this point.  I bet if you picked 1000
software developers at random, it would be likely that zero of them
have ever used AIX.  So, it's not that we don't care at all about
AIX but that the cost/benefit equation makes accepting AIX specific
changes more difficult.

One specific suggestion I have about your PR is to try to make your
changes not AIX specific.  Or at least, make the AIX checking as
localized as possible.  So, as an example, in test_uuid you have:

_notAIX = not sys.platform.startswith("aix")

then later in the module you check that flag.  While that is the
most direct approach to fixing the issue and making the test pass,
it is not good for the long term maintainability of the code.  You
end up with boolean flags like _notAIX spread about the logic.  Over
time, code like that becomes a nightmare to maintain.

Instead, I would suggest test_uuid is making platform specific
assumptions that are not true on AIX and possibly other platforms.
So, do something like:


_IS_AIX = sys.platform.startswith("aix")

_HAVE_MACADDR = (os.name == 'posix' and not _IS_AIX)

@unittest.skipUnless(_HAVE_MACADDR, 'requires Posix with macaddr')
def test_arp_getnode(self):
...

The _HAVE_MACADDR test is relatively simple and clear, does this
platform have this capability.  Later in the code, a check for
_HAVE_MACADDR is also quite clear.  If someone comes along with
another platform that doesn't support macaddr, they only have to
change one line of code.

This kind of capability checking is similar to what happened with
web browsers.  In that case, people discovered that checking the
User Agent header was a bad idea.  Instead, you should probe for
specific functionality and not assume based on browser IDs.  For the
macaddr case, is there some way to you probe the arp command to see
if supports macaddr?  That way your test doesn't have to include any
AIX specific check at all.  Further, it would have some hope of
working on platforms other than AIX that also don't support macaddr
but are POSIX and have 'arp'.  The code could be something like:

_HAVE_MACADDR = False
if os.name == 'posix':
if :
_HAVE_MACADDR = True

Hope that is helpful.

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Neil Schemenauer
On 2018-09-18, Carl Shapiro wrote:
> How might people feel about using the linker to bundle a list of pre-loaded
> modules into a single-file executable?

The users of Python are pretty diverse so it depends on who you ask.
Some would like a giant executable that includes everything they
need (so of like the Go approach).  Other people want an executable
that has just importlib inside it and then mix-and-match different
shared libs for their different purposes.  Some will not want work
"old school" and load from separate .py or .pyc files.

I see no reason why we can't support all these options.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-16 Thread Neil Schemenauer
On 2018-09-15, Paul Moore wrote:
> On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer  wrote:
> > We could have a new format, .pya (compiled python archive) that has
> > data for many .pyc files in it.
[..]
> Isn't that essentially what putting the stdlib in a zipfile does? (See
> the windows embedded distribution for an example). It probably uses
> normal IO rather than mmap, but maybe adding a "use mmap" flag to the
> zipfile module would be a more general enhancement that zipimport
> could use for free.

Yeah, it's close to the same thing.  If the syscalls are what gives
the speedup, using a better zipfile implementation might give nearly
the same benefit.

At the sprint we dicussed a variation of Larry's (FB's) patch.
Allow the frozen data to be in DLLs as well as in the python
executable data segment.  So, importlib would be frozen into the
exe.  The standard library could become another DLL.  The user could
provide one or more DLLs that contains their app code and package
deps.  In general, I think there would only be two DLLs: stdlib and
app+deps.

My suggestion of a special format (similar to zipfile) was
motivated by the wish to avoid platform build tools.  E.g. Windows
users would have a harder time to build DLLs.  However, I now think
depending on platform build tools is fine.  The people who will
build these DLLs will have the tools and skills to do so.  Even if
there is only a DLLs for the stdlib, it will be a win.  If no DLLs
are provided, you get the same behavior as current Python (i.e.
importlib is frozen in, everything else can come from .py files).

I think there is no question that Larry's PR will be faster than the
zipfile approach.  It removes the umarshal step.  Maybe that benefit
will but small but I think it should count.  Also, I suspect the OS
can page-in the DLL on-demand and perhaps leave parts of module .pyc
data on disk.  Larry had the idea of keeping code objects frozen
until they need to be executed.  It's a cool idea that would be
enabled by this first step.

I'm excited about Larry's PR.  I think if we get it cleanup up and
into Python 3.8, we will clearly leave Python 2.7 behind in terms of
startup performance.  That has been a goal of mine for a couple
years now.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-14 Thread Neil Schemenauer
On 2018-09-14, Larry Hastings wrote:
> [..] adding the stat calls back in costs you half the startup.  So
> any mechanism where we're talking to the disk _at all_ simply
> isn't going to be as fast.

Okay, so if we use hundreds of small .pyc files scattered all over
the disk, that's bad?  Who would have thunk it. ;-P

We could have a new format, .pya (compiled python archive) that has
data for many .pyc files in it.  In normal runs you would have one
or just and handlful of these things (e.g. one for stdlib, one for
your app and all the packages it uses).  Then you mmap these just
once and rely on OS page faults to bring in the data as you need it.
The .pya would have a hash table at the start or end that tells you
the offset for each module.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-14 Thread Neil Schemenauer
On 2018-09-14, Larry Hastings wrote:
[...]
> improvement 0.21242667903482038 %

I assume that should be 21.2 % othewise I recommend you abandon the
idea. ;-P

> The downside of the patch: for these modules it ignores the Python files on
> disk--it doesn't even stat them.

Having a command-line/env var to turn this on/off would be an
acceptable fix, IMHO.  If I'm running Python a server, I don't need
to be editing .py modules and have them be recognized.  Maybe have
it turned off by default, at least at first.

> Is it worth working on?

I wonder how much of the speedup relies on putting it in the data
segment (i.e. using linker/loader to essentially handle the
unmarshal).  What if you had a new marshal format that only needed a
light 2nd pass in order to fix up the data loaded from disk?  Yuri
suggested looking at formats like Cap'n Proto.  If the cost of the
2nd pass was not bad, you wouldn't have to rely on the platform C
toolchain.  Instead we can write .pyc files that hold this data.

Then the speedup can work on all compiled Python modules, not just
the ones you go through the special process that links them into the
data segment.  I suppose that might mean that .pyc files become arch
specific.  Maybe that's okay.

As you said last night, there doesn't seem to be much low hanging
fruit around anymore.  So, 21% looks pretty decent.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Heap-allocated StructSequences

2018-09-14 Thread Neil Schemenauer
On 2018-09-04, Eddie Elizondo wrote:
> Solution:
> 
>   *   Fix the implementation of PyStructSequence_NewType:
> 
> The best solution would be to fix the implementation of this
> function. This can easily be done by dynamically creating a
> PyType_Spec and calling PyType_FromSpec

Hello Eddie,

Thank you for spending time to look into this.  Without studying the
details of your patch, your approach sounds correct to me.  I think
we should be allocating types from the heap and use PyType_FromSpec.
Having static type definitions living in the data segment cause too
many issues.

We have to assess how 3rd party extension modules would be affected
by this change.  Unless it is too hard to do, they should still
compile (perhaps with warnings) after your fix.  Do you know if
that's the case?  Looking at your changes to structseq.c, I can't
tell easily.

In any case, this should go into Victor's pythoncapi fork.  That
fork includes all the C-API cleanup we are hoping to make to CPython
(assuming we can figure out the backwards and forwards compatibility
issues). 

Here is the project site:

https://pythoncapi.readthedocs.io

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34595: How to format a type name?

2018-09-13 Thread Neil Schemenauer
On 2018-09-13, Victor Stinner wrote:
> Right, that's a side effect of the discussion on the C API. It seems
> like Py_TYPE() has to go in the new C API. Sorry, the rationale is not
> written down yet, but Dino convinced me that Py_TYPE() has to go :-)

My understanding is that using Py_TYPE() inside the CPython
internals is okay (i.e. using a borrowed reference).  However,
extension modules would preferrably not use APIs that give back
borrowed references.  A clean API redesign would remove all of
those.

So, what are extension modules supposed to do?  We want to give them
an easy to use API.  If we give them %t that takes an object and
internally does the Py_TYPE() call, they have a simple way to do the
right thing.

E.g.

 PyErr_Format(PyExc_TypeError,
 "\"%s\" must be string, not %.200s", name,
 src->ob_type->tp_name);

becomes

 PyErr_Format(PyExc_TypeError,
 "\"%s\" must be string, not %t", name, src);

This kind of code occurs often in extension modules.  If you make
them get a strong reference to the type, they have to remember to
decref it.  It's not a huge deal but is a bit harder to use.  I like
the proposal to provide both %t and %T.  Our format code is a bit
more complicated but many extension modules get a bit simpler.
That's a win, IMHO.

For the Python side, I don't think you need the % format codes.  You
need a idiomatic way of getting the type name.  repr() and str() of
the type object is not it.  I don't think changing them at this
point is a good idea.  So, having a new property would seem the
obvious solution.  E.g.

 f'"{name}" must be string, not {src.__class__.__qualname__}'

That __qualname__ property will be useful for other things, not just
building type error messages.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Let's change to C API!

2018-08-22 Thread Neil Schemenauer
On 2018-07-31, Victor Stinner wrote:
> I would be nice to be able to use something to "generate" C
> extensions, maybe even from pure Python code. But someone has to
> work on a full solution to implement that.

Perhaps a "argument clinic on steroids" would be the proper
approach.  So, extensions would mostly be written in C.  However, we
would have a pre-processor that does some "magic" to make using the
Python API cleaner.  Defining new types using static structures, for
example, is not the way to build a good API.  The other approach
would be something like mypyc + cffi.  I agree with others that
Cython is not the right tool.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] When tp_clear returns non-zero?

2018-05-28 Thread Neil Schemenauer
On 2018-05-28, Serhiy Storchaka wrote:
> I'm interesting what the result of this function means. In what
> cases it can return non-zero, and can it set an exception?

My memory is fuzzy (nearly 20 years since I wrote that code).  My
best guess is that I thought a return value might be useful somehow.
As you have noticed, the return type probably should have been void.

If you want to see one of the first implementations of the Python
GC, I still have a patch:

http://python.ca/nas/python/gc/gc-cycle-152.diff

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 (pickle 5) implementation and backport available

2018-05-25 Thread Neil Schemenauer
On 2018-05-25, Antoine Pitrou wrote:
> The question is what purpose does it serve for pickle to do it rather
> than for the user to compress the pickle themselves.  You're basically
> saving one line of code.

It's one line of code everywhere pickling or unpicking happens.  And
you probably need to import a compression module, so at least two
lines.  Then maybe you need to figure out if the pickle is
compressed and what kind of compression is used.  So, add a few more
lines.

It seems logical to me that users of pickle want it to be fast and
produce small pickles.  Compressing by default seems the right
choice, even though it complicates the implementation.  Ivan brings
up a valid point that compressed pickles are harder to debug.
However, I think that's much less important than being small.

> it requires us to ship the lz4 library with Python

Yeah, that's not so great.  I think zlib with Z_BEST_SPEED would be
fine.  However, some people might worry it is too slow or doesn't
compress enough.  Having lz4 as a battery included seems like a good
idea anyhow.  I understand that it is pretty well established as a
useful compression method.  Obviously requiring a new C library to
be included expands the effort of implementation a lot.

This discussion can easily lead into bikeshedding (e.g. relative
merits of different compression schemes).  Since I'm not
volunteering to implement anything, I will stop responding at this
point. ;-)

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 (pickle 5) implementation and backport available

2018-05-25 Thread Neil Schemenauer
On 2018-05-25, Antoine Pitrou wrote:
> Do you have something specific in mind?

I think compressed by default is a good idea.  My quick proposal:

- Use fast compression like lz4 or zlib with Z_BEST_SPEED

- Add a 'compress' keyword argument with a default of None.  For
  protocol 5, None means to compress.  Providing 'compress' != None
  for older protocols will raise an error.

The compression overhead will be small compared to the
pickle/unpickle costs.  If someone wants to apply their own (e.g.
better) compression, they can set compress=False.

An alternative idea is to have two different protocol formats.  E.g.
5 and 6.  One is "pickle 5" with compression, one without
compression.  I don't like that as much since it breaks the idea
that higher protocol numbers are "better".

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-07 Thread Neil Schemenauer
On 2018-05-03, Lukasz Langa wrote:
> > On May 2, 2018, at 8:57 PM, INADA Naoki  wrote:
> > * Add lazy compiling API or flag in `re` module.  The pattern is compiled
> > when first used.
> 
> How about go the other way and allow compiling at Python
> *compile*-time? That would actually make things faster instead of
> just moving the time spent around.

Lisp has a special form 'eval-when'.  It can be used to cause
evaluation of the body expression at compile time.

In Carl's "A fast startup patch" post, he talks about getting rid of
the unmarshal step and storing objects in the heap segment of the
executable.  Those would be the objects necessary to evaluate code.
The marshal module has a limited number of types that it handle.
I believe they are: bool, bytes, code objects, complex, Ellipsis
float, frozenset, int, None, tuple and str.

If the same mechanism could handle more types, rather than storing
the code to be evaluated, we could store the objects created after
evaluation of the top-level module body.  Or, have a mechanism to
mark which code should be evaluated at compile time (much like the
eval-when form).

For the re.compile example, the compiled regex could be what is
stored after compiling the Python module (i.e. the re.compile gets
run at compile time).  The objects created by re.compile (e.g.
SRE_Pattern) would have to be something that the heap dumper could
handle.

Traditionally, Python has had the model "there is only runtime".
So, starting to do things at compile time complicates that model.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-02 Thread Neil Schemenauer
Antoine:
> The overhead of importing is not in trying too many names, but in
> loading the module and executing its bytecode.

That was my conclusion as well when I did some profiling last fall
at the Python core sprint.  My lazy execution experiments are an
attempt to solve this:

https://github.com/python/cpython/pull/6194

I expect that Mercurial is already doing a lot of tricks to make
execution more lazy.  They have a lazy module import hook but they
probably do other things to not execute more bytecode at startup
then is needed.  My lazy execution idea is that this could happen
more automatically.  I.e. don't pay for something you don't use.
Right now, with eager module imports, you usually pay a price for
every bit of bytecode that your program potentially uses.

Another idea, suggested to me by Carl Shapiro, is to store
unmarshalled Python data in the heap section of the executable (or
in DLLs).  Then, the OS page fault handling would take care of only
loading the data into RAM that is actually being used.  The linker
would take care of fixing up pointer references.  There are a lot of
details to work out with this idea but I have heard that Jeethu Rao
(Carl's colleague at Instagram) has a prototype implementation that
shows promise.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 'continue'/'break'/'return' inside 'finally' clause

2018-01-04 Thread Neil Schemenauer
On 2018-01-04, Guido van Rossum wrote:
> We should interview you for the paper we may be writing for HOPL.

History of Programming Languages?

I did some more digging this afternoon, trying to find source code
between versions 1.0.1 and 0.9.1.  No luck though.  It looks like
0.9.1 might have been the last one you uploaded to alt.sources.
Later 0.9.X releases were uploaded to ftp.cwi.nl and
wuarchive.wustle.edu.  No one seems to have an archive of those.

I think all my old PCs have been sent to the scrapyard.  I might
have some old hard disk images somewhere.  Maybe on a writable DVD
or CDR.  Probably unreadable at this point.  I don't know exactly
which version of Python I first downloaded.  No earlier than the
fall of 1992 and maybe 1993 but it could have been pre-1.0.  I do
recall running a DOS port at some point.

Here is the announcement of 0.9.4alpha:

http://legacy.python.org/search/hypermail/python-1992/0270.html

The Misc/HISTORY file has quite a lot of details.  It shows that
'continue' was added in 0.9.2.

Back on topic, it looks like allowing 'continue' will be trival once
Serhiy's unwind stack PR lands.  Just a few lines of code and I
think everything works.  If Mark implements his alternative
"wordcode for finally blocks gets copied" thing, it will make things
more complicated but not much more so than handling 'break' and
'return'.  So, those three should probably be all allowed or all
forbidden.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 'continue'/'break'/'return' inside 'finally' clause

2018-01-03 Thread Neil Schemenauer
On 2018-01-03, Guido van Rossum wrote:
> I'm sorry, I don't think more research can convince me either way.
> I want all three of return/break/continue to work inside finally
> clauses, despite there being few use cases.

That's fine.  The history of 'continue' inside 'finally' is
interesting.  The restriction dates back to at least when Jeremy
committed the AST-based compiler (I have fond memories of hacking on
it with Armin Rigo and others at a Python core sprint).  Going
further back, I looked at 1.5.2 and there is the comment in
compile.c:

TO DO:
...
XXX Allow 'continue' inside try-finally

So if we allow 'continue' we will be knocking off a nearly 20 year
old todo item. ;-)

For giggles, I unpacked a Python 0.9.1 tarball.  The source code is
all under 'src' in that version.  There doesn't seem to be a
restriction on 'continue' but only because the grammar doesn't
include it!  Without doing more research, I think the restriction
could be as old as the 'continue' keyword.

BTW, the bytecode structure for try/except shown in the compile.c
comments is very simlar to what is currently generated.  It is quite
remarkable how well your initial design and implementation have stood
the test of time.  Thank you for making it open source.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 'continue'/'break'/'return' inside 'finally' clause

2018-01-03 Thread Neil Schemenauer
On 2018-01-03, Serhiy Storchaka wrote:
> I haven't found 'finally' clauses in
> https://github.com/gevent/gevent/blob/master/src/gevent/libev/corecffi.py.
> Perhaps this code was changed in recent versions.

Yes, I was looking at was git revision bcf4f65e.  I reran my AST
checker and found this:

./src/gevent/_ffi/loop.py: 181: return inside finally

> In any case we now know that this combination is occurred (but
> very rarely) in the wild.

Looks like it.  If we do want to seriously consider changing the
grammar, I will download more packages of PyPI and check them.

BTW, ./src/gevent/threadpool.py doesn't compile with 3.7 because it
uses 'async' as a variable name.  So either they didn't notice the
deprecation warnings or they didn't care to update their code.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 'continue'/'break'/'return' inside 'finally' clause

2018-01-03 Thread Neil Schemenauer
Generally I think programming language implementers don't get to
decide how the language works. You just have to implement it as
specified, inconvenient as that might be.

However, from a languge design prespective, I think there is a good
argument that this is a corner of the language we should consider
changing.  First, I analyzed over one million lines of Python code
with my AST walker and only found this construct being used in four
different places.  It seems to be extremely rare.

Second, the existance of a pylint warning for it suggests that it is
confusing. I did a little more searching using the pylint warning
and found these pages:


https://stackoverflow.com/questions/35505624/break-statement-in-finally-block-swallows-exception

http://thegreyblog.blogspot.ca/2011/02/do-not-return-in-finally-block-return.html

So, given the above and that the implementation (both compiler and
bytecode evaluator) is pretty complicated, I vote that we should
disallow it.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 'continue'/'break'/'return' inside 'finally' clause

2018-01-02 Thread Neil Schemenauer
Serhiy Storchaka  wrote:
> Currently 'break' and 'return' are never used inside 'finally'
> clause in the stdlib.

See the _recv_bytes() function:

Lib/multiprocessing/connection.py: 316

> I would want to see a third-party code that uses them.

These are the only ones I found so far:

./gevent/src/gevent/libev/corecffi.py: 147
./gevent/src/gevent/threadpool.py: 226

I have an AST walker script that finds them.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Enabling depreciation warnings feature code cutoff

2017-11-06 Thread Neil Schemenauer
On 2017-11-06, Nick Coghlan wrote:
> Gah, seven years on from Python 2.7's release, I still get caught by
> that. I'm tempted to propose we reverse that decision and go back to
> enabling them by default :P

Either enable them by default or make them really easy to enable for
development evironments.  I think some setting of the PYTHONWARNINGS
evironment variable should do it.  It is not obvious to me how to do
it though.  Maybe there should be an environment variable that does
it more directly.  E.g.

PYTHONWARNDEPRECATED=1

Another idea is to have venv to turn them on by default or, based on
a command-line option, do it.  Or, maybe the unit testing frameworks
should turn on the warnings when they run.

The current "disabled by default" behavior is obviously not working
very well.  I had them turned on for a while and found quite a
number of warnings in what are otherwise high-quality Python
packages.  Obviously the vast majority of developers don't have them
turned on.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reorganize Python categories (Core, Library, ...)?

2017-10-17 Thread Neil Schemenauer
Antoine Pitrou  wrote:
> There is no definite "correct category" when you're mixing different
> classification schemes (what kind of bug it is --
> bug/security/enhancement/etc. --, what functional domain it pertains
> to -- networking/concurrency/etc. --, which stdlib API it affects).

I think there should be a set of tags rather than a single category.
In the blurb entry, you could apply all the tags that are relevant.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-17 Thread Neil Schemenauer
Christian Heimes  wrote:
> That approach could work, but I think that it is the wrong
> approach. I'd rather keep Python optimized for long-running
> processes and introduce a new mode / option to optimize for
> short-running scripts.

Another idea is to run a fake transasaction through the process
before forking.  That will "warm up" things so that most of the lazy
init is already done.

After returning from the core sprint, I have gotten over my initial
enthusiam for my "lazy module defs" idea.  It is just too big of a
change for Python to accept that this point.  I still hope there
would be a way to make LOAD_NAME/LOAD_GLOBAL trigger something like
__getattr__().  That would allow libraries that want to aggressively
do lazy-init to do so in the clean way.

The main reason that Python startup is slow is that we do far too
much work on module import (e.g. initializing data structures that
never get used).  Reducing that work will almost necessarily impact
pre-fork model programs (e.g. they expect the init to be done before
the fork).

As someone who uses that model heavily, I would still be okay with
the "lazification" as I think there are many more programs that
would be helped vs the ones hurt.  Initializing everything that your
program might possibibly need right at startup time doesn't seem
like a goal to strive for.  I can understand if you have a different
opinion though.

A third approach would be to do more init work at compile time.
E.g. for re.compile, if the compiled result could be stored in the
.pyc, that would eliminate a lot of time for short scripts and for
long-running programs.  Some Lisp systems have "compiler macros".
They are basically a hook to allow programs to do some work before
the code is sent to the compiler.  If something like that existed in
Python, it could be used by re.compile to generate a compiled
representation of the regex to store in the .pyc file.  That kind of
behavior is pretty different than the "there is only runtime" model
that Python generally tries to follow.

Spit-ball idea, thought up just now:

PAT = __compiled__(re.compile(...))

The expression in __compiled__(..) would be evaluated by the
compiler and the resulting value would become the value to store in
th .pyc.  If you are running the code as the script, __compiled__
just returns its argument unchanged.

Cheers,

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Lazy initialization of module global state

2017-09-08 Thread Neil Schemenauer
This is an idea that came out of the lazy module loading (via AST
analysis), posted to python-ideas.  The essential idea is to split
the marshal data stored in the .pyc into smaller pieces and only
load the parts as they are accessed.  E.g. use a __getattr__ hook on
the module to unmarshal+exec the code.

I have a very early prototype:

https://github.com/warsaw/lazyimport/blob/master/lazy_compile.py

Would work like a "compile_all.py" tool.  It writes standard .pyc
files right now.  It is not there yet but I should use the AST
analysis, like the lazy module load stuff, to determine if things
have potential side-effects on module import.  Those things will get
loaded eagerly, like they do now.

Initially I was thinking of class definitions and functions but now
I realize any global state could get this treatment.  E.g. if you
have a large dictionary global, don't unmarshal it until someone
accesses the module attribute.

This should be pretty safe to do and should give a significant
benefit in startup time and memory usage.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-07 Thread Neil Schemenauer
Larry Hastings  wrote:
> The TL;DR summary: add support for property objects to modules.
> I've already posted a prototype.

I posted an idea to python-ideas about lazy module loading.  If the
lazy loading idea works, having properties would allow modules to
continue to be "lazy safe" but to easily do init logic when needed,
e.g. getting of the property.

There should be a very clean way to do that, IMHO.  Using __class__
is not clean and it would be unfortunate to have the __class__
song-and-dance in a bunch of modules.  Using property() seems more
Pythonic.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Consolidate stateful runtime globals

2017-09-07 Thread Neil Schemenauer
Is there any issue with unit-at-a-time optimization?  I would
imagine that a static global would allow optimizations that are not
safe for a exported global (not sure the C term for it).

I suspect it doesn't matter and I support the idea in general.
Global variables in extension modules kills the idea of a
mark-and-sweap or some other GC mechanism.  That's probably not
going to happen but identifying all of the global state seems like a
step forward.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Memory bitmaps for the Python cyclic garbage collector

2017-09-07 Thread Neil Schemenauer
Python objects that participate in cyclic GC (things like lists, dicts,
sets but not strings, ints and floats) have extra memory overhead.  I
think it is possible to mostly eliminate this overhead.  Also, while
the GC is running, this GC state is mutated, which destroys
copy-on-write optimizations.  This change would mostly fix that
issue.

All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC
bit set in their type.  That causes an extra chunk of memory to be
allocated *before* the ob_refcnt struct member.  This is the PyGC_Head
struct.

The whole object looks like this in memory (PyObject pointer is at
arrow):

union __gc_head *gc_next;
union __gc_head *gc_prev;
Py_ssize_t gc_refs;
-->
Py_ssize_t ob_refcnt
struct _typeobject *ob_type;
[rest of PyObject members]


So, 24 bytes of overhead on a 64-bit machine.  The smallest Python
object that can have a pointer to another object (e.g. a single PyObject
* member) is 48 bytes.  Removing PyGC_Head would cut the size of these
objects in half.

Carl Shaprio questioned me today on why we use a double linked-list and
not the memory bitmap.  I think the answer is that there is no good
reason. We use a double linked list only due to historical constraints
that are no longer present.

Long ago, Python objects could be allocated using the system malloc or
other memory allocators.  Since we could not control the memory
location, bitmaps would be inefficient.  Today, we allocate all Python
objects via our own function.  Python objects under a certain size are
allocated using our own malloc, obmalloc, and are stored in memory
blocks known "arenas".

The PyGC_Head struct performs three functions.  First, it allows the GC
to find all Python objects that will be checked for cycles (i.e. follow
the linked list).  Second, it stores a single bit of information to let
the GC know if it is safe to traverse the object, set with
PyObject_GC_Track().  Finally, it has a scratch area to compute the
effective reference count while tracing refs (gc_refs).

Here is a sketch of how we can remove the PyGC_Head struct for small
objects (say less than 512 bytes).  Large objects or objects created by
a different memory allocator will still have the PyGC_Head overhead.

* Have memory arenas that contain only objects with the
  Py_TPFLAGS_HAVE_GC flag.  Objects like ints, strings, etc will be
  in different arenas, not have bitmaps, not be looked at by the
  cyclic GC.

* For those arenas, add a memory bitmap.  The bitmap is a bit array that
  has a bit for each fixed size object in the arena.  The memory used by
  the bitmap is a fraction of what is needed by PyGC_Head.  E.g. an
  arena that holds up to 1024 objects of 48 bytes in size would have a
  bitmap of 1024 bits.

* The bits will be set and cleared by PyObject_GC_Track/Untrack()

* We also need an array of Py_ssize_t to take over the job of gc_refs.
  That could be allocated only when GC is working and it only needs to
  be the size of the number of true bits in the bitmap.  Or, it could be
  allocated when the arena is allocated and be sized for the full arena.

* Objects that are too large would still get the PyGC_Head struct
  allocated "in front" of the PyObject.  Because they are big, the
  overhead is not so bad.

* The GC process would work nearly the same as it does now.  Rather than
  only traversing the linked list, we would also have to crawl over the
  GC object arenas, check blocks of memory that have the tracked bit
  set.

There are a lot of smaller details to work out but I see no reason
why the idea should not work.  It should significantly reduce memory
usage.  Also, because the bitmap and gc_refs are contiguous in
memory, locality will be improved.  Łukasz Langa has mentioned that
the current GC causes issues with copy-on-write memory in big
applications.  This change should solve that issue.

To implement, I think the easiest path is to create new malloc to be
used by small GC objects, e.g. gcmalloc.c.  It would be similar to
obmalloc but have the features needed to keep track of the bitmap.
obmalloc has some quirks that makes it hard to use for this purpose.
Once the idea is proven, gcmalloc could be merged or made to be a
variation of obmalloc.  Or, maybe just optimized and remain
separate.  obmalloc is complicated and highly optimized.  So, adding
additional functionality to it will be challenging.

I believe this change would be ABI compatible.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] To reduce Python "application" startup time

2017-09-06 Thread Neil Schemenauer
INADA Naoki  wrote:
> Current `python -v` is not useful to optimize import.
> So I use this patch to profile import time.
> https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb

I have implemented DTrace probes that do almost the same thing.
Your patch is better in that it does not require an OS with DTrace
or SystemTap.  The DTrace probes are better in that they can be a
part of the standard Python build.

https://github.com/nascheme/cpython/tree/dtrace-module-import

DTrace script:

https://gist.github.com/nascheme/c1cece36a3369926ee93cecc3d024179

Pretty printer for script output (very minimal):

https://gist.github.com/nascheme/0bff5c49bb6b518f5ce23a9aea27f14b


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Smoothing the transition from Python 2 to 3

2016-06-10 Thread Neil Schemenauer
Nick Coghlan  wrote:
> It could be very interesting to add an "ascii-warn" codec to Python
> 2.7, and then set that as the default encoding when the -3 flag is
> set.

I don't think that can work.  The library code in Python would spew
out warnings even in the cases when nothing is wrong with the
application code.  I think warnings have to be added to a Python
where str and bytes have been properly separated.  Without extreme
backporting efforts, that means 3.x.

We don't want to saddle 3.x with a bunch of backwards compatibility
cruft.  Maybe some of my runtime warning changes could be merged
using a command line flag to enable them.  It would be nice to have
the stepping stone version just be normal 3.x with a command line
option.  However, for the sanity of people maintaining 3.x, I think
perhaps we don't want to do it.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Smoothing the transition from Python 2 to 3

2016-06-10 Thread Neil Schemenauer

On 6/10/2016 10:49 AM, Nick Coghlan wrote:

What Brett said is mostly accurate for me, except with one slight
caveat: I've been explicitly trying to nudge you towards making the
*existing tools better*, rather than introducing new tools. With
modernize and futurize we have a fairly clear trade-off ("Do you want
your code to look more like Python 2 or more like Python 3?"), and
things like "pylint --py3k" and the static analyzers are purely
additive to the migration process (so folks can take them or leave
them), but alternate interpreter builds and new converters have really
high barriers to adoption.


I agree with that idea.  If there is anything that is "clean" enough, it 
should be merged with either 2.7.x or 3.x.  There is nothing in my tree 
that can be usefully merged though.



More -3 warnings in Python 2.7 are definitely welcome (since those can
pick up runtime behaviors that the static analysers miss), and if
there are things the existing code converters and static analysers
*could* detect but don't, that's a fruitful avenue for improvement as
well.
We are really limited on what can be done with the bytes/string issue 
because in Python 2 there is no distinct type for bytes. Also, the 
standard library does all sorts of unclean mixing of str and unicode so 
a warning would spew a lot of noise.


Likewise, a warning about comparison behavior (None, default ordering of 
types) would also not be useful because there is so much standard 
library code that would spew warnings.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Smoothing the transition from Python 2 to 3

2016-06-09 Thread Neil Schemenauer
On 2016-06-09, Brett Cannon wrote:
> I don't think you meant for what you said to sound insulting,
> Neil, but it did feel like it upon first reading.

Sorry, I think I misunderstood what you and Nick were saying.  I've
experienced a fair amount of negative feedback on my idea so I'm
pretty cranky at this point.  Amber Brown claimed that she spent
$60k of her time porting Twisted to Python 3.  I think there is lots
of room to make our porting tools better.

Using something like modernize, 2to6, or sixer seems like a better
idea than trying to improve on 2to3.  I agree on that point.
However, those tools combined with my modified Python 3.6 makes for
a much easier migration path than going directly to Python 3.x.  My
runtime warnings catch many common problems and make it easy to see
what needs fixing.

We have a lot more freedom to put ugly, backwards compatibility
hacks into this stepping stone version, rather than changing either
Python 2.7.x or the main 3.x line.  I'm hoping to get community
contributions to add more backwards compatibility and runtime
warnings.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Smoothing the transition from Python 2 to 3

2016-06-09 Thread Neil Schemenauer
On 2016-06-09, Brett Cannon wrote:
> On Thu, 9 Jun 2016 at 14:56 Nick Coghlan  wrote:
> > Once you switch to those now recommended more conservative migration
> > tools, the tool suite you request already exists:
> >
> > - update your code with modernize or futurize
> > - check it still runs on Python 2.7
> > - check it doesn't generate warnings under 2.7's "-3" switch
> > - check it passes "pylint --py3k"
> > - check if it runs on Python 3.5
> >
> 
> `python3.5 -bb` is best to help keep Python 2.7 compatibility, otherwise
> what Nick said. :)

I have to wonder if you guys actually ported at lot of Python 2
code.  Maybe you somehow avoided the problematic behavior. Below is
a pretty trival set of functions.  The tools you recommend do not
help at all.  One problem is that the str literals should be bytes
literals.  Comparison with None needs to be avoided.

With Python 2 code runs successfully.  With Python 3 the code
crashes with a traceback.  With my modified Python 3.6, the code
runs successfully but generates the following warnings:

test.py:13: DeprecationWarning: encoding bytes to str
  output.write('%d:' % len(s))
test.py:14: DeprecationWarning: encoding bytes to str
  output.write(s)
test.py:15: DeprecationWarning: encoding bytes to str
  output.write(',')
test.py:5: DeprecationWarning: encoding bytes to str
  if c == ':':
test.py:9: DeprecationWarning: encoding bytes to str
  size += c
test.py:24: DeprecationWarning: encoding bytes to str
  data = data + s
test.py:26: DeprecationWarning: encoding bytes to str
  if input.read(1) != ',':
test.py:31: DeprecationWarning: default compare is depreciated
  if a > 0:

It is very easy for me to find code written for Python 2 that will
fail in the same way.  According to you guys, there is no problem
and we already have good enough tooling. ;-(
def ns_read_size(input):
size = ''
while 1:
c = input.read(1)
if c == ':':
break
elif not c:
raise IOError('short netstring read')
size += c
return int(size)

def ns_write_string(s, output):
output.write('%d:' % len(s))
output.write(s)
output.write(',')

def ns_read_string(input):
size = ns_read_size(input)
data = ''
while size > 0:
s = input.read(size)
if not s:
raise IOError('short netstring read')
data = data + s
size -= len(s)
if input.read(1) != ',':
raise IOError('missing netstring terminator')
return data

def compare(a, b):
if a > 0:
return b + 10
return 0

def main():
import tempfile
out = tempfile.TemporaryFile()
ns_write_string('Hello world', out)
out.seek(0)
s = ns_read_string(out)
if s != 'Hello world':
print('Failed')
else:
print('Ok')
if (compare(None, 5) == 0 and
compare(1, 5) == 15):
print('Ok')
else:
print('Failed')

if __name__ == '__main__':
main()
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Smoothing the transition from Python 2 to 3

2016-06-08 Thread Neil Schemenauer
[I've posted something about this on python-ideas but since I now
have some basic working code, I think it is more than an idea.]

I think the uptake of Python 3 is starting to accelerate.  That's
good.  However, there are still millions or maybe billions of lines
of Python code that still needs to be ported.  It is beneficial to
the Python ecosystem if this code can get ported.

My idea is to make a stepping stone version of Python, between 2.7.x
and 3.x that eases the porting job.  The high level goals are:

- code coming out of 2to3 runs correctly on this modified Python

- code that runs without warnings on this modified Python will run
  correctly on Python 3.x.

Achieving these goals is not technically possible.  Still, I want to
reduce as much as possible the manual work involved in porting.
Incrementally fixing code that generates warnings is a lot easier
than trying to fix an entire application or library at once.

I have a very early version on github:

https://github.com/nascheme/ppython

I'm hoping if people find it useful then they would contribute
backwards compatibility fixes that help their applications or
librarys run.  I am currently running a newly 2to3 ported
application on it.  At this time there is no warning generated but I
would rather get a warning then have one of my customers run into a
porting bug.

To be clear, I'm not proposing that these backwards compatiblity
features go into Python 3.x or that this modified Python becomes the
standard version.  It is purely an intermediate step in getting code
ported to Python 3.

I've temporarily named it "Pragmatic Python".  I'd like a better
name if someone can suggest one.  Maybe something like Perverted,
Debauched or Impure Python.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyGC_Collect ignores state of `enabled`

2016-05-20 Thread Neil Schemenauer
Nick Coghlan  wrote:
> PEP 3121 is insufficient, since a lot of extension modules can't (or
> at least haven't) adopted it in practice.
> https://www.python.org/dev/peps/pep-0489/ has some more background on
> that (since it was the first step towards tackling the problem in a
> different way that extension module authors may be more likely to
> actually adopt)

My idea is that if we can cleanup the built-in extension modules
then maybe that would be enough to stop doing a lot of the
finalization hacks (e.g. module dict clearing).  If 3rd party
extension modules are not fixed, then some finalizers that used to
be called might no longer get called.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyGC_Collect ignores state of `enabled`

2016-05-18 Thread Neil Schemenauer
Benjamin Peterson  wrote:
> Adding PyGC_CollectIfEnabled() and calling it in Py_Finalize is probably
> fine. I don't think the contract of PyGC_Collect itself (or gc.collect()
> for that matter) should be changed. You might want to disable GC but
> invoke it yourself.

Yes, that sounds okay to me.

I poked around at the calls to PyGC_Collect() and
_PyGC_CollectNoFail().  The cyclic garbage collector gets invoked at
least three times during shutdown.  Once by Py_FinalizeEx() and two
times by PyImport_Cleanup().  That seems a bit exessively expensive
to me.  The collection time can be significant for programs with a
lot of "container" objects in memory.

The whole finalize/shutdown logic of the CPython interpreter could
badly use some improvement.  Currently it is a set of ugly hacks
piled on top of each other.  Now that we have PEP 3121,

Extension Module Initialization and Finalization
https://www.python.org/dev/peps/pep-3121/

we should be able to cleanup this mess.  PyImport_Cleanup() is the
main area of trouble.  I don't think we should not be clearing
sys.modules and we should certainly not be clearing module dicts.

If there is some whippersnapper out there who wants to get their
hands dirty with Python internals, fixing PyImport_Cleanup() would
be a juicy project.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyGC_Collect ignores state of `enabled`

2016-05-15 Thread Neil Schemenauer
Hi,

I intended for gc.collect() to actually collect cycles even if the
auto-GC was disabled.  Having Py_Finalize() run GC even when it has
been disabled seems wrong to me.  Originally, cyclic GC was supposed
to be optional.  Back then, most programs did not leak cycles.  I
would vote for Py_Finalize() checking the 'enabled' flag and not
calling PyGC_Collect if false.

Regards,

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] obmalloc mmap/munmap thrashing

2016-04-21 Thread Neil Schemenauer
I was running Python 2.4.11 under strace and I noticed some odd
looking system calls:

mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f9848681000
munmap(0x7f9848681000, 262144)  = 0
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f9848681000
munmap(0x7f9848681000, 262144)  = 0
[... repeated a number of times ...]

Looking at obmalloc.c, there doesn't seem to be any high/low
watermark (hysteresis) associated with unallocating arenas.  Is that
true?  If so, does it seem prudent to implement something to avoid
this behavior?  It seems potentially expensive if you program is
running just at the threshold of needing another arena.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Migration from Python 2.7 and bytes formatting

2014-01-18 Thread Neil Schemenauer
On 2014-01-18, Stephen J. Turnbull wrote:
 The above are descriptions of current behavior (ie, unchanged by PEPs
 460, 461), and this:
[..]
 is the content of this proposal, is that right?

The proposal is that -2 enables the following:

- %r as an alias for %a (i.e. calls ascii())

- %s will fallback to calling PyObject_Str() and then
  call _PyUnicode_AsASCIIString(obj, strict) to
  convert to bytes

That's it.  After sleeping on it, I'm not sure that's enough Python
2.x compatibility to help a lot.  I haven't ported much code to 3.x
yet but I imagine the following are major challenges:

- comparisons between str and bytes always returns unequal

- indexing/iterating bytes returns integers, not bytes objects

- concatenation of str and bytes fails (not so bad since
  a TypeError is generated right away).


Maybe the -2 command line option could revert to Python 2.x behavior
for the above but I'm worried it might break working 3.x library
code (the %r/%s change is very safe).  I think I'll play with the
idea and see which unit tests get broken.  Ideally, there would be
warnings generated when each backwards compatible behavior kicks in,
that would greatly help when fixing up code.

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461 Final?

2014-01-18 Thread Neil Schemenauer
Ethan Furman et...@stoneleaf.us wrote:
 So, if %a is added it would act like:

 -
%a % some_obj
 -
tmp = str(some_obj)
res = b''
for ch in tmp:
if ord(ch)  256:
res += bytes([ord(ch)]
else:
res += unicode_escape(ch)
 -

 where 'unicode_escape' would yield something like \u0440 ?

My patch on the tracker already implements %a, it's simple.  Just
call PyObject_ASCII() (same as ascii()) then call
PyUnicode_AsLatin1String(s) to convert it to bytes and stick it in.
PyObject_ASCII does not return non-ASCII characters, no decode error
is possible.  We could call _PyUnicode_AsASCIIString(s, strict)
instead if we are afraid for non-ASCII bytes coming out of
PyObject_ASCII.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461 Final?

2014-01-18 Thread Neil Schemenauer
Steven D'Aprano st...@pearwood.info wrote:
 To properly handle int and float subclasses, int(), index(), and float()
 will be called on the objects intended for (d, i, u), (b, o, x, X), and
 (e, E, f, F, g, G).


 -1 on this idea.

 This is a rather large violation of the principle of least surprise, and 
 radically different from the behaviour of Python 3 str. In Python 3, 
 '%d' interpolation calls the __str__ method, so if you subclass, you can 
 get the behaviour you want:

 py class HexInt(int):
 ... def __str__(self):
 ... return hex(self)
 ...
 py %d % HexInt(23)
 '0x17'


 which is exactly what we should expect from a subclass.

 You're suggesting that bytes should ignore any custom display 
 implemented by subclasses, and implicitly coerce them to the superclass 
 int. What is the justification for this? You don't define or even 
 describe what you consider properly handle.

The proposed behavior (at least as I understand it and as I've
implemented in my proposed patch) matches Python 2 str/unicode and
Python 3 str behavior for these codes.  If you want to allow
subclasses to have control or to use duck-typing, you have to use
str and __format__.  I'm okay with the limitation, bytes formatting
can be simple, limited and fast.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   >