[Python-Dev] Re: [python-committers] Re: [IMPORTANT] Preparations for 3.11.0 beta 1

2022-04-08 Thread Inada Naoki
Thank you, Victor.

I had considered dropping (a) from the PEP. But I keep them because:

* I rushed to write PEP, before 3.11 beta.
* In the "Backward compatibility" section in the PEP, I want to
mention `locale.getencoding()` and `encoding="locale"`
* But they are not fixed in the main branch yet. So I need to include
what needs to be fixed in 3.11 in the PEP.

But for now, we are close to merge `locale.getencoding()`.
And I am afraid merging it before the PEP accepted even though it is
documented in the PEP...

Now I think the best way is:

* Withdraw the PEP submission temporarily.
* Implement `locale.getencoding()` and fix `encoding="locale"` in the
main branch.
* Remove them from the PEP.
* Resubmit the PEP.

And if the PEP is accepted, I want to do this in the 3.11 branch (even
though it will be beta already):

* Improve document about UTF-8 mode and EncodingWarning based on the PEP.
* Add (opt-in) EncodingWarning to `locale.getpreferredencoding()` and
`subprocess.Popen(text=True)`.

On Thu, Apr 7, 2022 at 9:42 PM Victor Stinner  wrote:
>
> IMO adding locale.getencoding() to Python 3.11 is not controversial
> and is useful even if PEP 686 is rejected. This function was discussed
> for 1 year (bpo-43510, bpo-43552, bpo-43557, bpo-47000) and there is
> an agreement that there is a need for this function.
>
> > Making `open(path, encoding="locale")` use locale encoding in UTF-8 mode 
> > (Python 3.10 used UTF-8)
>
> If someone explicitly opts in for the "locale encoding", it sounds
> surprising that the locale (encoding) is ignored and that UTF-8 is
> used if the Python UTF-8 Mode is enabled. I'm fine with this change.
> If you want to always UTF-8... Pass explicitly UTF-8:
>
> # no surprise, always decode file content from UTF-8
> json_file = open(filename, encoding="utf-8")
>
> --
>
> I will not comment PEP 686 here. It's being discussed on Discourse:
>
> * https://discuss.python.org/t/14435
> * https://discuss.python.org/t/14737
>
> Victor
>
> On Thu, Apr 7, 2022 at 5:35 AM Inada Naoki  wrote:
> >
> > Hi, Pablo.
> >
> > I just submitted the PEP 686 to the SC.
> > https://github.com/python/steering-council/issues/118
> >
> > In this PEP, I am proposing:
> >
> > a. Small improvement for UTF-8 mode in Python 3.11
> > b. Make UTF-8 mode default in Python 3.13.
> >
> > (a) is an important change for (b) so I included it in the PEP.
> > More precisely, (a) contains two changes:
> >
> > * Making `open(path, encoding="locale")` use locale encoding in UTF-8
> > mode (Python 3.10 used UTF-8)
> > * Add `locale.getencoding()` that is same to
> > `locale.getpreferredencoding(False)` but returns locale encoding even
> > in UTF-8 mode.
> >
> > These changes are important for (b).
> > But they are not a big change needing PEP.
> >
> > What should I do?
> >
> > * Do not merge anything until PEP accepted.
> > * Merge (a) without waiting PEP accepted.
> > * Merge (a) and remove it from the PEP.
> >
> > FWI, I and Victor are implementing `locale.getencoding()` for now.
> >
> > https://bugs.python.org/issue47000
> > https://github.com/python/cpython/pull/32068
> >
> > Regards,
> > --
> > Inada Naoki  
> > ___
> > python-committers mailing list -- python-committ...@python.org
> > To unsubscribe send an email to python-committers-le...@python.org
> > https://mail.python.org/mailman3/lists/python-committers.python.org/
> > Message archived at 
> > https://mail.python.org/archives/list/python-committ...@python.org/message/7E4QEKZ6HNDDPDL76LP3TBBKLAUQ7AHB/
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
>
> --
> Night gathers, and now my watch begins. It shall not end until my death.



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XVJEAF7S2OORL77QMLLQTWKHLRDFA3KH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 686 – Make UTF-8 mode default

2022-04-07 Thread Inada Naoki
Hi, all.

I wrote a new PEP last month.
I'm sorry that I forgot to announce it here.

The pep is here:
https://peps.python.org/pep-0686/

Discussions:
* https://discuss.python.org/t/14737 (current thread)
* https://discuss.python.org/t/14435 (previous thread)

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AQ2ZN475KSTPGUURG4Y3ZKBIDSBOBYHY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [python-committers] [IMPORTANT] Preparations for 3.11.0 beta 1

2022-04-06 Thread Inada Naoki
Hi, Pablo.

I just submitted the PEP 686 to the SC.
https://github.com/python/steering-council/issues/118

In this PEP, I am proposing:

a. Small improvement for UTF-8 mode in Python 3.11
b. Make UTF-8 mode default in Python 3.13.

(a) is an important change for (b) so I included it in the PEP.
More precisely, (a) contains two changes:

* Making `open(path, encoding="locale")` use locale encoding in UTF-8
mode (Python 3.10 used UTF-8)
* Add `locale.getencoding()` that is same to
`locale.getpreferredencoding(False)` but returns locale encoding even
in UTF-8 mode.

These changes are important for (b).
But they are not a big change needing PEP.

What should I do?

* Do not merge anything until PEP accepted.
* Merge (a) without waiting PEP accepted.
* Merge (a) and remove it from the PEP.

FWI, I and Victor are implementing `locale.getencoding()` for now.

https://bugs.python.org/issue47000
https://github.com/python/cpython/pull/32068

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7E4QEKZ6HNDDPDL76LP3TBBKLAUQ7AHB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New PEP website is horrible to read on mobile device

2022-03-16 Thread Inada Naoki
I can reproduce it. I reported it on
https://github.com/python/peps/issues/2437

On Thu, Mar 17, 2022 at 1:29 AM Patrick Arminio
 wrote:
>
> I've tried to reproduce the bug on ios and android (emulators) with no 
> success, usually something like this
> happens when the
>
> 
>
> meta is missing, but we do have it in the content, so it is strange, can you 
> reproduce consistently?
>
>
> On Tue, 15 Mar 2022 at 13:28, Nathan Cook  wrote:
>>
>> Please make https://peps.python.org/ more responsive to various form factors
>>
>> See attached screenshot from Chrome version 99.0.4844.58 on my Pixel 3aXL 
>> running Android 12
>>
>> Thank you,
>> Nathan Cook
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/TTZBNWW67IR26VWLV4NFTHP6WBQOD5FI/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> Patrick Arminio
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/LA6U263OUVJ2RBFHFYNFXZ2QSCZHVVUW/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/22XIKX2S3RRLGYBNTCCWKSZZ6O25PXYY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-22 Thread Inada Naoki
On Wed, Feb 23, 2022 at 1:46 AM Eddie Elizondo via Python-Dev
 wrote:
>
>
> That article is five years old so it doesn't reflect the current state of the 
> system! We have continuous profiling and monitoring of Copy on Writes and 
> after introducing the techniques described in this PEP, we have largely fixed 
> the majority of scenarios where this happens.
>
> You are right in the fact that just addressing reference counting will not 
> fix all CoW issues. The trick here is also to leverage the permanent GC 
> generation used for the `gc.freeze` API. That is, if you have a container 
> that it's known to be immortal, it should be pushed into the permanent GC 
> generation. This will guarantee that the GC itself will not change the GC 
> headers of said instance.
>
> Thus, if you immortalize your heap before forking (using the techniques in: 
> https://github.com/python/cpython/pull/31489) then you'll end up removing the 
> vast majority of scenarios where CoW takes place. I can look into writing a 
> new technical article for Instagram with more up to date info but this might 
> take time to get through!
>
> Now, I said that we've largely fixed the CoW issue because there are still 
> places where it happens such as: free lists, the small object allocator, etc. 
> But these are relatively small compared to the ones coming from reference 
> counts and the GC head mutations.

Same technique don't guarantee same benefit. Like gc.freeze() is
needed before immortalize to avoid CoW, some other tricks may be
needed too.
New article is welcome, but I want sample application we can run,
profile, and measure the benefits.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AUF5R62E7YT22LL4DJ5HI3FCS3ZPHSTL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Inada Naoki
On Wed, Feb 23, 2022 at 10:12 AM Eric Snow  wrote:
>
> Thanks for the feedback.  I've responded inline below.
>
> -eric
>
> On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> > I hope per-interpreter GIL success at some point, and I know this is
> > needed for per-interpreter GIL.
> >
> > But I am worrying about per-interpreter GIL may be too complex to
> > implement and maintain for core developers and extension writers.
> > As you know, immortal don't mean sharable between interpreters. It is
> > too difficult to know which object can be shared, and where the
> > shareable objects are leaked to other interpreters.
> > So I am not sure that per interpreter GIL is achievable goal.
>
> I plan on addressing this in the PEP I am working on for
> per-interpreter GIL.  In the meantime, I doubt the issue will impact
> any core devs.
>

It's nice to hear!


> > So I think it's too early to introduce the immortal objects in Python
> > 3.11, unless it *improve* performance without per-interpreter GIL
> > Instead, we can add a configuration option such as
> > `--enalbe-experimental-immortal`.
>
> I agree that immortal objects aren't quite as appealing in general
> without per-interpreter GIL.  However, there are actual users that
> will benefit from it, assuming we can reduce the performance penalty
> to acceptable levels.  For a recent example, see
> https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>

It is not proven example, but just a hope at the moment. So option is
fine to prove the idea.

Although I can not read the code, they said "patching ASLR by patching
`ob_type` fields;".
It will cause CoW for most objects, isn't it?

So reducing memory write don't directly means reducing CoW.
Unless we can stop writing on a page completely, the page will be copied.


> > On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  
> > wrote:
> > >
> > > Reducing CPU Cache Invalidation
> > > ---
> > >
> > > Avoiding Data Races
> > > ---
> > >
> >
> > Both benefits require a per-interpreter GIL.
>
> CPU cache invalidation exists regardless.  With the current GIL the
> effect it is reduced significantly.
>

It's an interesting point. We can not see the benefit from
pypeformance, because it doesn't use much data and it runs one process
at a time.
So the pyperformance can not make enough stress to the last level
cache which is shared by many cores.

We need multiprocess performance benchmark apart from pyperformance,
to stress the last level cache from multiple cores.
It helps not only this PEP, but also optimizing containers like dict and set.


> >
> > As I wrote before, fork is very difficult to use safely. We can not
> > recommend to use it for many users.
> > And I don't think reducing the size of patch in Instagram or YouTube
> > is not good rational for this kind of change.
>
> What do you mean by "this kind of change"?  The proposed change is
> relatively small.  It certainly isn't nearly as intrusive as many
> changes we make to internals without a PEP.  If you are talking about
> the performance penalty, we should be able to eliminate it.
>

Can proposed optimizations to eliminate the penalty guarantee that
every __del__, weakref are not broken,
and no memory leak occurs when the Python interpreter is initialized
and finalized multiple times?
I haven't confirmed it yet.


> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > > relies on copy-on-write, including sharing data from shared objects
> > > files between processes.
> > >
> >
> > It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
> >
> > You may need to write hash of bytes and unicode. You may be need to
> > write `tp_type`.
> > Immortal objects can "reduce" the memory write. But "at least one
> > memory write" is enough to trigger the CoW.
>
> Correct.  However, without immortal objects (AKA immutable per-object
> runtime-state) it goes from "very difficult" to "basically
> impossible".
>

Configuration option won't make it impossible.


> > >
> > > Constraints
> > > ---
> > >
> > > * ensure that otherwise immutable objects can be truly immutable
> > > * be careful when immortalizing objects that are not otherwise immutable
> >
> > I am not sure about what this means.
> > For example, unicode objects are not immutable because they have hash,
> > u

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-19 Thread Inada Naoki
Hi,

I hope per-interpreter GIL success at some point, and I know this is
needed for per-interpreter GIL.

But I am worrying about per-interpreter GIL may be too complex to
implement and maintain for core developers and extension writers.
As you know, immortal don't mean sharable between interpreters. It is
too difficult to know which object can be shared, and where the
shareable objects are leaked to other interpreters.
So I am not sure that per interpreter GIL is achievable goal.

So I think it's too early to introduce the immortal objects in Python
3.11, unless it *improve* performance without per-interpreter GIL
Instead, we can add a configuration option such as
`--enalbe-experimental-immortal`.


On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
>
> Reducing CPU Cache Invalidation
> ---
>
> Avoiding Data Races
> ---
>

Both benefits require a per-interpreter GIL.

>
> Avoiding Copy-on-Write
> --
>
> For some applications it makes sense to get the application into
> a desired initial state and then fork the process for each worker.
> This can result in a large performance improvement, especially
> memory usage.  Several enterprise Python users (e.g. Instagram,
> YouTube) have taken advantage of this.  However, the above
> refcount semantics drastically reduce the benefits and
> has led to some sub-optimal workarounds.
>

As I wrote before, fork is very difficult to use safely. We can not
recommend to use it for many users.
And I don't think reducing the size of patch in Instagram or YouTube
is not good rational for this kind of change.


> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.  Anything that uses ``mmap``
> relies on copy-on-write, including sharing data from shared objects
> files between processes.
>

It is very difficult to reduce CoW with mmap(MAP_PRIVATE).

You may need to write hash of bytes and unicode. You may be need to
write `tp_type`.
Immortal objects can "reduce" the memory write. But "at least one
memory write" is enough to trigger the CoW.


> Accidental Immortality
> --
>
> While it isn't impossible, this accidental scenario is so unlikely
> that we need not worry.  Even if done deliberately by using
> ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> If that CPU were 32-bit then it is (technically) more possible though
> still highly unlikely.
>

Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.


>
> Constraints
> ---
>
> * ensure that otherwise immutable objects can be truly immutable
> * be careful when immortalizing objects that are not otherwise immutable

I am not sure about what this means.
For example, unicode objects are not immutable because they have hash,
utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
3.12).


>
> Object Cleanup
> --
>
> In order to clean up all immortal objects during runtime finalization,
> we must keep track of them.
>

I don't think we need to clean up all immortal objects.

Of course, we should care immortal by default objects.
But for user-marked immortal objects, it's very difficult to guarantee
__del__ or weakref callback is called safely.

Additionally, if they are marked immortal for avoiding CoW, cleanup cause CoW.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7FCNNQOTIUZTBFZUPYRDSLND6WCVM3JO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Inada Naoki
On Thu, Feb 17, 2022 at 7:01 AM Eric Snow  wrote:
>
> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.
> >
> > Could you elaborate? mmap, maybe?
> > [snip[
> > So if you know how to get benefit from CoW without fork, I want to know it.
>
> Sorry if I got your hopes up.  Yeah, I was talking about mmap.
>

Is there any common tool that utilize CoW by mmap?
If you know, please its link to the PEP.
If there is no common tool, most Python users can get benefit from this.

Generally speaking, fork is a legacy API. It is too difficult to know
which library is fork-safe, even for stdlibs. And Windows users can
not use fork.
Optimizing for non-fork use case is much better than optimizing for
fork use cases.

* https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c18d7db234
* https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
* https://www.evanjones.ca/fork-is-dangerous.html
* https://bugs.python.org/issue33725

I hope per-interpreter GIL replaces fork use cases.
But tools using CoW without fork also welcome, especially if it
supports Windows.

Anyway, I don't believe stopping refcounting will fix the CoW issue
yet. See this article [1] again.

[1] 
https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

Note that they failed to fix CoW by stopping refcounting code objects! (*)
Most CoW was caused by cyclic GC and finalization caused most CoW.

(*) It is not surprising to me because eval loop don't incre/decref
most code attributes. They borrow reference from the code object.

So we need a sample application and profile it, before saying it fixes CoW.
Could you provide some data, or drop the CoW issue from this PEP until
it is proved?

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J53GY7XKFOI4KWHSTTA7FUL7TJLE7WG6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-15 Thread Inada Naoki
+1 for overall idea.

Some comments:

>
> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.
>

Could you elaborate? mmap, maybe?

Generally speaking, fork is very difficult to use in safe.
My company's web apps load applications and libraries *after* fork,
not *before* fork for safety.
We had changed multiprocessing to use spawn by default on macOS.
So I don't recommend many Python users to use fork.

So if you know how to get benefit from CoW without fork, I want to know it.

>
> Immortal Global Objects
> ---
>
> The following objects will be made immortal:
>
> * singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
> * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
>   small ints)
>
> There will likely be others we have not enumerated here.
>

How about interned strings?
Should the intern dict be belonging to runtime, or (sub)interpreter?

If the interned dict is belonging to runtime, all interned dict should
be immortal to be shared between subinterpreters.
If the interned dict is belonging to interpreter, should we register
immortalized string to all interpreters?

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DQV6ECSUB2VD2EXX6CVCC45RJA6NR2ZZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Require a C compiler supporting C99 to build Python 3.11

2022-02-11 Thread Inada Naoki
On Thu, Feb 10, 2022 at 6:31 PM Petr Viktorin  wrote:
>
> >
> > I like it. I want to use anonymous union. It makes complex structure
> > like PyDictKeysObject simple a little.
> >
> > I confirmed that XLC supports it.
> > https://www.ibm.com/docs/en/xl-c-and-cpp-aix/13.1.3?topic=types-structures-unions#strct__anonstruct
>
> Ah, I've also wanted anonymous unions in the past!
> There's a little problem in that they're not valid in C++, so we can't
> have them in public headers.
>

C++ 11 supports anonymous union with some reasonable limitations.
https://en.cppreference.com/w/cpp/language/union

XL C/C++ also support it. So we can use it if we decided to use it.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GE3RFUXUDFY3GZQHFEVZAIQW3CCMLFK7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Require a C compiler supporting C99 to build Python 3.11

2022-02-09 Thread Inada Naoki
On Thu, Feb 10, 2022 at 3:49 AM Brett Cannon  wrote:
> On Wed, Feb 9, 2022 at 4:19 AM Petr Viktorin  wrote:
>> On 09. 02. 22 4:39, h.vetin...@gmx.com wrote:
>>
>> That's an interesting idea -- what's keeping us from C11?
>
> No one asking before, probably because we have been trying to get to C99 for 
> so long. 
>
>> In other words: the main thing keeping us from C99 is MSVC support, and
>> since that compiler apparently skipped C99, should we skip it as well?
>
> If we think "C11 without optional features" is widely supported then I think 
> that's a fine target to have.
>
> For anyone not sure what's optional in C11, I found 
> https://en.wikipedia.org/wiki/C11_%28C_standard_revision%29#Optional_features 
> . Other than atomics being discussed on Discord for mimalloc, leaving those 
> things out seem reasonable to me.
>

I like it. I want to use anonymous union. It makes complex structure
like PyDictKeysObject simple a little.

I confirmed that XLC supports it.
https://www.ibm.com/docs/en/xl-c-and-cpp-aix/13.1.3?topic=types-structures-unions#strct__anonstruct

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MMTUDLFIB7ET6T3Q73HDLODDGZY74X54/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-07 Thread Inada Naoki
On Tue, Feb 8, 2022 at 1:47 PM Guido van Rossum  wrote:
>
> Thanks for trying it! I'm curious why it would be slower (perhaps less 
> locality? perhaps the ...Id... APIs have some other trick up their sleeve?) 
> but since it's also messier and less backwards compatible than just leaving 
> _Py_IDENTIFIER alone and just not using it, I'd say let's not spend more time 
> on that alternative and just focus on the two other horses still in the race: 
> immortal objects or what you have now.
>

I think it's because statically allocated strings are not interned.
I think deepfreeze should stop using statically allocated strings for
interned strings too.

--
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7OILLUEZSQY6OTY5WY543JTUGRVSFNPS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Require a C compiler supporting C99 to build Python 3.11

2022-02-07 Thread Inada Naoki
https://www.ibm.com/docs/en/SSGH3R_16.1.0/pdf/getstart.pdf

As far as reading page 3,
xlclang fully supports C89/C99/C11.
xlc fully supports C89/C99, and partially supports C11.

On Tue, Feb 8, 2022 at 8:57 AM Brett Cannon  wrote:
>
>
>
> On Mon, Feb 7, 2022 at 9:12 AM Victor Stinner  wrote:
>>
>> Hi,
>>
>> I made a change to require C99  "NAN" constant and I'm was
>> asked to update the PEP 7 to clarify the C subset is needed to build
>> Python 3.11.
>>
>> Python 3.6 requires a subset of the C99 standard to build defined by the PEP 
>> 7:
>> https://www.python.org/dev/peps/pep-0007/
>>
>> I modified Python 3.11 to require more C99 features of the  header:
>>
>> * bpo-45440: copysign(), hypot(), isfinite(), isinf(), isnan(), round()
>> * bpo-46640: NAN
>>
>> After my NAN change (bpo-46640), Petr Viktorin asked me to update the
>> PEP 7. I proposed a change to simply say that "Python 3.11 and newer
>> versions use C99":
>> https://github.com/python/peps/pull/2309
>>
>> I would prefer to not have to give an exhaustive list of C99 features
>> used by CPython, since it's unclear to me what belongs to C99 or to
>> ISO C89. As I wrote before, Python already uses C99 features since
>> Python 3.6.
>>
>> On my PEP PR, Guido van Rossum asked me to escalate the discussion to
>> python-dev, so here I am :-)
>>
>> In "C99", the number "99" refers to the year 1999, the standard is now
>> 23 years old:
>> https://en.wikipedia.org/wiki/C99
>>
>> In 2022, C99 is now well supported by C compilers supported by Python:
>> GCC, clang, MSVC.
>
>
> I think if those compilers fully C99 at this point we should consider just 
> moving completely over to C99.
>
> -Brett
>
>>
>>
>> I don't know if AIX XLC supports C99. AIX provides a "c99" compiler
>> compatible with C99. It also seems like GCC is usable on AIX.
>>
>> I don't know if ICC supports C99. Python doesn't officially the ICC
>> compiler, the ICC buildbots are gone a few years ago. But sometimes I
>> make some changes to enhance the ICC support, when the change is small
>> enough.
>>
>> Note: Python also uses C11 , but it's not required: there
>> are fallbacks for compilers which don't support it.
>>
>> Victor
>> --
>> Night gathers, and now my watch begins. It shall not end until my death.
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/J5FSP6J4EITPY5C2UJI7HSL2GQCTCUWN/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZLDOBJUVMTIRETVRNHPWWO5MBHTXYEW3/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XIMQWFVP7RJW7CTV74YPWI74L7ZUU2PX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-02 Thread Inada Naoki
+1 for overall.

On Thu, Feb 3, 2022 at 7:45 AM Eric Snow  wrote:
>
>
> I'd also like to actually get rid of _Py_IDENTIFIER(), along with
> other related API including ~14 (private) C-API functions.  Dropping
> all that helps reduce maintenance costs.  However, at least one PyPI
> project (blender) is using _Py_IDENTIFIER().  So, before we could get
> rid of it, we'd first have to deal with that project (and any others).
>

It would be nice to provide something similar to _PY_IDENTIFIER, but
designed (and documented) for 3rd party modules like this.

```
typedef struct {
Py_IDENTIFIER(foo);
...
} Modstate;
...
// in some func
Modstate *state = (Modstate*)PyModule_GetState(module);
PyObject_GetAttr(o, PY_IDENTIFIER_STR(state->foo));
...
// m_free()
static void mod_free(PyObject *module) {
Modstate *state = (Modstate*)PyModule_GetState(module);
Py_IDENTIFIER_CLEAR(state->foo);
}
```


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZZ5QOZDOAO734SDRJGMXW6AJGAVEPUHE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Increase of Spammy PRs and PR reviews

2022-01-30 Thread Inada Naoki
On Sun, Jan 30, 2022 at 7:37 PM Irit Katriel  wrote:
>
> If people spam the approvals (i.e., approve PRs without reviewing them) then 
> the distinction between the labels becomes meaningless, of course. Though I 
> do wonder what the motivation for doing that repeatedly would be.  My basic 
> assumption is that people usually try not to make fools of themselves.
>

Some people may do "approve without review" to get attention from core
dev. They just want to the issue be fixed soon. This is not so bad.

Some people may do "approval without review" to make their "Profile"
page richer, because GitHub counts it as a contribution.
Creating spam issues or pull requests can be reported as spam very
easily. But "approve without review" is hard to be reported as spam.
So approving random issue is the most easy way to earn contributions
without reported as spam.

For example, see this user's contribution. They reviewed 32 pull
requests in cpython. It seems they approves random pull requests after
someone approved it.
https://github.com/raghavthind2005

Of course, approving approved pull requests without review don't
change the pull request status. So I can ignore them.
I just explain "what the motivation approve without review repeatedly".

I don't watch the cpython repository so I am not suffered from spammy
approvals. So I have no vote for it. I just mention to an option we
have.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UO6ZSNWLLXWU7AZ7NGQDTOQ2WVX2ZAZN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Increase of Spammy PRs and PR reviews

2022-01-29 Thread Inada Naoki
On Sun, Jan 30, 2022 at 12:03 PM Ethan Smith  wrote:
>
> As a non-committer, I want to make a plea for non-committer approval reviews, 
> or at least that they have a place. When asked how outsiders can contribute I 
> frequently see "review open PRs" as a suggested way of contributing to 
> CPython. Not being able to approve PRs that are good would be a barrier to 
> those contributions.
>
> Furthermore, I am collaborating with a couple of core devs, it would make 
> collaboration more difficult if I couldn't review their work and say that I 
> thought the changes looked good.
>

You can still write a review comment, including "LGTM". What you can
not is labeling PR as "Approved."
So I don't think it makes collaboration difficult.
By preventing approval from others, we can easily find PRs approved
from core-devs or triage members but not merged yet.

> I know "drive by approvals" are annoying but I think it is unfortunately part 
> of open source projects.
>

Sorry, but I don't think so.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/X4PIFVMHMIECGAEP7A2SKMUXZD4BVWIM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Increase of Spammy PRs and PR reviews

2022-01-29 Thread Inada Naoki
On Sun, Jan 30, 2022 at 10:39 AM Ethan Furman  wrote:
>
>
>  > And lots of non-committer PR reviews that only approve.
>
> I have seen this.  Quite irritating.
>

We can prohibit approval from non core developers. Do we use this
setting already?
https://github.blog/2021-11-01-github-keeps-getting-better-for-open-source-maintainers/


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KHXSE2MSEC5JR2QB5F6HJUFYCC4SHGFF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2022-01-05 Thread Inada Naoki
On Thu, Jan 6, 2022 at 7:00 AM Trent Nelson  wrote:
>
> I did some research on this a few years back.  I was curious what sort
> of "max reference counts" were encountered in the wild, in long-running
> real life programs.  For the same reason: I wanted to get some insight
> into how many unused bits could possibly be repurposed for future
> shenanigans (I had PyParallel* in the mind at the time).
>

I think we can assume the upper bound of the reference count is same
to upper bound of the pointer.
On 32bit machine, memory space is 2**32 byte, and pointers take
4bytes. And NULL can not store pointer. So upper bound of refcnt is
2**30-1.
So we have two free bits in the refcnt.

On 64bit machine, we have at least four free bits as same reason.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UNYX6VOQLDMXMWXP54KJUXCHSWOA5YKT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: The current state of typing PEPs

2021-11-29 Thread Inada Naoki
On Thu, Nov 18, 2021 at 8:00 AM Barry Warsaw  wrote:
>
> Does PEP 563 or 649 satisfy static and dynamic typing needs?
>
> In the interest of full transparency, we want to let the Python community 
> know that the Steering Council continues to discuss PEP 563 (Postponed 
> Evaluation of Annotations) and PEP 649 (Deferred Evaluation Of Annotations 
> Using Descriptors).  We have concluded that we lack enough detailed 
> information to make a decision in favor of either PEP.  As we see it, 
> adopting either PEP 563 or PEP 649 as the default would be insufficient. They 
> will not fully resolve the existing problems these PEPs intend to fix, will 
> break some existing code, and likely don’t address all the use cases and 
> requirements across the static and dynamic typing constituents.  We are also 
> uncertain as to the best migration path from the current state of affairs.
>
> Defer decision on PEP 563 and 649 in 3.11
>
> As such, at this time, the only reasonable path forward that the SC sees is 
> to defer the decision in Python 3.11 again, essentially keeping the 3.10 
> status quo.  We know that this is far from ideal, but it’s also the safest 
> path since we can’t clearly make the situation better, and we don’t have 
> confidence that either PEP solves the problems once and for all.  
> Pragmatically, we don’t want to make the situation worse, and we really don’t 
> want to find ourselves back here again in a couple of releases because we 
> overlooked an important requirement for a set of users.
>

In my opinion, we can agree that we can not make PEP 563 default in
the future because it will break too many use cases.
Anyone against making a statement that "PEP 563 will never be the
default behavior"?

Then, we do not need to decide "PEP 563 or 649".
We can focus on whether we can replace "stock semantics + opt-in PEP
563" with PEP 649 or not.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KCZKREGT66VE67MPT7NNJ5UPCACY6X7A/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-14 Thread Inada Naoki
On Mon, Nov 15, 2021 at 7:58 AM Victor Stinner  wrote:
>
> On Sun, Nov 14, 2021 at 6:34 PM Eric V. Smith  wrote:
> > On second thought, I guess the existing policy already does this. Maybe
> > we should make it more than 2 versions for deprecations? I've written
> > libraries where I support 4 or 5 released versions. Although maybe I
> > should just trim that back.
>
> If I understood correctly, the problem is more for how long is the new
> way available?
>

I think the main problem is how many user code will be broken and the
merit of the deletion.

For example, PEP 623 will remove some legacy C APIs in Python 3.12.
https://www.python.org/dev/peps/pep-0623/

There are a few modules the PEP will break. But the PEP has
significant merit (reduce memory usage of all string objects).
So I want to remove them with the minimum deprecation period and I am
helping people to use new APIs. (*)

* e.g. https://github.com/jamesturk/cjellyfish/pull/12

So I don't want to increase the minimum required deprecation period.
But I agree that a longer deprecation period is good when keeping
deprecation stuff has nearly zero cost.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3UE3SNH3DG5HE22EZ57NM5BFJ7ZANUJC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Type annotations, PEP 649 and PEP 563

2021-10-23 Thread Inada Naoki
On Sun, Oct 24, 2021 at 6:03 AM Bluenix  wrote:
>
> Hmm, I thought I responded to this on Gmail but it hasn't appeared here on 
> the archive so I'll send it again..
>
> Is it known how much more/less the annotations impact performance compared to 
> function defaults?
>

Basically, PEP 563 overhead is the same as function defaults.

* Function annotations are one constant tuple per function.
* Functions having the same signature share the same annotation tuple.
* No GC overhead because it is constant (not tracked by GC).


On the other hand, it is difficult to predicate PEP 649 overhead.

When namespace is not kept:

* Annotation is just a constant tuple too.
* No GC overhead too.
* The constant tuple will be slightly bigger than PEP 563. (It can be
the same as PEP 563 when all annotations are constant.)
* Although they are constant tuples, they may have their own names and
firstlineno. So it is difficult to share whole annotation data between
functions having the same signature. (*)

(*) I proposed to drop firstlineno and names to share more
annotations. See
https://github.com/larryhastings/co_annotations/pull/9

When namespace is kept:

* Annotations will be GC tracked objects.
* They will be heavier than the "namespace is not kept", for both of
startup time and memory consumption.
* They have some GC overhead.

To answer "how much more", we need a target application. But I don't
have a good target application for now.

Additionally, even if we have a good target application, it is still
difficult to estimate real future impact.

For example, SQLAlchemy has very heavy docstrings so the `-OO` option
has a big impact.
But SQLAlchemy doesn't use function annotations yet. So stock/PEP
563/PEP 649 have the same (zero) cost. Yay!
But if SQLAlchemy starts using annotations, *all applications* using
SQLAlchemy will be impacted in the future.

Regards,
--
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WHLQAVAXXXACUABGS2EJLYWU336ISZDD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Type annotations, PEP 649 and PEP 563

2021-10-23 Thread Inada Naoki
On Sat, Oct 23, 2021 at 5:55 AM Bluenix  wrote:
>
>
> > Is the performance of PEP 649 and PEP 563 similar enough that we can
> > outright discount it as a concern? Does anyone actually care about the
> > overhead of type annotations anymore? Are there other options to alleviate
> > this potential issue (like a process-wide switch to turn off annotations)?
>
> In my opinion this shouldn't warrant any concern as these costs are only
> on startup of Python. The difference is not enough for me to care at least.
>

Costs are not only on startup time.
Memory consumption is cost on process lifetime.
And longer GC time is every time when full-GC is happened.

So performance is problem for both of short lifecycle applications
like CLI tools and long lifecycle applications like Web app.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZQE3BXOTCETH6CDMGZHDGDEKV6SRQGQR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Type annotations, PEP 649 and PEP 563

2021-10-20 Thread Inada Naoki
On Thu, Oct 21, 2021 at 1:08 PM Henry Fredrick Schreiner
 wrote:
>
> > typing features from future Python versions
>
> I second both of these uses, but especially this (which seems to be missing 
> from the original post), it’s been by far the main reason I’ve used this mode 
> and I’ve seen this used, and is the main feature to look forward to when 
> dropping Python 3.7 support. The new features coming to typing make static 
> typing much easier, but no libraries can drop Python 3.7/3.8/3.9 support; but 
> static typing doesn’t need old versions.
>

I agree with this point. We shouldn't emit DeprecationWarning for
`from __future__ import annotations` at least 3 versions (3.11 ~
3.13).

> When talking about speed, one of the important things to consider here is the 
> difference between the two proposals. PEP 649 was about the same as the 
> current performance, but PEP 563 was significantly faster, since it doesn’t 
> instantiate or deal with objects at all, which both the current default and 
> PEP 563 do. You could even protect imports with TYPE_CHECKING with PEP 563, 
> and further reduce the runtime cost of adding types - which could be seen as 
> a reason to avoid adding types. To the best of my knowledge, it hasn’t been a 
> blocker for packages, but something to include.
>
> Also, one of the original points for static typing is that strings can be 
> substituted for objects. “Type” is identical, from a static typing 
> perspective, to Type. You can swap one for the other, and for a Python 3.6+ 
> codebase, using something like “A | B” (with the quotes) is a valid way to 
> have static types in 3.6 that pass MyPy (though are not usable at runtime, 
> obviously, but that’s often not a requirement). NumPy, for example, makes 
> heavy usage of unions and other newer additions in static typing.
>

We may be able to provide tool to rewrite Python sources like 2to3:

* Remove `from __future__ import annotations`
* Stringify annotations, if it is not constants (e.g. `None`, `42`,
`"foo"` are not rewrote).

This tool can ease transition from PEP 563 to 649, and solve
performance issues too.
PEP 649 can have the performance as PEP 563 if all annotations are constants.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZFGRO6UGFVOBPTW2EFNLUC7WEYPCRLAQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Type annotations, PEP 649 and PEP 563

2021-10-20 Thread Inada Naoki
On Thu, Oct 21, 2021 at 6:38 AM Christopher Barker  wrote:
>
> Thanks to the SC for such a thoughtful note. I really like where this is 
> going.
>
> One thought.
>
> On Wed, Oct 20, 2021 at 6:21 AM Thomas Wouters  wrote:
>>
>> Is the performance of PEP 649 and PEP 563 similar enough that we can 
>> outright discount it as a concern? Does anyone actually care about the 
>> overhead of type annotations anymore? Are there other options to alleviate 
>> this potential issue (like a process-wide switch to turn off annotations)?
>
> Annotations are used at runtime by at least one std lib module: dataclasses, 
> and who knows how many third party libs. So that may not be practical.
>

This is similar to docstring. Some tools using docstring (e.g. docopt)
prevent using -OO option.
Although some library (e.g. SQLAlchemy) has huge docstring, we can not
use -OO if a time set of module depends on docstring or assertion.

So I think we need some mechanizm to disable optimization like
dropping assertions, docstrings, and annotations per module.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DMJOQ6JXDQSPRZPOLL4FRADIMP5EZTF6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Regressions caused the recent work on ceval.c and frame objects

2021-09-20 Thread Inada Naoki
msgpack 1.0.2 sdist includes c++ code generated by Cython 0.29.21
released at 2020-07-09.
Python 3.10 can build it without any source modifications.
So I don't expect Python 3.10 requires re-generation for most Cython users.

Regards,

On Mon, Sep 20, 2021 at 4:11 PM  wrote:
>
> > Will all packages that use Cython have to upgrade Cython to work with 3.10?
>
> For this particular issue you'll only have to upgrade if you use "profiling" 
> (I doubt that many packages routinely build with Cython profiling turned on). 
> However, it's possible there are other 3.10 bugfixes in Cython - I'm not 
> completely sure. I think the easiest way to know is to try to build a package 
> for 3.10.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/52VFBHR7AHTXPLC434E4BPXNXVUU3SVF/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7MLTC3NK535YOQAKD6VQTT4QZIREG57N/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations

2021-08-12 Thread Inada Naoki
Lazy loading code object solves only a half of the problem.
I am worrying about function objects for annotation too.

Function objects are heavier than code objects. And they are GC-tracked objects.

I want to know how we can reduce the function objects created for
annotation in PEP 649, before deprecating PEP 563.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OTZ4EPAQ2DTOIFFRBNG5CDMSRBKDUME5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations

2021-08-11 Thread Inada Naoki

> 2021/08/11 19:22、Paul Moore のメール:
> 
> Also, I don't think that improving performance is a
> justification for a non-trivial backward compatibility break (I don't
> recall a case where we've taken that view in the past) so "PEP 649
> solves forward references without a backward compatibility impact, and
> performance is a small issue in the face of that" is a reasonable
> position to take.

OK. I will stop talking about import time.

But memory footprint and GC time is still an issue.
Annotations in PEP 649 semantics can be much heavier than docstrings.

So I want to measure them before deprecating 563.

IMHO, I don't think we can make PEP 563 default anymore.
FastAPI ecosystem changed how annotations are used after PEP 563 was accepted.

So I think two or three options are on the table:

a. Keep Python 3.10 state. PEP 563 is opt-in in foreseeable future.
b. Accept PEP 649 and deprecate both of old semantics and PEP 563.
c?. Accept PEP 649 and deprecate old semantics, but keep PEP 563 as opt-in.

(I exclude "Accept PEP 649 but keep old semantics" option because backward 
compatibility is not a problem if we keep old semantics.)

Of course, (b) is the most simple solution.
But I am not sure that (b) don't have significant memory usage and GC time 
regression.

We can keep (a) in Python 3.11 so we don't have to harry about accepting PEP 
649.

Regards,

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QTAXVWFNF632TPCCLNFJTAXPRMV45SO6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations

2021-08-10 Thread Inada Naoki
On Tue, Aug 10, 2021 at 5:11 PM Mark Shannon  wrote:
>
> >>
> >
> > Currently, reference implementation of PEP 649 has been suspended.
> > We need to revive it and measure performance/memory impact.
> >
> > As far as I remember, the reference implementation created a function
> > object for each methods.
>
> No function object is created under normal circumstances.
> __annotations__ is a property that calls the underlying
> __co_annotations__ property, which lazily creates a callable.
>
> I'll leave it to Larry to explain why __co_annotations__ isn't just a
> code object.
>

I am talking about methods.
As far as I remember, function objects are created for each method to
keep class namespace.
Larry explained it and possible optimization. That's what I am waiting for.
https://mail.python.org/archives/list/python-dev@python.org/message/2OOCEE6OMBQYEIJXEGFWIBE62VPIJHP5/

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ROPYNKOG5GJIM233LEESA5AH75W7G2YI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations

2021-08-09 Thread Inada Naoki
On Tue, Aug 10, 2021 at 12:30 AM Eric V. Smith  wrote:
>
> Personally, I'd like to see PEP 649 accepted. There are a number of
> issues on bpo where people expect dataclass's field.type to be an actual
> Python object representing a type, and not a string. This field is
> copied directly from __annotations__. If we stick with PEP 563's string
> annotations, I'll probably just close these issues as "won't fix", and
> tell the caller they need to convert the strings to Python objects
> themselves. If 649 is accepted, then they'll all get their wish, and in
> addition I can remove some ugly logic from dataclasses.
>

I don't think there is much difference.

PEP 563 is not default. And PEP 563 is not the only source of
stringified annotation.
So Accepting PEP 649 doesn't mean "they'll all get their wish". We
need to say "won't fix" anyway.


> Do we need to do anything here to move forward on this issue? I've
> chatted with Larry and Mark Shannon, who have some additional thoughts
> and I'm sure will chime in.
>

Currently, reference implementation of PEP 649 has been suspended.
We need to revive it and measure performance/memory impact.

As far as I remember, the reference implementation created a function
object for each methods.
It means doubles function objects. It has major impact to memory
usage, startup time, and GC time.

There was an idea to avoid creating function objects for most cases.
But it was not implemented.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QNZE4GX6RD4QLTIM4U247G5RNWFX6BOQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Repealing PEP 509 (Add a private version to dict)

2021-07-29 Thread Inada Naoki
+1

2021年7月29日(木) 19:46 Mark Shannon :

> Hi everyone,
>
> I would like to repeal PEP 509. We don't really have a process for
> repealing a PEP. Presumably I would just write another PEP.
>
> Before I do so, I would like to know if anyone thinks we should keep
> PEP 509.
>
> The dictionary version number is currently unused in CPython and just
> wastes memory. I am not claiming that we will never need it, just that
> we shouldn't be required to have it. It should be an internal
> implementation detail that we can add or remove depending on requirements.
>
> Thoughts?
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/GV4CW3T7SUTJOYSCP6IJMV4AHDNNZIPV/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5DH7TBYAFN3CPCYPGZU73VBM57YDZPXL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [python-committers] Roundup to GitHub Issues migration

2021-06-23 Thread Inada Naoki
FWIW, GitHub announced new powerful Issues today.
https://github.com/features/issues

It may fill some gap between GitHub Issues and Roundup.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IVTIRU4X6PXAAMUKFJU3IJB35BAR4A67/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-09 Thread Inada Naoki
I think stable ABI keeps symbols, signatures, and memory layouts.
I don't think stable ABI keeps all behaviors.

For example, Py_CompileString() is stable ABI.
When we add `async` keyword, Py_CompileString() starts raising an
Error for source code using `async` name.
Is it ABI change? I don't think so.

I want to drop Py_UNICODE support in Python 3.12. It is another
incompatible change in PyArg_Parse*() *API*.
Users can not use "u" format after it.  It is an incompatible *API*
change, but not an *ABI* change.

I suspect we had made many incompatible *API* changes in stable ABIs already.

If I am wrong, can we stop keeping stable ABI at Python 3.12?
Python 4.0 won't come in foreseeable future. Stable ABI blocks Python evolution.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T7CPD4LHAVU5TMMCZ7CXNMOUL3D7ZR5O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-08 Thread Inada Naoki
On Tue, Jun 8, 2021 at 6:02 PM Petr Viktorin  wrote:
>
>
> > * Make function PyArg_Parse always raising an exception.
>
> This would break extensions that use the stable ABI.
> (Yes, even starting to raise RuntimeError in 3.10 broke things. And yes,
> it's not strictly an ABI issue, but it has the same effect for users:
> they still need to recompile extensions to keep them working.)
>

I think we can skip this step.
Extension modules using # format is already broken since Python 3.10.

Adding # format support with size_t won't break so much.
We can do it in Python 3.12 or 3.13.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DN6OAOEP6MF75VJC6O6ATRQREPXE6CSU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-07 Thread Inada Naoki
On Tue, Jun 8, 2021 at 12:53 AM Hai Shi  wrote:
>
> > So how about making PY_SSIZE_T_CLEAN not mandatory in Python 3.11?
> > Extension modules can use '#' format with ssize_t, without
> > PY_SSIZE_T_CLEAN defined.
> > Or should we wait one more version?
>
> Hi, Inada,
> I suggest we should wait until at least Python 3.12 or Python 4.0.
>

Serhiy and you suggest Python 3.12 and I agree with it.
Thank you for your reply.

> There have an another question. There have many C API defined under 
> PY_SSIZE_T_CLEAN, for example: _PyArg_Parse_SizeT().
> Could we remove or merge them after making PY_SSIZE_T_CLEAN not mandatory?

They are part of stable ABIs. So we can remove/merge them at Python 4.0.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KNA7KVN6ZIBXWASUAJQRGT6OPCBDULFW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-07 Thread Inada Naoki
On Mon, Jun 7, 2021 at 4:52 PM Serhiy Storchaka  wrote:
>
> Many users still use 3.6 or 3.7. Jumping from 3.7 to 3.11 could break
> extensions in bad way (crash, truncated data, leaked sensitive
> information, execution of arbitrary code). Also, deprecation warnings in
> 3.8 and 3.9 can be easily ignored.
>
> I propose to wait until both of conditions became true:
>
> * 3.7 no longer maintained
> * 3.10 reaches security-only mode.
>

Makes sense.

Python 3.7 will get security fix until 2023-06.
https://www.python.org/dev/peps/pep-0537/#and-beyond-schedule

Python 3.12 will be released at 2023-10.
So we can change PY_SSIZE_T_CLEAN by default from 3.12.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QZSSHINMXXZ4I3ODPDFPGFC7MBSKIOVB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-06 Thread Inada Naoki
Hi, folks,

Since Python 3.8, PyArg_Parse*() APIs and Py_BuildValue() APIs emitted
DeprecationWarning when
'#' format is used without PY_SSIZE_T_CLEAN defined.
In Python 3.10, they raise a RuntimeError, not a warning. Extension
modules can not use '#' format with int.

So how about making PY_SSIZE_T_CLEAN not mandatory in Python 3.11?
Extension modules can use '#' format with ssize_t, without
PY_SSIZE_T_CLEAN defined.

Or should we wait one more version?

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KSREO43D6GQWO5LMVIU2LF7CP4IBYT2C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: GDB not breaking at the right place

2021-05-25 Thread Inada Naoki
On Tue, May 25, 2021 at 5:38 AM Guido van Rossum  wrote:
>
> To the contrary, I think if you want the CI jobs to be faster you should add 
> the CFLAGS to the configure call used to run the CI jobs.
>

-Og makes it faster not only CI jobs, but also everyday "edit code and
run `make test` with all assertions" cycles.

I don't have opinion which should be default. (+0 for -O0).
I use -Og by default and use -O0 only when I need anyway.

FWIW, we can disable optimization per-file basis during debugging.

  // Put this line on files you want to debug.
  #pragma GCC optimize ("O0")

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OJJZKWS446PJPXHUBNUVIYE756D5HHP4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Using FutureWarning for last version before deletion.

2021-05-11 Thread Inada Naoki
On Tue, May 11, 2021 at 5:30 PM Petr Viktorin  wrote:
>
> Test tools should treat DeprecationWarning as error by default [0][1].
> So even if end users don't really see it, I don't consider it "hidden".
>

*should* is not *do*. For example, nosetests don't show DeprecationWarning.
And there are many scripts without tests.

So it is hidden for some people.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GO7LZDHH4PEB57FRH4XHZZNZACWP5SRG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Using FutureWarning for last version before deletion.

2021-05-10 Thread Inada Naoki
Hi, folks.

Now Python 3.11 development is open and I am removing some deprecated
stuffs carefully.

I am considering `configparser.ParseError.filename` property that is
deprecated since Python 3.2.
https://github.com/python/cpython/blob/8e8307d70bb9dc18cfeeed3277c076309b27515e/Lib/configparser.py#L315-L333

My random thoughts about it:

* It has been deprecated long enough.
* But the maintenance burden is low enough.
* If we don't remove long deprecated stuff like this, Python 4.0 will
be a big breaking change.

My proposal:

* Change DeprecationWarning to FutureWarning and wait one more version.
  * DeprecationWarning is suppressed by default to hide noise from end users.
  * But sudden breaking change is more annoying to end users.

I am not proposing to change PEP 387 "Backwards Compatibility Policy".
This is just a new convention.

Another idea:

* Stop suppressing DeprecationWarning by default
* Use at least one PendingDeprecationWarning and one DeprecationWarning.
  * More than two PendingDeprecationWarning periods is preferred.

How do you think?

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DLRLO7HKJQK7PB6LHQK7RXYW53F72QR4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Can't sync cpython main to my fork

2021-05-06 Thread Inada Naoki
Your main branch in GitHub has some commits they are not in python/cpython.
https://github.com/smontanaro/cpython/commits/main

If you don't mind to trash all changes in your local and Github main
branch, you can sync main branches by:

# delete all changes in your worktree and main branch.
$ git reset --hard upstream/main
# delete all changes in your Github main branch.
$ git push --force origin main

On Thu, May 6, 2021 at 9:41 PM Skip Montanaro  wrote:
>
> (Sorry, this is probably not really python-dev material, but I'm stuck
> trying to bring my fork into sync with python/cpython.)
>
> I don't know if I did something to my fork or if the master->main
> change did something to me, but I am unable to sync my
> smontanaro/cpython main with the python/cpython main. The dev guide
> gives this simple recipe:
>
> git checkout main
> git pull upstream main
> git push origin main
>
> Here's how that goes:
>
> (python39) rvm% git co main
> Already on 'main'
> Your branch is up to date with 'upstream/main'.
> (python39) rvm% git pull upstream main
> From git://github.com/python/cpython
>  * branch  main   -> FETCH_HEAD
> Already up to date.
> (python39) rvm% git push origin main
> To github.com:smontanaro/cpython.git
>  ! [rejected]  main -> main (non-fast-forward)
> error: failed to push some refs to 'github.com:smontanaro/cpython.git'
> hint: Updates were rejected because the tip of your current branch is behind
> hint: its remote counterpart. Integrate the remote changes (e.g.
> hint: 'git pull ...') before pushing again.
> hint: See the 'Note about fast-forwards' in 'git push --help' for details.
>
> I looked at the fast-forward stuff in 'git push --help' but couldn't
> decipher what it told me, or more importantly, how it related to my
> problem. It's not clear to me how python/cpython:main can be behind
> smontanaro/cpython:main. I've attached my .git/config file in case
> that provides clues to the Git aficionados.
>
> Thx...
>
> Skip
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/J6GGEKUBMPU3X3WNKUG2XUD3GDV7L2FK/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EWSK2TM4QD3BM5G2CMEIJB6PVFO3FNSO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Keeping Python a Duck Typed Language.

2021-04-22 Thread Inada Naoki
On Fri, Apr 23, 2021 at 10:33 AM Chris Angelico  wrote:
>
> On Fri, Apr 23, 2021 at 11:22 AM Larry Hastings  wrote:
> >
> >
> > On 4/20/21 10:03 AM, Mark Shannon wrote:
> >
> > If you guarded your code with `isinstance(foo, Sequence)` then I could not 
> > use it with my `Foo` even if my `Foo` quacked like a sequence. I was forced 
> > to use nominal typing; inheriting from Sequence, or explicitly registering 
> > as a Sequence.
> >
> >
> > If I'm reading the library correctly, this is correct--but, perhaps, it 
> > could be remedied by adding a __subclasshook__ to Sequence that looked for 
> > an __iter__ attribute.  That technique might also apply to other ABCs in 
> > collections.abc, Mapping for example.  Would that work, or am I missing an 
> > critical detail?
> >
>
> How would you distinguish between a Sequence and a Mapping? Both have
> __iter__ and __len__. Without actually calling those methods, how
> would the subclass hook tell them apart?
>
> ChrisA

We can add .keys() to Mapping to distinguish Mapping and Sequence.
But it is breaking change, of course. We shouldn't change it.

I think using ABC to distinguish sequence or mapping is a bad idea.

There are three policies:

a) Use duck-typing; just us it as sequence. No type check at all.
b) Use strict type checking; isinstance(x, list) / isinstance(x, (list, tuple)).
c) Use ABC.

But (c) is broken by design. It is not fixable.
IMHO, We should chose (a) or (b) and reject any idea relying on Sequence ABC.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ESLOPO4GLC2QZW4ZDBYEQDPPGB4ZYDWM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 563 in light of PEP 649

2021-04-20 Thread Inada Naoki
And I tried removing co_firstfileno in optimize branch.
https://github.com/larryhastings/co_annotations/pull/9

Microbenchmarks.
(https://gist.github.com/methane/abb509e5f781cc4a103cc450e1e7925d)

```
# co_annotations branch (63b415c3)
$ ./python ~/ann_test.py 3
code size: 229679 bytes
memory: 209077 bytes
unmarshal: avg: 639.631ms +/-0.254ms
exec: avg: 95.979ms +/-0.033ms

$ ./python ~/ann_test_method.py 3
code size: 245729 bytes
memory: 339109 bytes
unmarshal: avg: 672.997ms +/-9.039ms
exec: avg: 259.286ms +/-4.841ms

# optimize branch (fbf0ad725f)
$ ./python ~/ann_test.py 3
code size: 113082 bytes
memory: 209077 bytes
unmarshal: avg: 318.437ms +/-0.171ms
exec: avg: 100.187ms +/-0.141ms

$ ./python ~/ann_test_method.py 3
code size: 129134 bytes
memory: 284565 bytes
unmarshal: avg: 357.157ms +/-0.971ms
exec: avg: 262.066ms +/-5.258ms
```

By the way, this microbenchmark uses 3 arguments and 1 return value.
annotation value is chosen from 3 (e.g. ["int", "str", "foo.bar.baz"]).
So there are 3*3*3*3=81 signatures, not only 27.

Anyway, 81/1000 may not be realistic.
When I changed ann_test to chose annotation value from 5 (e.g. 625/1000):

```
# co_annotations
$ ./python ~/ann_test.py 3
code size: 236106 bytes
memory: 208261 bytes
unmarshal: avg: 653.788ms +/-1.257ms
exec: avg: 95.783ms +/-0.169ms

# optimize
$ ./python ~/ann_test.py 3
code size: 162097 bytes
memory: 208261 bytes
unmarshal: avg: 458.959ms +/-0.163ms
exec: avg: 98.327ms +/-0.065ms
```


--
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FN26LGNAOMRPRVR2THUBBBUFZWPEOWSB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 563 in light of PEP 649

2021-04-20 Thread Inada Naoki
On Tue, Apr 20, 2021 at 4:24 PM Inada Naoki  wrote:
>
> Just an idea: do not save co_name and co_firstlineno in code object
> for function annotations.
> When creating a function object from a code object, they can be copied
> from annotated function.
>

I created a pull request. It use `__co_annotations__` for name, but
use `.__co_annotations__` for qualname.
https://github.com/larryhastings/co_annotations/pull/11


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D2X42E54IM2ASKDE5VJP6YJX557OOCHU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 563 in light of PEP 649

2021-04-20 Thread Inada Naoki
Just an idea: do not save co_name and co_firstlineno in code object
for function annotations.
When creating a function object from a code object, they can be copied
from annotated function.

I think co_name and co_firstlineno are preventing code object is
shared in compile time.
We can share only co_names and co_consts for now. If we can share the
entire code object, it will reduce pyc file size and unmarshal time
(e.g. part of import time).

On Tue, Apr 20, 2021 at 6:15 AM Larry Hastings  wrote:
>
> On 4/19/21 1:37 PM, Ethan Furman wrote:
>
> On 4/19/21 10:51 AM, Larry Hastings wrote:
>
> Something analogous /could/ happen in the PEP 649 branch but currently 
> doesn't.  When running Inada Noki's benchmark, there are a total of nine 
> possible annotations code objects.  Except, each function generated by the 
> benchmark has a unique name, and I incorporate that name into the name given 
> to the code object (f"{function_name}.__co_annotations__"). Since each 
> function name is different, each code object name is different, so each code 
> object /hash/ is different, and since they aren't /exact/ duplicates they are 
> never consolidated.
>
>
> I hate anonymous functions, so the name is very important to me.  The primary 
> code base I work on does have hundreds of methods with the same signature -- 
> unfortunately, many of the also have the same name (four levels of super() 
> calls is not unusual, and all to the same read/write/create parent methods 
> from read/write/create child methods).  In such a case would the name make a 
> meaningful difference?
>
> Or maybe the name can be store when running in debug mode, and not stored 
> with -O ?
>
>
> I think it needs to have a name.  But if it made a difference, perhaps it 
> could use f"{function_name}.__co_annotations__" normally, and simply 
> "__co_annotations__" with -O.
>
> Note also that this is the name of the annotations code object, although I 
> think the annotations function object reuses the name too.  Anyway, under 
> normal circumstances, the Python programmer would have no reason to interact 
> directly with the annotations code/function object, so it's not likely it 
> will affect them one way or another.  The only time they would see it would 
> be, say, if the calculation of an annotation threw an exception, in which 
> case it seems like seeing f"{function_name}.__co_annotations__" in the 
> traceback might be a helpful clue in diagnosing the problem.
>
>
> I'd want to see some real numbers before considering changes here.  If it has 
> a measurable and beneficial effect on real-world code, okay! let's change it! 
>  But my suspicion is that it doesn't really matter.
>
>
> Cheers,
>
>
> /arry
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZAPCP4MFDOF34E3G2TWAVY7JUQRHDOOB/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NVVLCBI3I75C5N67BP7TJJ2JX2LT6OP6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: In support of PEP 649

2021-04-18 Thread Inada Naoki
As far as I know, both Pydantic and marshmallow start using annotation
for runtime type after PEP 563 is accepted. Am I right?
When PEP 563 is accepted, there are some tools using runtime type in
annotation, but they are not adopted widely yet.

But we didn't have any good way to emit DeprecationWarning for use
cases that PEP 563 breaks.
Maybe, we should have changed the default behavior soon in Python 3.8.
With long preparation period without DeperecationWarning, use cases
broken by PEP 563 beccome much.

I still love PEP 563. Python is dynamic language so runtime type and
static type can not be 100% consistent.
Annotation is handy tool. But there is only one annotation per object.
When we start using annotation for multiple purposes, it becomes ugly
monster soon.

Using annotation syntax only for static typing like TypeScript is the ideal.
TypeScript is far more succeeded than Python about Gradual Typing.

But, current status is went to where I hate already. Annotation is
used for multiple purposes already.
I'm sad and disappointed about it, but that's that.

I'm OK to keep PEP 563 opt-in. And I think we should do it. (off
course, I'm not a SC member. I follow the SC decision).

And if we keep PEP 563 opt-in, no need to compare PEP 563 with PEP 649.
PEP 649 should be compared with "stock semantics + opt-in PEP 563 semantics".

Of course, one semantics is better than two semanticses.
But we need to have three semanticses until we can remove stock + PEP
563 semantices.
We should think about PEP 649 very carefully, spending more time.

Regards,
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BYSKN5TXEVIOHN6WVMSFRIVMF66OEPOT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 563 in light of PEP 649

2021-04-17 Thread Inada Naoki
On Sat, Apr 17, 2021 at 1:38 PM Guido van Rossum  wrote:
>
> I'm not going to report results, but we could use mypy itself as an example 
> real-world code base. Mypy is almost 100% annotated. It does not include 
> `from __future__ import annotations` lines but those could easily be added 
> mechanically for some experiment.
>
> ISTM that the unmarshal times reported by Inada are largely proportional to 
> the code size numbers, so perhaps the following three-way experiment would 
> give an indication:
>
> (1) Addthe sizes of all pyc files for mypy run with Python 3.9 (classic)
> (2) Ditto run with Python 3.10a7 (PEP 563)
> (3) Ditto run with Larry's branch (PEP 649, assuming it's on by default there 
> -- otherwise, modify the source by inserting the needed future import at the 
> top)
>

Please don't use 3.10a7, but latest master branch.
CFG optimizer broke some PEP 563 optimization and I fixed it yesterday.
https://github.com/python/cpython/pull/25419

> The repo is github.com/python/mypy, the subdirectory to look is mypy, WITH 
> THE EXCLUSION OF THE typeshed SUBDIRECTORY THEREOF.
>

I want to measure import time and memory usage. Will `import
mypy.main` import all important modules?

This is my quick result of (1) and (2). I can not try (3) because of
memory error. (see later).

## memory usage

```
$ cat a.py
import tracemalloc
tracemalloc.start()
import mypy.main
print("memory:", tracemalloc.get_traced_memory()[0])

# (1)
$ python3 a.py
memory: 8963137
$ python3 -OO a.py
memory: 8272848

# (2)
$ ~/local/python-dev/bin/python3 a.py
memory: 8849216
$ ~/local/python-dev/bin/python3 -OO a.py
memory: 8104730

>>> (8963137-8849216)/8963137
0.012709947421310196
>>> (8272848-8104730)/8272848
0.020321659481716575
```

PEP 563 saved 1~2% memory.

## GC time

```
$ pyperf timeit -s 'import mypy.main, gc' -- 'gc.collect()'

3.9: . 2.68 ms +- 0.02 ms
3.10: . 2.23 ms +- 0.01 ms

Mean +- std dev: [3.9] 2.68 ms +- 0.02 ms -> [3.10] 2.23 ms +- 0.01
ms: 1.20x faster
```

PEP 563 is 1.2x faster!

## import time

```
$ python3 -m pyperf command python3 -c 'import mypy.main'

(1) command: Mean +- std dev: 99.6 ms +- 0.3 ms
(2) command: Mean +- std dev: 93.3 ms +- 1.2 ms

>>> (99.6-93.3)/99.6
0.06325301204819275
```

PEP 563 reduced 6% importtime.

## memory error on co_annotations

I modifled py_compile to add `from __future__ import co_annotations`
automatically.

```
$ ../co_annotations/python -m compileall mypy
Listing 'mypy'...
Compiling 'mypy/checker.py'...
free(): corrupted unsorted chunks
Aborted

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x77c73859 in __GI_abort () at abort.c:79
#2  0x77cde3ee in __libc_message
(action=action@entry=do_abort, fmt=fmt@entry=0x77e08285 "%s\n") at
../sysdeps/posix/libc_fatal.c:155
#3  0x77ce647c in malloc_printerr
(str=str@entry=0x77e0a718 "free(): corrupted unsorted chunks") at
malloc.c:5347
#4  0x77ce81c2 in _int_free (av=0x77e39b80 ,
p=0x55d1db30, have_lock=) at malloc.c:4356
#5  0x55603906 in PyMem_RawFree (ptr=) at
Objects/obmalloc.c:1922
#6  _PyObject_Free (ctx=, p=) at
Objects/obmalloc.c:1922
#7  _PyObject_Free (ctx=, p=) at
Objects/obmalloc.c:1913
#8  0x5567caa9 in compiler_unit_free (u=0x55ef0fd0) at
Python/compile.c:583
#9  0x5568aea5 in compiler_exit_scope (c=0x7fffc3d0) at
Python/compile.c:760
#10 compiler_function (c=0x7fffc3d0, s=,
is_async=0) at Python/compile.c:2529
#11 0x5568837d in compiler_visit_stmt (s=,
c=0x7fffc3d0) at Python/compile.c:3665
#12 compiler_body (c=c@entry=0x7fffc3d0, stmts=0x56222450) at
Python/compile.c:1977
#13 0x55688e51 in compiler_class (c=c@entry=0x7fffc3d0,
s=s@entry=0x56222a60) at Python/compile.c:2623
#14 0x55687ce3 in compiler_visit_stmt (s=,
c=0x7fffc3d0) at Python/compile.c:3667
#15 compiler_body (c=c@entry=0x7fffc3d0, stmts=0x563014c0) at
Python/compile.c:1977
#16 0x5568db00 in compiler_mod (filename=0x772e6770,
mod=0x563017b0, c=0x7fffc3d0) at Python/compile.c:2001
```

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2UON4FZ5UJ3RYE3ZO5Q445RVPMFAR2SW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: In support of PEP 649

2021-04-16 Thread Inada Naoki
On Fri, Apr 16, 2021 at 6:03 AM Bluenix  wrote:
>
> Please accept PEP 649!
>
> Python's type hinting has become so much more useful than originally thought, 
> and without this change much of that will be hindered. For example (you 
> already know about Pydantic and FastAPI) 
> [discord.py](https://github.com/Rapptz/discord.py)'s commands system allows 
> you to use typehinting to specify how arguments should be converted. Take the 
> following code:
>
> ```py
> import discord
> from discord.ext import commands
>
> bot = commands.Bot(command_prefix='>')
>
> @bot.command()
> # discord.py reads the typehints and converts the arguments accordingly
> async def reply(ctx, member: discord.Member, *, text: str):  # ctx is always 
> passed
> await ctx.send(f'{member.mention}! {text}')
>
> bot.run('token')
> ```
>
> I must say, this is extremely ergonomic design! Don't break it :)

Are you sure about PEP 563 break it and discord.py can not fix it?

As far as my understanding, PEP 563 don't hurt this use case so much:

* This use case evaluates runtime type information only once. So
eval() overhead is not a problem.
* Currently, annotation is very very complex and varies. For example,
List[int] will be `List[int]`, `List['int']`, `'List[int]'`,
`'List["int"]'`, `List[ForwardRef('int')]` etc...
  After PEP 563, only `'List[int]'` is practical so we can stop
supporting `List["int"]` and others at some point.
  So playing with runtime type will become easier in the future.

Am I wrong?

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NSZGWCABWFYWZZTNCE5VE5ZVC3OUJCNU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-15 Thread Inada Naoki
>
> I will read PEP 649 implementation to find missing optimizations other
> than GH-25419 and GH-23056.
>

I found each "__co_annotation__" has own name like "func0.__co_annotation__".
It increased pyc size a little.
I created a draft pull request for cherry-picking GH-25419 and
GH-23056 and using just "__co_annotation__" as a name.
https://github.com/larryhastings/co_annotations/pull/9/commits/48a99e0aafa2dd006d72194bc1d7d47443900502

```
# previous result
$ ./python ~/ann_test.py 3
code size: 204963 bytes
memory: 209257 bytes
unmarshal: avg: 587.336ms +/-2.073ms
exec: avg: 97.056ms +/-0.046ms

# Use single name
$ ./python ~/ann_test.py 3
code size: 182088 bytes
memory: 209257 bytes
unmarshal: avg: 539.841ms +/-0.227ms
exec: avg: 98.351ms +/-0.064ms
```

It reduced code size and unmarshal time by 10%.
I confirmed GH-25419 and GH-23056 works very well. All same constants
are shared.

Unmarshal time is still slow. It is caused by unmarshaling code object.
But I don't know where is the bottleneck: parsing marshal file, or
creating code object.

---

Then, I tried to measure method annotation overhead.
Code: 
https://gist.github.com/methane/abb509e5f781cc4a103cc450e1e7925d#file-ann_test_method-py
Result:

```
# No annotation
$ ./python ~/ann_test_method.py 0
code size: 113019 bytes
memory: 256008 bytes
unmarshal: avg: 336.665ms +/-6.185ms
exec: avg: 176.791ms +/-3.067ms

# PEP 563
$ ./python ~/ann_test_method.py 2
code size: 120532 bytes
memory: 269004 bytes
unmarshal: avg: 348.285ms +/-0.102ms
exec: avg: 176.933ms +/-4.343ms

# PEP 649 (all optimization included)
$ ./python ~/ann_test_method.py 3
code size: 196135 bytes
memory: 436565 bytes
unmarshal: avg: 579.680ms +/-0.147ms
exec: avg: 259.781ms +/-7.087ms
```

PEP 563 vs 649
* code size: +63%
* memory: +62%
* import time: +60%

PEP 649 annotation overhead (compared with no annotation):
* code size: +83 byte/method
* memory: +180 byte/method
* import time: +326 us/method

It is disappointing because having thousands methods is very common
for web applications.

Unlike simple function case, PEP 649 creates function object instead
of code object for __co_annotation__ of methods.
It cause this overhead.  Can we avoid creating functions for each annotation?

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4KUXC373ZOP7YHZI6N4NKOOOB3DCL7NW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: In support of PEP 649

2021-04-15 Thread Inada Naoki
On Fri, Apr 16, 2021 at 9:49 AM Oscar Benjamin
 wrote:
>
>
> > That said, I agree it is better that this came up before the feature freeze 
> > than after the release. And I am willing to accept that the hypothetical 
> > future where annotations are not always syntactically expressions (which 
> > did not even exist before this week) is less important than backwards 
> > compatibility.
>
> Would it be problematic to postpone making __future__.annotations the default?
>

__future__.annotation is the default since 2020-10-04.
https://github.com/python/cpython/commit/044a1048ca93d466965afc027b91a5a9eb9ce23c#diff-ebc983d9f91e5bcf73500e377ac65e85863c4f77fd5b6b6caf4fcdf7c0f0b057

After that, many changes are made on compiler and other places.
So reverting the change is not so simple.

And personally, I love static typing but I don't use type hint for
performance/memory usage reason.
I spend much effort to optimize PEP 563 to minimize type hinting overhead.
So it's very sad that if I can not use type hinting when I can drop
Python 3.9 support.

So if PEP 649 is accepted, I want to use it since Python 3.10.
Otherwise, I can not use type hinting even after I dropped Python 3.9
support.

But it is up to release manager and steering council.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6ZYCD63KBP2EPDZIYJD2IPHCVRV4LQGP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-15 Thread Inada Naoki
I updated the benchmark little:

* Added no annotation mode for baseline performance.
* Better stats output.

https://gist.github.com/methane/abb509e5f781cc4a103cc450e1e7925d

```
# No annotation (master + GH-25419)
$ ./python ~/ann_test.py 0
code size: 102967 bytes
memory: 181288 bytes
unmarshal: avg: 299.301ms +/-1.257ms
exec: avg: 104.019ms +/-0.038ms

# PEP 563 (master + GH-25419)
$ ./python ~/ann_test.py 2
code size: 110488 bytes
memory: 193572 bytes
unmarshal: avg: 313.032ms +/-0.068ms
exec: avg: 108.456ms +/-0.048ms

# PEP 649 (co_annotations + GH-25419 + GH-23056)
$ ./python ~/ann_test.py 3
code size: 204963 bytes
memory: 209257 bytes
unmarshal: avg: 587.336ms +/-2.073ms
exec: avg: 97.056ms +/-0.046ms

# Python 3.9
$ python3 ann_test.py 0
code size: 108955 bytes
memory: 173296 bytes
unmarshal: avg: 333.527ms +/-1.750ms
exec: avg: 90.810ms +/-0.347ms

$ python3 ann_test.py 1
code size: 121011 bytes
memory: 385200 bytes
unmarshal: avg: 334.929ms +/-0.055ms
exec: avg: 400.260ms +/-0.249ms
```

## Rough estimation of annotation overhead

Python 3.9 w/o PEP 563
code (pyc) size: +11%
memory usage: +122%  (211bytes / function)
import time: +73% (*)

PEP 563
code (pyc) size: +7.3%
memory usage: +0.68%  (13.3bytes / function)
import time: +4.5%

PEP 649
code (pyc) size: +99%
memory usage: +15%  (28 bytes / function)
import time: +70%

(*) import time can be much more slower for complex annotations.

## Conclusion

* PEP 563 is close to "zero overhead" in memory consumption. And
import time overhead is ~5%. Users can write type annotations without
caring overhead.

* PEP 649 is much better than old semantics for memory usage and
import time. But import time is still longer than no annotation code.

  * The import time overhead is coming from unmarshal, not from
eval().  If we implement a "lazy load" mechanizm for docstrings and
annotations, overhead will become cheaper.
  * pyc file become bigger (but who cares?)

I will read PEP 649 implementation to find missing optimizations other
than GH-25419 and GH-23056.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K5PFGS6DQUHUG63UWRXYNLLMXAVELP32/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-14 Thread Inada Naoki
On Thu, Apr 15, 2021 at 11:09 AM Larry Hastings  wrote:
>
> Thanks for doing this!  I don't think PEP 649 is going to be accepted or 
> rejected based on either performance or memory usage, but it's nice to see 
> you confirmed that its performance and memory impact is acceptable.
>
>
> If I run "ann_test.py 1", the annotations are already turned into strings.  
> Why do you do it that way?  It makes stock semantics look better, because 
> manually stringized annotations are much faster than evaluating real 
> expressions.
>

Because `if TYPE_CHECKING` and manually stringified annotation is used
in real world applications.
I want to mix both use cases.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MKAGFBTVZ4LXYJC5X6P3LXXXRD7P2WH5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-14 Thread Inada Naoki
I added memory usage data by tracemalloc.

```
# Python 3.9 w/ old semantics
$ python3 ann_test.py 1
code size: 121011
memory: (385200, 385200)
unmarshal: avg: 0.3341682574478909 +/- 3.700437551781949e-05
exec: avg: 0.4067857594229281 +/- 0.0006858555167675445

# Python 3.9 w/ PEP 563 semantics
$ python3 ann_test.py 2
code size: 121070
memory: (398675, 398675)
unmarshal: avg: 0.3352349083404988 +/- 7.749102039824168e-05
exec: avg: 0.24610224328935146 +/- 0.0008628035427956459

# master + optimization w/ PEP 563 semantics
$ ./python ~/ann_test.py 2
code size: 110488
memory: (193572, 193572)
unmarshal: avg: 0.31316645480692384 +/- 0.00011766086337841035
exec: avg: 0.11456295938696712 +/- 0.0017481202239372398

# co_annotations + optimization w/ PEP 649 semantics
$ ./python ~/ann_test.py 3
code size: 204963
memory: (208273, 208273)
unmarshal: avg: 0.597023528907448 +/- 0.00016614519056599577
exec: avg: 0.09546191191766411 +/- 0.00018099485135812695
```

Summary:

* Both of PEP 563 and PEP 649 has low memory consumption than Python 3.9.
* Importing time (unmarshal+exec) is about 0.7sec on old semantics and
PEP 649, 0.43sec on PEP 563.

On Thu, Apr 15, 2021 at 10:31 AM Inada Naoki  wrote:
>
> I created simple benchmark:
> https://gist.github.com/methane/abb509e5f781cc4a103cc450e1e7925d
>
> This benchmark creates 1000 annotated functions and measure time to
> load and exec.
> And here is the result. All interpreters are built without --pydebug,
> --enable-optimization, and --with-lto.
>
> ```
> # Python 3.9 w/ stock semantics
>
> $ python3 ~/ann_test.py 1
> code size: 121011
> unmarshal: avg: 0.33605549649801103 +/- 0.007382938279889738
> exec: avg: 0.395090194279328 +/- 0.001004608380122509
>
> # Python 3.9 w/ PEP 563 semantics
>
> $ python3 ~/ann_test.py 2
> code size: 121070
> unmarshal: avg: 0.3407619891455397 +/- 0.0011833618746421965
> exec: avg: 0.24590165729168803 +/- 0.0003123404336687428
>
> # master branch w/ PEP 563 semantics
>
> $ ./python ~/ann_test.py 2
> code size: 149086
> unmarshal: avg: 0.45410854648798704 +/- 0.00107521956753799
> exec: avg: 0.11281821667216718 +/- 0.00011939747308270317
>
> # master branch + optimization (*) w/ PEP 563 semantics
> $ ./python ~/ann_test.py 2
> code size: 110488
> unmarshal: avg: 0.3184352931333706 +/- 0.0015278719180908732
> exec: avg: 0.11042822999879717 +/- 0.00018108884723599264
>
> # co_annotatins reference implementation w/ PEP 649 semantics
>
> $ ./python ~/ann_test.py 3
> code size: 229679
> unmarshal: avg: 0.6402394526172429 +/- 0.0006400500128250688
> exec: avg: 0.09774857209995388 +/- 9.275466265195788e-05
>
> # co_annotatins reference implementation + optimization (*) w/ PEP 649 
> semantics
>
> $ ./python ~/ann_test.py 3
> code size: 204963
> unmarshal: avg: 0.5824743471574039 +/- 0.007219086642131638
> exec: avg: 0.09641968684736639 +/- 0.0001416784753249878
> ```
>
> (*) I found constant folding creates new tuple every time even though
> same tuple is in constant table.
> See https://github.com/python/cpython/pull/25419
> For co_annotations, I cherry-pick
> https://github.com/python/cpython/pull/23056  too.
>
>
> --
> Inada Naoki  



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JIZ6PCV5SSIL7BUKZUCVF45OFNO4H26I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-14 Thread Inada Naoki
I created simple benchmark:
https://gist.github.com/methane/abb509e5f781cc4a103cc450e1e7925d

This benchmark creates 1000 annotated functions and measure time to
load and exec.
And here is the result. All interpreters are built without --pydebug,
--enable-optimization, and --with-lto.

```
# Python 3.9 w/ stock semantics

$ python3 ~/ann_test.py 1
code size: 121011
unmarshal: avg: 0.33605549649801103 +/- 0.007382938279889738
exec: avg: 0.395090194279328 +/- 0.001004608380122509

# Python 3.9 w/ PEP 563 semantics

$ python3 ~/ann_test.py 2
code size: 121070
unmarshal: avg: 0.3407619891455397 +/- 0.0011833618746421965
exec: avg: 0.24590165729168803 +/- 0.0003123404336687428

# master branch w/ PEP 563 semantics

$ ./python ~/ann_test.py 2
code size: 149086
unmarshal: avg: 0.45410854648798704 +/- 0.00107521956753799
exec: avg: 0.11281821667216718 +/- 0.00011939747308270317

# master branch + optimization (*) w/ PEP 563 semantics
$ ./python ~/ann_test.py 2
code size: 110488
unmarshal: avg: 0.3184352931333706 +/- 0.0015278719180908732
exec: avg: 0.11042822999879717 +/- 0.00018108884723599264

# co_annotatins reference implementation w/ PEP 649 semantics

$ ./python ~/ann_test.py 3
code size: 229679
unmarshal: avg: 0.6402394526172429 +/- 0.0006400500128250688
exec: avg: 0.09774857209995388 +/- 9.275466265195788e-05

# co_annotatins reference implementation + optimization (*) w/ PEP 649 semantics

$ ./python ~/ann_test.py 3
code size: 204963
unmarshal: avg: 0.5824743471574039 +/- 0.007219086642131638
exec: avg: 0.09641968684736639 +/- 0.0001416784753249878
```

(*) I found constant folding creates new tuple every time even though
same tuple is in constant table.
See https://github.com/python/cpython/pull/25419
For co_annotations, I cherry-pick
https://github.com/python/cpython/pull/23056  too.


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JEP4MXAHCVGQF7AI5OUUSGOENXAOR43O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-13 Thread Inada Naoki
On Wed, Apr 14, 2021 at 10:44 AM Larry Hastings  wrote:
>
>
> On 4/13/21 1:52 PM, Guido van Rossum wrote:
>
>
> Because typing is, to many folks, a Really Important Concept, and it's 
> confusing to use the same syntax ("x: blah blah") for different purposes, in 
> a way that makes it hard to tell whether a particular "blah blah" is meant as 
> a type or as something else -- because you have to know what's introspecting 
> the annotations before you can tell. And that introspection could be 
> signalled by a magical decorator, but it could also be implicit: maybe you 
> have a driver that calls a function based on a CLI entry point name, and 
> introspects that function even if it's not decorated.
>
>
> I'm not sure I understand your point.  Are you saying that we need to take 
> away the general-purpose functionality of annotations, that's been in the 
> language since 3.0, and restrict annotations to just type hints... because 
> otherwise an annotation might not be used for a type hint, and then the 
> programmer would have to figure out what it means?  We need to take away the 
> functionality from all other use cases in order to lend clarity to one use 
> case?
>

I don't think we need to take away "general purpose functionality".
But if we define type hinting is 1st class use case of annotations,
annotations should be optimized for type hinting.  General purpose use
case should accept some limitation and overhead.

On the other hand, if we decide general purpose functionality is 1st
class too, we shouldn't annotation syntax different from Python
syntax.

But annotations should be optimized for type hinting anyway. General
purpose use case used only is a limited part of application. On the
other hand, type hint can be used almost everywhere in application
code base. It must cheap enough.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UGFWTZUGH6QZRHF3FKTQHZLYG2ZNX5EG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Making staticmethod callable, any oposite?

2021-04-13 Thread Inada Naoki
Hi, all.

I am implementing PEP 597. During review, Victor suggested to
deprecate `OpenWrapper`. `OpenWrapper` is defined only for
compatibility between C function and Python function:

```
from _pyio import open as py_open
from _io import open as c_open

class C:
py_open = py_open
c_open = c_open

C().c_open("README.rst")  # works
C().py_open("README.rst")  # TypeError: expected str, bytes or
os.PathLike object, not C
```

So builtin open is not io.open, but io.OpenWrapper in Python 3.9.
Making staticfunction callable fixes this issue.

```
@staticfunction
def open(...): ...
```

Now open defined in Python behaves like C function. We don't need
OpenWrapper anymore.
This has already been committed by Guido's approval. staticmethod is
callable, and OpenWrapper is just an alias of open and deprecated in
master branch.

But Mark Shannon said we shouldn't make such a change without
discussing at python-dev.
I don't know we *should*, but I agree that it is *ideal*.

Then, does anyone oppose this change?

Histrically, this idea had been rejected once. bpo-20309 proposed
making classmethod and staticmethod callable.
https://bugs.python.org/issue20309

It had been rejected by:

"I don't agree that this is a bug that should be fixed.  It adds code
that will likely never get called or needed (i.e. there has never been
a request for this in the decade long history of desciptors and it
seems like a made up requirement to me.  "
https://bugs.python.org/issue20309#msg240843

"actually supporting this would mean adding code that would need to be
maintained indefinitely without providing a compensating practical
benefit,"
https://bugs.python.org/issue20309#msg240898

But status is changed now. We already have OpenWrapper. It proves
callable classmethod is "called and needed".
Although there is only one use case, we can remove more code than adding.

staticmethod.__call__() is simple C function.
https://github.com/python/cpython/pull/25117/files#diff-57bc77178b3d6f1010dd924722c87522f224d93bc341f0e46c0945094124d8f2

Victor removed OpenWrapper class already, and we can remove `DocDescripter` too.
https://github.com/python/cpython/pull/25354/files#diff-bcdfa9cbb0764d7959cda48f9084d79785f87c5ad7460f27ba2678b0bda76e38R314-L327

I think maintenance burden of staticmethod.__call__() is not higher
than OpenWrapper and DocDescripter.
Additionally, if we have same issue in other module, we can just use
staticmethod, instead of copy OpenWrapper and DocDescripter.

So it provides "compensating practical benefit".


Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MGQMXSMQIPI5ZKS2T5YHNM47PPWSSRD5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-12 Thread Inada Naoki
On Tue, Apr 13, 2021 at 11:18 AM Paul Bryan  wrote:
>
> On Tue, 2021-04-13 at 10:47 +0900, Inada Naoki wrote:
>
> On Tue, Apr 13, 2021 at 9:57 AM Larry Hastings  wrote:
>
>
> This is really the heart of the debate over PEP 649 vs PEP 563.  If you 
> examine an annotation, and it references an undefined symbol, should that 
> throw an error?  There is definitely a contingent of people who say "no, 
> that's inconvenient for us".  I think it should raise an error.  Again from 
> the Zen: "Special cases aren't special enough to break the rules."  
> Annotations are expressions, and if evaluating an expression fails because of 
> an undefined name, it should raise a NameError.
>
>
> I agree that this is the heart of the debate. I think "annotations are
> for type hitns". They are for:
>
> * Static type checkers
> * document.
>
>
> + dynamic type validation, encoding and decoding (Pydantic, FastAPI, Fondat, 
> et al.)
>
> Paul
>

OK. It is important use case too.

Such use cases doesn't justify raising NameError instead of getting
stringified type hints for documents for document use cases.

On the other hand, if "dynamic type" is used heavily, eval()
performance can be a problem.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ICKGBL674MUOYBOGNGKKJDPHYD3TYGER/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-12 Thread Inada Naoki
On Tue, Apr 13, 2021 at 9:57 AM Larry Hastings  wrote:
>
>
> On 4/12/21 4:50 PM, Inada Naoki wrote:
>
> PEP 563 solves all problems relating to types not accessible in runtime.
> There are many reasons users can not get types used in annotations at runtime:
>
> * To avoid circular import
> * Types defined only in pyi files
> * Optional dependency that is slow to import or hard to install
>
> It only "solves" these problems if you leave the annotation as a string.  If 
> PEP 563 is active, but you then use typing.get_type_hints() to examine the 
> actual Python value of the annotation, all of these examples will fail with a 
> NameError.  So, in this case, "solves the problem" is a positive way of 
> saying "hides a runtime error".
>

Of course, "get type which is unavailable in runtime" is unsolvable
problem. PEP 597 doesn't solve it too. Author needs to quote the hint
manually, and `typing.get_type_hints()` raises NameError too.
And if author forget to quote, user can not get any type hints.


> I don't know what the use cases are for examining type hints at runtime, so I 
> can't speak as to how convenient or inconvenient it is to deal with them 
> strictly as strings.  But it seems to me that examining annotations as their 
> actual Python values would be preferable.
>

This is use cases for examining type hints at runtime and stringified
hints are OK.

* Sphinx autodoc
* help()
* IPython and other REPLS showing type hint in popup.


>
> ```
> from dataclasses import dataclass
>
> if 0:
> from fakemod import FakeType
>
> @dataclass
> class C:
> a : FakeType = 0
> ```
>
> This works on PEP 563 semantics (Python 3.10a7). User can get
> stringified annotation.
>
> With stock semantics, it cause NameError when importing so author can
> notice they need to quote "FakeType".
>
> With PEP 649 semantics, author may not notice this annotation cause
> error. User can not get any type hints at runtime.
>
> Again, by "works on PEP 563 semantics", you mean "doesn't raise an error".  
> But the code has an error.  It's just that it has been hidden by PEP 563 
> semantics.
>
> I don't agree that changing Python to automatically hide errors is an 
> improvement.  As the Zen says: "Errors should never pass silently."
>
> This is really the heart of the debate over PEP 649 vs PEP 563.  If you 
> examine an annotation, and it references an undefined symbol, should that 
> throw an error?  There is definitely a contingent of people who say "no, 
> that's inconvenient for us".  I think it should raise an error.  Again from 
> the Zen: "Special cases aren't special enough to break the rules."  
> Annotations are expressions, and if evaluating an expression fails because of 
> an undefined name, it should raise a NameError.
>

I agree that this is the heart of the debate. I think "annotations are
for type hitns". They are for:

* Static type checkers
* document.

So I don't think `if TYPE_CHECKING` idiom is violating Python Zen.


Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DAMYYG5CY6MGRCAKIWRA4IKXW75DYXW6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-12 Thread Inada Naoki
On Tue, Apr 13, 2021 at 8:58 AM Larry Hastings  wrote:
>
> On 4/12/21 4:50 PM, Inada Naoki wrote:
>
> As PEP 597 says, eval() is slow. But it can avoidable in many cases
> with PEP 563 semantics.
>
> PEP 597 is "Add optional EncodingWarning".  You said PEP 597 in one other 
> place too.  Did you mean PEP 649 in both places?
>

You're right. I meant PEP 649 vs PEP 563. I'm sorry.

>
> Cheers,
>
>
> /arry



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CLTMVQS7PMLEM237IAY4WCXC7M5DL7T6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

2021-04-12 Thread Inada Naoki
I still prefer PEP 563.
I will describe what we lost if PEP 597 is accepted and PEP 563 is rejected.

### Types not accessible in runtime

First of all, PEP 563 solves not only forward references.

Note that PEP 563 says: "we'll call any name imported or defined
within a `if TYPE_CHECKING: block` a forward reference, too."
https://www.python.org/dev/peps/pep-0563/#forward-references

PEP 563 solves all problems relating to types not accessible in runtime.
There are many reasons users can not get types used in annotations at runtime:

* To avoid circular import
* Types defined only in pyi files
* Optional dependency that is slow to import or hard to install

This is the most clear point where PEP 563 is better for some users.
See this example:

```
from dataclasses import dataclass

if 0:
from fakemod import FakeType

@dataclass
class C:
a : FakeType = 0
```

This works on PEP 563 semantics (Python 3.10a7). User can get
stringified annotation.

With stock semantics, it cause NameError when importing so author can
notice they need to quote "FakeType".

With PEP 649 semantics, author may not notice this annotation cause
error. User can not get any type hints at runtime.


### Type alias

Another PEP 563 benefit is user can see simple type alias.
Consider this example.

```
from typing import *

AliasType = Union[List[Dict[Tuple[int, str], Set[int]]], Tuple[str, List[str]]]

def f() -> AliasType:
pass

help(f)
```

Currently, help() calls `typing.get_type_hints()`. So it shows:

```
f() -> Union[List[Dict[Tuple[int, str], Set[int]]], Tuple[str, List[str]]]
```

But with PEP 563 semantics, we can stop evaluating annotations and
user can see more readable alias type.

```
f() -> AliasType
```

As PEP 597 says, eval() is slow. But it can avoidable in many cases
with PEP 563 semantics.
I am not sure but I expect dataclass can avoid eval() too in PEP 563 semantics.

Sphinx uses this feature already.
See 
https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_type_aliases


### Relaxing annotation syntax

As discussed in PEP 647 thread, we can consider having different
syntax for annotation with PEP 597 semantics.


Regards,

On Mon, Apr 12, 2021 at 10:58 AM Larry Hastings  wrote:
>
>
> Attached is my second draft of PEP 649.  The PEP and the prototype have both 
> seen a marked improvement since round 1 in January; PEP 649 now allows 
> annotations to refer to any variable they could see under stock semantics:
>
> Local variables in the current function scope or in enclosing function scopes 
> become closures and use LOAD_DEFER.
> Class variables in the current class scope are made available using a new 
> mechanism, in which the class dict is attached to the bound annotation 
> function, then loaded into f_locals when the annotation function is run.  
> Thus permitting LOAD_NAME opcodes to function normally.
>
>
> I look forward to your comments,
>
>
> /arry
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/QSASX6PZ3LIIFIANHQQFS752BJYFUFPY/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KK5QG75RWSDBU4E36XAVSDPY5OUERA73/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When we remove 'U' mode of open()?

2021-04-07 Thread Inada Naoki
On Thu, Apr 8, 2021 at 9:54 AM Inada Naoki  wrote:
>
> We are close to 3.10 beta and it is not ideal timing for removing.
> So my proposal is:
>
> * Remove 'U' in fileinput, because it makes my task little simpler.
> * Remove 'U' in other places in Python 3.11, after 3.10 branch is
> created (and master branch is renamed to main).
>

I rejected bpo-36865, and created a pull request fixing bpo-5758 and
bpo-43712 without touching the `mode`.
There is no need to remove 'U' soon from fileinput too. We can remove
them all in Python 3.11.

https://bugs.python.org/issue36865
https://bugs.python.org/issue5758
https://bugs.python.org/issue43712
https://github.com/python/cpython/pull/25272


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E62553ZLB3OBFNU3VMOLECIDKRHQ74UC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When we remove 'U' mode of open()?

2021-04-07 Thread Inada Naoki
We are close to 3.10 beta and it is not ideal timing for removing.
So my proposal is:

* Remove 'U' in fileinput, because it makes my task little simpler.
* Remove 'U' in other places in Python 3.11, after 3.10 branch is
created (and master branch is renamed to main).

On Thu, Apr 8, 2021 at 5:45 AM Brett Cannon  wrote:
>
>
>
> On Wed, Apr 7, 2021 at 10:01 AM Serhiy Storchaka  wrote:
>>
>> 07.04.21 19:13, Victor Stinner пише:
>> > Hi Inada-san,
>> >
>> > I'm +0 on removing again the flag, but I would prefer to not endorse
>> > the responsibility. I am already responsible for enough incompatible
>> > changes in Python 3.10 :-D
>> >
>> > Some context on this "U" open mode. The flag is accepted by many
>> > functions opening files. It is deprecated (emit DeprecationWarning)
>> > for 9 years (Python 3.3, 2012).
>>
>> It was silently deprecated before 3.3 (perhaps it was no-op since 3.0).
>>
>> I added DeprecationWarning with intention to remove this option in all
>> functions accepting it. The only non-trivial support of the "U" mode was
>> left in ZipFile.open(), and it was broken since beginning.
>
>
> I think at this point the DeprecationWarning has definitely been on long 
> enough, there was an explicit warning about it in Python 3.9, and 3.10 will 
> be nearly 2 years removed from 2.7 reaching EOL which is the only place where 
> "U" may still be used. So I think it's fine to drop "U" in 3.10.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/VTROKN5UOU3EN6F3OLX5RUK7TVETAXKB/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NYVORKRSH562UMAXXLSJOOW5ECBA3HC5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When we remove 'U' mode of open()?

2021-04-07 Thread Inada Naoki
On Wed, Apr 7, 2021 at 11:29 PM Miro Hrončok  wrote:
>
> On 07. 04. 21 14:53, Inada Naoki wrote:
> > 'U' mode was removed once and resurrected.
> > https://bugs.python.org/issue39674
> >
> > As far as I can see, it is postponed to Python 3.10. Am I right?
> > Can we remove 'U' mode in Python 3.10?
>
> What is the benefit of doing it? Is the current compatibility layer to do
> nothing when "U" is passed difficult to maintain?
>

I am working on fileinput module:

* https://bugs.python.org/issue43712
* https://bugs.python.org/issue5758
* https://bugs.python.org/issue36865

It supports deprecated 'U' mode for now, same to builtin open().
When I read and write code and tests, I need to pay attention about
all allowed combination of mode string.

It is not difficult to maintain, but it has significant support cost.
If we don't remove it forever, accumulated cost would be very high
than users think.
It is technical debt.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YBT47H36E4IWMLOFATQEIJOTQEGZ4SQE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] When we remove 'U' mode of open()?

2021-04-07 Thread Inada Naoki
'U' mode was removed once and resurrected.
https://bugs.python.org/issue39674

As far as I can see, it is postponed to Python 3.10. Am I right?
Can we remove 'U' mode in Python 3.10?

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VM3GYZDHQCXSDHZCGDA5W7ZPBYRIJPGA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Weird io.OpenWrapper hack to use a function as method

2021-04-01 Thread Inada Naoki
On Thu, Apr 1, 2021 at 11:52 AM Brett Cannon  wrote:
>
> On Wed., Mar. 31, 2021, 18:56 Inada Naoki,  wrote:
>>
>> Do we need _pyio at all?
>> Does PyPy or any other Python implementation use it?
>
> https://www.python.org/dev/peps/pep-0399/ would suggest rolling back Python 
> support is something to avoid.
>

Thank you.
If we obey PEP 399, we need an easy way to keep consistency between
Python function and C function.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GWWJEK3ANQKJTXBJBBYM5T4XBTTFB3UU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Weird io.OpenWrapper hack to use a function as method

2021-03-31 Thread Inada Naoki
Do we need _pyio at all?
Does PyPy or any other Python implementation use it?

On Wed, Mar 31, 2021 at 9:36 PM Victor Stinner  wrote:
>
> Hi,
>
> The io module provides an open() function. It also provides an
> OpenWrapper which only exists to be able to store open as a method
> (class or instance method). In the _pyio module, pure Python
> implementation of the io module, OpenWrapper is implemented as:
>
> class OpenWrapper:
> """Wrapper for builtins.open
>
> Trick so that open won't become a bound method when stored
> as a class variable (as dbm.dumb does).
>
> See initstdio() in Python/pylifecycle.c.
> """
> def __new__(cls, *args, **kwargs):
> return open(*args, **kwargs)
>
> I would like to remove this class which is causing troubles in the PEP
> 597 implementation, but I don't know how. Simplified problem:
> ---
> def func():
> print("my func")
>
> class MyClass:
> method = func
>
> func() # A
> MyClass.method() # B
> obj = MyClass()
> obj.method() # C
> ---
>
> With this syntax, A and B work, but C fails with TypeError: func()
> takes 0 positional arguments but 1 was given.
>
> If I decorate func() with @staticmethod, B and C work, but A fails
> with TypeError: 'staticmethod' object is not callable.
>
> Is OpenWrapper the only way to have a callable object which works in
> the 3 variants A, B and C?
>
> A, B and C work if MyClass is modified to use staticmethod:
>
> class MyClass:
> method = staticmethod(func)
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/QZ7SFW3IW3S2C5RMRJZOOUFSHHUINNME/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BOSZENKZRZCTIYWDRBRLWT4GKHWGDLWP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: pth file encoding

2021-03-17 Thread Inada Naoki
On Wed, Mar 17, 2021 at 5:33 PM Paul Moore  wrote:
>
> On Wed, 17 Mar 2021 at 08:13, Michał Górny  wrote:
> >
> > On Wed, 2021-03-17 at 13:55 +0900, Inada Naoki wrote:
> > > OK. setuptools doesn't specify encoding at all. So locale-specific
> > > encoding is used.
> > > We can not fix it in short term.
> >
> > How about writing paths as bytestrings in the long term?  I think this
> > should eliminate the necessity of knowing the correct encoding for
> > the filesystem.
>
> If I have a path in my Python program that is "a£b" (a unicode string)
> and I want to write it to a .pth file, what encoding should I use to
> "write it as a bytestring"? I don't understand what you;re trying to
> suggest here.
> Paul

On Windows, it must be UTF-8. For example, we use `chcp 65001` in
`activate.bat` to support unicode path.
On Unix, raw path is bytestring. So paths can be written as-is. Python
decode it with fsencoding.

So I think this is the ideal solution. But this solution requires
platform-specific code in the site.py.
I don't think pth files are important enough for this complexity.

Sub-optimal idea is using UTF-8. It is the best encoding for Windows.
And most Unix systems use UTF-8 too.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NWBYQHLUIIWU2U2MX4KZXJH4PBTNJYAW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: pth file encoding

2021-03-16 Thread Inada Naoki
OK. setuptools doesn't specify encoding at all. So locale-specific
encoding is used.
We can not fix it in short term.

On Wed, Mar 17, 2021 at 4:56 AM Brett Cannon  wrote:
>
>
>
> On Mon, Mar 15, 2021 at 7:53 PM Inada Naoki  wrote:
>>
>> Hi, all.
>>
>> I found .pth file is decoded by the default (i.e. locale-specific) encoding.
>> https://github.com/python/cpython/blob/0269ce87c9347542c54a653dd78b9f60bb9fa822/Lib/site.py#L173
>>
>> pth files contain:
>>
>> * import statements
>> * paths
>>
>> For import statement, UTF-8 is the default Python code encoding.
>> For paths, fsencoding is the right encoding. It is UTF-8 on Windows
>> (excpet PYTHONLEGACYWINDOWSFSENCODING is set), and locale-specific
>> encoding in Linux.
>>
>> What encoding should we use?
>>
>> * UTF-8
>> * sys.getfilesystemencoding()
>> * Keep status-quo.
>
>
> What are packaging tools like pip and setuptools writing .pth files out as?



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B5EWSS6GT5O4HBUJTMCKWKZMTC6U6VTV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] pth file encoding

2021-03-15 Thread Inada Naoki
Hi, all.

I found .pth file is decoded by the default (i.e. locale-specific) encoding.
https://github.com/python/cpython/blob/0269ce87c9347542c54a653dd78b9f60bb9fa822/Lib/site.py#L173

pth files contain:

* import statements
* paths

For import statement, UTF-8 is the default Python code encoding.
For paths, fsencoding is the right encoding. It is UTF-8 on Windows
(excpet PYTHONLEGACYWINDOWSFSENCODING is set), and locale-specific
encoding in Linux.

What encoding should we use?

* UTF-8
* sys.getfilesystemencoding()
* Keep status-quo.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RKXH7QGIBC3UNCLGUSCLWLZX2WM6IGWW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Accepting PEP 624 (Remove Py_UNICODE encoder APIs)

2021-03-15 Thread Inada Naoki
Thank you, SC members, Victor, and Marc.

On Tue, Mar 16, 2021 at 3:49 AM Thomas Wouters  wrote:
>
>
> Hi Inada,
>
> Thank you for submitting PEP 624 (Remove Py_UNICODE encoder APIs). The 
> Steering Council is happy to accept it, but we do have two conditions. We 
> want to make sure that the documentation is clear on what is deprecated, and 
> when they are scheduled to be removed. For example, 
> PyUnicode_TransformDecimalToASCII is itself not currently marked as 
> deprecated (although the section header does mention the deprecation, that is 
> easy to miss), PyUnicode_TranslateCharmap is scheduled for removal in 4.0, 
> and PyUnicode_AsUnicode has two deprecation notices, one mentioning removal 
> in 3.10 and one in 3.12.
>
> We would also like to make sure users who need to migrate off of these APIs 
> have the information necessary to do so. The PEP currently lists alternatives 
> with caveats, and it's not immediately obvious from the PEP or the API 
> documentation what the right alternative is for those caveats. As a condition 
> of this PEP’s acceptance, we request that you fully document the recommended 
> workarounds for these caveats. We do recognise that PyUnicode_EncodeDecimal 
> is currently entirely undocumented. Documenting at this stage is probably not 
> worth the effort, but perhaps it could be mentioned in a brief ‘porting’ 
> section in the PEP instead.
>
> With the Python Steering Council's gratitude,
> Thomas.
> --
> Thomas Wouters 
>
> Hi! I'm an email virus! Think twice before sending your email to help me 
> spread!



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6VOKLYSEHGPX3IIXGR2NBXN45LSBOFAM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [python-committers] Re: Accepting PEP 597 (Add optional EncodingWarning)

2021-03-15 Thread Inada Naoki
Thank you, Council members and all members joined in the long discussion.


On Tue, Mar 16, 2021 at 8:29 AM Guido van Rossum  wrote:
>
>>
>> Once the whole stdlib and most of top PyPI projects will be fixed to
>> no longer emit EncodingWarning, I will become safer to opt-in for
>> UTF-8 by default by enabling the Python UTF-8 Mode!
>> https://docs.python.org/dev/library/os.html#python-utf-8-mode
>>
>> One day, we will silently switch Python to UTF-8 by default, and
>> nobody will notice! ;-)
>
>
> In particular it's important that nobody living in Japan or China should 
> notice. This is also still the biggest challenge. :-(
>

Java has a very similar problem and proposal. See JEP 400 (*) that was
updated recently.
If JEP 400 is accepted, users can use `-Dfile.encoding=COMPAT` for
legacy behavior.
If UTF-8 mode is enabled by default, users can use `PYTHONUTF8=0` or
`-Xutf8=0` for legacy behavior.

(*) https://openjdk.java.net/jeps/400

Anyway, PEP 597 adds `encoding="locale"` option. Let's implement it in
Python 3.10 and wait 4 years.
Many libraries will use only UTF-8, or can drop Python 3.9 support and
use `encoding="locale"` where locale encoding is needed.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EMYJ6KVCRNM3ZGPY6BWBVXHYBXIDPNU7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Steering Council update for February

2021-03-09 Thread Inada Naoki
On Wed, Mar 10, 2021 at 10:10 AM Ivan Pozdeev via Python-Dev
 wrote:
>
> On 10.03.2021 3:53, Chris Angelico wrote:
> > On Wed, Mar 10, 2021 at 11:47 AM Damian Shaw
> >  wrote:
> >>> Does 'master' confuse people?
> >> There's a general movement to replace language from common programming 
> >> practises that derive from, or are associated with, the dehumanization of 
> >> people. Such as master and slave, as well as whitelist and blacklist.
> >>
> > Is that *actually* the origin of the term in this context, or is it
> > the "master", the pristine, the original from which copies are made?
> > There's no "slave" branch anywhere in the git repository.
>
> It is, actually, the ultimate origin of the term.
>
> A more immediate origin is the master-slave architecture (the master agent 
> initiates some operation and slave agents respond to it and/or
> carry it out).
>

Petr Baudis (who named "master" branch) says its origin is "master
recording". So it is unrelated to master-slave.
https://twitter.com/xpasky/status/1272280760280637441

>
> Anyway, this is yet another SJW non-issue (countries other than US don't have 
> a modern history of slavery) so this change is a political
> statement rather than has any technical merit.
>

Yes. If we don't change the name, we need to pay our energy to same
discussion every year.
It is not productive. Let's change the name and stop further discussion.


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HXSURYDZJPFGPJ6G44RKOE6723BYPCVH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597 bikeshedding: envvar / option name.

2021-02-21 Thread Inada Naoki
Thank you for all.

I finally submit the PEP 597 with PYTHONWARNDEFAULTENCODING /
warn_default_encoding.

On Mon, Feb 15, 2021 at 2:28 PM Inada Naoki  wrote:
>
> I am updating PEP 597 to include discussions in the thread.
> Before finishing the PEP, I need to decide the option name.
>
> I used PYTHONWARNDEFAULTENCODING (envvar name) /
> warn_default_encoding (-X option and sys.flags name) before, but it
> seems too long and hard to type, easy to misspell.
>
> Currently, I use PYTHONWARNENCODING / warn_encoding, but it is not so 
> explicit.
>
> Which name is the best balance between explicitness and readability?
>
> * PYTHONWARNENCODING / warn_ecoding
> * PYTHONWARNOMITENCODING / warn_omit_encoding
> * PYTHONWARNDEFAULTENCODING / warn_default_encoding
> * Anything else
>
> Regards,
> --
> Inada Naoki  



-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AX3YE6AM7FVV44GNQLDAFMSBWRZFWR4B/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597 bikeshedding: envvar / option name.

2021-02-19 Thread Inada Naoki
On Mon, Feb 15, 2021 at 3:00 PM Paul Bryan  wrote:
>
> Let the bikeshedding begin. How about with the underscores in place? More 
> readable to my eyes.
>

I agree with you. Although it is not consistent with existing many
option names, it is much more readable.

Ivan, Victor, what do you think? about PYTHON_WARN_DEFAULT_ENCODING?

---
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TN4UE2WRKA3F6BBS7TYEYMBRSARYIUQM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 597 bikeshedding: envvar / option name.

2021-02-14 Thread Inada Naoki
I am updating PEP 597 to include discussions in the thread.
Before finishing the PEP, I need to decide the option name.

I used PYTHONWARNDEFAULTENCODING (envvar name) /
warn_default_encoding (-X option and sys.flags name) before, but it
seems too long and hard to type, easy to misspell.

Currently, I use PYTHONWARNENCODING / warn_encoding, but it is not so explicit.

Which name is the best balance between explicitness and readability?

* PYTHONWARNENCODING / warn_ecoding
* PYTHONWARNOMITENCODING / warn_omit_encoding
* PYTHONWARNDEFAULTENCODING / warn_default_encoding
* Anything else

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JMTBJKNMO7AHDNXFSIB4EYQK33D2ODZD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Deferred Evaluation Of Annotations Using Descriptors

2021-02-14 Thread Inada Naoki
On Mon, Feb 15, 2021 at 10:20 AM Joseph Perez  wrote:
>
> > How about having a pseudo-module called __typing__ that is
> > ignored by the compiler:
> >
> > from __typing__ import ...
> >
> > would be compiled to a no-op, but recognised by type checkers.
>
> If you want to do run-time typing stuff, you would use
> There is already a way of doing that: `if typing.TYPE_CHECKING: ...` 
> https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING
> But yes, the issue with it is that this constant is defined in the `typing` 
> module …
>
> However, I think this is a part of the solution. Indeed, the language could 
> define another builtin constants, let's name it `__static__`, which would 
> simply be always false (at runtime), while linters/type checkers would use it 
> the same way `typing.TYPE_CHECKING` is used:
> ```python
> if __static__:
> import typing
> import expensive_module
> ```


Please note that this is a thread about PEP 649.

If PEP 649 accepted and PEP 563 dies, all such idioms breaks
annotation completely.

Users need to import all heavy modules and circular references used
only type hints, or user can not get even string form annotation which
is very useful for REPLs.


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GB5ZD2OQ5XALMZX46DK3HWVV7ROZJHH2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-13 Thread Inada Naoki
To demonstrate how this warning is useful, I used my reference implementation.

When I try `pip install`, I found these issues soon.

https://bugs.python.org/issue43214 (Open pth file with locale-encoding)
https://github.com/pypa/pip/pull/9608 (Not a real bug, but open JSON
file with locale-encoding)

And when creating a PR for pip, I found this issue in tox:

https://github.com/tox-dev/tox/issues/1908 (Open toml file with
locale-encoding, may not work on Windows)

Although most developers won't use this option, I and few other
developers can put `export PYTHONWARNENCODING=1` in .bashrc and will
find many possible bugs that happen only on Windows, even if they
don't use Windows daily development.

Isn't this option worth enough?
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K7PVGEHDB3BXLNFZ6UWFJOKCC337UTWO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-12 Thread Inada Naoki
On Sat, Feb 13, 2021 at 4:53 AM Jim J. Jewett  wrote:
>
> Offering encoding="locale" (or open.locale or ... ) instead of a long 
> function call using False (locale.getpreferredencoding(False)) seems like a 
> win for Explicit is Better Than Implicit.  It would then be possible to say 
> "yeah, locale really is what I meant".
>
> Err... unless the charset determination is so tricky that it ends up just 
> adding another not-quite-right near-but-not-exact-synonym.
>
> Adding a new Warning subclass, and maybe a new warning type, and maybe a new 
> environment variable, and maybe a new launch flag ... these all seem to risk 
> just making things more complicated without sufficient gain.
>
> Would a recipe for site-packages be sufficient, or does this need to run too 
> early in the bootstrapping process?
>
> -jJ

What does "a recipe for site-packages" mean?

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ZOMEDEZ72SU7FDTTF5XUIPOA5SU72R6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Fri, Feb 12, 2021 at 12:45 PM Jim J. Jewett  wrote:
>
> On Thu, Feb 11, 2021 at 7:35 PM Inada Naoki  wrote:
>
> > The PEP helps developers living on UTF-8 locale to find missing
> > `encoding="utf-8"` bug.
> > This type of bug is very common, and many Windows users are suffered
> > by the bug when reading JSON, YAML, TOML, Markdown, or any other UTF-8
> > files.
>
> I think this is where we have been talking past each other.
>
> You seem to be assuming that the programmer knows the correct
> encoding, presumably because they (or their program) wrote it.

Not always, but many times.

>  You
> then assume that they neglected to mention the encoding out of
> forgetfulness, perhaps because on their system, everything is always
> UTF-8.  This clearly does happen, but the people who would make this
> mistake most often -- they probably wouldn't think to test their code
> under a special mode that catches only this.  (They might run a linter
> that looked for all sorts of problems, including this.)
>

Some Python experts can write `export PYTHONWARNENCODING=1` in their .bashrc.
They can find such mistakes not only in their codes but also in
libraries they are using.
Since they are experts, they can understand the warning and report it
to the library author correctly.

So this option helps library authors even if they don't use this option.


> I instead assume that the programmer really doesn't know the encoding,
> because the file is supplied by the user.  (The user may not know
> either, since it is really supplied by some other program, but ...
> neither python nor the programmer knows for sure.)
>  In this case, the
> warning is not just a false alarm, but is actively misleading.
>
> -jJ

This option is opt-in.  People don't understand what this warning
means should not opt-in the warning.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KLYUYKLHWCTTK7HOYNPDRPRS6WIQQU7K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Fri, Feb 12, 2021 at 12:28 PM Jim J. Jewett  wrote:
>
> (I apologize if my summaries distort what Inada Naoki
>  explained.)
>
> He said that some people use the default None when they really want
> either UTF-8 or ASCII.

Yes. Even Python core developers.
For example: https://bugs.python.org/issue33684

This is just one example. I saw many codes using default encoding to
read JSON, YAML, TOML, Markdown, etc...


>
> My concern is that the warning will be a false alarm if they really do
> need whatever locale returns, and that case may still be common.  (If
> web browsers had stopped bothering to sniff for other charsets, then
> maybe that situation really was getting rare.)
>

That's one of reason why this warning is opt-in, like BytesWarning.

> I asked when encoding=None is actually different from encoding=locale,
> currently spelled encoding=locale.getpreferredencoding(False)
>

I don't understand this sentence. This PEP proposes
`encoding="locale"` that is equal to encoding=None but don't emit
EncodingWarning.

There was discussion about difference between `encoding=None` and
`encoding=locale.getpreferredencoding(False)` in this thread.


> They can be different on Windows console, presumably because the
> environment settings that control locale may differ from the charset
> actually used by the console.  Even then, it only differs for open()
> when PYTHONLEGACYWINDOWSSTDIO is set, and for TextIOWrapper() When the
> file is not _WindowsConsoleIO
>
> To me, that sounds narrow enough to be a windows issue, rather than an
> issue with open.

Yes. So if user want to specify locale-specific encoding and don't
want to drop Python 3.9 support, user can use
encoding=locale.getpreferredencoding(False).

But this PEP doesn't recommend it. Third party libraries can use
`encoding="locale"` after they drop Python 3.9 support.


>  Is there some way to write an encoding that sniffs
> for charsets, particularly on windows, and to use that as the default
> instead of assuming that locale will be correct?
>
> -jJ

There is no reliable way, AFAIK.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LJASRUN5G2PYEUOT7H34LGGBYEHBUB3C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Thu, Feb 11, 2021 at 4:44 PM Jim J. Jewett  wrote:
>
> The PEP helps when the locale is ASCII or C, but that isn't enforced in 
> actual files.  I am confident that this is a frequent problem for packages 
> downloaded from mostly-English sites, including many software repositories.
>

The PEP helps developers living on UTF-8 locale to find missing
`encoding="utf-8"` bug.
This type of bug is very common, and many Windows users are suffered
by the bug when reading JSON, YAML, TOML, Markdown, or any other UTF-8
files.


> It does not seem to be a win when the locale is something incompatible with 
> utf-8, such as Latin-1, or whatever is still common in Japan.  The 
> surrogate-escape mechanism allows a proper round-trip, but python itself will 
> stop processing the characters correctly.
>

Surrogate-escape mechanism doesn't relating this PEP.


> For interactive use, when talking to another program (such as a terminal) 
> instead of an already existing file, the backwards compatibility problem 
> seems worse.
>

This PEP is 100% backward compatible.


> Changing the default to utf-8 (after a deprecation period showing how to make 
> locale an explicit default) may be reasonable, but claiming that it is 
> backwards compatible ... I didn't get that impression from the PEP.
>

This PEP doesn't propose to change the default encoding.

*If* we decide to change the default encoding in the future (maybe,
2025 or later) and start emitting DeprecationWarning where `encoding`
option is omitted, this PEP help it by:

* `encoding="locale"` option can be used since Python 3.10, and
* The number of DeprecationWarning shown is decreased because we can
add `encoding="utf-8"` many places before the time. At least, we can
fix all EncodingWarning in stdlib.

Maybe, the "Prepare to change the default encoding to UTF-8" is misleading.
I will try to fix the section or remove the section.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JBBRBR6AUTGP2SAVAUJVZJ3GM6FJQEBV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Fri, Feb 12, 2021 at 5:18 AM Jim J. Jewett  wrote:
>
> Inada Naoki wrote:
>
> > Default encoding is used for:
>
> > a. Really need to use locale specific encoding
> > b. UTF-8 (bug. not work on Windows)
> > c. ASCII (not a bug, but slow on Windows)
>
> > I assume most usages are (b) and (c). This PEP can reduce them soon.
>
> Is this just an assumption, based on those times being visible to someone who 
> installs a lot of packages, or has the use of any locale other than UTF-8 and 
> ASCII really gone down a lot?  Have browsers stopped using charset sniffing?
>

Using "most" is my fault. I am not good at Englsh. I should use "many" here.
You can see many bugs caused by not specifying `encoding="utf-8"` in Q sites.
I wrote some number about this common bugs in the PEP.

UTF-8 is used for 96.3% of web sites [1], although browser still use
charset sniffing. But how is it relating to this PEP?
[1] https://w3techs.com/technologies/details/en-utf8


> > Additionally, encoding="locale" will be backward/forward compatible
>
> What would be the problem with changing the default from None to locale?

It doesn't work on Python ~3.9.
So using `encoding="locale"` is not recommended anytime soon until
user drops Python 3.9 support.

> (I think you mentioned that they are the same 99% of the time; is that other 
> 1% likely to be cases where locale is wrong but None is right?  Would there 
> be a better way to represent that 1%?)
>

`encoding="locale"` and `encoding=None` has same behavior except
`encoding="locale"` doesn't emit EncodingWarning even when it is
opt-in.

There is little difference between `encoding=None` and
`encoding=locale.getpreferredencoding(False)`. The difference is:

* When Python is using Windows, and
* When when the file is console, and
* (for open()) When PYTHONLEGACYWINDOWSSTDIO is set
* (for TextIOWrapper()) When the file is not _WindowsConsoleIO

encoding=None uses console codepage but
encoding=locale.getpreferredencoding(False) uses
Otherwise, encoding=None and
encoding=locale.getpreferredencoding(False) are same.

So `encoding=locale.getpreferredencoding(False)` can be used to
specify locale-specific encoding explicitly.
But this PEP doesn't recommend it. This PEP recommend to use
EncodingWarning for just finding missing `encoding="utf-8"` (or any
other specific encoding).

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PD4BTBAQHFUYOCF5QKIBDIMHATPVEFPW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Fri, Feb 12, 2021 at 6:34 AM Paul Moore  wrote:
>
> On Thu, 11 Feb 2021 at 21:05, Jim J. Jewett  wrote:
> >
> > Who will benefit from this new warning?
> >
> > Is this basically just changing builtins.open by adding:
> >
> > if encoding is None and sys.flags.encoding_warning: # and not Android 
> > and not -X utf8 ?
> > warnings.warn(EncodingWarning("Are you sure you want locale instead 
> > of utf-8?"))
> >
> > Even for the few people with the knowledge, time, interest, and authority 
> > to fix the code, is that really helpful?
> >
> > Helpful enough to put it directly in python as an optional mode, separate 
> > from the dev mode or show all warnings mode?  Why not just add it to a 
> > linter, or write a 2to3 style checker?  Or at least emit or not based on a 
> > warnings filter?
>
> That's a very good point. If this warning is of use, why have none of
> the well-known linters implemented it? And why not prototype the
> proposal in them, at least? Python-ideas posts routinely get pushed to
> justify "why can't this be done in an external library?" and that
> probably applies here too.
>

* Linters can not add `encoding="locale"` to Python.
* This PEP provides the way to shift where warnings  is emitted.

def my_read_file(filename, encoding=None):
encoding = io.text_encoding(encoding)
with open(filename, encoding=encoding) with f:
return f.read()

This function is not warned. Caller of this function is warned
instead. It is difficult to implement in the Linter.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CKMRUBEI3UHEXSELZIQBA6NZCK77O75T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-11 Thread Inada Naoki
On Thu, Feb 11, 2021 at 4:44 PM Jim J. Jewett  wrote:
>
> I just reread PEP 597, then re-reread the Rationale.
>

Do you read current PEP 597, or old PEP 597 in discuss.python.org?


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UKGDVMHUNNNRA4D4UCG4RLPZDIVKNNEY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-10 Thread Inada Naoki
On Wed, Feb 10, 2021 at 11:58 PM Anders Munch  wrote:
>
> On Wed, Feb 10, 2021 at 1:46 AM Anders Munch  wrote:
> >> How about swapping around "locale" and None?
> Inada Naoki   wrote:
> >
> > I thought it, but it may not work. Consider about function like this:
> >
> > ```
> > def read_text(self, encoding=None):
> > with open(self._filename, encoding=encoding) as f:
> > return f.read()
> > ```
> >
> > If `encoding=None` suppresses the warning, functions like this never warned.
>
> I don't see why they should be.  The author clearly knew about the encoding
> argument to open, they clearly intended for a None value to be given in some
> cases, and at the time of writing None meant to use a locale-dependent 
> encoding.
>

It is not clear. The author may just want to "use the default encoding
same to open()".
If so, the caller of the function should be warned. To warn caller,
this function can use
`encoding=io.text_encoding(encoding)` as described in the PEP.


> > We are not discussing about changing default encoding for now.
>
> The section "Prepare to change the default encoding to UTF-8" gave me the
> impression that this was meant as a stepping stone on the way to doing just
> that.  If that was not the intention, my apologies for the misread.
>

This *can* be stepping stone. But it is not a frist goal. This PEP
doesn't discourange omitting encoding option anytime soon when user
really need to use locale encoding.

Default encoding is used for:

 a. Really need to use locale specific encoding
 b. UTF-8 (bug. not work on Windows)
 c. ASCII (not a bug, but slow on Windows)

I assume most usages are (b) and (c). This PEP can reduce them soon.

If we decided to change the default encoding in the future, we need to
warn omitting encoding option. Reducing (b) and (c) will reduce the
total warning shown in the future. This is what "Prepare" means.

Additionally, `encoding="locale"` will be backward/forward compatible
way to use locale-specific encoding when we decided to change the
default encoding.
So this PEP can be a very important stepping stone.

On the other hand, it is not a problem that we can not use
`encoding="locale"` in backward-compatible code *for now*.
Python 3.9 become EOL in 2025. We won't emit warning for the default
encoding until then.

People can use `encoding="locale"` after they drop Python 3.9 support.
No problem.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DBDI5FEJCF2IOTSAS7VELO27MNEQMK2Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-10 Thread Inada Naoki
On Wed, Feb 10, 2021 at 11:14 PM Anders Munch  wrote:
>
> This program runs just fine on 3.8.7 Windows, against a file.txt that 
> contains latin-1 text:
>
> with open('file.txt', 'rt') as f:
> print(f.read())
>
> But if I change it to this:
>
> with open('file.txt', 'rt', encoding='utf-8') as f:
> print(f.read())
>
> then it fails with UnicodeDecodeError.   How it that backwards compatible?
>

There are several ways:

* encoding="latin1" -- This is the best. Works perfectly.
* Don't touch -- You don't need to enable EncodingWarning.
* encoding=locale.getpreferredencoding(False) -- Backward compatible.
But doesn't work if you enabled UTF-8 mode.
* encoding="mbcs" -- Backward compatible. Works even when you enabled
UTF-8 mode. But it doesn't work only on Windows.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XPBVG5GU37UDQPDTZIFIGI2WOFYHYQBU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-10 Thread Inada Naoki
On Wed, Feb 10, 2021 at 5:00 PM Paul Moore  wrote:
>
> Let's just assume until you can convince me that setting UTF-8 mode
> globally is a good idea,

Oh, you misunderstood me. My proposal is not setting UTF-8 mode globally.
What I proposed is setting UTF-8 mode per env (e.g. installation,
venv, or conda env).

But this is off topic. The thread for this topic is here.
https://mail.python.org/archives/list/python-id...@python.org/thread/LQVK2UKPSOI2AHYFUWK6ZII2U6QKK6BP/

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZXPBI3WSZ6FCAWWKXNBRNKYXUXUG5FEH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-09 Thread Inada Naoki
On Wed, Feb 10, 2021 at 5:50 AM Paul Moore  wrote:
>
> On Tue, 9 Feb 2021 at 16:54, Inada Naoki  wrote:
> >
> > On Tue, Feb 9, 2021 at 9:31 PM Paul Moore  wrote:
> > >
> > > Personally, I'm not at all keen on the idea of making users always
> > > specify encoding in the first place, even if it's "just for the
> > > transition".
> >
> > I agree with you. But as I wrote in the PEP, omitted encoding caused
> > much troubles already.
> > Windows users can not just `pip install somepkg` because some library
> > authors write `long_description=open("README.md").read()` in setup.py.
> >
> > I am trying to fix this situation by two parallel approaches:
> >
> > * (This PEP) Provide a tool for finding this type of bugs, and
> > recommend `encoding="utf-8"` for cross-platform library authors.
> > * (Author thread) Make UTF-8 mode more usable for Windows users,
> > especially students.
>
> Thanks for explaining (again). There's so much debate, across multiple
> proposals, that I can barely follow it. I'm impressed that you're
> managing to keep things straight at all :-)
>
> I guess my views on this PEP come down to
>
> * I see no harm in having a tool that helps developers spot
> platform-specific assumptions about encoding.
> * Realistically, I'd be surprised if developers actually use such a
> tool. If they were likely to do so, they could probably just as easily
> locate all the uses of open() in their code, and check that way. So
> I'm not sure this proposal is actually worth it, even if the end
> result would be very beneficial.
> * In the setup.py case, why don't those same Windows users complain
> that the library fails to install? A quick bug report, followed by a
> simple fix, seems more likely to happen than the developer suddenly
> deciding to scan their code for encoding issues.
>

Yes, some issues are solved already.
On the other hand, there are dozen question about UnicodeDecodeError
in Q sites like Stack Overflow.
Many people don't know what the error means, and how to report it correctly.

I sometime set PYTHONWARNINGS=deafult in my bashrc and find
DeprecationWarnings in libraries I am using, and report them.

On the other hand, it is difficult to find omitted `encoding="utf-8"`,
because I use macOS and Linux in daily development.
If there is PYTHONWARNENCODING, I can write `export
PYTHONWARNENCODING=1` in my .bashrc.


> Regarding the wider question of UTF8 as default, my views can probably
> be summarised as follows:
>
> * If you want to write correct code to deal with encodings, there is
> no substitute for carefully considering every bytes/string conversion,
> deciding how you are going to identify the encoding to use, and then
> specifying that encoding explicitly. Default values for encodings have
> no place in such code.
> * In reality, though, that's far too much work for many situations.
> Default encodings are a necessary convenience, particularly for simple
> scripts, or for people who can't, or don't want to, do the analysis
> that the "correct" approach implies.

Yes. and the UTF-8 is the default encoding for s.encode() already.

> * Picking the right default is *hard*. Changing the default is even
> harder, unfortunately.
> * I feel that we already have a number of mechanisms (PEPs 538 and
> 540) trying to tackle this issue. Adding yet more suggests to me that
> we'd be better off pausing and working out why we still have an issue.
> We should be moving towards *fewer* mechanisms, not more.
> * We have UTF-8 mode, and users can set it per-process (via flag or
> environment variable) per-user or per-site (by environment variable).
> I don't honestly believe that a user (whatever OS they work on) who is
> capable of writing Python code, can't be shown how to set an
> environment variable. I see no reason to suggest we need yet another
> way to set UTF-8 mode, or that a per-interpreter or per-virtualenv
> setting is particularly crucial (suggestions that have been made in
> the Python-Ideas threads).

Note that many Python users don't use consoles. They just starts
Jupyter Notebook, or they just write .py file and run it in the
Minecraft mods.

> * UTF-8 is likely to be the most appropriate default encoding for
> Python in the longer term, and I agree that Windows is fast
> approaching the point where a UTF-8 encoding is more appropriate than
> the ANSI codepage for "new stuff". But there's a lot of legacy files
> and applications around, and I suspect that a UTF-8 default will
> inconvenience a lot of people working with such data. But equally,
> such people may not be in a huge rush to switch to the latest Python
> version. Whichever way we go, thou

[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-09 Thread Inada Naoki
On Wed, Feb 10, 2021 at 1:46 AM Anders Munch  wrote:
>
>
> Inada Naoki   wrote:
> > This warning is opt-in warning like BytesWarning.
>
> What use is a warning that no-one sees?

At least, I see.
We can fix stdlib and tests first, and fix some major tools too.

After that, `encoding="locale"` becomes backward/forward compatible at
some point.

> When the default is switched to encoding="utf8", it will break software, and 
> people need to be warned of that.
> UnicodeDecodeError's will abound when files that used to be read in a 
> single-byte encoding fails to decode as utf-8. All it takes is a single é.
> If the default encoding is ever to change, there's no way around a noisy 
> warning.
>

Please read the PEP and some my posts in this threads.
We are not discussing about changing default encoding for now.

This PEP provides a tool to find missing `encoding="utf-8"` bug for now.
The goal of the PEP is encourage `encoding="utf-8"` when the user
assumes encoding is UTF-8.

If we decide to change the default encoding. EncodingWarning can be
used to discourage omitting the `encoding` option.
But it is out of scope of the PEP. We don't discourage omitting
encoding option in Python 3.10.


> How about swapping around "locale" and None?  That is, make "locale" the new 
> default that emits a warning, and encoding=None emits no warning.  That has 
> the advantage that old code can be updated to say encoding=None, and then it 
> will work on both old and new Pythons without warning.
>

I thought it, but it may not work. Consider about function like this:

```
def read_text(self, encoding=None):
with open(self._filename, encoding=encoding) as f:
return f.read()
```

If `encoding=None` suppresses the warning, functions like this never warned.

So I think current PEP is better.
If users want to use locale encoding, they don't need to fix the
warning anytime soon. They can wait to drop Python 3.9 support.
If they want to fix all warnings soon, they can
`encoding=locale.getpreferredencoding(False)`.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4Q74PW673RMBMQTDZXHTVE6X7FT6DSAL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-09 Thread Inada Naoki
On Tue, Feb 9, 2021 at 9:31 PM Paul Moore  wrote:
>
> Personally, I'm not at all keen on the idea of making users always
> specify encoding in the first place, even if it's "just for the
> transition".

I agree with you. But as I wrote in the PEP, omitted encoding caused
much troubles already.
Windows users can not just `pip install somepkg` because some library
authors write `long_description=open("README.md").read()` in setup.py.

I am trying to fix this situation by two parallel approaches:

* (This PEP) Provide a tool for finding this type of bugs, and
recommend `encoding="utf-8"` for cross-platform library authors.
* (Author thread) Make UTF-8 mode more usable for Windows users,
especially students.


> If we want to switch the default encoding from the locale encoding to
> UTF-8, we should find a way to do that which *doesn't* mean that
> there's a "transitional" state where using the default is considered
> bad practice. That helps no-one, and just adds confusion, which will
> last far longer than that one release (there will be people
> encountering StackOverflow questions on the topic long after the
> default has changed).
>
> Maybe we just have to accept that we can't work out what people are
> intending, and just state in advance in the documentation that the
> default will change, then it's documented as an upcoming breaking
> change that people can address (if they read the release notes, but we
> seem to be assuming they'll spot a warning, so why not assume they
> read the release notes, too?).
>

This PEP doesn't cover how to change the default encoding. So this is
slightly off topic.
I have two ideas for changing the default encoding:

(a) Regular deprecation period: Emitting EncodingWarning by default
(3.14 or later), and change the default encoding later (3.17 or
later).
(b) Enable UTF-8 mode default on Windows. Users can disable UTF-8 mode
for backward compatibility.

Steve Dower againsted to (b) very strongly. He suggested to emit
DeprecationWarning.
https://discuss.python.org/t/pep-597-enable-utf-8-mode-by-default-on-windows/3122/16

On the other hand, some core-dev don't like emitting Warning for all
omitted `encoding` option.

So I don't have strong opinion about which approach is better. I want
to see how EncodingWarning and UTF-8 mode are adopted.

I want to implement both EncodingWarning and per-site UTF-8 mode
setting in Python 3.10.
5+ years later, we will see which approach is adopted by users.

* If EncodingWarning is widely adopted by many developers, we can
discuss approach (a).
* If UTF-8 mode becomes the best practice for Windows users, we can
discuss approach (b).

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DY4OPCBKHHRJZMXEJ43MXPNXJ4EUS6MM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-09 Thread Inada Naoki
On Wed, Feb 10, 2021 at 1:19 AM Paul Moore  wrote:
>
> But people who currently don't specify the encoding, and *don't* have
> any issue (because the system locale is correct) will be getting told
> to introduce a bug into their code, if they follow that advice :-(
>

This warning is opt-in warning like BytesWarning.

It will be a good tool to find potential problems for people knows
what is the problem.
But it is not recommended for users who don't understand what is the problem.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SJKTVKW3DQCPRFRTGOUL73EI6BOGWDFF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-09 Thread Inada Naoki
On Tue, Feb 9, 2021 at 7:23 PM Victor Stinner  wrote:
>
> I recall that something like 1 year ago, I basically tried to
> implement something like your PEP, to see if the stdlib calls open()
> without specifying an encoding. There were so many warnings, that the
> output was barely readable.
>
> The warning would only be useful if there is a way to modify the code
> to make the warning quiet (fix the issue) without losing support with
> Python 3.9 and older.
>
> I understand that open(filename) must be replaced with open(filename,
> encoding=("locale" if sys.version_info >= (3, 10) else None)) to make
> it backward and forward compatibility without emitting an
> EncodingWarning.

I think most of them must be replaced with encoding="ascii" or encoding="utf-8".

And encoding=locale.getpreferredencoding(False) is backward/forward
compatible way.
There is very little difference between encoding=None and
encoding=locale.getpreferredencoding(False).
But it is not a problem for most use cases.
Only applications using PYTHONLEGACYWINDOWSSTDIO and open() for
console I/O are affected by difference between them.


> One issue is that some people may blindly copy/paste
> this code pattern without thinking if "locale" is the proper encoding.
>

Isn't it same if the code pattern become `encoding=getattr(io,
"LOCALE_ENCODING", None)`,
or `encoding=locale.getpreferredencoding(False)`?

I think only we can do is documenting the option like this:

"""
EncodingWarning is warning to find missing encoding="utf-8" option. It
is common pitfall that many Windows user
Don't try to fix them if you need to use locale specific encoding.
"""

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YLAC2WJZ2TX7I3I6TSWA4GWPP5NNETUH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-06 Thread Inada Naoki
I send a pull request https://github.com/python/peps/pull/1799

* Add Backward/Forward Compatibility section
* Add How to teach this section
* Remove io.LOCALE_ENCODING constant


-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TRIGYFRJSVSUWFQDYIUZI64BB4J323UN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-06 Thread Inada Naoki
On Tue, Feb 2, 2021 at 1:40 PM Inada Naoki  wrote:
>
> On Tue, Feb 2, 2021 at 12:23 AM Victor Stinner  wrote:
> >
> >
> > > Add ``io.LOCALE_ENCODING = "locale"`` constant too. This constant can
> > > be used to avoid confusing ``LookupError: unknown encoding: locale``
> > > error when the code is run in old Python accidentally.
> >
> > I'm not sure that it is useful. I like a simple "locale" literal
> > string. If there is a constant is io, people may start to think that
> > it's specific and will add "import io" just to get the string
> > "locale".
> >
> > I don't think that we should care too much about the error message
> > rased by old Python versions.
> >
>
> This constant not only for replacing "locale" litera. As example code
> in the PEP, it can be used to test wheather TextIOWrapper supports
> `encoding="locale"` .
>
> `open(fn, encoding=getattr(io, "LOCALE_ENCODING", None))` works both
> for Python ~3.9 and Python 3.10~.
>

I changed my mind. Since there is no plan to change the default
encoding for now,
no need to encourage `encoding="locale"` soon.

Until users can drop Python 3.9 support, they can use EncodingWarning
only for finding missing `encoding="utf-8"` or `encoding="ascii"`.

I will remove the io.LOCALE_ENCODING.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4SRSQQXRLQSXG4RLZGXHFEFTTBVDKPWK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-04 Thread Inada Naoki
On Tue, Feb 2, 2021 at 8:16 PM Victor Stinner  wrote:
>
> > > I understand that encoding=locale.get_locale_encoding() would be
> > > different from encoding="locale":
> > > encoding=locale.get_locale_encoding() doesn't call
> > > os.device_encoding(), right?
> > >
> >
> > Yes.
>
> Would it be useful to add a io.get_locale_encoding(fd)->str (maybe
> "get_default_encoding"?) function which gives the chosen encoding from
> a file descriptor, similar to open(fd, encoding="locale").encoding?
> The os.device_encoding() call is not obvious.
>

I don't think it's so useful. encoding=None is 99% same to
encoding=locale.getpreferedencoding(False).

On Unix, os.device_encoding() just returns locale encoding.
On Windows, os.device_encoding() is very unlikely used. open() uses
WindowsConsoleIO for console unless PYTHONLEGACYWINDOWSSTDIO is set
and encoding for it is UTF-8.

And that's why I removed the detailed behavior from the PEP. It is too
detailed and almost unrelated to EncodingWarning.
I wrote a simple comment in this section instead.
https://www.python.org/dev/peps/pep-0597/#locale-is-not-a-codec-alias

>
> > > > Opt-in warning
> > > > ---
> > > >
> > > > Although ``DeprecationWarning`` is suppressed by default, emitting
> > > > ``DeprecationWarning`` always when ``encoding`` option is omitted
> > > > would be too noisy.
> > >
> > > The PEP is not very clear. Does "-X warn_encoding" only emits the
> > > warning, or does it also display it by default? Does it add a warning
> > > filter for EncodingWarning?
> > >
> >
> > This section is not the spec. This section is the rationale for adding
> > EncodingWarning instead of using DeprecationWarning.
> >
> > As spec saying, EncodingWarning is a subclass of Warning. So it is
> > displayed by default. But it is not emitted by default.
> >
> > When -X encoding_warning (or -X warn_default_encoding) is used, the
> > warning is emitted and shown unless the user suppresses warnings.
>
> I understand that EncodingWarning is always displayed by default
> (default warning filters don't ignore it, whereas DeprecationWarning
> are ignored by default), but no warning is emitted by default. Ok,
> that makes sense. Maybe try to say it explicitly in the PEP.
>
>
> > This PEP doesn't have "backward compatibility" section because the PEP
> > doesn't break any backward compatibility.
>
> IMO it's a good thing to always have the section, just to say that you
> took time to think about backward compatibility ;-) The section can be
> empty, like just say "there is no incompatible change" ;-)
>
>
> > And if developers want to support Python ~3.9 and use -X
> > warn_default_encoding on 3.10, they need to write
> > `encoding=getattr(io, "LOCALE_ENCODING", None)`, as written in the
> > spec.
>
> Maybe repeat it in the Backward Compatibility section.
>
> It's important to provide a way to prevent the warning without losing
> the support for old Python versions.
>

will do.

>
> > > The main question is if it's possible to use encoding="locale" on
> > > Python 3.6-3.9 (maybe using some ugly hacks).
> >
> > No.
>
> Hum. To write code compatible with Python 3.9, I understand that
> encoding=None is the closest to encoding="locale".
>
> And I understand that encoding=getattr(io, "LOCALE_ENCODING", None) is
> backward and forward compatible ;-)
>
> Well, encoding=None will hopefully remain accepted with your PEP
> anyway for lazy developers ;-)
>

Yes. I don't think this warning is enabled by default in near future.
So developers can just use the option to find missing `encoding="utf-8"` bug.


>
> > Oh, I'm sorry. I want to make it in 3.10.
>
> Since it doesn't change anything by default, the warning is only
> displayed when you opt-in for it, IMO Python 3.10 target is
> reasonable.
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FZ567UQIEKO5IIVSQPUFCSZJOZBMYD4D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-04 Thread Inada Naoki
On Tue, Feb 2, 2021 at 8:40 PM Inada Naoki  wrote:
>
> On Tue, Feb 2, 2021 at 7:37 PM M.-A. Lemburg  wrote:
> >
> > BTW: I don't understand this comment:
> > "They are inefficient on platforms wchar_t* is UTF-16. It is because
> > built-in codecs supports only UCS-1, UCS-2, and UCS-4 input."
> >
> > Windows is one such platform. Java (indirectly) is another. They both
> > store UTF-16LE in those arrays and Python's codecs handle this just
> > fine.
> >
>
> I'm sorry about the section is not clear.
>
> For example, if wchar_t* is UCS4, ucs4_utf8_encoder() can encode
> wchar_t* into UTF-8.
>
> But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
> surrogate escape.
> We need to use a temporary Unicode object. That is what "inefficient" means.
>
> I will update the section more elaborate.
>

I updated the "Alternative Ideas" section of the PEP.
https://www.python.org/dev/peps/pep-0624/#alternative-ideas

They replaces `Py_UNICODE*` with `PyObject*`, `Py_UCS4*`, and `wchar_t*`.
I explicitly noted that some codecs can bypass temporary Unicode objects:

"""
UTF-8, UTF-16, UTF-32 encoders support Py_UCS4 internally. So
PyUnicode_EncodeUTF8(), PyUnicode_EncodeUTF16(), and
PyUnicode_EncodeUTF32() can avoid to create a temporary Unicode
object.
"""

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AD7YKV33JAQXIXDTGUMH7UDSMQUEKVMG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Inada Naoki
On Tue, Feb 2, 2021 at 9:40 PM Emily Bowman  wrote:
>
> On Tue, Feb 2, 2021 at 3:47 AM Inada Naoki  wrote:
>>
>> But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
>> surrogate escape.
>> We need to use a temporary Unicode object. That is what "inefficient" means.
>
>
> Since real UCS-2 is effectively dead, maybe it should be flipped around: Make 
> UTF-16 be the efficient path and UCS-2 be the path that needs to round-trip 
> through Unicode. But I suppose that's out of scope for this PEP.
>
> -Em

Note the ucs2_utf8_encoder() is used only for encoding Python Unicode
object for now.
Unicode object is latin1, UCS2, or UCS4. It never be UTF-16.

So if we support add UTF-16 support to ucs2_utf8_encoder(), it means
we need to add code and maintain only for PyUnicode_EncodeUTF8 (encode
from wchar_t* into char*).

I don't think it is a good deal. As described in the PEP, encoder APIs
are used very rarely.
We must not add any maintainece costs for them.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KDYTBQDA4UFE6XWYENOV32ZRTCTAYEPC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Inada Naoki
On Tue, Feb 2, 2021 at 7:37 PM M.-A. Lemburg  wrote:
>
> >> That would keep extensions working after a recompile, since
> >> Py_UNICODE is already a typedef to wchar_t.
> >>
> >
> > That idea is written in the PEP already.
> > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t
>
> Right and I think this is a more workable approach than removing
> APIs.
>
> BTW: I don't understand this comment:
> "They are inefficient on platforms wchar_t* is UTF-16. It is because
> built-in codecs supports only UCS-1, UCS-2, and UCS-4 input."
>
> Windows is one such platform. Java (indirectly) is another. They both
> store UTF-16LE in those arrays and Python's codecs handle this just
> fine.
>

I'm sorry about the section is not clear.

For example, if wchar_t* is UCS4, ucs4_utf8_encoder() can encode
wchar_t* into UTF-8.

But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
surrogate escape.
We need to use a temporary Unicode object. That is what "inefficient" means.

I will update the section more elaborate.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUGBVLQNBFVNX25AEIL77WSFOHQES6LJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-01 Thread Inada Naoki
On Tue, Feb 2, 2021 at 12:23 AM Victor Stinner  wrote:
>
> Hi Inada-san,
>
> I followed the discussions on your different PEP and I like overall
> your latest PEP :-) I have some minor remarks.
>
> On Mon, Feb 1, 2021 at 6:55 AM Inada Naoki  wrote:
> > The warning is disabled by default. New ``-X warn_encoding``
> > command-line option and ``PYTHONWARNENCODING`` environment variable
> > are used to enable the warnings.
>
> Maybe "warn implicit encoding" or "warn omit encoding" (not sure if
> it's make sense written like that in english ;-)) would be more
> explicit.
>

Yes, it's explicit. So I used `PYTHONWARNDEFAULTENCODING` first.
But I feel it's unreadable. That's why I shorten the option name.

I wait to see more feedback about naming.

>
> > Options to enable the warning
> > --
> >
> > ``-X warn_encoding`` option and the ``PYTHONWARNENCODING``
> > environment variable are added. They are used to enable the
> > ``EncodingWarning``.
> >
> > ``sys.flags.encoding_warning`` is also added. The flag represents
> > ``EncodingWarning`` is enabled.
>
> Nitpick: I would prefer using the same name for the -X option and the
> sys.flags attribute (ex: sys.flags.warn_encoding).
>

OK, I will change the flag name same to option name.

>
> > ``encoding="locale"`` option
> > 
> >
> > ``io.TextIOWrapper`` accepts ``encoding="locale"`` option. It means
> > same to current ``encoding=None``. But ``io.TextIOWrapper`` doesn't
> > emit ``EncodingWarning`` when ``encoding="locale"`` is specified.
>
> Can you please define if os.device_encoding(fd) is called if
> encoding="locale" is used? It seems so, so it's not obvious from the
> PEP.
>

OK.

>
> In Python 3.10, I added _locale._get_locale_encoding() function which
> is exactly what the encoding used by open() when no encoding is
> specified (encoding=None) and when os.device_encoding(fd) returns
> None. See _Py_GetLocaleEncoding() for the C implementation
> (Python/fileutils.c).
>
> Maybe we should add a public locale.get_locale_encoding() function? On
> Unix, this function uses nl_langinfo(CODESET) *without* setting
> LC_CTYPE locale to the user preferred locale.
>

I can not imagine any use case. Isn't it just confusing?


> I understand that encoding=locale.get_locale_encoding() would be
> different from encoding="locale":
> encoding=locale.get_locale_encoding() doesn't call
> os.device_encoding(), right?
>

Yes.

>
> Maybe the PEP should also explain (in a "How to teach this" section?)
> when encoding="locale" is better than a specific encoding, like
> encoding="utf-8" or encoding="cp1252". In my experience, it's mostly
> for the inter-operability which other applications which also use the
> current locale encoding.
>

This option is for experts who are publishing cross-platform
libraries, frameworks, etc.

For students, I am suggesting another idea that make UTF-8 mode more accessible.

>
> > Add ``io.LOCALE_ENCODING = "locale"`` constant too. This constant can
> > be used to avoid confusing ``LookupError: unknown encoding: locale``
> > error when the code is run in old Python accidentally.
>
> I'm not sure that it is useful. I like a simple "locale" literal
> string. If there is a constant is io, people may start to think that
> it's specific and will add "import io" just to get the string
> "locale".
>
> I don't think that we should care too much about the error message
> rased by old Python versions.
>

This constant not only for replacing "locale" litera. As example code
in the PEP, it can be used to test wheather TextIOWrapper supports
`encoding="locale"` .

`open(fn, encoding=getattr(io, "LOCALE_ENCODING", None))` works both
for Python ~3.9 and Python 3.10~.


>
>
> > Opt-in warning
> > ---
> >
> > Although ``DeprecationWarning`` is suppressed by default, emitting
> > ``DeprecationWarning`` always when ``encoding`` option is omitted
> > would be too noisy.
>
> The PEP is not very clear. Does "-X warn_encoding" only emits the
> warning, or does it also display it by default? Does it add a warning
> filter for EncodingWarning?
>

This section is not the spec. This section is the rationale for adding
EncodingWarning instead of using DeprecationWarning.

As spec saying, EncodingWarning is a subclass of Warning. So it is
displayed by default. But it is not emitted by default.

When -X encoding_warning (or -X warn_default_encoding) is used, the
warning is emitted and 

[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-01 Thread Inada Naoki
On Tue, Feb 2, 2021 at 4:28 AM Steve Dower  wrote:
>
>
> I'm not defending the choice of wchar_t over UTF-8 (but I can: most of
> these systems chose Unicode before UTF-8 was invented and never took the
> backwards-incompatible change because they were so popular), but if we
> want to pragmatically weigh the needs of our users above our desire for
> purity, then we should try and support both equally wherever possible.
>

Note that we don't have "utf8 (char*) to Python bytes object" direct
encoder API.
If PEP 624 is accepted, utf8 and wchar_t* become equal.

So please don't think PEP 624 neglect only wchar_t*.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZZLY6AFXYEQQ7PI6IXRNU3FWQ23MXPZU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-01 Thread Inada Naoki
On Tue, Feb 2, 2021 at 12:43 AM M.-A. Lemburg  wrote:
>
> Hi Inada-san,
>
> thank you for adding some comments, but they are not really capturing
> what I think is missing:
>
> """
> Removing these APIs removes ability to use codec without temporary Unicode.
>
> Codecs can not encode Unicode buffer directly without temporary Unicode
> object since Python 3.3. All these APIs creates temporary Unicode object for
> now. So removing them doesn't reduce any abilities.
> """
>
> The point is that while the decoders allow going from a C object
> to a Python object directly, we are missing a way to do the same
> for the encoders, since the Python 3.3 change in the Unicode internals.
>
> At the very least, we should have such APIs for going from wchar_t*
> to a Python object.

We already have PyUnicode_FromWideChar(). So I assume you mean
"wchar_t* to Python bytes object".

>
> The alternatives you provide all require creating an intermediate
> Python object for this purpose. The APIs you want to remove do that
> as well, but that's not the point. The point is to expose the codecs'
> decode mechanism which is available in the C code, but currently
> not exposed via C APIs, e.g. ucs4lib_utf8_encode().
>
> It would be breaking change, but those APIs in your list could
> simply be changed from using Py_UNICODE to using whcar_t instead
> and then interface directly to the internal functions we have for
> the encoders.
>

OK, I see codecs.h has three encoders.

* utf8_encode
* utf16_encode
* utf32_encode

But there are 13 encoders in my PEP:

PyUnicode_Encode()
PyUnicode_EncodeASCII()
PyUnicode_EncodeLatin1()
PyUnicode_EncodeUTF7()
PyUnicode_EncodeUTF8()
PyUnicode_EncodeUTF16()
PyUnicode_EncodeUTF32()
PyUnicode_EncodeUnicodeEscape()
PyUnicode_EncodeRawUnicodeEscape()
PyUnicode_EncodeCharmap()
PyUnicode_TranslateCharmap()
PyUnicode_EncodeDecimal()
PyUnicode_TransformDecimalToASCII()

Do you want to keep all encoders? or 3 encoders?


> That would keep extensions working after a recompile, since
> Py_UNICODE is already a typedef to wchar_t.
>

That idea is written in the PEP already.
https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/USUH2YDEXW64NQYGJPG2OOLEJS3NJLXG/
Code of Conduct: http://python.org/psf/codeofconduct/


  1   2   3   4   5   6   >