[Python-Dev] Re: Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-17 Thread Gabriele
Hi Fabio

Does the actual function object get re-created as well during the
recompilation process that you have described? Perhaps it might help
to note that the __code__ attribute of a function object f can be
mutated and that f is hashable?

Cheers,
Gabriele

On Thu, 17 Feb 2022 at 19:33, Fabio Zadrozny  wrote:
>
>
> Em qui., 17 de fev. de 2022 às 16:05, Mark Shannon  escreveu:
>>
>> Hi Fabio,
>>
>> This happened as part of implementing PEP 626.
>> The previous behavior isn't very robust w.r.t doc strings and
>> compiler optimizations.
>>
>> OOI, why would you want to revert to the old behavior?
>>
>
> Hi Mark,
>
> The issue I'm facing is that ipython uses an approach of obtaining the ast 
> for a function to be executed and then it goes on node by node executing it.
>
> When running in the debugger, the debugger caches some information based on 
> (co_firstlineno, co_name, co_filename) to have information saved across 
> multiple calls to the same function, which works in general because each 
> function in a given python file would have its own co_firstlineno, but in 
> this specific case here it gets a single function and then recompiles it 
> expression by expression -- so, it'll have the same co_filename () and 
> the same co_name (), but then the co_firstlineno would be different 
> (because the statement resides in a different line), but with Python 3.10 
> this assumption fails as even the co_firstlineno will be the same...
>
> You can see the actual issues at: 
> https://github.com/microsoft/vscode-jupyter/issues/8803 / 
> https://github.com/ipython/ipykernel/issues/841/ 
> https://github.com/microsoft/debugpy/issues/844
>
> After thinkering a bit it seems it's possible to create a new code object 
> based on an existing code object with `code.replace` (re-assembling the 
> co_lnotab/co_firstlineno), so, I'm going to propose that as a fix to ipython, 
> but I found it really strange that this did change in Python 3.10 in the 
> first place as the old behavior seemed reasonable for me (i.e.: with the new 
> behavior it's a bit strange that the user is compiling something with a 
> single statement on line 99 and yet the resulting code object will have the 
> co_firstlineno == 1).
>
> -- note: I also couldn't find any mention of this in the changelog, so, I 
> thought this could've happened by mistake.
>
> Best regards,
>
> Fabio
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/DVP4VK3BY4XDC6B6HSVPLJTPCQKISAPC/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
"Egli è scritto in lingua matematica, e i caratteri son triangoli,
cerchi, ed altre figure
geometriche, senza i quali mezzi è impossibile a intenderne umanamente parola;
senza questi è un aggirarsi vanamente per un oscuro laberinto."

-- G. Galilei, Il saggiatore.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O5EEGEHE7G6UFTYO4UX7Y7QHZXA4ACYG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-17 Thread Fabio Zadrozny
Em qui., 17 de fev. de 2022 às 16:05, Mark Shannon 
escreveu:

> Hi Fabio,
>
> This happened as part of implementing PEP 626.
> The previous behavior isn't very robust w.r.t doc strings and
> compiler optimizations.
>
> OOI, why would you want to revert to the old behavior?
>
>
Hi Mark,

The issue I'm facing is that ipython uses an approach of obtaining the ast
for a function to be executed and then it goes on node by node executing it.

When running in the debugger, the debugger caches some information based on
(co_firstlineno, co_name, co_filename) to have information saved across
multiple calls to the same function, which works in general because each
function in a given python file would have its own co_firstlineno, but in
this specific case here it gets a single function and then recompiles it
expression by expression -- so, it'll have the same co_filename ()
and the same co_name (), but then the co_firstlineno would be
different (because the statement resides in a different line), but with
Python 3.10 this assumption fails as even the co_firstlineno will be the
same...

You can see the actual issues at:
https://github.com/microsoft/vscode-jupyter/issues/8803 /
https://github.com/ipython/ipykernel/issues/841/
https://github.com/microsoft/debugpy/issues/844

After thinkering a bit it seems it's possible to create a new code object
based on an existing code object with `code.replace` (re-assembling the
co_lnotab/co_firstlineno), so, I'm going to propose that as a fix to
ipython, but I found it really strange that this did change in Python 3.10
in the first place as the old behavior seemed reasonable for me (i.e.: with
the new behavior it's a bit strange that the user is compiling something
with a single statement on line 99 and yet the resulting code object will
have the co_firstlineno == 1).

-- note: I also couldn't find any mention of this in the changelog, so, I
thought this could've happened by mistake.

Best regards,

Fabio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DVP4VK3BY4XDC6B6HSVPLJTPCQKISAPC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-17 Thread Mark Shannon

Hi Fabio,

This happened as part of implementing PEP 626.
The previous behavior isn't very robust w.r.t doc strings and
compiler optimizations.

OOI, why would you want to revert to the old behavior?

Cheers,
Mark.

On 17/02/2022 5:52 pm, Fabio Zadrozny wrote:

Hi all,

I'm stumbling with an issue where the co_firstlineno behavior changed from 
Python 3.9 to Python 3.10 and I was wondering if this was intentional or not.

i.e.: Whenever a code is compiled in Python 3.10, the `code.co_firstlineno` is 
now always 1, whereas previously it was equal to the first statement.

Also, does anyone know if there is any way to restore the old behavior in 
Python 3.10? I tried setting the `module.lineno` but it didn't really make any 
difference...

As an example, given the code below:

|import dis source = ''' print(1) print(2) ''' initial_module = compile(source, 
'', 'exec', PyCF_ONLY_AST, 1) import sys print(sys.version) for i in range(2): 
module = Module([initial_module.body[i]], []) module_code = compile(module, '', 'exec') print(' --> First lineno:', module_code.co_firstlineno) print(' --> 
Line starts :', list(lineno for offset, lineno in dis.findlinestarts(module_code))) print(' 
dis ---') dis.dis(module_code)|



I have the following outputs for Pyhon 3.9/Python 3.10:

|3.9.6 (default, Jul 30 2021, 11:42:22) [MSC v.1916 64 bit (AMD64)] --> First lineno: 2 
--> Line starts : [2]  dis --- 2 0 LOAD_NAME 0 (print) 2 LOAD_CONST 0 (1) 4 
CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 1 (None) 10 RETURN_VALUE --> First lineno: 4 
--> Line starts : [4]  dis --- 4 0 LOAD_NAME 0 (print) 2 LOAD_CONST 0 (2) 4 
CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 1 (None) 10 RETURN_VALUE|



|3.10.0 (tags/v3.10.0:b494f59, Oct 4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] --> 
First lineno: 1 --> Line starts : [2]  dis --- 2 0 LOAD_NAME 0 (print) 2 LOAD_CONST 
0 (1) 4 CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 1 (None) 10 RETURN_VALUE --> First 
lineno: 1 --> Line starts : [4]  dis --- 4 0 LOAD_NAME 0 (print) 2 LOAD_CONST 0 (2) 
4 CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 1 (None) 10 RETURN_VALUE |

Thanks,

Fabio


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VXW3TVHVYOMXDQIQBJNZ4BTLXFT4EPQZ/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2YHNQVGQEDDDKF7MVZIQA4GBIMYC2CJD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-17 Thread Fabio Zadrozny
Hi all,

I'm stumbling with an issue where the co_firstlineno behavior changed from
Python 3.9 to Python 3.10 and I was wondering if this was intentional or
not.

i.e.: Whenever a code is compiled in Python 3.10, the `code.co_firstlineno`
is now always 1, whereas previously it was equal to the first statement.

Also, does anyone know if there is any way to restore the old behavior in
Python 3.10? I tried setting the `module.lineno` but it didn't really make
any difference...

As an example, given the code below:

import dis

source = '''
print(1)

print(2)
'''

initial_module = compile(source, '', 'exec', PyCF_ONLY_AST, 1)

import sys
print(sys.version)

for i in range(2):
module = Module([initial_module.body[i]], [])
module_code = compile(module, '', 'exec')
print(' --> First lineno:', module_code.co_firstlineno)
print(' --> Line starts :', list(lineno for offset, lineno in
dis.findlinestarts(module_code)))
print(' dis ---')
dis.dis(module_code)



I have the following outputs for Pyhon 3.9/Python 3.10:

3.9.6 (default, Jul 30 2021, 11:42:22) [MSC v.1916 64 bit (AMD64)]
 --> First lineno: 2
 --> Line starts : [2]
 dis ---
  2   0 LOAD_NAME0 (print)
  2 LOAD_CONST   0 (1)
  4 CALL_FUNCTION1
  6 POP_TOP
  8 LOAD_CONST   1 (None)
 10 RETURN_VALUE
 --> First lineno: 4
 --> Line starts : [4]
 dis ---
  4   0 LOAD_NAME0 (print)
  2 LOAD_CONST   0 (2)
  4 CALL_FUNCTION1
  6 POP_TOP
  8 LOAD_CONST   1 (None)
 10 RETURN_VALUE



3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)]
 --> First lineno: 1
 --> Line starts : [2]
 dis ---
  2   0 LOAD_NAME0 (print)
  2 LOAD_CONST   0 (1)
  4 CALL_FUNCTION1
  6 POP_TOP
  8 LOAD_CONST   1 (None)
 10 RETURN_VALUE
 --> First lineno: 1
 --> Line starts : [4]
 dis ---
  4   0 LOAD_NAME0 (print)
  2 LOAD_CONST   0 (2)
  4 CALL_FUNCTION1
  6 POP_TOP
  8 LOAD_CONST   1 (None)
 10 RETURN_VALUE

Thanks,

Fabio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VXW3TVHVYOMXDQIQBJNZ4BTLXFT4EPQZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow
Again, thanks for the reply.  It's helpful.  My further responses are
inline below.

-eric

On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
> > Agreed.  However, what behavior do users expect and what guarantees do
> > we make?  Do we indicate how to interpret the refcount value they
> > receive?  What are the use cases under which a user would set an
> > object's refcount to a specific value?  Are users setting the refcount
> > of objects they did not create?
>
> That's what I hoped the PEP would tell me. Instead of simply claiming
> that there won't be issues, it should explain why we won't have any issues.
> [snip]
> IMO, the reasoning should start from the assumption that things will
> break, and explain why they won't (or why the breakage is acceptable).
> If the PEP simply tells me upfront that things will be OK, I have a hard
> time trusting it.
>
> IOW, it's clear you've thought about this a lot (especially after
> reading your replies here), but it's not clear from the PEP.
> That might be editorial nitpicking, if it wasn't for the fact that I
> want find any gaps in your research and reasoning, and invite everyone
> else to look for them as well.

Good point.. It's easy to dump a bunch of unnecessary info into a PEP,
and it was hard for me to know where the line was in this case.  There
hadn't been much discussion previously about the possible ways this
change might break users.  So thanks for bringing this up.  I'll be
sure to put a more detailed explanation in the PEP, with a bit more
evidence too.

> Ah, I see. I was confused by this:

No worries!  I'm glad we cleared it up.  I'll make sure the PEP is
more understandable about this.

> > This is also true even with the GIL, though the impact is smaller.
>
> Smaller than what? The baseline for that comparison is a hypothetical
> GIL-less interpreter, which is only introduced in the next section.
> Perhaps say something like "Python's GIL helps avoid this effect, but
> doesn't eliminate it."

Good point.  I'll clarify the point.

> >> Weren't you planning a PEP on subinterpreter GIL as well? Do you want to
> >> submit them together?
> >
> > I'd have to think about that.  The other PEP I'm writing for
> > per-interpreter GIL doesn't require immortal objects.  They just
> > simplify a number of things.  That's my motivation for writing this
> > PEP, in fact. :)
>
> Please think about it.
> If you removed the benefits for per-interpreter GIL, the motivation
> section would be reduced to is memory savings for fork/CoW. (And lots of
> performance improvements that are great in theory but sum up to a 4% loss.)

Sounds good.  Would this involve more than a note at the top of the PEP?

And just to be clear, I don't think the fate of a per-interpreter GIL
PEP should not depend on this one.

> > It wouldn't match _Py_IMMORTAL_REFCNT, but the high bit of
> > _Py_IMMORTAL_REFCNT would still match.  That bit is what we would
> > actually be checking, rather than the full value.
>
> It makes sense once you know _Py_IMMORTAL_REFCNT has two bits set. Maybe
> it'd be good to note that detail -- it's an internal detail, but crucial
> for making things safe.

Will do.

> >> What about extensions compiled with Python 3.11 (with this PEP) that use
> >> an older version of the stable ABI, and thus should be compatible with
> >> 3.2+? Will they use the old versions of the macros? How will that be 
> >> tested?
> >
> > It wouldn't matter unless an object's refcount reached
> > _Py_IMMORTAL_REFCNT, at which point incref/decref would start
> > noop'ing.  What is the likelihood (in real code) that an object's
> > refcount would grow that far?  Even then, would such an object ever be
> > expected to go back to 0 (and be dealloc'ed)?  Otherwise the point is
> > moot.
>
> That's exactly the questions I'd hope the PEP to answer. I could
> estimate that likelihood myself, but I'd really rather just check your
> work ;)
>
> (Hm, maybe I couldn't even estimate this myself. The PEP doesn't say
> what the value of _Py_IMMORTAL_REFCNT is, and in the ref implementation
> a comment says "This can be safely changed to a smaller value".)

Got it.  I'll be sure that the PEP is more clear about that.  Thanks
for letting me know.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LRUQDLVTC7GV4K3HHZK2ESPW3AHW4NKJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow
On Wed, Feb 16, 2022 at 10:43 PM Jim J. Jewett  wrote:
> I suggest being a little more explicit (even blatant) that the particular 
> details of:
> [snip]
> are not only Cpython-specific, but are also private implementation details 
> that are expected to change in subsequent versions.

Excellent point.

> Ideally, things like the interned string dictionary or the constants from a 
> pyc file will be not merely immortal, but stored in an immortal-only memory 
> page, so that they won't be flushed or CoW-ed when a nearby non-immortal 
> object is modified.

That's definitely worth looking into.

> Getting those details right will make a difference to performance, and you 
> don't want to be locked in to the first draft.

Yep, that is one big reason I was trying to avoid spelling out every
detail of our plan. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/535SKVXHPFZQMKRB2YC6UVQLN2TZ4RMY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Petr Viktorin

On 17. 02. 22 2:13, Eric Snow wrote:

Thanks for the feedback.  My responses are inline below.

-eric


On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin  wrote:

Thank you very much for writing this down! It's very helpful to see a
concrete proposal, and the current state of this idea.
I like the change,


That's good to hear. :)


but I think it's unfortunately more complicated than
the PEP suggests.


That would be unsurprising. :)


This proposal is CPython-specific and, effectively, describes
internal implementation details.


I think that is a naïve statement. Refcounting is
implementation-specific, but it's hardly an *internal* detail.


Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* 
https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT


There is
code that targets CPython specifically, and relies on the details.


Could you elaborate?  Do you mean such code relies on specific refcount values?


The refcount has public getters and setters,


Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?


That's what I hoped the PEP would tell me. Instead of simply claiming 
that there won't be issues, it should explain why we won't have any issues.




and you need a pretty good
grasp of the concept to write a C extension.


I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).


I think that it's safe to assume that this will break people's code,


Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] 
https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/


IMO, the reasoning should start from the assumption that things will 
break, and explain why they won't (or why the breakage is acceptable).
If the PEP simply tells me upfront that things will be OK, I have a hard 
time trusting it.


IOW, it's clear you've thought about this a lot (especially after 
reading your replies here), but it's not clear from the PEP.
That might be editorial nitpicking, if it wasn't for the fact that I 
want find any gaps in your research and reasoning, and invite everyone 
else to look for them as well.



[...]

Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.


This looks out of context. Python has a per-process GIL. It should it go
after the next section.


This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?


Ah, I see. I was confused by this:

This is also true even with the GIL, though the impact is smaller.


Smaller than what? The baseline for that comparison is a hypothetical 
GIL-less interpreter, which is only introduced in the next section.
Perhaps say something like "Python's GIL helps avoid this effect, but 
doesn't eliminate it."




The proposed solution is obvious enough that two people came to the
same conclusion (and implementation, more or less) independently.


Who was it? Assuming it's not a secret :)


Me and Eddit. :)  I don't mind saying so.


In the case of per-interpreter GIL, the only realistic alternative
is to move all global objects into ``PyInterpreterState`` and add
one or more lookup functions to access them.  Then we'd have to