[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-29 Thread Thomas Grainger
Can you ping me on the airflow PR for this change? (@graingert)

On Fri, Apr 29, 2022, 7:54 AM Malthe  wrote:

> On Fri, 29 Apr 2022 at 06:50, Thomas Grainger  wrote:
> > You can use a `__del__` method that warns on collection - like an
> unawaited coroutine
> >
> > Also if you're in control of importing the dagfile you can record all
> created dags and report any that are missing from the globals of the module
>
> Yes and I think this is the best we can do given how frames are being
> cleared.
>
> We can notify the user that a DAG was instantiated and not exposed at
> the top-level which is almost guaranteed to be a mistake. There's
> probably no good way currently to do better (for some value of
> "better").
>
> Thanks
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/QEYZLMYN4OIV4Q7JTIBP7RHEI37QPJAS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-29 Thread Thomas Grainger
Does this only apply to DAGfiles? Eg
https://airflow.apache.org/docs/apache-airflow/1.10.12/concepts.html#scope

You can use a `__del__` method that warns on collection - like an unawaited
coroutine

Also if you're in control of importing the dagfile you can record all
created dags and report any that are missing from the globals of the module


On Fri, Apr 29, 2022, 7:45 AM Malthe  wrote:

> On Fri, 29 Apr 2022 at 06:38, Thomas Grainger  wrote:
> > Can you show a run-able example of the successful and unsuccessful usage
> of `with DAG(): ... `?
>
> from airflow import DAG
>
> # correct:
> dag = DAG("my_dag")
>
> # incorrect:
> DAG("my_dag")
>
> The with construct really has nothing to do with it, but it is a
> common source of confusion:
>
> # incorrect
> with DAG("my_dag"):
> ...
>
> It is less obvious (to some) in this way that the entire DAG will not
> be picked up. You will in fact have to write:
>
> # correct
> with DAG("my_dag") as dag:
> ...
>
> This way, you're capturing the DAG in the top-level scope which is the
> requirement.
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/PC3ZTY3COHM3XDOPO3KWWC3NYVCQ7SNH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-29 Thread Malthe
On Fri, 29 Apr 2022 at 06:50, Thomas Grainger  wrote:
> You can use a `__del__` method that warns on collection - like an unawaited 
> coroutine
>
> Also if you're in control of importing the dagfile you can record all created 
> dags and report any that are missing from the globals of the module

Yes and I think this is the best we can do given how frames are being cleared.

We can notify the user that a DAG was instantiated and not exposed at
the top-level which is almost guaranteed to be a mistake. There's
probably no good way currently to do better (for some value of
"better").

Thanks
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/E4IU26RL4I72FMACQLNTIPT5DN5XTE3S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Malthe
On Fri, 29 Apr 2022 at 06:38, Thomas Grainger  wrote:
> Can you show a run-able example of the successful and unsuccessful usage of 
> `with DAG(): ... `?

from airflow import DAG

# correct:
dag = DAG("my_dag")

# incorrect:
DAG("my_dag")

The with construct really has nothing to do with it, but it is a
common source of confusion:

# incorrect
with DAG("my_dag"):
...

It is less obvious (to some) in this way that the entire DAG will not
be picked up. You will in fact have to write:

# correct
with DAG("my_dag") as dag:
...

This way, you're capturing the DAG in the top-level scope which is the
requirement.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/HREOTTGPB5JMLGYMIQL4VR2DFI6GBG5J/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Thomas Grainger
Can you show a run-able example of the successful and unsuccessful usage of
`with DAG(): ... `?

On Fri, Apr 29, 2022, 6:31 AM Malthe  wrote:

> Pablo Galindo Salgado wrote:
> > As it has been mentioned there is no guarantee that your variable will
> even
> > be finalized (or even destroyed) after the frame finishes. For example,
> if
> > your variable goes into a reference cycle for whatever reason it may not
> be
> > cleared until a GC run happens (and in some situations it may not even be
> > cleared at any point).
>
> I think there is a reasonable guarantee in CPython that it will happen
> exactly when you leave the frame, assuming there are no cycles or other
> references to the object. There's always the future, but I don't see a very
> near future where this will change fundamentally.
>
> Relying too much on CPython's behavior is a bad thing, but I think there
> are cases where it makes sense and can be a pragmatic choice. Certainly
> lots of programs have successfully relied on `sys._getframe` over the years.
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/BVO7RMMZ2LJFEG4GRNNTYZU3Q4P3DHV3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/PK7XVKSI7MSU6IJQIQCWM7BHNO7UT5YW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Malthe
Pablo Galindo Salgado wrote:
> As it has been mentioned there is no guarantee that your variable will even
> be finalized (or even destroyed) after the frame finishes. For example, if
> your variable goes into a reference cycle for whatever reason it may not be
> cleared until a GC run happens (and in some situations it may not even be
> cleared at any point).

I think there is a reasonable guarantee in CPython that it will happen exactly 
when you leave the frame, assuming there are no cycles or other references to 
the object. There's always the future, but I don't see a very near future where 
this will change fundamentally.

Relying too much on CPython's behavior is a bad thing, but I think there are 
cases where it makes sense and can be a pragmatic choice. Certainly lots of 
programs have successfully relied on `sys._getframe` over the years.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BVO7RMMZ2LJFEG4GRNNTYZU3Q4P3DHV3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Malthe
Dennis Sweeney wrote:
> I don't know if there's anything specifically stopping this, but from what I 
> understand, the precise moment that a finalizer gets called is unspecified, 
> so relying on any sort of behavior there is undefined and non-portable. 
> Implementations like PyPy don't always use reference counting, so their 
> garbage collection might get called some unspecified amount of time later.

It's unspecified of course for the language as such, but in the specific case 
of CPython (which we're targeting), I think the refcounting logic is here to 
stay and generally speaking, can be relied on. Of course some version may come 
along to break expectations and I suppose we might cross that bridge when we 
get to it.

> I'm not familiar with Airflow, but would you be able to decorate the create() 
> function to check for good return values?

We could but for the most part, people don't define DAGs inside functions – it 
happens, but it is not the most simple usage pattern. It's not so much about 
the function itself, but about being able to determine if a DAG was dropped at 
the top-level of the module.

If the frame clearing behavior was changed so that locals were reclaimed before 
popping the frame, I think the line number (i.e. `f_lineno`) would have to be 
that of the function definition, i.e. `def test():` in the examples above.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/FWRP3RPCGXXDQT2IVO7HQBCUQFHGTCRM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Pablo Galindo Salgado
As it has been mentioned there is no guarantee that your variable will even
be finalized (or even destroyed) after the frame finishes. For example, if
your variable goes into a reference cycle for whatever reason it may not be
cleared until a GC run happens (and in some situations it may not even be
cleared at any point). The language gives you no guarantees over when or
how objects will be finalized or destroyed and any attempt at relying on
specific behaviour is deemed to fail because it can change between versions
and implementations.



On Thu, 28 Apr 2022, 14:14 Malthe,  wrote:

> Consider this example code:
>
> def test():
> a = A()
>
> test()
>
> Currently, the locals (i.e. `a`) are cleared only after the function
> has returned:
>
> If we attach a finalizer to `a` immediately after the declaration then
> the frame stack available via `sys._getframe()` inside the finalizer
> function does not include the frame used to evaluate the function
> (i.e. with the code object of the `test` function).
>
> The nearest frame is that of the top-level module (where we make the
> call to the function).
>
> This is in practical terms no different than:
>
> def test():
> return A()
>
> test()
>
> There's no way to distinguish between the two cases even though in the
> second example, the object is dropped only after the frame (used to
> evaluate the function) has been cleared.
>
> The effect I am trying to achieve is:
>
> def test():
> a = A()
> del a
>
> Here's a use-case to motivate this need:
>
> In Airflow, we're considering introducing some "magic" to help users write:
>
> with DAG(...):
> # some code here
>
> That is, without declaring a top-level variable such as `dag`.
>
> However, we can't detect the following situation:
>
> def create():
> with DAG(...) as dag:
> # some code here
>
> create()
>
> The DAG is not returned from the function but nevertheless, we can't
> distinguish between this code and the correct version:
>
> def create():
> with DAG(...) as dag:
> # some code here
> return dag
>
> In this case, calling `create` will then "return" the DAG and of
> course, without a variable assignment, the finalizer will be called –
> but now we can detect this.
>
> I'm thinking that it ought to be possible to clear out
> `frame->localsplus` before leaving the function frame.
>
> I played around with "ceval.c" and only got segfaults. It's
> complicated machinery :-)
>
> Thoughts?
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/D5HCLMN42SIRRUHWPU566R7YYAVLCAEN/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BUT34WUMBSQHKASHDTRSZI5H7GSUAX72/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Dennis Sweeney
I don't know if there's anything specifically stopping this, but from what I 
understand, the precise moment that a finalizer gets called is unspecified, so 
relying on any sort of behavior there is undefined and non-portable. 
Implementations like PyPy don't always use reference counting, so their garbage 
collection might get called some unspecified amount of time later.

I'm not familiar with Airflow, but would you be able to decorate the create() 
function to check for good return values? Something like

:import functools
:
:def dag_initializer(func):
:@functools.wraps(func)
:def wrapper():
:with DAG(...) as dag:
:result = func(dag)
:del dag
:if not isinstance(result, DAG):
:raise ValueError(f"{func.__name__} did not return a dag")
:return result
:return wrapper
:
:@dag_initializer
:def create(dag):
:"some code here"
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/EBCLFYZLCTANUYSPZ55GFHG5I7DDTR76/
Code of Conduct: http://python.org/psf/codeofconduct/