Hi Martin,

Short answer: yes, I agree.
Slightly longer: I would be eternally grateful if you wish to contribute to
the PEP with any such expansion of the Motivation and Expansion.

On Wed, Jun 22, 2022 at 10:47 AM Martin Di Paola <martinp.dipa...@gmail.com>
wrote:

> Hi David, I read the PEP and I think it would be useful to expand the
> Motivation and Examples sections.
>
> While indeed Dask uses lazy evaluation to build a complex computation
> without executing it, I don't think that it is the whole story.
>
> Dask takes this deferred complex computation and *plans* how to execute it
> and then it *executes* it in non-obvious/direct ways.
>
> For example, the computation of the min() of a dataframe can be done
> computing the min() of each partition of the dataframe and then
> computing the min() of them. Here is where the plan and the execution
> stages play.
>
> All of this is hidden from the developer. From his/her perspective the
> min() is called once over the whole dataframe.
>
> Dask's deferred computations are "useless" without the
> planning/execution plan.
>
> PySpark, like Dask, does exactly the same.
>
> But what about Django's ORM? Indeed Django allows you the build a SQL
> query without executing it. You can then perform more subqueries,
> joins and group by without executing them.
>
> Only when you need the real data the query is executed.
>
> This is another example of deferred execution similar to Dask/PySpark
> however when we consider the planning/execution stages the similarities
> ends there.
>
> Django's ORM writes a SQL query and send it to a SQL database.
>
> Another example of deferred execution would be my library to interact
> with web pages programmatically: selectq.
>
> Very much like an ORM, you can select elements from a web page, perform
> subselections and unions without really interacting with the web page.
>
> Only when you want to get the data from the page is when the deferred
> computations are executed and like an ORM, the plan done by selectq is
> to build a single xpath and then execute it using Selenium.
>
> So...
>
> Three cases: Dask/PySpark, Django's ORM and selectq. All of them
> implement deferred expressions but all of them "compute" them in very
> specific ways (aka, they plan and execute the computation differently).
>
> Would those libs (and probably others) do benefit from the PEP? How?
>
> Thanks,
> Martin.
>
> On Tue, Jun 21, 2022 at 04:53:44PM -0400, David Mertz, Ph.D. wrote:
> >Here is a very rough draft of an idea I've floated often, but not with
> much
> >specification.  Take this as "ideas" with little firm commitment to
> details
> >from me. PRs, or issues, or whatever, can go to
> >https://github.com/DavidMertz/peps/blob/master/pep-9999.rst as well as
> >mentioning them in this thread.
> >
> >PEP: 9999
> >Title: Generalized deferred computation
> >Author: David Mertz <dme...@gnosis.cx>
> >Discussions-To:
> >https://mail.python.org/archives/list/python-ideas@python.org/thread/
> >Status: Draft
> >Type: Standards Track
> >Content-Type: text/x-rst
> >Created: 21-Jun-2022
> >Python-Version: 3.12
> >Post-History:
> >
> >Abstract
> >========
> >
> >This PEP proposes introducing the soft keyword ``later`` to express the
> >concept
> >of deferred computation.  When an expression is preceded by the keyword,
> the
> >expression is not evaluated but rather creates a "thunk" or "deferred
> >object."
> >Reference to the deferred object later in program flow causes the
> >expression to
> >be executed at that point, and for both the value and type of the object
> to
> >become the result of the evaluated expression.
> >
> >
> >Motivation
> >==========
> >
> >"Lazy" or "deferred" evaluation is useful paradigm for expressing
> >relationships
> >among potentially expensive operations prior their actual computation.
> Many
> >functional programming languages, such as Haskell, build laziness into the
> >heart of their language.  Within the Python ecosystem, the popular
> >scientific
> >library `dask-delayed <dask-delayed>`_ provides a framework for lazy
> >evaluation
> >that is very similar to that proposed in this PEP.
> >
> >.. _dask-delayed:
> >   https://docs.dask.org/en/stable/delayed.html
> >
> >
> >Examples of Use
> >===============
> >
> >While the use of deferred computation is principally useful when
> >computations
> >are likely to be expensive, the simple examples shown do not necessarily
> use
> >such expecially spendy computations.  Most of these are directly inspired
> by
> >examples used in the documentation of dask-delayed.
> >
> >In dask-delayed, ``Delayed`` objects are create by functions, and
> operations
> >create a *directed acyclic graph* rather than perform actual computations.
> >For
> >example::
> >
> >    >>> import dask
> >    >>> @dask.delayed
> >    ... def later(x):
> >    ...     return x
> >    ...
> >    >>> output = []
> >    >>> data = [23, 45, 62]
> >    >>> for x in data:
> >    ...     x = later(x)
> >    ...     a = x * 3
> >    ...     b = 2**x
> >    ...     c = a + b
> >    ...     output.append(c)
> >    ...
> >    >>> total = sum(output)
> >    >>> total
> >    Delayed('add-8f4018dbf2d3c1d8e6349c3e0509d1a0')
> >    >>> total.compute()
> >    4611721202807865734
> >    >>> total.visualize()
> >
> >.. figure:: pep-9999-dag.png
> >   :align: center
> >   :width: 50%
> >   :class: invert-in-dark-mode
> >
> >   Figure 1.  Dask DAG created from simple operations.
> >
> >Under this PEP, the soft keyword ``later`` would work in a similar manner
> to
> >this dask.delayed code.  But rather than requiring calling ``.compute()``
> >on a
> >``Delayed`` object to arrive at the result of a computation, every
> >reference to
> >a binding would perform the "compute" *unless* it was itself a deferred
> >expression.  So the equivalent code under this PEP would be::
> >
> >    >>> output = []
> >    >>> data = [23, 45, 62]
> >    >>> for later x in data:
> >    ...     a = later (x * 3)
> >    ...     b = later (2**x)
> >    ...     c = later (a + b)
> >    ...     output.append(later c)
> >    ...
> >    >>> total = later sum(output)
> >    >>> type(total)  # type() does not un-thunk
> >    <class 'DeferredObject'>
> >    >>> if value_needed:
> >    ...     print(total)  # Actual computation occurs here
> >    4611721202807865734
> >
> >In the example, we assume that the built-in function `type()` is special
> in
> >not
> >counting as a reference to the binding for purpose of realizing a
> >computation.
> >Alternately, some new special function like `isdeferred()` might be used
> to
> >check for ``Deferred`` objects.
> >
> >In general, however, every regular reference to a bound object will force
> a
> >computation and re-binding on a ``Deferred``.  This includes access to
> >simple
> >names, but also similarly to instance attributes, index positions in lists
> >or
> >tuples, or any other means by which an object may be referenced.
> >
> >
> >Rejected Spellings
> >==================
> >
> >A number of alternate spellings for creating a ``Deferred`` object are
> >possible.  This PEP-author has little preference among them.  The words
> >``defer`` or ``delay``, or their past participles ``deferred`` and
> >``delayed``
> >are commonly used in discussions of lazy evaluation.  All of these would
> >work
> >equally well as the suggested soft keyword ``later``.  The keyword
> ``lazy``
> >is
> >not completely implausible, but does not seem to read as well.
> >
> >No punctuation is immediately obvious for this purpose, although
> surrounding
> >expressions with backticks is somewhat suggestive of quoting in Lisp, and
> >perhaps slightly reminiscent of the ancient use of backtick for shell
> >commands
> >in Python 1.x.  E.g.::
> >
> >    might_use = `math.gcd(a, math.factorial(b))`
> >
> >
> >Relationship to PEP-0671
> >========================
> >
> >The concept of "late-bound function argument defaults" is introduced in
> >:pep:`671`.  Under that proposal, a special syntactic marker would be
> >permitted
> >in function signatures with default arguments to allow the expressions
> >indicated as defaults to be evaluated at call time rather than at runtime.
> >In
> >current Python, we might write a toy function such as::
> >
> >    def func(items=[], n=None):
> >        if n is None:
> >            n = len(items)
> >        items.append("Hello")
> >        print(n)
> >
> >    func([1, 2, 3])  # prints: 3
> >
> >Using the :pep:`671` approach this could be simplified somewhat as::
> >
> >    def func(items=[], n=>len(items)):
> >        # late-bound defaults act as if bound here
> >        items.append("Hello")
> >        print(n)
> >
> >    func([1, 2, 3])  # prints: 3
> >
> >Under the current PEP, evaluation of a ``Deferred`` object only occurs
> upon
> >reference.  That is, for the current toy function, the evaluation would
> not
> >occur until the ``print(n)`` line.::
> >
> >    def func(items=[], n=later len(items)):
> >        items.append("Hello")
> >        print(n)
> >
> >    func([1, 2, 3])  # prints: 4
> >
> >To completely replicate the behavior of PEP-0671, an extra line at the
> >start of
> >the function body would be required::
> >
> >    def func(items=[], n=later len(items)):
> >        n = n  # Evaluate the Deferred and re-bind the name n
> >        items.append("Hello")
> >        print(n)
> >
> >    func([1, 2, 3])  # prints: 3
> >
> >
> >References
> >==========
> >
> >https://github.com/DavidMertz/peps/blob/master/pep-9999.rst
> >
> >Copyright
> >=========
> >
> >This document is placed in the public domain or under the
> >CC0-1.0-Universal license, whichever is more permissive.
> >
> >
> >
> >--
> >Keeping medicines from the bloodstreams of the sick; food
> >from the bellies of the hungry; books from the hands of the
> >uneducated; technology from the underdeveloped; and putting
> >advocates of freedom in prisons.  Intellectual property is
> >to the 21st century what the slave trade was to the 16th.
>
>
> >_______________________________________________
> >Python-ideas mailing list -- python-ideas@python.org
> >To unsubscribe send an email to python-ideas-le...@python.org
> >https://mail.python.org/mailman3/lists/python-ideas.python.org/
> >Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/DQTR3CYWMLSRRKR6MBLZNTGCG762QNDF/
> >Code of Conduct: http://python.org/psf/codeofconduct/
>
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/SK267MWXWE3CBMUY46NVD5OBZNT7KSY4/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PCKAUMW4DOA3ZN6AQ6GXGWYVVIGVJ5XI/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to