I still can't follow this explanation.

Operators in Python have a priority, which determines order of evaluation.

In the example

  A ~ B + C

is this equivalent to

  A ~ (B + C)

or to

  (A ~ B) + C

???

It's not unheard of to add an operator to Python that's primarily important
for a certain subcommunity -- in particular, we've done this for @
(__matmul__).

But the motivation as well as the specifics of the proposal must be
understandable for people outside that subcommunity. You could look at PEP
465 for some hints on how to do this.

Assuming that the reader is familiar with the example `Lottery ~ Literacy +
Wealth + Region` is *not* going to work. I have literally no idea from what
field that is taken or what the purpose of the example is. Please don't
expect that I can just Google it: I did, found
https://www.statsmodels.org/stable/example_formulas.html, and I still have
no idea what it's about.

On Sun, Feb 23, 2020 at 3:07 PM Brendan Barnwell <brenb...@brenbarn.net>
wrote:

> On 2020-02-23 14:38, Steven D'Aprano wrote:
> > Hi Aaron, and welcome!
> >
> > Your proposal would be a lot more interesting to me if I knew what this
> > binary ~ would actually do, without having to go learn R or LaTeX.
> >
> > You say:
> >
> >> I think it would be awesome to have in the language, as it would allow
> >> modelling along the lines of R that we currently only get with text,
> >> e.g.:
> >>
> >> smf.ols(formula='Lottery ~ Literacy + Wealth + Region', data=df)
> >>
> >> With a binary context for ~, we could write the above string as pure
> >> Python
> >
> > I'm confused. Why can't you just write
> >
> >      'Lottery ~ Literacy + Wealth + Region'
> >
> > as a literal string? That's an exact copy and paste from your example,
> > and it works for me.
>
>         I'm not the OP but. . .
>
>         In R there is a tilde operator that is used to indicate "depends
> on"
> when separating the dependent and independent variables in a statistical
> model formulation.  The example given is how it has to be done in
> Python.  In R you just write `Lottery ~ Literacy + Wealth + Region`
> (i.e., as code with no quotes).
>
>         That said, the way this works in R depends on additional
> "features" of
> R whose absence in Python make it a heavier lift than just adding a
> tilde.  R can magically defer evaluation of names so that you can write
> something like that tilde expression and pass an additional argument
> specifying the table whose columns are the given variables (i.e., a
> table with columns for Lottery, Literacy, etc.), and then it will later
> evaluate the names by looking them up as columns.  This won't work in
> Python because even if you had the tilde, you couldn't do this:
>
> ols(Lottery ~ Literacy + Wealth + Region, data=df)
>
>         Because that model expression is a function argument, Python
> semantics
> require it to be evaluated before the call is made, so you can't defer
> evaluation and later use the names as column names to look up in the
> provided table.
>
>         In order to make it work you'd need something else that I've
> sometimes
> wished for, which is a smooth way to create and pass around unevaluated
> expressions, and then later trigger their evaluation in the context of a
> given namespace (such as the one where the evaluation is triggered).
> Right now the only approximation to this is lambda, but lambda closes
> over variables based on the lexical context where it's defined, not
> where it's called, so it doesn't really work.  In other words, what I'd
> like is the ability to do something like this:
>
> def foo():
>         expr = deferred(a + b + c)
>         bar(expr)
>
> def bar(x):
>         a, b, c = 1, 2, 3
>
>         # this should return 6
>         return expr.evaluate()
>
>         If such functionality existed, then a tilde operator could indeed
> be
> used to create model definitions using deferred evaluations like in R.
>
>         However, I think deferred evaluation is the more important
> functionality here.  If we had deferred evaluation without the tilde, we
> could still do what R does by using a different operator instead of
> tilde, at worst perhaps having to parenthesize the dependent-variable
> expression (in case our alternative "depends" operator had the wrong
> precedence).  But without deferred evaluation, the tilde operator gains
> little, at least in terms of providing model-evaluation expressions like
> those in R.
>
> --
> Brendan Barnwell
> "Do not follow where the path may lead.  Go, instead, where there is no
> path, and leave a trail."
>     --author unknown
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/FU5P4Q7EQV6VOE7DFSISMSEFW3JD6OEZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IEYEFJN7VMQPQMPOB4LW65M4JMDDSAPN/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to