(Re-sending, because this was originally a reply to an off-list message by Nima 
Hamidi)

On Jul 13, 2019, at 14:12, Nima Hamidi <ham...@stanford.edu> wrote:
>  
> Sometimes it's necessary not to evaluate the expression. Two such 
> applications of NSE in R are as follows:
>  
> 1. Data-tables have cleaner syntax. For example, letting dt be a data-table 
> with a column called price, one can retrieve items cheaper than $1 using the 
> following: dt [price < 1]. Pandas syntax requires something like dt[dt.price 
> < 1]. This is currently inevitable as the expression is evaluated *before* 
> __getitem__ is invoked. Using NSE, dt.__getitem__ can, first, add its columns 
> to locals() dictionary and then evaluate the expression in the new context.

This one looks good. I can also imagine it being useful for SQLAlchemy, 
appscript, etc. just as it is for Pandas.

But in your proposal, wouldn’t this have to be written as dt[`price < 1`]? I 
think the cost of putting the expression in ticks is at least as bad as the 
cost of naming the dt.

Also: dt.price < 1 is a perfectly valid expression, with a useful value. You 
can store it in a temporary variable to avoid repeating it, or stash it for 
later, or print it out to see what’s happening. But price < 1 on its own is a 
NameError, and I’m not sure what `price < 1` is worth on its own. Would this 
invite code that’s hard to refactor and even harder to debug?

> 2. Pipe-lining in R is also much cleaner. Dplyr provided an operator %>% 
> which passes the return value of its LHS as the first argument of its RHS. In 
> other words, f() %>% g() is equivalent to g(f()). This is pretty useful for 
> long pipelines. The way that it works is that the operator %>% changes AST 
> and then evaluates the modified expression. In this example, evaluating g() 
> is undesirable.

This doesn’t seem necessary in a language with first-class functions. Why can’t 
you just write the pipeline as something f %>% g, much as you would in, say, 
Haskell, which would just define a function (presumably equivalent to either 
lambda: g(f()) or lambda *a, **kw: g(f(*a, **kw))) that represents the pipeline 
that you then just call normally? I don’t see the benefit in being able to 
write g() instead of g here, and in fact it seems actively misleading, because 
it implies calling g on no arguments instead of one.

Also, given that Python doesn’t have such an operator, or a way to define 
custom operators, and that proposals for even simpler operators on functions 
like @ for compose have been rejected every time they’ve been suggested, I 
wouldn’t expect much traction from this example. Is there something similar 
that could plausibly be done in Python, and feel Pythonic?

—-

A couple more things I thought of since the initial reply…

I’m pretty sure Python’s AST objects don’t contain the original source text. 
So, what is your plot function going to actually do with its arguments to get 
the axes? What if it’s called with plot(`x[..., 3:]`)? Will plot—and every 
other function that wants to do something similar—need to come up with a way to 
generate the nicest source text that could produce the given AST? Or do we need 
to add a decompile to the stdlib for them? I suppose you could solve this by 
just adding more fields to BoundExpression, but I’m not sure that wouldn’t make 
it a lot harder to implement the backtick feature.

Backticks are supposed to be banned for the life of Python since 3.0 eliminated 
them as shorthand for repr. That could be revisited, but it might be a tough 
sell. Maybe the original “grit on Tim’s screen” reason is no longer as 
compelling because of higher-res screens and more uniform console fonts, but 
the rise to ubiquity of markdown seems like an even better reason not to use 
them. Today, you can paste Python code between backticks to mark it as code in 
markdown; if Python code can contain backticks, that’s no longer true. People 
who use languages that rely on backticks have been complaining about that for 
years; so we want to join them?

Finally, I think you need a fully worked-through example, not just a 
description of one. Show what the implementation of plot would look like if it 
could be handed BoundExpression objects. (Although pd.DataFrame.__getitem__ 
seems like the killer use case here, so maybe show that one instead, even 
though it’s probably more complicated.)
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BG6T2XF6VDKDQQ5M4EIHGGIO5OLKDQCH/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to