Re: [PHP-DEV] [RFC] Pipe Operator (again)

Larry Garfield Sat, 05 Apr 2025 10:09:28 -0700

On Thu, Mar 27, 2025, at 9:30 AM, Ilija Tovilo wrote:
> Hi Larry
>
> Sorry for the late response.
>
> On Fri, Feb 7, 2025 at 5:58 AM Larry Garfield <la...@garfieldtech.com> wrote:
>>
>> https://wiki.php.net/rfc/pipe-operator-v3
>
> We have already discussed this topic extensively off-list, so let me
> bring the list up-to-date.
>
> The current pipes proposal is elegantly simple. This has many upsides,
> but it comes with an obvious limitation:
> It only works well when the called function takes only a single argument.
>
> $sourceCode |> lexer(...) |> parser(...) |> compiler(...) |> vm(...)
>
> Such code is nice, but is also quite niche. I have argued off-list
> that the predominant use-case for pipes are arrays and iterators
> (including strings immediately split into chunks), and it seems most
> agree. However, most array/iterator functions (e.g. filter, map,
> reduce, first, all, etc.) don't fall into the one-parameter category.
>
> A slightly simplified example from the RFC:
>
> $result = "Hello World"
>     |> str_split(...)
>     |> fn($x) => array_map(strtoupper(...), $x)
>     |> fn($x) => array_filter($x, fn($v) => $v != 'O');
>
> IMO, this is harder to understand than the alternative of using
> multiple statements with a temporary variable.
>
> $tmp = "Hello World";
> $tmp = str_split($tmp);
> $tmp = array_map(strtoupper(...), $tmp);
> $result = array_filter($tmp, fn($v) => $v != 'O');
>
> The RFC has a solution for this: Partial function application [1].
>
> $result = "Hello World"
>     |> str_split(...)
>     |> array_map(strtoupper(...), ?)
>     |> array_filter(?, fn($v) => $v != 'O');
>
> This still causes more cognitive overhead than it should, at least to me.
>
> * The placement of ? is hard to detect, especially when it's not the
> first argument.
> * The user now has to think about immediately-invoked closures that
> exist solely for argument-reordering. The closure can be elided
> through the optimizer, but we cannot elide the additional cognitive
> overhead in the user.
> * The implementation of ? is significantly more complex than that of
> pipes, making the supposed simplicity of pipes somewhat misleading.
>
> If my assumption is correct that the primary use-case for pipes are
> arrays, it might be worth investigating the possibility of introducing
> a new iterator API, which has been proposed before [2], optimized for
> pipes. Specifically, this API would ensure consistent placement of the
> subject, i.e. the iterable in this case, as the first argument. Pipes
> would no longer have the form of expr |> expr, where the
> right-hand-side is expected to return a callable. Instead, it would
> have the form of expr |> function_call, where the left-hand-side is
> implicitly inserted as the first parameter of the call.
>
> namespace Iter {
>     function map(iterable $iterable, \Closure $callback): \Iterator;
>     function filter(iterable $iterable, \Closure $callback): \Iterator;
> }
>
> namespace {
>     use function Iter\{map, filter};
>
>     $result = "Hello World"
>         |> str_split()
>         |> map(strtoupper(...))
>         |> filter(fn($v) => $v != 'O');
> }
>
> This is the same approach taken by Elixir [3]. It has a few benefits:
>
> * We don't need to think about closures that are immediately invoked,
> because there are none. The code is exactly the same as if you had
> written it through nested function calls. This simplifies things
> significantly for both the engine and the user.
> * It closely resembles code that would be written in an
> object-oriented manner, making it more familiar.
> * It is the shortest and most readable of all the proposed options.
>
> As with everything, there are downsides.
>
> * It only works well for subject-first APIs. There are not an
> insignificant number of existing functions that do not follow this
> convention (e.g. explode(), preg_match(), etc.). That said, explode('
> ', $s) |> filter($c1) |> map($c2) still composes well, given explode()
> is usually first first in the chain, while preg_match() is rarely
> chained at all.
> * People have voiced concerns for potential confusion regarding the
> right-hand-side. It may not be any arbitrary expression, but is
> restricted to a function call. Hence, `$param |> $myClosure` is not
> valid code, requiring additional braces: `$param |> $myClosure()`.
> This approach resembles the -> operator, where at least conceptually,
> the left-hand-side is implicitly passed as a $this parameter. However,
> the spaces between |> do not signal this fact as well, making it look
> like the right-hand-side is evaluated separately. Potentially, a
> different symbol might work better.
>
> Internal reactions to this idea were mixed, so I'm interested to hear
> what the community thinks about it.
>
> Ilija
>
> [1] https://wiki.php.net/rfc/partial_function_application
> [2] https://externals.io/message/118896
> [3] https://elixirschool.com/en/lessons/basics/pipe_operator


To clarify my stance on the above: I am open to this, and I agree with Ilija 
that in the typical case it would be more convenient.  The argument that it 
would be confusing to have a "hidden" first param is valid, but as with any new 
feature I think it's obvious once you know it, so that's a small issue.  I 
didn't propose it originally as I suspected folks would balk at the added 
complexity, but I do like the concept.

Part of Ilija's proposal does include offering $val |> ($expr) (or similar) to 
allow arbitrary expressions on the left, which would need to return a unary 
function.  Basically the () would make it the same as what the RFC is doing now.

However, it also received significant pushback off-list from folks who felt it 
was too much magic.  I don't want to torpedo pipes on over-reaching.  But 
without feedback from other voters, I don't know if this is over-reaching.  Is 
it?  Please, someone tell me which approach you'd be more willing to vote for. 
:-)

One concern of this approach is that it gets even closer to "real" extension 
functions.  But real extension functions (which let you write code that looks 
like you're adding arbitrary methods to arbitrary objects, even though under 
the hood it's just a plain function that takes an object as a parameter) also 
run into a lot of additional complexity.  Chief among them, they don't handle 
name collisions, so you can have only one "map" function rather than 
one-per-class.  Unless you have an alternate syntax for the extension functions 
to specify the type they work on (which is what Kotlin does), but then you run 
into questions around inheritance and polymorphism that are hard to resolve in 
a runtime-centric environment.   I haven't fully thought through all of these 
details.

It's also been proposed to use +> as an operator for extension functions and/or 
first-param pipes like Elixir.  I'm not sure how I feel about that; my main 
concern is which one it would apply to, since as noted above full extension 
functions introduce a lot of extra considerations.

But I really don't want to hold up pipes on speculation on multiple future 
maybe-features.  As the RFC notes, there are a number of follow ups that I want 
to try and get at least some of into the same release.

So, consider this me begging for voters to actually speak up on this issue and 
give feedback on a way forward, because right now I have no idea what to do 
with it.

--Larry Garfield

Re: [PHP-DEV] [RFC] Pipe Operator (again)

Reply via email to