Hi Larry

Sorry for the late response.

On Fri, Feb 7, 2025 at 5:58 AM Larry Garfield <la...@garfieldtech.com> wrote:
>
> https://wiki.php.net/rfc/pipe-operator-v3

We have already discussed this topic extensively off-list, so let me
bring the list up-to-date.

The current pipes proposal is elegantly simple. This has many upsides,
but it comes with an obvious limitation:
It only works well when the called function takes only a single argument.

$sourceCode |> lexer(...) |> parser(...) |> compiler(...) |> vm(...)

Such code is nice, but is also quite niche. I have argued off-list
that the predominant use-case for pipes are arrays and iterators
(including strings immediately split into chunks), and it seems most
agree. However, most array/iterator functions (e.g. filter, map,
reduce, first, all, etc.) don't fall into the one-parameter category.

A slightly simplified example from the RFC:

$result = "Hello World"
    |> str_split(...)
    |> fn($x) => array_map(strtoupper(...), $x)
    |> fn($x) => array_filter($x, fn($v) => $v != 'O');

IMO, this is harder to understand than the alternative of using
multiple statements with a temporary variable.

$tmp = "Hello World";
$tmp = str_split($tmp);
$tmp = array_map(strtoupper(...), $tmp);
$result = array_filter($tmp, fn($v) => $v != 'O');

The RFC has a solution for this: Partial function application [1].

$result = "Hello World"
    |> str_split(...)
    |> array_map(strtoupper(...), ?)
    |> array_filter(?, fn($v) => $v != 'O');

This still causes more cognitive overhead than it should, at least to me.

* The placement of ? is hard to detect, especially when it's not the
first argument.
* The user now has to think about immediately-invoked closures that
exist solely for argument-reordering. The closure can be elided
through the optimizer, but we cannot elide the additional cognitive
overhead in the user.
* The implementation of ? is significantly more complex than that of
pipes, making the supposed simplicity of pipes somewhat misleading.

If my assumption is correct that the primary use-case for pipes are
arrays, it might be worth investigating the possibility of introducing
a new iterator API, which has been proposed before [2], optimized for
pipes. Specifically, this API would ensure consistent placement of the
subject, i.e. the iterable in this case, as the first argument. Pipes
would no longer have the form of expr |> expr, where the
right-hand-side is expected to return a callable. Instead, it would
have the form of expr |> function_call, where the left-hand-side is
implicitly inserted as the first parameter of the call.

namespace Iter {
    function map(iterable $iterable, \Closure $callback): \Iterator;
    function filter(iterable $iterable, \Closure $callback): \Iterator;
}

namespace {
    use function Iter\{map, filter};

    $result = "Hello World"
        |> str_split()
        |> map(strtoupper(...))
        |> filter(fn($v) => $v != 'O');
}

This is the same approach taken by Elixir [3]. It has a few benefits:

* We don't need to think about closures that are immediately invoked,
because there are none. The code is exactly the same as if you had
written it through nested function calls. This simplifies things
significantly for both the engine and the user.
* It closely resembles code that would be written in an
object-oriented manner, making it more familiar.
* It is the shortest and most readable of all the proposed options.

As with everything, there are downsides.

* It only works well for subject-first APIs. There are not an
insignificant number of existing functions that do not follow this
convention (e.g. explode(), preg_match(), etc.). That said, explode('
', $s) |> filter($c1) |> map($c2) still composes well, given explode()
is usually first first in the chain, while preg_match() is rarely
chained at all.
* People have voiced concerns for potential confusion regarding the
right-hand-side. It may not be any arbitrary expression, but is
restricted to a function call. Hence, `$param |> $myClosure` is not
valid code, requiring additional braces: `$param |> $myClosure()`.
This approach resembles the -> operator, where at least conceptually,
the left-hand-side is implicitly passed as a $this parameter. However,
the spaces between |> do not signal this fact as well, making it look
like the right-hand-side is evaluated separately. Potentially, a
different symbol might work better.

Internal reactions to this idea were mixed, so I'm interested to hear
what the community thinks about it.

Ilija

[1] https://wiki.php.net/rfc/partial_function_application
[2] https://externals.io/message/118896
[3] https://elixirschool.com/en/lessons/basics/pipe_operator

Reply via email to