Re: [PHP-DEV] [RFC] Pipe Operator (again)

Larry Garfield Fri, 07 Feb 2025 13:05:44 -0800

Merging a few replies together here, since they overlap.  Also reordering a few 
of Tim's comments...

On Fri, Feb 7, 2025, at 7:32 AM, Tim Düsterhus wrote:
> Hi
>
> Am 2025-02-07 05:57, schrieb Larry Garfield:
>> It is now back with a better implementation (many thanks to Ilija for 
>> his help and guidance in that), and it's nowhere close to freeze, so 
>> here we go again:
>> 
>> https://wiki.php.net/rfc/pipe-operator-v3
>
> There's some editorial issues:
>
> 1. Status: Draft needs to be updated.
> 2. The RFC needs to be added to the overview page.
> 3. List formatting issues in “Future Scope” and “Patches and Tests”.
>
> Would also help having a closed voting widget in the “Proposed Voting 
> Choices” section to be crystal clear on what is being voted on (see 
> below the next quote).

I split pipes off from the Composition RFC late last night right before 
posting; I guess I missed a few things while doing so. :-/  Most notably, the 
Compose section is now removed from pipes, as it is not in scope for this RFC.  
(As noted, it's going to be more work so has its own RFC.)  Sorry for the 
confusion.  I think it should all be handled now.

> 5. The “References” (as in reference variables) section would do well 
> with an example of what doesn't work.

Example block added.

> 9. In the “Why in the engine?” section: The RFC makes a claim about 
> performance.
>
> Do you have any numbers?

Not currently.  The statements here are based on simply counting the number of 
function calls necessary, and PHP function calls are sadly non-cheap.  In 
previous benchmarks of my own libraries using my Crell/fp library, I did find 
that the number of function calls involved in some tight pipe operations was 
both a performance and debugging concern, but I don't have any hard numbers 
laying about at present to share.

If you think that's critical, please advise on how to best get meaningful 
numbers here.

Regarding the equivalency of pipes:

Tim Düsterhus wrote:
> 4. “That is, the following two code fragments are also exactly 
> equivalent:”.
>
> I do not believe this is true (specifically referring to the “exactly” 
> word in there), since the second code fragment does not have the short 
> closures, which likely results in an observable behavioral difference 
> when throwing Exceptions (in the stack trace) and also for debuggers. Or 
> is the implementation able to elide the the extra closure? (Of course 
> there's also the difference between the temporary variable existing, 
> with would be observable for `get_defined_vars()` and possibly 
> destructors / object lifetimes).

Thomas Hruska wrote:
> The repeated assignment to $temp in your second example is _not_ 
> actually equal to the earlier example as you claim.  The second example 
> with all of the $temp variables should, IMO, just be:
>
> $temp = "Hello World";
> $result = array_filter(array_map('strtoupper', 
> str_split(htmlentities($temp))), fn($v) { return $v != 'O'; });

Juris Evertovskis wrote:
> 3. Does the implementation actually turn `1 |> f(...) |> g(...)` into 
> `$π = f(1); g($π)`? Is `g(f(1))` not performanter? Or is the engine 
> clever enough with the var reuse anyways?

There's some subtlety here on these points.  The v2 RFC used the lexer to 
mutate $a |> $b |> $c into the same AST as $c($b($a)), which would then compile 
as though that had been written in the first place.  However, that made 
addressing references much harder, and there's an important caveat around order 
of operations. (See below.)  The v3 RFC instead uses a compile function to take 
the AST of $a |> $b |> $c and produce opcodes that are effectively equivalent 
to $t = $b($a); $t = $c($t);  I have not compared to see if they are the 
precise same opcodes, but they net effect is the same.  So "effectively 
equivalent" may be a more accurate statement.

In particular, Tim is correct that, technically, the short lambdas would be 
used as-is, so you'd end up with the equivalent of:

$temp = (fn($x) => array_map(strtoupper(...), $x))($temp);

I'm not sure if there's a good way to automatically unwrap the closure there.  
(If someone knows of one, please share; I'm fine with including it.)  However, 
the intent is that it would be largely unnecessary in the future with a revised 
PFA implementation, which would obviate the need for the explicit wrapping 
closure.  You would instead write

$a |> array_map(strtoupper(...), ?);

Alternatively, one can use higher order user-space functions already.  In 
trivial cases:

function amap(Closure $fn): Closure {
  return fn(array $x) => array_map($fn, $x);
}

$a |> amap(strtoupper(...));

Which I am already using in Crell/fp and several libraries that leverage it, 
and it's quite ergonomic.

There's a whole bunch of such simple higher order functions here:
https://github.com/Crell/fp/blob/master/src/array.php
https://github.com/Crell/fp/blob/master/src/string.php

Which leads to the subtle difference between that and the v2 implementation, 
and why Thomas' statement is incorrect.  If the expression on the right side 
that produces a Closure has side effects (output, DB interaction, etc.), then 
the order in which those side effects happen may change with the different 
restructuring.  With all pure functions, that won't make a practical 
difference, and normally one should be using pure functions, but that's not 
something PHP can enforce.

I don't think there would be an appreciable performance difference between the 
two compiled versions, either way, but using the temp-var approach makes 
dealing with references easier, so it's what we're doing.

Juris Evertovskis wrote:
> 1. Do you think it would be hard to add some shorthand for `|> 
> $condition ? $callable : fn($😐) => $😐`?

I'm not sure I follow here.  Assuming you're talking about "branch in the next 
step", the standard way of doing that is with a higher order user-space 
function.  Something like:

function cond(bool $cond, Closure $t, Closure $f): Closure {
  return $cond ? $t : $f;
}

$a |> cond($config > 10, bigval(...), smallval(...)) |> otherstuff(...);

I think it's premature to try and bake that logic into the language, especially 
when I don't know of any other function-composition-having language that does 
so at the language level rather than the standard library level.  (There are a 
number of fun operations people build into pipelines, but they are all 
generally done in user space.)

--Larry Garfield

Re: [PHP-DEV] [RFC] Pipe Operator (again)

Reply via email to