Hi folks.  We have discovered a subtle issue with the pipes implementation that 
needs to be addressed, although we're not entirely sure how.  Specifically, 
Derick found it while trying to ensure Xdebug plays nice with pipes.

The problem has to do with the operator precedence of short closures vs pipes.  
For example:

$result = $arr
  |> fn($x) => array_map(strtoupper(...), $x)
  |> fn($x) => array_filter($x, fn($v) => $v != 'O');
  
It's logical to assume that would be parsed as 

$result = $arr
  |> (fn($x) => array_map(strtoupper(...), $x))
  |> (fn($x) => array_filter($x, fn($v) => $v != 'O'));

Which then compiles into, essentially:

$temp = (fn($x) => array_map(strtoupper(...), $x))($arr);
$temp = (fn($x) => array_filter($x, fn($v) => $v != 'O'))($temp);
$result = $temp;

Or 

$result = (fn($x) => array_filter($x, fn($v) => $v != 'O'))((fn($x) => 
array_map(strtoupper(...), $x))($arr));

(depending on how you want to visualize it.)

That was the intent of the RFC.

However, because short closures are "greedy" and assume anything before the 
next semi-colon is part of the closure body, it's actually getting parsed like 
this:

$result = $arr
    |> fn($x) => (
        array_map(strtoupper(...), $x)
            |> fn($x) => (
                array_filter($x, fn($v) => $v != 'O')
            )
    )
;

Which would compile into something like:

$result = (fn($x) => (fn($x) => array_filter($x, fn($v) => $v != 
'O'))(array_map(strtoupper(...), $x)))($arr);

Which is not the intent.

Humorously, if all the functions and closures involved are pure, this parsing 
difference *usually* doesn't matter.  The result is still computed as intended. 
 That's why it wasn't caught during earlier reviews or by automated tests.  
However, there are cases where it matters.  For example:

42
    |> fn($x) => $x < 42
    |> fn($x) => var_dump($x);
    
One would expect that to evaluate to false in the first segment, then 
var_dump() false.  But it actually var_dump()s 42, because it gets interpreted 
as (42 |> fn($x) => var_dump($x)) first.

The incorrect wrapping also makes debugging vastly harder, even if the computed 
result is the same, as the expression is broken up "wrong" into multiple nested 
closures, stack traces are different than one would expect, etc.

The challenge is conflicting requirements.  Closures have extremely low 
precedence right now, specifically so that they will grab everything that comes 
after them as a single expression.  However, we also want pipes to allow a step 
to be a closure; that would typically mean making pipe bind even lower than 
closures, but that's not viable because it would result in 

$result = 'x' |> fn ($x) => strtoupper($x) 

being interpreted as

($result = 'x') |> (fn ($x) => strtoupper($x))

Which would be rather pointless.

So far, the best suggestion that's been put forward (though we've not tried 
implementing it yet) is to disallow a pipe inside a short-closure body, unless 
the body is surrounded by ().  So this:

fn($x) => $x 
  |> fn($x) => array_map(strtoupper(...), $x)
  |> fn($x) => array_filter($x, fn($v) => $v != 'O');
 
Today, that would run somewhat by accident, as the outer-most closure would 
claim everything after it.  With the new approach, it would be interpreted as 
passing `fn($x) => $x` as the argument to the first pipe segment, which would 
then be mapped over, which would fail.  You'd instead need to do this:

fn($x) => ($x 
  |> (fn($x) => array_map(strtoupper(...), $x))
  |> (fn($x) => array_filter($x, fn($v) => $v != 'O'))
);

Which is not wonderful, but it's not too bad, either.  That's probably the only 
case where pipes inside a short-closure body would be useful anyway.  And if 
PFA (https://wiki.php.net/rfc/partial_function_application_v2) and similar 
closure improvements pass, it will greatly reduce the need for mixing short 
closures and pipes together in either precedence, so it won't come up very 
often.

There are a few other operators that bind lower than pipe (see 
https://github.com/php/php-src/blob/fd8dfe1bfda62a3bd9dd1ff7c0577da75db02fcf/Zend/zend_language_parser.y#L56-L73),
 which would therefore need wrapping parentheses.  For many of them we do want 
them to be lower than pipe, so just moving pipe's priority down isn't viable. 
However, most of those are unlikely to be used inside a pipe segment, so are 
less likely to come up.  The most likely would be a bail-out exception:

$value = null;
$value
    |> fn ($x) => $x ?? throw new Exception('Value may not be null')
    |> fn ($x) => var_dump($x);

Which would currently be interpreted something like:

$c = function ($x) {
  $c = function ($x) {
    return var_dump($x);
  };
  return $x ?? throw $c(new Exception('Value may not be null'));
};
$c(null);

This would not throw the exception as expected, unless parentheses are added.  
It would var_dump() an exception and then try to throw the return of 
varl_dump(), which would fatal.

RM approval to address this during the 8.5 beta phase has been given, but we 
still want to have some discussion to make sure we have a good solution.

So, the question:

1. Does this seem like a good solution, or is there a problem we've not spotted 
yet?
2. Does anyone have a better solution to suggest?

Thanks to Derick, Ilija, and Tim for tracking down this annoying edge case.

-- 
  Larry Garfield
  la...@garfieldtech.com

Reply via email to