Re: [PHP-DEV] [RFC] Pipe Operator (again)

Larry Garfield Thu, 03 Apr 2025 21:49:42 -0700

On Thu, Apr 3, 2025, at 4:06 PM, Rowan Tommins [IMSoP] wrote:
> On 03/04/2025 18:06, Larry Garfield wrote:
>> So if we expect higher order functions to be common (and I would probably 
>> mainly use them myself), then it would be wise to figure out some way to 
>> make them more efficient.  Auto-first-arg is one way. 
>
> From this angle, auto-first-arg is a very limited compiler optimisation 
> for partial application.

I'd say it has the dual benefit of optimization and ergonomics.  (Though see 
discussion below.)

> With PFA and one-arg-callable pipes, you could add a parser rule that 
> matches this, with the same output:
>
> $foo |> bar(?, $baz);
>
> But you'd also be able to do this:
>
> $baz |> bar($foo, ?);
>
> And maybe the compiler could optimise that case too.

>From what Arnaud has told me, any PFA that has a single, fixed-position-number 
>argument remaining should be optimizable.  (Though that's a task for whenever 
>PFA is next worked on, if it is next worked on.)

> Neither helps with the performance of higher order functions which are 
> doing more than partial application, like map and filter themselves. I 
> understand there's a high cost to context-switching between C and PHP; 
> presumably if there was an easy solution for that someone would have 
> done it already.

> On 03/04/2025 18:39, Ilija Tovilo wrote: 
>> To me, pipes improve readability when they behave like methods, i.e.
>> they perform some operation on a subject. This resembles Swift's
>> protocol extensions or Rust's trait default implementations, except
>> using a different "method" call operator. 
>> [...]
>> If we decide not to add an iterator API that works well with
>> first-arg, then I agree that this is not the right approach. But if we
>> do, then neither of your examples are problematic. 
>
>
> I guess those two things go together quite well as a mental model: 
> pipes as a way to implement extension methods, and new functions 
> designed for use as extension methods.
>
> I think I'd be more welcoming of it if we actually implemented 
> extension methods instead of pipes, and then the new iterator API was 
> extension-method-only. It feels less like "one of the arguments is 
> missing" if that argument is *always* expressed as the left-hand side 
> of an arrow or some sort.

As I've noted, classic pipes (current RFC, unary function only) and extension 
functions are not mutually exclusive, and I see no reason we couldn't add both. 
 Auto-partialing first-arg pipes and dedicated extension functions step on each 
other's toes a bit more, however.

To address both this and Ilija's email, I was toying with extension functions 
as a concept a while back.  I also did extensive research into "collections" in 
other languages last year with Derick.  (See discussion in a previous PHP 
Foundation report[1]).  That led me to a number of conclusions that I still 
hold to:

* A new iterable API is absolutely a good thing and we should do it.
* That said, we *need* to split Sequence, Set, and Dictionary into separate 
types.  We are the only language I reviewed that didn't have them as separate 
constructs with their own APIs.
* The use of the same construct (arrays and iterables) for all three types is a 
fundamental and core flaw in PHP's design that we should not double-down on.  
It's ergonomically awful, it's bad for performance, and it invites major 
security holes.  (The "Drupageddon" remote exploit was caused by using an array 
and assuming it was sequential when it was actually a map.)

So while I want a new iterable API, the more I think on it, the more I think a 
bunch of map(iterable $it, callable $fn) style functions would not be the right 
way to do it.  That would be easy, but also ineffective.

The behavior of even basic operations like map and filter are subtly different 
depending on which type you're dealing with.  Whether the input is lazy or not 
is the least of the concerns.  The bigger issue is when to pass keys to the 
$fn; probably always in Dict, probably never in Seq, and certainly never in Set 
(as there are no meaningful keys).  Similarly, when filtering a Dict, you would 
want keys preserved.  When filtering a Seq, you'd want the indexes re-zeroed.  
(Or to seem like it, given or take implementation details.)  And then, yes, 
there's the laziness question.

So we'd effectively want three different versions of map(), filter(), etc. if 
we didn't want to perpetuate and further entrench the design flaw and security 
hole that is "sequences and hashes are the same thing if you squint."  And... 
frankly I'd probably vote against an interable/collections API that didn't 
address that issue.

However, a simple "first arg" pipe wouldn't allow for that.  Or rather, we'd 
need to implement seqMap(iterable $it, callable $fn), setMap(iterable $it, 
callable $fn), and dictMap(iterable $it, callable $fn).  And the same split for 
filter, and probably a few other things.  That seems ergonomically suspect, at 
best, and still wouldn't really address the issue since you would have no way 
to ensure you're using the "right" version of each function. Similarly, a dict 
version of implode() would likely need to take 2 separators, whereas the other 
types would take only one.

So the more I think on it, the more I think the sort of iterable API that 
first-arg pipes would make easy is... probably not the iterable API we want 
anyway.  There may well be other cases for Elixir-style first-arg pipes, but a 
new iterable API isn't one of them, at least not in this form.

Which brings us then to extension functions.  Pipes and higher order functions, 
or first-arg pipes, can act as a sort of "junior" extension functions, but for 
the reasons listed above fall short of being real extension functions.

For comparison, extension functions in Kotlin look like this:

fun SomeType.foo(a: Int) {
  // a is a variable. "this" is the SomeType the function was called on.
  // However, this is still "external" scope so only public members are usable.
}

val s = SomeType()
s->foo(5)

(Kotlin doesn't have a "new" keyword; the above is how you instantiate an 
object.)

Arguably, Go is entirely built as extension functions. It looks like this:

func (st SomeType) foo(a int) {
  // st and a are both variables here.  Do as you will.
}

Notably for us, the same function can be defined multiple times against 
different types.  That allows the system to differentiate between A.foo() and 
B.foo().  You can also attach extension functions to interfaces.  In fact, most 
of Kotlin's collections (list, set, map) API is implemented as extension 
functions on interfaces, of which they have many.

However, both Go and Kotlin are compiled languages, which means the compiler 
has a complete view of the code at compile time, and can sort out which 
extension function to use in a given situation statically.  That is, of course, 
not the case in PHP.

That means even if we figure out a way to define multiple foo() functions that 
apply to different types, and can agree that doing so is not evil (some have 
argued it's too close to function/method overloading, which they claim is evil; 
I disagree with both points), there is still a very non-trivial task of 
figuring out how to resolve the function to call at runtime, probably somehow 
leveraging autoloading, which also then runs us up against function 
autoloading, etc.  I hope that is a solvable problem, but I don't currently 
know how to solve it.

So "real" extension functions are an epic unto themselves, even though I really 
really want them.  (They are fantastically ergonomic for converting from one 
representation to another, like from an ORM entity to a minimal struct to 
serialize as JSON, and vice versa.  I quite miss them from Kotlin).

It would be really nice if we could follow Kotlin's example and build 3 
different collection types (likely via objects), and then build most of the API 
for them in extension functions rather than as methods.  However, that sounds 
harder every time I dig into it.

As a side note to Yakov[2], a Uniform Function Call Syntax in PHP would have 
all the same problems as extension functions, even before we get into the issue 
that Rowan, Tim, and others have brought up that PHP is wildly inconsistent in 
having the "subject" first in a function call.  Without that UFCS doesn't make 
much sense.  While I appreciate the elegance of it, in practice, figuring out 
extension functions as a dedicated syntax (akin to Kotlin or Go above) is 
probably the best we could do, if we can even do that.

All of which is to say... I think I may have talked myself back around to just 
using basic unary function pipes and "suck it up" on the extra call for higher 
order functions for now, unless someone can show a fair number of non-iterable 
use cases where it would be helpful.  That then would unblock the other 
incremental improvements listed in the RFC (compose, PFA, and $$->foo()).  True 
extension functions could then be explored later (likely by people with way 
more engine knowledge than me) as their own thing, whether using ->, +>, or 
something else entirely.  We just need to agree that the existence of pipes 
does not render extension functions moot.

Thoughts?

--Larry Garfield

[1] https://thephp.foundation/blog/2024/08/19/state-of-generics-and-collections/
[2] https://externals.io/message/127037

Re: [PHP-DEV] [RFC] Pipe Operator (again)

Reply via email to