Hi Maté and all, > On Mar 25, 2025, at 03:45, Máté Kocsis <kocsismat...@gmail.com> wrote:
Regarding Rowbot slowness compared to the RFC: > I can only assume that the excessive usage of objects makes the library much > slower than what's possible > even for a userland library (obviously, an internal C implementation will > always be faster). According to my results, the RFC's implementation was > **two orders of magnitude** faster than the Rowbot library for parsing a very > basic "https://example.com" URL 1000 times (~0.002 sec vs ~0.56 sec). I would not presume that the dedicated value objects are what "makes the [Rowbot] library much slower" than the RFC -- instead, my first intuition is that the *parsing* operations are slower in userland than in C, and are primarily responsible for the comparative slowness. Speedwise, creation of multiple objects from the parsed results would be a rounding error compared to the parsing itself. > What I want to say with this is that it's perfectly fine to optimize a > userland library for ergonomics and for the usage of advanced OOP in mind, > but an internal > implementation should also keep efficiency in mind besides developer > experience. That's why I don't see myself implement separate objects for some > of > the components for now. But nothing would block us from doing it later, if we > found out it's necessary. I think that's fair. The main thing that stands out to me is not the Scheme, Host, etc. value objects, but that the RFC presents no UrlRecord -- which is very definitely part the WHATWG-URL specification. That is, from reading the spec, I'd expect to see a UrlRecord, and a Url composed from it. > I believe the most fundamental difference between the Rowbot library and the > RFC is that the RFC has native support for percent-decoding (because > most properties are accessible in 2 variants), while the library completely > leaves this task for the user. I have some thoughts on that, but I'll save them for later. Meanwhile, AFAICT, neither Rowbot nor the RFC provide a percent *en*coding mechanism, for consumers to put together properly-encoded values. Have I missed it in the RFC, or is it somehow not necessary, or something else? > This RFC is a synthesis of almost a year of discussion and refinement, > collaborated by some very clever folks, who have a lot of hands-on experience > of > URL parsing and handling. I would not presume otherwise! Even so, everyone makes mistakes and oversights from time to time, including very clever folks of the kind you describe above. > That's why I would say that input from Trevor Rowbotham is also welcome in > the discussion (especially his experience of some edge cases he had to deal > with) I agree -- it would be great for the RFC team to seek him out and invite him to comment in this thread. > but the said library is nowhere near as widely adopted for it to qualify as > something we must definitely take into consideration > when designing PHP's new URL parsing API. Not to be too blunt, but the Rowbot library is far more widely adopted than the RFC currently is; I think Rowbot represents an intersection of theory and practice that one would be unwise to discard without intentional and extensive consideration. >> A URLSearchParams class: > > I like this concept too. And in fact, support for such a class is on my to-do > list, and is mentioned in the "Future Scope". Because it is part of the WHATWG-URL spec, I think it deserves first-class treatment in this RFC ... > I just didn't want to make the RFC even longer, because we already have a lot > of details to discuss. ... but yeah, the sheer volume of the RFC makes it difficult to review and pick apart. Which leads to my last point: I would really like to see at least two separate RFCs here. They be a lot easier to review and critique that way: - one for dealing with URIs as they exist now, especially one that the honors the ways-of-working that exist in userland; and, - one for dealing with WHATWG-URL in its entirety, with all its differences (some subtle, some not) from URIs. I can see arguments for either one being the "base" on which the other would build. -- pmj