Hi Andreas, 2017-08-01 6:57 GMT+02:00 Andreas Hennings <andr...@dqxtech.net>:
> Hello list, > a quite common use case is that one needs to find out if a string > $haystack begins or ends with another string $needle. > Or in other words, if $needle is a prefix or a suffix of $haystack. > > One prominent example would be in PSR-4 or PSR-0 class loaders. > Maybe the use case also occurs when writing parsers.. > In each of these two examples (parsers, class loaders), we care about > performance. > > (forgive me if this was discussed before, I did not find it anywhere > in the archives) > > -------------------------- > > Existing solutions to this problem feel non-trivial, and/or are > suboptimal in performance. > https://stackoverflow.com/questions/2790899/how-to- > check-if-a-string-starts-with-a-specified-string > https://stackoverflow.com/questions/834303/startswith- > and-endswith-functions-in-php > This answer compares different solutions, > https://stackoverflow.com/a/7168986/246724 > > Existing solutions: > (Let's focus on string_starts_with(), the other case is mostly > equivalent / symmetric) > > if (0 === strpos($haystack, $needle)) {..} > I have often seen this presented as the preferable solution. > Unfortunately, this searches the entire string, not just the > beginning. Especially if $haystack is really long, this can be a > waste. > E.g. if (0 === strpos(file_get_contents('some_source_file.php'), > '<?php')) {..} will search the entire file for an occurence of > '<?php'. > > if ($needle === substr($haystack, 0, strlen($needle))) {..} > This reserves new memory for the substring, which later needs to be > garbage-collected. > Also, this requires an additional function call to strlen() - which > adds even more clutter if $needle is an expression, not just a > variable. > > if (0 === strncmp($haystack, $needle, strlen($needle))) {..} > Needs the additional call to strlen(). > Otherwise, this seems like a really good solution. > > if ('' === $needle || false !== strrpos($haystack, $needle, > -strlen($haystack))) {..} > This is the funky solution from https://stackoverflow.com/a/ > 10473026/246724 > The author says that it will be outperformed by strncmp() - so.. > > if (preg_match('/^' . preg_quote($needle, '/') . '/', $haystack)) {..} > Clearly gonna be slower than other options. > > As said, all these solutions do work, but they are either suboptimal, > or they add clutter and overhead, or feel a bit like mind acrobatics. > > ----------------- > > So, I wonder if it would be worthwhile to add new functions > string_starts_with() / string_has_prefix(), and string_ends_with() / > string_has_suffix(). > > (Or maybe change strncmp(), so that the 3rd parameter $len is > optional. If $len is NULL / not provided, it would use the length of > the second (or first?) string. > (idea was that second parameter = needle).) > > For me personally, I am sure that I would use a new > string_starts_with() a lot more often than a lot of the other existing > string functions. > I don't think it is an exotic or niche use case. > > -------------- > > Spinning this further: > A lot of times if I want to check if $haystack begins with $needle, I > will then need the rest of the string after $needle. > So > if (string_starts_with($haystack, $needle)) { > $suffix = substr($haystack, strlen($needle)); > } > or > if (string_ends_with($filename, '.php')) { > $basename = substr($filename, 0, -4); > } > > I wonder if this could be somehow combined. > E.g. > if (FALSE !== $basename = string_clip_suffix($filename, '.php')) { > // Do something with $basename. > } > > ------------------ > > One flaw of these new functions would be that they are less versatile > than other string functions. > They solve this problem, and nothing else. > On the other hand, this is the point, to avoid unnecessary overhead. > > The other problem would be, of course, "feature creep" aka "we have so > many string functions already". > This is a matter of opinion. > I would imagine the "cost" of new native functions is: > - global namespace pollution > - increased mental load to learn and remember all of them > - higher memory footprint of php engine? > - more C code to maintain > - a new doc page. > Did I miss something? > > ------------------ > > -- Andreas > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > > This idea was discussed 11 months ago https://externals.io/message/94787 There is also a proper RFC https://wiki.php.net/rfc/add_str_begin_and_end_functions You might wanna contact with Will to get feedback from the idea. -- regards / pozdrawiam, -- Michał Brzuchalski about.me/brzuchal brzuchalski.com