Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread Mike Schinkel
> On Sep 19, 2021, at 8:03 PM, tyson andre  wrote:
> 
> Hi internals,
> 
> I've created a new RFC https://wiki.php.net/rfc/deque to add a `final class 
> Deque`
> 
> This is based on the `Teds\Deque` implementation I've worked on
> for the https://github.com/TysonAndre/pecl-teds PECL.

With one caveat, this is a much stronger RFC than the Vector one.  Good job!

However...

> On Sep 20, 2021, at 4:25 PM, Rowan Tommins  wrote:
> 
> On 20/09/2021 14:46, tyson andre wrote:
>> The choice of global namespace maintains consistency with the namespace used 
>> for general-purpose collections already in the SPL
> 
> I find this argument unconvincing. If the intention is for this to fit with 
> existing classes in the SPL, it should be called "SplDeque", or more 
> consistently "SplDoubleEndedQueue", and the RFC should talk about how the 
> design aligns with those existing classes.
> 
> If it is intended to be the first of a new set of data structures which are 
> *not* aligned with the existing SPL types, then putting it in a new namespace 
> would make most sense.
> 
> In the RFC and the list you've mentioned a few comparisons, but I don't think 
> any of them hold:
> 
> * ArrayObject, WeakReference, and WeakMap are all classes for binding to 
> specific engine behaviour, not generic data structures
> * Iterators all have an "Iterator" suffix (leading to some quite awkward 
> names)
> * Reflection classes all have a "Reflection" prefix
> * Having both "Queue" and "SplQueue", or both "Stack" and "SplStack" would be 
> a terrible idea, and is a pretty strong argument *not* to add data structures 
> with such plain names

I am in complete agreement with Rowan.  

Honestly, at first I confused `Deque` with `Dequeue` and was wondering why we 
would name a class with a verb?  It wasn't until Rowan's comment that I 
realized `Deque` is an abbreviation.  

Which begs the question: how many other PHP developers will know computer 
science terms like this well enough to know `Deque` is a noun when they see it, 
and more importantly how many PHP developers will think to search for `Deque` 
when they need a queue?

So here is a straw man argument; name the class one of:

- DataStruct\DoubleEndedQueue, or
- DataStruct\DE_Queue

Let the bike-shedding begin! 

(Or is that "Let it continue?")

-Mike

P.S. BTW re: https://github.com/TysonAndre/pecl-teds 
, who is Ted? (pun intended)

[PHP-DEV] (Planned) Straw poll: Naming pattern for `*Deque`

2021-09-20 Thread tyson andre
Hi internals,

Because the naming choice for new datastructures is a question that has been 
asked many times,
I plan to create another straw poll (Single transferrable vote) on wiki.php.net 
to gather feedback on the naming pattern to use for future additions of 
datastructures to the SPL,
with the arguments for and against the naming pattern.

https://wiki.php.net/rfc/namespaces_in_bundled_extensions recently passed.
It permits using the same namespace that is already used in an extension,
but offers guidance in choosing namespace names and allows for using namespaces 
in new categories of functionality.

The planned options are:

1. `\Deque`, the name currently used in the RFC/implementation. See 
https://wiki.php.net/rfc/deque#global_namespace

   This was my preference because it was short, making it easy to remember and 
convenient to use.
2. `\SplDeque`, similar to datastructures added to the `Spl` in PHP 5.3.

   (I don't prefer that name because `SplDoublyLinkedList`, `SplStack`, and 
`SplQueue` are subclasses of a doubly linked list with poor performance,
   and this name would easily get confused with them. Also, historically, none 
of the functionality with that naming pattern has been final.
   However, good documentation (e.g. suggesting `*Deque` instead where possible 
in the manual) would make that less of an issue.)

   See https://wiki.php.net/rfc/deque#lack_of_name_prefix (and arguments for 
https://externals.io/message/116100#116111)
3. `\Collection\Deque` - the singular form is proposed because this might grow 
long-term to contain not just collections,
   but also functionality related to collections in the future(e.g. helper 
classes for building classes
   (e.g. `ImmutableSequenceBuilder` for building an `ImmutableSequence`), 
global functions, traits/interfaces,
   collections of static methods, etc.
   (especially since https://wiki.php.net/rfc/namespaces_in_bundled_extensions 
prevents more than one level of namespaces)

   Additionally, all existing extension names in php-src are singular, not 
plural. https://github.com/php/php-src/tree/master/ext 
   (Except for `sockets`, but that defines `socket_*` and `class Socket` and 
I'd assume it would be named `Socket\` anyway, the rfc didn't say exactly 
match?)

   So the namespace's contents might not just be `Collections`, but rather all 
functionality related to a `Collection`)
   Also, the examples in the "namespaces in bundled extension" RFC were all 
singular

   > For example, the `array_is_list()` function added in PHP 8.1 should indeed 
be called `array_is_list()`
   > and should not be introduced as `Array\is_list()` or similar.
   > Unless and until existing `array_*()` functions are aliased under an 
Array\* namespace,
   > new additions should continue to be of the form `array_*()` to maintain 
horizontal consistency.

   See https://wiki.php.net/rfc/deque#global_namespace (and 
https://externals.io/message/116100#116111)

   Also, straw polls for other categories of functionality 
(https://wiki.php.net/rfc/cachediterable_straw_poll#namespace_choices) 
   had shown interest of around half of voters in adopting namespaces,
   there was disagreement about the best namespace to use (e.g. none that were 
preferred to the global namespace),
   making me hesitant to propose namespaces in any RFC. For an ordinary 
collection datastructure, the situation may be different.

While there is considerable division in whether or not members of internals 
want to adopt namespaces,
I hope that the final outcome of the poll will be accepted by members of 
internals 
as what the representative of the majority of the members of internals 
(from diverse backgrounds such as contributors/leaders of userland 
applications/frameworks/composer libraries written in PHP,
documentation contributors, PECL authors, php-src maintainers, etc. (all of 
which I expect are also end users of php))
want to use as a naming choice in future datastructure additions to PHP.
(and I hope there is a clear majority)

-

Are there any other suggestions to consider for namespaces to add to the straw 
poll?

Several suggestions that have been brought up in the past are forbidden by the 
accepted policy RFC (https://wiki.php.net/rfc/namespaces_in_bundled_extensions)
and can't be used in an RFC.

- `Spl\`, `Core\`, and `Standard\` are forbidden: "Because these extensions 
combine a lot of unrelated or only tangentially related functionality, symbols 
should not be namespaced under the `Core`, `Standard` or `Spl` namespaces.
  Instead, these extensions should be considered as a collection of different 
components, and should be namespaced according to these."
- More than one namespace component (`A\B\`) is forbidden
- Namespace names should follow CamelCase.

Thanks,
Tyson

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread Rowan Tommins

On 20/09/2021 14:46, tyson andre wrote:

The choice of global namespace maintains consistency with the namespace used 
for general-purpose collections already in the SPL



I find this argument unconvincing. If the intention is for this to fit 
with existing classes in the SPL, it should be called "SplDeque", or 
more consistently "SplDoubleEndedQueue", and the RFC should talk about 
how the design aligns with those existing classes.


If it is intended to be the first of a new set of data structures which 
are *not* aligned with the existing SPL types, then putting it in a new 
namespace would make most sense.


In the RFC and the list you've mentioned a few comparisons, but I don't 
think any of them hold:


* ArrayObject, WeakReference, and WeakMap are all classes for binding to 
specific engine behaviour, not generic data structures
* Iterators all have an "Iterator" suffix (leading to some quite awkward 
names)

* Reflection classes all have a "Reflection" prefix
* Having both "Queue" and "SplQueue", or both "Stack" and "SplStack" 
would be a terrible idea, and is a pretty strong argument *not* to add 
data structures with such plain names



Regards,

--
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread Pierre

Le 20/09/2021 à 15:46, tyson andre a écrit :

Hi Pierre,
I'm not certain what you mean by "normalize".
https://www.merriam-webster.com/dictionary/normalize mentions


At least please try to make it serious, I think you understood what I 
meant. I'm in no place in arguing about technical details about how it 
should be implemented because the C language is not a place where I 
shine and I trust people like you for doing it best.


Nevertheless, my only point was, please put all data structures 
altogether, and please, just don't throw them in global namespace. 
Otherwise, as we said in french "ça va être un sacré bordel là dedans" 
(actually, it already is "un sacré bordel").


I'll let someone better in english than me translate, I'm not a native 
english speaker I could get it wrong.



If you also mean all datastructure RFCs should be combined into a single RFC,
I'd considered combining the Vector RFC with https://wiki.php.net/rfc/deque,
but decided against combining the RFCs in this instance, because of:


No, not necessarily, they don't need to be in the same RFC, having one 
per data structure is probably the way to go you'll maximize chances 
that each one pass vote. Nevertheless, many RFC's exist and if there's 
many other to come, no matter in which order they'll happen and no 
matter at which pace, they still can be seen as a "whole", and a 
namespace is a in my opinion still a good idea.


I won't debate on the rest, because you are much more both involved and 
technically competent than I am, and as being someone stupid, I need 
things to be well-organized to find them easily, that's the only 
constructive argument I have to bring to this discussion.


Regards,

--

Pierre

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Add `final class Vector` to PHP

2021-09-20 Thread tyson andre
Hi Peter Bowyer,

> That is a fair point. Vector is an overloaded and common word. For me a
> vector will always default to an entity characterized by a magnitude and a
> direction, because that's what I learned and used for years. The next
> definition I learned was the Numpy one.
> 
> That for me is the sticking point if this Vector allows mixed types which
> include arrays or vectors. Store them inside a Vector and then you end up
> with a matrix, a tensor and so-on in something identified as a Vector,
> which is nonsense. Yes C++ does that [1]. Yes with generics it sort-of
> makes sense. Numpy gets round it by calling the type `ndarray` and a vector
> is a specialised one-dimensional array.
> 
> If it's a high-performance array and that's the goal, call it hparray. Call
> it a tuple. Call it a dictionary.

- `hparray`: I think putting high performance in any class name in core is a 
mistake,
  and generally poor naming choice, and will mislead users now or in the future.
  (unless it is literally an API client for a database or server that includes 
high performance in the server software's name)

  Benchmarks currently show it using less memory but some more time than 
`array`,
  and those benchmarks will change as opcache's internals or PHP's 
representation 
  of `object`s or `array`s change.

  Which choice of data structure is highest performance would depend on the 
benchmark or needs of the application/library.
- `tuple`: In mathematics, most references I've heard of to tuples are 
generally 
  fixed sizes (n-tuples). In programming, python and C++ and various other 
languages
  use tuple to refer to a fixed-size (and immutable) data structure,
  making this naming choice extremely confusing.
  https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences
  https://en.cppreference.com/w/cpp/utility/tuple

  > (In C++)Class template std::tuple is a fixed-size collection of 
heterogeneous values.
- `dictionary` - Wikipedia refers to this as an associative array 
https://en.wikipedia.org/wiki/Associative_array
  which is the exact opposite of what my Vector RFC is proposing.
 
So I don't consider any of those proposed names appropriate alternatives, 
and expect much, much stronger opposition to an RFC using that naming choice 
for this functionality.

I expect opposition to any naming choice I propose; `Vector` is what I expect 
to have the least opposition.

Thanks,
Tyson

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread tyson andre
Hi Pierre,

> It seems that you are writing more than one RFC to add many data 
> structures. I love that you're doing that, but I suggest that you'd 
> normalize them all

I'm not certain what you mean by "normalize".
https://www.merriam-webster.com/dictionary/normalize mentions

1. "to make conform to or reduce to a norm or standard"
2. 
https://www.freetext.org/Introduction_to_Linear_Algebra/Basic_Vector_Operations/Normalization/
   (no pun intended)
3. "to bring or restore to a normal condition"

If you mean to make Vector and Queue's APIs consistent with each other,
I plan to make changes to Vector (e.g. remove $preserveKeys, add isEmpty), but 
the Vector RFC is currently on hold.

If you also mean all datastructure RFCs should be combined into a single RFC,
I'd considered combining the Vector RFC with https://wiki.php.net/rfc/deque,
but decided against combining the RFCs in this instance, because of:

1. Current discussion about whether or not to choose an alternate name for a 
`Vector`
2. The fact that `Deque` has much better performance for various queue workloads
   on both time and memory usage than `array`
   (and significantly better performance than `SplDoublyLinkedList`).

Still, I may consider the approach for future RFCs, given that

1. Many developers in internals have expressed a desire for having a 
significantly 
   larger data structure library in core along the lines of what php-ds 
provides,
   but may be uninterested in some of the individual datastructures or design 
choices.

   E.g. if 60% of developers were in favor of a sorted set and its proposed 
API/name 
   (along the lines of https://cplusplus.com/reference/set/set/),
   60% were in favor of an immutable sequence and its proposed API/name (of 
values) (similar to 
https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences),
   then with the 2/3 voting threshold,
   neither of those RFCs would pass but a proposal combining those two would 
pass,
   despite ~95% of developers wanting some type of improved datastructures 
added to core in general (I would guess).
2. This would allow seeing how datastructures compare to each other.

Combining RFCs has the drawback of significantly increasing the implementation, 
discussion, review,
delays, and time involvement for the volunteer RFC authors and voters,
and may lead to a larger number of last-minute concerns raised after voting has 
started when more time 
is spent trying out the new code and looking at the RFC.

> and place all new classes in a single new dedicated 
> namespace.

My rationale for deciding against a dedicated namespace is in 
https://wiki.php.net/rfc/deque#global_namespace
which I've recently expanded on.

The `Deque` proposal is normalized with respect to the namespace choice of data 
structures that already exist.

The choice of global namespace maintains consistency with the namespace used 
for general-purpose collections already in the SPL 
(as well as relatively recent additions such as ''WeakReference'' (PHP 7.4) and 
''WeakMap'' (PHP 8.0)).
Other recent additions to PHP such as ''ReflectionIntersectionType'' in PHP 8.1 
have 
also continued to use the global namespace when adding classes with 
functionality related to other classes.

Additionally, prior polls for namespacing choices of other datastructure 
functionality showed preferences 
for namespacing and not namespacing were evenly split in a straw poll for a new 
iterable type
(https://wiki.php.net/rfc/cachediterable_straw_poll#namespace_choices)

Introducing a new namespace for data structures would also raise the question 
of whether existing datastructures 
should be moved to that new namespace (for consistency), and that process would:

1. Raise the amount of work needed for end users or 
library/framework/application authors to migrate to new PHP versions.
2. Cause confusion and inconvenience for years about which namespace can or 
should be used in an application 
   (''SplObjectStorage'' vs ''Xyz\SplObjectStorage''), especially for 
developers working on projects supporting different php version ranges.
3. Prevent applications/libraries from easily supporting as wide of a range of 
php versions as they otherwise could.
4. Cause serialization/unserialization issues when migrating to different php 
versions,
   if the old or new class name in the serialized data did not exist in the 
other php version and was not aliased.
   For example, if the older PHP version could not ''unserialize()'' 
''Xyz\SplObjectStorage'' 
   and would silently create a `__PHP_Incomplete_Class_Name` 
   (see 
https://www.php.net/manual/en/language.oop5.serialization.php#language.oop5.serialization)
   without any warnings or notices.

Thanks,
Tyson

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Proposal: Adding an ARRAY_FILTER_REINDEX flag to array_values

2021-09-20 Thread Guilliam Xavier
On Sun, Sep 19, 2021 at 3:11 PM tyson andre 
wrote:

> Hi internals,
>
> Currently, array_filter will always return the original keys.
> This often requires an additional wrapping call of
> array_values(array_filter(...)) to reindex the keys and return a list.
> (or applications may not realize there will be gaps in the keys until it
> causes a bug or unexpected JSON encoding, etc.)
>

Hi,

This "issue" is not limited to array_filter(), there's also array_unique()
and even array_diff()/array_intersect() whose variadic signature doesn't
allow to add a flag (I think?)...

PS: and array_column() has the opposite issue...

-- 
Guilliam Xavier


Re: [PHP-DEV] Proposal: Adding an ARRAY_FILTER_REINDEX flag to array_values

2021-09-20 Thread Marco Pivetta
On Mon, Sep 20, 2021 at 12:02 PM Eugene Sidelnyk 
wrote:

> From my experience it is not that easy to locate bug like this.
>

Heyo, just a tip for you (and others in this thread), but if you use
`vimeo/psalm` and declare a `list` for a type, then `array_filter()`
that are missing an `array_values()` around it will be caught :)

```php
/** @return list */
function foo(array $input): array {
return array_filter($input);
}
```

Will produce:

```
INFO: MixedReturnTypeCoercion - 5:12 - The type 'array' is more general than the declared return type
'list' for foo
```

See https://psalm.dev/r/f9c51f72c2

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/


Re: [PHP-DEV] Proposal: Adding an ARRAY_FILTER_REINDEX flag to array_values

2021-09-20 Thread Eugene Sidelnyk
Hi, I myself faced such bugs because filter function preserves keys. From
my experience it is not that easy to locate bug like this. In my case I
rewrote solution in other way than it was originally written. Only later I
realized that root cause was array_filter

On Sun, Sep 19, 2021, 4:11 PM tyson andre  wrote:

> Hi internals,
>
> Currently, array_filter will always return the original keys.
> This often requires an additional wrapping call of
> array_values(array_filter(...)) to reindex the keys and return a list.
> (or applications may not realize there will be gaps in the keys until it
> causes a bug or unexpected JSON encoding, etc.)
>
> PHP is also more memory/time efficient at creating packed arrays than it
> is at creating associative arrays.
>
> What are your thoughts on adding `ARRAY_FILTER_REINDEX`, to ignore the
> original int/string keys and replace them with `0, 1, 2, ...`
>
> ```
> php > echo json_encode(array_filter([5,6,7,8], fn($value) => $value % 2 >
> 0));
> {"0":5,"2":7}
> // proposed flag
> php > echo json_encode(array_filter([5,6,7,8], fn($value) => $value % 2 >
> 0, ARRAY_FILTER_REINDEX));
> [5,7]
> ```
>
> https://www.php.net/array_filter already has the `int $mode = 0` which
> accepts the bit flags `ARRAY_FILTER_USE_KEY` and `ARRAY_FILTER_USE_BOTH`
> These could be then be combined with the proposed bit flag
> `ARRAY_FILTER_REINDEX`, e.g. to filter an array based on both the array
> keys and values, and return a list without gaps.
> (and if $callback is null, this would return a list containing only the
> truthy values)
>
> Thoughts?
>
> Thanks,
> Tyson
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>


Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread Christian Schneider
Am 20.09.2021 um 10:36 schrieb Pierre :
> Le 20/09/2021 à 02:03, tyson andre a écrit :
>> I've created a new RFC https://wiki.php.net/rfc/deque to add a `final class 
>> Deque`
> 
> It seems that you are writing more than one RFC to add many data structures. 
> I love that you're doing that, but I suggest that you'd normalize them all 
> and place all new classes in a single new dedicated namespace.

+1

- Chris

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Proposal: Adding an ARRAY_FILTER_REINDEX flag to array_values

2021-09-20 Thread Peter Bowyer
On Sun, 19 Sept 2021 at 14:12, tyson andre 
wrote:

> What are your thoughts on adding `ARRAY_FILTER_REINDEX`, to ignore the
> original int/string keys and replace them with `0, 1, 2, ...`
>

If it's measurably faster/more memory efficient than
array_values(array_filter(...)) across a range of array sizes, I'm strongly
in favour as I write this a lot.

Peter


Re: [PHP-DEV] RFC: Add `final class Vector` to PHP

2021-09-20 Thread Peter Bowyer
Hi Tyson,

On Sat, 18 Sept 2021 at 16:46, tyson andre 
wrote:

> Many of php's names are based on the naming choices in libraries made in
> C/C++.
> So using https://cplusplus.com/reference/vector/vector/ for my RFC
> https://wiki.php.net/rfc/vector
> seems like the most natural naming choice,
> and would make it easier for people with backgrounds in that family of
> languages to find the functionality they're looking for.
> PHP already has a SplStack, SplQueue, etc, like C++'s `stack`, `queue`,
> etc.
>

That is a fair point. Vector is an overloaded and common word. For me a
vector will always default to an entity characterized by a magnitude and a
direction, because that's what I learned and used for years. The next
definition I learned was the Numpy one.

That for me is the sticking point if this Vector allows mixed types which
include arrays or vectors. Store them inside a Vector and then you end up
with a matrix, a tensor and so-on in something identified as a Vector,
which is nonsense. Yes C++ does that [1]. Yes with generics it sort-of
makes sense. Numpy gets round it by calling the type `ndarray` and a vector
is a specialised one-dimensional array.

If it's a high-performance array and that's the goal, call it hparray. Call
it a tuple. Call it a dictionary.


> Also, your comment is ambiguous. Are you saying that you personally object
> to the name,
> or that you're fine with the name but think that the comments by
> Larry/Chris/Pierre in this email thread are representative of voters.
>

Both.

I object to the name for what's being proposed, but am not necessarily
against what's being proposed if it looks more useful than the Spl* stuff.

I'm fine with the name but for something other than what's being proposed.

HTH
Peter

1. https://www.geeksforgeeks.org/vector-of-vectors-in-c-stl-with-examples/


Re: [PHP-DEV] Adding `final class Deque` to PHP

2021-09-20 Thread Pierre

Le 20/09/2021 à 02:03, tyson andre a écrit :

Hi internals,

I've created a new RFC https://wiki.php.net/rfc/deque to add a `final class 
Deque`


Hello,

It seems that you are writing more than one RFC to add many data 
structures. I love that you're doing that, but I suggest that you'd 
normalize them all and place all new classes in a single new dedicated 
namespace.


Regards,

--

Pierre

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php