Re: On "A New Collections Framework for the Standard Library"

2017-11-13 Thread Andrei Alexandrescu via Digitalmars-d

On 11/13/2017 05:19 AM, Mark wrote:

On Thursday, 18 May 2017 at 18:27:22 UTC, Andrei Alexandrescu wrote:

On 05/18/2017 11:18 AM, Jack Stouffer wrote:

I just got around to watching Eduard Staniloiu's talk at DConf [1]
about the collections library he was working on. One thing seemed
odd, in that Eduard seems to be saying that the container and the
range over the container's elements are one in the same and the
range primitives are on the container itself.

Is this accurate?


That is correct. The fundamental equation is:

Ranges + Optional Primitives = Containers

Andrei


Is the code for the new collections available on github or somewhere 
else? Even if it's a work in progress, I'm curious to see what it looks 
like.




+Eduard


Re: On "A New Collections Framework for the Standard Library"

2017-11-13 Thread Mark via Digitalmars-d
On Thursday, 18 May 2017 at 18:27:22 UTC, Andrei Alexandrescu 
wrote:

On 05/18/2017 11:18 AM, Jack Stouffer wrote:
I just got around to watching Eduard Staniloiu's talk at DConf 
[1]
about the collections library he was working on. One thing 
seemed
odd, in that Eduard seems to be saying that the container and 
the

range over the container's elements are one in the same and the
range primitives are on the container itself.

Is this accurate?


That is correct. The fundamental equation is:

Ranges + Optional Primitives = Containers

Andrei


Is the code for the new collections available on github or 
somewhere else? Even if it's a work in progress, I'm curious to 
see what it looks like.




Re: On "A New Collections Framework for the Standard Library"

2017-05-19 Thread Jonathan M Davis via Digitalmars-d
On Friday, May 19, 2017 1:22:50 AM PDT Jack Stouffer via Digitalmars-d 
wrote:
> First off, how are you going to do something like a map over a
> immutable container then, as map uses the range primitives and
> not foreach? There's no reason in principal that that should
> cause an issue. But with that design popFront is mutating the
> underlying data, and therefore should not be allowed. But this
> works with the std.container design because popFront is mutating
> the range view which has no reason to be immutable.

Andrei was talking at dconf about this and said that we'd be adding a new
range primitive called tail, and we'd basically end up with car and cdr for
ranges. This makes sense for immutable ranges, and it's simple to wrap
existing input ranges to makes them work with such primitives, but I'm not
sure that it plays at all nicely once you start considering higher order
ranges - and it has the serious problem that it would require revamping all
range-based code to support it. That _might_ make sense for Phobos, since
it's the standard library, but I expect that its DOA pretty much everywhere
else (especially when it's for immutable containers which can be very useful
but are usually very niche), and even for Phobos, it's going to be a lot of
churn. Personally, I'm not convinced that immutable containers are worth
enough to merit such a change, but Andrei clearly thinks that they are.
Interesingly, he doesn't seem to think that it's worth solving the
tail-const issue with ranges, and without a solution for that, const and
ranges basically so not work together at all. And _that_ is a problem that
keeps coming up, whereas most folks don't seem to be clammering for
immutable containers.

Now, if you're willing to add a layer of indirection, then it's trivial to
make existing ranges over immutable containers work, because you can have a
mutable range refer to an immutable object, but that is either going to be
unsafe or require that the range be allocated on the heap - which is
presumably why Andrei didn't go with that solution, but I don't know.

But overall, it sounds like once again Andrei is getting very experimental
with what he's doing (or getting his mentee to do). So if it works out well,
we could end up with something pretty interesting and innovative, but it
also risks being a bit of a disaster, and since it's experimental, there's
really no way to know how it's going to go (even less so when all we know
about it is what was in the talk and whatever tidbits Andrei has discussed
in the newsgroup or at dconf). So, I guess that we'll have to wait and see -
particularly since we haven't even seen the actual API yet.

- Jonathan M Davis



Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Eugene Wissner via Digitalmars-d

On Thursday, 18 May 2017 at 15:18:00 UTC, Jack Stouffer wrote:
I just got around to watching Eduard Staniloiu's talk at DConf 
[1] about the collections library he was working on. One thing 
seemed odd, in that Eduard seems to be saying that the 
container and the range over the container's elements are one 
in the same and the range primitives are on the container 
itself.


Is this accurate?

If so, then this is completely different than the currently 
accepted design of ranges and containers. Currently, a 
container allows the removal and insertion of elements, and to 
get a range of the elements you have to call a function or 
slice the container. This is how std.containers and Brain's 
EMSI container library. In this new library, what happens when 
I iterate the elements of one of these containers using 
foreach? Do the range primitives empty the container? How do I 
ask for a new, full range of my data in the collection to use 
in a range algorithm after I've already iterated over the 
elements?


As Andrei originally laid out in this talk introducing ranges 
[2], ranges only shrink and never grow. Put another way, Output 
Ranges and Input Ranges are two separate things. Andrei states 
durning that talk (at 1:20:20)


"Can ranges ever grow? ... No, never, because it would be 
fundamentally unsafe."


I'm not sure what Andrei meant by this, but he believes that 
growing ranges is also a bad idea.


[1] https://www.youtube.com/watch?v=k88QSIC-Na0
[2] 
https://archive.org/details/AndreiAlexandrescuKeynoteBoostcon2009


I didn't look the talk yet, but about a month ago I tried to 
implement a similar concept and wanted to apply it to all 
containers as well.
My idea was to make the containers/slices reference counted and 
copy on write (or to be more precise, on size changing, similar 
how D built-in slices behave).


1) All containers have a pointer to the allocated memory and its 
length.
2) All containers have a pointer to the beginning and the end of 
the range they refer to (so popFront, popBack can be implemented 
without changing the real size of the allocated memory)
3) If you do something like insertBack or insertFront and the 
reference counter is greater than 1, then the counter decreases 
and the container is copied.


But at the end I couldn't deal with the constness. Slice!(ubyte) 
and Slice!(const ubyte) are different types - that isn't nice. 
But for "const Slice!ubyte" you can't do popFront and popBack. It 
could be solved with some casting or other hacks but I don't know 
how it would work in a multithreaded envrionment. So I stopped 
the experiment for now.


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Jack Stouffer via Digitalmars-d
On Thursday, 18 May 2017 at 18:27:22 UTC, Andrei Alexandrescu 
wrote:

Iterating over a container using e.g. foreach won't consume the
container same as iterating over int[] won't consume the slice.


I don't understand why you're mapping the behavior of 
ranges/slices, which theoretically are shrinking views of data, 
onto containers of data which can grow or shrink. They seem 
antithetical to me.


The only reason foreach over a dynamic array doesn't consume the 
slice is because foreach uses index based iteration on static and 
dynamic arrays and not the std.range.primitives functions.


It's not evident to me that the non-consuming behavior makes 
sense to map onto something completely different like containers 
that won't even necessarily have index based access.


It seems odd to me that slices would be the design basis when 
slices


1. Are intentionally limited in scope design wise to be a 
shrinking view of data

2. Only make sense behavior wise for array containers
3. Their ability to grow is a GC oddity which may end up copying 
a whole lot of data depending on the origin of the slice


There is no way to reclaim the original container. If you 
create a container, you get a range positioned to "see" the 
entire container. Once you popFront from that range, you lose 
all access to the first element in the container, unless you 
have a separate copy of the range. This is not new and not 
different from:


auto r = new int[10];
r.popFront;

What happens here is an array of 10 elements gets created. But 
you don't get a type "array of 10 integers". You get "a slice 
of integers, incidentally initialized to refer to the entire 
array of 10 integers that was just created". Next, you decide 
you don't care about the first element in the array. Once you 
call r.popFront, access to that element is lost forever.


First off, how are you going to do something like a map over a 
immutable container then, as map uses the range primitives and 
not foreach? There's no reason in principal that that should 
cause an issue. But with that design popFront is mutating the 
underlying data, and therefore should not be allowed. But this 
works with the std.container design because popFront is mutating 
the range view which has no reason to be immutable.


Secondly, if I want to say, perform a map over the elements of a 
container for some output, and I want to do things with the 
elements later in the code, then I have to create a copy of the 
container and pass the original to map? If that's the case then 
why not just go with the std.container behavior of getting a 
range from a primitive if you end up having to create range 
copies? I mean it's fundamentally the same, but that way you 
don't have to worry about the copies changing the data of the 
container out from under you.


What happens to the copy of a container when the original 
modifies the underlying data? For example, what should this code 
print?


auto container = Array!int(MAllocator.instance);
container.put(iota(10));

auto container_copy = container;
container.popBackN(5);
container.insertBack(1);

container_copy.each!(a => write(a, ", "));

In my estimation, a properly designed containers library should 
print


0, 1, 2, 3, 4, 1, 5, 6, 7, 8, 9

But from what you're describing, that the containers act like 
slices, the code with print


0, 1, 2, 3, 4, 1, 6, 7, 8, 9


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Andrei Alexandrescu via Digitalmars-d

On 05/18/2017 02:17 PM, Jack Stouffer wrote:

On Thursday, 18 May 2017 at 15:18:00 UTC, Jack Stouffer wrote:

...


Also, shared allocators raise another problem entirely. Let's say for
the sake of clarity that these future containers will use a separate
range type.

Let's say you have a container with five elements, and then you give a
range of those elements to a function that caches the length somewhere
because it's available (very common as ranges have never been expected
to change before). What happens when you're iterating over the range and
then another thread calls removeBack before the loop is completed? All
sorts of functions in Phobos will blow up in interesting and different
ways.


If two or more threads have access to a container, that would be shared. 
Many traditional containers aren't usable gainfully as shared, so we 
plan to implement a few specialized ones. Anyhow, for now you won't be 
able to use containers that are shared. -- Andrei


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Andrei Alexandrescu via Digitalmars-d

On 05/18/2017 11:18 AM, Jack Stouffer wrote:

I just got around to watching Eduard Staniloiu's talk at DConf [1]
about the collections library he was working on. One thing seemed
odd, in that Eduard seems to be saying that the container and the
range over the container's elements are one in the same and the
range primitives are on the container itself.

Is this accurate?


That is correct. The fundamental equation is:

Ranges + Optional Primitives = Containers


If so, then this is completely different than the currently accepted
 design of ranges and containers. Currently, a container allows the
removal and insertion of elements, and to get a range of the elements
you have to call a function or slice the container. This is how
std.containers and Brain's EMSI container library. In this new
library, what happens when I iterate the elements of one of these
containers using foreach?


Iterating over a container using e.g. foreach won't consume the
container same as iterating over int[] won't consume the slice.


Do the range primitives empty the container? How do I ask for a new,
full range of my data in the collection to use in a range algorithm
after I've already iterated over the elements?


There is no way to reclaim the original container. If you create a 
container, you get a range positioned to "see" the entire container. 
Once you popFront from that range, you lose all access to the first 
element in the container, unless you have a separate copy of the range. 
This is not new and not different from:


auto r = new int[10];
r.popFront;

What happens here is an array of 10 elements gets created. But you don't 
get a type "array of 10 integers". You get "a slice of integers, 
incidentally initialized to refer to the entire array of 10 integers 
that was just created". Next, you decide you don't care about the first 
element in the array. Once you call r.popFront, access to that element 
is lost forever.


As an aside, we briefly experimented with a type called "new T[]" which 
meant to refer to that initial array. I tried to redo a bunch of code 
with it. It was a failure - figuring out what's an array and what's a 
slice in the array was a continuous hassle offering no redeeming 
insight. It took me years to figure actually there's no need for 
containers and ranges - all you need is ranges.



As Andrei originally laid out in this talk introducing ranges [2],
ranges only shrink and never grow. Put another way, Output Ranges and
Input Ranges are two separate things. Andrei states durning that talk
(at 1:20:20)

"Can ranges ever grow? ... No, never, because it would be
fundamentally unsafe."

I'm not sure what Andrei meant by this, but he believes that growing
 ranges is also a bad idea.


Yes, you're not supposed to grow a range on its existing support without 
knowing whether you have room. That's a distinct matter. You can always 
grow a range by means of safe primitives that may duplicate it if 
needed, such as ~ and ~=.



Andrei


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Jack Stouffer via Digitalmars-d

On Thursday, 18 May 2017 at 15:18:00 UTC, Jack Stouffer wrote:

...


Also, shared allocators raise another problem entirely. Let's say 
for the sake of clarity that these future containers will use a 
separate range type.


Let's say you have a container with five elements, and then you 
give a range of those elements to a function that caches the 
length somewhere because it's available (very common as ranges 
have never been expected to change before). What happens when 
you're iterating over the range and then another thread calls 
removeBack before the loop is completed? All sorts of functions 
in Phobos will blow up in interesting and different ways.


I think you'd have to make all ranges created from mutable 
containers input ranges with no extra primitives in order to 
avoid this problem.


Plus, you also have a problem with things like SList where 
removeFront can be called after a different thread creates a 
range. All of a sudden your range `front` is pointing to invalid 
data. So for that, you need to either


1. Make a thread safe way to invalidate all old ranges (possibly 
by marking them empty immediately)
2. Allow the program to blow-up and mark all remove functions as 
@system


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Jack Stouffer via Digitalmars-d

On Thursday, 18 May 2017 at 15:38:39 UTC, Jonathan M Davis wrote:
That point concerned me as well. Dynamic arrays in D are very 
strange beasts indeed, and while it works for them to function 
as both ranges and (sort of) containers, it's also created a 
fair bit of confusion, and it really a fair bit of what is done 
with dynamic arrays are _not_ range-based functions (e.g. 
appending), making the whole situation that much more confusing 
when range-based functions and array-specific functions are 
mixed (which is definitely going to happen in stuff like string 
code).


Yes, adding in the free function versions of the range primitives 
in std.range, while convenient initially, seems to have had large 
negative effects down the line.


Re: On "A New Collections Framework for the Standard Library"

2017-05-18 Thread Jonathan M Davis via Digitalmars-d
On Thursday, May 18, 2017 15:18:00 Jack Stouffer via Digitalmars-d wrote:
> I just got around to watching Eduard Staniloiu's talk at DConf
> [1] about the collections library he was working on. One thing
> seemed odd, in that Eduard seems to be saying that the container
> and the range over the container's elements are one in the same
> and the range primitives are on the container itself.
>
> Is this accurate?
>
> If so, then this is completely different than the currently
> accepted design of ranges and containers. Currently, a container
> allows the removal and insertion of elements, and to get a range
> of the elements you have to call a function or slice the
> container. This is how std.containers and Brain's EMSI container
> library. In this new library, what happens when I iterate the
> elements of one of these containers using foreach? Do the range
> primitives empty the container? How do I ask for a new, full
> range of my data in the collection to use in a range algorithm
> after I've already iterated over the elements?
>
> As Andrei originally laid out in this talk introducing ranges
> [2], ranges only shrink and never grow. Put another way, Output
> Ranges and Input Ranges are two separate things. Andrei states
> durning that talk (at 1:20:20)
>
> "Can ranges ever grow? ... No, never, because it would be
> fundamentally unsafe."
>
> I'm not sure what Andrei meant by this, but he believes that
> growing ranges is also a bad idea.
>
> [1] https://www.youtube.com/watch?v=k88QSIC-Na0
> [2]
> https://archive.org/details/AndreiAlexandrescuKeynoteBoostcon2009

That point concerned me as well. Dynamic arrays in D are very strange beasts
indeed, and while it works for them to function as both ranges and (sort of)
containers, it's also created a fair bit of confusion, and it really a fair
bit of what is done with dynamic arrays are _not_ range-based functions
(e.g. appending), making the whole situation that much more confusing when
range-based functions and array-specific functions are mixed (which is
definitely going to happen in stuff like string code). And a container as a
range really makes no sense to me. It makes slightly more sense than an
iterator being a container (since it is a range of elements not just one),
but it still goes against how containers normally work. A big part of what's
different with dynamic arrays is that they don't actually own or manage
their own memory - rather they refer to memory from elsewhere, and how that
memory is managed depends on where that memory comes from. Yes, after
appending, it can be guaranteed to be the GC, but even then, it's the GC
that manages what goes on with the elements. No one dynamic array owns them,
since you can have multiple of them referring to the same memory. So,
they're quite schizophrenic.

Now, I could see it working to treat a container as a range if you're
talking about _immutable_ containers - especially if their memory is owned
by the GC (and immutable containers are definitely part of what's being
worked on) - since you don't really add to those (at most, you create a new
container with the same elements plus more), but for mutable containers,
that just doesn't seem right.

So, yes, that bit about modeling containers after dynamic arrays greatly
concerns me, but I figured that I'd wait and see the actual API before
complaining about it. And it may very well turn out that what they're doing
actually makes a lot of sense once you actually see the details. I don't
know.

- Jonathan M Davis