Re: [rust-dev] Persistent data structures

2013-12-05 Thread Michael Woerister

On 05.12.2013 03:15, Bill Myers wrote:
The reason I introduced COW when RC  1 is that it allows persistent 
data structures to be mutated in place if there aren't extra 
references, just like non-persistent data structures.
That's a great idea that an might obviate the need to have a separate 
`Builder` type (as in .Net [1]) or explicit `transients` (as in Clojure 
[2]).


However, one has to be careful to lock the whole path when working in 
a graph structure in a concurrent setting, so no other task/thread 
changes a node further up, while the first task does changes to the some 
node further down. This can happen, when relying on the reference count 
alone because there can be N Arc-references + M borrowed references to 
some memory box---and the reference count will only be N. Conceptually, 
the `get()` method in Arc also should increase the reference count for 
the lifetime of the borrowed reference. For safe memory management, this 
is not necessary because the compiler will ensure that no borrowed 
reference outlives all memory owners, but for counting who has access to 
the memory contents, the reference count alone is inadequate.


This should not be hard to mitigate manually but it's a potential source 
for some nasty race conditions ;)


-Michael

[1] https://msdn.microsoft.com/en-us/library/dn385366(v=vs.110).aspx
[2] http://clojure.org/transients
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-05 Thread Bill Myers
No, the problem you describe does not exist in my implementation because it 
requires an mut to the smart pointer.

In particular, if the reference count is 1, then there is no other Rc and Arc 
pointing to the same data, and because we have an mut there is also no other 
borrowed reference to the Rc or Arc we are manipulating.

Hence, if the reference count is 1 and we have an mut to the Rc or Arc, it's 
safe to return an mut pointing to the contents and thus mutate its contents in 
place using it.

If the reference count is more than 1, then it uses read-only access to clone 
the contents, giving a new allocation with reference count 1 that can then be 
mutated in place as above.

  ___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-05 Thread Michael Woerister

On 05.12.2013 16:59, Bill Myers wrote:
No, the problem you describe does not exist in my implementation 
because it requires an mut to the smart pointer.

You are right. I needed some time to wrap my head around this :)
This is really a very clever approach! I'm looking forward to 
experimenting with it in my HAMT implementation.


-Michael
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Patrick Walton

On 12/3/13 11:28 PM, Isaac Dupree wrote:

Is Rc the best smart pointer for persistent data structures?


I would think so, for non-thread safe ones.


Is it
possible for the structure to be parametrized on smart pointer?


Not without higher kinded types (which eventually we do want--so the 
answer is not yet).



Rc requires the contained data to be Freeze or Send or risk reference
cycles.


I want to lift this restriction, BTW. Trying to prevent cycles through 
the type system has never really worked for us.


  Gc requires T:'static (which means no borrowed pointers besides

'static ones within the type).


Yes, but I wouldn't worry about this restriction biting users of your 
structure too much. Rust data structures rarely ever store non-static 
references in them, as the stack discipline that references must follow 
is fairly limited. (I can probably count the number of times I've put a 
non-static `` reference into a dynamic vector on one hand, and I don't 
think I've ever put references into a hash map.)


  Every Send type is 'static, but not

every Freeze type is 'static, so neither Rc nor Gc is strictly more
flexible.  Arc is Send, unlike either Rc or Gc, but has more overhead
and can only contain Freeze+Send data; someone wanting to share
persistence between tasks (conceivably for the sake of memory-use or
asymptotic time) would want it.


Really what you want here for maximum flexibility is higher-kinded types 
and an implementation parameterized over RcT or ArcT. That doesn't 
work in today's Rust, so you'll have to implement the data structure 
separately for non-thread-safe and thread-safe applications, unless you 
use macros. (Though often times you may want separate logic either way, 
necessitating a separate implementation. I'm not sure about the 
particular data structure you had in mind though.)



Is it possible to implement FromIteratorT for ListT without using
O(n) temporary space or unsafe code?  The problem is that the list
comes out reversed in an obvious implementation. (O(n) stack space via
recursion, or an O(n) intermediate data structure, or unsafely
constructing a cons cell before constructing its tail.)


I think it should be possible to reverse the list after it's 
constructed, but then of course it has to be mutable (at least 
temporarily). Or you could carry around a pointer to the end of the list 
(which, again, requires mutation). I don't think this is solvable 
without mutation in a strict (non-lazy) language, unless I'm missing 
something.


Patrick

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Kevin Ballard
On Dec 4, 2013, at 12:23 AM, Patrick Walton pcwal...@mozilla.com wrote:

 Yes, but I wouldn't worry about this restriction biting users of your 
 structure too much. Rust data structures rarely ever store non-static 
 references in them, as the stack discipline that references must follow is 
 fairly limited. (I can probably count the number of times I've put a 
 non-static `` reference into a dynamic vector on one hand, and I don't think 
 I've ever put references into a hash map.)

I've put non-static [u8]s into a map. Specifically, I have a function in one 
of my sources that looks vaguely like

pub fn process_input'a(input: 'a [u8]) - int {
let mut map: HashMap'a [u8], int = HashMap::new();

// .. process the input using the map, then throw away the map

return result;
}

-Kevin
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Michael Woerister

I've implemented a persistent HAMT [1] a while back:
https://github.com/michaelwoerister/rs-persistent-datastructures/blob/master/hamt.rs

It's the data structure used for persistent maps in Clojure (and Scala, 
I think). It's not too hard to implement and it's pretty nifty. I'm not 
sure about the performance with atomic reference counting being used for 
memory management. It will definitely be worse than with a  
stop-the-world garbage collector. Although, it's noteworthy that look 
ups in the data structure only have to copy one `Arc` for returning the 
result, so the high fan-out of the data structure should not hurt if you 
mostly read from it. I'd be very interested in a performance comparison 
to other persistent map implementations in Rust (e.g. a red-black tree 
or splay tree).


Here are some things I came across during implementing this:
* I too discovered that I couldn't parametrize on the type of reference 
being used without higher kinded types. I did the implementation with 
regular @ pointers at first and later switched to `Arc`, since 
concurrent contexts are where persistent data structures really shine. 
Switching the implementation from @ to Arc was pretty straight forward.
* I found there is no standardized trait for persistent maps in libstd 
or libextra. It would be nice to have one!
* It's probably a very good idea to provide a non-persistent builder 
that avoids the excessive copying during the insertion phase. In Clojure 
one can switch between transient and persistent mode for a data 
structure instance which also allows for optimized batch modifications. 
An `insert_from(iterator)` method might also do the trick. There's quite 
a bit of design space here.
* I would have liked to avoid some allocations and pointer chasing by 
using fixed size vectors directly within nodes but I could not get that 
to work without a lot of unsafe code that I was not sure would be 
correct in all cases. So I just used owned vectors in the end.


Looking forward to seeing more in this area :)

-Michael

[1] https://en.wikipedia.org/wiki/Hash_array_mapped_trie

On 04.12.2013 08:28, Isaac Dupree wrote:
I'm interested in having persistent data structures[1] in Rust.  To 
start, I implemented the simplest one, the cons/nil list (it looks 
like extra::list::List has another implementation of it).  Am I using 
Rust conventions properly?


My persistent list code:
https://github.com/idupree/rust-code/blob/master/persistent.rs

My next goal is a persistent tree-map, probably cribbing from 
Haskell's Data.Map.



Is Rc the best smart pointer for persistent data structures?  Is it 
possible for the structure to be parametrized on smart pointer?


Rc requires the contained data to be Freeze or Send or risk reference 
cycles.  Gc requires T:'static (which means no borrowed pointers 
besides 'static ones within the type).  Every Send type is 'static, 
but not every Freeze type is 'static, so neither Rc nor Gc is strictly 
more flexible.  Arc is Send, unlike either Rc or Gc, but has more 
overhead and can only contain Freeze+Send data; someone wanting to 
share persistence between tasks (conceivably for the sake of 
memory-use or asymptotic time) would want it.



Is it possible to implement FromIteratorT for ListT without using 
O(n) temporary space or unsafe code?  The problem is that the list 
comes out reversed in an obvious implementation. (O(n) stack space via 
recursion, or an O(n) intermediate data structure, or unsafely 
constructing a cons cell before constructing its tail.)



[1] https://en.wikipedia.org/wiki/Persistent_data_structure , like 
every data structure in Haskell


-Isaac
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Huon Wilson

On 04/12/13 20:51, Michael Woerister wrote:

I've implemented a persistent HAMT [1] a while back:
https://github.com/michaelwoerister/rs-persistent-datastructures/blob/master/hamt.rs 



It's the data structure used for persistent maps in Clojure (and 
Scala, I think). It's not too hard to implement and it's pretty nifty. 
I'm not sure about the performance with atomic reference counting 
being used for memory management. It will definitely be worse than 
with a  stop-the-world garbage collector. Although, it's noteworthy 
that look ups in the data structure only have to copy one `Arc` for 
returning the result, so the high fan-out of the data structure should 
not hurt if you mostly read from it. I'd be very interested in a 
performance comparison to other persistent map implementations in Rust 
(e.g. a red-black tree or splay tree).


Here are some things I came across during implementing this:
* I too discovered that I couldn't parametrize on the type of 
reference being used without higher kinded types. I did the 
implementation with regular @ pointers at first and later switched to 
`Arc`, since concurrent contexts are where persistent data structures 
really shine. Switching the implementation from @ to Arc was pretty 
straight forward.
* I found there is no standardized trait for persistent maps in libstd 
or libextra. It would be nice to have one!
* It's probably a very good idea to provide a non-persistent builder 
that avoids the excessive copying during the insertion phase. In 
Clojure one can switch between transient and persistent mode for a 
data structure instance which also allows for optimized batch 
modifications. An `insert_from(iterator)` method might also do the 
trick. There's quite a bit of design space here.



For reference, the FromIterator  Extendable traits are the things to 
implement if one has a structure that can be constructed from/extended 
with an iterator and wishes to share the behaviour.



http://static.rust-lang.org/doc/master/std/iter/trait.FromIterator.html
http://static.rust-lang.org/doc/master/std/iter/trait.Extendable.html


Huon

* I would have liked to avoid some allocations and pointer chasing by 
using fixed size vectors directly within nodes but I could not get 
that to work without a lot of unsafe code that I was not sure would be 
correct in all cases. So I just used owned vectors in the end.


Looking forward to seeing more in this area :)

-Michael

[1] https://en.wikipedia.org/wiki/Hash_array_mapped_trie


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Michael Woerister


For reference, the FromIterator  Extendable traits are the things to 
implement if one has a structure that can be constructed from/extended 
with an iterator and wishes to share the behaviour.



http://static.rust-lang.org/doc/master/std/iter/trait.FromIterator.html
http://static.rust-lang.org/doc/master/std/iter/trait.Extendable.html


Thanks for pointing that out :)
Implementing FromIterator should work out fine. Extendable would need a 
persistent version returning the newly created instance, something like:


mod persistent {
pub trait ExtendableA: FromIterator 
http://static.rust-lang.org/doc/master/std/iter/trait.FromIterator.htmlA 
{
fn extend 
http://static.rust-lang.org/doc/master/std/iter/trait.Extendable.html#tymethod.extendT: 
Iterator 
http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.htmlA(self, 
iterator: mut T) - Self;

}
}
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Niko Matsakis
On Wed, Dec 04, 2013 at 12:23:35AM -0800, Patrick Walton wrote:
 Not without higher kinded types (which eventually we do want--so the
 answer is not yet).

This. That said, I imagine that if we do it right, it'll be possible
to write one persistent map implementation that can be used with GC,
ARC, and also an arena. The latter would -- at least in principle --
make it possible to store references. I haven't thought hard about
this though and it may be that complications arise.


Niko
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Michael Woerister



Is it
possible for the structure to be parametrized on smart pointer?


Not without higher kinded types (which eventually we do want--so the 
answer is not yet).
I've been thinking about that a bit more and I think it might be 
possible to support different reference types without higher-kinded 
types. We can define a `Reference` trait that is then implemented for 
`Rc`, `Arc` and what have you. Something like:


trait ReferenceT : Clone {
fn borrow'a('a self) - 'a T;
}

If we also want to create references without knowing their concrete 
type, we also need a trait for creating them:


trait ReferenceFactoryT, TRef: ReferenceT {
fn create_reference(self, val: T) - TRef;
}

And then we can define a container type, using the generic reference type:

struct ContainerT, TRef, TRefFactory {
items: ~[TRef],
reference_factory: TRefFactory
}

It contains a `ReferenceFactory` value that is used to create 
`Reference` instances when the container needs one (see the `add` method 
below):


impl
T,
TRef: ReferenceT,
TRefFactory: ReferenceFactoryT, TRef

ContainerT, TRef, TRefFactory {

pub fn create(factory: TRefFactory) - ContainerT, TRef, 
TRefFactory {

Container {
reference_factory: factory,
items: ~[]
}
}

pub fn add(mut self, val: T) {
self.items.push(self.reference_factory.create_reference(val));
}
}

This setup is a bit roundabout but it should get the job done. Of 
course, I may have overlooked something but at least rustc swallows the 
above code without complaint ;)
If we could call static methods on type parameters, the factory 
functionality could be moved to the `Reference` trait:


trait ReferenceT : Clone {
fn borrow'a('a self) - 'a T;
fn new(value: T) - Self;
}

// Now using the static `new` method instead of the `ReferenceFactory` trait
pub fn add(mut self, val: T) {
self.items.push(TRef::new(val));
}

With this, the code would actually be of acceptable complexity in my eyes.

-Michael

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Bill Myers



Hello, I already implemented a persistent tree-map called SnapMap: you can find 
the source code at https://github.com/mozilla/rust/pull/9816

I stopped working on it before I made a serious effort to push it into the Rust 
codebase and don't have time to work further on it, so it would be awesome if 
you were interested in continuing that effort.

It's pretty much finished, but pretty much completely untested (however, it was 
derived from the existing treemap, so the amount of bugs should not be as high 
as something written from scratch and untested).

The first thing that needs to be done is to run the existing tests inherited 
from treemap, fix any bug that shows up, and then write more tests to test 
SnapMap specific usage patterns and features.

Then, one should look at the mutable/handle-based iterators in the code and 
decide whether they should be removed, kept as is, or improved, possibly after 
changing the iterator interface to return an object with the lifetime of the 
function call instead of the iterator.

The rest of the code should be fine (except for bugs due to no testing), but 
there might be places where unnecessary copy-on-write is performed.

My code used an improved version of Rc that supports copy-on-write by cloning 
only when reference count  1.


  ___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Isaac Dupree

On 12/04/2013 02:09 PM, Michael Woerister wrote:

Is it
possible for the structure to be parametrized on smart pointer?


Not without higher kinded types (which eventually we do want--so the
answer is not yet).


And then we can define a container type, using the generic reference type:

struct ContainerT, TRef, TRefFactory {
 items: ~[TRef],
 reference_factory: TRefFactory
}


Clever solution!  In the cons list, it needs to be a reference to 
NodeT, not to T.  Every persistent container library would have to 
expose its node type and take as a parameter an allocator for its 
NodeT.  Simplified example:


enum NodeT, TNodeRef {
  Nil,
  Cons(T, TNodeRef)
}

type IntList = RcNodeint, IntList;

But that is an infinite type and, as such, does not compile.  Instead using

struct IntList(RcNodeint, IntList);

works and could implement the necessary traits.  Or better,

struct RcListT(RcNodeT, RcListT);

Ooh, this is within the realm of maybe being reasonable.  A persistent 
data structure module could provide RcList and ArcList versions.  Is 
performance affected by the use of traits for this strategy, or does the 
way Rust generics are compiled make that a non-issue?  Are there other 
reasons this is a terrible idea? :)


-Isaac

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Daniel Micay
On Wed, Dec 4, 2013 at 4:17 PM, Isaac Dupree
m...@isaac.cedarswampstudios.org wrote:
 On 12/04/2013 02:09 PM, Michael Woerister wrote:

 Is it
 possible for the structure to be parametrized on smart pointer?


 Not without higher kinded types (which eventually we do want--so the
 answer is not yet).


 And then we can define a container type, using the generic reference type:

 struct ContainerT, TRef, TRefFactory {
  items: ~[TRef],
  reference_factory: TRefFactory
 }


 Clever solution!  In the cons list, it needs to be a reference to NodeT,
 not to T.  Every persistent container library would have to expose its node
 type and take as a parameter an allocator for its NodeT.  Simplified
 example:

 enum NodeT, TNodeRef {
   Nil,
   Cons(T, TNodeRef)
 }

 type IntList = RcNodeint, IntList;

 But that is an infinite type and, as such, does not compile.  Instead using

 struct IntList(RcNodeint, IntList);

 works and could implement the necessary traits.  Or better,

 struct RcListT(RcNodeT, RcListT);

 Ooh, this is within the realm of maybe being reasonable.  A persistent data
 structure module could provide RcList and ArcList versions.  Is performance
 affected by the use of traits for this strategy, or does the way Rust
 generics are compiled make that a non-issue?  Are there other reasons this
 is a terrible idea? :)

 -Isaac

Generics are equivalent to substituting all of the type parameters
with the concrete type by hand. Specialized code is generated for each
set of type parameters with removing the duplication left up to LLVM
(`mergefunc` is close to being completely stable).
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Isaac Dupree

On 12/04/2013 03:36 PM, Bill Myers wrote:

Hello, I already implemented a persistent tree-map called SnapMap: you
can find the source code at https://github.com/mozilla/rust/pull/9816

I stopped working on it before I made a serious effort to push it into
the Rust codebase and don't have time to work further on it, so it would
be awesome if you were interested in continuing that effort.

It's pretty much finished, but pretty much completely untested (however,
it was derived from the existing treemap, so the amount of bugs should
not be as high as something written from scratch and untested).

The first thing that needs to be done is to run the existing tests
inherited from treemap, fix any bug that shows up, and then write more
tests to test SnapMap specific usage patterns and features.


Yay!  Looks like I should start by collecting the various existing 
persistent structures (your SnapMap, Michael's HAMT, extra::list) and 
polishing them up.



Then, one should look at the mutable/handle-based iterators in the code
and decide whether they should be removed, kept as is, or improved,
possibly after changing the iterator interface to return an object with
the lifetime of the function call instead of the iterator.

The rest of the code should be fine (except for bugs due to no testing),
but there might be places where unnecessary copy-on-write is performed.

My code used an improved version of Rc that supports copy-on-write by
cloning only when reference count  1.


Did COW improve performance?  What's a good way to do performance 
testing of Rust code?


Should I try to get your patches to Rc and Arc merged?  They look 
generally useful, if folks think they fit the design of Rc.  You also 
have a patch that adds OwnT to std which is equivalent to ~T but has 
methods that look like Rc's.  You use this, and putting most of your 
treemap.rs inside macro definitions, to get a sort of 
reference-type-polymorphism in your treemap.  Shall I continue with this 
approach and try to get OwnT merged too?  (Opinions from anybody are 
welcome.)


(I used https://github.com/mozilla/rust/pull/9816/files + click Show 
Diff Stats to look through the changes)


-Isaac

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread Bill Myers
 Did COW improve performance?  What's a good way to do performance 
 testing of Rust code?

The reason I introduced COW when RC  1 is that it allows persistent data 
structures to be mutated in place if there aren't extra references, just like 
non-persistent data structures.

Lots of languages with persistent data structures can't do that because they 
use garbage collection (and thus they can't tell whether there are other 
references due to lack of a reference count), but it seems an essential feature 
to have if one is using reference counting in a language like Rust that is 
supposed to be a systems language producing optimal code.

 Should I try to get your patches to Rc and Arc merged?  They look 
 generally useful, if folks think they fit the design of Rc.  You also 
 have a patch that adds OwnT to std which is equivalent to ~T but has 
 methods that look like Rc's.  You use this, and putting most of your 
 treemap.rs inside macro definitions, to get a sort of 
 reference-type-polymorphism in your treemap.  Shall I continue with this 
 approach and try to get OwnT merged too?  (Opinions from anybody are 
 welcome.)

I did it because I realized that having two different balanced tree 
implementations would have imposed a significant maintenance burden, and thus 
the macro and OwnT approach would have allowed to avoid that by replacing the 
current treemap code completely with the new optionally-persistent version.

Also, having both Rc and Arc versions seems quite useful anyway, and currently 
I think macros are the best way to have both, until Rust starts supporting 
higher-kinded types (which would allow to pass Rc as a parameter), or at least 
recursive types (which would allow to express RcTreeNodeT, RcTreeNode, T, 
... and pass it as a parameter).

I don't have much more input on what the best names for the methods are, 
whether to have OwnT instead of ~T and so on; I guess someone on the Rust 
core team will have to decide on that, and there's already some discussion on 
that on https://github.com/mozilla/rust/pull/9786

  ___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread David Piepgrass
My next goal is a persistent tree-map, probably cribbing from Haskell's
Data.Map.

I look forward to hearing how that goes!

I've been meaning to make a data structure in Rust too, but it's hard to
find the time, so how's about I tell you guys about it instead.

I call my data structure an optionally-persistent hashtable or hashset.
In C# I implemented this optionally-persistent hashtable and hashset with
mutable and immutable variants of each (the term semi-persistent was
already taken and means something else). I've been meaning to write an
article about this, but didn't get around to it yet.

Optionally-persistent means that it's structured as if it were persistent,
but each node can be either mutable (locally) or immutable (recursively).
Each node has a frozen flag which implicitly applies recursively to all
children. Converting the tree from immutable to mutable, or mutable to
immutable takes O(1) time. Immutable to mutable is essentially a no-op (the
mutable copy has copy-on-write behavior), while mutable-to-immutable simply
requires marking the root node as frozen.

The hashtable is really a tree that is up to 8 levels deep, with each
level representing 4 bits of the hashcode (not sure if this is the best
approach). Lots more details in the doc comment:

https://sourceforge.net/p/loyc/code/HEAD/tree/Src/Loyc.Collections/Sets/InternalSet.cs

My benchmarks show that mutating such a set/map is dramatically faster than
immutable editing (which requires copying multiple nodes for each change),
and not that much slower than a traditional hashtable, so I think it's
hands down superior to a traditional persistent hash tree.

In my version, from the end-user's perspective, there's a separate data
type for immutable and mutable versions of the data structure (MMapT and
MSetT are mutable, MapT and SetT are immutable). Both data types
encapsulate an instance of InternalSetT which is the real data
structure. InternalSetT manages a tree of nodes, where each node has 16
entries and represents 4 bits of the hashcode. There's also an interesting
variation called InvertibleSetT; an invertible set can represent a
negative set such as all integers except 0 and 1.

-- 
- David
http://loyc-etc.blogspot.com
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-04 Thread David Piepgrass
Please disregard this message; I hadn't seen Bill Myers' solution
(copy-on-write
by cloning only when reference count  1), which sounds like it's probably
perfect for Rust.


On Wed, Dec 4, 2013 at 9:03 PM, David Piepgrass qwertie...@gmail.comwrote:

 My next goal is a persistent tree-map, probably cribbing from Haskell's
 Data.Map.

 I look forward to hearing how that goes!

 I've been meaning to make a data structure in Rust too, but it's hard to
 find the time, so how's about I tell you guys about it instead.

 I call my data structure an optionally-persistent hashtable or hashset.
 In C# I implemented this optionally-persistent hashtable and hashset with
 mutable and immutable variants of each (the term semi-persistent was
 already taken and means something else). I've been meaning to write an
 article about this, but didn't get around to it yet.

 Optionally-persistent means that it's structured as if it were persistent,
 but each node can be either mutable (locally) or immutable (recursively).
 Each node has a frozen flag which implicitly applies recursively to all
 children. Converting the tree from immutable to mutable, or mutable to
 immutable takes O(1) time. Immutable to mutable is essentially a no-op (the
 mutable copy has copy-on-write behavior), while mutable-to-immutable simply
 requires marking the root node as frozen.

 The hashtable is really a tree that is up to 8 levels deep, with each
 level representing 4 bits of the hashcode (not sure if this is the best
 approach). Lots more details in the doc comment:


 https://sourceforge.net/p/loyc/code/HEAD/tree/Src/Loyc.Collections/Sets/InternalSet.cs

 My benchmarks show that mutating such a set/map is dramatically faster
 than immutable editing (which requires copying multiple nodes for each
 change), and not that much slower than a traditional hashtable, so I think
 it's hands down superior to a traditional persistent hash tree.

 In my version, from the end-user's perspective, there's a separate data
 type for immutable and mutable versions of the data structure (MMapT and
 MSetT are mutable, MapT and SetT are immutable). Both data types
 encapsulate an instance of InternalSetT which is the real data
 structure. InternalSetT manages a tree of nodes, where each node has 16
 entries and represents 4 bits of the hashcode. There's also an interesting
 variation called InvertibleSetT; an invertible set can represent a
 negative set such as all integers except 0 and 1.

 --
 - David
 http://loyc-etc.blogspot.com




-- 
- David
http://qism.blogspot.com
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


[rust-dev] Persistent data structures

2013-12-03 Thread Isaac Dupree
I'm interested in having persistent data structures[1] in Rust.  To 
start, I implemented the simplest one, the cons/nil list (it looks like 
extra::list::List has another implementation of it).  Am I using Rust 
conventions properly?


My persistent list code:
https://github.com/idupree/rust-code/blob/master/persistent.rs

My next goal is a persistent tree-map, probably cribbing from Haskell's 
Data.Map.



Is Rc the best smart pointer for persistent data structures?  Is it 
possible for the structure to be parametrized on smart pointer?


Rc requires the contained data to be Freeze or Send or risk reference 
cycles.  Gc requires T:'static (which means no borrowed pointers besides 
'static ones within the type).  Every Send type is 'static, but not 
every Freeze type is 'static, so neither Rc nor Gc is strictly more 
flexible.  Arc is Send, unlike either Rc or Gc, but has more overhead 
and can only contain Freeze+Send data; someone wanting to share 
persistence between tasks (conceivably for the sake of memory-use or 
asymptotic time) would want it.



Is it possible to implement FromIteratorT for ListT without using 
O(n) temporary space or unsafe code?  The problem is that the list 
comes out reversed in an obvious implementation. (O(n) stack space via 
recursion, or an O(n) intermediate data structure, or unsafely 
constructing a cons cell before constructing its tail.)



[1] https://en.wikipedia.org/wiki/Persistent_data_structure , like every 
data structure in Haskell


-Isaac
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev


Re: [rust-dev] Persistent data structures

2013-12-03 Thread Tony Arcieri
On Tue, Dec 3, 2013 at 11:28 PM, Isaac Dupree 
m...@isaac.cedarswampstudios.org wrote:

 I'm interested in having persistent data structures[1] in Rust.


Nice!

-- 
Tony Arcieri
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev