Re[2]: [Haskell-cafe] trees and pointers

2010-07-16 Thread Bulat Ziganshin
Hello Jake,

Friday, July 16, 2010, 7:26:22 AM, you wrote:

 Excluding DiffArray under certain usage patterns of course, but
 DiffArray is slow for unknown reasons besides algorithmic complexity.

unknown reason = MVar usage

ArrayRef library contains parameterized DiffArray implementation that
may be specialized either to MVar or IORef usage


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-16 Thread Jan-Willem Maessen
2010/7/16 wren ng thornton w...@freegeek.org:
 Jake McArthur wrote:

 On 07/15/2010 05:33 PM, Victor Gorokhov wrote:

 From the docs, lookup is O(min(n,W))

 Actually worse than O(log n).

 Perhaps I am misunderstanding you, but O(min(n,W)) is either better than
 or the same as O(log n), depending on how you look at things, but I don't
 see any way that the former could be *worse* than the latter.

 For n  W: min(n,W)  log n

 So, when you can guarantee that n  W ---which is almost always the case for
 IntMap---, then O(min(n,W)) is linear and therefore worse than O(log n).

Indeed---though you see worst-case behavior only for carefully-chosen
key sets (eg successive powers of 2).  If the n values in an IntMap
are, say, consecutive or nearly-consecutive, we obtain a log n bound.
I suspect in practice most programmers will see logarithmic behavior.

 But even so, if your constant factors are k  c, then k*n  c*log n is
 perfectly possible for all n  W, and therefore what matters in the real
 world here is the constant factors. The reason why is that for asymptotic
 purposes O(min(n,W)) and O(log n) belong to the same class of functions
 between constant and linear, so they're effectively the same (in
 asymptotic-land).

The argument for constant-time IntMap performance is essentially
similar to the following argument:

There are balanced trees that provide an O(lg n) bound on tree depth
for a tree containing n elements.  Our computer has only k bits of
address space and therefore the number of elements in any in-memory
tree is O(k).  Thus there is a constant (and smallish) upper bound on
tree depth, O(lg k).  Therefore every balanced tree implementation
offers constant-time access.

As you observe, it's really down to constant factors.  The reason
IntMap (or any digital trie) is so interesting is that it is simple
enough that the constant factors are quite good---in particular we
don't waste a lot of time figuring out if we're going to need to
rearrange the tree structure on the fly.  That turns out to amortize
quite a few extra level traversals in a lot of settings.

It also offers really elegant implementations of union and unions.
Whether that means they're quickish I leave to the benchmarkers.

-Jan-Willem Maessen


 --
 Live well,
 ~wren
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-16 Thread wren ng thornton

Jan-Willem Maessen wrote:

As you observe, it's really down to constant factors.  The reason
IntMap (or any digital trie) is so interesting is that it is simple
enough that the constant factors are quite good---in particular we
don't waste a lot of time figuring out if we're going to need to
rearrange the tree structure on the fly.  That turns out to amortize
quite a few extra level traversals in a lot of settings.


This simplicity of not rebalancing also allows for more sharing in a 
persistent setting, which can be a big gain for programs with the right 
kinds of data distributions.




It also offers really elegant implementations of union and unions.
Whether that means they're quickish I leave to the benchmarkers.


In my experience (NLP work), the performance of unions for tries is one 
of their major selling points. In Okasaki's venerable benchmarks[1], 
they vastly outperform all competitors for merging. Even in 
imperative-land, that's one of the reasons tries are so common in NLP 
and are often chosen over hashmaps for certain tasks.



[1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.5452

--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Stephen Tetley
2010/7/15 Jake McArthur jake.mcart...@gmail.com:
 On 07/14/2010 05:01 PM, Victor Gorokhov wrote:

 You can implement pure pointers on top of Data.Map with O(log n) time

 Or on top of Data.IntMap with O(1) time. ;)

Unlikely...

From the docs, lookup is O(min(n,W))
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Felipe Lessa
On Thu, Jul 15, 2010 at 4:30 AM, Stephen Tetley
stephen.tet...@gmail.com wrote:
 2010/7/15 Jake McArthur jake.mcart...@gmail.com:
 On 07/14/2010 05:01 PM, Victor Gorokhov wrote:

 You can implement pure pointers on top of Data.Map with O(log n) time

 Or on top of Data.IntMap with O(1) time. ;)

 Unlikely...

 From the docs, lookup is O(min(n,W))

W is a constant, 32 or 64 on most machines, so this is really O(W) = O(1).

Should someone create an IntegerMap, then lookup wouldn't be O(1) as W
would be variable.

Cheers!

-- 
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Ivan Lazar Miljenovic
Stephen Tetley stephen.tet...@gmail.com writes:

 2010/7/15 Jake McArthur jake.mcart...@gmail.com:
 On 07/14/2010 05:01 PM, Victor Gorokhov wrote:

 You can implement pure pointers on top of Data.Map with O(log n) time

 Or on top of Data.IntMap with O(1) time. ;)

 Unlikely...

From the docs, lookup is O(min(n,W))

Yeah, I was trying to work out how the O(1) time worked as well...

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Sergey Mironov
2010/7/15 Serguey Zefirov sergu...@gmail.com:
 2010/7/14 Sergey Mironov ier...@gmail.com:
 Hi cafe! I have a question of C-to-Haskell type:)

 Imagine web application wich allows users to browse some shared
 filesystem located at the server.
 Application stores every users's position within that filesystem
 (current directory or file).

 In C this can be implemented with the help of following data types:

 Any ideas?

 Use IORef. ;)

 PS
 MVar is better, actually.


Somehow I forgot about them:) Code will turn into something like

data TreeNodeData = File | Dir (IORef TreeNode)
data TreeNode = TreeNode {
next :: Maybe (IORef TreeNode),
prev :: Maybe (IORef TreeNode),
up :: Maybe (IORef TreeNode), -- missed it in original C example
payload :: TreeNodeData
}

data User = User {
position :: IORef TreeNode,
-- ...
}

It really should work! (we don't take multithreading issues into
account for now)
Slightly annoying thing is that 1-to-1 mapping from C to Haskell also
forces programmer to perform C-like low-level pointer linking.

-- 
Thanks,
Sergey
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Sergey Mironov
15 июля 2010 г. 2:01 пользователь Victor Gorokhov m...@rkit.pp.ru написал:
 You can implement pure pointers on top of Data.Map with O(log n) time:

 {-# LANGUAGE ExistentialQuantification #-}
 import Data.Map ( Map )
 import qualified Data.Map as Map
 import Data.Typeable
 import Control.Monad.State
 import Data.Maybe

 type PointerSpace = Map Int PackedValue
 newtype Pointer a = Pointer Int
 data PackedValue = forall a. Typeable a = PackedValue a

 readPointer :: Pointer a - State PointerSpace a
 readPointer ( Pointer key ) =  do
  space - get
  return $ fromJust $ cast $ Map.find key space

 writePointer :: a - Pointer a - State PointerSpace ()
 writePointer a ( Pointer key ) = do
  space - get
  put $ Map.insert key ( PackedValue a ) space

 newPointer :: a - State PointerSpace ( Pointer a )
 newPointer a = do
  space - get
  let key = findEmptyKey space -- implement it yourself
     p = Pointer key
  writePointer a p
  return p

Thanks for an example! Probably, one can think about using Arrays
instead of Map or IntMap in order to achieve 'true' O(1) in pure. But
I suppose that there are some trouble with array expanding. Or
somebody would already make it.

-- 
Thanks,
Sergey
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Serguey Zefirov
2010/7/15 Sergey Mironov ier...@gmail.com:
 2010/7/15 Serguey Zefirov sergu...@gmail.com:
 2010/7/14 Sergey Mironov ier...@gmail.com:
 Hi cafe! I have a question of C-to-Haskell type:)

 Imagine web application wich allows users to browse some shared
 filesystem located at the server.
 Application stores every users's position within that filesystem
 (current directory or file).

 In C this can be implemented with the help of following data types:

 Any ideas?

 Use IORef. ;)

 PS
 MVar is better, actually.
 Somehow I forgot about them:) Code will turn into something like

 It really should work! (we don't take multithreading issues into
 account for now)
 Slightly annoying thing is that 1-to-1 mapping from C to Haskell also
 forces programmer to perform C-like low-level pointer linking.

This is just straightforward solution and it contains some number of
traps. What if someone disconnected a part of file system while some
user browses it? That user will be trapped inside that island (or get
a core dump), How do users get notifications about changes in their
parts of structures?

You can do better but, of course, it will be harder.

The browsing itself is a simple variant of collaborative editing:
http://en.wikipedia.org/wiki/Collaborative_editing Your variant is
simpler that editing because only one member produce changes in
structure. So you could send tree diffs in Zipper format to all your
users or accumulate diffs and give them to users when they ask for it.

Adding tree diff over user position described as Zipper won't put user
into an isolated island.

And if you later decide that there are two parties who can change the
world, you are almost fully prepared for it.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Jake McArthur

On 07/15/2010 02:30 AM, Stephen Tetley wrote:

2010/7/15 Jake McArthurjake.mcart...@gmail.com:

On 07/14/2010 05:01 PM, Victor Gorokhov wrote:


You can implement pure pointers on top of Data.Map with O(log n) time


Or on top of Data.IntMap with O(1) time. ;)


Unlikely...

 From the docs, lookup is O(min(n,W))


Exactly. O(min(n,32)) or O(min(n,64))

- Jake
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Victor Gorokhov



Thanks for an example! Probably, one can think about using Arrays
instead of Map or IntMap in order to achieve 'true' O(1) in pure. But
I suppose that there are some trouble with array expanding. Or
somebody would already make it.

  

Pure arrays have O(n) modification time.


From the docs, lookup is O(min(n,W))


Actually worse than O(log n).


B-tree with 4 or even 8 child nodes will be the best solution. This 
trees have better lookup time and worse space efficiency, but we can 
almost eliminate space overhead by using dense keys.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread Jake McArthur

On 07/15/2010 05:33 PM, Victor Gorokhov wrote:



Thanks for an example! Probably, one can think about using Arrays
instead of Map or IntMap in order to achieve 'true' O(1) in pure. But
I suppose that there are some trouble with array expanding. Or
somebody would already make it.


Pure arrays have O(n) modification time.


Excluding DiffArray under certain usage patterns of course, but 
DiffArray is slow for unknown reasons besides algorithmic complexity.





From the docs, lookup is O(min(n,W))


Actually worse than O(log n).


Perhaps I am misunderstanding you, but O(min(n,W)) is either better than 
or the same as O(log n), depending on how you look at things, but I 
don't see any way that the former could be *worse* than the latter.


- Jake
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-15 Thread wren ng thornton

Jake McArthur wrote:

On 07/15/2010 05:33 PM, Victor Gorokhov wrote:

From the docs, lookup is O(min(n,W))

Actually worse than O(log n).


Perhaps I am misunderstanding you, but O(min(n,W)) is either better than 
or the same as O(log n), depending on how you look at things, but I 
don't see any way that the former could be *worse* than the latter.


For n  W: min(n,W)  log n

So, when you can guarantee that n  W ---which is almost always the case 
for IntMap---, then O(min(n,W)) is linear and therefore worse than O(log n).


But even so, if your constant factors are k  c, then k*n  c*log n is 
perfectly possible for all n  W, and therefore what matters in the real 
world here is the constant factors. The reason why is that for 
asymptotic purposes O(min(n,W)) and O(log n) belong to the same class of 
functions between constant and linear, so they're effectively the same 
(in asymptotic-land).


--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] trees and pointers

2010-07-14 Thread Sergey Mironov
Hi cafe! I have a question of C-to-Haskell type:)

Imagine web application wich allows users to browse some shared
filesystem located at the server.
Application stores every users's position within that filesystem
(current directory or file).

In C this can be implemented with the help of following data types:

struct tree_node {
union item {
// some file data
struct file *file;

// struct dir has link to another list of tree_node
struct dir *dir;
};
int type;

// List of tree_nodes
struct tree_node *next;
struct tree_node *prev;
};

struct user {
struct tree_node *position;

// List of users
struct user *next;
struct user *prev;
};

This implementation will give us
1) O(1) time to insert to shared tree
2) O(1) time to access user's current position

Is it possible to reach this requirements in haskell?

For example, managing distinct tree type like

data TreeNode = File | Dir [TreeNode]

will lead to failure of req. 2 (have to traverse this
tree to find each user's position).

Also one could manage several zipper types (one for every user):

data TreeNodeCtx = Top | TreeNodeCtx {
left :: [TreeNode],
right :: [TreeNode],
up :: TreeNodeCtx
}

data TreeNodeZ = TreeNodeZ {
ctx :: [TreeNodeCtx]
pos :: TreeNode
}

It works for one user but not for many because of req. 1 (have to
insert new item into
several zippers).

Any ideas?

-- 
Thanks,
Sergey
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-14 Thread Serguey Zefirov
2010/7/14 Sergey Mironov ier...@gmail.com:
 Hi cafe! I have a question of C-to-Haskell type:)

 Imagine web application wich allows users to browse some shared
 filesystem located at the server.
 Application stores every users's position within that filesystem
 (current directory or file).

 In C this can be implemented with the help of following data types:

 Any ideas?

Use IORef. ;)

PS
MVar is better, actually.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-14 Thread Andrew Coppin

Serguey Zefirov wrote:

Use IORef. ;)

PS
MVar is better, actually


TVar is better still. ;-)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-14 Thread Gregory Crosswhite
Or you can get the best of all worlds by combining all three!

data User = User
{userNext :: IORef (MVar (TVar User)))
,userPrev :: IORef (MVar (TVar User)))
}


On 07/14/10 14:39, Andrew Coppin wrote:
 Serguey Zefirov wrote:
 Use IORef. ;)

 PS
 MVar is better, actually

 TVar is better still. ;-)

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-14 Thread Victor Gorokhov

You can implement pure pointers on top of Data.Map with O(log n) time:

{-# LANGUAGE ExistentialQuantification #-}
import Data.Map ( Map )
import qualified Data.Map as Map
import Data.Typeable
import Control.Monad.State
import Data.Maybe

type PointerSpace = Map Int PackedValue
newtype Pointer a = Pointer Int
data PackedValue = forall a. Typeable a = PackedValue a

readPointer :: Pointer a - State PointerSpace a
readPointer ( Pointer key ) =  do
 space - get
 return $ fromJust $ cast $ Map.find key space

writePointer :: a - Pointer a - State PointerSpace ()
writePointer a ( Pointer key ) = do
 space - get
 put $ Map.insert key ( PackedValue a ) space

newPointer :: a - State PointerSpace ( Pointer a )
newPointer a = do
 space - get
 let key = findEmptyKey space -- implement it yourself
 p = Pointer key
 writePointer a p
 return p

Code can contain some typos.

Sergey Mironov пишет:

Hi cafe! I have a question of C-to-Haskell type:)

Imagine web application wich allows users to browse some shared
filesystem located at the server.
Application stores every users's position within that filesystem
(current directory or file).

In C this can be implemented with the help of following data types:

struct tree_node {
union item {
// some file data
struct file *file;

// struct dir has link to another list of tree_node
struct dir *dir;
};
int type;

// List of tree_nodes
struct tree_node *next;
struct tree_node *prev;
};

struct user {
struct tree_node *position;

// List of users
struct user *next;
struct user *prev;
};

This implementation will give us
1) O(1) time to insert to shared tree
2) O(1) time to access user's current position

Is it possible to reach this requirements in haskell?

For example, managing distinct tree type like

data TreeNode = File | Dir [TreeNode]

will lead to failure of req. 2 (have to traverse this
tree to find each user's position).

Also one could manage several zipper types (one for every user):

data TreeNodeCtx = Top | TreeNodeCtx {
left :: [TreeNode],
right :: [TreeNode],
up :: TreeNodeCtx
}

data TreeNodeZ = TreeNodeZ {
ctx :: [TreeNodeCtx]
pos :: TreeNode
}

It works for one user but not for many because of req. 1 (have to
insert new item into
several zippers).

Any ideas?

  


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trees and pointers

2010-07-14 Thread Jake McArthur

On 07/14/2010 05:01 PM, Victor Gorokhov wrote:

You can implement pure pointers on top of Data.Map with O(log n) time


Or on top of Data.IntMap with O(1) time. ;)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe