Re: [Haskell-cafe] function result caching

2006-10-14 Thread Ketil Malde
Robert Dockins [EMAIL PROTECTED] writes:

 slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
 and use slowFunctionCacheList !! i instead of slowFunction (i)

 Not much different in principle, but better in practice - you could
 use an array rather than a list.  O(1) lookups should make things (a
 lot) faster.

 Well, this is true only if the range of the domain function is small and 
 fairly dense.  

I don't think so.

 With 500 elements, you're looking at allocating about 20Mb of
 memory

On the other hand, the lists allocates the 20Mb of pointers, *and*
another 20Mb of cons cells for the lists. 

 to hold pointers to closures_ and then allocating and filling out 500 
 closures, all before you get down to doing any real work!  

If I interpret you correctly, you want to make the array's contents
strict?  Not a good idea when the domain is sparse, but on the other
hand it would let you unbox the contents, which means you'd only need
to store the actual values. For boolean values, I think GHC
stores one bit per value, i.e. less than a MB for this range.

 Little-endian patricia trees [...]

Yes, sure, if you can't afford a 20Mb index.  On the other hand,
changing the function to use an array is a very small modification,
and probably more than good enough in many cases. 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-14 Thread Robert Dockins
On Saturday 14 October 2006 13:13, Ketil Malde wrote:
 Robert Dockins [EMAIL PROTECTED] writes:
  slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
  and use slowFunctionCacheList !! i instead of slowFunction (i)
 
  Not much different in principle, but better in practice - you could
  use an array rather than a list.  O(1) lookups should make things (a
  lot) faster.
 
  Well, this is true only if the range of the domain function is small and
  fairly dense.

 I don't think so.

  With 500 elements, you're looking at allocating about 20Mb of
  memory

 On the other hand, the lists allocates the 20Mb of pointers, *and*
 another 20Mb of cons cells for the lists.

True, but only if you access deeply into the tail of the list.  If one access 
only the first several hundred elements, say, then you'll only allocate the 
space needed for those.

Of course, if you only want to access a small portion at the begining, then 
why create such a big list in the first place?  Moral: lists will lose this 
contest in almost all cases.

  to hold pointers to closures_ and then allocating and filling out 500
  closures, all before you get down to doing any real work!

 If I interpret you correctly, you want to make the array's contents
 strict?  Not a good idea when the domain is sparse, but on the other
 hand it would let you unbox the contents, which means you'd only need
 to store the actual values. For boolean values, I think GHC
 stores one bit per value, i.e. less than a MB for this range.

No, I didn't suggest that the elements be strict.  That would involve 
precomputing the entire table.  You _could_ do that if you anticipate a LOT 
of access to sufficient to outweigh the initial cost.  But that seems 
unlikely for a sparse domain, as you mentioned.

However, even non-strict arrays are created all at once (ie, they are strict 
in their _structure_), and the closures have to be allocated as the array is 
being created.  Creating a closure isn't terribly expensive, but creating 
500 closures might take awhile and cost a lot of memory if the closure 
has a large number of free variables (which depends on the function 
definition and the exact details of the lambda lifter).  Also, large arrays 
tend to play havoc with GHC's garbage collector; it has to scan all elements 
on every major GC, IIRC.  That alone may offset any advantages won.

In the end, the only way to be sure which method is best is to test it against 
your usage profile.  My guess is that the array method will have enough 
overhead that it will lose against a tree.  However I may be wrong, 
especially if the program will have a very long runtime and if a warm-up 
period is acceptable.

  Little-endian patricia trees [...]

 Yes, sure, if you can't afford a 20Mb index.  On the other hand,
 changing the function to use an array is a very small modification,
 and probably more than good enough in many cases.

I completely agree; it is good for many cases, and can be a very useful 
technique.  I just don't think it will be good for _this_ case (large, sparse 
domain where f(n) doesn't depend on all f(m) where m  n).  That probability 
is positively correlated with the size of the domain.  Again, the only way to 
really know is to implement and benchmark.  Thankfully, caching techniques 
are completely local and can be changed easily.

 -k

-- 
Rob Dockins

Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
   -- TMBG
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-14 Thread Silviu Gheorghe
thank you for all the answersi was aware lists are is not the best solution, but i was too keen to see the actual result I'll do some tests though using different variants, because i have the feeling that in my next program I'll face the strong form of this problem.
On 10/13/06, Silviu Gheorghe [EMAIL PROTECTED] wrote:
it does, thank you very much for the quick answer, unfortunately as I understand it, it doesn't work well on ints :(for just now i created a list slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
and use slowFunctionCacheList !! i instead of slowFunction (i)it helped alot (i mean i stoped the program after 3 hours still working and got the result in 2 minutes :))

i am still curious about a better method (and a general one), because this is ugly, and it only works on ints as i see it.but then again thank you for telling me it doesn't do it, because i had the false impresion it does and i wouldn't stop it otherwise
On 10/13/06, Tom Phoenix 
[EMAIL PROTECTED] wrote:
On 10/12/06, Silviu Gheorghe [EMAIL PROTECTED] wrote: I'd like to know if the results are cachedby the compiler
Hardly ever, as I understand things.
 if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little
 on this topic.You need to search for the word memoize (or memoise). Here's apage about a memo function for GHC.

http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.htmlHope this helps!--Tom Phoenix


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-13 Thread Tomasz Zielonka
On Thu, Oct 12, 2006 at 08:40:44PM -0700, John Meacham wrote:
 it is too bad IntSet and IntMap are strict in their subtrees, it would
 have been nice to provide things like
 
 out of curiosity, why are IntMap and IntSet strict in their subtrees.

I guess the reason is balancing. I can't think of any way of balancing a
lazy tree that wouldn't break abstraction.

Perhaps I would be possible to use some trick to rebalance an existing
tree to account for what's currently evaluated. But it could be very
tricky to get it right and it would certainly go beyond Haskell 98.

Best regards
Tomasz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-13 Thread Jan-Willem Maessen


On Oct 13, 2006, at 4:05 AM, Tomasz Zielonka wrote:


On Thu, Oct 12, 2006 at 08:40:44PM -0700, John Meacham wrote:
it is too bad IntSet and IntMap are strict in their subtrees, it  
would

have been nice to provide things like

out of curiosity, why are IntMap and IntSet strict in their subtrees.


I guess the reason is balancing. I can't think of any way of  
balancing a

lazy tree that wouldn't break abstraction.


Uh, Patricia trees aren't balanced in the usual sense.  There is  
exactly one tree structure for a given set of keys, regardless of  
insertion order etc.  (IntSet and IntMap workes approximately as Carl  
Witty described last I checked, though I won't swear to whether bits  
are taken low to high or vice versa.)


I had assumed the strictness was to avoid deferring O(n) insertion  
work to the first lookup operation---though really it makes no  
difference in an amortized sense.


-Jan-Willem Maessen



Perhaps I would be possible to use some trick to rebalance an existing
tree to account for what's currently evaluated. But it could be very
tricky to get it right and it would certainly go beyond Haskell 98.

Best regards
Tomasz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe




smime.p7s
Description: S/MIME cryptographic signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-13 Thread Tomasz Zielonka
On Fri, Oct 13, 2006 at 08:52:09AM -0400, Jan-Willem Maessen wrote:
 I guess the reason is balancing. I can't think of any way of  
 balancing a
 lazy tree that wouldn't break abstraction.
 
 Uh, Patricia trees aren't balanced in the usual sense.

 There is  exactly one tree structure for a given set of keys,
 regardless of  insertion order etc.  (IntSet and IntMap workes
 approximately as Carl  Witty described last I checked, though I won't
 swear to whether bits  are taken low to high or vice versa.)

Ah, IntMaps! Forget what I said.

Best regards
Tomasz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-13 Thread Ketil Malde
Silviu Gheorghe [EMAIL PROTECTED] writes:

 slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
 and use slowFunctionCacheList !! i instead of slowFunction (i)

 i am still curious about a better method (and a general one)

Not much different in principle, but better in practice - you could
use an array rather than a list.  O(1) lookups should make things (a
lot) faster.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-13 Thread Robert Dockins
On Friday 13 October 2006 16:15, Ketil Malde wrote:
 Silviu Gheorghe [EMAIL PROTECTED] writes:
  slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
  and use slowFunctionCacheList !! i instead of slowFunction (i)
 
  i am still curious about a better method (and a general one)

 Not much different in principle, but better in practice - you could
 use an array rather than a list.  O(1) lookups should make things (a
 lot) faster.

Well, this is true only if the range of the domain function is small and 
fairly dense.  He said it was large and sparsely populated.  

With 500 elements, you're looking at allocating about 20Mb of memory _just 
to hold pointers to closures_ and then allocating and filling out 500 
closures, all before you get down to doing any real work!  That's just not 
going to be very fast.  You're going to take a HUGE constant runtime and 
space penalty for that O(1) lookup.

Little-endian patricia trees are probably the right way to go here (which I 
think has been mentioned already).  You get O(log n) lookup and good behavior 
for a sparse and/or infinite domain because only the parts of the tree that 
are explored get unfolded.


 -k

-- 
Rob Dockins

Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
   -- TMBG
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] function result caching

2006-10-12 Thread Silviu Gheorghe
hello,I'm new to Haskell and I'm using it to do some simulations for a game, and I have a some functions that have as argument just one int (the current situation), but they do a lot of computations after that (future evolutions etc)
I'd like to know if the results are cached by the compiler (there are only a few thousand values i call the functions on, but they are distributed on a fairly large interval (0-100), because of the codification. 
if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little on this topic.
thank you very muchSilviuP.S. sorry for my English :)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-12 Thread Tom Phoenix

On 10/12/06, Silviu Gheorghe [EMAIL PROTECTED] wrote:


I'd like to know if the results are cached  by the compiler


Hardly ever, as I understand things.


if they are not I'd like to know what is the best way to cache them
manually, and where can I read more about this, and the optimizations the
compiler does, because I've searched the web before and i found very little
on this topic.


You need to search for the word memoize (or memoise). Here's a
page about a memo function for GHC.

   http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.html

Hope this helps!

--Tom Phoenix
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-12 Thread Silviu Gheorghe
it does, thank you very much for the quick answer, unfortunately as I understand it, it doesn't work well on ints :(for just now i created a list slowFunctionCacheList= [slowFunction (i) | i -[0..500]]
and use slowFunctionCacheList !! i instead of slowFunction (i)it helped alot (i mean i stoped the program after 3 hours still working and got the result in 2 minutes :))
i am still curious about a better method (and a general one), because this is ugly, and it only works on ints as i see it.but then again thank you for telling me it doesn't do it, because i had the false impresion it does and i wouldn't stop it otherwise
On 10/13/06, Tom Phoenix [EMAIL PROTECTED] wrote:
On 10/12/06, Silviu Gheorghe [EMAIL PROTECTED] wrote: I'd like to know if the results are cachedby the compilerHardly ever, as I understand things.
 if they are not I'd like to know what is the best way to cache them manually, and where can I read more about this, and the optimizations the compiler does, because I've searched the web before and i found very little
 on this topic.You need to search for the word memoize (or memoise). Here's apage about a memo function for GHC.
http://www.haskell.org/ghc/docs/6.4.2/html/hslibs/memo-library.htmlHope this helps!--Tom Phoenix
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-12 Thread Carl Witty
On Fri, 2006-10-13 at 01:27 +0300, Silviu Gheorghe wrote:
 it does, thank you very much for the quick answer, unfortunately as I
 understand it, it doesn't work well on ints :(
 
 for just now i created a list 
 
 slowFunctionCacheList= [slowFunction (i) | i -[0..500]] 
 and use slowFunctionCacheList !! i instead of slowFunction (i)
 
 it helped alot (i mean i stoped the program after 3 hours still
 working and got the result in 2 minutes :))
 
 i am still curious about a better method (and a general one), because
 this is ugly, and it only works on ints as i see it.

I can describe a method that's uglier, faster, and more general; is that
better or not?

You're using an infinite list to store your cached results.  (Well, your
list is actually finite, but an infinite list would work just as well.)

Instead of using an infinite list, you can use an infinite binary tree,
with a cached result at every node.  Construct a binary tree with the
following property: Consider the path from the root to a node, where
left branches are called 0 and right branches are called 1.  This
sequence of 0's and 1's is the binary expansion of the key whose cached
value is stored at that node (with the least-significant-bit at the root
of the tree).  (Actually constructing this tree, and looking things up
in it, is left as an exercise for the reader; but it isn't very hard.)

This generalizes to any kind of key that can be uniquely serialized into
bits; looking up the corresponding value then takes O(# of bits in the
key) extra time and space, which is far better than the O(2^(# of bits
in the key)) that an infinite list would use.

Carl Witty


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] function result caching

2006-10-12 Thread John Meacham
On Thu, Oct 12, 2006 at 04:01:14PM -0700, Carl Witty wrote:
 Instead of using an infinite list, you can use an infinite binary tree,
 with a cached result at every node.  Construct a binary tree with the
 following property: Consider the path from the root to a node, where
 left branches are called 0 and right branches are called 1.  This
 sequence of 0's and 1's is the binary expansion of the key whose cached
 value is stored at that node (with the least-significant-bit at the root
 of the tree).  (Actually constructing this tree, and looking things up
 in it, is left as an exercise for the reader; but it isn't very hard.)
 
 This generalizes to any kind of key that can be uniquely serialized into
 bits; looking up the corresponding value then takes O(# of bits in the
 key) extra time and space, which is far better than the O(2^(# of bits
 in the key)) that an infinite list would use.

it is too bad IntSet and IntMap are strict in their subtrees, it would
have been nice to provide things like

infiniteMap :: (Int - a) - IntMap a
infiniteMap  = ...

cachingSet :: (Int - Bool) - IntSet
cachingSet = ...

out of curiosity, why are IntMap and IntSet strict in their subtrees.
since their subtrees are not of CPR form, I would think the benefit
would not be that great and might even hurt in some situations... was
there testing of the 'lazy' version? 

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe