Re: slightly ot: data structure question

2002-08-02 Thread George Russell

Andrew Bromage wrote
> On Mon, Jul 29, 2002 at 02:50:55PM -0700, Hal Daume III wrote:
> 
> > I need a data structure which is a map from Ints to Doubles; the
> > distribution of the Ints is in the range say 0-2 and a map will
> > contain somewhere around 100-200 elements.  I need to be able to query
> > *very* quickly, insert *quite* quickly, and iterate through the elements
> > very quickly.  I'm thinking hash tables, but I'm not sure.  Does anyone
> > have any thoughts on this?
> 
> How about this?
> 
> http://citeseer.nj.nec.com/briggs93efficient.html
> 
> Another paper which might be useful:
> 
> Aasa, A., Holstrom, S. and Nilsson, C. (1988),
>"An Efficiency Comparison of Some Representations of Purely
>Functional Arrays". BIT 28(3):490--503.
You aren't doing sparse matrix computations are you?  In that case there are
specialised storage techniques around, which are probably better than any
general solution.  If you ask on sci.math.num-analysis you'll probably get
more references than you know what to do with.

The method in the first paper effectively has one array of length 2 allowing
you to query *very* quickly, plus a smaller array listing the elements.
I think the main problem with this would be the array of 2 elements.  Apart from
the cost of memory (which is probably pretty insignificant), you have the problem that
2 elements is 160kB which is quite a lot to all keep in cache at once.  If the 
elements 
are uniformly distributed it seems to me you might get better performance by, say, 
using
a closed hash table with size say 1024 and indexing just by some selection of 10 bits 
from
the integers.  (Of course if the integers are not uniformly distributed you will need
a better hash function.)  Then in 80-90% of cases you will get a positive/negative 
answer
to the query in just one lookup; if it's a don't know (there is a different element
at this position) you will only have to scan through the same cache line (almost 
certainly)
until you get the right answer.  You can also dispense with the first paper's auxiliary
array this way by setting the integers in the array to out-of-range values (or by using
signalling NaN's for the doubles); iterating through the array checking all the 
integers
shouldn't take too long.
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: slightly ot: data structure question

2002-08-01 Thread Andrew J Bromage

G'day all.

On Mon, Jul 29, 2002 at 02:50:55PM -0700, Hal Daume III wrote:

> I need a data structure which is a map from Ints to Doubles; the
> distribution of the Ints is in the range say 0-2 and a map will
> contain somewhere around 100-200 elements.  I need to be able to query
> *very* quickly, insert *quite* quickly, and iterate through the elements
> very quickly.  I'm thinking hash tables, but I'm not sure.  Does anyone
> have any thoughts on this?

How about this?

http://citeseer.nj.nec.com/briggs93efficient.html

Another paper which might be useful:

Aasa, A., Holstrom, S. and Nilsson, C. (1988),
   "An Efficiency Comparison of Some Representations of Purely
   Functional Arrays". BIT 28(3):490--503.

Cheers,
Andrew Bromage
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



slightly ot: data structure question

2002-07-29 Thread Hal Daume III

Not really Haskell-specific, but...

I need a data structure which is a map from Ints to Doubles; the
distribution of the Ints is in the range say 0-2 and a map will
contain somewhere around 100-200 elements.  I need to be able to query
*very* quickly, insert *quite* quickly, and iterate through the elements
very quickly.  I'm thinking hash tables, but I'm not sure.  Does anyone
have any thoughts on this?

 - Hal

--
Hal Daume III

 "Computer science is no more about computers| [EMAIL PROTECTED]
  than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell