Hal Daume III writes: : | *all* i care about is being able to quickly calculate the size of | the intersection of two sets. these sets are, in general, very | sparse, which means that the intersections tend to be small. | | for example, i might have two sets reprsented by the arrays: | | {0,1,10,346,398,1039,3289,3853,9811,89231,50913} | {0,3,98,183,398,1038,5319,7642,9811,13893,93123} | | and all i need to be able to do is respond with "3" very very | quickly. using sorted arrays, this takes O(n+m) time (where n and | m are the sizes of the arrays). | | i'm willing to pay some up front cost for creating a data structure | that could do this more quickly (i.e., in logarithmic time). | | is such a thing possible? "does anyone have a haskell | implementation of" such a thing? :P
Hi. The total time (including the up front time for building the data structure) can't go below O(n+m), because if it did, you'd be neglecting to look at some of the elements at all. To move some of the effort up front, I suspect you're looking at a compression problem in disguise. For example, if your data weren't so sparse, you might try something like this: IntSet = [(Int, Int)] -- ordered list of non-overlapping ranges setToList s = concat [[a..b] | (a, b) <- s] intersectionSize :: IntSet -> IntSet -> Int intersectionSize s1@((a, b):s1') s2@((c, d):s2') = rangeSize (max a c, min b d) + case compare b d of LT -> intersectionSize s1' s2 EQ -> intersectionSize s1' s2' GT -> intersectionSize s1 s2' intersectionSize _ _ = 0 So... how *compressible* are your data? - Tom _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell