[given a set of ranges, find items that are in range.
one way is for each range, look at each item.
runtime is proportional to sizes: range set X item set
if this is too slow, what is a faster way?]
*Design notes*
If theory bores you, skip to *Implementation notes*.
1. Changing when you compare things.
The original approach slows down in proportion to
the product of the dataset sizes. If you double the
size of the two datasets, the time taken will, ignoring
some practical issues, be four times as much.
If you are comparing the elements of datasets,
think about sorting.
Sorting can be done so that its time growth is not
much worse than linear -- doubling the dataset size
results in little more than a doubling of the time to
sort the set.
2. Changing the problem.
The problem as stated allows that ranges may
overlap arbitrarily. Perhaps this is not the case,
in which case there might be scope for improving
things based on the real limits.
If they do overlap arbitrarily, and there's a lot of
overlapping going on, then it might make sense
to convert the current set to a new set of ordered
non overlapping ranges.
For example, if you start with:
1 - 4
2 - 3
3 - 5
6 - 7
then you would end up with:
1 - 5
6 - 7
----------
*Implementation notes*
If we stick with your original idea of having the
data be stored in hashes, then we need to sort
hashes.
If you don't do the reduce-to-non-overlapping
ranges thing, then the range list should at
least be sorted with vals descending within
keys ascending.
If you can have elements be added to their
respective hashes already in order according
to the times in their key, then perhaps the neatest
solution is to use Tie::IxHash. This neat module
saves you code and execution time by making
the keys, each, and values functions automatically
return results in the order they were added.
If not, well, just use something like:
@one = (sort byrange keys %one)
@two = (sort keys %two)
and define the byrange sort to sort vals descending
within keys ascending.
Then compare, pulling more from each dataset only
as needed. I'm going to assume use of Tie::IxHash.
I'm not actually sure this code is right, but I think it's
sorta there:
$t = $et = --$st; # tricks to get ball rolling
{
($t, $val) = each %two if $t < $st;
($st, $et) = each %one if $t >= $et;
last if $t = undef or $st = undef;
if $t >= $st and $t < $et
{
push @match, $val;
($t, $val) = each %one;
}
} while 1
--------
there's bound to completely different ways of
looking at the problem that kick the above in
to touch. but i gotta run...