It was inferred from my original post that the start/stop times were all
integers, which actually was not true.  That's why the bitmap union idea
sounds good, but doesn't work with non-integers.

Thanks to some ideas I got from several people on the list, however, I came
up with an algorithm that seems to work:

# loop through, grabbing start/stop times from file data making sure start
< stop
# push time onto times array, keeping track of whether it's a start/stop
time

# sort times array

# step through times starting at smallest, incrementing counter (c) for
start time,
# decrementing for stop;  if c is going from 0 to 1, i.e. beginning of a
"covered"
# time, set $lTime to current time; if c is going from 1 to 0, i.e. end of
coverage,
# set $lWin (total window) to $lWin + (current time - $lTime)

Here's the code, in case anyone's interested (the first little part deals
with getting the data out of the pile I've stuffed it into (watch for
wrapping):

  # grab line from each file's array, $f
  for ($f=0;$f<=$fIndex;$f++) {
    $lines[$f] = [ split(" ", $fContent[$f][$i]) ];  # $lines[file][field]
    @tmp = sort {$a <=> $b} @{$lines[$f]}[15,18];  # get and sort times
    push(@lTs, [ (shift(@tmp),0) ]);  # push smaller time onto @lTs with
flag, $lTs[index][type: time-0, start/stop-1]
    push(@lTs, [ (shift(@tmp),1) ]); # push larger time onto @lTs
  }

  # CALCULATE WINDOWS

  @lTs = sort {$a->[0] <=> $b->[0]} @lTs;  # sort @lTs array on element 0
  if ($lTs[-1][0] == -1) { $lWin = -1; }   # check for no-data case
  else {
    $c = $lWin = $lastT = 0;  # initialize counter, window size, and last
time
    while ($#lTs > -1) {  # loop through all times
      ($lTs[0][1] == 0) && (++$c == 1) && ($lastT = $lTs[0][0]); # if a
start time, inc $c and if we're now 1, set $lastT
      ($lTs[0][1] == 1) && (--$c == 0) && ($lWin += $lTs[0][0] - $lastT); #
if a stop time, dec $c and if we're now 0, add time to $lWin
      shift @lTs;
    }
  }


Thanks again to everyone for the help on this...

- Bryan


__________________


From: "Bryan R Harris" <[EMAIL PROTECTED]>
> I'm trying to come up with an algorithm that seems like it ought to be
> really easy, but it's turning out to be pretty tough...
>
> Basically I have three pairs of start/stop times, e.g.:
>
>    3, 5
>    4, 10
>   15, 20
>
> I want the total time covered by all these ranges.  I can't just say
> (5-3 + 10-4 + 20-15) because the 3,5 and 4,10 ranges overlap.  The
> desired answer for this case is 13, 3 to 10 and 15 to 20.  I have the
> start times in two arrays, @starts and @stops.
>
> I have to do this probably a million+ times.  Any ideas on how I could
> go about this?

Strange. Am I the only one who thinks about bitmaps as soon as
someone mentions unions?

Let's look at it like this. Each range specifies a set of hours, if we
implement the set as a bit map we can implement the union by
simple bitwise OR and then get the number of elements in the total
union by counting the 1s.

(I am assuming here that you only have full hours in the ranges and
that there are 24 hours a day (well, that there are at most 32 hours
a day actualy) and that range 1-3 is 2 hours long.

     use strict;
     my @ranges = ([2,5],[4,7],[12,14],[15,15]);

     my $total = 0;
     foreach my $int (@ranges) {
          my $mask = 0xFFFFFFFF & (0xFFFFFFFF << ($int->[0]-
1));
          $mask &= 0xFFFFFFFF >> (33-$int->[1]);
          $total = $total | $mask;
          print "$int->[0] .. $int->[1] = ", unpack("%32b*",
pack('N',$mask)), "\n";
     }

     print "Total: ", unpack("%32b*", pack('N',$total)), "\n";

Jenda

=========== [EMAIL PROTECTED] == http://Jenda.Krynicky.cz ==========
There is a reason for living. There must be. I've seen it somewhere.
It's just that in the mess on my table ... and in my brain
I can't find it.
                         --- me

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to