--- "Jackson, Harry" <[EMAIL PROTECTED]> wrote: > 
> I have been looking back at this problem and here is what I have found.
> 
> Lets take the following set of times
>        A  , B
> 1     (4  , 5)
> 2     (9  , 10)
> 3     (11 , 12)
> 4     (12 , 14)
> 5     (12 , 18)
> 6     (14 , 15)
> 
> If we sort by column A the set inherits the following property or Rule 
> 
>     a1 <= a2 is always true
> 
> This means that we are only interested b1 properties in relation to a2 and
> b2.  If (b1 >= b2) then because of Rule 1 we can ignore the values in the
> second set and remove them.

This seems wrong, why (b1 >= b2)?  That's an assertion we aren't looking
backwards in time (lets assume that the input has been checked for this
- it makes the calculations easier (I could solve it with a double pass)).

If I assume you meant (a1 >= b2), then it is not valid:

>        A  , B
> 1     (4  , 5)
> 2     (9  , 10)

Which is already sorted on A, now we can see that if a1>=b2 should never be
true.  So, lets assume (a2 >= b1):

>        A  , B
> 1     (4  , 5)
> 2     (9  , 10)

Now, if 5 > 9, then we could merge these (in this case they aren't).  With
the new set being (4, 10) [if 5 > 9!]

> If this is not the case then if (b1 < a2) we can safely deduce that it has
> no overlap in any other set from rule 1 and calculate the time and store it.

I think this might be true, proving you have collapsed the above.  It warrents
someone better than I at Algorithms to look at.

> If however (b1 < a2) is false then because of Rule 1 we can safely replace
> b1 with b2 and remove the second set.

I'm lost.  :P

> The code I have used to get the results is at the end. I have no idea if it
> is a fast solution. I am not sure if it is the best way to code it either
> and I am sure some people can optimise it for best performance. I wanted to
> use a for loop but went for the easier while loop instead. 

If you only have to sort once, and you can step through in a linear fashion
then it will be pretty fast.  It is not dissimilar to what I did (and nobody
has complained yet), it does however have an unfortunate side effect:

  the original sample of data gets destroyed

but that is easily fixable.  The time complexity of our solutions is going
to be the same [I think], however improvements can be made to make it even
more efficient - if we preprocess the input data, which is not unlikely.
Imagine a calander with associated timeslices, where you can do some
preprocessing and even store to disk.

I was going to try using an AVL tree, but someone hasn't written Tree::AVL
yet!  I don't have time just now, so it's going to have to be on the back
burner for the moment.

> Some other considerations that may have to be considered if it is going to
> become a module. 

Some of which I've covered.

> For thousands of records if we find the greatest stop time then if our
> searching set every has this value as its stop time  we can calculate the
> total and exit without iterating over any more sets providing the set has
> been sorted on column A.

Huh?

> When we have sets with matching start times we are only interested in the
> set with the largest difference and vice versa for the stop times.

Time complexity to detect that case is poor, is it not?

Jonathan Paton

=====
s''-//--/\\///|-\/\|--\--/-\-/\-//\-|/\\\|/\///|-\--\\\\',
s''/-\\\/|///|-|/|/--\--/--//\|\/\||/|/-/\\\-/\///|-\-\-',
y'|\/-'3210',$_=join qq\\,map{s|2|10|||s|3|11|||s|^|0|;$_}
m|.|g;map{print chr unpack'N',pack'B32','0'x24 .$_}/.{8}/g

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to