Re: Merging overlapping spans/ranges

2005-05-11 Thread Jordan Rastrick
Should work fine as far as I can see. Of course, thats not very
'pythonic' - I should probably give at least 10 different unit tests
that it passes ;)

Its gratifying to know I'm not the only one who needed that final
yield.

Jim Sizelove wrote:
 Bengt Richter wrote:
  On Tue, 10 May 2005 15:14:47 +0200, Max M [EMAIL PROTECTED] wrote:
 
 
 I am writing a find-free-time function for a calendar. There are
a lot
 of time spans with start end times, some overlapping, some not.
 
 To find the free time spans, I first need to convert the events
into a
 list of non overlapping time spans meta-spans.
 
 This nice ascii graph should show what I mean.
 
 1) ---
 2) ---
 3)   ---
 4)  -
 5) -
 
 
 ---   # meta spans
 
 I can then iterate through the meta-spans and find non-busy times.
 
 I have written the class below, but it is rather O^2, so I wondered
if
 anybody has an idea for a better approach?
 
 
  Maybe (not tested beyond what you see ;-)

 It is with some trepidation that I write this message; after hanging
 around in c.l.p for the better part of a year, I have come to expect
 Bengt's messages to be right-on and error-free.

 However this time...  :)
def mergespans(spans):
   ...  start = end = None
   ...  for s,e in sorted(spans):
   ...  if start is None: start, end = s,e; continue
   ...  if s = end: end = e; continue
   ...  yield start, end
   ...  start,end = s,e
   ...  if start is not None: yield start, end
   ...
spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
list(mergespans(spans))
   [(0, 7), (9, 17)]

 There can be a problem in mergespans if one span fits inside another:

spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
list(mergespans(spans))
   [(0, 3), (4, 7), (9, 17)]

 Here is a revised version (not tested beyond what you see ;-)
def mergespans(spans):
   ... start = end = None
   ... for s,e in sorted(spans):
   ... if start is None:
   ... start, end = s,e
   ... continue
   ... if s = end:
   ... if end  e:
   ... end = e
   ... continue
   ... yield start, end
   ... start,end = s,e
   ... if start is not None:
   ... yield start, end
   ...
list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
   [(0, 7), (9, 17)]
list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
   [(0, 7), (9, 17)]

 Can anyone find any other errors in this? ;-)
 
 With humble regards,
 Jim Sizelove

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-11 Thread Max M
Jim Sizelove wrote:


Wow! c.l.py is allmost like an AI program generator. But I guess it 
helps to ask questions about problems that programmers find interresting :-)

The linear approach is pretty simple to code and understand. I am just 
afraid what happens when many users tries to book that 14 day conference 
in the calendar sometimes this year.

365 * 24 * 60 = 525,600 booleans in the array

Well ok a few MBytes ain't too bad on a modern machine. It would even be 
possible to cut the resolution down to 15 minutes. Yielding 35,040 
bools, but the solution is still O^2 with regards to size of search 
window and number of events.

Bengts/Jims and Jordans solutions seems to be relatively similar. I will 
use one of those instead of my own code.


Thanks!


-- 

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-11 Thread Bengt Richter
On Tue, 10 May 2005 23:53:43 GMT, Jim Sizelove [EMAIL PROTECTED] wrote:
Bengt Richter wrote:
[...]
 Maybe (not tested beyond what you see ;-)
I had that feeling ... somehow the word max was trying to get my attention,
but I posted without figuring out why ;-/

It is with some trepidation that I write this message; after hanging 
around in c.l.p for the better part of a year, I have come to expect 
Bengt's messages to be right-on and error-free.
Would that 'twere so ;-)


However this time...  :)
   def mergespans(spans):
  ...  start = end = None
  ...  for s,e in sorted(spans):
  ...  if start is None: start, end = s,e; continue
  ...  if s = end: end = e; continue
  ...  yield start, end
  ...  start,end = s,e
  ...  if start is not None: yield start, end
  ...
   spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
   list(mergespans(spans))
  [(0, 7), (9, 17)]

There can be a problem in mergespans if one span fits inside another:

Good call. I should have thought of it.

   spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
   list(mergespans(spans))
  [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-)
   def mergespans(spans):
  ... start = end = None
  ... for s,e in sorted(spans):
  ... if start is None:
  ... start, end = s,e
  ... continue
  ... if s = end:
  ... if end  e:
  ... end = e
  ... continue
  ... yield start, end
  ... start,end = s,e
  ... if start is not None:
  ... yield start, end
  ...
   list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
  [(0, 7), (9, 17)]
   list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
  [(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

Not off hand. OTOH, I really didn't like the aesthetics of all that crufty 
logic in my original.
Maybe (again, barely tested ;-)

  def mergespans(spans):
 ... it = iter(sorted(spans))
 ... start, end = it.next()
 ... for s,e in it:
 ... if s = end:
 ... end = max(end, e) # was what max was trying to tell me, I 
think
 ... else:
 ... yield start, end
 ... start,end = s,e
 ... yield start, end
 ...
  list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
 [(0, 7), (9, 17)]
  list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
 [(0, 7), (9, 17)]


With humble regards,
Me too. UIAM we're all still human here (even my favorite 'bots, I think  ;-)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Merging overlapping spans/ranges

2005-05-10 Thread Max M
I am writing a find-free-time function for a calendar. There are a lot 
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a 
list of non overlapping time spans meta-spans.

This nice ascii graph should show what I mean.

1) ---
2) ---
3)   ---
4)  -
5) -

  ---   # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if 
anybody has an idea for a better approach?


##
# -*- coding: latin-1 -*-


1) ---
2) ---
3)   ---
4)  -
5) -

  ---  


class MetaSpans:

 
 Populate with a list of span tuples [(start,end)], and it will make 
meta
 spans, with overlapping spans folded into one span.
 

 def __init__(self):
 self.spans = []

 def add(self, span):
 start, end = span
 overlapping = [span]
 non_overlapping = []
 for spn in self.spans:
 spn_start, spn_end = spn
 # span rules for iterated spans
 starts_before = spn_start = start
 ends_after = spn_end = end
 is_outside = starts_before and ends_after
 starts_inside = start = spn_start = end
 ends_inside =  start = spn_end = end
 overlaps = starts_inside or ends_inside or is_outside
 if overlaps:
 overlapping.append(spn)
 else:
 non_overlapping.append(spn)
 # overlapping spans are changed to one span
 starts = []
 ends = []
 for start, end in overlapping:
 starts.append(start)
 ends.append(end)
 min_start = min(starts)
 max_end = max(ends)
 non_overlapping.append( (min_start, max_end) )
 self.spans = non_overlapping


 def findFreeTime(self, duration):
 self.spans.sort()




if __name__ == '__main__':

 ms = MetaSpans()
 ms.add((0,3))
 ms.add((4,7))
 ms.add((2,5))
 ms.add((9,14))
 ms.add((12,17))
 print ms.spans


 from datetime import datetime
 ms = MetaSpans()
 ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0)))
 ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0)))
 ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0)))
 ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0)))
 ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0, 
0)))
 print ms.spans





-- 

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread bearophileHUGS
This is the problem of finding the connected components inside an
interval graph. You can implement the algorithms yourself, of you can
use my graph data structure here:

http://sourceforge.net/projects/pynetwork/

The graph methods:
createIntervalgGraph
And:
connectedComponents
can probably solve your problem quite fast, algorithmically, and with
few lines of code.
If you need more help, then ask for it and I'll give the little code
needed.

Bear hugs,
Bearophile

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Jordan Rastrick

Max M wrote:
 I am writing a find-free-time function for a calendar. There are a
lot
 of time spans with start end times, some overlapping, some not.

 To find the free time spans, I first need to convert the events into
a
 list of non overlapping time spans meta-spans.

 This nice ascii graph should show what I mean.

 1) ---
 2) ---
 3)   ---
 4)  -
 5) -

   ---   # meta spans

 I can then iterate through the meta-spans and find non-busy times.

 I have written the class below, but it is rather O^2, so I wondered
if
 anybody has an idea for a better approach?


 ##
 # -*- coding: latin-1 -*-

 
 1) ---
 2) ---
 3)   ---
 4)  -
 5) -

   ---  
 

 class MetaSpans:

  
  Populate with a list of span tuples [(start,end)], and it will
make
 meta
  spans, with overlapping spans folded into one span.
  

  def __init__(self):
  self.spans = []

  def add(self, span):
  start, end = span
  overlapping = [span]
  non_overlapping = []
  for spn in self.spans:
  spn_start, spn_end = spn
  # span rules for iterated spans
  starts_before = spn_start = start
  ends_after = spn_end = end
  is_outside = starts_before and ends_after
  starts_inside = start = spn_start = end
  ends_inside =  start = spn_end = end
  overlaps = starts_inside or ends_inside or is_outside
  if overlaps:
  overlapping.append(spn)
  else:
  non_overlapping.append(spn)
  # overlapping spans are changed to one span
  starts = []
  ends = []
  for start, end in overlapping:
  starts.append(start)
  ends.append(end)
  min_start = min(starts)
  max_end = max(ends)
  non_overlapping.append( (min_start, max_end) )
  self.spans = non_overlapping


  def findFreeTime(self, duration):
  self.spans.sort()




 if __name__ == '__main__':

  ms = MetaSpans()
  ms.add((0,3))
  ms.add((4,7))
  ms.add((2,5))
  ms.add((9,14))
  ms.add((12,17))
  print ms.spans


  from datetime import datetime
  ms = MetaSpans()
  ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3,
0, 0)))
  ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7,
0, 0)))
  ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5,
0, 0)))
  ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14,
0, 0)))
  ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17,
0,
 0)))
  print ms.spans




I think the following code does what you want. It should be O(n log n)
- at least I hope thats what Python takes to sort the list of spans :)
Of course I've assumed you have the spans available to you all at once
as a list, and dont need to add them one at a time as you did in your
original code.

def get_metaspans(spans):
Given a list of span tuples [(start,end)], will generate all
meta spans, with overlapping spans folded into one span.

spans.sort()
spans = iter(spans)
metaspan = spans.next()
for span in spans:
start, end = span
m_start, m_end = metaspan
if start  m_end:
yield metaspan
metaspan = span
elif end  m_end:
metaspan = (m_start, end)
# Need to yield the final metaspan once the span list is exhausted
yield metaspan

def get_breaks(metaspans):
Gets all the breaks in between a sequence of metaspans
metaspans = iter(metaspans)
_, prev_end = metaspans.next()
for metaspan in metaspans:
start, end = metaspan
yield (prev_end, start)
prev_end = end

I must admit I'm a bit of a generatoraholic, I'll tend to throw yields
at anything given half a chance :) Having to yield once more at the end
of the get_metaspans loop seems a little inelegant, maybe it could be
done a bit better. But its nice the way empty lists are handled
gracefully - the StopIteration thrown by the .next() calls are just
absorbed by the, uh, generatorness.

A little bit of testing:

 spans = [(12, 13), (0,3), (2,5), (1,4), (4,6), (1,2), (8,9), (9,
10)]
 print list(get_metaspans(spans))
[(0, 6), (8, 10), (12, 13)]
 print list(get_breaks(get_metaspans(spans)))
[(6, 8), (10, 12)]

Is that more or less what was required?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread elbertlev
The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Mike Rovner
Max M wrote:
 I am writing a find-free-time function for a calendar. There are a lot 
 of time spans with start end times, some overlapping, some not.
 
 To find the free time spans, I first need to convert the events into a 
 list of non overlapping time spans meta-spans.
 

Almost linear method (includes .sort() which is afaik N*logN):
[Set all priority to 1, or you will get changes in priorities as well.]

def merge(p):
  p is a seq of (start, priority, end)
 returns list of merged events:
   (start1, pri), (end1, 0), (start2, pri), (end2, 0), ...
 
 all=[] # list of x, h, up, id
 for id,(l,h,r) in enumerate(p):
 all.append((l,h,1,id))
 all.append((r,h,0,id))
 all.sort()

 sl = [] # skyline solution
 slx = slh = 0 # current values
 vis = {} # active {id:h}
 for x,h,up,id in all:
 if up:
 vis[id]=h
 if hslh:
 sl.append((x,h))
 slh=h
 else:
 del vis[id]
 assert h=slh
 if h==slh:
 v = vis.values()
 if v:
 h = max(v)
 else:
 h = 0
 sl.append((x,h))
 slh=h
 slx=x
 # merge same time events
 s=dict(sl)
 sl=s.keys()
 sl.sort()
 return [(k,s[k]) for k in sl]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Steven Bethard
[EMAIL PROTECTED] wrote:
 The linear method:
 
 You create an array - one bool per minute. For one day 24 * 60 entries
 is enough. Spans (Start, End) are in minutes from midnight. Set array
 slots in range(Start, End) to True for each input span.
 
 Scan the array and find metaspans - contiguous sequences of False.

Yeah, this would basically have been my suggestion.  One implementation:

py def merge(spans, nbins):
... bins = [0]*nbins
... for start, end in spans:
... for bin in xrange(start, end):
... bins[bin] += 1
... def key((i, count)):
... return count != 0
... index_groups = itertools.groupby(enumerate(bins), key=key)
... for isnonzero, indices in index_groups:
... if isnonzero:
... indices = [i for (i, count) in indices]
... yield (indices[0], indices[-1])
...
py spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
py list(merge(spans, 20))
[(0, 6), (9, 16)]

Obviously, this needs some modification to handle datetime objects.

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Bengt Richter
On Tue, 10 May 2005 15:14:47 +0200, Max M [EMAIL PROTECTED] wrote:

I am writing a find-free-time function for a calendar. There are a lot 
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a 
list of non overlapping time spans meta-spans.

This nice ascii graph should show what I mean.

1) ---
2) ---
3)   ---
4)  -
5) -

  ---   # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if 
anybody has an idea for a better approach?

Maybe (not tested beyond what you see ;-)

  def mergespans(spans):
 ...  start = end = None
 ...  for s,e in sorted(spans):
 ...  if start is None: start, end = s,e; continue
 ...  if s = end: end = e; continue
 ...  yield start, end
 ...  start,end = s,e
 ...  if start is not None: yield start, end
 ...
  spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
  list(mergespans(spans))
 [(0, 7), (9, 17)]


Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Bengt Richter
On Tue, 10 May 2005 15:14:47 +0200, Max M [EMAIL PROTECTED] wrote:

I am writing a find-free-time function for a calendar. There are a lot 
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a 
list of non overlapping time spans meta-spans.

This nice ascii graph should show what I mean.

1) ---
2) ---
3)   ---
4)  -
5) -

  ---   # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if 
anybody has an idea for a better approach?



  spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
 
Non-recommended one-liner:

  reduce(lambda L,se: (L and se[0]=L[-1][1] and (L.append((L.pop()[0], 
  se[1])) or L)) or L.append(se) or L ,sorted(spans), [])
 [(0, 7), (9, 17)]

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Merging overlapping spans/ranges

2005-05-10 Thread Jim Sizelove
Bengt Richter wrote:
 On Tue, 10 May 2005 15:14:47 +0200, Max M [EMAIL PROTECTED] wrote:
 
 
I am writing a find-free-time function for a calendar. There are a lot 
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a 
list of non overlapping time spans meta-spans.

This nice ascii graph should show what I mean.

1) ---
2) ---
3)   ---
4)  -
5) -


---   # meta spans

I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if 
anybody has an idea for a better approach?

 
 Maybe (not tested beyond what you see ;-)

It is with some trepidation that I write this message; after hanging 
around in c.l.p for the better part of a year, I have come to expect 
Bengt's messages to be right-on and error-free.

However this time...  :)
   def mergespans(spans):
  ...  start = end = None
  ...  for s,e in sorted(spans):
  ...  if start is None: start, end = s,e; continue
  ...  if s = end: end = e; continue
  ...  yield start, end
  ...  start,end = s,e
  ...  if start is not None: yield start, end
  ...
   spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
   list(mergespans(spans))
  [(0, 7), (9, 17)]

There can be a problem in mergespans if one span fits inside another:

   spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
   list(mergespans(spans))
  [(0, 3), (4, 7), (9, 17)]

Here is a revised version (not tested beyond what you see ;-)
   def mergespans(spans):
  ... start = end = None
  ... for s,e in sorted(spans):
  ... if start is None:
  ... start, end = s,e
  ... continue
  ... if s = end:
  ... if end  e:
  ... end = e
  ... continue
  ... yield start, end
  ... start,end = s,e
  ... if start is not None:
  ... yield start, end
  ...
   list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
  [(0, 7), (9, 17)]
   list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
  [(0, 7), (9, 17)]

Can anyone find any other errors in this? ;-)

With humble regards,
Jim Sizelove
-- 
http://mail.python.org/mailman/listinfo/python-list