On Aug 27, 3:00 pm, Gerard flanagan <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > I have a list that starts with zeros, has sporadic data, and then has > > good data. I define the point at which the data turns good to be the > > first index with a non-zero entry that is followed by at least 4 > > consecutive non-zero data items (i.e. a week's worth of non-zero > > data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, > > 9], I would define the point at which data turns good to be 4 (1 > > followed by 2, 3, 4, 5). > > > I have a simple algorithm to identify this changepoint, but it looks > > crude: is there a cleaner, more elegant way to do this? > > > flag = True > > i=-1 > > j=0 > > while flag and i < len(retHist)-1: > > i += 1 > > if retHist[i] == 0: > > j = 0 > > else: > > j += 1 > > if j == 5: > > flag = False > > > del retHist[:i-4] > > > Thanks in advance for your help > > > Thomas Philips > > data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > def itergood(indata): > indata = iter(indata) > buf = [] > while len(buf) < 4: > buf.append(indata.next()) > if buf[-1] == 0: > buf[:] = [] > for x in buf: > yield x > for x in indata: > yield x > > for d in itergood(data): > print d
This seems the most efficient so far for arbitrary iterables. With a few micro-optimizations it becomes: from itertools import chain def itergood(indata, good_ones=4): indata = iter(indata); get_next = indata.next buf = []; append = buf.append while len(buf) < good_ones: next = get_next() if next: append(next) else: del buf[:] return chain(buf, indata) $ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from itergood import itergood" "list(itergood(x))" 100 loops, best of 3: 3.09 msec per loop And with Psyco enabled: $ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from itergood import itergood" "list(itergood(x))" 1000 loops, best of 3: 466 usec per loop George -- http://mail.python.org/mailman/listinfo/python-list