Thanks for the help, this is considerably faster and easier to read (see below). I changed it to avoid the "break" and I think it makes it easy to understand. I am checking the conditions each time slows it but it is worth it to me at this time. Thanks again Vincent
def read_data_file(filename): reader = csv.reader(open(filename, "U"),delimiter='\t') data = [] mask = [] outliers = [] modified = [] data_append = data.append mask_append = mask.append outliers_append = outliers.append modified_append = modified.append maskcount = 0 outliercount = 0 modifiedcount = 0 for row in reader: if '[MASKS]' in row: maskcount += 1 if '[OUTLIERS]' in row: outliercount += 1 if '[MODIFIED]' in row: modifiedcount += 1 if not any((maskcount, outliercount, modifiedcount, not row)): data_append(row) elif not any((outliercount, modifiedcount, not row)): mask_append(row) elif not any((modifiedcount, not row)): outliers_append(row) else: if row: modified_append(row) data = data[1:] mask = mask[3:] outliers = outliers[3:] modified = modified[3:] return [data, mask, outliers, modified] *Vincent Davis 720-301-3003 * vinc...@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis> On Fri, Feb 19, 2010 at 4:36 PM, Jonathan Gardner < jgard...@jonathangardner.net> wrote: > > On Fri, Feb 19, 2010 at 1:58 PM, Vincent Davis > <vinc...@vincentdavis.net>wrote: > >> In reference to the several comments about "[x for x in read] is basically >> a copy of the entire list. This isn't necessary." or list(read). I had >> thought I had a problem with having iterators in the takewhile() statement. >> I thought I testes and it didn't work. It seems I was wrong. It clearly >> works. I'll make this change and see if it is any better. >> >> I actually don't plan to read them all in at once, only as needed, but I >> do need the whole file in an array to perform some mathematics on them and >> compare different files. So my interest was in making it faster to open them >> as needed. I guess part of it is that they are about 5mb so I guess it might >> be disk speed in part.nks >> >> > > Record your numbers in an array and then work your magic on them later. > Don't store the entire file in memory, though. > > -- > Jonathan Gardner > jgard...@jonathangardner.net >
-- http://mail.python.org/mailman/listinfo/python-list