Hi I have an excel file that is read into python (8000 rows)
from csv import reader, writer incsv = reader(open(MY_FILE), dialect='excel') keys = incsv.next() There are mixed datatypes. the last column contains a cumulative frequency running in order 0.0000 to 1.0000 for the 8000 rows for a loop of 100,000 times I want to take a new random number each time and find the row with the next biggest cumulative frequency value Here's my current (pseudo)code: for 1 to 100000 myRand = random.random() for line in incsv: if float(line[-1]) > myRand: resline = [] for item in line: try: i = int(item) except ValueError: try: i = float(item) except ValueError: i = item resline.append(i) #Here we construct a dict of pair values: #{'ID':18,... res = dict(zip(keys,resline)) break else: continue #do some stuff with res I'm scanning over each line of the csv and deciding which row to select (100k times) this is just not very efficient How can i improve this code. for line in incsv: if float(line[-1]) > random.random(): I can use numpy etc. whatever -- http://mail.python.org/mailman/listinfo/python-list