On Fri, 14 Jan 2005, kumar s wrote:
> >>>for i in range(len(what)): > ele = split(what[i],'\t') > cor1 = ele[0] > for k in range(len(my_report)): > cols = split(my_report[k],'\t') > cor = cols[0] > if cor1 == cor: > print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2] Hi Kumar, Ok, this calls for the use of an "associative map" or "dictionary". The main time sink is the loop here: > for k in range(len(my_report)): > cols = split(my_report[k],'\t') > cor = cols[0] > if cor1 == cor: > print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2] Conceptually, my_report can be considered a list of key/value pairs. For each element in 'my_report', the "key" is the first column (cols[0]), and the "value" is the rest of the columns (cols[1:]). The loop above can, in a pessimistic world, require a search across the whole of 'my_report'. This can take time that is proportional to the length of 'my_report'. You mentioned earlier that each list might be of length 249502, so we're looking into a process whose overall cost is gigantic. [Notes on calculating runtime cost: when the structure of the code looks like: for element1 in list1: for element2 in list2: some_operation_that_costs_K_time() then the overall cost of running this loop will be K * len(list1) * len(list2) ] We can do much better than this if we use a "dictionary" data structure. A "dictionary" can reduce the time it takes to do a lookup search down from a linear-time operation to an atomic-time one. Do you know about dictionaries yet? You can take a look at: http://www.ibiblio.org/obp/thinkCSpy/chap10.htm which will give an overview of a dictionary. It doesn't explain why dictionary lookup is fast, but we can talk about that later if you want. Please feel free to ask any questions about dictionaries and their use. Learning how to use a dictionary data structure is a skill that pays back extraordinarily well. Good luck! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor