Hi Yogi On Fri, Apr 4, 2008 at 10:05 PM, yogi <[EMAIL PROTECTED]> wrote: > Hi , > Here is my first usable Python code. > The code works. Woohoo! congratulations.
> Here is what I'm trying to do. > I have two huge text files. After some processing, One is 12M (file A) and > the other 1M (file B) . > The files have columns which are of interest to me. ... > Question1 : Is there a better way ? I admit that I didn't spend too much time trying to understand your code. But at first glance your logic looks like it could be easily represented in SQL. I bet a relational database could do your lookup faster than doing it in pure Python. I do this kind of thing frequently: use python to import delimited data into a relational database like PostgreSQL, add indexes where they make sense, query the database for the results. It can all be done from inside Python but it doesn't have to be. SELECT a.* FROM a INNER JOIN b ON a.field0=b.field0 WHERE b.field3=0 AND a.field3 >= (b.field1-1000000) AND a.field3 <= (b.field2+1000001) ... etc. > Question2 : For now I'm using shells time call for calculating time > required. Does Python provide a more fine grained check. I think so but I've not used it: timeit. Search this mailing list's archives for 'timeit' and/or at the Python command line: import timeit help(timeit) > Question 2: If I have convert this code into a function. > Should I ? Yes, especially if it helps make your code easier to read and understand. _______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
