Hello, I recently converted one of my perl scripts to python. What the script does is simply search a lot of big mail files (~40MB) to retrieve specific emails. I simply converted the script line by line to python, keeping the algorithms & functions as they were in perl (no optimization). The purpose was mainly to learn python and see the differences with perl.
Now, once the converted script was finished, I was amazed to find that the python version is running 8 times faster (800% faster!). Needless to say, I was very intrigued and wanted to know what causes such a performance gap between the two versions. So to keep my story short, after some research and a few tests, I found that file IO is mainly the cause of the performance diff. I made two short test scripts, one in perl and one in python (see below), and compared the performance difference. As we can see, the bigger the file the larger the difference in performance.... I'm fairly new to python, and don't know much of its inner working so I wonder if someone could explain to me why it is so much faster in python to open a file and load it in a list/array ? Thanks ----- #!/usr/bin/python for i in range(20): Data = open('data.test').readlines() ----- #!/usr/bin/perl for ($i = 0; $i < 20; $i++) { open(DATA, "data.test"); @Data = <DATA>; close(DATA); } ----- Running tests (data.test = 10MB text file): [EMAIL PROTECTED] blop $ time ./ftest.py real 0m6.408s user 0m4.552s sys 0m1.826s [EMAIL PROTECTED] blop $ time ./ftest.pl real 0m22.855s user 0m21.946s sys 0m0.822s ----- Running tests (data.test = 40MB text file): [EMAIL PROTECTED] blop $ time ./ftest.py real 0m26.235s user 0m18.238s sys 0m7.872s [EMAIL PROTECTED] blop $ time ./ftest.pl real 3m26.741s user 3m22.168s sys 0m3.764s -- http://mail.python.org/mailman/listinfo/python-list