Not to pick at you, but an hour and a half sounds a bit long for 171k entries. If you did not write the code for corporate interests, would you mind sharing it, on something like github or bitbicket?
I mean, I know nothing of a) the problem you are trying to solve b) the constraints you are facing in programming a solution so I will obstinately hold down my views that there must be a problem with either the algorithm you employ or the way you've coded it, even in the face of precise and clear evidence. --Rory On Fri, Jan 9, 2009 at 2:01 PM, Daniel Kersten <[email protected]> wrote: > > Hi all, > > I wrote a Python program to generate a report from log files. The log > files are generated by a test-suite used to test a java program. The > report gives me details about how many messages (its a message based > program being tested) were processed, how long each operation (which > may consist of processing one or more message) took, counts of which > operations passed or didnt pass, messages processed per second etc. > The idea is that the main program can be run over a weekend or week or > whatever and the log files from the test suite are checked by my > Python program. > > The log files can be huge. > > Yesterday, I ran my program on a log file with 171K entries - it took > an hour and a half! (This is why I'm interested in the sppedup patch) > There are some algorithmic changes which would be beneficial, but that > would require significant code restructuring which, right now, I dont > have time for. So I'm looking for simpler ways. > > I decided to give Cython (cython.org) a shot, since it compiles Python > code to C. IT supports almost all of Pythons constructs, the only > major limitation (IMHO - that is, the only feature I really use which > Cython does not support) being nested functions and lambdas. Removing > them from my code slowed it down a small bit, due to one of my > functions accessing a variable from the outer scope, so I couldn't > simply move it into the global scope - and I couldn't pass it as an > argument because I was storing the function as a callback. > Besides that, I made NO other changes to my Python code. > > The code that took 1 hour and 32 minutes to execute with the pure > python version completed in 48 minutes!! > > This can be improved more still, by strategically declaring functions > and variables as C types. > > Just thought I'd share, in case someone else needs more performance > out of their Python and doesn't know where to turn. > > -- > Daniel Kersten. > Leveraging dynamic paradigms since the synergies of 1985. > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Python Ireland" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.ie/group/pythonireland?hl=en -~----------~----~----~----~------~----~------~--~---
