Re: Optimization during extraction

m.nooning Wed, 24 Jan 2007 05:56:44 -0800

Steffen Mueller wrote:

Hi list,
you are probably all aware of the long time it takes to extract all datafrom a pp'd binary or a .par file for large applications.
For a sample .par which contains a couple of modules for testing, I dida profiling run and it turns out that a lot (20%) of the time is spentin an accessor in Archive::Zip. Namely Archive::Zip::Member::fileName.(There are over 100k calls to that in my example package. Real world usemight end up five times that number.)
Now, replacing the use of that accessor with a bare hash access in asingle place in PAR.pm results in a reduction to about three thousandcalls to that accessor. The extraction process runs, on my rather fastmachine, about 0.3 seconds faster (of 1.2 seconds script run-time whichincludes loading all those modules). I'd expect the extraction to makeup about 0.8-0.9 seconds of the total run-time. That's definitely anoticeable speed-up. A simple-minded micro-benchmark shows that directhash access is about three times faster than calling the method.
I'm aware that this is breaking encapsulation. This is bad. But it'salso a *huge* gain! Would you consider it feasible to do such a hack ifit helps this much?
The code I'm referring to is in PAR.pm's _first_member function. my%names = map { ...->fileName... } $zip->members;
An alternative would be to convince the A::Zip maintainer to provide a$zip->member_names() function which breaks encapsulation ofA::Zip::Member from within the Archive::Zip distribution. The problemsare that a) A::Zip is currently very strict about this and I don't wantto be the one to change that and b) the author has been away for sometime and A::Zip is currently community maintained.
What do you think?

Steffen

This might sound like passing the buck but processors are getting sofast now, and are due to get faster. The extra speed would help withold machines, but I am not sure a saved half second or so would be muchappreciated during a pp'd applications initial startup.

I suppose it could be argued that if there were a pp'd application thatan end user had to execute and exit, repeatedly, then there would be areal difference. However, it is a contrived example. If that were theneed I imagine the original programmer would simply design the programto stay running, and have a "run again" button, or some sort of similarway to keep the application alive.

The real goal of any software is to solve a problem for the end user.In the long run, maintainability, and especially ease of adding ontoexisting code (extensibility) will do more for the end user. It alsolets programmers whose skills only go but so far (like me) do what I cando. Such as debug other people's (modular) code.


Just an opinion.

Re: Optimization during extraction

Reply via email to