Might help if I had attached the patch..
Alec Joseph Warner wrote:
Author : TGL
Purpose :
While profiling a script that reads CONTENTS files using
'dblink.getcontents()', i've seen that it was spending quite some time
in ~400k calls to 'os.path.normpath()' (one per referenced element). I
think there is nothing to normalize here (paths in CONTENTS files are
already normal by construction) and thus this calls could be avoided.
The one-line patch i will attach does this. It's effect is easy to
check, for instance with equery:
Before:
% time equery belongs /bin/bash > /dev/null
real 0m13.583s
user 0m13.215s
sys 0m0.249s
After:
% time equery belongs /bin/bash > /dev/null
real 0m6.526s
user 0m6.218s
sys 0m0.246s
A quick testing of normpath in python shows it acts how I think it does,
which means i think this is patch is good to put in, it's short simple,
and shouldn't break anything.
I kind of see Brian's point in being defensive in filenames though
(os.path.normpath("///////var/db/pkg/fex") being invalid in a CONTENTS
file but still usable by most(?) calls in python, which will just figure
it all out anyway.
If portage errored on invalid paths in CONTENTS file, I would see
leaving it there, but if it doesn't help anything ( because it doesn't
hurt anything either, right? :) ) then why do the work?
*has ranted too long :P*
-Alec Warner (antarus)
--- pym/portage.py.1 2005-04-25 23:32:45.000000000 +0200
+++ pym/portage.py 2005-04-26 00:05:38.000000000 +0200
@@ -3834,7 +3834,7 @@
# we do this so we can remove from non-root filesystems
# (use the ROOT var to allow maintenance on other partitions)
try:
- mydat[1]=os.path.normpath(root+mydat[1][1:])
+ mydat[1]=root+mydat[1][1:]
if mydat[0]=="obj":
#format: type, mtime, md5sum
pkgfiles[" ".join(mydat[1:-2])]=[mydat[0], mydat[-1], mydat[-2]]