2008/4/14, John Mandereau <[EMAIL PROTECTED]>: > Hi, > > in current lilypond-book, it seems that using sets for reading existing > output file names is a significant speed bottleneck. I see no good > reason for using sets here, as I'd expect glob.glob and os.listdir to > return lists (or tuples) without duplicate file names. Using a list iso > a set speeds up lp-book a lot for me; may I commit and push the > following patch? > > Cheers, > John > > diff --git a/scripts/lilypond-book.py b/scripts/lilypond-book.py > index b29cb97..025861c 100644 > --- a/scripts/lilypond-book.py > +++ b/scripts/lilypond-book.py > @@ -1635,14 +1635,14 @@ def write_file_map (lys, name): > def split_output_files(directory): > """Returns directory entries in DIRECTORY/XX/ , where XX are hex digits. > > - Return value is a set of strings. > + Return value is a list of strings. > """ > - files = set () > + files = [] > for subdir in glob.glob (os.path.join (directory, '[a-f0-9][a-f0-9]')): > base_subdir = os.path.split (subdir)[1] > sub_files = [os.path.join (base_subdir, name) > for name in os.listdir (subdir)] > - files = files.union (sub_files) > + files += sub_files
please convert back to a set before returning. Subsequent functions do a lot of if file in files lists get very slow when there are 10000 elements in them. > return files > > def do_process_cmd (chunks, input_name, options): > > > > > > _______________________________________________ > lilypond-devel mailing list > [email protected] > http://lists.gnu.org/mailman/listinfo/lilypond-devel > -- Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen _______________________________________________ lilypond-devel mailing list [email protected] http://lists.gnu.org/mailman/listinfo/lilypond-devel
