In relation to this, I just noticed something very important: If you "set auto_zoom, off" before loading a series of structures, you'll boost PyMOL's loading performance dramatically: by 10X at least...perhaps much more. Just zoom once manually after everything is loaded in.
cmd.set("zoom","off") # now load your structures ... # now zoom cmd.zoom() This change makes it possible to load hundreds of structures containing hundreds of thousands of atoms in a reasonable amount of time. For example, on my dual 1 GB G5 with Shark-optimized G5 beta code, I loaded 800 PDB structures containing a total of 1.4 million atoms in just over 426 seconds -- apparently the situation isn't nearly as bad as I'd feared. Cheers, Warren -- mailto:war...@delsci.com Warren L. DeLano, Ph.D. Principal Scientist DeLano Scientific LLC Voice (650)-346-1154 Fax (650)-593-4020 > -----Original Message----- > From: Ben Allen [mailto:benal...@caltech.edu] > Sent: Tuesday, September 07, 2004 2:48 PM > To: Warren DeLano > Cc: pymol-users@lists.sourceforge.net > Subject: Re: [PyMOL] long loading times as the number of > existing objects increases > > Warren- > Thanks for your prompt response! > Given the fundamental issues you mentioned, I think I will > change my script so that it loads files only when they are > needed and deletes the associated objects when they are no > longer being displayed. Initially, I rejected this solution > as less efficient, but apparently the specific situation with > pymol actually makes it more efficient! > > Although the current version of my script uses a lot of > outside information to determine which files are loaded, how > they are colored, and how they are aligned (and so I haven't > included it), the following illustrates what I'm talking about: > > #!/usr/bin/env python > > from glob import glob > from time import time > > if __name__ == 'pymol': > from pymol import cmd > t1 = time() > for pdb in glob('*.pdb'): > print pdb > cmd.load(pdb) > t2 = time() > print t2-t1 > > If (from pymol) I cd to a directory that has 50 pdb files and > run this script, it takes about 105 sec to complete. If I > include simple alignment and color commands, as follows: > > #!/usr/bin/env python > > from glob import glob > from time import time > > if __name__ == 'pymol': > from pymol import cmd > t1 = time() > objects = [] > for pdb in glob('*.pdb'): > print pdb > cmd.load(pdb) > objects.append(pdb[:-4]) > cmd.fit(objects[-1]+' and name ca',objects[0]+' and name ca') > cmd.color('wheat',objects[-1]+' and elem c') > t2 = time() > print t2-t1 > > , it still takes same amount of time as before. This is only > one data point (50 structures), because I didn't want to > repeat the benchmarks for larger sets of structures, but it > seems to indicate that the limiting step is the actual > loading of the pdb files, and not the subsequent > aligning/coloring steps. > > Thanks again for letting me know which direction I should go. > I'll let you know if I get any insight into the origin of the > original issue. > > -Ben > > On Sep 7, 2004, at 11:51 AM, Warren DeLano wrote: > > > > Ben, > > Thanks for the great benchmarks! PyMOL is definitely > showing non-linear > behavior when it comes to loading a lot of objects...I > don't know why this > is exactly, but I can tell you that I didn't originally > envision (and thus > optimize PYMOL for) loading of so many objects. > > As it currently stands, there are a number of places > where PyMOL does things > using lists when it should be using hashes, and there > are many tasks (such > as selecting of atoms) that are linearly dependent (or > worse) on the total > number of atoms and coordinate sets present in the > system. All of these > issues will be addressed in time, but it may take a > considerable work to > correct them. Unfortunately, these are more than just > bugs -- they are > limitations in the original design. Such limitations > are now the bane of my > existence, my dreams are filled with questions of "How > do we fix or improve > the software, without breaking existing PyMOL usage?" > Remodeling an > airplane full of passengers while you're flying it is > much more challenging > than when it is empty and on the ground. : ) > > My current advice is to find creative ways of limiting > the total number of > atoms and objects loaded into PyMOL at one time. One > way to do this is to > create subsets which just contain those atoms you'd > like to see. Another > approach is to run multiple PyMOL instances simultaneously. > > Cheers, > Warren > > PS. It would be great if you could send us one of your > more challenging > example scripts to use as a test-case for improvement > -- and if you do spot > simple bottlenecks in the code, such information could > be very helpful. > > -- > mailto:war...@delsci.com > Warren L. DeLano, Ph.D. > Principal Scientist > DeLano Scientific LLC > Voice (650)-346-1154 > Fax (650)-593-4020 > > > > > -----Original Message----- > From: pymol-users-ad...@lists.sourceforge.net > > [mailto:pymol-users-ad...@lists.sourceforge.net] On Behalf Of > Ben Allen > Sent: Tuesday, September 07, 2004 10:32 AM > To: pymol-users@lists.sourceforge.net > Subject: [PyMOL] long loading times as the > number of existing > objects increases > > I have a situation in which I need to load a > large number of > separate pdb files into a single pymol session. In this > case, the number is ~150, but it could > potentially be more. > However, the amount of time required to load a > file appears > to be strongly dependent on the number of files already > loaded. For example: > > # of structures loaded time to load all > structures (seconds) > 5 0.82 > 10 2.49 > 20 11.05 > 30 29.85 > 40 62.48 > 50 115.25 > 60 189.79 > 70 302.67 > 80 432.82 > 90 589.23 > > unfortunately, this means that to load 150 > structures takes > over an hour. I observe this behavior whether I > am loading > the structures all at once using a python > script, or one at a > time. In both cases, I am using the cmd.load() > api function, > but the built-in load command gives similar > results. The > structures I am loading are (nearly) identical: > each has 263 residues (in a single chain); each > individual > pdb file is about 215KB. > > I am running this on a dual 2.0 GHz G5 system > with 1.5 GB > memory. The long loading times are consistent > between the > two versions of pymol I have installed: OSX/X11 hybrid > version 0.97 and MacPyMol version 0.95. > During the long loading times, there is plenty > of memory > available, but the processor load stays at 50% > (i.e. one > processor on my machine is fully loaded throughout). > > My gut feeling is that this situation should > not be, but I > don't yet understand the structure of the code > well enough to > debug it. Can anyone shed light on this issue? > > Thanks in advance, > Ben Allen > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic > Workshop FREE > Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > _______________________________________________ > PyMOL-users mailing list > PyMOL-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pymol-users > > > > > > >