It was ~5% faster with connect the dots alone, but that's *very*
preliminary. I wrote a Python script that loaded just atoms into an OBMol,
ran ConnectTheDots, and then profiled it (`python -m cProfile -s time
test_script.py`). This script looped through a random folder of
metal-organic frameworks that I had laying around, so it's in no way
indicative of general-use-case performance.

One area that would be worth looking in to would be the hash sizing logic.
In my example - with a 12,000-atom unit cell and a 1024-element hash size -
there's probably a whole lot of unnecessary collisions. Maybe there could
be a loading factor instead of a hash size, so that the size expands and
contracts based off of the number of atoms (hash_size = num_atoms /
loading_factor, and loading_factor could be something like 0.5).


On Mon, Feb 10, 2014 at 2:57 AM, Noel O'Boyle <baoille...@gmail.com> wrote:

> Can you check the code into a branch?
>
> Do you mean 5% faster at ConnectTheDots, or 5% faster over the whole
> reading in?
>
> - Noel
>
> On 10 February 2014 02:40, Geoffrey Hutchison <geoff.hutchi...@gmail.com>
> wrote:
> > I got the hash-based algorithm up and running (commit here). I ran it on
> a
> > folder of ~100 cif files, and it's about 5% faster. I don't think the
> small
> > performance gain is worth migrating away from tried and tested code,
> but, if
> > there's another opinion out there, let me know.
> >
> >
> > The current ConnectTheDots is generally N log N since the atoms are
> sorted.
> > But my question would be whether the PerceiveBondOrders is being
> triggered
> > on your CIF files. Try, for example, running:
> >
> > obabel file.cif -O file.cml -as #output only single bonds from
> > ConnectTheDots()
> >
> > I suspect the problem is not in perceiving bonds, but the ring perception
> > and aromaticity detection needed in the PerceiveBondOrders.
> >
> > BTW, I'd try benchmarking before and after the commit with -as for
> > perceiving only single bonds. I'd be curious on large CIF files how much
> the
> > hash-based algorithm helps.
> >
> > Geoff - This might be more of a question for the Avogadro mailing list,
> but
> > do you think there's a more elegant way to handle these large cifs in a
> GUI?
> > The example cif currently freezes up Avogadro (and the internal front-end
> > software at my job) with a ton of CPU work. Maybe this could be run in a
> > background thread with a timeout?
> >
> >
> > Avogadro *does* read molecule files in a background thread. Granted, you
> > can't do anything until the molecule is read, but the interface still
> works.
> > Again, if you're seeing a huge CPU hit, my guess is that it's from PBO.
> Now
> > if you have a bit of free time for coding, I've got a great idea there
> for a
> > huge performance win. :-)
> >
> > -Geoff
>
------------------------------------------------------------------------------
Android&trade; apps run on BlackBerry&reg;10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to