Geoff - Thanks for the references (I can access all of them). I'd be open
to trying out a couple of kekulization implementations, although it'll be
at a couple-of-hours-per-week pace.
Noel - I don't think I can make a pull request to a new branch on
openbabel/openbabel, but you can use my "connectthedots" branch at
patrickfuller/openbabel.
On Mon, Feb 10, 2014 at 11:23 AM, Patrick Fuller <patrickful...@gmail.com>wrote:
> It was ~5% faster with connect the dots alone, but that's *very*
> preliminary. I wrote a Python script that loaded just atoms into an OBMol,
> ran ConnectTheDots, and then profiled it (`python -m cProfile -s time
> test_script.py`). This script looped through a random folder of
> metal-organic frameworks that I had laying around, so it's in no way
> indicative of general-use-case performance.
>
> One area that would be worth looking in to would be the hash sizing logic.
> In my example - with a 12,000-atom unit cell and a 1024-element hash size -
> there's probably a whole lot of unnecessary collisions. Maybe there could
> be a loading factor instead of a hash size, so that the size expands and
> contracts based off of the number of atoms (hash_size = num_atoms /
> loading_factor, and loading_factor could be something like 0.5).
>
>
> On Mon, Feb 10, 2014 at 2:57 AM, Noel O'Boyle <baoille...@gmail.com>wrote:
>
>> Can you check the code into a branch?
>>
>> Do you mean 5% faster at ConnectTheDots, or 5% faster over the whole
>> reading in?
>>
>> - Noel
>>
>> On 10 February 2014 02:40, Geoffrey Hutchison <geoff.hutchi...@gmail.com>
>> wrote:
>> > I got the hash-based algorithm up and running (commit here). I ran it
>> on a
>> > folder of ~100 cif files, and it's about 5% faster. I don't think the
>> small
>> > performance gain is worth migrating away from tried and tested code,
>> but, if
>> > there's another opinion out there, let me know.
>> >
>> >
>> > The current ConnectTheDots is generally N log N since the atoms are
>> sorted.
>> > But my question would be whether the PerceiveBondOrders is being
>> triggered
>> > on your CIF files. Try, for example, running:
>> >
>> > obabel file.cif -O file.cml -as #output only single bonds from
>> > ConnectTheDots()
>> >
>> > I suspect the problem is not in perceiving bonds, but the ring
>> perception
>> > and aromaticity detection needed in the PerceiveBondOrders.
>> >
>> > BTW, I'd try benchmarking before and after the commit with -as for
>> > perceiving only single bonds. I'd be curious on large CIF files how
>> much the
>> > hash-based algorithm helps.
>> >
>> > Geoff - This might be more of a question for the Avogadro mailing list,
>> but
>> > do you think there's a more elegant way to handle these large cifs in a
>> GUI?
>> > The example cif currently freezes up Avogadro (and the internal
>> front-end
>> > software at my job) with a ton of CPU work. Maybe this could be run in a
>> > background thread with a timeout?
>> >
>> >
>> > Avogadro *does* read molecule files in a background thread. Granted, you
>> > can't do anything until the molecule is read, but the interface still
>> works.
>> > Again, if you're seeing a huge CPU hit, my guess is that it's from PBO.
>> Now
>> > if you have a bit of free time for coding, I've got a great idea there
>> for a
>> > huge performance win. :-)
>> >
>> > -Geoff
>>
>
>
------------------------------------------------------------------------------
Androi apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience. Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel