Hi,

 

Thank you all very much for all the detailed information, the link to the Dr. 
Dobb's article might become very useful.

 

Does someone know if I can assume that the canonical SMILES of RDKit are the 
same as the Open Babel ones?

 

Am I doing something wrong in responding to the mailing list, it looks like all 
my answers are logged as a separate message as oposed to being logged in the 
same thread - please let me know, I don't want to make it all untidy!

 

Thanks.

 
> From: da...@dalkescientific.com
> Date: Fri, 13 Feb 2009 23:21:01 +0100
> To: rdkit-discuss@lists.sourceforge.net
> Subject: Re: [Rdkit-discuss] Canonical SMILES
> 
> On Feb 13, 2009, at 9:14 PM, TJ O'Donnell wrote:
> > Yes, INnChI is unique across different packages. This is because
> > there is one definitive source for the code and algorithm. This was
> > a design goal of InChI.
> 
> 
> Or to twist TJ's words around .. it's exactly the same as with 
> canonical SMILES - every implementation of InChI does it a different 
> way. It's just that there's only one InChI implementation.
> 
> >> The book I was referring to is An Introduction to 
> >> Chemoinformatics from A.R. Leach and V.J. Gillet. Yes, they refer 
> >> to the CANGEN algorithm and to the Weininger paper you mentioned.
> >> It doesn't matter, as long as I'm aware of the scope of 
> >> 'uniqueness'.
> 
> Then it's an eerie coincidence that Schneider and Baringhaus use 
> exactly the same example, with exactly the same SMILES. ;)
> 
> http://books.google.com/books?id=feNn- 
> JcC1KgC&pg=PA25&lpg=PA25&dq=canonical 
> +SMILES&source=web&ots=CeTadvKPxA&sig=46za2byYVjkOtYM1cs5- 
> xs6Bch0&hl=en&ei=ia2VSbf1FMyL- 
> gbbguWQCQ&sa=X&oi=book_result&resnum=6&ct=result
> 
> 
> > in this case probably to do with which branch to deal with first)
> 
> 
> As I recall when trying to implement the algorithm, the ambiguity is 
> in dealing with ties. The algorithm assigns a unique ordering to the 
> atoms, up to symmetry, but it's defined at the atom level. Given an 
> atom A bonded to atoms B1 and B2, it's possible for B1 and B2 to be 
> in the same symmetry class, but with different bond types going to B1 
> and B2.
> 
> I asked Weininger about it and he said "choose the highest order bond 
> first", which mostly works but I think can be ambiguous for a few 
> rare cases.
> 
> There may be other under-specified aspects. I haven't looked at the 
> paper in 10 years.
> 
> Brian Kelley wrote an article about canonicalization, with code, for 
> Dr. Dobb's magazine. It's online at
> http://www.ddj.com/architect/184405341
> 
> The algorithm isn't that hard to implement, and it can be useful (at 
> very rare times) for doing things like canonicalizing SMARTS.
> 
> 
> Andrew
> da...@dalkescientific.com
> 
> 
> 
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_________________________________________________________________
Make a mini you and download it into Windows Live Messenger
http://clk.atdmt.com/UKM/go/111354029/direct/01/

Reply via email to