When I convert the molecules as given with obabel, you're right - you
run into a bug that's been fixed on the development branch -
aromaticity is perceived differently depending on the presence/absence
of explicit hydrogens:

> obabel 3rlb_ligand.* -osmi
Cc1nc(N)c(Cn2csc(CCO)c2C)cn1    3rlb_ligand
Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1   ./3rlb_ligand.pdb

If you delete the explicit Hs first, you can get the same aromaticity
perception for both:
>obabel 3rlb_ligand.* -d -O tmp.sdf
>obabel tmp.sdf -osmi
Cc1nc(N)c(CN2=CSC(=C2C)CCO)cn1  3rlb_ligand
Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1   ./3rlb_ligand.pdb

If you paste these SMILES into Marvin Sketch you can see the
difference. The MOL2 file contains an extra double bond to a nitrogen.
So what's going on?...

I'm guessing that the correct structure is in the MOL2 file, but it
was read incorrectly by Open Babel and so is missing the charge on the
4-valent nitrogen. MOL2 is a horrible format but we should do a better
job. I note in passing that MarvinSketch interprets it the same as
Open Babel but that's no excuse.

The PDB file of course does not contain any bond orders and so we
guess them. We do an okay job - this is an example where we miss the
bond. If you removed these bond orders from the MOL2 file you would
get the same wrong structure too.

- Noel



On 23 May 2017 at 15:24, Marcos Villarreal <mvillarr...@unc.edu.ar> wrote:
> Here is one example from the PDBBind refine data set.
> Please find bellow the code, the output, and attached the mol2 and the pdb
> input files.
>
> Code:
>
> #include <iostream>
> #include <openbabel/obconversion.h>
> #include <openbabel/obiter.h>
> #include <openbabel/mol.h>
> #include <openbabel/atom.h>
>
> int main(int argc,char **argv)
> {
>
>   OpenBabel::OBConversion conv;
>   OpenBabel::OBMol mol;
>   std::string filename;
>   filename = argv[1];
>
>   conv.ReadFile(&mol,filename);
>
>   mol.DeleteHydrogens();
>   mol.ConnectTheDots();
>   mol.PerceiveBondOrders();
>   mol.UnsetAromaticPerceived();
>
>   FOR_ATOMS_OF_MOL(atom, mol) {
>      std::cout << atom->IsAromatic() ;
>   }
>
> }
>
> Output:
> 000000111110000000 (mol2)
> 000000000000111111 (pdb)
>
>
>
> 2017-05-23 9:43 GMT-03:00 Noel O'Boyle <baoille...@gmail.com>:
>>
>> Maybe if you can give an example of the problem with aromaticity, we
>> can help? The only information that is used by that function is the
>> structure, so it was probably wrong at that point.
>>
>> On 23 May 2017 at 13:16, Marcos Villarreal <mvillarr...@unc.edu.ar> wrote:
>> >
>> > Dear Noel Thank you for your answer. Please see my comments bellow.
>> >
>> > 2017-05-22 16:00 GMT-03:00 Noel O'Boyle <baoille...@gmail.com>:
>> >>
>> >> In other words, you want to assign atom types based on the structure.
>> >
>> >
>> >    Yes, that's right.
>> >
>> >>
>> >> The source of the structure is immaterial except in so far as it
>> >> introduces noise. For example, to read a PDB file you need to guess
>> >> various things. To read a MOL file, you don't need to guess anything.
>> >
>> >
>> > That noise is what we are trying to avoid by always calculating
>> > (guessing)
>> > things with the same algorithm.
>> >
>> >>
>> >> Regarding your code, you should never throw away information and then
>> >> try to guess it.
>> >
>> >
>> > Well, that depend on your faith on the quality of the information putted
>> > in
>> > the input format.
>> > One can always set a flag to keep the input information if its
>> > considered
>> > accurate enough, but if you want consistency regarding the input file
>> > format
>> > I don't see other way but to strip off all the information in the input
>> > and
>> > recalculate it.
>> >
>> >> Also, I note in passing that DeleteHydrogens()
>> >> doesn't delete anything, it just suppresses any explicit hydrogens.
>> >
>> >
>> >> I'm a bit unclear why you are using the internal Open Babel atom
>> >> types. Personally, I would avoid this as the atom types may not be
>> >> suitable.
>> >>
>> >> Instead, just implement your own atom type function to suit
>> >> your needs. Any atom typing can be implemented as a function that
>> >> takes an OBAtom* and returns the type, perhaps as an enum.
>> >
>> >
>> > Are you referring to functions like "IsAmideNitrogen" or so?.  We used
>> > these
>> > functions, and they worked just fine for our needs.
>> > The problem we faced was with "IsAromatic" that we couldn't make it
>> > input-format agnostic. Our guess is that some information of the input
>> > format is always remaining when calling it, regardless
>> > UnsetAromaticPerceived and the like were called before.
>> > This lead us to try the route of put all the atom types in internal Open
>> > Babel types and build upon it.
>> >
>> >>
>> >> - Noel
>> >>
>> >> On 22 May 2017 at 18:56, Marcos Villarreal <mvillarr...@unc.edu.ar>
>> >> wrote:
>> >> > Hello,
>> >> >
>> >> > For an application we are developing, we would like to get an atom
>> >> > typing
>> >> > independent of the input format.
>> >> > For example a mol2 with all Hydrogen atoms and a pdb without
>> >> > Hydrogens
>> >> > of
>> >> > the same molecule (i.e. identical heavy atom coordinates) should get
>> >> > the
>> >> > same atom types.
>> >> > The attached program is our try in that direction, but unfortunately
>> >> > without
>> >> > success. How could one get ride off all the input information and let
>> >> > babel
>> >> > do all the new calculations of atom types?
>> >> >
>> >> > Thank you in advance.
>> >> >
>> >> >
>> >> > int main(int argc,char **argv)
>> >> > {
>> >> >
>> >> >   OpenBabel::OBConversion conv;
>> >> >   OpenBabel::OBMol mol;
>> >> >   std::string filename;
>> >> >   filename = argv[1];
>> >> >
>> >> >   conv.ReadFile(&mol,filename);
>> >> >
>> >> >   mol.DeleteHydrogens();
>> >> >   mol.ConnectTheDots();
>> >> >   mol.PerceiveBondOrders();
>> >> >
>> >> >   int i=0;
>> >> >   FOR_ATOMS_OF_MOL(atom, mol) {
>> >> >      i++;
>> >> >      std::cout << i << ": " << atom->GetType() << std::endl ;
>> >> >   }
>> >> >
>> >> > }
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Marcos Villarreal
>> >> > Dpto de Química Teórica y Computacional
>> >> > Facultad de Ciencias Químicas
>> >> > Universidad Nacional de Córdoba
>> >> > Argentina.
>> >> >
>> >> >
>> >> >
>> >> > ------------------------------------------------------------------------------
>> >> > Check out the vibrant tech community on one of the world's most
>> >> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> >> > _______________________________________________
>> >> > OpenBabel-discuss mailing list
>> >> > OpenBabel-discuss@lists.sourceforge.net
>> >> > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> > Marcos Villarreal
>> > Dpto de Química Teórica y Computacional
>> > Facultad de Ciencias Químicas
>> > Universidad Nacional de Cordoba
>
>
>
>
> --
> Marcos Villarreal
> Dpto de Química Teórica y Computacional
> Facultad de Ciencias Químicas
> Universidad Nacional de Cordoba

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to