Thank you Noel for look into this.
So how do you suggest to do this inside the code, that is without passing
for and intermediate file.
I remind you that our gol is to get the same atom types (say aromatics)
regardless the input format.
For now we are interested in consistency before "accuracy", which is
another subject. As a related note, we have tested several atom typing
programs (Knodle, I-interpret, Unicon and also Open Babel) and the
perception of the number of aromatic atoms typically differ in 10-20 % when
analyzing a 3600 structures in the PDBbind database.
2017-05-23 11:47 GMT-03:00 Noel O'Boyle <baoille...@gmail.com>:
> When I convert the molecules as given with obabel, you're right - you
> run into a bug that's been fixed on the development branch -
> aromaticity is perceived differently depending on the presence/absence
> of explicit hydrogens:
>
> > obabel 3rlb_ligand.* -osmi
> Cc1nc(N)c(Cn2csc(CCO)c2C)cn1 3rlb_ligand
> Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1 ./3rlb_ligand.pdb
>
> If you delete the explicit Hs first, you can get the same aromaticity
> perception for both:
> >obabel 3rlb_ligand.* -d -O tmp.sdf
> >obabel tmp.sdf -osmi
> Cc1nc(N)c(CN2=CSC(=C2C)CCO)cn1 3rlb_ligand
> Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1 ./3rlb_ligand.pdb
>
> If you paste these SMILES into Marvin Sketch you can see the
> difference. The MOL2 file contains an extra double bond to a nitrogen.
> So what's going on?...
>
> I'm guessing that the correct structure is in the MOL2 file, but it
> was read incorrectly by Open Babel and so is missing the charge on the
> 4-valent nitrogen. MOL2 is a horrible format but we should do a better
> job. I note in passing that MarvinSketch interprets it the same as
> Open Babel but that's no excuse.
>
> The PDB file of course does not contain any bond orders and so we
> guess them. We do an okay job - this is an example where we miss the
> bond. If you removed these bond orders from the MOL2 file you would
> get the same wrong structure too.
>
> - Noel
>
>
>
> On 23 May 2017 at 15:24, Marcos Villarreal <mvillarr...@unc.edu.ar> wrote:
> > Here is one example from the PDBBind refine data set.
> > Please find bellow the code, the output, and attached the mol2 and the
> pdb
> > input files.
> >
> > Code:
> >
> > #include <iostream>
> > #include <openbabel/obconversion.h>
> > #include <openbabel/obiter.h>
> > #include <openbabel/mol.h>
> > #include <openbabel/atom.h>
> >
> > int main(int argc,char **argv)
> > {
> >
> > OpenBabel::OBConversion conv;
> > OpenBabel::OBMol mol;
> > std::string filename;
> > filename = argv[1];
> >
> > conv.ReadFile(&mol,filename);
> >
> > mol.DeleteHydrogens();
> > mol.ConnectTheDots();
> > mol.PerceiveBondOrders();
> > mol.UnsetAromaticPerceived();
> >
> > FOR_ATOMS_OF_MOL(atom, mol) {
> > std::cout << atom->IsAromatic() ;
> > }
> >
> > }
> >
> > Output:
> > 000000111110000000 (mol2)
> > 000000000000111111 (pdb)
> >
> >
> >
> > 2017-05-23 9:43 GMT-03:00 Noel O'Boyle <baoille...@gmail.com>:
> >>
> >> Maybe if you can give an example of the problem with aromaticity, we
> >> can help? The only information that is used by that function is the
> >> structure, so it was probably wrong at that point.
> >>
> >> On 23 May 2017 at 13:16, Marcos Villarreal <mvillarr...@unc.edu.ar>
> wrote:
> >> >
> >> > Dear Noel Thank you for your answer. Please see my comments bellow.
> >> >
> >> > 2017-05-22 16:00 GMT-03:00 Noel O'Boyle <baoille...@gmail.com>:
> >> >>
> >> >> In other words, you want to assign atom types based on the structure.
> >> >
> >> >
> >> > Yes, that's right.
> >> >
> >> >>
> >> >> The source of the structure is immaterial except in so far as it
> >> >> introduces noise. For example, to read a PDB file you need to guess
> >> >> various things. To read a MOL file, you don't need to guess anything.
> >> >
> >> >
> >> > That noise is what we are trying to avoid by always calculating
> >> > (guessing)
> >> > things with the same algorithm.
> >> >
> >> >>
> >> >> Regarding your code, you should never throw away information and then
> >> >> try to guess it.
> >> >
> >> >
> >> > Well, that depend on your faith on the quality of the information
> putted
> >> > in
> >> > the input format.
> >> > One can always set a flag to keep the input information if its
> >> > considered
> >> > accurate enough, but if you want consistency regarding the input file
> >> > format
> >> > I don't see other way but to strip off all the information in the
> input
> >> > and
> >> > recalculate it.
> >> >
> >> >> Also, I note in passing that DeleteHydrogens()
> >> >> doesn't delete anything, it just suppresses any explicit hydrogens.
> >> >
> >> >
> >> >> I'm a bit unclear why you are using the internal Open Babel atom
> >> >> types. Personally, I would avoid this as the atom types may not be
> >> >> suitable.
> >> >>
> >> >> Instead, just implement your own atom type function to suit
> >> >> your needs. Any atom typing can be implemented as a function that
> >> >> takes an OBAtom* and returns the type, perhaps as an enum.
> >> >
> >> >
> >> > Are you referring to functions like "IsAmideNitrogen" or so?. We used
> >> > these
> >> > functions, and they worked just fine for our needs.
> >> > The problem we faced was with "IsAromatic" that we couldn't make it
> >> > input-format agnostic. Our guess is that some information of the input
> >> > format is always remaining when calling it, regardless
> >> > UnsetAromaticPerceived and the like were called before.
> >> > This lead us to try the route of put all the atom types in internal
> Open
> >> > Babel types and build upon it.
> >> >
> >> >>
> >> >> - Noel
> >> >>
> >> >> On 22 May 2017 at 18:56, Marcos Villarreal <mvillarr...@unc.edu.ar>
> >> >> wrote:
> >> >> > Hello,
> >> >> >
> >> >> > For an application we are developing, we would like to get an atom
> >> >> > typing
> >> >> > independent of the input format.
> >> >> > For example a mol2 with all Hydrogen atoms and a pdb without
> >> >> > Hydrogens
> >> >> > of
> >> >> > the same molecule (i.e. identical heavy atom coordinates) should
> get
> >> >> > the
> >> >> > same atom types.
> >> >> > The attached program is our try in that direction, but
> unfortunately
> >> >> > without
> >> >> > success. How could one get ride off all the input information and
> let
> >> >> > babel
> >> >> > do all the new calculations of atom types?
> >> >> >
> >> >> > Thank you in advance.
> >> >> >
> >> >> >
> >> >> > int main(int argc,char **argv)
> >> >> > {
> >> >> >
> >> >> > OpenBabel::OBConversion conv;
> >> >> > OpenBabel::OBMol mol;
> >> >> > std::string filename;
> >> >> > filename = argv[1];
> >> >> >
> >> >> > conv.ReadFile(&mol,filename);
> >> >> >
> >> >> > mol.DeleteHydrogens();
> >> >> > mol.ConnectTheDots();
> >> >> > mol.PerceiveBondOrders();
> >> >> >
> >> >> > int i=0;
> >> >> > FOR_ATOMS_OF_MOL(atom, mol) {
> >> >> > i++;
> >> >> > std::cout << i << ": " << atom->GetType() << std::endl ;
> >> >> > }
> >> >> >
> >> >> > }
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Marcos Villarreal
> >> >> > Dpto de Química Teórica y Computacional
> >> >> > Facultad de Ciencias Químicas
> >> >> > Universidad Nacional de Córdoba
> >> >> > Argentina.
> >> >> >
> >> >> >
> >> >> >
> >> >> > ------------------------------------------------------------
> ------------------
> >> >> > Check out the vibrant tech community on one of the world's most
> >> >> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >> >> > _______________________________________________
> >> >> > OpenBabel-discuss mailing list
> >> >> > OpenBabel-discuss@lists.sourceforge.net
> >> >> > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Marcos Villarreal
> >> > Dpto de Química Teórica y Computacional
> >> > Facultad de Ciencias Químicas
> >> > Universidad Nacional de Cordoba
> >
> >
> >
> >
> > --
> > Marcos Villarreal
> > Dpto de Química Teórica y Computacional
> > Facultad de Ciencias Químicas
> > Universidad Nacional de Cordoba
>
--
Marcos Villarreal
Dpto de Química Teórica y Computacional
Facultad de Ciencias Químicas
Universidad Nacional de Cordoba
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss