Hi Lionel,
my guess (but it is only a guess) is that the molecule which have a [Zn]
atom with no charge might feature bonds between the zinc and the atoms
which are part of the complex with the metal, e.g.:
In [1]: from rdkit import Chem
In [2]: querySmiles = Chem.MolFromSmiles('[Zn]')
In [3]: querySmarts = Chem.MolFromSmarts('[Zn]')
In [4]: mol = Chem.MolFromSmiles('N[Zn]N')
In [5]: mol.GetSubstructMatch(querySmiles)
Out[5]: ()
In [6]: mol.GetSubstructMatch(querySmarts)
Out[6]: (1,)
In [7]: znAtom = mol.GetAtomWithIdx(1)
In [8]: znAtom.GetFormalCharge()
Out[8]: 0
Best,
Paolo
On 11/30/17 16:47, Lionel Colliandre wrote:
Hi Paolo,
I am not sure to understand. If I concentrate on these searches :
(q)mol_from_smiles('[Zn]') => do not find mol containing [Zn] or
mol containing [Zn+2]
(q)mol_from_smiles('[Zn+2]') => find mol containing [Zn+2]
mol_from_smarts('[Zn]') => find mol containing [Zn] or mol
containing [Zn+2]
mol_from_smarts('[Zn+2]') => find mol containing [Zn+2]
I understand all results except the first one: why at least [Zn] is
not retreived? For me both mol should be retreived as with the smarts
search.
Cheers,
Lionel
Le 30/11/2017 à 14:27, Paolo Tosco a écrit :
Hi Lionel,
the success or failure of the SMILES searches depends on the fact
that you specify the exact formal charge as present in the database
molecule, which in turn depends on whether (and how) it was set in
the input molecule when it was loaded in the database, whereas the
SMART searches based on the element only will succeed no matter which
the formal charge is, as it does not take into account the formal
charge at all.
Best,
p.
On 11/30/17 13:21, Lionel Colliandre wrote:
Hi all,
For the question of molecules that cannot be searched, I finally
found a solution in treating my queries as smarts:
SELECT id FROM rdk.mols WHERE m@>*mol_from_smarts*('[Zn]');
All the presented queries gives the expected results, even if I am
not sure what is changing when I treat the query from smiles to
smarts i.e. the query are valid smiles.
Lionel
2- for a lot of compounds, the ctab is valid and I can convert them
into mol and obtain the smile in the rdk.mols table. However I
cannot found them when I search part of the smile.
**First for molecules with metals :
m1 = [Mn+2].[Zn+2]...
m2 = [Ag+].[Na+]...
m3 = [Ca+2]....
m4 = [Na+].c1ccc([B-](c2ccccc2)(c2ccccc2)c2ccccc2)cc1
m5 = [V+2]=O
m6 = [Rh+]...
m7 = [Cu].[Zn]
m8 = [Fe+2]...
For a database containing those molecules, these searches give:
[Mn] or [Mn+2] => 0 results (bad)
[Zn] => 0 (bad) but [Zn+2] => m1 (ok)
[Ag] or [Ag+] => m2 (ok)
[Na] => 0 (bad) why Ag is founded and not Na in the same molecule ?
but [Na+] => m2 + m4 (ok)
[Ca] => 0 (bad) but [Ca+2] => m3 (ok)
[B] or [B-] => 0 (bad)
[V] or [V+2] => 0 (bad)
[Rh] or [Rh+] => m6 (ok)
[Cu] => m7 (ok) but [Zn] => 0 (bad)
[Fe] => m8 (ok) but [Fe+2] => 0 (bad)
I cannot find a logic, sometime the atom is found and not the ion,
sometime is the invert, sometime in the same molecule one can be
found and not the other. Has someone an explanation?
** second for N3
m9 = [N-]=[N+]=[N-]
the following search gives:
[N-] or [N+] => 0 (bad)
[N-]=N => m9 (ok)
[N-]=[N+] => 0 (bad)
[N-]=[N+]=N => m9 (ok)
[N-]=[N+]=[N-] => m9 (ok)
Once again I cannot find a logic. Has someone an explanation?
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org!http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss