I'm not sure if this helps, but maybe... If you have xyz files, how about
trying xyz2mol:
https://github.com/jensengroup/xyz2mol

-greg


On Wed, Oct 23, 2019 at 4:57 PM Navid Shervani-Tabar <nshe...@gmail.com>
wrote:

> Thanks for the prompt response Greg, these are from QM9 dataset (from the
> original paper). Do you know of any package that has already fixed them by
> any chance? I used to use Chainer Chemistry to load the dataset, but those
> seem not to have coordinate information included.
>
> Navid
>
> On Wed, Oct 23, 2019 at 10:53 AM Greg Landrum <greg.land...@gmail.com>
> wrote:
>
>>
>> Given that those molecules are not chemically reasonable, I would suggest
>> either fixing them by hand or removing them.
>>
>> On Wed, 23 Oct 2019 at 16:46, Navid Shervani-Tabar <nshe...@gmail.com>
>> wrote:
>>
>>> Thanks Dan,
>>>
>>> There are two more issues after sanitizing:
>>>
>>> 1. For some molecules (e.g; c1ccnc1), I get the following error: Can't
>>> kekulize mol.  Unkekulized atoms: 0 1 2 3 4
>>>
>>> 2. For some molecules (e.g; C#CC#CN#N), I get the following error:
>>> ValueError: Sanitization error: Explicit valence for atom # 4 N, 4, is
>>> greater than permitted
>>>
>>> I asked a similar question few weeks ago, where I got a similar error
>>> while having SMILES as my input, but non of the suggestions helped. Should
>>> I just get rid of these molecules?
>>>
>>> Thanks,
>>> Navid
>>>
>>> On Tue, Oct 22, 2019 at 11:57 PM Dan Nealschneider <
>>> dan.nealschnei...@schrodinger.com> wrote:
>>>
>>>> Navid-
>>>> You probably need to "sanitize" the mol:
>>>>
>>>> rdkit.Chem.rdmolops.SanitizeMol(mol)
>>>>
>>>> *dan nealschneider* | senior developer
>>>> [image: Schrodinger Logo] <https://www.schrodinger.com/>
>>>>
>>>>
>>>> On Tue, Oct 22, 2019 at 6:31 PM Navid Shervani-Tabar <nshe...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to load a dataset using a vector of atoms (e.g
>>>>> [6,6,7,6,6,8]) and the corresponding adjacency matrix. I am using the
>>>>> following script to transform these into a mol object:
>>>>>
>>>>> def MolFromGraphs(node_list, adjacency_matrix):
>>>>>
>>>>>     # create empty editable mol object
>>>>>     mol = Chem.RWMol()
>>>>>
>>>>>     # add atoms to mol and keep track of index
>>>>>     node_to_idx = {}
>>>>>     for i in range(len(node_list)):
>>>>>         a = Chem.Atom(node_list[i].item())
>>>>>         molIdx = mol.AddAtom(a)
>>>>>         node_to_idx[i] = molIdx
>>>>>
>>>>>     # add bonds between adjacent atoms
>>>>>     for ix, row in enumerate(adjacency_matrix):
>>>>>         for iy, bond in enumerate(row):
>>>>>
>>>>>             # only traverse half the matrix
>>>>>             if iy <= ix:
>>>>>                 continue
>>>>>
>>>>>             # add relevant bond type (there are many more of these)
>>>>>             if bond == 0:
>>>>>                 continue
>>>>>             elif bond == 1:
>>>>>                 bond_type = Chem.rdchem.BondType.SINGLE
>>>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>>>             elif bond == 2:
>>>>>                 bond_type = Chem.rdchem.BondType.DOUBLE
>>>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>>>             elif bond == 3:
>>>>>                 bond_type = Chem.rdchem.BondType.TRIPLE
>>>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>>>             elif bond == 1.5:
>>>>>                 bond_type = Chem.rdchem.BondType.AROMATIC
>>>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>>>
>>>>>     # Convert RWMol to Mol object
>>>>>     mol = mol.GetMol()
>>>>>
>>>>>     return mol
>>>>>
>>>>>
>>>>> When I try to get the hybridization of atoms using the mol object
>>>>> generated from the function above, I get *UNSPECIFIED.*
>>>>>
>>>>> To make sure that this function works, I used *MolToSmiles *to
>>>>> generate a SMILES string from the generated mol object and it matched the
>>>>> actual SMILES from the dataset. Interestingly, when I regenerate the mol
>>>>> object from the SMILES that I already generated from the above function, I
>>>>> can get the hybridization from the new mol object with no problem. I was
>>>>> wondering if there is a flag or variable that I should set in the above
>>>>> function to be able to get hybridization from the generated mol object.
>>>>>
>>>>> Thanks!
>>>>> Navid
>>>>>
>>>>> _______________________________________________
>>>>> Rdkit-discuss mailing list
>>>>> Rdkit-discuss@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>>
>>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to