Something to note from SMILES there are no radicals, only undervalent atoms. Which means your formula works correctly but from other formats (e.g. MDL) you get an explicit unpaired electron added to the container. Simple rules will get you pretty far and there are utilities like the CDK AtomTypeMatcher which provide a global model but I would write what you need since different valence models exist for different formats and based on surrounds e.g. oxide's things change.
I will caution against "robo chemistry <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6086778/>" where you try to guess what the correct answer is, for example to be a negative charge looks more reasonable: [Na+].O=C1[N-]C=CC=2C=CC(Br)=CC12.FC(F)(F)CI Either way - I would probably avoid the getValency() field and instead switch on the atomic number and guard against unusual charges: int explValence = atomContainer.getBondOrderSum(atom); switch (atomicNum) { case 6: if (charge == 0) max(4 - explValence, 0); break; case 7: if (charge == 0 && explValence > 3) max(5 - explValence, 0); else if (charge == 0) max(3 - explValence, 0); break; } Further reading: MDL valence model: https://www.ics.uci.edu/~dock/manuals/oechem/pyprog/mdlvalence.html Further Reading: https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/ OPSIN has a good valence checker: https://github.com/dan2097/opsin/blob/c827542501516a70fb91dce09bc1b275ddddb80d/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/ValencyChecker.java John On Mon, 22 Aug 2022 at 08:47, Uli Fechner <u...@pending.ai> wrote: > Hi, > > I came across an issue today that seemed straightforward at the beginning, > but after a while ceased to appear that easily accessible. Well, I probably > shouldn't be surprised - I guess that is just cheminformatics at its best :) > > The following smiles popped up in my workflow: > [Na+].O=C1[N]C=CC=2C=CC(Br)=CC12.FC(F)(F)CI > > This translates to the sole nitrogen (valency = 3, SP2) being a radical > with no implicit hydrogen and two neighboring carbon atoms both of which > are connected by single bonds. > > Irrespective of how that radical got there I want to 'remove' it by just > adding an implicit hydrogen to the nitrogen atom. > > This then led to the more general question of how to remove radicals for > common organic elements (C, N, O, P, S seems like a good start). > > I came up with the following formula: > > int numberOfUnpairedElectrons = (int) (atom.getValency() - > atomContainer.getBondOrderSum(atom) + atom.getFormalCharge() - > atom.getImplicitHydrogenCount()); > if (numberOfUnpairedElectrons % 2 != 0) { > atom.setImplicitHydrogenCount(atom.getImplicitHydrogenCount() + 1); > } > > As this is chemistry, I am sure there are a lot of exceptions - even if > the elements of interest are very restricted. > > Is the formula above a reasonable simplification? Or am I oversimplifying > this? > > Best > Uli > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user >
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user