Ah ok. Just a note that you can fragment the actual structure with Open Babel and you will get the same results. There's a page in the docs on this: https://open-babel.readthedocs.io/en/latest/Aromaticity/Aromaticity.html.
On Tue, 21 Feb 2023 at 18:05, Thomas <odioidenti...@gmail.com> wrote: > Thank you Noel, the -a option solved my issue. > > I made a program that fragments molecular structures by fragmenting the > SMILES string instead of the actual structure. Therefore, the resulting > SMILES fragments can be a bit messed up, still I want them to match the > original structure. > Furthermore, the SMILES that I use as input are already processed by a > chemical platform (called Vega), so for consistency sake I should not > modify the information. > > Il giorno dom 19 feb 2023 alle ore 18:32 Noel O'Boyle < > baoille...@gmail.com> ha scritto: > >> It would be useful to know what problem you are trying to solve here. >> >> OB does not support canonical Kekule SMILES, if you expect different >> resonance forms to give the same canonical Kekule SMILES. Of course, you >> can just write out an canonical aromatic SMILES, read it back in, and then >> write it out in Kekule form (no need for canonical option). >> >> Regarding the second question, I don't know where you got that SMILES >> from, but if you go to https://www.simolecule.com/cdkdepict/depict.html, >> and paste the SMILES into the SMILES box, and the SMARTS pattern into the >> SMARTS box, you will not see a match either. Both software by default apply >> the Daylight aromaticity model (as best they can) leading to the bridging O >> and C being aromatic. >> >> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -osmi >> O=C(NCCN(C)C)c1cccc2c(=O)c3ccccc3oc12 >> >> Note the lowercase 'o' and 'c' - this is why the SMARTS won't match. I >> don't recommend it unless you know what you're doing, but OB can preserve >> whatever aromaticity is in the input using the "a" input option to SMILES: >> >> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -osmi -aa >> O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3Oc12 >> >> Note that the uppercase 'O' and 'C' is preserved. Here's proof that it >> matches the SMARTS: >> >> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -s >> "c1cccc2Cc3ccccc3(Oc12)" -osmi >> 0 molecules converted >> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -s >> "c1cccc2Cc3ccccc3(Oc12)" -osmi -aa >> O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3Oc12 >> >> Regards, >> Noel >> >> >> >> On Fri, 17 Feb 2023 at 18:50, Thomas <odioidenti...@gmail.com> wrote: >> >>> Thank you Noel. >>> I wanted to get the canonical SMILES, without changing the aromaticity >>> of the input SMILES: >>> >>> mol = pybel.readstring('smi', >>> 'O=C(NCCN(C)C)C1=CC=CN2C(=O)c3ccccc3(N=C12)') >>> mol.write(opt={"k": True, 'c':True}) >>> 'CN(CCNC(=O)C1=CC=CN2C1=NC1C=CC=CC=1C2=O)C\t\n' >>> mol.write(opt={'c':True}) >>> 'CN(CCNC(=O)c1cccn2c1nc1ccccc1c2=O)C\t\n' >>> >>> Furthermore, can you explain me this: >>> >>> mol = pybel.readstring('smi', "O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)") >>> sma = pybel.Smarts("c1cccc2Cc3ccccc3(Oc12)") >>> sma.obsmarts.Match(mol.OBMol, True) >>> False >>> >>> Thank you again >>> Thomas >>> >>> Il giorno ven 17 feb 2023 alle ore 18:37 Noel O'Boyle < >>> baoille...@gmail.com> ha scritto: >>> >>>> Hi Thomas, >>>> >>>> OB does not sanitize molecules when reading from SMILES (or any other >>>> format). By default it writes aromatic SMILES though, but it sounds like >>>> you want Kekule SMILES - see the obabel -Hsmi for the list of options. In >>>> this case you want 'k': >>>> >>>> $ obabel -:"O=C1C=COC(=C1(O))C" -xk -osmi >>>> O=C1C=COC(=C1O)C >>>> >>>> In Python, this is something like mol.write(opt={"k:" True}). >>>> >>>> Neither does it add Hs. A SMILES string exactly specifies the number of >>>> Hs on each atom; this is preserved on reading/writing. If you could provide >>>> information on a specific case, we could explain what's happening more >>>> clearly. >>>> >>>> Regards >>>> Noel >>>> >>>> >>>> On Fri, 17 Feb 2023 at 16:18, Thomas <odioidenti...@gmail.com> wrote: >>>> >>>>> Is there an option to avoid sanitization of a molecule when reading >>>>> from SMILES? >>>>> For example I'd like the SMILES to remain unchanged if I read and >>>>> write it: >>>>> >>>>> mol = pybel.readstring('smi', 'O=C1C=COC(=C1(O))C') >>>>> mol.write() >>>>> O=c1ccoc(c1O)C >>>>> >>>>> Beside kekulization issues, other unwanted sanitizations are the >>>>> addition of Hs if I generate the molecule from SMILES fragments (partial >>>>> SMILES) >>>>> >>>>> Thank you >>>>> Thomas >>>>> _______________________________________________ >>>>> OpenBabel-discuss mailing list >>>>> OpenBabel-discuss@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss >>>>> >>>> _______________________________________________ >>> OpenBabel-discuss mailing list >>> OpenBabel-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss >>> >> _______________________________________________ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss >
_______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss