Re: [Open Babel] Avoid sanitization

Noel O'Boyle Wed, 22 Feb 2023 01:36:33 -0800

Ah ok. Just a note that you can fragment the actual structure with Open
Babel and you will get the same results. There's a page in the docs on
this:
https://open-babel.readthedocs.io/en/latest/Aromaticity/Aromaticity.html.


On Tue, 21 Feb 2023 at 18:05, Thomas <odioidenti...@gmail.com> wrote:

> Thank you Noel, the -a option solved my issue.
>
> I made a program that fragments molecular structures by fragmenting the
> SMILES string instead of the actual structure. Therefore, the resulting
> SMILES fragments can be a bit messed up, still I want them to match the
> original structure.
> Furthermore, the SMILES that I use as input are already processed by a
> chemical platform (called Vega), so for consistency sake I should not
> modify the information.
>
> Il giorno dom 19 feb 2023 alle ore 18:32 Noel O'Boyle <
> baoille...@gmail.com> ha scritto:
>
>> It would be useful to know what problem you are trying to solve here.
>>
>> OB does not support canonical Kekule SMILES, if you expect different
>> resonance forms to give the same canonical Kekule SMILES. Of course, you
>> can just write out an canonical aromatic SMILES, read it back in, and then
>> write it out in Kekule form (no need for canonical option).
>>
>> Regarding the second question, I don't know where you got that SMILES
>> from, but if you go to https://www.simolecule.com/cdkdepict/depict.html,
>> and paste the SMILES into the SMILES box, and the SMARTS pattern into the
>> SMARTS box, you will not see a match either. Both software by default apply
>> the Daylight aromaticity model (as best they can) leading to the bridging O
>> and C being aromatic.
>>
>> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -osmi
>> O=C(NCCN(C)C)c1cccc2c(=O)c3ccccc3oc12
>>
>> Note the lowercase 'o' and 'c' - this is why the SMARTS won't match. I
>> don't recommend it unless you know what you're doing, but OB can preserve
>> whatever aromaticity is in the input using the "a" input option to SMILES:
>>
>> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -osmi -aa
>> O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3Oc12
>>
>> Note that the uppercase 'O' and 'C' is preserved. Here's proof that it
>> matches the SMARTS:
>>
>> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -s
>> "c1cccc2Cc3ccccc3(Oc12)" -osmi
>> 0 molecules converted
>> $ obabel -:"O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)" -s
>> "c1cccc2Cc3ccccc3(Oc12)" -osmi -aa
>> O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3Oc12
>>
>> Regards,
>> Noel
>>
>>
>>
>> On Fri, 17 Feb 2023 at 18:50, Thomas <odioidenti...@gmail.com> wrote:
>>
>>> Thank you Noel.
>>> I wanted to get the canonical SMILES, without changing the aromaticity
>>> of the input SMILES:
>>>
>>> mol = pybel.readstring('smi',
>>> 'O=C(NCCN(C)C)C1=CC=CN2C(=O)c3ccccc3(N=C12)')
>>> mol.write(opt={"k": True, 'c':True})
>>> 'CN(CCNC(=O)C1=CC=CN2C1=NC1C=CC=CC=1C2=O)C\t\n'
>>> mol.write(opt={'c':True})
>>> 'CN(CCNC(=O)c1cccn2c1nc1ccccc1c2=O)C\t\n'
>>>
>>> Furthermore, can you explain me this:
>>>
>>> mol = pybel.readstring('smi', "O=C(NCCN(C)C)c1cccc2C(=O)c3ccccc3(Oc12)")
>>> sma = pybel.Smarts("c1cccc2Cc3ccccc3(Oc12)")
>>> sma.obsmarts.Match(mol.OBMol, True)
>>> False
>>>
>>> Thank you again
>>> Thomas
>>>
>>> Il giorno ven 17 feb 2023 alle ore 18:37 Noel O'Boyle <
>>> baoille...@gmail.com> ha scritto:
>>>
>>>> Hi Thomas,
>>>>
>>>> OB does not sanitize molecules when reading from SMILES (or any other
>>>> format). By default it writes aromatic SMILES though, but it sounds like
>>>> you want Kekule SMILES - see the obabel -Hsmi for the list of options. In
>>>> this case you want 'k':
>>>>
>>>> $ obabel -:"O=C1C=COC(=C1(O))C" -xk -osmi
>>>> O=C1C=COC(=C1O)C
>>>>
>>>> In Python, this is something like mol.write(opt={"k:" True}).
>>>>
>>>> Neither does it add Hs. A SMILES string exactly specifies the number of
>>>> Hs on each atom; this is preserved on reading/writing. If you could provide
>>>> information on a specific case, we could explain what's happening more
>>>> clearly.
>>>>
>>>> Regards
>>>> Noel
>>>>
>>>>
>>>> On Fri, 17 Feb 2023 at 16:18, Thomas <odioidenti...@gmail.com> wrote:
>>>>
>>>>> Is there an option to avoid sanitization of a molecule when reading
>>>>> from SMILES?
>>>>> For example I'd like the SMILES to remain unchanged if I read and
>>>>> write it:
>>>>>
>>>>> mol = pybel.readstring('smi', 'O=C1C=COC(=C1(O))C')
>>>>> mol.write()
>>>>> O=c1ccoc(c1O)C
>>>>>
>>>>> Beside kekulization issues, other unwanted sanitizations are the
>>>>> addition of Hs if I generate the molecule from SMILES fragments (partial
>>>>> SMILES)
>>>>>
>>>>> Thank you
>>>>> Thomas
>>>>> _______________________________________________
>>>>> OpenBabel-discuss mailing list
>>>>> OpenBabel-discuss@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>>>>
>>>> _______________________________________________
>>> OpenBabel-discuss mailing list
>>> OpenBabel-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>>
>> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Re: [Open Babel] Avoid sanitization

Reply via email to