Trouble is, you're mixing chemical operations and lexical ones. It
might be handy if this 'just worked' but in practice it's not going to
produce valid SMILES without more work.

I've written code in the past to do this kind of thing for virtual
library building, using dummy atoms to mark link positions in the
fragments, and using Perl code to transform between the dummy atoms
and bond-closure numbers to give text strings which could be assembled
to give valid dot-disconnected SMILES. This required additional
lexical transformations in order to maintain valid SMILES depending on
where the dummy atom was, and to make sure that stereochemistry worked
properly. If you want to do this kind of thing I don't think you can
expect to avoid these additional lexical operations.

I don't think it's reasonable to expect that invalid SMILES strings
should be coerced into giving a particular result for convenience when
1) - they're invalid! and 2) - the behaviour is actually a reasonable
interpretation of the order of connections in the SMILES (even though
they are invalid).

I don't think the current RDKit interpretation of these SMILES should
change, though it might be useful if it could issue a warning that
SMILES of this type are not correct.

Best regards,
Chris

On 9 November 2017 at 15:09, Brian Cole <col...@gmail.com> wrote:
> Here's an example of why this is useful at maintaining molecular
> fragmentation inside your molecular representation:
>
>>>> from rdkit import Chem
>>>> smiles = 'F9.[C@]91(C)CCO1'
>>>> fluorine, core = smiles.split('.')
>>>> fluorine
> 'F9'
>>>> fragment = core.replace('9', '([*:9])')
>>>> fragment
> '[C@]([*:9])1(C)CCO1'
>>>> mol = Chem.RWMol(Chem.MolFromSmiles(fragment))  ### RDKit is flipping
>>>> the stereo on me here even the order of the bonds has not changed
>>>> idx = mol.AddAtom(Chem.Atom(0))
>>>> mol.AddBond(idx, 4, Chem.rdchem.BondType.SINGLE)
> 7
>>>> mol.GetAtomWithIdx(idx).SetIntProp("molAtomMapNumber", 8)
>>>> new_core = Chem.MolToSmiles(mol, True)
>>>> new_core = new_core.replace('([*:9])', '9').replace('([*:8])', '8')
>>>> new_core
> 'C[C@]19CC8O1'
>>>> analog_smiles = 'Cl8.' + fluorine + '.' + new_core
>>>> analog_smiles
> 'Cl8.F9.C[C@]19CC8O1'
>>>> analog = Chem.MolFromSmiles(analog_smiles)
>>>> analog.HasSubstructMatch(Chem.MolFromSmiles(smiles), useChirality=True)
>>>> # Uh oh! My original molecule didn't match
> False
>>>> analog.HasSubstructMatch(Chem.MolFromSmiles(smiles.replace('@', '@@')),
>>>> useChirality=True)   # flipping the stereo of the original causes it to
>>>> match again
> True
>
>
>
>
> On Thu, Nov 9, 2017 at 4:41 AM, Andrew Dalke <da...@dalkescientific.com>
> wrote:
>>
>> On Nov 9, 2017, at 08:13, Greg Landrum <greg.land...@gmail.com> wrote:
>> > As was discussed in the comments of
>> > https://github.com/rdkit/rdkit/issues/786, I think it's pretty gross that
>> > the second syntax is even legal. But that's a side point.
>>
>> To belabor that point. Neither Daylight SMILES nor OpenSMILES accept it,
>> which are the only two explicit sources of "legal" that people use.
>>
>> "allowed" might be a better term.
>>
>>                                 Andrew
>>                                 da...@dalkescientific.com
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to