Thanks Ivan -- very helpful.

Is there any consensus on idioms for identifying multiple moieties in the
same fragment?  Do I have to use len(mol.GetSubstructMatches(patt)) > 1 as
some kind of selector and then do some kind of graph traversal routine to
see if any of the matches are covalently connected?

On Sat, Mar 7, 2020 at 3:34 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi Curt,
>
> According to
> https://www.rdkit.org/docs/RDKit_Book.html#smarts-support-and-extensions ,
> it's not supported:
>
> Here’s the (hopefully complete) list of SMARTS features that are *not*
>>  supported:
>>
>>    - Non-tetrahedral chiral classes
>>
>>
>>    - the @? operator
>>
>>
>>    - explicit atomic masses (though isotope queries are supported)
>>
>>
>>    - component level grouping requiring matches in different components,
>>    i.e. (C).(C)
>>
>> OK, the way it's worded it sounds like (C.C) might be supported (since
> that would be requiring matches in the same component), but as you've seen,
> it isn't supported either...
>
> Ivan
>
>
> On Sat, Mar 7, 2020 at 4:58 PM Curt Fischer <curt.r.fisc...@gmail.com>
> wrote:
>
>> Hi rdkit fiends!
>>
>> The [Daylight SMARTS example page](
>> https://daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html)
>> gives several examples for "multiple group" smarts, including these strings:
>>
>> ([Cl!$(Cl~c)].[c!$(c~Cl)])
>> ([Cl]).([c])
>> ([Cl].[c])
>> [NX3;H2,H1;!$(NC=O)].[NX3;H2,H1;!$(NC=O)]
>>
>> In general, I cannot get these to be parsed by Chem.MolFromSmarts().
>>
>> For example,  Chem.MolFromSmarts('([Cl!$(Cl~c)].[c!$(c~Cl)])') gives me
>> this error message:
>>
>> ```
>> [13:01:41] SMARTS Parse Error: syntax error while parsing:
>> ([Cl!$(Cl~c)_100].[c!$(c~Cl)_101])
>> [13:01:41] SMARTS Parse Error: Failed parsing SMARTS
>> '([Cl!$(Cl~c)_100].[c!$(c~Cl)_101])' for input: '([Cl!$(Cl~c)].[c!$(c~Cl)])'
>> ```
>> My understanding of SMARTS is that the outermost parentheses in this
>> SMARTS string are required to force the chlorine and the aromatic carbon to
>> be somewhere in the same covalently connected fragment.  E.g. this pattern
>> *should* hit benzyl chloride ClCc1ccccc1 but should *not* hit the
>> hydrochloride salt of aniline Cl.Nc1ccccc1.
>>
>> What am I getting wrong?  Is there a way to write rdkit-parsable SMARTS
>> that achieves this?  (I want to filter our molecules that contain more than
>> one of certain moieties, while allowing molecules that have one (or zero)
>> such moieties.  But salts or covalently disconnected fragments that each
>> contain one instance of the moiety should be fine.)
>>
>> Details on my setup:
>>
>> - RDKit Version: 2019.09.3
>> - Operating system: macOS 10.15.2
>> - Python version (if relevant): 3.6
>> - Are you using conda? yes
>> - If you are using conda, which channel did you install the rdkit from?
>> `conda-forge`
>> - If you are not using conda: how did you install the RDKit?
>>
>> Curt
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to