I'm afraid that there's likely to be rather a lot of devil hiding in the
details (as is so often the case).

A simple example of one problem: let's take your [But]O case. Suppose you
do a substructure search for the molecule defined by the SMARTS "OCC". Does
that match "[But]O"?  What does it return when I ask for the substructure
matches (this function, if you aren't familiar with it, returns the indices
of the matching atoms)? What about the SMARTS "CC"?

One solution to this that works with substructure searching is to have the
molecule contain all the atoms - "CCCCO" in your example - but to have the
four C atoms marked as a group so that drawings of the molecule display
"[But]O". Supporting this type of functionality is on the To Do list (it's
part of supporting S Groups from Mol files).

If you just want to indicate that there is a [But] group there but not
really do anything with the group's structure, there's are probably already
ways to handle this using dummy atoms and custom labels.

-greg




On Wed, Sep 27, 2017 at 9:26 PM, Kovas Palunas <kovas.palu...@arzeda.com>
wrote:

> Ideally, I'd like to treat these pseudoatoms as similarly to normal atoms
> as possible.  I would mostly want to use them for substructure matching,
> running reactions, and also display purposes.  Also, basic atom queries,
> such as getting a mapping number or a atom symbol.
>
> I was thinking that maybe this could be done by just defining the CoA atom
> type (for example) just as the carbon or oxygen atom types are defined
> (setting atomic weight, valences, etc.).
>
> Does this make sense?
>
>  - Kovas
> ------------------------------
> *From:* Greg Landrum <greg.land...@gmail.com>
> *Sent:* Wednesday, September 27, 2017 2:27:04 AM
> *To:* Kovas Palunas
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Masking groups as atoms in RDKit
>
> Where would you want to use this?
> Is it for depiction (i.e. drawing molecules) or something else?
>
> -greg
>
>
> On Tue, Sep 26, 2017 at 10:12 PM, Kovas Palunas <kovas.palu...@arzeda.com>
> wrote:
>
>> Hi all,
>>
>>
>> Has anyone tried implementing or using a group to atom masking strategy
>> in RDKit?  By this I mean taking a piece of a molecule and representing it
>> as a single atom.  Here is an example:
>>
>>
>> CCCCO  could be represented as  [But]O, where the atom [But] represents
>> the four carbon chain.
>>
>>
>> In my case I'm particularly interested is using this strategy to
>> represent large biological molecules / molecule pieces, such as coenzyme A.
>>
>>
>>
>> If I were to implement this myself, is there a place in RDKit where atom
>> types can be defined?
>>
>>
>> Thanks!
>>
>>
>>  - Kovas
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to