I am sorry to pick up on this again - but I still cannot get it to work.
I fixed the SMARTS definition to something obvious - just mark all
aliphatic nitrogens as acceptors
When the order in the definition file is
AtomType NAcceptor [N]
AtomType NAcceptor [n;+0;!X3;!$([n;H1](cc)cc)]
Debugging with your suggested method gives:
Acceptor.SingleAtomAcceptor [$([N,$([n;+0;!X3;!$([n;H1](cc)cc)])])]
which looks good.
But when I invert the order of the two I get:
Acceptor.SingleAtomAcceptor [$([n;+0;!X3;!$([n;H1](cc)cc),$([N])])]
Which is NOT the same as the above because ";" (AND) has lower precedence
than ",". So this is the first bit which is being evaluated is the
spurious clause !$([n;H1](cc)cc),$([N]) and then following the other ANDs.
I think. This of course gives me no acceptor points.
The correct way to write this was *perhaps*
Acceptor.SingleAtomAcceptor [$([n;+0;!X3;!$([n;H1](cc)cc))],$([N])]
-
Jean-Paul Ebejer
Early Stage Researcher
On 15 August 2012 04:49, Greg Landrum <[email protected]> wrote:
> On Tue, Aug 14, 2012 at 11:43 AM, JP <[email protected]> wrote:
> >
> > Anyway enough of the blabber. I am using the feature definition file
> > in RDKit and was wondering why the order of the rules in the file
> > makes a difference.
> >
> > So
> >
> > AtomType NAcceptor C[N;H0]=C
> > AtomType NAcceptor [N&v3;H0;$(Nc)]
> >
> > Gives different results than
> >
> > AtomType NAcceptor [N&v3;H0;$(Nc)]
> > AtomType NAcceptor C[N;H0]=C
> >
> > These are different rules affecting different chemotypes... why does
> > the above find the CN=C acceptor feature and the below does not?
>
> The short answer is that you're using the wrong SMARTS. An AtomType
> definition should match a single Atom. What I think you mean here is:
>
> AtomType NAcceptor [N&v3;H0;$(Nc)]
> AtomType NAcceptor [$(N(C)=C)]
>
> Here's a demonstration that using this makes the order dependence go away:
>
> In [31]: fdf="""AtomType NAcceptor3 [N&v3;H0;$(Nc)]
> ....: AtomType NAcceptor3 [$(N(C)=C)]
> ....: DefineFeature SingleAtomAcceptor3 [{NAcceptor3}]
> ....: Family Acceptor3
> ....: Weights 1
> ....: EndFeature
> ....:
> ....: AtomType NAcceptor4 [$(N(C)=C)]
> ....: AtomType NAcceptor4 [N&v3;H0;$(Nc)]
> ....: DefineFeature SingleAtomAcceptor3 [{NAcceptor4}]
> ....: Family Acceptor4
> ....: Weights 1
> ....: EndFeature
> ....: """
>
> In [32]: m = Chem.MolFromSmiles('CN=C')
>
> In [33]: ff = AllChem.BuildFeatureFactoryFromString(fdf)
>
> In [34]: feats=ff.GetFeaturesForMol(m)
>
> In [35]: [x.GetFamily() for x in feats]
> Out[35]: ['Acceptor3', 'Acceptor4']
>
> Hopefully that gets your code working. You may want to stop reading here.
> :-)
>
>
> Here's what happens when I do the same thing with your definitions:
>
> In [36]: fdf="""AtomType NAcceptor1 C[N;H0]=C
> ....: AtomType NAcceptor1 [N&v3;H0;$(Nc)]
> ....: DefineFeature SingleAtomAcceptor1 [{NAcceptor1}]
> ....: Family Acceptor1
> ....: Weights 1
> ....: EndFeature
> ....:
> ....: AtomType NAcceptor2 [N&v3;H0;$(Nc)]
> ....: AtomType NAcceptor2 C[N;H0]=C
> ....: DefineFeature SingleAtomAcceptor2 [{NAcceptor2}]
> ....: Family Acceptor2
> ....: Weights 1
> ....: EndFeature
> ....: """
>
> In [37]: ff = AllChem.BuildFeatureFactoryFromString(fdf)
>
> In [38]: feats=ff.GetFeaturesForMol(m)
>
> In [39]: [x.GetFamily() for x in feats]
> Out[39]: ['Acceptor1']
>
> This is the behavior you were seeing.
>
> To understand why this happens, you need to look at the SMARTS that
> ends up being produced for each of your feature definitions:
>
> In [40]: for k,v in ff.GetFeatureDefs().iteritems(): print k,v
> Acceptor1.SingleAtomAcceptor1 [$(C[N;H0,$([N&v3;H0;$(Nc)])]=C)]
> Acceptor2.SingleAtomAcceptor2 [$([N&v3;H0;$(Nc),$(C[N;H0]=C)])]
>
> The fdef parser combines the different atom type defintions with each
> other based on the assumption that each defines a single atom using
> simple string manipulations. It's really expecting your AtomType
> definition to start and end with a square bracket. It should be
> testing for that, but it's not.
>
> -greg
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss