On Tue, Sep 27, 2011 at 3:37 PM, Chris Morley <c.mor...@gaseq.co.uk> wrote:
> On 26/09/2011 11:50, Martin Guetlein wrote:
> > On Mon, Sep 19, 2011 at 11:23 AM, Martin Guetlein<
> > martin.guetl...@googlemail.com> wrote:
> >
> >> On Mon, Sep 19, 2011 at 11:04 AM, Chris Morley<c.mor...@gaseq.co.uk>
> >> wrote:
> >>> On 19/09/2011 09:32, Martin Guetlein wrote:
> >>>> Hi,
> >>>>
> >>>> I would like to do smarts matching with OpenBabel, and for some
> >>>> reasons I am restricted to use the command line interface.
> >>>> Is there an efficient way to match a range of smarts strings against a
> >> dataset?
> >>>>
> >>>> What I found out is that I could use the "-s" option, but I would have
> >>>> to call babel once for each smarts, right?
> >>> If the SMARTS patterns are sufficiently simple to be SMILES, then they
> >>> can all be put in a file whose name is used with the -s option. The
> >>> matching is OR - any single match will do. The file could be any any
> >>> molecular format, but must have the appropriate extension.
> >>
> >
> > I would have to run this once for each smarts, right? (This would produce
> > quite a file parsing overhead.)
>
> We are not quite understanding each other. Here is what I was meaning, I
> hope it is what you want. There is a dataset (in any format) and you to
> extract from it the molecules that match at least one of a set of
> structures in a file patterns.yyy (any format). The output goes to
> results.zzz You do a single command:
> obabel dataset.xxx -O results.zzz -s patterns.yyy
> The isomorphism filters are constructed once and applied for each input
> molecule, but you don't have to worry about this.
>
> Here is what I was meaning, I hope it is what you want.
Not quite. This is what I was looking for:
I have a dataset (of some format) with n compounds. I have another dataset
containing m smarts. I want to have a matrix of size n x m, with values 1 or
0:
1 : smarts m matches compound n
0 : else
When I have to use the -s command I will only get 1 column of the matrix at
a time (and this requires quite some parsing work).
Defining my own SMARTS pattern in plugindefines.txt did the job (Had the
access problem in windows, though).
> >>>> A second option I found would be to edit the patterns.txt file and use
> >>>> the FP3 fingerprint ("-ofpt -xfFP3 -xs"). I dont like the idea of
> >>>> editing a file in advance, though (The smarts matching is part of some
> >>>> software I am working on). Is there a way to provide an own patterns
> >>>> file?
> >>>
> >>> You can make your own fingerprint by making a data file with SMARTS
> >>> patterns, like patterns.txt (or with a couple of other slightly
> >>> different formats) and specifying the details by making an entry in a
> >>> plugindefines.txt. The MACCS fingerprint is defined there in this way.
> >>
> >> This looks good, I will give it a try, thanks,
> >> Martin
> >>
> >
> >
> > Hmm, it worked fine in Linux, not for Windows, though. The
> plugindefines.txt
> > is in a folder that requires admin rights to be edited (C:\Program
> > Files\OpenBabel-2.3.0\data).
> > Is there a way around that?
>
> 1) Copy the folder to somewhere accessible and set the environment
> variable BABEL_DATADIR to the new position. Do this via Control
> Panel/System and Security/System/Advanced system setting (in left
> panel)/Environment Variables (button). You should see BABEL_DATADIR in
> the top box.
Great, I think that will do. I will check immediatelly.
Thanks a lot,
Martin
> or 2) Re-install OpenBabel to a different folder. (In the next version
>
the Windows Installer will use a different default location.)
> >>
> >>
> >>>
> >>> For extra flexibility you could define a compound filter to be used
> with
> >>> the --filter option, where SMARTS tests can be combined with conditions
> >>> based on other molecular (or SDF type) properties.
> >>
> >
> > Same as for the -s option here, I have to run it once for each smarts,
> > right?
>
> You can run several SMARTS filters in the same command. Suppose you want
> to only convert molecules that have a pyridine ring or have a bromine atom:
> obabel dataset.xxx -O results.zzz --filter "s=n1ccccc1 || s=[Br]"
>
> There seems to be a glitch with the compound filter, which would combine
> these. I'll try to sort it out.
>
> Chris
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
--
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 7633 (office)
+49 (0)177 623 9499 (mobile)
Email:
guetl...@informatik.uni-freiburg.de
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss