Right now I found that calling kekulize() before match() does the trick,
at least for the example
f: [#7]-[#6]=[#6]-[#6]=[#6]-[#6]-[#6]-[#6]
m: c1cc(CCNC)ncc1

Andreas

Noel O'Boyle wrote on 01/07/2010 01:56 PM:
> Hi Andreas, can you cc this message to the list? It may help to make
> things clearer.
> 
> - Noel
> 
> 2010/1/7 Andreas Maunz <[email protected]>:
>> Hi Noel, thanks for the reply.
>> Actually, the mol is taken from the Blood-Brain-Barrier dataset from
>> Rueckert and Kramer: "Optimizing Feature Sets for Structured Data".
>> (http://www.springerlink.com/content/x22202075j328w46/), so it should be
>> a real molecule.
>> Since, unfortunately, I am not a chemist, I can't tell you about the charge.
>>
>> Here is another example from the same dataset that doesn't work,
>> although it should:
>> f: [#7]-[#6]=[#6]-[#6]=[#6]-[#6]-[#6]-[#6]
>> m: c1cc(CCNC)ncc1
>>
>> It is even more confusing since the different linear fragments occuring
>> in benzene are found w/o problems:
>>
>> "[#6]-[#6]"
>> Found 3 instances. Here are the atom indices:
>>  Hit 0: [ 1 6 ]
>>  Hit 1: [ 2 3 ]
>>  Hit 2: [ 4 5 ]
>> "[#6]=[#6]"
>> Found 3 instances. Here are the atom indices:
>>  Hit 0: [ 1 2 ]
>>  Hit 1: [ 3 4 ]
>>  Hit 2: [ 5 6 ]
>> "[#6]-[#6]=[#6]"
>> Found 6 instances. Here are the atom indices:
>>  Hit 0: [ 1 6 5 ]
>>  Hit 1: [ 2 3 4 ]
>>  Hit 2: [ 3 2 1 ]
>>  Hit 3: [ 4 5 6 ]
>>  Hit 4: [ 5 4 3 ]
>>  Hit 5: [ 6 1 2 ]
>> "[#6]=[#6]-[#6]"
>> Found 6 instances. Here are the atom indices:
>>  Hit 0: [ 1 2 3 ]
>>  Hit 1: [ 2 1 6 ]
>>  Hit 2: [ 3 4 5 ]
>>  Hit 3: [ 4 3 2 ]
>>  Hit 4: [ 5 6 1 ]
>>  Hit 5: [ 6 5 4 ]
>> "[#6]=[#6]-[#6]=[#6]"
>> Found 3 instances. Here are the atom indices:
>>  Hit 0: [ 1 2 3 4 ]
>>  Hit 1: [ 2 1 6 5 ]
>>  Hit 2: [ 3 4 5 6 ]
>> "[#6]=[#6]-[#6]=[#6]"
>> Found 3 instances. Here are the atom indices:
>>  Hit 0: [ 1 2 3 4 ]
>>  Hit 1: [ 2 1 6 5 ]
>>  Hit 2: [ 3 4 5 6 ]
>>
>>
>> Andreas
>>
>> Noel O'Boyle wrote on 01/07/2010 01:09 PM:
>>> I find it difficult to work out the PI electron count for these sorts
>>> of systems. Can you confirm that this is a real molecule, and that the
>>> 5-membered ring is uncharged?
>>>
>>> - Noel
>>>
>>> 2010/1/7 Andreas Maunz <[email protected]>:
>>>> Hi all,
>>>>
>>>> In my simple world, a molecule is just a graph with nodes corresponding
>>>> to atomic numbers, while the edges are either single, double, or triple
>>>> (no aromatic bindings). My graph miner reads in molecules as
>>>>
>>>> OBAtomIterator atom;
>>>> mol.BeginAtom(atom);
>>>> do {
>>>>        InputNodeLabel inputnodelabel = (*atom)->GetAtomicNum();
>>>> } while (mol.NextAtom(atom));
>>>>
>>>> Similar for the edges, i.e. bondorder = (*bond)->GetBondOrder(); It then
>>>> computes all fragments from the graphs that adhere to certain
>>>> conditions, such as minimum frequency.
>>>>
>>>> Now, I want to match the found fragments back to (a subset of) the
>>>> molecules, e.g.:
>>>> m: n1(c2ccccc2)n(c(C)cc1=O)C
>>>> f: [#6]-[#6](=[#6]-[#6]-[#7]-[#6])(-[#7])
>>>> The example above won't match back to the molecule, although my graph
>>>> mining application has found f as a subgraph of m. I use this code (in
>>>> ruby) to match back:
>>>>
>>>> c=OpenBabel::OBConversion.new
>>>> c.set_in_format 'smi'
>>>> m=OpenBabel::OBMol.new
>>>> c.read_string m, "n1(c2ccccc2)n(c(C)cc1=O)C"
>>>> m.set_aromatic_perceived             # seems necessary: Without
>>>> m.set_aromatic_perceived, even such simple examples as m = c1ccccc1 and
>>>> f = [#6]=[#6] won't match.
>>>> p=OpenBabel::OBSmartsPattern.new
>>>> if !p.init("[#6]-[#6](=[#6]-[#6]-[#7]-[#6])(-[#7])")
>>>>    puts "Error! Smarts pattern invalid."
>>>>    exit
>>>> end
>>>> p.match(m)
>>>>
>>>> My problem is: the last line returns 'false'. What is the problem here?
>>>>
>>>> Greetings
>>>> Andreas
>>>>
>>>> Here is a depiction of the molecule:
>>>> http://www.daylight.com/daycgi/depict?6e31286332636363636332296e28632843296363313d4f2943
>>>>
>>>> --
>>>> http://www.maunz.de
>>>> OpenPGP key: http://www.maunz.de/[email protected]_pub.asc
>>>>
>>>>             Warning: dates in calendar are closer than they appear
>>>>
>>>> ------------------------------------------------------------------------------
>>>> This SF.Net email is sponsored by the Verizon Developer Community
>>>> Take advantage of Verizon's best-in-class app development support
>>>> A streamlined, 14 day to market process makes app distribution fast and 
>>>> easy
>>>> Join now and get one step closer to millions of Verizon customers
>>>> http://p.sf.net/sfu/verizon-dev2dev
>>>> _______________________________________________
>>>> OpenBabel-Devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>>>>
>> --
>> http://www.maunz.de
>> OpenPGP key: http://www.maunz.de/[email protected]_pub.asc
>>
>>   I do know everything, just not all at once. It's a virtual memory
>> problem.
>>
> 

-- 
http://www.maunz.de
OpenPGP key: http://www.maunz.de/[email protected]_pub.asc

   I do know everything, just not all at once. It's a virtual memory
problem.

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to