The next quarter release is coming soon, which was good incentive for me to
work on the new MCS code for RDKit.
I checked in the actual MCS code a few weeks ago, and Greg's reviewed it at
least lightly. This evening I checked in the test cases. They are at:
http://sourceforge.net/p/rdkit/code/2197/tree/trunk/rdkit/Chem/MCS.py
http://sourceforge.net/p/rdkit/code/2197/tree/trunk/rdkit/Chem/UnitTestMCS.py
Here's the API and documentation.
def FindMCS(mols, min_num_atoms=2,
maximize = Default.maximize,
atom_compare = Default.atom_compare,
bond_compare = Default.bond_compare,
match_valences = Default.match_valences,
ring_matches_ring_only = False,
complete_rings_only = False,
timeout=Default.timeout,
):
Find the maximum common substructure of a set of molecules
@type mols: molecule iterator
@param mols: find the MCS of these molecules
@type min_num_atoms: integer
@param min_num_atoms: The minimum number of atoms which must be in the MCS.
The minimim value is 2.
@type maximize: atoms or bonds
@param maximize: The default atoms maximizes the number of atoms in
the MCS. Use bonds to maximize the number of bonds instead.
@type atom_compare: any, elements, or isotopes
@param atom_compare: Specify the atom comparison function. The default
elements
says that two atoms are the same if and only if they have the same
element number. Use isotopes if you are using isotope labels to
define your own atom classs. With any, all atoms match each
other.
@type bond_compare: any or bondtypes
@param bond_compare: Specify the bond comparison function. The default
bondtypes says that two bonds are the same if and only if they
have the same bond type. With any, all bonds match each other.
@type match_valences: boolean
@param match_valences: If True, atoms must also have matching valences
to match. By default this is False.
@type ring_matches_ring_only: boolean
@param ring_matches_ring_only: If True, then both bonds must either be in
a ring or not in a ring in order to match. By default this is
False.
@type ring_matches_ring_only: boolean
@param ring_matches_ring_only: If True, then both bonds must either be in
a ring or not in a ring in order to match. By default this is
False.
@type complete_rings_only: boolean
@param complete_rings_only: If True, then if a ring bond of a molecule
is in the MCS then the corresponding MCS bond is also in a ring.
@type timeout: float
@param timeout: stop search after 'timeout' seconds and report the current
best MCS.
@rtype: MCSResult
@return: Information about the MCS search results. Attributes are
'completed'
(0 if timeout reached, otherwise 1), 'num_atoms', 'num_bonds', and
'smarts'.
Is that too unreadable? I can rewrite that to be more prose than
this sort of auto-documentation format.
I wasn't sure of how/where to update the documentation. Greg? Do you have
an idea of where it might go? Or perhaps want to do it yourself?
In other news, I'll be presenting the MCS work at the Goslar conference
this fall, and I've got some funding to work on finding a threshold MCS,
that is, the maximum common substructure which is in at least some percentage
threshold of the entire system.
Can anyone suggest a good name for that? Threshold MCS? Something else?
Cheers,
Andrew
da...@dalkescientific.com
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-devel mailing list
Rdkit-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-devel