Would be interesting to take a set of compounds and look at a correlation 
matrix - maybe one can identify a set of "generally" discriminating bits 
that can be used for screening ? Probably not but it could be worth a try 
... then memory would go down as well as discriminating power up?

Nik




Andrew Dalke <da...@dalkescientific.com> 
12.02.2009 14:56

To
RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
cc

Subject
Re: [Rdkit-discuss] Optimizing SSS in the RDKit






On Feb 12, 2009, at 8:46 AM, Greg Landrum wrote:
> I'm either not understanding completely or I disagree. The queries
> were constructed by fragmenting the molecules I searched through, so
> I'd expect lots of substructure hits (and a lower screen-out rate that
> arbitrary queries against arbitrary molecules).

Ahh, of course.

But I don't think fingerprint screen give, say, 0.001% false rates.
I think they are more in line with what you found. But if the bit
distributions were really uncorrelated for molecules where one is
not a substructure of the other, then I would expect extremely
low false positive rates. 2048 bits should give a lot of
discrimination power if the bits weren't correlated.

> That's a good idea to add to the list of things to look into. It's
> also relatively easy to do because it probably just involves
> increasing the minimum path length included in fingerprints (at least
> as a first step).

Again, I don't have experience with that, but it means
that there's less ability to handle unlikely atom types.
Yes, the larger subgraphs will include them. Don't know.

> Looking at MACCS is a good idea. I'll also put that on the list.

Is this list on a wiki? ;)

                                                                 Andrew
 da...@dalkescientific.com



------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with 
Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code 
to
build responsive, highly engaging applications that combine the power of 
local
resources and data with the reach of the web. Download the Adobe AIR SDK 
and
Ajax docs to start building applications 
today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_________________________

CONFIDENTIALITY NOTICE

The information contained in this e-mail message is intended only for the 
exclusive use of the individual or entity named above and may contain 
information that is privileged, confidential or exempt from disclosure 
under applicable law. If the reader of this message is not the intended 
recipient, or the employee or agent responsible for delivery of the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please 
notify the sender immediately by e-mail and delete the material from any 
computer.  Thank you.

Reply via email to