While not multithreaded (yet) this is the use case of the filter catalog:

http://rdkit.blogspot.com/2016/04/changes-in-201603-release-filtercatalog.html?m=1

Look for the SmartsMatcher class in the blog.

It is a good idea to make this multithreaded as well, I'll add this as a 
possible enhancement.

----
Brian Kelley

> On Jun 9, 2017, at 7:04 AM, Greg Landrum <greg.land...@gmail.com> wrote:
> 
> Hi Alexis,
> 
> I would approach this by loading the 1000 queries into a list of molecules 
> and then "stream" the others past that (so that you never attempt to load the 
> full 500K set at once).
> 
> Here's a quick sketch of one way to do this:
> In [4]: queries = [x for x in Chem.ForwardSDMolSupplier('mols.1000.sdf') if x 
> is not None]
> 
> In [5]: matches = []
> 
> In [6]: for m in Chem.ForwardSDMolSupplier('./znp.50k.sdf'):
>    ...:     if m is None:
>    ...:         continue
>    ...:     matches.append([m.HasSubstructMatch(q) for q in queries])
>    ...:     
> 
> 
> Brian has some thoughts on making this particular use case easier/faster (in 
> particular by adding multi-threading support), so maybe there will be 
> something in the next release there.
> 
> I hope this helps,
> -greg
> 
> 
>> On Sun, Jun 4, 2017 at 10:25 PM, Alexis Parenty 
>> <alexis.parenty.h...@gmail.com> wrote:
>> Dear RDKit community,
>> 
>> I need to screen for substructure relationships between two sets of 
>> structures (1 000 X 500 000): I thought I should build two lists of mol 
>> objects from SMILES, but I keep having a memory error when the second list 
>> reaches 300 000 mol. All my RAM (12G) gets consumed along with all my 
>> virtual memory.
>> 
>> Do I really have to compromise on speed and make mol object on the flight 
>> from two lists of SMILES? Is there another memory efficient way to store mol 
>> object?
>> 
>> Best,
>> 
>> Alexis
>> 
>> 
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to