On 2016-12-29 07:19, John M wrote:

> For why you need sub-second depiction consider these times for 92877507
> structures (current size PubChem Compound):
>
> 1s per structure = 1074 days (~3 years)
> 100 ms per structure = 107 days
> 1ms per structure = 25 hours

The Dilbert answer is buy a better computer. The serious answer is if 
you run millions of jobs sequentially on a single core, your problem is 
not how long a single job takes: no matter how fast you can make it, it 
will only scale linearly. There will be 1B compounds in PubChem two 
years from now and your painstakingly crafted 1ms/structure code will 
still take 3 years, the only difference is you get garbage depictions.

Condor can be persuaded fire up 92877507 EC2 VMs and run all of those in 
parallel -- provided you're willing to pay Amazon for it of course. If 
you can code the algorithm into GPGPU/SIMD parallel flow, you can 
probably push it into an FPGA and then get that baked into ASICs in 
China -- they'll give you discount if you order more than ten thousand. 
That gets you a $20 USB dongle that will run them at umpteen K/second. 
And so on.

If you don't want quality depictions because bad ones will work just 
fine for your needs, that's a perfectly good argument. If you don't want 
them because generating 10M sequentially on a single core will take a 
long time, that's BS argument.

Dima


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to