On 2016-12-29 07:19, John M wrote: > For why you need sub-second depiction consider these times for 92877507 > structures (current size PubChem Compound): > > 1s per structure = 1074 days (~3 years) > 100 ms per structure = 107 days > 1ms per structure = 25 hours
The Dilbert answer is buy a better computer. The serious answer is if you run millions of jobs sequentially on a single core, your problem is not how long a single job takes: no matter how fast you can make it, it will only scale linearly. There will be 1B compounds in PubChem two years from now and your painstakingly crafted 1ms/structure code will still take 3 years, the only difference is you get garbage depictions. Condor can be persuaded fire up 92877507 EC2 VMs and run all of those in parallel -- provided you're willing to pay Amazon for it of course. If you can code the algorithm into GPGPU/SIMD parallel flow, you can probably push it into an FPGA and then get that baked into ASICs in China -- they'll give you discount if you order more than ten thousand. That gets you a $20 USB dongle that will run them at umpteen K/second. And so on. If you don't want quality depictions because bad ones will work just fine for your needs, that's a perfectly good argument. If you don't want them because generating 10M sequentially on a single core will take a long time, that's BS argument. Dima ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss