Hello everyone! I have been looking through the mlpack codebase to find sections which could benefit from a muti threaded implementation. A couple of papers also shed some light on the implementation of machine learning algorithms in a multi threaded setting :
1. Map-Reduce for Machine Learning on Multicore <https://papers.nips.cc/paper/3150-map-reduce-for-machine-learning-on-multicore.pdf> 2. Parallelizing Machine Learning Algorithms <http://cs229.stanford.edu/proj2010/BatizBenetSlackSparksYahya-ParallelizingMachineLearningAlgorithms.pdf> I have listed out some algorithms which can be implemented in this manner. One idea that I had was to parallelize testing. Currently, mlpack builds a single mlpack_test executable, which runs the tests on a single thread. Instead, we can build multiple test executables, and use CMake's ctest tool to run those tests, with as many jobs as the number of extra threads we have to spare. More on this here <https://baptiste-wicht.com/posts/2012/10/run-boost-test-parallel-cmake.html>. This can significantly reduce testing time, and help in reducing the time for the complete matrix builds planned in the future. This wouldn't interfere with the aim of having a single, verifiable command for users to test the library, as mentioned in issue #137 <https://github.com/mlpack/mlpack/issues/137>,. Any thoughts? Thanks, Shikhar
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
