[mlpack] Regarding profiling for parallelization

Shikhar Bhardwaj Tue, 28 Mar 2017 10:01:13 -0700

Hello everyone!

I have been looking through the mlpack codebase to find sections which
could benefit from a muti threaded implementation. A couple of papers also
shed some light on the implementation of machine learning algorithms in a
multi threaded setting :


1. Map-Reduce for Machine Learning on Multicore
<https://papers.nips.cc/paper/3150-map-reduce-for-machine-learning-on-multicore.pdf>
2. Parallelizing Machine Learning Algorithms
<http://cs229.stanford.edu/proj2010/BatizBenetSlackSparksYahya-ParallelizingMachineLearningAlgorithms.pdf>

I have listed out some algorithms which can be implemented in this manner.

One idea that I had was to parallelize testing. Currently, mlpack builds a
single mlpack_test executable, which runs the tests on a single thread.
Instead, we can build multiple test executables, and use CMake's ctest tool
to run those tests, with as many jobs as the number of extra threads we
have to spare. More on this here
<https://baptiste-wicht.com/posts/2012/10/run-boost-test-parallel-cmake.html>.
This can significantly reduce testing time, and help in reducing the time
for the complete matrix builds planned in the future.

This wouldn't interfere with the aim of having a single, verifiable command
for users to test the library, as mentioned in issue #137
<https://github.com/mlpack/mlpack/issues/137>,.

Any thoughts?

Thanks,
Shikhar

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

[mlpack] Regarding profiling for parallelization

Reply via email to