On Fri, Mar 16, 2018 at 01:39:34PM +0530, Nikhil Goel wrote: > Hello > > Thank you for your help! I had a few more questions > Sequential algorithms like logistic regression are very hard to > parallelize. While researching for this project, the only way I could find > was by computing the gradient in parallel of a batch. But from what I > could see in mlpack, the batch is provided as a matrix. Matrices operations > are already parallelized in mlpack as openBLAS is parallelized. So I > needn't worry about such algorithms?
Hi there Nikhil, You are right, there are some algorithms for which specific parallelization is not useful and it is better to depend on a parallel BLAS. For logistic regression in particular, there are a few parallel optimizers that are implemented; you might consider taking a look at those also. > Yes, you're right that we can use environment variables but wouldn't it be > cleaner and better looking to provide users with an option like 'cores' > with default value as max number of cores available (Or 1, whichever is > chosen by you) in algorithms that have been parallelized? No, in my view this would be an unnecessary addition of an extra API that users have to learn. If a user learns about OpenMP environment variables it is useful anywhere OpenMP is used, but if a user instead learns about some mlpack-specific parallelization API, it is not useful anywhere except mlpack. > Also is bagging emsembling implemented in mlpack? It's a pretty popular > algorithm and I couldn't find it in mlpack. I was wondering if it's needed > in mlpack? The only ensembling algorithm we have at the minute is AdaBoost. It may be useful to add another algorithm. Thanks, Ryan -- Ryan Curtin | "I can't believe you like money too. We should r...@ratml.org | hang out." - Frito _______________________________________________ mlpack mailing list mlpack@lists.mlpack.org http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack