also, mahout does have optimizer that simply decides on degree of parallelism of the _product_. I.e., if it computes C=A'B then it figures that final results should be split N ways. but it doesn't apply the partition function -- it just uses the usual hash partitioner to forward the keys, i don't think we ever override that.
On Mon, May 2, 2016 at 9:39 AM, Dmitriy Lyubimov <[email protected]> wrote: > by probabilistic algorithms i mostly mean inference involving monte carlo > type mechanisms (Gibbs sampling LDA which i think might still be part of > our MR collection might be an example, as well as its faster counterpart, > variational Bayes inference. > > the parallelization strategies are are just standard spark mechanisms (in > case of spark), mostly are using their standard hash samplers (which are in > math speak are uniform multinomial samplers really). > > On Mon, May 2, 2016 at 9:25 AM, Khurrum Nasim <[email protected]> > wrote: > >> Hey Dimitri - >> >> Yes I meant probabilistic algorithms. If mahout doesn’t use >> probabilistic algos then how does it accomplish a degree of optimal >> parallelization ? Wouldn’t you need randomization to spread out the >> processing of tasks. >> >> > On May 2, 2016, at 12:13 PM, Dmitriy Lyubimov <[email protected]> >> wrote: >> > >> > yes mahout has stochastic svd and pca which are described at length in >> the >> > samsara book. The book examples in Andrew Palumbo's github also contain >> an >> > example of computing k-means|| sketch. >> > >> > if you mean _probabilistic_ algorithms, although i have done some things >> > outside the public domain, nothing has been contributed. >> > >> > You are very welcome to try something if you don't have big constraints >> on >> > oss contribution. >> > >> > -d >> > >> > On Mon, May 2, 2016 at 7:49 AM, Khurrum Nasim <[email protected] >> > >> > wrote: >> > >> >> Hey All, >> >> >> >> I’d like to know if Mahout uses any randomized algorithms. I’m >> thinking >> >> it probably does. Can somebody point me to the packages that utilized >> >> randomized algos. >> >> >> >> Thanks, >> >> >> >> Khurrum >> >> >> >> >> >> >
