Thank you Andrew and Dimitry for your informed responses.
> On May 2, 2016, at 8:59 PM, Dmitriy Lyubimov <[email protected]> wrote: > > also, mahout does have optimizer that simply decides on degree of > parallelism of the _product_. I.e., if it computes C=A'B then it figures > that final results should be split N ways. but it doesn't apply the > partition function -- it just uses the usual hash partitioner to forward > the keys, i don't think we ever override that. > > On Mon, May 2, 2016 at 9:39 AM, Dmitriy Lyubimov <[email protected]> wrote: > >> by probabilistic algorithms i mostly mean inference involving monte carlo >> type mechanisms (Gibbs sampling LDA which i think might still be part of >> our MR collection might be an example, as well as its faster counterpart, >> variational Bayes inference. >> >> the parallelization strategies are are just standard spark mechanisms (in >> case of spark), mostly are using their standard hash samplers (which are in >> math speak are uniform multinomial samplers really). >> >> On Mon, May 2, 2016 at 9:25 AM, Khurrum Nasim <[email protected]> >> wrote: >> >>> Hey Dimitri - >>> >>> Yes I meant probabilistic algorithms. If mahout doesn’t use >>> probabilistic algos then how does it accomplish a degree of optimal >>> parallelization ? Wouldn’t you need randomization to spread out the >>> processing of tasks. >>> >>>> On May 2, 2016, at 12:13 PM, Dmitriy Lyubimov <[email protected]> >>> wrote: >>>> >>>> yes mahout has stochastic svd and pca which are described at length in >>> the >>>> samsara book. The book examples in Andrew Palumbo's github also contain >>> an >>>> example of computing k-means|| sketch. >>>> >>>> if you mean _probabilistic_ algorithms, although i have done some things >>>> outside the public domain, nothing has been contributed. >>>> >>>> You are very welcome to try something if you don't have big constraints >>> on >>>> oss contribution. >>>> >>>> -d >>>> >>>> On Mon, May 2, 2016 at 7:49 AM, Khurrum Nasim <[email protected] >>>> >>>> wrote: >>>> >>>>> Hey All, >>>>> >>>>> I’d like to know if Mahout uses any randomized algorithms. I’m >>> thinking >>>>> it probably does. Can somebody point me to the packages that utilized >>>>> randomized algos. >>>>> >>>>> Thanks, >>>>> >>>>> Khurrum >>>>> >>>>> >>> >>> >>
