by probabilistic algorithms i mostly mean inference involving monte carlo
type mechanisms (Gibbs sampling LDA which i think might still be part of
our MR collection might be an example, as well as its faster counterpart,
variational Bayes inference.

the parallelization strategies are are just standard spark mechanisms (in
case of spark), mostly are using their standard hash samplers (which are in
math speak are uniform multinomial samplers really).

On Mon, May 2, 2016 at 9:25 AM, Khurrum Nasim <[email protected]>
wrote:

> Hey Dimitri -
>
> Yes I meant probabilistic algorithms.  If mahout doesn’t use probabilistic
> algos then how does it accomplish a degree of optimal parallelization ?
> Wouldn’t you need randomization to spread out the processing of tasks.
>
> > On May 2, 2016, at 12:13 PM, Dmitriy Lyubimov <[email protected]> wrote:
> >
> > yes mahout has stochastic svd and pca which are described at length in
> the
> > samsara book. The book examples in Andrew Palumbo's github also contain
> an
> > example of computing k-means|| sketch.
> >
> > if you mean _probabilistic_ algorithms, although i have done some things
> > outside the public domain, nothing has been contributed.
> >
> > You are very welcome to try something if you don't have big constraints
> on
> > oss contribution.
> >
> > -d
> >
> > On Mon, May 2, 2016 at 7:49 AM, Khurrum Nasim <[email protected]>
> > wrote:
> >
> >> Hey All,
> >>
> >> I’d like to know if Mahout uses any randomized algorithms.   I’m
> thinking
> >> it probably does.  Can somebody point me to the packages that utilized
> >> randomized algos.
> >>
> >> Thanks,
> >>
> >> Khurrum
> >>
> >>
>
>

Reply via email to