On Tue, Sep 6, 2011 at 4:44 PM, Chris Lu <[email protected]> wrote: > I see, thanks! > > Seems it should build into Mahout LDA algorithms, since the input file is > usually not too large, but really needs parallel mapping processes. > > If your input is not large, running a multithreaded in-memory algorithm on a relatively beefy box (16+ cores, enough RAM to fit your data + model + some spare) will be *much* faster than putting the same data on cluster, actually.
-jake
