I've given up on the CLI and I'm trying to do this in java now, but it looks like I can't launch multiple KMeans drivers at once since KMeansDriver and many of its underlying classes are static. Am I right that that will cause problems? (Sorry for the beginner question. I'm not too familiar with concurrency in java).
I'd really like to be able to launch multiple clustering runs at the same time since launching them one at a time and waiting for each to finish is killing my overall performance. On Thu, Nov 8, 2012 at 1:48 PM, Matt Molek <[email protected]> wrote: > When doing top down clustering, I'm running a first pass of kmeans, and > then splitting the different clusters off into their own directories with > clusterpp. So I have a bunch of input directories that I want to run kmeans > jobs on at the same time. > > Can I do that from a bash script? Right now I'm running over each input > directory with a for loop, and each kmeans job is waiting for completion > before the next one starts. > > If I can't do it with a script, could I do it in Java without having to > modify the mahout source? > > Thanks for the help! >
