I've started to experiment with LDA and am finding that it creates only
a single long-running map task for each iteration, which doesn't scale
well.  The map is taking 20mins for 10k of my input SparseVectors, and 5
hours for 100k (the vocabulary size also grows when there are more
vectors).

Is this expected or am I doing something wrong?  Are there any existing
performance benchmarks?

Many thanks!

Mark

Reply via email to