My issues of interest:
MAHOUT-516: Eigencuts produces unexpected results
This is finding a decent heuristic for automatically determining the
degree fed to the Lanczos solver.
MAHOUT-517: Eigencuts needs an output format
Something better than System.out.println()
MAHOUT-518: Implement Affinity Preprocessing for Eigencuts and Spectral
KMeans
Have a map job sit in front of eigencuts/spectral k-means that converts
some standard input format (perhaps CSV?) into the affinity matrix used
by the algorithms.
MAHOUT-524: DisplaySpectralKMeans example fails
Something strange in the clusters shown in the example.
MAHOUT-537: Bring DistributedRowMatrix into compliance with Hadoop 0.20.2
Somewhat on hold until a later version of Hadoop.
I'm sorry for my lack of activity; my summer internship with Google kept
me busier than I'd expected. However, with the return of PhD research,
my advisor wants to very quickly--as in, the next couple months--push
out a prototype for our framework that uses Mahout, so these issues
should be attended to very, very soon.
Shannon
On 8/17/11 6:44 AM, Grant Ingersoll wrote:
More later, but…
On Aug 17, 2011, at 5:13 AM, Sebastian Schelter wrote:
MAHOUT-767 Improve RowSimilarityJob performance for count-based distance
measures
Currently working on that.
I'm looking to test what you have on the ASF mail archives in the coming few
weeks.