Hi Jeff, Shannon,
I took a quick look at this just now. It seems that 8 clean eigenvectors
are being written by the call to
verifier.runJob(conf, lanczosSeqFiles, L.getRowPath(),
verifiedEigensPath, true, 1.0, 0.0, clusters);
in SpectralKMeansDriver.run(). The matrix W is created with numRows = 5
(clusters), and the subsequent transpose() fails with the call to:
SequentialAccessSparseVector outVector = new
SequentialAccessSparseVector(tmp);
in TransposeReducer.reduce(), and this is failing as the
RandomAccessSparseVector tmp has been created with size newNumCols = 5
(clusters from SpectralKMeansDriver.run()), but it appears to contains
the 8 clean eigenvectors, which then generates an IndexException in
AbstractVector.set().
Looking back further into EigenVerificationJob.saveCleanEigens(), it
looks like it will always write out all of the clean eigenvectors, and
ignore the 'maxEigens' value, i.e. the clusters value passed to
verifier.runJob() in this case:
for (Map.Entry<MatrixSlice, EigenStatus> pruneSlice : prunedEigenMeta) {
.
.
.
int numEigensWritten = 0;
// increment the number of eigenvectors written and see if we've
// reached our specified limit, or if we wish to write all
eigenvectors
// (latter is built-in, since numEigensWritten will always be > 0
numEigensWritten++;
if (numEigensWritten == maxEigensToKeep) {
log.info("{} of the {} total eigens have been written",
maxEigensToKeep, prunedEigenMeta.size());
break;
}
}
I'm assuming the "int numEigensWritten = 0;" should appear before this
for loop?
Derek
On 12/10/10 04:21, Jeff Eastman wrote:
+user@
+1 Any helpers out there want to earn a patch kudo?
On 10/11/10 6:59 PM, Shannon Quinn wrote:
Ok, this machine learning homework assignment is really brutal, due
Wednesday morning...may not get to this before then. Unless anyone would
like to help :)
Shannon
On Mon, Oct 11, 2010 at 10:35 AM, Shannon Quinn<[email protected]>
wrote:
I'll have a chance to look at this later today; hopefully I'll have
something for you once you get back tonight.
On Mon, Oct 11, 2010 at 10:33 AM, Jeff
Eastman<[email protected]
wrote:
Sorry, my bad. I neglected to commit the TestClusterDumper
changes. It's
in now and all tests run. The DisplaySpectralKMeans example still
fails when
you run it but it is not run by any of the build processes. It's
pointing to
a potential problem in SpectralKMeans which I'd like to get fixed
if we can.
I'm starting work at Narus today so I won't be able to pay
attention to
this until later this evening.
On 10/10/10 11:48 PM, Sean Owen wrote:
I trust this all is much more bug fix than anything else -- just
mindful of the purported "code freeze" in action now. This leaves us
with a broken build at the moment. I know the point is to get it
sorted straight away. Just wondering if we're pretty sure this isn't
opening up a new line of issues at a time we're going to bless a
state
of the code for another 6-8 months.
On Mon, Oct 11, 2010 at 3:35 AM, Jeff Eastman
<[email protected]> wrote:
Hi Shannon,
I've committed a new display example that attempts to push the
standard
mixture of models data set through spectral k-means. After some
tweaking
of
configuration arguments it gets remarkably far through, finally
failing
on
W.transpose() after the eigen cleanup. I can't imagine this would
all be
pilot error so I wonder if you'd have a look at it to see where its
going
south?