[
https://issues.apache.org/jira/browse/MAHOUT-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917858#action_12917858
]
Shannon Quinn commented on MAHOUT-516:
--------------------------------------
Aha. It's not a logic bug per se, but rather a hack I devised (and obviously
forgot about): there's no explicit calculation done to determine the
desiredRank for the eigen-decomposition, and there is no explicit "formula" for
this in the original literature, either. This is an item I'll be asking Dr.
Chennubhotla about tomorrow, but needless to say the stopgap measure I
implemented during GSoC is nowhere close to scalable: it was to make
desiredRank = dimensionality of input data (yes, very bad). We had been
discussing the possibility of making this parameter tweak-able by the user
(perhaps "suggestable", kind of along the lines of how Hadoop's # of M/R tasks
can be suggested by the user), but the upshot is this algorithm still needs a
reliable way to determine desiredRank in order to function properly.
> Eigencuts produces unexpected results
> -------------------------------------
>
> Key: MAHOUT-516
> URL: https://issues.apache.org/jira/browse/MAHOUT-516
> Project: Mahout
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 0.4
> Reporter: Jeff Eastman
> Fix For: 0.5
>
>
> Shannon reports he suspects a logic error in Eigencuts since it evidently
> does not produce exactly the expected results. It passes all current unit
> tests so we need to characterize the results differences and produce a test
> for it. Marking for 0.5 for now though we will fix it as soon as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.