There isn't a good way to guess eigenvalues.

But the basic decomposer can see progressively more eigenvalues as it goes
and should be able to know when to stop.

I can't speak to where in the code you would stick that, but it should be
reasonably easy to find.

On Thu, Oct 21, 2010 at 5:33 PM, Shannon Quinn (JIRA) <[email protected]>wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923711#action_12923711]
>
> Shannon Quinn commented on MAHOUT-516:
> --------------------------------------
>
> For the time being, I'm just going to go with a -k type flag for specifying
> the degree of eigendecomposition. But here are my thoughts on a more
> permanent solution:
>
> In reading up a little further on the low-rank approximations of the
> eigencuts paper, it appears that, at least for images, the eigenvalues
> follow a linear decrease from 1, i.e. each corresponding eigenvalue is <=
> the previous according to some approximately linear function. In perturbing
> the flow of probability in the underlying markov transition graph (in order
> to determine where the clusters are), any eigenvalue/eigenvector pairs that
> fall under a certain threshold (specified by a combination of epsilon and
> beta, which are command-line arguments) are ignored. Thus, since the
> eigenvalues are monotonically decreasing, in theory we'd only need to find
> which eigenvalue falls beneath the threshold and perform a full
> decomposition up to that point.
>
> There's an obvious implementation problem there: we can't really know what
> that minimum degree is without performing a full decomposition in the first
> place. Is there a way around this? Do we have an efficient way of
> calculating, or perhaps approximating, eigenvalues without computing
> corresponding eigenvectors or otherwise performing a full decomposition?
> Maybe we could even do this probabilistically by "sampling" from the space
> of eigenvalues to make a guess on what rank we want? Just throwing ideas out
> here until the experts respond :)
>
> > Eigencuts produces unexpected results
> > -------------------------------------
> >
> >                 Key: MAHOUT-516
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-516
> >             Project: Mahout
> >          Issue Type: Bug
> >          Components: Clustering
> >    Affects Versions: 0.4
> >            Reporter: Jeff Eastman
> >             Fix For: 0.5
> >
> >         Attachments: jeastman.vcf
> >
> >
> > Shannon reports he suspects a logic error in Eigencuts since it evidently
> does not produce exactly the expected results. It passes all current unit
> tests so we need to characterize the results differences and produce a test
> for it. Marking for 0.5 for now though we will fix it as soon as possible.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Reply via email to