Sounds more like new requirements requiring new behavior to me. This is open source. If you see a way to make the box better, you are free to do so.
On Fri, Oct 22, 2010 at 5:10 AM, Shannon Quinn <[email protected]> wrote: > It would have to be in the LanczosSolver itself, though - seems like kind > of a special case to be modifying the black box? > > > On 10/22/2010 2:10 AM, Ted Dunning wrote: > >> There isn't a good way to guess eigenvalues. >> >> But the basic decomposer can see progressively more eigenvalues as it goes >> and should be able to know when to stop. >> >> I can't speak to where in the code you would stick that, but it should be >> reasonably easy to find. >> >> On Thu, Oct 21, 2010 at 5:33 PM, Shannon Quinn (JIRA)<[email protected] >> >wrote: >> >> [ >>> >>> https://issues.apache.org/jira/browse/MAHOUT-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923711#action_12923711 >>> ] >>> >>> Shannon Quinn commented on MAHOUT-516: >>> -------------------------------------- >>> >>> For the time being, I'm just going to go with a -k type flag for >>> specifying >>> the degree of eigendecomposition. But here are my thoughts on a more >>> permanent solution: >>> >>> In reading up a little further on the low-rank approximations of the >>> eigencuts paper, it appears that, at least for images, the eigenvalues >>> follow a linear decrease from 1, i.e. each corresponding eigenvalue is<= >>> the previous according to some approximately linear function. In >>> perturbing >>> the flow of probability in the underlying markov transition graph (in >>> order >>> to determine where the clusters are), any eigenvalue/eigenvector pairs >>> that >>> fall under a certain threshold (specified by a combination of epsilon and >>> beta, which are command-line arguments) are ignored. Thus, since the >>> eigenvalues are monotonically decreasing, in theory we'd only need to >>> find >>> which eigenvalue falls beneath the threshold and perform a full >>> decomposition up to that point. >>> >>> There's an obvious implementation problem there: we can't really know >>> what >>> that minimum degree is without performing a full decomposition in the >>> first >>> place. Is there a way around this? Do we have an efficient way of >>> calculating, or perhaps approximating, eigenvalues without computing >>> corresponding eigenvectors or otherwise performing a full decomposition? >>> Maybe we could even do this probabilistically by "sampling" from the >>> space >>> of eigenvalues to make a guess on what rank we want? Just throwing ideas >>> out >>> here until the experts respond :) >>> >>> Eigencuts produces unexpected results >>>> ------------------------------------- >>>> >>>> Key: MAHOUT-516 >>>> URL: https://issues.apache.org/jira/browse/MAHOUT-516 >>>> Project: Mahout >>>> Issue Type: Bug >>>> Components: Clustering >>>> Affects Versions: 0.4 >>>> Reporter: Jeff Eastman >>>> Fix For: 0.5 >>>> >>>> Attachments: jeastman.vcf >>>> >>>> >>>> Shannon reports he suspects a logic error in Eigencuts since it >>>> evidently >>>> >>> does not produce exactly the expected results. It passes all current unit >>> tests so we need to characterize the results differences and produce a >>> test >>> for it. Marking for 0.5 for now though we will fix it as soon as >>> possible. >>> >>> -- >>> This message is automatically generated by JIRA. >>> - >>> You can reply to this email to add a comment to the issue online. >>> >>> >>> >
