[ 
https://issues.apache.org/jira/browse/MAHOUT-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Eastman updated MAHOUT-825:
--------------------------------

    Attachment: MAHOUT-825.patch

Modified version of canopy-radius-based-outlier-elimination. Renames the filter 
to "outlierFilter". Adjusts filter semantics to reject distance > outlierFilter 
* cluster.radius(). Emits rejected points to clusterId = -1 so they do not 
disappear silently. Includes Apache License text, adjusts to Mahout formatter 
style. Seems to pass the relevant tests. 

This should be implemented in the other clustering algorithms before committing.
                
> Canopies grouping records outside t1
> ------------------------------------
>
>                 Key: MAHOUT-825
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-825
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.6
>         Environment: windows, linux
>            Reporter: Paritosh Ranjan
>              Labels: features, newbie, patch
>             Fix For: 0.6
>
>         Attachments: Clustering Remote Points - Two Big, Useless 
> Clusters.txt, MAHOUT-825.patch, Not Clustering Remote Points - Two Meaningful 
> Clusters.txt, canopy-clusterFilter-t1, canopy-outlier-elimination, 
> canopy-outside-t1-points-patch-1, canopy-radius-based-outlier-elimination, 
> canopy-strict-clustering-flag
>
>
> While finding closest canopy, there is no check to ensure that it returns 
> canopies which are within distance t1 from the point. This results in 
> incorrect result i.e. Points outside t1 are grouped in canopies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to