[
https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816858#comment-13816858
]
Chisomo Sakala commented on MAHOUT-1206:
----------------------------------------
I'm really excited about this prospect.
The paper <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6253489>
talks about how to implement MapReduce for DBScan (DBSCAN-MR). I have a pdf
copy and can email it to anybody interested in viewing it.
I emailed the author of that paper to find out if they'd be willing to
contribute their coded implementation of DBSCAN-MR to Mahout, but I haven't
yet gotten a response.
Here is another paper discussing parellelization of DBSCAN.
<http://conferences.computer.org/sc/2012/papers/1000a053.pdf>
> Add density-based clustering algorithms to mahout
> -------------------------------------------------
>
> Key: MAHOUT-1206
> URL: https://issues.apache.org/jira/browse/MAHOUT-1206
> Project: Mahout
> Issue Type: Improvement
> Reporter: Yexi Jiang
> Labels: clustering
> Fix For: Backlog
>
>
> The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and
> spectral cluster) clustering data by assuming that the data can be clustered
> into the regular hyper sphere or ellipsoid. However, in practical, not all
> the data can be clustered in this way.
> To enable the data to be clustered in arbitrary shapes, clustering algorithms
> like DBSCAN, BIRCH, CLARANCE
> (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are
> proposed.
> It is better that we can implement one or some of these clustering algorithm
> to enrich the clustering library.
--
This message was sent by Atlassian JIRA
(v6.1#6144)