Hi all, Browsing the source code I found that the AgglomerativeClustering class appears to perform most forms of hierarchical clustering (not only Ward as before). Firstly, I would like to ask if the code is already "stable". I know that it is unreleased, but since it is on the repository I assume that it should be pretty close to the final version which will be released.
Also, I'm still not 100% sure how to use it based on the documentation. More specifically, I'm interested in the case of using a precomputed distance matrix. Thus, I want to confirm if the correct to use the class would be something like: from sklearn.cluster import hierarchical dataset = np.genfromtxt('somedatafile) X = my_pairwise_distance(dataset) agg_cluster = hierarchical.AgglomerativeClustering(affinity='precomputed', linkage='complete') agg_cluster.fit(X) ... To simplify, I just need to know if I can pass the distance matrix to the fit method. The documentation is unclear on this, but checking the source code it appears that it should work, since the linkage_tree method can deal with the distance matrix. Explaining why the documentation is unclear, it simply says on the fit method: " X : array-like, shape = [n_samples, n_features] The samples a.k.a. observations. " It does not state that X can be a distance matrix in the case of precomputed. Thanks in advance, Flavio ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general