On Wed, Aug 29, 2012 at 11:03:01AM +0200, Matthias Ekman wrote:
> I was trying to visualize the tree structure obtained from ward, but I
> don't quite understand the data format of the children_ attribute.

> The documentation reads:
> children_     array-like, shape = [n_nodes, 2]        List of the children of
> each nodes. Leaves of the tree do not appear.

> So it is not a "left child-right Sibling representation", right?

I am not sure, this is not a term that I am familiar with. Keep in mind
that Ward gives a binary tree, so it would be more a "left child-right
child representation".

This matrix simple lists the pairs of children for each node, where a node
is denoted as an integer index. It does not include the terminal nodes
(orginal samples) as they have no children.

> Are there any pointers to that specific format or even better does
> anyone have some advice on how to visualize the tree with
> ``scipy.cluster.hierarchy.dendrogram`` or ``graphviz``?

I couldn't figure out the structure that
scipy.cluster.hierarchy.dendrogram uses. That said, it should be possible
to adapt our representation to something usable be dendrogram, and I'd
love to merge in an example showing how to do this.

> As a second, but slightly related question, is it possible to use the
> ward on a n_features x n_features matrix (e.g. an adjacency matrix)?
> It works, but I wasn't sure whether these results can be considered as
> meaningful.

Ward does not work on adjacency matrices because it is specific to the
euclidean distance. Other hierarchical clustering methods such as
complete linkage would work.

Complete linkage is not implemented in the scikit, but it only requires a
simple modification to the code doing Ward. I need to find time to do it
(TM).

HTH,

Gael

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to