I've spent some time on this problem and have a 44 line solution for
generating dendrogram line segments and root ending locations, i.e. x
and y values. The format for cluster information is a nested tuple like
this:
cluster=(4.5,(3.0,'c',(1.0,'a','b')),(2.0,'e',(1.5,'f','g')))
where the FP numbers are distance information and the strings are the
names of the items being clustered. The code can also handle cluster
data without distance information, by assuming a fixed distance of 1.0:
cluster=(('c',('a','b')),('e',('f','g')))
The output is a list of root location and value tuples: (x,y,item) and a
list of dendrogram line segment tuples: (x1,y1,x2,y2). I've purposely
avoided recursion back into the function by using list stacks because
I'm always a little leery of how much space is available on the
[virtual] machine stack. Note that this code has not been extensively
tested. One limitation on this dendrogram code is that only pairs of
objects may be clustered together at a time. Obviously a lot of work
would need to be done to apply these data to a Matplotlib plot, but I
don't know how to do it. The output could also be used to generate
images in SVG, Postscript, mechanical plotters, or any other vector
oriented graphical system.
Anyway, FWIW, my code is listed, below. I'm sure it can be improved
upon. In hopes of someone doing something useful for others with it, I
hereby release it under the Matplotlib license, while retaining the
copyright for my own additional use. Please let me know if there is a
better way to submit the code.
--Tim
import sys
def dendrogram(ctree,hasDistances='yes',yincr=1.0):
stype=type("")
tstack=[ctree[:]] ## make a copy
nstack=[] ## node stack
baselist=[]
linelist=[]
y=0.0
while len(tstack)>0:
tob=tstack.pop()
if hasDistances=='yes':
dist=tob[0]
tob=tob[1:]
elif hasDistances=='ignore':
dist=1.0
tob=tob[1:]
elif hasDistances=='no':
dist=1.0
else:
raise Exception("unknown value '%s' for named argument
'hasDistances'" % self.hasDistances)
obflag=False
for ob in tob:
if type(ob)==stype:
baselist.append( (0.0,y,ob) )
nstack.append( (0.0,y,dist) )
y+=yincr
else:
tstack.append(ob)
obflag=True
if obflag: nstack.append((dist,))
while len(nstack)>1 and len(nstack[-1])>1 and len(nstack[-2])>1:
x1,y1,d1=nstack.pop()
x2,y2,d2=nstack.pop()
if d1>d2: d=d1
else: d=d2
if x1>x2: xnew=x1+d
else: xnew=x2+d
ynew=(y1+y2)/2.0
linelist.append((x1,y1,xnew,y1))
linelist.append((x2,y2,xnew,y2))
linelist.append((xnew,y1,xnew,y2))
if len(nstack)>0 and len(nstack[-1])<=1:
dist=nstack.pop()[0]
nstack.append( (xnew,ynew,dist) )
return baselist,linelist
if __name__=="__main__":
baselist,linelist=dendrogram(
(4.5,(3.0,'c',(1.0,'a','b')),(2.0,'e',(1.5,'f','g'))) )
print baselist
print
print linelist
Jouni K. Seppänen wrote:
> Timothy <[EMAIL PROTECTED]> writes:
>
>
>> It appears matplotlib does not have a dendrogram plot. I may be
>> interested in developing one, if I can get a sense for what it would
>> take. Could someone suggest some code I could look at as a model for
>> developing a new plot?
>>
>
> In the file axes.py, search for the comment "Specialized plotting" and
> look at the functions after that. The first function is "bar", which
> looks quite complicated, but perhaps "stem" would be a good starting
> point.
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Matplotlib-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel