I'm rather rusty on graph theory, but I'm in a mood to take a whack at this anyway...
Use any convenient data structure to represent your graph. The nodes are the songs, the edges are the "relationships" between songs. You can can model the relationships in several ways. Starting with song A 1) As described, pick out the one "most similar" song B from a subset of the other nodes. This increases the weight of edge AB by one. 2) You can have the user weight the similarity of B1,B2,...Bn to A. The final weight of AB is the average weight given by users who examined those two songs together. 3) Probably a bunch more. There are two algorithms to play with. Finding connected components (there is a non-zero weight path between every node) and finding cliques (there is a non-zero weight edge between each node). My guess is that connectedness will be more interesting than cliques. You can adjust the sensitivity of the algorithm (by reducing the weights of edges, or explicitly building a threshold into the algorithm) to find more-strongly linked regions. Another possibility (especially if you are using percentage weights, like in scenario two) is to play with distance algorithms. Invert the weights, so that two completely dissimilar songs are at infinite distance, and closely related songs have low edge weights. Compute shortest paths. This could be combined with connected components to pick out the "exemplar" of the component: take the song with the shortest total distance to all others in the component. The details of the algorithms can be found in any algorithm text book. --kag _______________________________________________ Boston-pm mailing list [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm

