Who can I prod about setting up a UDF repo at MySQL. I think 'they' should do this ;)
http://lists.mysql.com/community/97 Anyway I am posting this request to 'community' because I still don't know the appropriate place to post UDF related stuff. This is anoter (potentially crazy) idea for a UDF that I would find very usefull in my research... AGGLOM - Simple agglomerative clustering for MySQL ... The UDF would work on any NUMBER column, and return the number of 'clusters' using agglomerative clustering with a certain threshold as an input. Agglomerative clustering merges any two numbers that are within the 'threshold', and replaces those numbers with the average of the two. The clustering proceedes smallest 'gap' first, and stops when no two numbers are within the threshold. The result would be the number (or perhaps the values) of the remaining clusters. Syntax (suggested) AGGLOM(THRESH,expr (returning a number)) For example Table1 C1 C2 A 1 A 2 A 3 A 4 A 5 A 6 A 7 B 10 B 11 B 12 B 56 B 57 B 58 B 99 B 101 SELECT C1, AGGLOM(C2,1) AS C3 FROM Table1 GROUP BY C1; C1 C3 A 4 B 6 SELECT C1, AGGLOM(C2,2) AS C3 FROM Table1 GROUP BY C1; C1 C3 A 3 B 3 SELECT C1, AGGLOM(C2,3) AS C3 Table1 GROUP BY C1; C1 C3 A 2 B 3 SELECT C1, AGGLOM(C2,4) AS C3 Table1 GROUP BY C1; C1 C3 A 1 B 3 SELECT C1, AGGLOM(C2,50) AS C3 Table1 GROUP BY C1; C1 C3 A 1 B 1 Remember, merge numbers with the smallest difference first, and replace each pair with the average of the two. Recalculate the differences for the new number, and repeat until no distance is smaller than the threshold. This is a usefull clustering 'hack' to see if a distribution is bi-modal or multi modal for example. It is very quick to calculate using a hash table, and could be a great function to add. Is this idea as crazy as I think it might be? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]