I've been looking into clustering algorithms, as some of my projects will use them. It seems all of the ones I've looked at use similar math, which leads to what I would call "cartographic" issues. I'm sure people know of the issues (see examples below), and they are somewhat inherent to the algorithms, particularly if performance is high priority. However, I'm wondering if anyone knows of any other approaches that alleviate these issues. Or more broadly, are there any good web pages that examine and compare the pros and cons of various approaches?
Here are some of the issues. These apply to the distance-based algorithms -- I'm not considering grid-based ones. Mostly the issues are the result of using marker locations for the cluster location and the fact that markers are considered in order one-by-one. Note that the issues do not mean that clustering doesn't work, just that the results are not the best cartographically. For the examples below X is the clustering distance and E is some small epsilon distance. Ex. #1: Different order produces different clusters Consider the points: (1) <-- X-E --> (2) <-- X-E --> (3) Which cluster as: (1&2) <-- 2X-2E --> (3) Reordering to: (2) <-- X-E --> (1) <-- X-E --> (3) Clusters as: (1&2&3) Ex. #2: Same issue, but a less trivial case than above Points: (1) <------ X-E ------> (2) <-- 4E --> (3) <------ X-E ------> (4) Clusters: (1&2) <-------- 2X+2E --------> (3&4) Ideal(?): (1) <------ X+E ------> (2&3) <------ X+E ------> (4) (Hopefully the formatting is okay.) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google Maps API" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-maps-api?hl=en -~----------~----~----~----~------~----~------~--~---
