I'm not sure whether k-means would converge with this customized distance measure. You can list (weighted) time as a feature along with coordinates, and then use Euclidean distance. For other supported distance measures, you can check Derrick's package: http://spark-packages.org/package/derrickburns/generalized-kmeans-clustering. -Xiangrui
On Mon, May 18, 2015 at 2:30 AM, Pa Rö <paul.roewer1...@googlemail.com> wrote: > hallo, > > i want cluster geo data (lat,long,timestamp) with k-means. now i search for > a good core function, i can not find good paper or other sources for that. > to time i multiplicate the time and the space distance: > > public static double dis(GeoData input1, GeoData input2) > { > double timeDis = Math.abs( input1.getTime() - input2.getTime() ); > double geoDis = geoDis(input1, input2); //extra function > return timeDis*geoDis; > } > > maybe someone know a good core function for clustering temporal geo data? > (need citation) --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org