fmcquillan99 commented on issue #433: Kmeans: Add automatic optimal cluster estimation URL: https://github.com/apache/madlib/pull/433#issuecomment-529595386 minor verbosity thing ``` madlib=# SELECT madlib.kmeanspp_auto( madlib(# 'km_sample', madlib(# 'k_auto1', madlib(# 'points', madlib(# ARRAY[2,3], madlib(# 'madlib.squared_dist_norm2', madlib(# 'madlib.avg', madlib(# 20, madlib(# 0.001, madlib(# 1.0, madlib(# 'elbow' madlib(# ); NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'k' as the Greenplum Database data distribution key for this table. HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew. CONTEXT: SQL statement " CREATE TABLE k_auto1 ( k INTEGER, centroids DOUBLE PRECISION[][], cluster_variance DOUBLE PRECISION[], objective_fn DOUBLE PRECISION, frac_reassigned DOUBLE PRECISION, num_iterations INTEGER , elbow DOUBLE PRECISION) " PL/Python function "kmeanspp_auto" kmeanspp_auto --------------- (1 row) ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
