fmcquillan99 commented on issue #433: Kmeans: Add automatic optimal cluster 
estimation
URL: https://github.com/apache/madlib/pull/433#issuecomment-529595386
 
 
   minor verbosity thing
   ```
   madlib=# SELECT madlib.kmeanspp_auto(
   madlib(# 'km_sample',
   madlib(# 'k_auto1',
   madlib(# 'points', 
   madlib(# ARRAY[2,3],
   madlib(# 'madlib.squared_dist_norm2',
   madlib(# 'madlib.avg', 
   madlib(# 20, 
   madlib(# 0.001,
   madlib(# 1.0,
   madlib(# 'elbow'
   madlib(# );
   NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 
'k' as the Greenplum Database data distribution key for this table.
   HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
   CONTEXT:  SQL statement "
           CREATE TABLE k_auto1 (
               k INTEGER,
               centroids   DOUBLE PRECISION[][],
               cluster_variance    DOUBLE PRECISION[],
               objective_fn    DOUBLE PRECISION,
               frac_reassigned DOUBLE PRECISION,
               num_iterations  INTEGER
               
               , elbow DOUBLE PRECISION)
           "
   PL/Python function "kmeanspp_auto"
    kmeanspp_auto 
   ---------------
    
   (1 row)
   
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to