fmcquillan99 edited a comment on issue #352: Feature/kd tree knn
URL: https://github.com/apache/madlib/pull/352#issuecomment-463797009
 
 
   This returns a result:
   ```
   DROP TABLE IF EXISTS knn_result_classification_kd;
   SELECT madlib.knn(
                   'knn_train_data',        -- Table of training data
                   'data',                  -- Col name of training data
                   'id',                    -- Col name of id in train data
                   NULL,                    -- Training labels
                   'knn_test_data',         -- Table of test data
                   'data',                  -- Col name of test data
                   'id',                    -- Col name of id in test data
                   'knn_result_classification_kd',  -- Output table
                    1,                      -- Number of nearest neighbors
                    True,                   -- True to list nearest-neighbors 
by id
                    'madlib.squared_dist_norm2', -- Distance function
                    False,                  -- For weighted average
                    'kd_tree',              -- Use kd-tree
                    'depth=1, leaf_nodes=1' -- Kd-tree options
                    );
   SELECT * FROM knn_result_classification_kd ORDER BY id;
   ```
   produces
   ```
    id |  data   | k_nearest_neighbours 
   ----+---------+----------------------
     1 | {2,1}   | {2}
     2 | {2,6}   | {3}
     3 | {15,40} | {7}
     4 | {12,1}  | {4}
     5 | {2,90}  | {9}
     6 | {50,45} | {6}
   (6 rows)
   ```
   though I have not checked if this result is correct.
   
   But if I search 31 of 32 leaf nodes I get no result set:
   ```
   DROP TABLE IF EXISTS knn_result_classification_kd;
   SELECT madlib.knn(
                   'knn_train_data',        -- Table of training data
                   'data',                  -- Col name of training data
                   'id',                    -- Col name of id in train data
                   NULL,                    -- Training labels
                   'knn_test_data',         -- Table of test data
                   'data',                  -- Col name of test data
                   'id',                    -- Col name of id in test data
                   'knn_result_classification_kd',  -- Output table
                    1,                      -- Number of nearest neighbors
                    True,                   -- True to list nearest-neighbors 
by id
                    'madlib.squared_dist_norm2', -- Distance function
                    False,                  -- For weighted average
                    'kd_tree',              -- Use kd-tree
                    'depth=5, leaf_nodes=31' -- Kd-tree options
                    );
   SELECT * FROM knn_result_classification_kd ORDER BY id;
   ```
   produces
   ```
    id | data | k_nearest_neighbours 
   ----+------+----------------------
   (0 rows)
   ```
   which does not seem right.
   
   In fact after more testing, I can't get any results for a depth greater than 
1:
   ```
   DROP TABLE IF EXISTS knn_result_classification_kd;
   SELECT madlib.knn(
                   'knn_train_data',        -- Table of training data
                   'data',                  -- Col name of training data
                   'id',                    -- Col name of id in train data
                   NULL,                    -- Training labels
                   'knn_test_data',         -- Table of test data
                   'data',                  -- Col name of test data
                   'id',                    -- Col name of id in test data
                   'knn_result_classification_kd',  -- Output table
                    1,                      -- Number of nearest neighbors
                    True,                   -- True to list nearest-neighbors 
by id
                    'madlib.squared_dist_norm2', -- Distance function
                    False,                  -- For weighted average
                    'kd_tree',              -- Use kd-tree
                    'depth=2, leaf_nodes=1' -- Kd-tree options
                    );
   SELECT * FROM knn_result_classification_kd ORDER BY id;
   ```
   produces
   ```
    id | data | k_nearest_neighbours 
   ----+------+----------------------
   (0 rows)
   ```
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to