Github user fmcquillan99 commented on the issue:
https://github.com/apache/madlib/pull/315
I'm not sure what this is doing:
```
%%sql
DROP TABLE IF EXISTS knn_result_classification;
SELECT * FROM madlib.knn(
'knn_train_data', -- Table of training data
'array[99.]::int[] || array[99]', -- Col
name of training data
'id', -- Col name of id in train data
'label', -- Training labels
'knn_test_data', -- Table of test data
'data', -- Col name of test data
'id', -- Col name of id in test data
'knn_result_classification', -- Output table
1, -- Number of nearest neighbors
True, -- True to list nearest-neighbors by
id
'madlib.squared_dist_norm2' -- Distance function
);
SELECT * from knn_result_classification ORDER BY id;
```
produces
```
id | data | prediction | k_nearest_neighbours
----+---------+------------+----------------------
1 | {2,1} | 0 | {8}
2 | {2,6} | 0 | {8}
3 | {15,40} | 0 | {8}
4 | {12,1} | 0 | {8}
5 | {2,90} | 1 | {1}
6 | {50,45} | 1 | {1}
(6 rows)
```
I get the same result if I do:
```
DROP TABLE IF EXISTS knn_result_classification;
SELECT * FROM madlib.knn(
'knn_train_data', -- Table of training data
'array[0.]::int[] || array[0]', -- Col name
of training data
'id', -- Col name of id in train data
'label', -- Training labels
'knn_test_data', -- Table of test data
'data', -- Col name of test data
'id', -- Col name of id in test data
'knn_result_classification', -- Output table
1, -- Number of nearest neighbors
True, -- True to list nearest-neighbors by
id
'madlib.squared_dist_norm2' -- Distance function
);
SELECT * from knn_result_classification ORDER BY id;
```
gives
```
id | data | prediction | k_nearest_neighbours
----+---------+------------+----------------------
1 | {2,1} | 0 | {8}
2 | {2,6} | 0 | {8}
3 | {15,40} | 0 | {8}
4 | {12,1} | 0 | {8}
5 | {2,90} | 1 | {1}
6 | {50,45} | 1 | {1}
(6 rows)
```
---