Github user fmcquillan99 commented on the issue:
https://github.com/apache/madlib/pull/315
(1)
expression for test data array:
```
DROP TABLE IF EXISTS knn_result_classification;
SELECT * FROM madlib.knn(
'knn_train_data', -- Table of training data
'data', -- Col name of training data
'id', -- Col name of id in train data
'label', -- Training labels
'knn_test_data', -- Table of test data
'3 || ARRAY[4]', -- Col name of test data
'id', -- Col name of id in test data
'knn_result_classification', -- Output table
3, -- Number of nearest neighbors
True, -- True to list nearest-neighbors by
id
'madlib.squared_dist_norm2' -- Distance function
);
SELECT * from knn_result_classification ORDER BY id;
```
produces
```
id | 3 || ARRAY[4] | prediction | k_nearest_neighbours
----+---------------+------------+----------------------
1 | {3,4} | 1 | {3,4,5}
2 | {3,4} | 1 | {3,4,5}
3 | {3,4} | 1 | {3,4,5}
4 | {3,4} | 1 | {4,3,5}
5 | {3,4} | 1 | {3,4,5}
6 | {3,4} | 1 | {4,3,5}
(6 rows)
```
(2)
another expression for test data array:
```
DROP TABLE IF EXISTS knn_result_classification;
SELECT * FROM madlib.knn(
'knn_train_data', -- Table of training data
'data', -- Col name of training data
'id', -- Col name of id in train data
'label', -- Training labels
'knn_test_data', -- Table of test data
'array[3.]::int[] || array[4]', -- Col name
of test data
'id', -- Col name of id in test data
'knn_result_classification', -- Output table
3, -- Number of nearest neighbors
True, -- True to list nearest-neighbors by
id
'madlib.squared_dist_norm2' -- Distance function
);
SELECT * from knn_result_classification ORDER BY id;
```
produces
```
id | array[3.]::int[] || array[4] | prediction | k_nearest_neighbours
----+------------------------------+------------+----------------------
1 | {3,4} | 1 | {3,4,5}
2 | {3,4} | 1 | {3,4,5}
3 | {3,4} | 1 | {4,3,5}
4 | {3,4} | 1 | {3,4,5}
5 | {3,4} | 1 | {4,3,5}
6 | {3,4} | 1 | {4,3,5}
(6 rows)
```
so this bit seems to work
---