Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/incubator-madlib/pull/80#discussion_r92718644
--- Diff: src/ports/postgres/modules/knn/knn.sql_in ---
@@ -0,0 +1,126 @@
+/* -----------------------------------------------------------------------
*//**
+ *
+ * @file knn.sql_in
+ *
+ * @brief Set of functions for k-nearest neighbors.
+ *
+ *
+ *//*
----------------------------------------------------------------------- */
+
+m4_include(`SQLCommon.m4')
+
+DROP TYPE IF EXISTS MADLIB_SCHEMA.knn_result CASCADE;
+CREATE TYPE MADLIB_SCHEMA.knn_result AS (
+ prediction float
+);
+DROP TYPE IF EXISTS MADLIB_SCHEMA.test_table_spec CASCADE;
+CREATE TYPE MADLIB_SCHEMA.test_table_spec AS (
+ id integer,
+ vector DOUBLE PRECISION[]
+);
+
+CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__knn_validate_src(
+rel_source VARCHAR
+) RETURNS VOID AS $$
+ PythonFunction(knn, knn, knn_validate_src)
+$$ LANGUAGE plpythonu
+m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `READS SQL DATA', `');
+
--- End diff --
I believe this is just the first version of this code. But
it is a good idea to have some documentation that
would be helpful for users. Try to include a function
that would show the help messages when the user just
calls `select madlib.knn()` or `select madlib.knn('help')`.
It'll also be awesome to include some small example data
that can be used. For instance, include a small train and
test tables. Functions that display these help messages
are typically written in python. Check out other modules
to get some ideas for the same.
Fyi, I created the following train and test tables locally
when I tried to run this code:
```sql
drop table if exists knn_train_data;
create table knn_train_data (
id integer,
data integer[],
label float);
copy knn_train_data (id, data, label) from stdin delimiter '|';
1|{1,1}|1.0
2|{2,2}|1.0
3|{3,3}|1.0
4|{4,4}|1.0
5|{4,5}|1.0
6|{20,50}|0.0
7|{10,31}|0.0
8|{81,13}|0.0
9|{1,111}|0.0
\.
drop table if exists knn_test_data;
create table knn_test_data (
id integer,
data integer[]);
copy knn_test_data (id, data) from stdin delimiter '|';
1|{2,1}
2|{2,6}
3|{15,40}
4|{12,1}
5|{2,90}
6|{50,45}
\.
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---