Github user njayaram2 commented on a diff in the pull request:

    https://github.com/apache/incubator-madlib/pull/80#discussion_r92718644
  
    --- Diff: src/ports/postgres/modules/knn/knn.sql_in ---
    @@ -0,0 +1,126 @@
    +/* ----------------------------------------------------------------------- 
*//**
    + *
    + * @file knn.sql_in
    + *
    + * @brief Set of functions for k-nearest neighbors.
    + *
    + *
    + *//* 
----------------------------------------------------------------------- */
    +
    +m4_include(`SQLCommon.m4')
    +
    +DROP TYPE IF EXISTS MADLIB_SCHEMA.knn_result CASCADE;
    +CREATE TYPE MADLIB_SCHEMA.knn_result AS (
    +    prediction float
    +);
    +DROP TYPE IF EXISTS MADLIB_SCHEMA.test_table_spec CASCADE;
    +CREATE TYPE MADLIB_SCHEMA.test_table_spec AS (
    +    id integer,
    +    vector DOUBLE PRECISION[]
    +);
    +
    +CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__knn_validate_src(
    +rel_source VARCHAR
    +) RETURNS VOID AS $$
    +    PythonFunction(knn, knn, knn_validate_src)
    +$$ LANGUAGE plpythonu
    +m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `READS SQL DATA', `');
    +
    --- End diff --
    
    I believe this is just the first version of this code. But
    it is a good idea to have some documentation that
    would be helpful for users. Try to include a function
    that would show the help messages when the user just
    calls `select madlib.knn()` or `select madlib.knn('help')`.
    
    It'll also be awesome to include some small example data
    that can be used. For instance, include a small train and
    test tables. Functions that display these help messages
    are typically written in python. Check out other modules
    to get some ideas for the same.
    
    Fyi, I created the following train and test tables locally
    when I tried to run this code:
    ```sql
    drop table if exists knn_train_data;
    create table knn_train_data (
        id  integer,
        data    integer[],
        label   float);
    copy knn_train_data (id, data, label) from stdin delimiter '|';
    1|{1,1}|1.0
    2|{2,2}|1.0
    3|{3,3}|1.0
    4|{4,4}|1.0
    5|{4,5}|1.0
    6|{20,50}|0.0
    7|{10,31}|0.0
    8|{81,13}|0.0
    9|{1,111}|0.0
    \.
    
    drop table if exists knn_test_data;
    create table knn_test_data (
        id  integer,
        data    integer[]);
    copy knn_test_data (id, data) from stdin delimiter '|';
    1|{2,1}
    2|{2,6}
    3|{15,40}
    4|{12,1}
    5|{2,90}
    6|{50,45}
    \.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to