Github user njayaram2 commented on a diff in the pull request:

    https://github.com/apache/madlib/pull/315#discussion_r215458345
  
    --- Diff: src/ports/postgres/modules/knn/knn.py_in ---
    @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, 
point_column_name, point_id,
     
         if label_column_name and label_column_name.strip():
             cols_in_tbl_valid(point_source, [label_column_name], 'kNN')
    -    cols_in_tbl_valid(point_source, (point_column_name, point_id), 'kNN')
    -    cols_in_tbl_valid(test_source, (test_column_name, test_id), 'kNN')
    -
    -    if not is_col_array(point_source, point_column_name):
    -        plpy.error("kNN Error: Feature column '{0}' in train table is not"
    -                   " an array.".format(point_column_name))
    -    if not is_col_array(test_source, test_column_name):
    -        plpy.error("kNN Error: Feature column '{0}' in test table is not"
    -                   " an array.".format(test_column_name))
    --- End diff --
    
    I was just playing with the function to see how user-friendly the error 
message would be. Found that, with the older kNN version, if a column was not 
of array type, the error was informative:
    `kNN Error: Feature column 'y' in train table is not an array.`
    With the code in this PR, the error for the same failure is:
    `function array_upper(double precision, integer) does not exist`
    
    I don't think this is informative enough for users. I would surely like to 
continue the discussion that has already happened on this. @fmcquillan99 and 
@hpandeycodeit , any thoughts?


---

Reply via email to