Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214482318 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -264,12 +260,17 @@ def knn(schema_madlib, point_source, point_column_name, point_id, SELECT test.{test_id} AS {test_id_temp}, train.{point_id} as train_id, {fn_dist}( - train.{point_column_name}, - test.{test_column_name}) + p_col_name, + t_col_name) AS dist {label_out} - FROM {point_source} AS train, - {test_source} AS test + FROM + ( + SELECT {point_id} , {point_column_name} as p_col_name , {label_column_name} from {point_source} + ) train, + ( + SELECT {test_id} ,{test_column_name} as t_col_name from {test_source} + ) test --- End diff -- Can you please use variables with unique strings for `train`, `test`, `p_col_name` and `t_col_name`. If the train or test table is named any of those, the query would fail I guess. While you are at it, could you also do the same for other variables in this query: `train_id`, `r`, `dist_inverse` and others I may have missed listing out?
---