Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/315#discussion_r214482318
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -264,12 +260,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id,
SELECT test.{test_id} AS {test_id_temp},
train.{point_id} as train_id,
{fn_dist}(
- train.{point_column_name},
- test.{test_column_name})
+ p_col_name,
+ t_col_name)
AS dist
{label_out}
- FROM {point_source} AS train,
- {test_source} AS test
+ FROM
+ (
+ SELECT {point_id} , {point_column_name} as
p_col_name , {label_column_name} from {point_source}
+ ) train,
+ (
+ SELECT {test_id} ,{test_column_name} as
t_col_name from {test_source}
+ ) test
--- End diff --
Can you please use variables with unique strings for `train`, `test`,
`p_col_name` and `t_col_name`. If the train or test table is named any of
those, the query would fail I guess.
While you are at it, could you also do the same for other variables in this
query: `train_id`, `r`, `dist_inverse` and others I may have missed listing out?
---