[ 
https://issues.apache.org/jira/browse/MADLIB-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172530#comment-16172530
 ] 

Frank McQuillan commented on MADLIB-1059:
-----------------------------------------

Yes that is correct.  User can pick distance function of interest, or use the 
default one.

I know [~okislal] is away for a bit, but [~njayaram] can perhaps comment on 
preferred implementation.

Also, is the above the full set of distance functions avail in MADlib today, or 
are there any other ones we could add to the list?



> Add additional distance metrics for k-NN
> ----------------------------------------
>
>                 Key: MADLIB-1059
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1059
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: k-NN
>            Reporter: Frank McQuillan
>            Assignee: Himanshu Pandey
>              Labels: starter
>             Fix For: v2.0
>
>
> Follow on from https://issues.apache.org/jira/browse/MADLIB-927
> which supports one distance function.  This JIRA is to 
> (1)
> add additional distance metrics.  The model is follow is
> http://madlib.incubator.apache.org/docs/latest/group__grp__kmeans.html
> fn_dist (optional)
> TEXT, default: squared_dist_norm2'. The name of the function to use to 
> calculate the distance between data points.
> The following distance functions can be used (computation of barycenter/mean 
> in parentheses):
> dist_norm1: 1-norm/Manhattan (element-wise median [Note that MADlib does not 
> provide a median aggregate function for support and performance reasons.])
> dist_norm2: 2-norm/Euclidean (element-wise mean)
> squared_dist_norm2: squared Euclidean distance (element-wise mean)
> dist_angle: angle (element-wise mean of normalized points)
> dist_tanimoto: tanimoto (element-wise mean of normalized points [5])
> user defined function with signature DOUBLE PRECISION[] x, DOUBLE PRECISION[] 
> y -> DOUBLE PRECISION
> and also check of there are other distance functions under
> http://madlib.apache.org/docs/latest/group__grp__linalg.html
> that might make sense to include while you are at it, in addition to the ones 
> listed above
> (2) Add an option for weighted average in the voting.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to