This makes sense to me; it invites fewer errors (e.g., unsupported or
misspelled metrics). We could always support the string-y syntax with a
thin logical rewrite if we wanted/need that syntax as well for any reason.
On 12/2/25 12:40 PM, Ian Maxon wrote:
Hi fellow devs,
There is a nice patch up by Calvin
(https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126) that adds a
variety of distance functions for vectors. The initial patch added a
function like this:
vector_distance(u,v,metric)
Which would compute the distance between two arrays of the same length
containing numerics, according to a metric, which is given as a
string. The string would be, for example "euclidean", or "manhattan".
I wasn't particularly fond of this syntax- it was inspired from
something else, and not a slight at Calvin's work. After discussing
informally with Calvin and some others familiar with the patch, I
changed the patch to instead add a separate function for each metric,
like:
euclidean_dist(u,v)
manhattan_dist(u,v)
...
and so on. To me it seems this fits better with the naming and syntax
patterns we already have. Code-wise each function continues to share
most of its implementation with the other vector distance calculation
functions.
Does anyone have any other thoughts or suggestions on the matter?
-Ian