Chiwan Park created FLINK-1933:
----------------------------------
Summary: Add distance measure interface and basic implementation
to machine learning library
Key: FLINK-1933
URL: https://issues.apache.org/jira/browse/FLINK-1933
Project: Flink
Issue Type: New Feature
Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Chiwan Park
Add distance measure interface to calculate distance between two vectors and
some implementations of the interface. In FLINK-1745, [~till.rohrmann] suggests
a interface following:
{code}
trait DistanceMeasure {
def distance(a: Vector, b: Vector): Double
}
{code}
I think that following list of implementation is sufficient to provide first to
ML library users.
* Manhattan distance [1]
* Cosine distance [2]
* Euclidean distance (and Squared) [3]
* Tanimoto distance [4]
* Minkowski distance [5]
* Chebyshev distance [6]
[1]: http://en.wikipedia.org/wiki/Taxicab_geometry
[2]: http://en.wikipedia.org/wiki/Cosine_similarity
[3]: http://en.wikipedia.org/wiki/Euclidean_distance
[4]:
http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
[5]: http://en.wikipedia.org/wiki/Minkowski_distance
[6]: http://en.wikipedia.org/wiki/Chebyshev_distance
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)