Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3643#discussion_r21663352
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala ---
    @@ -264,6 +263,60 @@ object MLUtils {
         }
         Vectors.fromBreeze(vector1)
       }
    + 
    +  /**
    +   * Returns the squared distance between two Vectors.
    +   */
    +  def vectorSquaredDistance(v1: Vector, v2: Vector): Double = {
    +    var squaredDistance = 0.0
    +    (v1, v2) match { 
    +      case (v1: SparseVector, v2: SparseVector) =>
    +        v1.indices.intersect(v2.indices).foreach((idx) => {
    --- End diff --
    
    Hi, thanks, but I do not really understand your idea. The indices array in 
`SparseVector` only keeps the dimensions that have values in that 
`SparseVector`. BLAS.dot is used to calculate dot product between two vectors. 
How we use it to compute the intersection between two indices arrays? For 
example, vector1's indices is `[1, 5, 10]` and vector2's indices is `[5, 6, 
7]`. The dot product between them is 1 * 5 + 5 * 6 + 10 * 7 = 105. But it is 
useless for computing the intersection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to