Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15413#discussion_r94078123
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
    @@ -356,13 +427,243 @@ class GaussianMixture @Since("2.0.0") (
       override def transformSchema(schema: StructType): StructType = {
         validateAndTransformSchema(schema)
       }
    +
    +  /**
    +   * Initialize weights and corresponding gaussian distributions at random.
    +   *
    +   * We start with uniform weights, a random mean from the data, and 
diagonal covariance matrices
    +   * using component variances derived from the samples.
    +   *
    +   * @param instances The training instances.
    +   * @param numClusters The number of clusters.
    +   * @param numFeatures The number of features of training instance.
    +   * @return The initialized weights and corresponding gaussian 
distributions. Note the
    +   *         covariance matrix of multivariate gaussian distribution is 
symmetric and
    +   *         we only save the upper triangular part as a dense vector.
    --- End diff --
    
    Document that the matrix is in column-major order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to