GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/20719
[SPARK-23568][ML] Use metadata numAttributes if available in Silhouette
## What changes were proposed in this pull request?
Silhouette need to know the number of features. This was taken using
`first` and checking the size of the vector. Despite this works fine, if the
number of attributes is present in metadata, we can avoid to trigger a job for
this and use the metadata value. This can help improving performances of course.
## How was this patch tested?
existing UTs
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgaido91/spark SPARK-23568
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20719.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20719
----
commit 73db3b93973b2fa01eb2a34235ff44302f6feaa2
Author: Marco Gaido <marcogaido91@...>
Date: 2018-03-02T17:10:32Z
[SPARK-23568][ML] Use metadata numAttributes if available in Silhouette
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]