Re: Apache Spark documentation on mllib's Kmeans doesn't jibe.

2017-12-13 Thread Scott Reynolds
The train method is on the Companion Object
https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.mllib.clustering.KMeans$

here is a decent resource on Companion Object usage:
https://docs.scala-lang.org/tour/singleton-objects.html

On Wed, Dec 13, 2017 at 9:16 AM Michael Segel 
wrote:

> Hi,
>
> Just came across this while looking at the docs on how to use Spark’s
> Kmeans clustering.
>
> Note: This appears to be true in both 2.1 and 2.2 documentation.
>
> The overview page:
> https://spark.apache.org/docs/2.1.0/mllib-clustering.html#k-means
> 
>
> Here’ the example contains the following line:
>
> val clusters = KMeans.train(parsedData, numClusters, numIterations)
>
> I was trying to get more information on the train() method.
> So I checked out the KMeans Scala API:
>
>
> https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.mllib.clustering.KMeans
> 
>
> The issue is that I couldn’t find the train method…
>
> So I thought I was slowly losing my mind.
>
> I checked out the entire API page… could not find any API docs which
> describe the method train().
>
> I ended up looking at the source code and found the method in the scala
> source code.
> (You can see the code here:
> https://github.com/apache/spark/blob/v2.1.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala
> 
>  )
>
> So the method(s) exist, but not covered in the Scala API doc.
>
> How do you raise this as a ‘bug’ ?
>
> Thx
>
> -Mike
>
> --

Scott Reynolds
Principal Engineer
[image: twilio] 


EMAIL sreyno...@twilio.com


Apache Spark documentation on mllib's Kmeans doesn't jibe.

2017-12-13 Thread Michael Segel
Hi,

Just came across this while looking at the docs on how to use Spark’s Kmeans 
clustering.

Note: This appears to be true in both 2.1 and 2.2 documentation.

The overview page:
https://spark.apache.org/docs/2.1.0/mllib-clustering.html#k-means

Here’ the example contains the following line:

val clusters = KMeans.train(parsedData, numClusters, numIterations)

I was trying to get more information on the train() method.
So I checked out the KMeans Scala API:
https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.mllib.clustering.KMeans

The issue is that I couldn’t find the train method…

So I thought I was slowly losing my mind.

I checked out the entire API page… could not find any API docs which describe 
the method train().

I ended up looking at the source code and found the method in the scala source 
code.
(You can see the code here: 
https://github.com/apache/spark/blob/v2.1.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala
 )

So the method(s) exist, but not covered in the Scala API doc.

How do you raise this as a ‘bug’ ?

Thx

-Mike