Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19204#discussion_r139312034
  
    --- Diff: python/pyspark/ml/evaluation.py ---
    @@ -328,6 +329,86 @@ def setParams(self, predictionCol="prediction", 
labelCol="label",
             kwargs = self._input_kwargs
             return self._set(**kwargs)
     
    +
    +@inherit_doc
    +class ClusteringEvaluator(JavaEvaluator, HasPredictionCol, HasFeaturesCol,
    +                          JavaMLReadable, JavaMLWritable):
    +    """
    +    .. note:: Experimental
    +
    +    Evaluator for Clustering results, which expects two input
    +    columns: prediction and features.
    +
    +    >>> from sklearn import datasets
    +    >>> from pyspark.sql.types import *
    +    >>> from pyspark.ml.linalg import Vectors, VectorUDT
    +    >>> from pyspark.ml.evaluation import ClusteringEvaluator
    +    ...
    +    >>> iris = datasets.load_iris()
    --- End diff --
    
    Please don't involves other libraries if not necessary, here the doc test 
is used to show how to use ```ClusteringEvaluator```  to fresh users, so we 
should focus on evaluator and keep it as simple as possible. You can refer 
other evaluator to construct simple dataset.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to