[ 
https://issues.apache.org/jira/browse/SPARK-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213488#comment-15213488
 ] 

Sean Owen commented on SPARK-14183:
-----------------------------------

It doesn't really make sense in this case to build a model on 1 element, but 
the error should be better in any event. There's a bunch of error checking in 
LogisticRegression.scala:294 and it could see if the summarizers saw at least 
one data point. The problem here is that there are none at all observed by the 
summarizer.

> UnsupportedOperationException: empty.max when fitting CrossValidator model 
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-14183
>                 URL: https://issues.apache.org/jira/browse/SPARK-14183
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.0.0
>            Reporter: Jacek Laskowski
>            Priority: Minor
>
> The following code produces {{java.lang.UnsupportedOperationException: 
> empty.max}}, but it should've said what might've caused that or how to fix it.
> The exception:
> {code}
> scala> val model = cv.fit(df)
> java.lang.UnsupportedOperationException: empty.max
>   at scala.collection.TraversableOnce$class.max(TraversableOnce.scala:227)
>   at scala.collection.AbstractTraversable.max(Traversable.scala:104)
>   at 
> org.apache.spark.ml.classification.MultiClassSummarizer.numClasses(LogisticRegression.scala:739)
>   at 
> org.apache.spark.ml.classification.MultiClassSummarizer.histogram(LogisticRegression.scala:743)
>   at 
> org.apache.spark.ml.classification.LogisticRegression.train(LogisticRegression.scala:288)
>   at 
> org.apache.spark.ml.classification.LogisticRegression.train(LogisticRegression.scala:261)
>   at 
> org.apache.spark.ml.classification.LogisticRegression.train(LogisticRegression.scala:160)
>   at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
>   at org.apache.spark.ml.Predictor.fit(Predictor.scala:71)
>   at org.apache.spark.ml.Estimator.fit(Estimator.scala:59)
>   at org.apache.spark.ml.Estimator$$anonfun$fit$1.apply(Estimator.scala:78)
>   at org.apache.spark.ml.Estimator$$anonfun$fit$1.apply(Estimator.scala:78)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
>   at org.apache.spark.ml.Estimator.fit(Estimator.scala:78)
>   at 
> org.apache.spark.ml.tuning.CrossValidator$$anonfun$fit$1.apply(CrossValidator.scala:110)
>   at 
> org.apache.spark.ml.tuning.CrossValidator$$anonfun$fit$1.apply(CrossValidator.scala:105)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>   at org.apache.spark.ml.tuning.CrossValidator.fit(CrossValidator.scala:105)
>   ... 55 elided
> {code}
> The code:
> {code}
> import org.apache.spark.ml.tuning._
> val cv = new CrossValidator
> import org.apache.spark.mllib.linalg._
> val features = Vectors.sparse(3, Array(1), Array(1d))
> val df = Seq((0, "hello world", 0d, features)).toDF("id", "text", "label", 
> "features")
> import org.apache.spark.ml.classification._
> val lr = new LogisticRegression()
> import org.apache.spark.ml.evaluation.RegressionEvaluator
> val regEval = new RegressionEvaluator()
> val paramGrid = new ParamGridBuilder().build()
> cv.setEstimatorParamMaps(paramGrid).setEstimator(lr).setEvaluator(regEval)
> val model = cv.fit(df)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to