[GitHub] spark pull request #22063: [WIP][SPARK-25044][SQL] Address translation of LM...

cloud-fan Thu, 23 Aug 2018 08:04:29 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22063#discussion_r212345138
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/Classifier.scala ---
    @@ -164,19 +164,15 @@ abstract class ClassificationModel[FeaturesType, M <: 
ClassificationModel[Featur
         var outputData = dataset
         var numColsOutput = 0
         if (getRawPredictionCol != "") {
    -      val predictRawUDF = udf { (features: Any) =>
    --- End diff --
    
    I looked into this, and now I understand why it worked before.
    
    Scala 2.11 somehow can generate type tag for `Any`, then Spark gets the 
input schema from type tag `Try(ScalaReflection.schemaFor(typeTag[A1]).dataType 
:: Nil).toOption`. It will fail and input schema will be None, so no type check 
will be applied later.
    
    I think it makes more sense to specify the type and ask Spark to do type 
check.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22063: [WIP][SPARK-25044][SQL] Address translation of LM...

Reply via email to