Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20313#discussion_r194636521
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
    @@ -264,7 +265,9 @@ class CountVectorizerModel(
     
           Vectors.sparse(dictBr.value.size, effectiveCounts)
         }
    -    dataset.withColumn($(outputCol), vectorizer(col($(inputCol))))
    +    val attrs = vocabulary.map(_ => new 
NumericAttribute).asInstanceOf[Array[Attribute]]
    --- End diff --
    
    Sorry for replying late. Though I agree that this attributes don't provide 
much info, I'm wondering if we can let it lazily generated. At this point, I 
think we don't know if following transformer will need it or not?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to