Github user mpjlu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19516#discussion_r146528222
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala ---
@@ -291,9 +291,13 @@ final class ChiSqSelectorModel private[ml] (
val featureAttributes: Array[Attribute] = if
(origAttrGroup.attributes.nonEmpty) {
origAttrGroup.attributes.get.zipWithIndex.filter(x =>
selector.contains(x._2)).map(_._1)
} else {
- Array.fill[Attribute](selector.size)(NominalAttribute.defaultAttr)
+ null
--- End diff --
For this problem, either here or the following code should be changed,
right?
" case nomAttr: NominalAttribute =>
nomAttr.getNumValues match {
case Some(numValues: Int) => Iterator(idx -> numValues)
case None => throw new IllegalArgumentException(s"Feature
$idx is marked as" +
" Nominal (categorical), but it does not have the number
of values specified.")
}"
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]