[
https://issues.apache.org/jira/browse/SPARK-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097724#comment-15097724
]
Earthson Lu commented on SPARK-12746:
-------------------------------------
ok, i see:)
If there's no nullability in ML, how could we implement a Transformer to fill
missing values(always represented as NULL). I think we need support nullability
for Preprocessing, so we can get clean data for further operation. I can't
imagine the situation that we can do nothing when the data contains NULL.
- - -
I think the type checking API is independent with nullability in ML. It is a
common case that one transformer accept both BooleanType or IntType. Maybe, it
is a good idea that test condition and assertions are implemented separately.
> ArrayType(_, true) should also accept ArrayType(_, false)
> ---------------------------------------------------------
>
> Key: SPARK-12746
> URL: https://issues.apache.org/jira/browse/SPARK-12746
> Project: Spark
> Issue Type: Bug
> Components: ML, SQL
> Affects Versions: 1.6.0
> Reporter: Earthson Lu
>
> I see CountVectorizer has schema check for ArrayType which has
> ArrayType(StringType, true).
> ArrayType(String, false) is just a special case of ArrayType(String, true),
> but it will not pass this type check.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]