[jira] [Commented] (SPARK-12746) ArrayType(_, true) should also accept ArrayType(_, false)

Earthson Lu (JIRA) Wed, 13 Jan 2016 22:57:58 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097724#comment-15097724
 ]


Earthson Lu commented on SPARK-12746:
-------------------------------------

ok, i see:)

If there's no nullability in ML, how could we implement a Transformer to fill 
missing values(always represented as NULL). I think we need support nullability 
for Preprocessing, so we can get clean data for further operation. I can't 
imagine the situation that we can do nothing when the data contains NULL.

- - -

I think the type checking API is independent with nullability in ML. It is a 
common case that one transformer accept both BooleanType or IntType. Maybe, it 
is a good idea that test condition and assertions are implemented separately.

> ArrayType(_, true) should also accept ArrayType(_, false)
> ---------------------------------------------------------
>
>                 Key: SPARK-12746
>                 URL: https://issues.apache.org/jira/browse/SPARK-12746
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, SQL
>    Affects Versions: 1.6.0
>            Reporter: Earthson Lu
>
> I see CountVectorizer has schema check for ArrayType which has 
> ArrayType(StringType, true). 
> ArrayType(String, false) is just a special case of ArrayType(String, true), 
> but it will not pass this type check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-12746) ArrayType(_, true) should also accept ArrayType(_, false)

Reply via email to