[jira] [Commented] (SPARK-3683) PySpark Hive query generates "NULL" instead of None

Tamas Jambor (JIRA) Wed, 29 Oct 2014 02:50:50 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188185#comment-14188185
 ]


Tamas Jambor commented on SPARK-3683:
-------------------------------------

Thanks for the comments. 
>From my perspective this is a matter of inconsistency, as all the other types 
>are represented as none in python, except string. So I run another pass on the 
>data, and convert all the NULL values to none. 
I think the problem with the literal string "NULL", you cannot build the logic 
to handle that in the consecutive steps, as it is not represented in the 
appropriate way (i.e. missing values are usually handled as a special case). 




> PySpark Hive query generates "NULL" instead of None
> ---------------------------------------------------
>
>                 Key: SPARK-3683
>                 URL: https://issues.apache.org/jira/browse/SPARK-3683
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.1.0
>            Reporter: Tamas Jambor
>            Assignee: Davies Liu
>
> When I run a Hive query in Spark SQL, I get the new Row object, where it does 
> not convert Hive NULL into Python None instead it keeps it string 'NULL'. 
> It's only an issue with String type, works with other types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-3683) PySpark Hive query generates "NULL" instead of None

Reply via email to