[ 
https://issues.apache.org/jira/browse/ARROW-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192701#comment-15192701
 ] 

Dan Robinson commented on ARROW-62:
-----------------------------------

For whatever it's worth: it seems PostgreSQL uses 0 in a null bitmap to 
indicate null values 
(http://www.postgresql.org/docs/8.0/static/storage-page-layout.html) while 
MySQL and SQL Server use 1 
(https://dev.mysql.com/doc/internals/en/null-bitmap.html, 
http://www.sqlpassion.at/archive/2011/06/29/the-mystery-of-the-null-bitmap-mask/).
 And of course Drill uses 0, while Numpy uses 1. So there does not seem to be 
an established convention yet. IMHO I guess I think the validity-map approach 
that uses 0 is a little more elegant.


> Format: Are the nulls bits 0 or 1 for null values?
> --------------------------------------------------
>
>                 Key: ARROW-62
>                 URL: https://issues.apache.org/jira/browse/ARROW-62
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Format
>            Reporter: Wes McKinney
>            Assignee: Wes McKinney
>
> As brought up by Dan Robinson on the mailing list (thank you for catching 
> this!), there is an inconsistency in the format documents in the 
> representation of nulls with the ValueVectors code import -- since I drafted 
> these format documents initially I'll take the blame for the inconsistency, 
> but:
> * Drill / ValueVectors uses the value 0 for null data, and 1 for non-null data
> * The format document currently states the opposite (values are null if the 
> bit is set)
> I can see arguments both ways, but one argument for the ValueVectors style is 
> that values must be explicitly set to be non-null, versus uninitialized 
> values being accidentally interpreted as being non-null. When initializing a 
> bitmap, one can {{memset}} the bits to 0, then set then to 1 when non-null 
> values are appended during construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to