Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/13680
@cloud-fan thank you for your good comment. I also read [previous
proposal](https://github.com/apache/spark/pull/12640#discussion_r61539393).
I love to have only single format (or implementation). Since I thought that
there are some reasons to keep the old format, I introduced a new dense format.
IMHO, a new unified format should have three properties.
1. Remove indirect offset (for performance and footprint)
2. Have capability of presence of nullbit (for generality)
3. Quickly get information on existence of null value in an array (for
performance, in particular, primitive array)
Based on them, how about this single format?
```
[numElements] [all zero in null bits?] [null bits] [values] [variable
length portion]
```
If we want to reduce memory footprint in the case of primitive array, we
can drop ```[null bits]``` part if ```[all zero in null bits?]``` has a special
value.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]