Justin Talbot created ARROW-12101:
-------------------------------------

             Summary: [Format] Consider adding int0 and other small integer 
types for more efficient Dictionary encoding
                 Key: ARROW-12101
                 URL: https://issues.apache.org/jira/browse/ARROW-12101
             Project: Apache Arrow
          Issue Type: Wish
          Components: Format
            Reporter: Justin Talbot


I often come across the need to store single-valued columns. The current Arrow 
format doesn't have an efficient way to represent these, I believe. One 
possible improvement would be to introduce an int0 type (where all values are 
0) that, like null, does not have a buffer allocated. Then this could be used 
as an index into a Dictionary with a single value.

For low cardinality columns, I also often find myself wishing for int1, int2, 
and int4 types to use as an index.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to