Justin Talbot created ARROW-12101:
-------------------------------------
Summary: [Format] Consider adding int0 and other small integer
types for more efficient Dictionary encoding
Key: ARROW-12101
URL: https://issues.apache.org/jira/browse/ARROW-12101
Project: Apache Arrow
Issue Type: Wish
Components: Format
Reporter: Justin Talbot
I often come across the need to store single-valued columns. The current Arrow
format doesn't have an efficient way to represent these, I believe. One
possible improvement would be to introduce an int0 type (where all values are
0) that, like null, does not have a buffer allocated. Then this could be used
as an index into a Dictionary with a single value.
For low cardinality columns, I also often find myself wishing for int1, int2,
and int4 types to use as an index.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)