Wes McKinney created ARROW-81: --------------------------------- Summary: C++: Add a Category nested type Key: ARROW-81 URL: https://issues.apache.org/jira/browse/ARROW-81 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Wes McKinney Assignee: Wes McKinney
A Category (or "factor") is a dictionary-encoded array whose dictionary has semantic meaning. The data consists of - An array of integer "codes" - A child array of some other type, known as the "categories" or "levels" of the array It is a basic requirement for Python and R, at least, as Arrow C++ consumers, to have this type. Separately, we should consider what is necessary to be able to transmit category data in IPCs -- possible an expansion of the format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)