hi Pino, can you reply on JIRA for the sake of keeping the discussion in one place?
>From what I know about sector categorization I think this is a slightly separate question -- here we are only concerned with the metadata and memory representation of data with a fixed number of categories (where the categories have some semantic meaning in the the analysis, e.g. ordering). On Fri, Aug 19, 2016 at 8:18 AM, pino patera <pino.pat...@gmail.com> wrote: > For the Financial World, category time series are very important (i.e. > industry/sector categories are different over time). How would this > structure look like in this scenario? > > On Fri, Aug 19, 2016 at 5:12 PM Jacques Nadeau (JIRA) <j...@apache.org> > wrote: > >> >> [ >> https://issues.apache.org/jira/browse/ARROW-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428316#comment-15428316 >> ] >> >> Jacques Nadeau commented on ARROW-81: >> ------------------------------------- >> >> Can you guys provide two small example datasets in JSON format here? >> >> > C++: Add a Category nested type >> > ------------------------------- >> > >> > Key: ARROW-81 >> > URL: https://issues.apache.org/jira/browse/ARROW-81 >> > Project: Apache Arrow >> > Issue Type: New Feature >> > Components: C++ >> > Reporter: Wes McKinney >> > Assignee: Wes McKinney >> > >> > A Category (or "factor") is a dictionary-encoded array whose dictionary >> has semantic meaning. The data consists of >> > - An array of integer "codes" >> > - A child array of some other type, known as the "categories" or >> "levels" of the array. Typically there is an "ordered" boolean flag >> indicating whether the order of the categories is meaningful. >> > Category/factor types are used in a number of common statistical >> analyses. See, for example, >> http://www.voteview.com/R_Ordered_Logistic_or_Probit_Regression.htm. It >> is a basic requirement for Python and R, at least, as Arrow C++ consumers, >> to have this type. Separately, we should consider what is necessary to be >> able to transmit category data in IPCs -- possible an expansion of the >> Arrow format. >> >> >> >> -- >> This message was sent by Atlassian JIRA >> (v6.3.4#6332) >>