Wes McKinney created ARROW-5340:
-----------------------------------

             Summary: [C++] See if possible to deduplicate dictionaries in IPC 
streams in some way
                 Key: ARROW-5340
                 URL: https://issues.apache.org/jira/browse/ARROW-5340
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Wes McKinney


As follow-on work to ARROW-3144, there are cases where a dictionary may be 
shared by multiple fields in a RecordBatch. 

The presumption of {{arrow::ipc::DictionaryMemo}} is that there is a 1-to-1 
mapping between fields and dictionaries, and dictionary id assignment occurs 
prior to observing the dictionaries (to know whether or not they are used 
multiple times), so it may not be feasible, or at least not easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to