jmgpeeters opened a new pull request #9629:
URL: https://github.com/apache/arrow/pull/9629


   The only code change required, AFAICT, is the calculation of num_dicts, 
which is no longer simply the number of fields, but rather the unique number of 
id's they point to. I'm calculating this on-demand, as it's quite cheap and not 
frequently called, but could also (p)re-compute this on every addField.
   
   For now, I've added tests that read materialised data generated from Java, 
as we don't support writing IPC with shared dictionaries in C++ either yet (and 
out of scope here). 
   
   Down the line, I would like full read & write support for shared 
dictionaries across at least C++, Python, Java and Julia, so I'll be coming 
back to this with follow-up PR's where needed. As part of that, I'll also 
change the tests to no longer rely on materialised files, but use the 
round-trip mechanism.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to