[GitHub] [arrow-adbc] lidavidm commented on pull request #765: feat(format): add additional features to 1.1.0 spec

via GitHub Wed, 21 Jun 2023 06:44:19 -0700


lidavidm commented on PR #765:
URL: https://github.com/apache/arrow-adbc/pull/765#issuecomment-1600863206


   Updated to include some additional common statistics after reviewing Hive. 
   
   It may also be good to include the 'histogram' statistic (Oracle, Hive, 
PostgreSQL) but the encoding for this gets very messy with Arrow: either you 
need a union[list[int], list[double], ...], or you have to devise some way to 
pack it into a binary column. (Or possibly we can just include list[binary] and 
declare that things need to be packed there.)
   
   Regardless, the statistic schema is now fairly complicated with nesting, 
unions, and dictionary-encoding.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-adbc] lidavidm commented on pull request #765: feat(format): add additional features to 1.1.0 spec

Reply via email to