[GitHub] [arrow-rs] yjshen commented on pull request #2593: Comparable Row Format

GitBox Thu, 08 Sep 2022 19:34:56 -0700


yjshen commented on PR #2593:
URL: https://github.com/apache/arrow-rs/pull/2593#issuecomment-1241430450


   Great to see this happening!
   
   I suggest we move the majority of the code in this PR to the DataFusion repo 
and only keep the API changes on the arrow sort compute kernel (the visibility 
changes) in arrow-rs. My suggestion mainly comes from two folds: we could ease 
the development by iterating on a single repo in DataFusion instead of counting 
on a separate arrow-rs release, and we could minimize confusion by having two 
row modules in two repos.
   
   After checking the usage of this comparable row format in 
https://github.com/apache/arrow-datafusion/pull/3386, I think it's still valid 
for us to have three variants of the row format to serve different purposes. 
One for storing efficiency, one for updating efficiency, and one for sort 
efficiency. For example, if we use this comparable format for aggregation 
buffer, we would need to repeatedly flip bytes back and force for each cell 
update.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] yjshen commented on pull request #2593: Comparable Row Format

Reply via email to