westonpace opened a new pull request, #13218:
URL: https://github.com/apache/arrow/pull/13218

   The primary goal of this refactor of old code was to improve the readability 
and clarity of the code base. I did not make any functional changes to the code 
and if any functional changes are suggested which modify existing code I will 
happily discuss them here but defer the changes themselves to follow-up PRs. I 
would very much appreciate any feedback on naming, making sure we have 
sufficient test coverage, and overall layout of the code.
   
   * KeyRowArray -> RowTableImpl KeyEncoder -> RowTableEncoder: The old name 
made sense because this data is currently represented physically as an array of 
rows. However, the data is conceptually tabular. We are storing rows & columns. 
In particular, I found it confusing that `KeyColumnArray` was a 1D data 
structure while `KeyRowArray` was a 2D table structure.
   * KeyEncoder::Context -> LightContext: There's nothing particular to the key 
encoder here and I worry keeping it there may lead to fracturing into many 
different "context" objects. 
   * Overall structure: I created a new folder arrow/compute/row and put all 
row-based utilities in here. Most of the files are now marked as _internal and 
the content in these files is not used outside of arrow/compute/row. The 
grouper had previously been alongside the kernel code and it didn't really 
belong there as it relies very heavily on the internal structure of the row 
encoding.
   * Row structure: I documented the file arrow/compute/row/row_internal.h


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to