westonpace opened a new pull request, #13218: URL: https://github.com/apache/arrow/pull/13218
The primary goal of this refactor of old code was to improve the readability and clarity of the code base. I did not make any functional changes to the code and if any functional changes are suggested which modify existing code I will happily discuss them here but defer the changes themselves to follow-up PRs. I would very much appreciate any feedback on naming, making sure we have sufficient test coverage, and overall layout of the code. * KeyRowArray -> RowTableImpl KeyEncoder -> RowTableEncoder: The old name made sense because this data is currently represented physically as an array of rows. However, the data is conceptually tabular. We are storing rows & columns. In particular, I found it confusing that `KeyColumnArray` was a 1D data structure while `KeyRowArray` was a 2D table structure. * KeyEncoder::Context -> LightContext: There's nothing particular to the key encoder here and I worry keeping it there may lead to fracturing into many different "context" objects. * Overall structure: I created a new folder arrow/compute/row and put all row-based utilities in here. Most of the files are now marked as _internal and the content in these files is not used outside of arrow/compute/row. The grouper had previously been alongside the kernel code and it didn't really belong there as it relies very heavily on the internal structure of the row encoding. * Row structure: I documented the file arrow/compute/row/row_internal.h -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org