icexelloss commented on PR #13028: URL: https://github.com/apache/arrow/pull/13028#issuecomment-1116459280
@westonpace I am trying to extend the implementation to support multiple keys and key types and wonder if you can give some pointers. Basically I think I would create a "mapper" that maps an input "row" to the "key" and use that as the the hash map key for the given row. This mapper would * take the column name/index that are key during initialization * maps a batch + row index -> key * I am not sure what the type of the "key" is but looking at hash join it seems to use just "string" as the key type (using RowEncoder::encoded_row) I also take a look at aggregation for other options but didn't find anything obvious. Did such a "mapper" class already exist in Arrow compute that I can use for this purpose? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
