rtpsw commented on PR #14682: URL: https://github.com/apache/arrow/pull/14682#issuecomment-1322610029
Sorry, I should have given more context for this PR. @wjones127's [comment](https://github.com/apache/arrow/pull/14682#issuecomment-1322559310) is correct. The concept of user-defined tabular (UDT) functions is [clarified here](https://github.com/apache/arrow/pull/14043#issuecomment-1257178331). In quick-reply to a couple of questions about the naming: 1. Why "UDT"? Logically, a UDT function returns a (user-defined) table. The subdivision of the resulting table into batches does not affect this. 2. Why "tabular"? Physically, the function produces (a sequence of) structures equivalent to a table. The code in this PR only supports flat struct-arrays, and interprets them as a table. I'm open to a different name than "tabular", in particular because it caused confusion. However, I'm not in favor of names with "batch" in them due to point 1 above. Could you propose a name considering the above 2 points? Still related to context for this PR: @wjones127, you asked why have an in-type parameter at all. While the current implementation does not handle UDT inputs, I'm designing the UDT API to support inputs - see [this explanation](https://github.com/apache/arrow/pull/14043#issuecomment-1257178331). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
