[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-09-09 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-689455228  Hooray! This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-09-06 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-687894822 fyi @jorgecarleitao I plan to look carefully at this PR again early tomorrow morning US East Coast (UTC-4) time

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683714908 > Final note: this whole discussion may be a bit too detailed for udfs, and we could instead offer a simpler interface to not handle these complex cases. However, this whole

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683713809 # E As I think you are suggesting, I think it would be possible to keep the ability to declare simple UDF functions inline even if we allowed a more general trait based general

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683713264 # D I think in any UDF implementation, the UDF author will be required to ensure their physical expression implementation matches the type information reported to the DataFusion

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683711848 # C > I think that we also need to have in the trait the functions' return type function Yes you are right

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683711505 # B I see what you mean now -- I think I was getting confused about what code will do the coercion of input types. I was thinking that that built in type coercion logic could

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-31 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683710210 # A > I can imagine the situation where the coercer returns non-deterministic plans due to two variants being valid, and one being picked up by chance. Or, that the order on

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-30 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683411727 > With this said, as an exercise, let me try to write how I imagine an interface could look like for option 3, just to check if I have the same understanding as you do. I

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-29 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-683275858 I wonder if it would be helpful to think about types in logical and physical plans differently. Let's take @jorgecarleitao's `concat` as an example and test the output of

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-24 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-679075478 I will try and review this carefully later today sometime -- I am on vacation this week with my family so my responses will likely be delayed compared to normal (not that I

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-24 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-679069890 > I would prefer that the user does not have to have this burden: it registers a UDF with the type, and then just plans a call without its return type, during planning.

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-23 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-678780754 > When you mean data_type you mean the arguments' types or the return_type? I was referring to `Expr::ScalarFunction::data_type`:

[GitHub] [arrow] alamb commented on pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-08-23 Thread GitBox
alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-678766915 @jorgecarleitao -- another thing I can think of would be to postpone the UDF resolution until the type coercion logical optimizer pass. So in other words, when converting