alamb commented on pull request #7967: URL: https://github.com/apache/arrow/pull/7967#issuecomment-679075478
I will try and review this carefully later today sometime -- I am on vacation this week with my family so my responses will likely be delayed compared to normal (not that I have been as prompt as you have been anyways :) ) On Sun, Aug 23, 2020 at 9:32 AM Jorge Leitao <notificati...@github.com> wrote: > The code you pointed to reads return_type: DataType. I will assume you > mean the return type declared in Expr::ScalarFunctions. > > Two minds thinking alike: I was just trying to do that in the codebase. > > Unfortunately, I do not think that that is sufficient 😞 : when a > projection is declared, we need to resolve its schema's type, which we do > via Expr::get_type. If we do not have the UDF's return_type on > Expr::ScalarFunction, we can't know its return type, which means we can't > even project (even before optimizations). > > But to get the UDF's DataType, we need to access the UDF's registry. What > we currently do is let the user decide the DataType for us in the logical > plane via the call scalar_function("name", vec![args..], DATATYPE). > Unfortunately, this means that the user needs to know the return type of > the UDF, or it will all break during planning, when the physical plan has > nothing to do with the logical one. I would prefer that the user does not > have to have this burden: it registers a UDF with the type, and then just > plans a call without its return type, during planning. > > I am formalizing a proposal to address this. The gist is that we can't > have "meta" of UDFs in the logical plan: they need to know their return > type, which means that we need to access the registry during planning. > > I am developing some ideas for this here > <https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing> > . > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/arrow/pull/7967#issuecomment-678781518>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AADXZMOSF2MK7Z5G5LHGZXTSCER7LANCNFSM4QAG5BEA> > . > ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org