wmoustafa commented on PR #36593: URL: https://github.com/apache/spark/pull/36593#issuecomment-1142712351
@beliefer, glad to see interest/progress in cross platform SQL/UDFs pushdown. Have you considered doing this leveraging frameworks such as Transport [[1](https://github.com/linkedin/transport), [2](https://engineering.linkedin.com/blog/2018/11/using-translatable-portable-UDFs)] for UDFs and Coral [[1](https://github.com/linkedin/coral), [2](https://engineering.linkedin.com/blog/2020/coral)] for SQL? With Transport, one can implement a function that is executable in Spark as well as other data sources, using one implementation. All function variants (automatically generated) will natively access the in-memory records of the corresponding engine/data source. With Coral, one can apply transformations/rewrites to built-in functions/SQL expressions so they translate to the same semantics in an underlying engine/data source. For example, it can be used to push down complex functions/SQL expressions from Spark to Trino despite having different syntax. This PR might not be the best place to discuss this in detail, but happy to file a JIRA ticket to carry this forward. cc: @xkrogen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
