hvanhovell opened a new pull request, #47839: URL: https://github.com/apache/spark/pull/47839
### What changes were proposed in this pull request? This PR moves the sql Column to sql/api. A (foreseen) consequence of this is that we need to move the Spark Connect Scala Client to the shared API. This makes the PR quite large. ### Why are the changes needed? We want to create a Scala Client interface that is shared between Classic and Connect. ### Does this PR introduce _any_ user-facing change? Connect does no support untyped Scala UDFs until we figure out a solution for the conf system. ### How was this patch tested? Mostly existing tests. I had to change a couple of tests on the connect side: - UDAF related tests used a feature that was not supposed to work in the first place. You could use a typed aggregator on an untyped Dataset (Row). I have added a test for the scenario where you use an untyped aggregator on an untyped dataset. - In some cases the shared Column API generates slightly different structures (more/less arguments, different names, or different structure). The verbatims used by `PlanGenerationTestSuite` and `ProtoToParsedPlanTestSuite` were updated to reflect this. The protos that had non trivial changes were renamed with the `_orphaned` suffix, this to make sure we keep testing plans produced by older clients. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
