GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23178
[SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction)
## What changes were proposed in this pull request?
It's a bad idea to use case class as public API, as it has a very wide
surface. For example, the `copy` method, its fields, the companion object, etc.
For a particular case, `UserDefinedFunction`. It has a private constructor,
and I believe we only want users to access a few methods:`apply`, `nullable`,
`asNonNullable`, etc.
However, all its fields, and `copy` method, and the companion object are
public unexpectedly. As a result, we made many tricks to work around the binary
compatibility issues.
This PR proposes to only make interfaces public, and hide implementations
behind with a private class. Now `UserDefinedFunction` is a pure trait, and the
concrete implementation is `SparkUserDefinedFunction`, which is private.
This is the first PR to go with this direction. If it's accepted, I'll
create a umbrella JIRA and fix all the public case classes.
## How was this patch tested?
existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark udf
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23178.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23178
----
commit 700334f3b14cfe88d6141c8a99ec339ec7a16afc
Author: Wenchen Fan <wenchen@...>
Date: 2018-11-29T13:38:51Z
Do not use case class as public API (UserDefinedFunction)
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]