hvanhovell commented on code in PR #47882:
URL: https://github.com/apache/spark/pull/47882#discussion_r1733354114
##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##########
@@ -3591,5 +1507,234 @@ class Dataset[T] private[sql] (
* We cannot deserialize a connect [[Dataset]] because of a class clash on
the server side. We
* null out the instance for now.
*/
+ @scala.annotation.unused("this is used by java serialization")
private def writeReplace(): Any = null
+
+ ////////////////////////////////////////////////////////////////////////////
+ // Return type overrides to make sure we return the implementation instead
Review Comment:
Improve this documentation a bit. There are three reasons for doing this:
- Retain the old signatures for binary compatibility.
- Java compatibility. The java compiler uses the byte code signatures, and
those would point to api.Dataset being returned instead of Dataset. This causes
issues when the java code tries to materialize results, or tries to use
functionality that is implementation specfic.
- Scala method resolution runs into problems when the ambiguous methods are
scattered across the interface and implementation. `drop` and `select` suffered
from this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]