[GitHub] [spark] WeichenXu123 commented on a change in pull request #27565: [SPARK-30791] Dataframe add sameSemantics and sementicHash method

GitBox Thu, 13 Feb 2020 08:07:14 -0800

WeichenXu123 commented on a change in pull request #27565: [SPARK-30791] 
Dataframe add sameSemantics and sementicHash method
URL: https://github.com/apache/spark/pull/27565#discussion_r378957996


 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -3308,6 +3308,33 @@ class Dataset[T] private[sql](
     files.toSet.toArray
   }
 
+  /**
+   * Returns true when the query plan of the given Dataset will return the 
same results as this
+   * Dataset.
+   *
+   * Since its likely undecidable to generally determine if two given plans 
will produce the same
+   * results, it is okay for this function to return false, even if the 
results are actually
+   * the same.  Such behavior will not affect correctness, only the 
application of performance
+   * enhancements like caching.  However, it is not acceptable to return true 
if the results could
+   * possibly be different.
+   *
+   * This function performs a modified version of equality that is tolerant of 
cosmetic
+   * differences like attribute naming and or expression id differences.
+   *
+   * @since 3.0.0
+   */
+  @DeveloperApi
+  def sameSemantics(other: Dataset[T]): Boolean = {
 
 Review comment:
   Remove @DeveloperApi. Now it is user API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] WeichenXu123 commented on a change in pull request #27565: [SPARK-30791] Dataframe add sameSemantics and sementicHash method

Reply via email to