HyukjinKwon commented on a change in pull request #27565:
[WIP][SPARK-30791][SQL][PYTHON] Add 'sameSemantics' and 'sementicHash' methods
in Dataset
URL: https://github.com/apache/spark/pull/27565#discussion_r379963546
##########
File path: python/pyspark/sql/dataframe.py
##########
@@ -2153,6 +2153,59 @@ def transform(self, func):
"should have been DataFrame." %
type(result)
return result
+ @since(3.1)
+ def sameSemantics(self, other):
+ """
+ Returns `True` when the logical query plans inside both
:class:`DataFrame`\\s are equal and
+ therefore return same results.
+
+ .. note:: The equality comparison here is simplified by tolerating the
cosmetic differences
+ such as attribute names.
+
+ .. note::This API can compare both :class:`DataFrame`\\s very fast but
can still return
Review comment:
nit: there should be a space between `note::This ` -> `note:: This `
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]