[GitHub] [spark] HyukjinKwon commented on a change in pull request #27565: [WIP][SPARK-30791] Dataframe add sameSemantics and sementicHash method

GitBox Fri, 14 Feb 2020 02:27:28 -0800

HyukjinKwon commented on a change in pull request #27565: [WIP][SPARK-30791] 
Dataframe add sameSemantics and sementicHash method
URL: https://github.com/apache/spark/pull/27565#discussion_r379356166


 ##########
 File path: python/pyspark/sql/dataframe.py
 ##########
 @@ -2153,6 +2153,45 @@ def transform(self, func):
                                               "should have been DataFrame." % 
type(result)
         return result
 
+    @since(3.1)
+    def sameSemantics(self, other):
+        """
 
 Review comment:
   The documentation seems mismatched with Scala side. I would suggest:
   
   ```
   Returns `True` when the logical query plans inside both 
:class:`DataFrame`\\s are equal and
   therefore return same results.
   
   .. note:: The equality comparison here is simplified by tolerating the 
cosmetic differences
       such as attribute names.
   
   .. note::This API can compare both :class:`DataFrame`\\s very fast but can 
still return `False` on
       the :class:`DataFrame` that return the same results, for instance, from 
different plans. Such
       false negative semantic can be useful when caching as an example.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #27565: [WIP][SPARK-30791] Dataframe add sameSemantics and sementicHash method

Reply via email to