itholic commented on a change in pull request #34113:
URL: https://github.com/apache/spark/pull/34113#discussion_r719124229



##########
File path: python/pyspark/pandas/indexes/multi.py
##########
@@ -1137,6 +1137,42 @@ def intersection(self, other: Union[DataFrame, Series, 
Index, List]) -> "MultiIn
         )
         return cast(MultiIndex, DataFrame(internal).index)
 
+    def equal_levels(self, other: "MultiIndex") -> bool:
+        """
+        Return True if the levels of both MultiIndex objects are the same
+
+        Examples
+        --------
+        >>> psmidx1 = ps.MultiIndex.from_tuples([("a", "x"), ("b", "y"), ("c", 
"z")])
+        >>> psmidx2 = ps.MultiIndex.from_tuples([("b", "y"), ("a", "x"), ("c", 
"z")])
+        >>> psmidx1.equal_levels(psmidx2)
+        True
+
+        >>> psmidx2 = ps.MultiIndex.from_tuples([("a", "x"), ("b", "y"), ("c", 
"j")])
+        >>> psmidx1.equal_levels(psmidx2)
+        False
+        """
+        nlevels = self.nlevels
+        if nlevels != other.nlevels:
+            return False
+
+        self_sdf = self._internal.spark_frame
+        other_sdf = other._internal.spark_frame
+        subtract_list = []
+        for nlevel in range(nlevels):
+            self_index_scol = self._internal.index_spark_columns[nlevel]
+            other_index_scol = other._internal.index_spark_columns[nlevel]
+            self_subtract_other = self_sdf.select(self_index_scol).subtract(

Review comment:
       Yeah, the equality condition of this API is,
   
   **Is all of the unique(distinct) values of each level are the same ?**
   
   So I think applying the `subtract` for each level of index column is satisfy 
the equality condition.
   
   Because if there is at least one different unique value, it will not be 
filtered out when subtracting.
   
   I think maybe it will be more expensive if we compare after applying 
`distinct` to each level since it requires sorting ?
   
   I'm afraid it will be more expensive because we have to sort before 
comparing, if we want to apply `distinct` to each level.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to