[GitHub] [spark] aa1371 commented on a change in pull request #35083: [WIP][SPARK-37798] PySpark Pandas API: Cross and conditional merging

GitBox Sun, 02 Jan 2022 01:52:39 -0800


aa1371 commented on a change in pull request #35083:
URL: https://github.com/apache/spark/pull/35083#discussion_r777188497




##########
File path: python/pyspark/pandas/frame.py
##########
@@ -7648,52 +7654,6 @@ def to_list(os: Optional[Union[Name, List[Name]]]) -> 
List[Label]:
             else:
                 return [o if is_name_like_tuple(o) else (o,) for o in os]
 
-        if isinstance(right, ps.Series):

Review comment:
       @HyukjinKwon - I just pushed an update to address cleaning up the logic 
in the merge function (the first half of the function, up until the creation of 
the `joined_table`). And actually, I hope you don't mind, but I've also 
included the cross and conditional merge logic into the PR as well, because it 
actually is only a minimal marginal addition to the cleanup refactor to 
implement both features.
   
   If we only want the PR to address cleaning up the function (i.e. no behavior 
change/new features), the only  change we would need to make to the PR is to 
reduce this block of code:
   
https://github.com/apache/spark/blob/f052ff0c90de487342a2da71a56cdee2b1b3099f/python/pyspark/pandas/frame.py#L7720-L7737
   
   To this subset of that block of code above:
   
https://github.com/apache/spark/blob/f052ff0c90de487342a2da71a56cdee2b1b3099f/python/pyspark/pandas/frame.py#L7731-L7737
   
   I think there will still be a bit to digest in the PR, but it's all related 
to the null-op cleanup refactor, after that's been processed the new feature 
additions are very simple to follow.
   
   I think this would be a potentially good way to move forward, let me know 
your thoughts on this approach.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] aa1371 commented on a change in pull request #35083: [WIP][SPARK-37798] PySpark Pandas API: Cross and conditional merging

Reply via email to