pitrou commented on code in PR #13281:
URL: https://github.com/apache/arrow/pull/13281#discussion_r886961793
##########
python/pyarrow/_exec_plan.pyx:
##########
@@ -259,13 +259,19 @@ def _perform_join(join_type, left_operand not None,
left_keys,
left_columns = []
elif join_type == "inner":
c_join_type = CJoinType_INNER
- right_columns = set(right_columns) - set(right_keys)
+ right_columns = [
+ col for col in right_columns if col not in set(right_keys)
Review Comment:
Well, the original version (with the `set` instantion inside the list
comprehension) clearly was a performance bug.
I agree that we may keep using a list as the set of keys should be small in
general.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]