zanmato1984 commented on PR #46566: URL: https://github.com/apache/arrow/pull/46566#issuecomment-2921384026
> I see, so if I understand this correctly, ideally, we probably should assign distinct key for both columns before using filter expression since output_suffix_for_left would only works for output at the end of the workflow, right? (sorry if this is a dumb question...) i.e., something like this won't work > > ```python > join_opts = HashJoinNodeOptions( > "inner", left_keys="key", right_keys="key", > output_suffix_for_left="_left",output_suffix_for_right="_right", > filter=pc.equal(pc.field('key_left'), 2)) # <------------ will hit key not found in both schemas. > joined = Declaration( > "hashjoin", options=join_opts, inputs=[left_source, right_source]) > result = joined.to_table() > ``` Sorry I made a mistake. You are right about this. Thanks for clarifying. If you want to write a similar test case, let's just workaround the constraint and use unique column names. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org