zanmato1984 commented on issue #44513: URL: https://github.com/apache/arrow/issues/44513#issuecomment-2550475588
Hi @kolfild26 , I've successfully run the case in my local (M1 MBP with 32GB memory, arrow 18.1.0) but didn't reproduce the issue. My python script: ``` import pandas import pickle import pyarrow def main(): print("pandas: {0}, pyarrow: {1}".format(pandas.__version__, pyarrow.__version__)) with open('small.pkl', 'rb') as f: small = pickle.load(f) with open('large.pkl', 'rb') as f: large = pickle.load(f) print("small size: {0}, large size: {1}".format(small.num_rows, large.num_rows)) join = small.join(large, keys=['ID_DEV_STYLECOLOR_SIZE', 'ID_DEPARTMENT', 'ID_COLLECTION'], join_type='left outer') print("join size: {0}".format(join.num_rows)) if __name__ == "__main__": main() ``` Result: ``` python test.py pandas: 2.2.3, pyarrow: 18.1.0 small size: 18201475, large size: 360449051 join size: 18201475 ``` Did I miss something? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org