zanmato1984 commented on issue #44513:
URL: https://github.com/apache/arrow/issues/44513#issuecomment-2550475588

   Hi @kolfild26 , I've successfully run the case in my local (M1 MBP with 32GB 
memory, arrow 18.1.0) but didn't reproduce the issue.
   
   My python script:
   ```
   import pandas
   import pickle
   import pyarrow
   
   def main():
       print("pandas: {0}, pyarrow: {1}".format(pandas.__version__, 
pyarrow.__version__))
       with open('small.pkl', 'rb') as f: small = pickle.load(f)
       with open('large.pkl', 'rb') as f: large = pickle.load(f)
       print("small size: {0}, large size: {1}".format(small.num_rows, 
large.num_rows))
       join = small.join(large, keys=['ID_DEV_STYLECOLOR_SIZE', 
'ID_DEPARTMENT', 'ID_COLLECTION'], join_type='left outer')
       print("join size: {0}".format(join.num_rows))
   
   if __name__ == "__main__":
       main()
   ```
   
   Result:
   ```
   python test.py
   pandas: 2.2.3, pyarrow: 18.1.0
   small size: 18201475, large size: 360449051
   join size: 18201475
   ```
   
   Did I miss something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to