vibhatha commented on pull request #12110: URL: https://github.com/apache/arrow/pull/12110#issuecomment-1009491761
> This looks good. The change from prefix to suffix looks correct to me. > > However, part of the ask was also to only modify the field names if the name existing on both the left and right side. > > For example, [in pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html) these are called "overlapping" columns and the suffix is only added to the columns that overlap. Notice that columns `A` and `B` do not have any suffix added. > > ``` > df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], > 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']}) > other = pd.DataFrame({'key': ['K0', 'K1', 'K2'], > 'B': ['B0', 'B1', 'B2']}) > df.join(other, lsuffix='_caller', rsuffix='_other') > key_caller A key_other B > 0 K0 A0 K0 B0 > 1 K1 A1 K1 B1 > 2 K2 A2 K2 B2 > 3 K3 A3 NaN NaN > 4 K4 A4 NaN NaN > 5 K5 A5 NaN NaN > ``` Yes, you’re correct’ need to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org