Re: [PR] GH-46179: [Python] Bump index level once if pandas df already contains __index_level_i__ column [arrow]

via GitHub Mon, 25 May 2026 07:38:05 -0700


Copilot commented on code in PR #46884:
URL: https://github.com/apache/arrow/pull/46884#discussion_r3298718286



##########
python/pyarrow/pandas_compat.py:
##########
@@ -378,7 +378,10 @@ def _index_level_name(index, i, column_names):
     if index.name is not None and index.name not in column_names:
         return _column_name_to_strings(index.name)
     else:
-        return f'__index_level_{i:d}__'
+        j = i
+        while f'__index_level_{j:d}__' in column_names:
+            j += 1
+        return f'__index_level_{j:d}__'

Review Comment:
   Bumping the generated index column name changes the meaning of the numeric 
suffix (it no longer matches the index level position). Later in this module, 
`_get_index_level()` assumes that any `__index_level_{n}__` name maps to 
`df.index.get_level_values(n)`. With this change, a dataframe that already has 
`__index_level_0__` will generate `__index_level_1__` for the first unnamed 
index level, but `_get_index_level(df, '__index_level_1__')` would incorrectly 
select level 1 (or raise for a single-level index) when a user passes an 
explicit schema. Consider updating `_get_index_level()` to resolve generated 
index names using the same collision-avoidance logic as `_index_level_name` 
(based on `df.columns` and already-assigned index names), so schema-based 
conversions keep working.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-46179: [Python] Bump index level once if pandas df already contains __index_level_i__ column [arrow]

Reply via email to