Re: [PR] GH-47172: [Python][Test] Add function to create Arrow table instead of pandas df [arrow]

via GitHub Wed, 30 Jul 2025 06:24:41 -0700


egolearner commented on code in PR #47199:
URL: https://github.com/apache/arrow/pull/47199#discussion_r2242686563



##########
python/pyarrow/tests/parquet/common.py:
##########
@@ -121,6 +121,11 @@ def _test_dataframe(size=10000, seed=0):
     return df
 
 
+def _test_table(size=10000, seed=0):
+    df = _test_dataframe(size, seed)
+    return pa.Table.from_pandas(df, preserve_index=False)

Review Comment:
   Thanks for your review @rok 
   
   I have added `_test_dict` function as data generation logic for both 
`_test_dataframe` and `_test_table`. PTAL
   
   > It might even be good to have fallback logic in _test_table for cases 
numpy is not available. This logic could use stdlib's random or some testing 
utility we have available in arrow c++.
   
   Maybe we can deal this in another issue? It seems `numpy` is still a must 
for a lot of test cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] GH-47172: [Python][Test] Add function to create Arrow table instead of pandas df [arrow]

Reply via email to