jwimberl opened a new issue, #575:
URL: https://github.com/apache/arrow-datafusion-python/issues/575
**Describe the bug**
When trying to create a `DataFrame` from a `pyarrow.Table` object with a
nonzero number of columns, but zero rows, I encounter a panic in
`src/context.rs:294`.
**To Reproduce**
```
>>> import datafusion as df
>>> import pyarrow as pa
>>> ctx = df.SessionContext()
>>> import pandas as pd
>>> df = pd.DataFrame({'col': []})
>>> import pyarrow as pa
>>> emptyTable = pa.Table.from_pandas(df)
>>> emptyTable
pyarrow.Table
col: double
----
col: [[]]
>>> ctx.from_arrow_table(emptyTable)
thread '<unnamed>' panicked at src/context.rs:294:37:
index out of bounds: the len is 0 but the index is 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pyo3_runtime.PanicException: index out of bounds: the len is 0 but the index
is 0
```
**Expected behavior**
I expect this to create a `DataFrame` with zero rows, such as the following
(created via `.limit(0)` from a non-empty `DataFrame`):
```
>>> empty
DataFrame()
++
++
>>> empty.describe()
DataFrame()
+------------+-----+
| describe | col |
+------------+-----+
| count | 0.0 |
| null_count | 0.0 |
| mean | |
| std | |
| min | |
| max | |
| median | |
+------------+-----+
```
**Additional context**
- Operating system: Rocky 8
- Python version: 3.10.4
- Python module versions used:
```
>>> df.__version__
'34.0.0'
>>> pa.__version__
'15.0.0'
>>> pd.__version__
'2.2.0'
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]