jwimberl opened a new issue, #575:
URL: https://github.com/apache/arrow-datafusion-python/issues/575

   **Describe the bug**
   When trying to create a `DataFrame` from a `pyarrow.Table` object with a 
nonzero number of columns, but zero rows, I encounter a panic in 
`src/context.rs:294`.
   
   **To Reproduce**
   ```
   >>> import datafusion as df
   >>> import pyarrow as pa
   >>> ctx = df.SessionContext()
   >>> import pandas as pd
   >>> df = pd.DataFrame({'col': []})
   >>> import pyarrow as pa
   >>> emptyTable = pa.Table.from_pandas(df)
   >>> emptyTable
   pyarrow.Table
   col: double
   ----
   col: [[]]
   >>> ctx.from_arrow_table(emptyTable)
   thread '<unnamed>' panicked at src/context.rs:294:37:
   index out of bounds: the len is 0 but the index is 0
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   pyo3_runtime.PanicException: index out of bounds: the len is 0 but the index 
is 0
   ```
   
   **Expected behavior**
   I expect this to create a `DataFrame` with zero rows, such as the following 
(created via `.limit(0)` from a non-empty `DataFrame`):
   ```
   >>> empty
   DataFrame()
   ++
   ++
   >>> empty.describe()
   DataFrame()
   +------------+-----+
   | describe   | col |
   +------------+-----+
   | count      | 0.0 |
   | null_count | 0.0 |
   | mean       |     |
   | std        |     |
   | min        |     |
   | max        |     |
   | median     |     |
   +------------+-----+
   ```
   
   **Additional context**
   
   - Operating system: Rocky 8
   - Python version: 3.10.4
   - Python module versions used:
   ```
   >>> df.__version__
   '34.0.0'
   >>> pa.__version__
   '15.0.0'
   >>> pd.__version__
   '2.2.0'
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to