mesejo commented on issue #66:
URL:
https://github.com/apache/arrow-datafusion-python/issues/66#issuecomment-1676956208
Thanks for the report @marvin-lge. Actually, this issue is unrelated to
compression; the problem is that the default value for`file_extension` for
`register_parquet` is `".parquet"`. Change the code to:
```python
ctx.register_parquet(name="example_pq", path="df.pq", file_extension=".pq")
# note file_extension = ".pq"
# test parquet
df = ctx.sql("SELECT * FROM example_pq")
result = df.collect()
res = result[0]
print(res)
```
**Output**
```
pyarrow.RecordBatch
col1: int64
col2: int64
col3: int64
col4: int64
```
I agree that the usage of `register_parquet` is not very intuitive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]