alamb commented on issue #7036:
URL:
https://github.com/apache/arrow-datafusion/issues/7036#issuecomment-1682180232
I double checked the current behavior on main. It is now possible to specify
the sort order for the parquet file (and you can see the output _order is
correctly reflected)
```
❯ create external table cpu(time timestamp) stored as parquet location
'cpu.parquet' with order (time desc);
0 rows in set. Query took 0.001 seconds.
❯ select * from cpu;
+---------------------+
| time |
+---------------------+
| 2022-09-30T12:55:00 |
+---------------------+
1 row in set. Query took 0.003 seconds.
❯ explain select * from cpu order by time desc;
+---------------+-----------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+-----------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Sort: cpu.time DESC NULLS FIRST
|
| | TableScan: cpu projection=[time]
|
| physical_plan | ParquetExec: file_groups={1 group:
[[Users/alamb/Downloads/cpu.parquet]]}, projection=[time],
output_ordering=[time@0 DESC] |
| |
|
+---------------+-----------------------------------------------------------------------------------------------------------------------------+
2 rows in set. Query took 0.001 seconds.
```
However, it is not possible to specify just the order without the schema:
```
❯ create external table cpu stored as parquet location 'cpu.parquet' with
order (time desc);
Error during planning: Provide a schema before specifying the order while
creating a table.
```
Per @edmondop 's suggestion I think the clearest thing is to close this
ticket as complete (the sort order can be specified) and I will open a new
ticket to allow specifying order without setting the schema
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]