sundy-li commented on issue #5404:
URL:
https://github.com/apache/arrow-datafusion/issues/5404#issuecomment-1445557500
@jychen7 I checked in my 16-core linux with SSD, duckdb read parquet still
faster.
duckdb v0.6.0
```
D CREATE VIEW hits AS
> SELECT *
> REPLACE
> (epoch_ms(EventTime * 1000) AS EventTime,
> DATE '1970-01-01' + INTERVAL (EventDate) DAYS AS EventDate)
> FROM read_parquet('hits.parquet', binary_as_string=True);
D
D .timer on
D select count(1), max(URL) from hits;
┌──────────┬─────────────────────────────────────────┐
│ count(1) │ max("URL") │
│ int64 │ varchar │
├──────────┼─────────────────────────────────────────┤
│ 99997497 │ https://yugra-advert2792270][to]=&input │
└──────────┴─────────────────────────────────────────┘
Run Time (s): real 0.957 user 21.641735 sys 4.245200
```
datafusion:
```
❯ select max("URL") from hits;
+-----------------------------------------+
| MAX(hits.URL) |
+-----------------------------------------+
| https://yugra-advert2792270][to]=&input |
+-----------------------------------------+
1 row in set. Query took 2.849 seconds.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]