jaylmiller commented on PR #5554:
URL:
https://github.com/apache/arrow-datafusion/pull/5554#issuecomment-1469003917
ClickBench count distinct query when using dictionary columns is getting
killed (this is on main) 🤔
```
❯ CREATE EXTERNAL TABLE hits_base
STORED AS PARQUET
LOCATION 'hits.parquet';
0 rows in set. Query took 0.041 seconds.
❯ CREATE TABLE hits as
select
arrow_cast("UserID", 'Dictionary(Int32, Utf8)') as "UserID"
FROM hits_base;
0 rows in set. Query took 13.887 seconds.
❯ SELECT COUNT(DISTINCT "UserID") from hits;
Killed
```
"UserID" table is pretty high cardinality though: is there a better
clickbench query/column pair to bench with?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]