jaylmiller commented on PR #5554:
URL: 
https://github.com/apache/arrow-datafusion/pull/5554#issuecomment-1469003917

   ClickBench count distinct query when using dictionary columns is getting 
killed (this is on main) 🤔
   
   ```
   ❯ CREATE EXTERNAL TABLE hits_base
   STORED AS PARQUET
   LOCATION 'hits.parquet';
   0 rows in set. Query took 0.041 seconds.
   ❯ CREATE TABLE hits as
   select
     arrow_cast("UserID", 'Dictionary(Int32, Utf8)') as "UserID"
   FROM hits_base;
   
   0 rows in set. Query took 13.887 seconds.
   ❯ SELECT COUNT(DISTINCT "UserID") from hits;
   Killed
   ```
   
   "UserID" table is pretty high cardinality though: is there a better 
clickbench query/column pair to bench with? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to