BlakeOrth commented on PR #16971:
URL: https://github.com/apache/datafusion/pull/16971#issuecomment-3145516316

   I figured I'd put my money where my mouth was with regards to my comment 
here: https://github.com/apache/datafusion/pull/16971#discussion_r2248565529 
specifically with regards to the latency penalty.
   
   I've done a very quick modification to this branch to implement the standard 
`get` method that omits any requests for metadata. Results on a remote dataset 
can be seen below:
   
   ```sql
   DataFusion CLI v49.0.0
   > CREATE EXTERNAL TABLE athena_partitioned
   STORED AS PARQUET LOCATION 
's3://clickhouse-public-datasets/hits_compatible/athena_partitioned/';
   0 row(s) fetched.
   Elapsed 3.277 seconds.
   
   > select count(*) from athena_partitioned;
   +----------+
   | count(*) |
   +----------+
   | 99997497 |
   +----------+
   1 row(s) fetched.
   Elapsed 2.469 seconds.
   
   > select count(*) from athena_partitioned;
   +----------+
   | count(*) |
   +----------+
   | 99997497 |
   +----------+
   1 row(s) fetched.
   Elapsed 0.309 seconds.
   
   > select count(*) from athena_partitioned;
   +----------+
   | count(*) |
   +----------+
   | 99997497 |
   +----------+
   1 row(s) fetched.
   Elapsed 0.159 seconds.
   
   >
   ```
   
   @alamb are these results more in line with what you were expecting to see 
regarding your comment here?
   https://github.com/apache/datafusion/pull/16971#pullrequestreview-3073221882


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to