Lordworms commented on issue #9964:
URL: 
https://github.com/apache/arrow-datafusion/issues/9964#issuecomment-2039966408

   > FYI I think this is more like an Epic that can be used to coordinate 
individual tasks / changes rather than a specific change itself.
   > 
   > > Interested in this one
   > 
   > Thanks @Lordworms -- one thing that would probably help to start this 
project along would be to gather some data.
   > 
   > Specifically, put the LIstingTable against data on a remote object store 
(eg. figure out how to write a query against 100 parquet files on an S3 bucket).
   > 
   > And then measure how much time is spent:
   > 
   > 1. object store listing
   > 2. fetching metadata
   > 3. pruning / fetching IO
   > 4. How many object store requests are made
   > 
   > Does anyone know a good public data set on S3 that we could use to test / 
benchmark with?
   
   I got it, I'll search some data first


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to