Z0ltrix opened a new issue #2269:
URL: https://github.com/apache/drill/issues/2269


   **Is your feature request related to a problem? Please describe.**
   
   As discussed at the Users Mailing List it looks like more and more people 
are using deltalake or iceberg in spark for transactional working with big 
tables.
   
   Additionally i saw that drill is using iceberg as storage engine for 
metadata.
   
   I think this kind of storage format is used more and more in cloud 
architectures because it departments wants to use as less tools as possible to 
provide a big data product. With iceberg they can build consistant and scalable 
big data structures for stream and batch processing at the same storage layer 
with a single tool, Spark.
   
   The problem is how to provide the data to customers. In my opinion Spark 
itself is too slow for interactive querying by a lot of people or BI Tools. 
Thats the point where Drill enters the stage.
   
   **Describe the solution you'd like**
   
   I would like to query Iceberg Tables with Drill like a Folder of Parquet 
Files in DFS.
   
   `SELECT * FROM dfs.'path/to/iceberg/table' `
   
   Additionally it would be great to make use of time-travel Feature via 
snapshots and timestamp-ms https://iceberg.apache.org/spec/#snapshots 
   
   `SELECT snapshots[0].timestamp-ms FROM dfs.'path/to/iceberg/table' `
   
   `SELECT * FROM dfs.'path/to/iceberg/table'  WHERE snapshot-timestamp-ms = 
'2021-06-07 20:15:46.378'`
   
   **Describe alternatives you've considered**
   
   Alternatives are just switch to another MPP System like Dremio or Presto.
   
   **Additional context**
   -
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to