[GitHub] [druid] Thelin90 opened a new issue, #12598: Delta IO Connector

GitBox Thu, 02 Jun 2022 06:54:57 -0700


Thelin90 opened a new issue, #12598:
URL: https://github.com/apache/druid/issues/12598


   ### Motivation
   
   Hello.
   
   I have been using Apache Druid on and off for some time. Most recently, I 
used it as the base for a Lakehouse architecture I designed. I have been 
deploying it alone via K8S vanilla style, without any operators or hosted 
clouds etc with good success.
   However, since I am a big fan of the open source data layer created by 
databricks: [delta.io](http://delta.io/) , I ended up doing tedious movements 
of delta -> raw parquet -> load to druid .
   
   presto has recently created a [delta.io](http://delta.io/) adapter to link 
delta files directly, this is making me consider to use a presto cluster 
instead (would scale decently enough with current data load I have to work 
with).
   However, is there anyone who has faced this issue and solved it in a “nice” 
way, or are there any plans to add a delta connector similar to what presto has 
done?
   
   Here is a reference to what presto has added: 
https://prestodb.io/blog/2022/03/15/native-delta-lake-connector-for-presto
   And a video: https://www.youtube.com/watch?v=JrXGkqpl7xk (fast forward to 
`21:40`).
   
   ### Proposed changes
   
   Implement delta.io connector to load in into apache druid.
   
   ### Rationale
   
   Currently it is very tedious, and a bit inefficient when you are working 
with delta.io as your underlaying layer, to then have another service 
extracting version `N` of delta to load into raw parquet, to make it accessible 
to `apache druid`.
   
   ### Operational impact
   
   It should be done in a way so it does not impact backwards compatibility.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] Thelin90 opened a new issue, #12598: Delta IO Connector

Reply via email to