Most common in practice are Spark, Flink, and Trino/Presto. Also, Hive, Dremio and cloud "warehouses" like Snowflake (external Iceberg tables), BigQuery, Redshift. The latter are more relevant as sources/sinks and not as execution engines.
Also, nice to have would be DuckDB (their duck lake is quite promising). Kaustubh On Fri, Jan 30, 2026 at 3:45 PM Zoi Kaoudi via dev <[email protected]> wrote: > I think going into data lake settings makes lots of sense. > We already have parquet source operators for java and spark and a sink for > spark, we are missing a sink for java. > There is also a PR by Christoffer with Iceberg sources and sinks [1] > The question is which data engines do organizations having data lakes > settings use? Spark is one of them, I guess some DBMS as well, but what > else that we do not support? Apache DataFusion? Others? > [1] https://github.com/apache/wayang/pull/656 > > > > Στις Παρασκευή 30 Ιανουαρίου 2026 στις 10:30:11 π.μ. CET, ο χρήστης > Alexander Alten <[email protected]> έγραψε: > > Agree with Kaustubh, an parquet / Avro source and sink sound great and > enables Iceberg integration. > > —Alex > > > On Jan 30, 2026, at 10:04, Kaustubh Beedkar <[email protected]> wrote: > > > > Thanks Zoi. > > > > An interesting project could be supporting iceberg tables as that is > gaining a lot of traction in the data lake settings. > > > > Best > > Kaustubh > > > > > > Sent from my iPhone > > > >> On 30 Jan 2026, at 1:52 PM, Zoi Kaoudi <[email protected]> wrote: > >> > >> Hello everyone, > >> > >> I believe it's around time to think about some project ideas to propose > for the Google Summer of Code. Here is a link to a guide for mentoring [1] > and here is a link to some ideas so far [2] > >> > >> I just sent an email to ask if there is a deadline for submitting ideas > but in the meantime we could start discussing some ideas here. I will come > back with some ideas I had towards the end of the day. > >> > >> Please feel free to use this thread and propose any ideas you might > have. > >> > >> [1] > https://www.google.com/url?q=https://community.apache.org/gsoc/guide-to-being-a-mentor.html&source=gmail-imap&ust=1770368696000000&usg=AOvVaw0VKxRRKZoLspaFoBmOCr-w > >> [2] > https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/COMDEV/GSoC%2B2026%2BIdeas%2Blist&source=gmail-imap&ust=1770368696000000&usg=AOvVaw1wgo_X5qUAdltiPrbgTSD3 > >> > >> Best > >> -- > >> Zoi > >> > >>> On 2025/11/25 10:28:02 Zoi Kaoudi via dev wrote: > >>> Oh yes the link directs to the 2025 page. I think the application > starts some time in February.. I will monitor and let you know. > >>> Best > >>> -- > >>> Zoi > >>> Στις Τρίτη 25 Νοεμβρίου 2025 στις 10:19:37 π.μ. CET, ο χρήστης > Kaustubh Beedkar <[email protected]> έγραψε: > >>> > >>> agree. Currently it says organization registration is closed; but > perhaps > >>> its for 2025. > >>> > >>> > >>> Kaustubh > >>> > >>> > >>>> On Tue, Nov 25, 2025 at 2:07 PM Alexander Alten <[email protected]> > wrote: > >>>> > >>>> +1 - good idea! > >>>> > >>>> —Alex > >>>> > >>>>> On Nov 25, 2025, at 09:08, Zoi Kaoudi via dev <[email protected] > > > >>>> wrote: > >>>>> > >>>>> Hello everyone, > >>>>> I was thinking whether that we could apply to the Google Summer of > Code > >>>> as a mentor: > >>>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://summerofcode.withgoogle.com/%26source%3Dgmail-imap%26ust%3D1764662953000000%26usg%3DAOvVaw0uj0ldYUq1NusFx5pzGWeV&source=gmail-imap&ust=1770368696000000&usg=AOvVaw0KHtAgath-_o517iIPWw_B > >>>>> I think we had applied again back in the early days but I do not > recall > >>>> what happened back then. > >>>>> What do you think? > >>>>> Best > >>>>> -- > >>>>> Zoi > >>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> *Scalytics Connect* > >>>> The foundation for secure, scalable, and transparent > >>>> AI. > >>>> www.scalytics.io < > https://www.google.com/url?q=http://www.scalytics.io&source=gmail-imap&ust=1770368696000000&usg=AOvVaw3_vDapOKccHoH-ykJJ1N0- > > > >>>> > >>>> -- Please consider the > >>>> environment before printing this email -- > >>>> > >>>> Disclaimer: > >>>> The content of this > >>>> message is confidential. If you have received it by mistake, please > inform > >>>> us by an email reply and then delete the message. It is forbidden to > copy, > >>>> forward, or in any way reveal the contents of this message to anyone. > The > >>>> integrity and security of this email cannot be guaranteed over the > >>>> Internet. Therefore, the sender will not be held liable for any damage > >>>> caused by the message. > >>>> > >>> > > > -- > *Scalytics Connect* > The foundation for secure, scalable, and transparent > AI. > www.scalytics.io <http://www.scalytics.io> > > -- Please consider the > environment before printing this email -- > > Disclaimer: > The content of this > message is confidential. If you have received it by mistake, please inform > us by an email reply and then delete the message. It is forbidden to copy, > forward, or in any way reveal the contents of this message to anyone. The > integrity and security of this email cannot be guaranteed over the > Internet. Therefore, the sender will not be held liable for any damage > caused by the message. >
