+1 On Wed, Apr 9, 2025 at 4:33 PM Szehon Ho <szehon.apa...@gmail.com> wrote:
> +1 really excited to finally see Materialized View finally make its way to > Spark, as many other ecosystem projects (Trino, Starrocks, soon Iceberg) > already supporting it. > > Thanks > Szehon > > On Wed, Apr 9, 2025 at 2:33 AM Martin Grund <mar...@databricks.com.invalid> > wrote: > >> +1 >> >> On Wed, Apr 9, 2025 at 9:37 AM Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> +1 >>> >>> Dr Mich Talebzadeh, >>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> >>> >>> On Wed, 9 Apr 2025 at 08:07, Peter Toth <peter.t...@gmail.com> wrote: >>> >>>> +1 >>>> >>>> On Wed, Apr 9, 2025 at 8:51 AM Cheng Pan <pan3...@gmail.com> wrote: >>>> >>>>> +1 (non-binding) >>>>> >>>>> Glad to see Spark SQL extended to streaming use cases. >>>>> >>>>> Thanks, >>>>> Cheng Pan >>>>> >>>>> >>>>> >>>>> On Apr 9, 2025, at 14:43, Anton Okolnychyi <aokolnyc...@gmail.com> >>>>> wrote: >>>>> >>>>> +1 >>>>> >>>>> вт, 8 квіт. 2025 р. о 23:36 Jacky Lee <qcsd2...@gmail.com> пише: >>>>> >>>>>> +1 I'm delighted that it will be open-sourced, enabling greater >>>>>> integration with Iceberg/Delta to unlock more value. >>>>>> >>>>>> Jungtaek Lim <kabhwan.opensou...@gmail.com> 于2025年4月9日周三 10:47写道: >>>>>> > >>>>>> > +1 looking forward to seeing this make progress! >>>>>> > >>>>>> > On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org> >>>>>> wrote: >>>>>> >> >>>>>> >> +1 >>>>>> >> >>>>>> >> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >>>>>> >> > +1 >>>>>> >> > >>>>>> >> > I am actually pretty excited to have this. Happy to see this >>>>>> being proposed. >>>>>> >> > >>>>>> >> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org> >>>>>> wrote: >>>>>> >> > >>>>>> >> > > +1. Super excited about this effort! >>>>>> >> > > >>>>>> >> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao < >>>>>> huaxin.ga...@gmail.com> wrote: >>>>>> >> > > >>>>>> >> > >> +1 I support this SPIP because it simplifies data pipeline >>>>>> management and >>>>>> >> > >> enhances error detection. >>>>>> >> > >> >>>>>> >> > >> >>>>>> >> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal < >>>>>> dkbis...@gmail.com> wrote: >>>>>> >> > >> >>>>>> >> > >>> Excited to see this heading toward open source — >>>>>> materialized views and >>>>>> >> > >>> other features will bring a lot of value. >>>>>> >> > >>> +1 (non-binding) >>>>>> >> > >>> >>>>>> >> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org> >>>>>> wrote: >>>>>> >> > >>> >>>>>> >> > >>>> Hi Khalid – the CLI in the current proposal will need to be >>>>>> built on >>>>>> >> > >>>> top of internal APIs for constructing and launching >>>>>> pipeline executions. >>>>>> >> > >>>> We'll have the option to expose these in the future. >>>>>> >> > >>>> >>>>>> >> > >>>> It would be worthwhile to understand the use cases in more >>>>>> depth before >>>>>> >> > >>>> exposing these, because APIs are one-way doors and can be >>>>>> costly to >>>>>> >> > >>>> maintain. >>>>>> >> > >>>> >>>>>> >> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >>>>>> >> > >>>> khalidmammad...@gmail.com> wrote: >>>>>> >> > >>>> >>>>>> >> > >>>>> Looks great! >>>>>> >> > >>>>> QQ: will user able to run this pipeline from normal code? >>>>>> I.e. can I >>>>>> >> > >>>>> trigger a pipeline from *driver* code based on some >>>>>> condition etc. or >>>>>> >> > >>>>> it must be executed via separate shell command ? >>>>>> >> > >>>>> As a background Databricks imposes similar limitation >>>>>> where as you >>>>>> >> > >>>>> cannot run normal Spark code and DLT on the same cluster >>>>>> for some reason >>>>>> >> > >>>>> and forces to use two clusters increasing the cost and >>>>>> latency. >>>>>> >> > >>>>> >>>>>> >> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org> >>>>>> wrote: >>>>>> >> > >>>>> >>>>>> >> > >>>>>> Hi all – starting a discussion thread for a SPIP that >>>>>> I've been >>>>>> >> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie >>>>>> Yang: [JIRA >>>>>> >> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >>>>>> >> > >>>>>> < >>>>>> https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 >>>>>> > >>>>>> >> > >>>>>> ]. >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> The SPIP proposes extending Spark's lazy, declarative >>>>>> execution model >>>>>> >> > >>>>>> beyond single queries, to pipelines that keep multiple >>>>>> datasets up to date. >>>>>> >> > >>>>>> It introduces the ability to compose multiple >>>>>> transformations into a single >>>>>> >> > >>>>>> declarative dataflow graph. >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> Declarative pipelines aim to simplify the development and >>>>>> management >>>>>> >> > >>>>>> of data pipelines, by removing the need for manual >>>>>> orchestration of >>>>>> >> > >>>>>> dependencies and making it possible to catch many errors >>>>>> before any >>>>>> >> > >>>>>> execution steps are launched. >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> Declarative pipelines can include both batch and streaming >>>>>> >> > >>>>>> computations, leveraging Structured Streaming for stream >>>>>> processing and new >>>>>> >> > >>>>>> materialized view syntax for batch processing. Tight >>>>>> integration with Spark >>>>>> >> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >>>>>> detection than is >>>>>> >> > >>>>>> achievable with more generic frameworks. >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> Let us know what you think! >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> >>>>>> >> > >>>>>> >> >>>>>> >> >>>>>> --------------------------------------------------------------------- >>>>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>> >> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>> >>>>>> >>>>>