+1 (non-binding)
On April 9, 2025 7:29:40 AM GMT+02:00, Rishab Joshi <rishab99...@gmail.com> wrote: >+1 Exciting. >Rishab Joshi > >On Tue, Apr 8, 2025, 10:04 PM Ruifeng Zheng <ruife...@apache.org> wrote: > >> +1 >> >> On Wed, Apr 9, 2025 at 12:57 PM Denny Lee <denny.g....@gmail.com> wrote: >> >>> +1 (non-binding) >>> >>> On Tue, Apr 8, 2025 at 9:53 PM Yuming Wang <yumw...@apache.org> wrote: >>> >>>> +1 >>>> >>>> On Wed, Apr 9, 2025 at 10:47 AM Jungtaek Lim < >>>> kabhwan.opensou...@gmail.com> wrote: >>>> >>>>> +1 looking forward to seeing this make progress! >>>>> >>>>> On Wed, Apr 9, 2025 at 11:32 AM Yang Jie <yangji...@apache.org> wrote: >>>>> >>>>>> +1 >>>>>> >>>>>> On 2025/04/09 01:07:57 Hyukjin Kwon wrote: >>>>>> > +1 >>>>>> > >>>>>> > I am actually pretty excited to have this. Happy to see this being >>>>>> proposed. >>>>>> > >>>>>> > On Wed, 9 Apr 2025 at 01:55, Chao Sun <sunc...@apache.org> wrote: >>>>>> > >>>>>> > > +1. Super excited about this effort! >>>>>> > > >>>>>> > > On Tue, Apr 8, 2025 at 9:47 AM huaxin gao <huaxin.ga...@gmail.com> >>>>>> wrote: >>>>>> > > >>>>>> > >> +1 I support this SPIP because it simplifies data pipeline >>>>>> management and >>>>>> > >> enhances error detection. >>>>>> > >> >>>>>> > >> >>>>>> > >> On Tue, Apr 8, 2025 at 9:33 AM Dilip Biswal <dkbis...@gmail.com> >>>>>> wrote: >>>>>> > >> >>>>>> > >>> Excited to see this heading toward open source — materialized >>>>>> views and >>>>>> > >>> other features will bring a lot of value. >>>>>> > >>> +1 (non-binding) >>>>>> > >>> >>>>>> > >>> On Mon, Apr 7, 2025 at 10:37 AM Sandy Ryza <sa...@apache.org> >>>>>> wrote: >>>>>> > >>> >>>>>> > >>>> Hi Khalid – the CLI in the current proposal will need to be >>>>>> built on >>>>>> > >>>> top of internal APIs for constructing and launching pipeline >>>>>> executions. >>>>>> > >>>> We'll have the option to expose these in the future. >>>>>> > >>>> >>>>>> > >>>> It would be worthwhile to understand the use cases in more >>>>>> depth before >>>>>> > >>>> exposing these, because APIs are one-way doors and can be >>>>>> costly to >>>>>> > >>>> maintain. >>>>>> > >>>> >>>>>> > >>>> On Sat, Apr 5, 2025 at 11:59 PM Khalid Mammadov < >>>>>> > >>>> khalidmammad...@gmail.com> wrote: >>>>>> > >>>> >>>>>> > >>>>> Looks great! >>>>>> > >>>>> QQ: will user able to run this pipeline from normal code? I.e. >>>>>> can I >>>>>> > >>>>> trigger a pipeline from *driver* code based on some condition >>>>>> etc. or >>>>>> > >>>>> it must be executed via separate shell command ? >>>>>> > >>>>> As a background Databricks imposes similar limitation where as >>>>>> you >>>>>> > >>>>> cannot run normal Spark code and DLT on the same cluster for >>>>>> some reason >>>>>> > >>>>> and forces to use two clusters increasing the cost and latency. >>>>>> > >>>>> >>>>>> > >>>>> On Sat, 5 Apr 2025 at 23:03, Sandy Ryza <sa...@apache.org> >>>>>> wrote: >>>>>> > >>>>> >>>>>> > >>>>>> Hi all – starting a discussion thread for a SPIP that I've >>>>>> been >>>>>> > >>>>>> working on with Chao Sun, Kent Yao, Yuming Wang, and Jie >>>>>> Yang: [JIRA >>>>>> > >>>>>> <https://issues.apache.org/jira/browse/SPARK-51727>] [Doc >>>>>> > >>>>>> < >>>>>> https://docs.google.com/document/d/1PsSTngFuRVEOvUGzp_25CQL1yfzFHFr02XdMfQ7jOM4/edit?tab=t.0 >>>>>> > >>>>>> > >>>>>> ]. >>>>>> > >>>>>> >>>>>> > >>>>>> The SPIP proposes extending Spark's lazy, declarative >>>>>> execution model >>>>>> > >>>>>> beyond single queries, to pipelines that keep multiple >>>>>> datasets up to date. >>>>>> > >>>>>> It introduces the ability to compose multiple transformations >>>>>> into a single >>>>>> > >>>>>> declarative dataflow graph. >>>>>> > >>>>>> >>>>>> > >>>>>> Declarative pipelines aim to simplify the development and >>>>>> management >>>>>> > >>>>>> of data pipelines, by removing the need for manual >>>>>> orchestration of >>>>>> > >>>>>> dependencies and making it possible to catch many errors >>>>>> before any >>>>>> > >>>>>> execution steps are launched. >>>>>> > >>>>>> >>>>>> > >>>>>> Declarative pipelines can include both batch and streaming >>>>>> > >>>>>> computations, leveraging Structured Streaming for stream >>>>>> processing and new >>>>>> > >>>>>> materialized view syntax for batch processing. Tight >>>>>> integration with Spark >>>>>> > >>>>>> SQL's analyzer enables deeper analysis and earlier error >>>>>> detection than is >>>>>> > >>>>>> achievable with more generic frameworks. >>>>>> > >>>>>> >>>>>> > >>>>>> Let us know what you think! >>>>>> > >>>>>> >>>>>> > >>>>>> >>>>>> > >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>> >>>>>>