Hello, Sorry for the delay.
I agree I think that works for most workflows. The only caveat would be CUDA based ML workflows. You can't bundle CUDA into a dependency bundle. Overall, it works in application mode. It would just be awesome to use Session clusters for Batch / ephemeral test streaming jobs. Ryan van Huuksloot Sr. Production Engineer | Streaming Platform [image: Shopify] <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email> On Thu, Dec 5, 2024 at 2:56 AM Dian Fu <dian0511...@gmail.com> wrote: > Hi Ryan, > > It supports configuring the Python dependencies at job wise in PyFlink > and so per my understanding, "dynamically provide dependencies in > Python" should already be supported. Besides, it also supports > specifying Python dependencies which are located in distributed file > systems. It would be a good way to manage the Python dependencies in > distributed file systems and each job could choose & configure which > Python dependencies to use. > > Regards, > Dian > > On Thu, Dec 5, 2024 at 3:28 PM Shengkai Fang <fskm...@gmail.com> wrote: > > > > Hi Ryan. > > > > Thanks for your inputs. I think it's better to load user python > > dependencies dynamically rather than use different images because image > is > > not flexible, because using image is hard to test: > > * we need to build an image and push the image to docker hub for > testing... > > * it takes a lot of time to build images... > > > > Best, > > Shengkai > > > > > > Ryan van Huuksloot <ryan.vanhuuksl...@shopify.com.invalid> 于2024年12月5日周四 > > 12:46写道: > > > > > Hi Shengkai, > > > > > > re: (1) > > > That is how we currently handle image management. > > > > > > re: (2) > > > The current proposed use case is that MLEs provide different PyFlink > jobs > > > which can have different dependencies/version requirements and these > > > packages can be quite large (GBs). > > > In the Java world, you'd provide a different uber jar with the > dependencies > > > and that should work. In Python, as far as I know, you can't provide > the > > > same bundled dependencies. > > > This means that we need to preload the image with all of the > dependencies > > > but those dependencies would be static based on the pre-defined image. > And > > > different workloads on this session cluster may require different > > > dependencies / versions. > > > > > > Maybe it is simpler to provide a way to dynamically provide > dependencies in > > > Python - similar to Java? > > > > > > (I haven't use the jar submission in Java) > > > > > > Thanks, > > > Ryan van Huuksloot > > > Sr. Production Engineer | Streaming Platform > > > [image: Shopify] > > > < > https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email> > > > > > > > > > On Tue, Dec 3, 2024 at 9:11 PM Shengkai Fang <fskm...@gmail.com> > wrote: > > > > > > > Hi Ryan. > > > > > > > > Thanks for your input. I am not a k8s expert, but I know that Flink > k8s > > > > deployments supports to get Flink TaskManager with specified pod > > > > template[1], which supports to specify image. @Junrui may provide > more > > > > detailed information about this topic. > > > > > > > > If different taskmanager has different workload, it means the slot > in the > > > > different taskamanger has different profiles. Otherwise, scheduler > > > doesn't > > > > know the difference among different slots and may choose the wrong > slot > > > to > > > > run the task. I am just curious what's the difference between the > ETL job > > > > and ML job. > > > > > > > > Best, > > > > Shengkai > > > > > > > > [1] > > > > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-template > > > > > > > > Ryan van Huuksloot <ryan.vanhuuksl...@shopify.com.invalid> > 于2024年12月3日周二 > > > > 22:11写道: > > > > > > > > > Hi Shengkai, > > > > > > > > > > Today we currently use application mode. It is an option and may > be the > > > > > recommendation. > > > > > > > > > > Specifically for Batch jobs, we have Machine Learning pipelines > that > > > are > > > > > ephemeral however they contain very different dependencies > depending on > > > > the > > > > > workload. > > > > > From my perspective, Batch jobs work well on Session Clusters. > However, > > > > due > > > > > to the differing images you cannot run different workloads on the > same > > > > > session cluster. Making the session cluster essentially useless. > > > > > > > > > > Ryan van Huuksloot > > > > > Sr. Production Engineer | Streaming Platform > > > > > [image: Shopify] > > > > > < > > > > https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email > > > > > > > > > > > > > > > > > > > > On Tue, Dec 3, 2024 at 1:20 AM Shengkai Fang <fskm...@gmail.com> > > > wrote: > > > > > > > > > > > Hi. > > > > > > > > > > > > Why needs different image for taskmanager? Do you mean different > > > > > operators > > > > > > require different resources? > > > > > > > > > > > > As far as I know, JM supports to manage taskmanager with > different > > > > > > profiles. For example, a cluster may consists of two taskmanagers > > > with > > > > > > following profiles: > > > > > > * TM1 contains 4 slots, every slot has 2 core, 4GB Memory > > > > > > * TM2 contains 4 slots, every slot have 1core, 2GB Memory > > > > > > > > > > > > > the scheduler would need some level of job isolation > > > > > > > > > > > > You can use application mode to run the job. In application > mode, the > > > > > > cluster is dedicated for the job. > > > > > > > > > > > > Best, > > > > > > Shengkai > > > > > > > > > > > > > > > > > > >