+1 It's already a matter of fact for a while that we no longer port new features to the Mesos deployment.
Thank you~ Xintong Song On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <trohrm...@apache.org> wrote: > +1 for officially deprecating this component for the 1.13 release. > > Cheers, > Till > > On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kna...@apache.org> > wrote: > >> Hi Matthias, >> >> Thank you for following up on this. +1 to officially deprecate Mesos in >> the code and documentation, too. It will be confusing for users if this >> diverges from the roadmap. >> >> Cheers, >> >> Konstantin >> >> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <matth...@ververica.com> >> wrote: >> >>> Hi everyone, >>> considering the upcoming release of Flink 1.13, I wanted to revive the >>> discussion about the Mesos support ones more. Mesos is also already >>> listed >>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align >>> the >>> documentation accordingly to make it more explicit? >>> >>> What do you think? >>> >>> Best, >>> Matthias >>> >>> [1] https://flink.apache.org/roadmap.html#feature-radar >>> >>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <trohrm...@apache.org> >>> wrote: >>> >>> > Hi Oleksandr, >>> > >>> > yes you are right. The biggest problem is at the moment the lack of >>> test >>> > coverage and thereby confidence to make changes. We have some e2e tests >>> > which you can find here [1]. These tests are, however, quite coarse >>> grained >>> > and are missing a lot of cases. One idea would be to add a Mesos e2e >>> test >>> > based on Flink's end-to-end test framework [2]. I think what needs to >>> be >>> > done there is to add a Mesos resource and a way to submit jobs to a >>> Mesos >>> > cluster to write e2e tests. >>> > >>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen >>> > [2] >>> > >>> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common >>> > >>> > Cheers, >>> > Till >>> > >>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi < >>> > o.nitavs...@criteo.com> wrote: >>> > >>> >> Hello Xintong, >>> >> >>> >> Thanks for the insights and support. >>> >> >>> >> Browsing the Mesos backlog and didn't identify anything critical, >>> which >>> >> is left there. >>> >> >>> >> I see that there are were quite a lot of contributions to the Flink >>> Mesos >>> >> in the recent version: >>> >> https://github.com/apache/flink/commits/master/flink-mesos. >>> >> We plan to validate the current Flink master (or release 1.12 branch) >>> our >>> >> Mesos setup. In case of any issues, we will try to propose changes. >>> >> My feeling is that our test results shouldn't affect the Flink 1.12 >>> >> release cycle. And if any potential commits will land into the 1.12.1 >>> it >>> >> should be totally fine. >>> >> >>> >> In the future, we would be glad to help you guys with any >>> >> maintenance-related questions. One of the highest priorities around >>> this >>> >> component seems to be the development of the full e2e test. >>> >> >>> >> Kind Regards >>> >> Oleksandr Nitavskyi >>> >> ________________________________ >>> >> From: Xintong Song <tonysong...@gmail.com> >>> >> Sent: Tuesday, October 27, 2020 7:14 AM >>> >> To: dev <dev@flink.apache.org>; user <u...@flink.apache.org> >>> >> Cc: Piyush Narang <p.nar...@criteo.com> >>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support >>> >> >>> >> Hi Piyush, >>> >> >>> >> Thanks a lot for sharing the information. It would be a great relief >>> that >>> >> you are good with Flink on Mesos as is. >>> >> >>> >> As for the jira issues, I believe the most essential ones should have >>> >> already been resolved. You may find some remaining open issues here >>> [1], >>> >> but not all of them are necessary if we decide to keep Flink on Mesos >>> as is. >>> >> >>> >> At the moment and in the short future, I think helps are mostly >>> needed on >>> >> testing the upcoming release 1.12 with Mesos use cases. The community >>> is >>> >> currently actively preparing the new release, and hopefully we could >>> come >>> >> up with a release candidate early next month. It would be greatly >>> >> appreciated if you fork as experienced Flink on Mesos users can help >>> with >>> >> verifying the release candidates. >>> >> >>> >> >>> >> Thank you~ >>> >> >>> >> Xintong Song >>> >> >>> >> [1] >>> >> >>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open >>> >> < >>> >> >>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0 >>> >> > >>> >> >>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.nar...@criteo.com >>> >> <mailto:p.nar...@criteo.com>> wrote: >>> >> >>> >> Hi Xintong, >>> >> >>> >> >>> >> >>> >> Do you have any jiras that cover any of the items on 1 or 2? I can >>> reach >>> >> out to folks internally and see if I can get some folks to commit to >>> >> helping out. >>> >> >>> >> >>> >> >>> >> To cover the other qs: >>> >> >>> >> * Yes, we’ve not got a plan at the moment to get off Mesos. We use >>> >> Yarn for some our Flink workloads when we can. Mesos is only used >>> when we >>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized >>> in >>> >> one DC) >>> >> * We’re currently on Flink 1.9 (old planner). We have a plan to >>> bump >>> >> to 1.11 / 1.12 this quarter. >>> >> * We typically upgrade once every 6 months to a year (not every >>> >> release). We’d like to speed up the cadence but we’re not there yet. >>> >> * We’d largely be good with keeping Flink on Mesos as-is and >>> >> functional while missing out on some of the newer features. We >>> understand >>> >> the pain on the communities side and we can take on the work if we >>> see some >>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put >>> in >>> >> the request to port it over. >>> >> >>> >> >>> >> >>> >> Thanks, >>> >> >>> >> >>> >> >>> >> -- Piyush >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> From: Xintong Song <tonysong...@gmail.com<mailto: >>> tonysong...@gmail.com>> >>> >> Date: Sunday, October 25, 2020 at 10:57 PM >>> >> To: dev <dev@flink.apache.org<mailto:dev@flink.apache.org>>, user < >>> >> u...@flink.apache.org<mailto:u...@flink.apache.org>> >>> >> Cc: Lasse Nedergaard <lassenedergaardfl...@gmail.com<mailto: >>> >> lassenedergaardfl...@gmail.com>>, <p.nar...@criteo.com<mailto: >>> >> p.nar...@criteo.com>> >>> >> Subject: Re: [SURVEY] Remove Mesos support >>> >> >>> >> >>> >> >>> >> Thanks for sharing the information with us, Piyush an Lasse. >>> >> >>> >> >>> >> >>> >> @Piyush >>> >> >>> >> >>> >> >>> >> Thanks for offering the help. IMO, there are currently several >>> problems >>> >> that make supporting Flink on Mesos challenging for us. >>> >> >>> >> 1. Lack of Mesos experts. AFAIK, there are very few people (if not >>> >> none) among the active contributors in this community that are >>> familiar >>> >> with Mesos and can help with development on this component. >>> >> 2. Absence of tests. Mesos does not provide a testing cluster, like >>> >> `MiniYARNCluster`, making it hard to test interactions between Flink >>> and >>> >> Mesos. We have only a few very simple e2e tests running on Mesos >>> deployed >>> >> in a docker, covering the most fundamental workflows. We are not sure >>> how >>> >> well those tests work, especially against some potential corner cases. >>> >> 3. Divergence from other deployment. Because of 1 and 2, the new >>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if >>> >> possible. When the new efforts have to touch the Mesos related >>> components >>> >> (e.g., changes to the common resource manager interfaces), we have to >>> be >>> >> very careful and make as few changes as possible, to avoid >>> accidentally >>> >> breaking anything that we are not familiar with. As a result, the >>> component >>> >> diverges a lot from other deployment components (K8s/Yarn), which >>> makes it >>> >> harder to maintain. >>> >> >>> >> It would be greatly appreciated if you can help with either of the >>> above >>> >> issues. >>> >> >>> >> >>> >> >>> >> Additionally, I have a few questions concerning your use cases at >>> Criteo. >>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while >>> >> keeping the Flink version up-to-date? What Flink version are you >>> currently >>> >> using? How often do you upgrade (e.g., every release)? Would you be >>> good >>> >> with keeping the Flink on Mesos component as it is (means that >>> deployment >>> >> and resource management improvements may not be ported to Mesos), >>> while >>> >> keeping other components up-to-date (e.g., improvements from >>> programming >>> >> APIs, operators, state backens, etc.)? >>> >> >>> >> >>> >> >>> >> Thank you~ >>> >> >>> >> Xintong Song >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard < >>> >> lassenedergaardfl...@gmail.com<mailto:lassenedergaardfl...@gmail.com >>> >> >>> >> wrote: >>> >> >>> >> Hi >>> >> >>> >> >>> >> >>> >> At Trackunit We have been using Mesos for long time but have now >>> moved to >>> >> k8s. >>> >> >>> >> Med venlig hilsen / Best regards >>> >> >>> >> Lasse Nedergaard >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetz...@apache.org >>> >> <mailto:rmetz...@apache.org>>: >>> >> >>> >> >>> >> >>> >> Hey Piyush, >>> >> >>> >> thanks a lot for raising this concern. I believe we should keep Mesos >>> in >>> >> Flink then in the foreseeable future. >>> >> >>> >> Your offer to help is much appreciated. We'll let you know once there >>> is >>> >> something. >>> >> >>> >> >>> >> >>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.nar...@criteo.com >>> >> <mailto:p.nar...@criteo.com>> wrote: >>> >> >>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be >>> able >>> >> to find folks who would be excited to contribute / help in any way. >>> >> >>> >> -- Piyush >>> >> >>> >> >>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kklou...@gmail.com<mailto: >>> >> kklou...@gmail.com>> wrote: >>> >> >>> >> Thanks Piyush for the message. >>> >> After this, I revoke my +1. I agree with the previous opinions >>> that we >>> >> cannot drop code that is actively used by users, especially if it >>> >> something that deep in the stack as support for cluster management >>> >> framework. >>> >> >>> >> Cheers, >>> >> Kostas >>> >> >>> >> On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang < >>> p.nar...@criteo.com >>> >> <mailto:p.nar...@criteo.com>> wrote: >>> >> > >>> >> > Hi folks, >>> >> > >>> >> > >>> >> > >>> >> > We at Criteo are active users of the Flink on Mesos resource >>> >> management component. We are pretty heavy users of Mesos for >>> scheduling >>> >> workloads on our edge datacenters and we do want to continue to be >>> able to >>> >> run some of our Flink topologies (to compute machine learning short >>> term >>> >> features) on those DCs. If possible our vote would be not to drop >>> Mesos >>> >> support as that will tie us to an old release / have to maintain a >>> fork as >>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden >>> >> something that can be helped with by the community? (Or are you >>> referring >>> >> to having to ensure PRs handle the Mesos piece as well when they >>> touch the >>> >> resource managers?) >>> >> > >>> >> > >>> >> > >>> >> > Thanks, >>> >> > >>> >> > >>> >> > >>> >> > -- Piyush >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > From: Till Rohrmann <trohrm...@apache.org<mailto: >>> >> trohrm...@apache.org>> >>> >> > Date: Friday, October 23, 2020 at 8:19 AM >>> >> > To: Xintong Song <tonysong...@gmail.com<mailto: >>> >> tonysong...@gmail.com>> >>> >> > Cc: dev <dev@flink.apache.org<mailto:dev@flink.apache.org>>, >>> user < >>> >> u...@flink.apache.org<mailto:u...@flink.apache.org>> >>> >> > Subject: Re: [SURVEY] Remove Mesos support >>> >> > >>> >> > >>> >> > >>> >> > Thanks for starting this survey Robert! I second Konstantin and >>> >> Xintong in the sense that our Mesos user's opinions should matter most >>> >> here. If our community is no longer using the Mesos integration, then >>> I >>> >> would be +1 for removing it in order to decrease the maintenance >>> burden. >>> >> > >>> >> > >>> >> > >>> >> > Cheers, >>> >> > >>> >> > Till >>> >> > >>> >> > >>> >> > >>> >> > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song < >>> tonysong...@gmail.com >>> >> <mailto:tonysong...@gmail.com>> wrote: >>> >> > >>> >> > +1 for adding a warning in 1.12 about planning to remove Mesos >>> >> support. >>> >> > >>> >> > >>> >> > >>> >> > With my developer hat on, removing the Mesos support would >>> >> definitely reduce the maintaining overhead for the deployment and >>> resource >>> >> management related components. On the other hand, the Flink on Mesos >>> users' >>> >> voices definitely matter a lot for this community. Either way, it >>> would be >>> >> good to draw users attention to this discussion early. >>> >> > >>> >> > >>> >> > >>> >> > Thank you~ >>> >> > >>> >> > Xintong Song >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf < >>> kna...@apache.org >>> >> <mailto:kna...@apache.org>> wrote: >>> >> > >>> >> > Hi Robert, >>> >> > >>> >> > +1 to the plan you outlined. If we were to drop support in Flink >>> >> 1.13+, we >>> >> > would still support it in Flink 1.12- with bug fixes for some >>> time >>> >> so that >>> >> > users have time to move on. >>> >> > >>> >> > It would certainly be very interesting to hear from current >>> Flink >>> >> on Mesos >>> >> > users, on how they see the evolution of this part of the >>> ecosystem. >>> >> > >>> >> > Best, >>> >> > >>> >> > Konstantin >>> >> >>> > >>> >> >> >> -- >> >> Konstantin Knauf >> >> https://twitter.com/snntrable >> >> https://github.com/knaufk >> >