Okay. It seems that we've agreed on using different repositories for each engine.
Luciano, can you create a "bahir-flink" git repository with GitHub integration? I'll soon open the first pull request moving an existing connector from Flink to Bahir. Also, there's an incoming contribution that I would probably redirect to Bahir as well. On Tue, Aug 16, 2016 at 2:30 PM, Ufuk Celebi <[email protected]> wrote: > Hey all, > > great to see this discussion. I'm part of the Flink PMC and would love > to see some of Flink's connectors added to Bahir. I can also help > Robert with maintenance on the Flink side of things. > > +1 to multiple repo approach > > Best, > > Ufuk > > On Tue, Aug 16, 2016 at 2:27 PM, <[email protected]> wrote: > > > > dev Digest of: thread.362 > > > > > > [DISCUSS] Adding streaming connectors from Apache Flink to Bahir > > 362 by: Robert Metzger > > 363 by: Steve Loughran > > 370 by: Luciano Resende > > 371 by: Robert Metzger > > 374 by: Luciano Resende > > 376 by: Ted Yu > > 377 by: Robert Metzger > > 380 by: Steve Loughran > > 381 by: Luciano Resende > > 382 by: Luciano Resende > > 384 by: Robert Metzger > > > > Administrivia: > > > > > > --- Administrative commands for the dev list --- > > > > I can handle administrative requests automatically. Please > > do not send them to the list address! Instead, send > > your message to the correct command address: > > > > To subscribe to the list, send a message to: > > <[email protected]> > > > > To remove your address from the list, send a message to: > > <[email protected]> > > > > Send mail to the following for info and FAQ for this list: > > <[email protected]> > > <[email protected]> > > > > Similar addresses exist for the digest list: > > <[email protected]> > > <[email protected]> > > > > To get messages 123 through 145 (a maximum of 100 per request), mail: > > <[email protected]> > > > > To get an index with subject and author for messages 123-456 , mail: > > <[email protected]> > > > > They are always returned as sets of 100, max 2000 per request, > > so you'll actually get 100-499. > > > > To receive all messages with the same subject as message 12345, > > send a short message to: > > <[email protected]> > > > > The messages should contain one line or word of text to avoid being > > treated as sp@m, but I will ignore their content. > > Only the ADDRESS you send to is important. > > > > You can start a subscription for an alternate address, > > for example "[email protected]", just add a hyphen and your > > address (with '=' instead of '@') after the command word: > > <[email protected]> > > > > To stop subscription for this address, mail: > > <[email protected]> > > > > In both cases, I'll send a confirmation message to that address. When > > you receive it, simply reply to it to complete your subscription. > > > > If despite following these instructions, you do not get the > > desired results, please contact my owner at > > [email protected]. Please be patient, my owner is a > > lot slower than I am ;-) > > > > --- Enclosed is a copy of the request I received. > > > > Return-Path: <[email protected]> > > Received: (qmail 73404 invoked by uid 99); 16 Aug 2016 12:27:00 -0000 > > Received: from mail-relay.apache.org (HELO mail-relay.apache.org) > (140.211.11.15) > > by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2016 12:27:00 > +0000 > > Received: from mail-oi0-f46.google.com (mail-oi0-f46.google.com > [209.85.218.46]) > > by mail-relay.apache.org (ASF Mail Server at > mail-relay.apache.org) with ESMTPSA id A606D1A0046 > > for <[email protected]>; Tue, 16 Aug 2016 > 12:27:00 +0000 (UTC) > > Received: by mail-oi0-f46.google.com with SMTP id c15so96340127oig.0 > > for <[email protected]>; Tue, 16 Aug 2016 > 05:27:00 -0700 (PDT) > > X-Gm-Message-State: AEkoousgoDEIM+HCjh+aY7eTsyA74zj2w9Kq4PiayzrgwesoOZ+ > Zww6zKxamSKZTtf5yGMNL9CuRzh7NJTBzQ8V6 > > X-Received: by 10.202.197.3 with SMTP id v3mr18601804oif.131. > 1471350419968; > > Tue, 16 Aug 2016 05:26:59 -0700 (PDT) > > MIME-Version: 1.0 > > Received: by 10.157.55.181 with HTTP; Tue, 16 Aug 2016 05:26:19 -0700 > (PDT) > > From: Ufuk Celebi <[email protected]> > > Date: Tue, 16 Aug 2016 14:26:19 +0200 > > X-Gmail-Original-Message-ID: <CAKiyyaH7h9Njeo+MUaAX2nVoVaHL8B= > [email protected]> > > Message-ID: <CAKiyyaH7h9Njeo+MUaAX2nVoVaHL8B=5STdZ+9HV9- > [email protected]> > > Subject: > > To: [email protected] > > Content-Type: text/plain; charset=UTF-8 > > > > > > ---------------------------------------------------------------------- > > > > > > > > ---------- Forwarded message ---------- > > From: Robert Metzger <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 10:54:17 +0200 > > Subject: [DISCUSS] Adding streaming connectors from Apache Flink to Bahir > > Hello Bahir community, > > > > The Apache Flink community is currently discussing how to handle incoming > > (streaming) connector contributions [1]. > > The Flink community wants to limit the maintained connectors to the most > > popular ones, but we don't want to reject valuable code contributions > > without offering a good alternative. > > Among options we are currently discussing is also Apache Bahir. > > From the Bahir announcement, I got the impression that the project is > also > > open to connectors from projects other than Apache Spark. > > > > Initially, we would move some of our current connectors here (redis, > flume, > > nifi), and there are also some pending contributions in Flink that we > would > > redirect to Bahir as well. > > > > So what's your opinion on this? > > > > > > Regards, > > Robert > > > > > > [1] > > http://mail-archives.apache.org/mod_mbox/flink-dev/201608. > mbox/%3CCAGr9p8CAN8KQTM6%2B3%2B%3DNv8M3ggYEE9gSqdKaKLQiWsWsKzZ > 21Q%40mail.gmail.com%3E > > > > > > ---------- Forwarded message ---------- > > From: Steve Loughran <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 11:04:26 +0200 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > I can see benefits from this —provided we get some help from the Flink > > people in maintaining and testing the stuff. > > > > On 11 August 2016 at 10:54, Robert Metzger <[email protected]> wrote: > > > >> Hello Bahir community, > >> > >> The Apache Flink community is currently discussing how to handle > incoming > >> (streaming) connector contributions [1]. > >> The Flink community wants to limit the maintained connectors to the most > >> popular ones, but we don't want to reject valuable code contributions > >> without offering a good alternative. > >> Among options we are currently discussing is also Apache Bahir. > >> From the Bahir announcement, I got the impression that the project is > also > >> open to connectors from projects other than Apache Spark. > >> > >> Initially, we would move some of our current connectors here (redis, > flume, > >> nifi), and there are also some pending contributions in Flink that we > would > >> redirect to Bahir as well. > >> > >> So what's your opinion on this? > >> > >> > >> Regards, > >> Robert > >> > >> > >> [1] > >> http://mail-archives.apache.org/mod_mbox/flink-dev/201608. > >> mbox/%3CCAGr9p8CAN8KQTM6%2B3%2B%3DNv8M3ggYEE9gSqdKaKLQiWsWsKzZ > >> 21Q%40mail.gmail.com%3E > >> > > > > > > ---------- Forwarded message ---------- > > From: Luciano Resende <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 04:50:12 -0700 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > On Thu, Aug 11, 2016 at 2:04 AM, Steve Loughran <[email protected]> > wrote: > > > >> I can see benefits from this —provided we get some help from the Flink > >> people in maintaining and testing the stuff. > >> > > > > +1, Let me know when you guys are ready and I can create a bahir-flink > git > > repository. > > > > > > -- > > Luciano Resende > > http://twitter.com/lresende1975 > > http://lresende.blogspot.com/ > > > > > > ---------- Forwarded message ---------- > > From: Robert Metzger <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 14:42:33 +0200 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > @Steve: The plan is that Flink committers also help out here with > > reviewing, releasing and other community activities (but I suspect the > > activity will be much lower, otherwise, we would not be discussing > removing > > some of the connectors from Flink) > > > > @Luciano: So the idea is to have separate repositories for each project > > contributing connectors? > > I'm wondering if it makes sense to keep the code in the same repository > to > > have some synergies (like the release scripts, CI, documentation, a > common > > parent pom with rat etc.). Otherwise, it would maybe make more sense to > > create a Bahir-style project for Flink, to avoid maintaining completely > > disjunct codebases in the same JIRA, ML, ... > > > > > > On Thu, Aug 11, 2016 at 1:50 PM, Luciano Resende <[email protected]> > > wrote: > > > >> On Thu, Aug 11, 2016 at 2:04 AM, Steve Loughran <[email protected]> > wrote: > >> > >> > I can see benefits from this —provided we get some help from the Flink > >> > people in maintaining and testing the stuff. > >> > > >> > >> +1, Let me know when you guys are ready and I can create a bahir-flink > git > >> repository. > >> > >> > >> -- > >> Luciano Resende > >> http://twitter.com/lresende1975 > >> http://lresende.blogspot.com/ > >> > > > > > > ---------- Forwarded message ---------- > > From: Luciano Resende <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 09:03:39 -0700 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > On Thu, Aug 11, 2016 at 5:42 AM, Robert Metzger <[email protected]> > wrote: > > > >> > >> > >> @Luciano: So the idea is to have separate repositories for each project > >> contributing connectors? > >> I'm wondering if it makes sense to keep the code in the same repository > to > >> have some synergies (like the release scripts, CI, documentation, a > common > >> parent pom with rat etc.). Otherwise, it would maybe make more sense to > >> create a Bahir-style project for Flink, to avoid maintaining completely > >> disjunct codebases in the same JIRA, ML, ... > >> > >> > >> > > But we most likely would have very different release schedules with the > > different set of extensions, where Spark extensions will tend to follow > > Spark release cycles, and Flink release cycles. As for the overhead, I > > believe release scripts might be the one piece that would be replicated, > > but I can volunteer the infrastructure overhead for now. All rest, such > as > > JIRA, ML, etc will be common. But, anyway, I don't want to make this an > > issue for Flink to bring up the extensions here, so if you have a strong > > preference on having all in the same repo, we could start with that. > > > > Thoughts ? > > > > > > ---------- Forwarded message ---------- > > From: Ted Yu <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 09:13:24 -0700 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > Having Flink connectors in the same repo seems to make more sense at the > > moment. > > > > Certain artifacts can be shared between the two types of connectors. > > > > Flink seems to have more frequent releases recently. But Bahir doesn't > have > > to follow each Flink patch release. > > > > Just my two cents. > > > > On Thu, Aug 11, 2016 at 9:03 AM, Luciano Resende <[email protected]> > > wrote: > > > >> On Thu, Aug 11, 2016 at 5:42 AM, Robert Metzger <[email protected]> > >> wrote: > >> > >> > > >> > > >> > @Luciano: So the idea is to have separate repositories for each > project > >> > contributing connectors? > >> > I'm wondering if it makes sense to keep the code in the same > repository > >> to > >> > have some synergies (like the release scripts, CI, documentation, a > >> common > >> > parent pom with rat etc.). Otherwise, it would maybe make more sense > to > >> > create a Bahir-style project for Flink, to avoid maintaining > completely > >> > disjunct codebases in the same JIRA, ML, ... > >> > > >> > > >> > > >> But we most likely would have very different release schedules with the > >> different set of extensions, where Spark extensions will tend to follow > >> Spark release cycles, and Flink release cycles. As for the overhead, I > >> believe release scripts might be the one piece that would be replicated, > >> but I can volunteer the infrastructure overhead for now. All rest, such > as > >> JIRA, ML, etc will be common. But, anyway, I don't want to make this an > >> issue for Flink to bring up the extensions here, so if you have a strong > >> preference on having all in the same repo, we could start with that. > >> > >> Thoughts ? > >> > > > > > > ---------- Forwarded message ---------- > > From: Robert Metzger <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 20:41:00 +0200 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > Thank you for your responses. > > > > @Luciano: I don't have a strong preference for one of the two options, > but > > I would like to understand the implications of the two before we start > > setting up the infrastructure. > > Regarding the release cycle: For the Flink connectors, I would actually > try > > to make the release cycle dependent on the connectors, not so much on > Flink > > itself. In my experience, connectors could benefit from a more frequent > > release schedule. For example Kafka seems to release new versions quite > > frequently (recently), or at least the release cycle of Kafka and Flink > is > > not aligned ;) > > So maybe it would make sense for bahir to release independent of the > engine > > projects, on a monthly or 2-monthly schedule, with an independent > > versioning scheme. > > > > @Ted: Flink has bugfix releases quite frequently, but major releases are > at > > a okay level (3-4 months in between). > > Since 1.0.0 Flink provides interface stability, so there should not be an > > issue with independent connector releases. > > > > > > > > On Thu, Aug 11, 2016 at 6:13 PM, Ted Yu <[email protected]> wrote: > > > >> Having Flink connectors in the same repo seems to make more sense at the > >> moment. > >> > >> Certain artifacts can be shared between the two types of connectors. > >> > >> Flink seems to have more frequent releases recently. But Bahir doesn't > have > >> to follow each Flink patch release. > >> > >> Just my two cents. > >> > >> On Thu, Aug 11, 2016 at 9:03 AM, Luciano Resende <[email protected]> > >> wrote: > >> > >> > On Thu, Aug 11, 2016 at 5:42 AM, Robert Metzger <[email protected]> > >> > wrote: > >> > > >> > > > >> > > > >> > > @Luciano: So the idea is to have separate repositories for each > project > >> > > contributing connectors? > >> > > I'm wondering if it makes sense to keep the code in the same > repository > >> > to > >> > > have some synergies (like the release scripts, CI, documentation, a > >> > common > >> > > parent pom with rat etc.). Otherwise, it would maybe make more > sense to > >> > > create a Bahir-style project for Flink, to avoid maintaining > completely > >> > > disjunct codebases in the same JIRA, ML, ... > >> > > > >> > > > >> > > > >> > But we most likely would have very different release schedules with > the > >> > different set of extensions, where Spark extensions will tend to > follow > >> > Spark release cycles, and Flink release cycles. As for the overhead, I > >> > believe release scripts might be the one piece that would be > replicated, > >> > but I can volunteer the infrastructure overhead for now. All rest, > such > >> as > >> > JIRA, ML, etc will be common. But, anyway, I don't want to make this > an > >> > issue for Flink to bring up the extensions here, so if you have a > strong > >> > preference on having all in the same repo, we could start with that. > >> > > >> > Thoughts ? > >> > > >> > > > > > > ---------- Forwarded message ---------- > > From: Steve Loughran <[email protected]> > > To: [email protected] > > Cc: > > Date: Thu, 11 Aug 2016 23:18:32 +0200 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > Thinking some more > > > > To an extent, Bahir is currently mostly a home for some connectors and > > things which were orphaned by the main spark team, giving them some ASF > > home. Luciano has been putting in lots of work getting a release out in > > sync with the spark release. > > > > I have some plans to contribute some other things related to spark in > > there, so again, an ASF home and a test & release process (some YARN > driver > > plugins, for ATS integration and another I have a plan to write for YARN > > registry binding). Again, some stuff unloved by the core spark team. > > > > Ideally, Flink should be growing its user/dev base, recruiting everyone > who > > wants to get patches in and getting them to work on those JIRAs. That's > the > > community growth part of an ASF project. Having some orphan stuff isn't > > ideal; it's the perennial "contrib" problem of projects.(*) > > > > Hadoop had a big purge of contrib stuff in the move to hadoop 2 & maven, > > though we've been adding stuff in hadoop-tools, especially related to > > object stores and things. There's now a fairly harsh-but-needed policy > > there: no contributions which can't be tested during a release. It's a > PITA > > as for some code changes I need to test against: AWS S3, Azure, 2x > > OpenStack endpoints and soon a chinese one. We could have been harsh and > > said "stay on github" but having it in offers some benefits > > -synchronized release schedule (good for Hadoop; bad if the contributors > > want to release more frequently) > > -hadoop team gets some control over what's going on there. > > -code review process lets us improve quality; we're getting metrics in > &c. > > -works well with my plan to have an explicit object store API, extending > > FileSystem with specific and efficient blobstore ops (put(), > > list(prefix),..) > > -enables us to do refactorings across all object stores > > > > One thing we do have there which handles object stores/filesystems even > > outside Hadoop is a set of public compliance tests and a fairly strict > > specification of what a filesystem is meant to do; it means we can > handle a > > big contrib by getting the authors to have those tests working, have > > regression tests going. But...the bindings do need active engagement to > > keep alive; openstack has suffered a bit there, and there's now some fork > > in openstack itself: code follows maintenance; use drives maintenance. > > > > Anyway, I digress > > > > I've thought about this some more and here are some points > > > > -if there's mutual code and/or tests related to flink connectors and the > > spark ones, there's a very strong case for putting the code into bahir > > -if it's more that you need a home for things, I'd recommend you start > with > > Apache Flink and if there are big contributions that suffer neglect then > > it'll be time to look for a home > > > > in the meantime, maybe bahir artifacts should explicitly indicate that > they > > are for spark, eg bahir-spark, so as to leave the option for having, > say, a > > bahir-flink artifact at some point in the future. > > > > > > > > > > On 11 August 2016 at 14:42, Robert Metzger <[email protected]> wrote: > > > >> @Steve: The plan is that Flink committers also help out here with > >> reviewing, releasing and other community activities (but I suspect the > >> activity will be much lower, otherwise, we would not be discussing > removing > >> some of the connectors from Flink) > >> > >> @Luciano: So the idea is to have separate repositories for each project > >> contributing connectors? > >> I'm wondering if it makes sense to keep the code in the same repository > to > >> have some synergies (like the release scripts, CI, documentation, a > common > >> parent pom with rat etc.). Otherwise, it would maybe make more sense to > >> create a Bahir-style project for Flink, to avoid maintaining completely > >> disjunct codebases in the same JIRA, ML, ... > >> > >> > >> On Thu, Aug 11, 2016 at 1:50 PM, Luciano Resende <[email protected]> > >> wrote: > >> > >> > On Thu, Aug 11, 2016 at 2:04 AM, Steve Loughran <[email protected]> > >> wrote: > >> > > >> > > I can see benefits from this —provided we get some help from the > Flink > >> > > people in maintaining and testing the stuff. > >> > > > >> > > >> > +1, Let me know when you guys are ready and I can create a bahir-flink > >> git > >> > repository. > >> > > >> > > >> > -- > >> > Luciano Resende > >> > http://twitter.com/lresende1975 > >> > http://lresende.blogspot.com/ > >> > > >> > > > > > > ---------- Forwarded message ---------- > > From: Luciano Resende <[email protected]> > > To: [email protected] > > Cc: > > Date: Fri, 12 Aug 2016 11:28:36 -0700 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > On Thu, Aug 11, 2016 at 2:18 PM, Steve Loughran <[email protected]> > wrote: > > > >> Thinking some more > >> > >> To an extent, Bahir is currently mostly a home for some connectors and > >> things which were orphaned by the main spark team, giving them some ASF > >> home. Luciano has been putting in lots of work getting a release out in > >> sync with the spark release. > >> > > > > This was what originated Bahir, but we are already starting to see > original > > extensions being built by the Bahir community. > > What we see today is a few distributed analytic platforms that have their > > focus on build the runtime and maybe a few reference implementation > > extensions, and then extensions are mostly built by individuals in their > > own github repositories. Bahir enables these extensions to build a > > community around it and follow the Apache governance, and it's open for > non > > Spark extensions. > > > > > >> > >> I have some plans to contribute some other things related to spark in > >> there, so again, an ASF home and a test & release process (some YARN > driver > >> plugins, for ATS integration and another I have a plan to write for YARN > >> registry binding). Again, some stuff unloved by the core spark team. > >> > >> Ideally, Flink should be growing its user/dev base, recruiting everyone > who > >> wants to get patches in and getting them to work on those JIRAs. That's > the > >> community growth part of an ASF project. Having some orphan stuff isn't > >> ideal; it's the perennial "contrib" problem of projects.(*) > >> > >> > > I don't think that collaborating around Flink extensions in Bahir implies > > that these extensions are orphans. Bahir can give a lot of flexibility to > > these extensions, one is release flexibility, where the extensions could > > follow the extension source release cycle (e.g. Kafka release cycle) or > the > > Platform release cycle (e.g. Flink) or both, which is more complicated > when > > they are collocated within the Platform code. Another benefit is the > share > > of domain expertise, Kafka experts for example could collaborate across > > extensions on different platforms, etc... > > > > > >> Hadoop had a big purge of contrib stuff in the move to hadoop 2 & maven, > >> though we've been adding stuff in hadoop-tools, especially related to > >> object stores and things. There's now a fairly harsh-but-needed policy > >> there: no contributions which can't be tested during a release. It's a > PITA > >> as for some code changes I need to test against: AWS S3, Azure, 2x > >> OpenStack endpoints and soon a chinese one. We could have been harsh and > >> said "stay on github" but having it in offers some benefits > >> -synchronized release schedule (good for Hadoop; bad if the > contributors > >> want to release more frequently) > >> -hadoop team gets some control over what's going on there. > >> -code review process lets us improve quality; we're getting metrics in > &c. > >> -works well with my plan to have an explicit object store API, > extending > >> FileSystem with specific and efficient blobstore ops (put(), > >> list(prefix),..) > >> -enables us to do refactorings across all object stores > >> > >> One thing we do have there which handles object stores/filesystems even > >> outside Hadoop is a set of public compliance tests and a fairly strict > >> specification of what a filesystem is meant to do; it means we can > handle a > >> big contrib by getting the authors to have those tests working, have > >> regression tests going. But...the bindings do need active engagement to > >> keep alive; openstack has suffered a bit there, and there's now some > fork > >> in openstack itself: code follows maintenance; use drives maintenance. > >> > >> Anyway, I digress > >> > >> I've thought about this some more and here are some points > >> > >> -if there's mutual code and/or tests related to flink connectors and the > >> spark ones, there's a very strong case for putting the code into bahir > >> > > > > IMHO, even if there isn't, I believe there is still benefits, some of I > > have described above. > > > > > >> -if it's more that you need a home for things, I'd recommend you start > with > >> Apache Flink and if there are big contributions that suffer neglect then > >> it'll be time to look for a home > >> > >> > > Well, I would say, if you need a more flexible place to host these > > extensions, Bahir would welcome you. > > > > Having said that, we are expecting that the Flink community would be > > responsible for maintaining these extensions with help of the Bahir > > community. Note that we also have an defined some guidelines for retiring > > extensions : http://bahir.apache.org/contributing-extensions/ which > will be > > used in case of orphaned code. > > > > > >> in the meantime, maybe bahir artifacts should explicitly indicate that > they > >> are for spark, eg bahir-spark, so as to leave the option for having, > say, a > >> bahir-flink artifact at some point in the future. > >> > > > > Currently, all artifact ids are prefixed by spark: > > <artifactId>spark-streaming-akka_2.11</artifactId> > > > > > > > >> > >> > >> > >> On 11 August 2016 at 14:42, Robert Metzger <[email protected]> wrote: > >> > >> > @Steve: The plan is that Flink committers also help out here with > >> > reviewing, releasing and other community activities (but I suspect the > >> > activity will be much lower, otherwise, we would not be discussing > >> removing > >> > some of the connectors from Flink) > >> > > >> > @Luciano: So the idea is to have separate repositories for each > project > >> > contributing connectors? > >> > I'm wondering if it makes sense to keep the code in the same > repository > >> to > >> > have some synergies (like the release scripts, CI, documentation, a > >> common > >> > parent pom with rat etc.). Otherwise, it would maybe make more sense > to > >> > create a Bahir-style project for Flink, to avoid maintaining > completely > >> > disjunct codebases in the same JIRA, ML, ... > >> > > >> > > >> > On Thu, Aug 11, 2016 at 1:50 PM, Luciano Resende < > [email protected]> > >> > wrote: > >> > > >> > > On Thu, Aug 11, 2016 at 2:04 AM, Steve Loughran <[email protected]> > >> > wrote: > >> > > > >> > > > I can see benefits from this —provided we get some help from the > >> Flink > >> > > > people in maintaining and testing the stuff. > >> > > > > >> > > > >> > > +1, Let me know when you guys are ready and I can create a > bahir-flink > >> > git > >> > > repository. > >> > > > >> > > > >> > > -- > >> > > Luciano Resende > >> > > http://twitter.com/lresende1975 > >> > > http://lresende.blogspot.com/ > >> > > > >> > > >> > > > > > > > > -- > > Luciano Resende > > http://twitter.com/lresende1975 > > http://lresende.blogspot.com/ > > > > > > ---------- Forwarded message ---------- > > From: Luciano Resende <[email protected]> > > To: [email protected] > > Cc: > > Date: Fri, 12 Aug 2016 11:34:25 -0700 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > On Thu, Aug 11, 2016 at 9:03 AM, Luciano Resende <[email protected]> > > wrote: > > > >> > >> > >> On Thu, Aug 11, 2016 at 5:42 AM, Robert Metzger <[email protected]> > >> wrote: > >> > >>> > >>> > >>> @Luciano: So the idea is to have separate repositories for each project > >>> contributing connectors? > >>> I'm wondering if it makes sense to keep the code in the same > repository to > >>> have some synergies (like the release scripts, CI, documentation, a > common > >>> parent pom with rat etc.). Otherwise, it would maybe make more sense to > >>> create a Bahir-style project for Flink, to avoid maintaining completely > >>> disjunct codebases in the same JIRA, ML, ... > >>> > >>> > >>> > >> But we most likely would have very different release schedules with the > >> different set of extensions, where Spark extensions will tend to follow > >> Spark release cycles, and Flink release cycles. As for the overhead, I > >> believe release scripts might be the one piece that would be replicated, > >> but I can volunteer the infrastructure overhead for now. All rest, such > as > >> JIRA, ML, etc will be common. But, anyway, I don't want to make this an > >> issue for Flink to bring up the extensions here, so if you have a strong > >> preference on having all in the same repo, we could start with that. > >> > >> Thoughts ? > >> > >> > > I have thought more about the question about one combined repository > versus > > separate repositories per platform (e.g. Spark, Flink) and the more I > think > > I believe two repositories will be the best. Think about some of the > > benefits listed below : > > > > Multiple Repositories: > > - Enable smaller and fast builds, as you don't have to wait on the other > > platform extensions > > - Simplify dependency management when different platforms use different > > levels of dependencies > > - Enable for more flexibility on releases, permitting disruptive changes > in > > one platform without affecting others > > - Enable better versioning schema for different platforms (e.g. Spark > > following the Spark release version schema, while Flink having it's own > > schema) > > - etc > > > > One Repository > > - Enable sharing common components (which in my view will be mostly > > infrastructure pieces that once created are somewhat stable) > > > > Thoughts ? > > > > -- > > Luciano Resende > > http://twitter.com/lresende1975 > > http://lresende.blogspot.com/ > > > > > > ---------- Forwarded message ---------- > > From: Robert Metzger <[email protected]> > > To: [email protected] > > Cc: > > Date: Mon, 15 Aug 2016 14:04:09 +0200 > > Subject: Re: [DISCUSS] Adding streaming connectors from Apache Flink to > Bahir > > Hi, > > > > @stevel: Flink is still experiencing a lot of community growth. > Initially, > > we accepted all contributions in an acceptable state. Then, we introduced > > various models of "staging" and "contrib" modules, but by now, the amount > > of incoming contributions is just too high for the core project. > > Also, its a bit out of scope compared to the core engine we are building. > > That's why we started looking at Bahir (and other approaches) > > > > @Luciano, I'll answer to the multiple vs one repo discussion inline below > > > > > > On Fri, Aug 12, 2016 at 8:34 PM, Luciano Resende <[email protected]> > > wrote: > > > >> On Thu, Aug 11, 2016 at 9:03 AM, Luciano Resende <[email protected]> > >> wrote: > >> > >> Multiple Repositories: > >> - Enable smaller and fast builds, as you don't have to wait on the other > >> platform extensions > >> > > > > True, build time is an argument for multiple repos > > > > > >> - Simplify dependency management when different platforms use different > >> levels of dependencies > >> > > > > I don't think that the dependencies influence each other much. > > For the one repository approach, the structure would probably be like > this: > > > > bahir-parent > > - bahir-spark > > - spark-streaming-akka > > - ... > > - bahir-flink > > - flink-connector-redis > > - ... > > > > In "bahir-parent", we could define all release-related plugins, apache > rat, > > checkstyle?, general project information and all the other stuff that > makes > > a bahir project "bahir" ;) > > In the "bahir-<system>" parent, we could define all platform specific > > dependencies and settings. > > > > > > > >> - Enable for more flexibility on releases, permitting disruptive > changes in > >> one platform without affecting others > >> > > > > With the structure proposed above, I guess we could actually have an > > independent versioning / releasing for the "bahir-<system>" parent tree. > > > > > >> - Enable better versioning schema for different platforms (e.g. Spark > >> following the Spark release version schema, while Flink having it's own > >> schema) > >> - etc > >> > >> One Repository > >> - Enable sharing common components (which in my view will be mostly > >> infrastructure pieces that once created are somewhat stable) > >> > >> > > > > Since you are the project PMC chair, I propose to go for the "multiple > > repositories" approach if nobody objects within 24 hours? > > > > Once we have concluded our discussion here, I'll send a summary to the > > Flink dev@ list and see what they think about it. > > I expect them to agree to our proposals, since the "bahir approach" is > our > > favorite. > > > > Regards, > > Robert > > >
