Tomek: I started the discussion here, so just everyone is aware of it even if they are not watching GH issues. I now created the GH Issue https://github.com/apache/airflow/issues/12282 so that I can gather together people with some interest and I think it's best to continue the discussion there.
What I plan to do within the next few days, is to start a design document and design discussion. I would like to start with defining the actual users of Breeze, the use-cases it should serve, the purpose, and the set of assumptions that it should have. And only after we hash it all out, I would like to define the scope, decide whether we want to have one or many different tools for different users, how much of it is common and whether we can remove some of it completely or simplify it. I think we've gathered enormous experience from various levels of developers while using Breeze and it's a perfect moment to discuss (with those various users) what is useful, for whom, what makes sense, and how to provide the best interface. I see the current Breeze as a learning platform on what is useful and what is not, and I would love - this time - so that decisions in it are made by the actual users (of a various kind). And I would love to lead it - not as a developer this time, but as a "product manager" - listening to various voices and trying to make the best of it, reaching some consensus and working with others to implement it. I think this is the best use of the experience we had with Breeze and the "crowd-wisdom" of the developers of Airflow of a different kind and with a different experience. J. On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <andrewharmon...@gmail.com> wrote: > I would agree as an end user, I’m not really sure what Breeze does. Is it > for CI or is it a way to quickly spin up a containerized env for local > development. I do think it would be great to have something similar to > Puckel that uses official airflow images. Very easy to quickly get started > with to give airflow a try, but also a jumping off point for organizations > to customize it to their needs. If this is decker-compose or something > else, that’s fine. We use a customized version of puckel for all the > engineers to do local dag development. It would be great if this was more > “official” Airflow. I agree that python would make it easier for others to > contribute. Finally, very clear documentation on the Airflow site would be > very helpful too. > > Thanks, > Andrew Harmon > > On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <turbas...@apache.org> wrote: > > +1 for using python. > > > I would also say: make breeze do less. Right now it is three major > things: > > * A local development environment > > * CI runner > > * It's recently grown the ability to run airflow for developing dags. > > My first thought was similar - breeze does too much now. However, I think > the problem is not in plenty of functionality but in technology used - > bash. Using python or any other language will let us create a nice and > clear structure for the project that will be easy to onboard, reason about > and manage. > > Structuring breeze may allow us to leverage using separate docker images, > docker composes for different purposes (CI, DAG dev, Airflow dev). I like > the way in which breeze is a "layer over docker" and I think this gives a > nice experience. However, breeze has grown so big that I'm not sure even if > I use half of the functions it has. > > *Note:* where should we continue the discussion? The official place is > devlist, but we have GH issue. Which one should we use to avoid two > separate discussions? > > Tomek > > > On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <jarek.pot...@polidea.com> > wrote: > >> I also created issue for it: >> https://github.com/apache/airflow/issues/12282 >> >> Anyone interested in taking part - please comment there! >> >> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> >>> You screamed (among many others) and I listened :). And I think the time >>> is now to act. >>> >>> I believe the scope of "Breeze 2" should be part of the design >>> discussion, where we will hear other's opinions (especially the first time >>> or fresh contributors). >>> >>> For now, my vision is quite a bit different than yours Ash :). But I do >>> not want to start a design discussion just yet, I want to make breathing >>> space for others to chime in. >>> >>> I would love to hear many voices and interests of people before we deep >>> dive into what "Breeze 2" might look like. >>> >>> What I am interested in is whether: >>> >>> a) it's the right time >>> b) python is the right choice >>> c) do I have several people who would like to join and offer both - help >>> in designing the vision for it, as well as their time to implement it. >>> >>> I think it is crucial that those people who will be implementing it, >>> will be the main people who make design decisions about it, as I would love >>> to have a strong group of people who would like to not only take part in >>> developing it but also in maintaining it in the future. >>> >>> J. >>> >>> >>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <a...@apache.org> >>> wrote: >>> >>>> Omg yes. I have been screaming out for this for months. >>>> >>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l >>>> 6911 >>>> >>>> That's entirely too much bash for my liking by about an order of >>>> magnitude ;) >>>> >>>> I would also say: make breeze do less. Right now it is three major >>>> things: >>>> >>>> * A local development environment >>>> * CI runner >>>> * It's recently grown the ability to run airflow for developing dags. >>>> >>>> That is too much. Yes there is overlap, but it's just too much in one >>>> tool, and too complex as a result. Some of this should just be replaced >>>> with a docker-compose file (that uses published release images, not >>>> floating master/nightly) and users told to run that. >>>> >>>> Make it simpler, fitting a core purpose - running CI consistently >>>> should be it's only goal. >>>> >>>> -ash >>>> >>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>> Hello Everyone, >>>> >>>> TL; DR; I was thinking for quite a while on this and I think this is >>>> the right time to raise that subject. It's been asked several times, why >>>> Breeze is not written in something else than Bash since it is "that big" or >>>> some people said "monstrous" :). I think it's the right time to start a >>>> "rewrite" project with wide community involvement and Python seems to be >>>> the best choice :). >>>> >>>> >>>> While I was opposing this while we were focusing on Airflow 2.0, and >>>> there are some good reasons why initially I started Breeze in Bash, I think >>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based >>>> on Python 3.6 and with some "stability" and "good set of features" we have >>>> in Breeze and a good level of modularisation we achieved - it's the right >>>> time to think about a rewrite. >>>> >>>> I did not raise this subject to add a distraction on top of what is >>>> already a lot of work for 2.0, but I think having Breeze rewritten in >>>> Python could be the "one more thing" that we could do - as a community to >>>> make 2.0 experience even better, and one that can make the community even >>>> closer. >>>> >>>> I was thinking that Breeze is perfect to be split into separate smaller >>>> pieces, describe some assumptions that we will have for its use, and turn >>>> it into a true community effort where a lot of people will contribute and >>>> where we will be able to simplify some of the stuff, and - most importantly >>>> - make more people from the community know about how our CI and development >>>> environment works and be able to solve any problems there. >>>> >>>> Breeze (and underlying bash libraries) are crucial, to get our CI >>>> working and I am mostly the single point of contact (and failure!) when it >>>> comes to that - I would love to not be one :) and I think with most of the >>>> core committers busy with 2.0, this is also an opportunity for more of the >>>> contributors to take their part in it (and eventually earn their rank to >>>> become committers!). For the core committers, this is an extra opportunity >>>> to learn how the system works, influence its design, and possibly simplify >>>> some parts of it - even if they will be mostly focused on 2.0. >>>> >>>> I would like to do it well - write some assumptions in a design doc, >>>> plan the work and split it into separate issues, and lead the effort - but >>>> I would love if most of the work is done by others, who would then become >>>> familiar with the whole of it. >>>> >>>> WDYT? Do you think it is a good idea? Do you thin k it is the right >>>> time? Are there some people in the community who would like to take part in >>>> it? >>>> >>>> J. >>>> >>>> -- >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>> >>> -- >>> Jarek Potiuk >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>> M: +48 660 796 129 <+48660796129> >>> [image: Polidea] <https://www.polidea.com/> >>> >>> >> >> -- >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>