I think we should wait until 2.0 is out before discussing or even gathering feedback. As I am sure any feedback will trigger a discussion.
On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > Andrew, > > Thanks for chiming in - just to answer your questions and clarify the > scope of the discussion: > > Breeze is for developing Airflow itself, it's purpose is not to develop > and run DAGs. It was never intended to be used by the "users" of Airflow or > DAG development or testing the DAGs. And while we were pondering with that > thought recently, I think it never will be this, it is simply not fit for > the purpose. > > Even the "start-airflow" command is there mainly for the developers of > Airflow, not for the users of it. For example, it can be quickly used to > test if a new release candidate for Apache Aiirflow "works" - thanks to it > in a few minutes I can run a released version of Airflow in several > combinations of python/backend and see that it generally "works". > > So for the docker-compose user production image" - sure, it is needed but > this is a different issue, different users, and a completely different > use-case (even if "docker-compose" name is there too). Those two are > completely different use-cases, starting from the fact that even the docker > image used there is different. Maybe this is what both you and Ash are > talking about. In which case I fully agree it's needed, but I believe we > are not talking about it here. > > If you want to have this kind of approach you are talking about, you can > take a look at the issue here: > https://github.com/apache/airflow/issues/8605. Nobody works on it > actively now, but I would love someone who takes a lead on it and completes > it. I am happy to help and review it as much as I can. But maybe you would > like to take a lead on it Andrew since you have some experience and > real use case behind? I think we need people there who are actual users of > Airflow - which sadly, I am mostly not one :) > > But let's not mix the two please :). I'd love to keep this thread focused > on *"Breeze, the development environment for Airflow itself"*. Even the > tagline of Breeze "*It's a Breeze to develop Airflow*." rather than "It's > a Breeze to develop DAGs" > > J. > > > On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <jarek.pot...@polidea.com> > wrote: > >> Tomek: >> >> I started the discussion here, so just everyone is aware of it even if >> they are not watching GH issues. I now created the GH Issue >> https://github.com/apache/airflow/issues/12282 so that I can gather >> together people with some interest and I think it's best to continue the >> discussion there. >> >> What I plan to do within the next few days, is to start a design document >> and design discussion. I would like to start with defining the actual users >> of Breeze, the use-cases it should serve, the purpose, and the set of >> assumptions that it should have. And only after we hash it all out, I would >> like to define the scope, decide whether we want to have one or many >> different tools for different users, how much of it is common and whether >> we can remove some of it completely or simplify it. >> >> I think we've gathered enormous experience from various levels of >> developers while using Breeze and it's a perfect moment to discuss (with >> those various users) what is useful, for whom, what makes sense, and how to >> provide the best interface. I see the current Breeze as a learning platform >> on what is useful and what is not, and I would love - this time - so that >> decisions in it are made by the actual users (of a various kind). And I >> would love to lead it - not as a developer this time, but as a "product >> manager" - listening to various voices and trying to make the best of >> it, reaching some consensus and working with others to implement it. I >> think this is the best use of the experience we had with Breeze and the >> "crowd-wisdom" of the developers of Airflow of a different kind and with a >> different experience. >> >> J. >> >> >> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <andrewharmon...@gmail.com> >> wrote: >> >>> I would agree as an end user, I’m not really sure what Breeze does. Is >>> it for CI or is it a way to quickly spin up a containerized env for local >>> development. I do think it would be great to have something similar to >>> Puckel that uses official airflow images. Very easy to quickly get started >>> with to give airflow a try, but also a jumping off point for organizations >>> to customize it to their needs. If this is decker-compose or something >>> else, that’s fine. We use a customized version of puckel for all the >>> engineers to do local dag development. It would be great if this was more >>> “official” Airflow. I agree that python would make it easier for others to >>> contribute. Finally, very clear documentation on the Airflow site would be >>> very helpful too. >>> >>> Thanks, >>> Andrew Harmon >>> >>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <turbas...@apache.org> >>> wrote: >>> >>> +1 for using python. >>> >>> > I would also say: make breeze do less. Right now it is three major >>> things: >>> > * A local development environment >>> > * CI runner >>> > * It's recently grown the ability to run airflow for developing dags. >>> >>> My first thought was similar - breeze does too much now. However, I >>> think the problem is not in plenty of functionality but in technology used >>> - bash. Using python or any other language will let us create a nice and >>> clear structure for the project that will be easy to onboard, reason about >>> and manage. >>> >>> Structuring breeze may allow us to leverage using separate docker >>> images, docker composes for different purposes (CI, DAG dev, Airflow dev). >>> I like the way in which breeze is a "layer over docker" and I think this >>> gives a nice experience. However, breeze has grown so big that I'm not sure >>> even if I use half of the functions it has. >>> >>> *Note:* where should we continue the discussion? The official place is >>> devlist, but we have GH issue. Which one should we use to avoid two >>> separate discussions? >>> >>> Tomek >>> >>> >>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <jarek.pot...@polidea.com> >>> wrote: >>> >>>> I also created issue for it: >>>> https://github.com/apache/airflow/issues/12282 >>>> >>>> Anyone interested in taking part - please comment there! >>>> >>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>>> You screamed (among many others) and I listened :). And I think the >>>>> time is now to act. >>>>> >>>>> I believe the scope of "Breeze 2" should be part of the design >>>>> discussion, where we will hear other's opinions (especially the first time >>>>> or fresh contributors). >>>>> >>>>> For now, my vision is quite a bit different than yours Ash :). But I >>>>> do not want to start a design discussion just yet, I want to make >>>>> breathing >>>>> space for others to chime in. >>>>> >>>>> I would love to hear many voices and interests of people before we >>>>> deep dive into what "Breeze 2" might look like. >>>>> >>>>> What I am interested in is whether: >>>>> >>>>> a) it's the right time >>>>> b) python is the right choice >>>>> c) do I have several people who would like to join and offer both - >>>>> help in designing the vision for it, as well as their time to implement >>>>> it. >>>>> >>>>> I think it is crucial that those people who will be implementing it, >>>>> will be the main people who make design decisions about it, as I would >>>>> love >>>>> to have a strong group of people who would like to not only take part in >>>>> developing it but also in maintaining it in the future. >>>>> >>>>> J. >>>>> >>>>> >>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <a...@apache.org> >>>>> wrote: >>>>> >>>>>> Omg yes. I have been screaming out for this for months. >>>>>> >>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l >>>>>> 6911 >>>>>> >>>>>> That's entirely too much bash for my liking by about an order of >>>>>> magnitude ;) >>>>>> >>>>>> I would also say: make breeze do less. Right now it is three major >>>>>> things: >>>>>> >>>>>> * A local development environment >>>>>> * CI runner >>>>>> * It's recently grown the ability to run airflow for developing dags. >>>>>> >>>>>> That is too much. Yes there is overlap, but it's just too much in one >>>>>> tool, and too complex as a result. Some of this should just be replaced >>>>>> with a docker-compose file (that uses published release images, not >>>>>> floating master/nightly) and users told to run that. >>>>>> >>>>>> Make it simpler, fitting a core purpose - running CI consistently >>>>>> should be it's only goal. >>>>>> >>>>>> -ash >>>>>> >>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <jarek.pot...@polidea.com> >>>>>> wrote: >>>>>> >>>>>> Hello Everyone, >>>>>> >>>>>> TL; DR; I was thinking for quite a while on this and I think this is >>>>>> the right time to raise that subject. It's been asked several times, why >>>>>> Breeze is not written in something else than Bash since it is "that big" >>>>>> or >>>>>> some people said "monstrous" :). I think it's the right time to start a >>>>>> "rewrite" project with wide community involvement and Python seems to be >>>>>> the best choice :). >>>>>> >>>>>> >>>>>> While I was opposing this while we were focusing on Airflow 2.0, and >>>>>> there are some good reasons why initially I started Breeze in Bash, I >>>>>> think >>>>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based >>>>>> on Python 3.6 and with some "stability" and "good set of features" we >>>>>> have >>>>>> in Breeze and a good level of modularisation we achieved - it's the right >>>>>> time to think about a rewrite. >>>>>> >>>>>> I did not raise this subject to add a distraction on top of what is >>>>>> already a lot of work for 2.0, but I think having Breeze rewritten in >>>>>> Python could be the "one more thing" that we could do - as a community to >>>>>> make 2.0 experience even better, and one that can make the community even >>>>>> closer. >>>>>> >>>>>> I was thinking that Breeze is perfect to be split into separate >>>>>> smaller pieces, describe some assumptions that we will have for its use, >>>>>> and turn it into a true community effort where a lot of people will >>>>>> contribute and where we will be able to simplify some of the stuff, and - >>>>>> most importantly - make more people from the community know about how our >>>>>> CI and development environment works and be able to solve any problems >>>>>> there. >>>>>> >>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI >>>>>> working and I am mostly the single point of contact (and failure!) when >>>>>> it >>>>>> comes to that - I would love to not be one :) and I think with most of >>>>>> the >>>>>> core committers busy with 2.0, this is also an opportunity for more of >>>>>> the >>>>>> contributors to take their part in it (and eventually earn their rank to >>>>>> become committers!). For the core committers, this is an extra >>>>>> opportunity >>>>>> to learn how the system works, influence its design, and possibly >>>>>> simplify >>>>>> some parts of it - even if they will be mostly focused on 2.0. >>>>>> >>>>>> I would like to do it well - write some assumptions in a design doc, >>>>>> plan the work and split it into separate issues, and lead the effort - >>>>>> but >>>>>> I would love if most of the work is done by others, who would then become >>>>>> familiar with the whole of it. >>>>>> >>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right >>>>>> time? Are there some people in the community who would like to take part >>>>>> in >>>>>> it? >>>>>> >>>>>> J. >>>>>> >>>>>> -- >>>>>> Jarek Potiuk >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>>> M: +48 660 796 129 <+48660796129> >>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Jarek Potiuk >>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>> M: +48 660 796 129 <+48660796129> >>>>> [image: Polidea] <https://www.polidea.com/> >>>>> >>>>> >>>> >>>> -- >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > >