BTW. I've also learned recently about this project https://github.com/earthly/earthly.
For those who are interested - they have a really nice description of what motivations they have, where it fits, what problem it solves, and what "niche" in the development process it fills. >From what I see - the needs they address are very, very close to what Breeze does. And It might even be that we will propose Earthly to be used as a foundation for the new Breeze2. We might not necessarily have to write all of it from the scratch - but rather "stand on the shoulders of giants". I think all options are open as long as we focus on the "needs", "users" and "use cases" that we want to address. J. On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > My intention is not to rewrite it now, but start doing it when we get a > stable 2.0 release, to know what we want to achieve and plan it, and have a > team aligned on it - so that we can actually start doing it whenever we > feel 2.0 is "stable" and there is nothing of higher priority. > > But I will start discussion and doc on "scope", "use cases" and "users" - > so that we know what we DO and what we DO NOT do with Breeze. > > My goal is simple" "It's a Breeze to *develop *Airflow". It's not about > "using Airflow", it's not about "trying out Airflow", it's not about > "writing and testing DAGs" - if there is a need for that, this should be a > different tool/project. > > The "users" of Breeze are only contributors. Full Stop. For "Airflow > users" - if they are not contributors, Breeze will be useless for them. And > that's intended. > > I would like to clarify that goal and assumptions soon, so I am preparing > a short doc where I put my assumptions about that, but in the scope of it, > I want to keep the focus of "developing Airflow" only. > > This is my primary concern - that there are some ideas on what to do with > Breeze that go far beyond that primary goal. But I would like to keep > Breeze within those boundaries only. > > And I am happy to help with other initiatives to answer other needs, but > those should be separate IMHO. > > J. > > > On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <daniel.imber...@gmail.com> > wrote: > >> I am all for rewriting breeze, but I think waiting until after 2.0 makes >> the most sense. Python could work, but let’s be intentional about the >> decision before we choose. >> >> via Newton Mail >> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2> >> >> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd.den...@gmail.com> >> wrote: >> >> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets >> relatively more “stable”). >> >> My aspect is more about to concentrate development/community focus. >> >> >> XD >> >> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <kaxiln...@gmail.com> wrote: >> >>> I think we should wait until 2.0 is out before discussing or even >>> gathering feedback. As I am sure any feedback will trigger a discussion. >>> >>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <jarek.pot...@polidea.com> >>> wrote: >>> >>>> Andrew, >>>> >>>> Thanks for chiming in - just to answer your questions and clarify the >>>> scope of the discussion: >>>> >>>> Breeze is for developing Airflow itself, it's purpose is not to develop >>>> and run DAGs. It was never intended to be used by the "users" of Airflow or >>>> DAG development or testing the DAGs. And while we were pondering with that >>>> thought recently, I think it never will be this, it is simply not fit for >>>> the purpose. >>>> >>>> Even the "start-airflow" command is there mainly for the developers of >>>> Airflow, not for the users of it. For example, it can be quickly used to >>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it >>>> in a few minutes I can run a released version of Airflow in several >>>> combinations of python/backend and see that it generally "works". >>>> >>>> So for the docker-compose user production image" - sure, it is needed >>>> but this is a different issue, different users, and a completely different >>>> use-case (even if "docker-compose" name is there too). Those two are >>>> completely different use-cases, starting from the fact that even the docker >>>> image used there is different. Maybe this is what both you and Ash are >>>> talking about. In which case I fully agree it's needed, but I believe we >>>> are not talking about it here. >>>> >>>> If you want to have this kind of approach you are talking about, you >>>> can take a look at the issue here: >>>> https://github.com/apache/airflow/issues/8605. Nobody works on it >>>> actively now, but I would love someone who takes a lead on it and completes >>>> it. I am happy to help and review it as much as I can. But maybe you would >>>> like to take a lead on it Andrew since you have some experience and real >>>> use case behind? I think we need people there who are actual users of >>>> Airflow - which sadly, I am mostly not one :) >>>> >>>> But let's not mix the two please :). I'd love to keep this thread >>>> focused on *"Breeze, the development environment for Airflow itself"*. >>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*." >>>> rather than "It's a Breeze to develop DAGs" >>>> >>>> J. >>>> >>>> >>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>>> Tomek: >>>>> >>>>> I started the discussion here, so just everyone is aware of it even if >>>>> they are not watching GH issues. I now created the GH Issue >>>>> https://github.com/apache/airflow/issues/12282 so that I can gather >>>>> together people with some interest and I think it's best to continue the >>>>> discussion there. >>>>> >>>>> What I plan to do within the next few days, is to start a design >>>>> document and design discussion. I would like to start with defining the >>>>> actual users of Breeze, the use-cases it should serve, the purpose, and >>>>> the >>>>> set of assumptions that it should have. And only after we hash it all out, >>>>> I would like to define the scope, decide whether we want to have one or >>>>> many different tools for different users, how much of it is common and >>>>> whether we can remove some of it completely or simplify it. >>>>> >>>>> I think we've gathered enormous experience from various levels of >>>>> developers while using Breeze and it's a perfect moment to discuss (with >>>>> those various users) what is useful, for whom, what makes sense, and how >>>>> to >>>>> provide the best interface. I see the current Breeze as a learning >>>>> platform >>>>> on what is useful and what is not, and I would love - this time - so that >>>>> decisions in it are made by the actual users (of a various kind). And I >>>>> would love to lead it - not as a developer this time, but as a "product >>>>> manager" - listening to various voices and trying to make the best of it, >>>>> reaching some consensus and working with others to implement it. I think >>>>> this is the best use of the experience we had with Breeze and the >>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a >>>>> different experience. >>>>> >>>>> J. >>>>> >>>>> >>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon < >>>>> andrewharmon...@gmail.com> wrote: >>>>> >>>>>> I would agree as an end user, I’m not really sure what Breeze does. >>>>>> Is it for CI or is it a way to quickly spin up a containerized env for >>>>>> local development. I do think it would be great to have something similar >>>>>> to Puckel that uses official airflow images. Very easy to quickly get >>>>>> started with to give airflow a try, but also a jumping off point for >>>>>> organizations to customize it to their needs. If this is decker-compose >>>>>> or >>>>>> something else, that’s fine. We use a customized version of puckel for >>>>>> all >>>>>> the engineers to do local dag development. It would be great if this was >>>>>> more “official” Airflow. I agree that python would make it easier for >>>>>> others to contribute. Finally, very clear documentation on the Airflow >>>>>> site >>>>>> would be very helpful too. >>>>>> >>>>>> Thanks, >>>>>> Andrew Harmon >>>>>> >>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <turbas...@apache.org> >>>>>> wrote: >>>>>> >>>>>> +1 for using python. >>>>>> >>>>>> > I would also say: make breeze do less. Right now it is three major >>>>>> things: >>>>>> > * A local development environment >>>>>> > * CI runner >>>>>> > * It's recently grown the ability to run airflow for developing >>>>>> dags. >>>>>> >>>>>> My first thought was similar - breeze does too much now. However, I >>>>>> think the problem is not in plenty of functionality but in technology >>>>>> used >>>>>> - bash. Using python or any other language will let us create a nice and >>>>>> clear structure for the project that will be easy to onboard, reason >>>>>> about >>>>>> and manage. >>>>>> >>>>>> Structuring breeze may allow us to leverage using separate docker >>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow >>>>>> dev). >>>>>> I like the way in which breeze is a "layer over docker" and I think this >>>>>> gives a nice experience. However, breeze has grown so big that I'm not >>>>>> sure >>>>>> even if I use half of the functions it has. >>>>>> >>>>>> *Note:* where should we continue the discussion? The official place >>>>>> is devlist, but we have GH issue. Which one should we use to avoid two >>>>>> separate discussions? >>>>>> >>>>>> Tomek >>>>>> >>>>>> >>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk < >>>>>> jarek.pot...@polidea.com> wrote: >>>>>> >>>>>>> I also created issue for it: >>>>>>> https://github.com/apache/airflow/issues/12282 >>>>>>> >>>>>>> Anyone interested in taking part - please comment there! >>>>>>> >>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk < >>>>>>> jarek.pot...@polidea.com> wrote: >>>>>>> >>>>>>>> You screamed (among many others) and I listened :). And I think the >>>>>>>> time is now to act. >>>>>>>> >>>>>>>> I believe the scope of "Breeze 2" should be part of the design >>>>>>>> discussion, where we will hear other's opinions (especially the first >>>>>>>> time >>>>>>>> or fresh contributors). >>>>>>>> >>>>>>>> For now, my vision is quite a bit different than yours Ash :). But >>>>>>>> I do not want to start a design discussion just yet, I want to make >>>>>>>> breathing space for others to chime in. >>>>>>>> >>>>>>>> I would love to hear many voices and interests of people before we >>>>>>>> deep dive into what "Breeze 2" might look like. >>>>>>>> >>>>>>>> What I am interested in is whether: >>>>>>>> >>>>>>>> a) it's the right time >>>>>>>> b) python is the right choice >>>>>>>> c) do I have several people who would like to join and offer both - >>>>>>>> help in designing the vision for it, as well as their time to >>>>>>>> implement it. >>>>>>>> >>>>>>>> I think it is crucial that those people who will be implementing >>>>>>>> it, will be the main people who make design decisions about it, as I >>>>>>>> would >>>>>>>> love to have a strong group of people who would like to not only take >>>>>>>> part >>>>>>>> in developing it but also in maintaining it in the future. >>>>>>>> >>>>>>>> J. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <a...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Omg yes. I have been screaming out for this for months. >>>>>>>>> >>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l >>>>>>>>> 6911 >>>>>>>>> >>>>>>>>> That's entirely too much bash for my liking by about an order of >>>>>>>>> magnitude ;) >>>>>>>>> >>>>>>>>> I would also say: make breeze do less. Right now it is three >>>>>>>>> major things: >>>>>>>>> >>>>>>>>> * A local development environment >>>>>>>>> * CI runner >>>>>>>>> * It's recently grown the ability to run airflow for developing >>>>>>>>> dags. >>>>>>>>> >>>>>>>>> That is too much. Yes there is overlap, but it's just too much in >>>>>>>>> one tool, and too complex as a result. Some of this should just be >>>>>>>>> replaced >>>>>>>>> with a docker-compose file (that uses published release images, not >>>>>>>>> floating master/nightly) and users told to run that. >>>>>>>>> >>>>>>>>> Make it simpler, fitting a core purpose - running CI consistently >>>>>>>>> should be it's only goal. >>>>>>>>> >>>>>>>>> -ash >>>>>>>>> >>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <jarek.pot...@polidea.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hello Everyone, >>>>>>>>> >>>>>>>>> TL; DR; I was thinking for quite a while on this and I think this >>>>>>>>> is the right time to raise that subject. It's been asked several >>>>>>>>> times, why >>>>>>>>> Breeze is not written in something else than Bash since it is "that >>>>>>>>> big" or >>>>>>>>> some people said "monstrous" :). I think it's the right time to start >>>>>>>>> a >>>>>>>>> "rewrite" project with wide community involvement and Python seems to >>>>>>>>> be >>>>>>>>> the best choice :). >>>>>>>>> >>>>>>>>> >>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0, >>>>>>>>> and there are some good reasons why initially I started Breeze in >>>>>>>>> Bash, I >>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 >>>>>>>>> fully >>>>>>>>> based on Python 3.6 and with some "stability" and "good set of >>>>>>>>> features" we >>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's >>>>>>>>> the >>>>>>>>> right time to think about a rewrite. >>>>>>>>> >>>>>>>>> I did not raise this subject to add a distraction on top of what >>>>>>>>> is already a lot of work for 2.0, but I think having Breeze rewritten >>>>>>>>> in >>>>>>>>> Python could be the "one more thing" that we could do - as a >>>>>>>>> community to >>>>>>>>> make 2.0 experience even better, and one that can make the community >>>>>>>>> even >>>>>>>>> closer. >>>>>>>>> >>>>>>>>> I was thinking that Breeze is perfect to be split into separate >>>>>>>>> smaller pieces, describe some assumptions that we will have for its >>>>>>>>> use, >>>>>>>>> and turn it into a true community effort where a lot of people will >>>>>>>>> contribute and where we will be able to simplify some of the stuff, >>>>>>>>> and - >>>>>>>>> most importantly - make more people from the community know about how >>>>>>>>> our >>>>>>>>> CI and development environment works and be able to solve any problems >>>>>>>>> there. >>>>>>>>> >>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI >>>>>>>>> working and I am mostly the single point of contact (and failure!) >>>>>>>>> when it >>>>>>>>> comes to that - I would love to not be one :) and I think with most >>>>>>>>> of the >>>>>>>>> core committers busy with 2.0, this is also an opportunity for more >>>>>>>>> of the >>>>>>>>> contributors to take their part in it (and eventually earn their rank >>>>>>>>> to >>>>>>>>> become committers!). For the core committers, this is an extra >>>>>>>>> opportunity >>>>>>>>> to learn how the system works, influence its design, and possibly >>>>>>>>> simplify >>>>>>>>> some parts of it - even if they will be mostly focused on 2.0. >>>>>>>>> >>>>>>>>> I would like to do it well - write some assumptions in a design >>>>>>>>> doc, plan the work and split it into separate issues, and lead the >>>>>>>>> effort - >>>>>>>>> but I would love if most of the work is done by others, who would then >>>>>>>>> become familiar with the whole of it. >>>>>>>>> >>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the >>>>>>>>> right time? Are there some people in the community who would like to >>>>>>>>> take >>>>>>>>> part in it? >>>>>>>>> >>>>>>>>> J. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jarek Potiuk >>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>>>>>> M: +48 660 796 129 <+48660796129> >>>>>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Jarek Potiuk >>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>>>>> M: +48 660 796 129 <+48660796129> >>>>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jarek Potiuk >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>>>> M: +48 660 796 129 <+48660796129> >>>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> Jarek Potiuk >>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>> >>>>> M: +48 660 796 129 <+48660796129> >>>>> [image: Polidea] <https://www.polidea.com/> >>>>> >>>>> >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>