@Tomek - Thanks for the link to the Airflow talk ! Checking it out now. 
 
@Jarek - It sounds like git-sync is or rather should be the default way users add/modify DAGs. With that said, have you had any experience with customers syncing their dags to other forms of dag storage (S3 etc.) and what the outcomes were? 
 
I spoke with Luciano and we're both available after anytime after noon on Thursday and Friday to chat about this effort.
@Ry - Looking at your calendar Thurs@12:30 and Fri@1pm both look open, which one would fit your schedule best?
 
Look forward to chatting.
 
--
Alan Chin
CODAIT, San Francisco
Email - [email protected]
 
----- Original message -----
From: Tomasz Urbaszek <[email protected]>
To: [email protected]
Cc: "[email protected]" <[email protected]>, "[email protected]" <[email protected]>
Subject: [EXTERNAL] Re: A Visual Editor for Airflow pipelines
Date: Tue, Nov 3, 2020 3:29 AM
 
I think the visual DAG editor would be a thing!

Not sure if you are aware of this Airflow Summit talk about visual DAG editor and the integration between Airflow and CWL:
https://www.youtube.com/watch?v=I4nFCqEnOJc&list=PLGudixcDaxY3RGLSlWoN_cEEXhIT1OPmj&index=19

Cheers,
Tomek
 
On Tue, Nov 3, 2020 at 11:42 AM Jarek Potiuk <[email protected]> wrote:
Agree - maybe 2.1 or 2.2 :). 
 
After some experiences with big customers deployments, I personally think GitSync at least for now is the best approach out there. It requires git repo + authorization, but this has all the added benefits of code change tracking, it is a very standard interface, most of the git repos provide some ways of manual review if needed and most have some kind of integration with CI/automated code analysis. 
 
I personally think it should be the default, for any serious deployment as it provides so many benefits with very limited extra. You just need an extra "box" - git repo (which is pretty much a given in any organization). It uses a standard interface that is highly customizable (branches/folder structures, whatnot) and we already have git-sync container support in the helm chart.
 
J.
 
 
On Tue, Nov 3, 2020 at 11:27 AM Ash Berlin-Taylor <[email protected]> wrote:
Wishfull thinking at the moment Gerard -- the task execution still needs files on disk to run the tasks.
 
This was always in my long term plan for DAG serialization, but we aren't there yet. And Custom operators makes this a non-straight forward problem to solve.
 
-ash
 
On Nov 3 2020, at 12:18 am, Gerard Casas Saez <[email protected]> wrote:
Would be interested to also know possible ways to do what Luciano described. Hopefully w the serialized DAG and the new API we can start just pushing the DAG to the DB (wishful thinking)?
 
Gerard Casas Saez
Twitter | Cortex | @casassaez
 
On Mon, Nov 2, 2020 at 2:06 PM Jarek Potiuk <[email protected]> wrote:
Cool!. I also think it's an interesting one:). But it would be great to have such integration possible from Elyra :). Let us know what comes out of it :). 
 
J.
 
 
On Mon, Nov 2, 2020 at 10:02 PM Ry Walker <[email protected]> wrote:
Hi Luciano -
 
Elyra looks like an interesting project — we'd love to connect and talk through the opportunity.
 
You can compare your cal to mine and grab a slot here: https://calendly.com/ryw/60min — and I'll be sure to get a few of the Airflow PMC members to join as well.
 
-Ry
 
Ry Walker
Founder/CTO of Astronomer + Airflow Committer
 
 
On Mon, Nov 2, 2020 at 12:00 AM Luciano Resende <[email protected]> wrote:
Hi All,
 
As mentioned in the user list [1] we are working on a visual editor
for pipelines and adding Airflow as one of the supported backends.
   
As you are the Airflow devs, we would invite you to help us implement
the best integration possible, in two steps:
 
1) Getting a solid integration for building and running pipelines with
python scripts and  jupyter notebooks
 
2) Expand the available list component types and enable more generic operators
 
One of the questions raised in the original e-mail is related to how
to best submit the pipeline dag to be executed by the Airflow runtime,
we have tried a few different options, starting from the experimental
REST API, S3 bucket syncs and these seem to not be the ideal solution,
 will be looking into git-sync next, but would really appreciate some
suggestions on the best options, particularly if someone has already
done some external integration similar to this.
 
Feel free to create issues for discussion and or more details
   
Or use this thread for suggestions
   
--
Luciano Resende
 
 
--

Jarek Potiuk
Polidea | Principal Software Engineer

 
 
--

Jarek Potiuk
Polidea | Principal Software Engineer

M: +48 660 796 129
Polidea

 

Reply via email to