Sounds reasonable to me. +1 We have been meaning to refactor the AWS operators anyway, we could take advantage of this to reorganize the repo a little bit.
That makes me think that we have not yet decided of a flow to enable work towards say 2.0 vs 1.8. We might want to start thinking about this. Best, Arthur On Fri, Oct 14, 2016 at 9:58 AM, Chris Riccomini <[email protected]> wrote: > > So I vote for this, but it will have to be done gently to avoid breaking > the existing GCP ones. > > Same. > > On Fri, Oct 14, 2016 at 8:51 AM, Jeremiah Lowin <[email protected]> wrote: > > One reason I do like the idea is that especially in contrib, Operators > are > > essentially self-documenting and the first clue is just the file name > > ('my_gcp_operators.py'). Since we no longer greedily import anything, you > > have to know exactly what file to import to get the functionality you > want. > > Grouping them provides a gentler way to figure out what file does what > > ('GCP/storage_operators.py' vs 'GCP/bigquery_operators.py' vs > > 'docker_operators.py'). Sure, you could do this by enforcing a common > name > > standard ('GCP_storage_operators.py') but submodules mean you can > > additionally take advantage of the common infrastructure that Alex > > referenced. I think if we knew how many contrib modules we would have > today, > > we would have done this at the outset (even though it would have looked > like > > major overkill). Also, the previous import mechanism made importing from > > submodules really hard; we don't have that issue anymore. > > > > So I vote for this, but it will have to be done gently to avoid breaking > the > > existing GCP ones. > > > > On Fri, Oct 14, 2016 at 11:29 AM Alex Van Boxel <[email protected]> > wrote: > >> > >> Talking about AWS, it would only make sense if other people would step > up > >> to do it for AWS, and even Azure (or don't we have Azure operators?). > >> > >> On Fri, Oct 14, 2016 at 5:25 PM Chris Riccomini <[email protected]> > >> wrote: > >> > >> > What do others think? I know Sid is a big AWS user. > >> > > >> > On Fri, Oct 14, 2016 at 8:24 AM, Chris Riccomini < > [email protected]> > >> > wrote: > >> > > Ya, if we go the deprecation route, and let them float around for a > >> > > release or two, I'm OK with that (or until we bump major to 2.0). > >> > > > >> > > Other than that, it sounds like a good opportunity to clean things > up. > >> > > :) I do notice a lot of AWS/GCP code (e.g. the S3 Redshift > operator). > >> > > > >> > > On Fri, Oct 14, 2016 at 8:16 AM, Alex Van Boxel <[email protected]> > >> > wrote: > >> > >> Well, I wouldn't touch the on that exist (maybe we could mark them > >> > >> deprecated, but that's all). But I would move (copy) them together > >> > >> and > >> > make > >> > >> them consistent (example, let them all use the same default > >> > connection_id, > >> > >> ...). For a new user it's quite confusing I think due to different > >> > reasons > >> > >> (style, etc...) you know we have an old ticket: making gcp > consistent > >> > >> (I > >> > >> just don't want to start on this on, on fear of breaking > something). > >> > >> > >> > >> On Fri, Oct 14, 2016 at 4:59 PM Chris Riccomini > >> > >> <[email protected]> > >> > >> wrote: > >> > >> > >> > >> Hmm. What advantages would this provide? I'm a little nervous about > >> > >> breaking compatibility. We have a bunch of DAGs which import all > >> > >> kinds > >> > >> of GCP hooks and operators. Wouldn't want those to move. > >> > >> > >> > >> On Fri, Oct 14, 2016 at 7:54 AM, Alex Van Boxel <[email protected]> > >> > wrote: > >> > >>> Hi all, > >> > >>> > >> > >>> I'm starting to write some very exotic Operators that are a bit > >> > >>> strange > >> > >>> adding to contrib. Examples of this are: > >> > >>> > >> > >>> + See if a Compute snapshot of a disc is created > >> > >>> + See if a string appears on the serial port of Compute instance > >> > >>> > >> > >>> but they would be a nice addition if we had a Google Compute > plugin > >> > >>> (or > >> > >> any > >> > >>> other cloud provider, AWS, Azure, ...). I'm not talking about > >> > >>> getting > >> > >> cloud > >> > >>> support out of the main source tree. No, I'm talking about > grouping > >> > them > >> > >>> together in a consistent part. We can even start adding macro's > etc. > >> > This > >> > >>> would be a good opportunity to move all the GCP operators > together, > >> > making > >> > >>> them consistent without braking the existing operators that exist > in > >> > >>> *contrib*. > >> > >>> > >> > >>> Here are a few requirements that I think of: > >> > >>> > >> > >>> - separate folder ( example <airflow>/integration/googlecloud > , > >> > >>> <airflow>/integration/aws > >> > >>> , <airflow>/integration/azure ) > >> > >>> - enable in config (don't want to load integrations I don't > use) > >> > >>> - based on Plugin (same interface) > >> > >>> > >> > >>> Thoughts? > >> > >
