Are you guys familiar with Beam <https://beam.apache.org>?  Esp. if not
doing transforms, it might rather straightforward to rely on the ecosystem
of connectors in that Apache Project to use as the foundations for a
generic transfer operator.

On Tue, Sep 1, 2020 at 11:05 AM Jarek Potiuk <[email protected]>
wrote:

> +1
>
> On Tue, Sep 1, 2020 at 1:35 PM Kamil Olszewski <
> [email protected]>
> wrote:
>
> > Hello all,
> > since there have been no new comments shared in the POC doc
> > <
> >
> https://docs.google.com/document/d/1o7Ph7RRNqLWkTbe7xkWjb100eFaK1Apjv27LaqHgNkE/edit
> > >
> > for a couple of days, then I will proceed with creating an AIP for this
> > feature, if that is ok with everybody.
> > Best regards,
> > Kamil
> > On Thu, Aug 27, 2020 at 10:50 AM Tomasz Urbaszek <[email protected]>
> > wrote:
> >
> > > I like the approach as it itnroduces another interesting operators'
> > > interface standarization. It would be awesome to here more opinions :)
> > >
> > > Cheers,
> > > Tomek
> > >
> > > On Wed, Aug 19, 2020 at 8:10 PM Jarek Potiuk <[email protected]
> >
> > > wrote:
> > >
> > > > I like the idea a lot. Similar things have been discussed before but
> > the
> > > > proposal is I think rather pragmatic and solves a real problem (and
> it
> > > does
> > > > not seem to be too complex to implement)
> > > >
> > > > There is some discussion about it already in the document (please
> > > chime-in
> > > > for those interested) but here a few points why I like it:
> > > >
> > > > - performance and optimization is not a focus for that. For generic
> > stuff
> > > > it is usually to write "optimal" solution but once you admit you are
> > not
> > > > going to focus for optimisation, you come with simpler and easier to
> > use
> > > > solutions
> > > >
> > > > - on the other hand - it uses very "Python'y" approach with using
> > > > Airflow's familiar concepts (connection, transfer) and has the
> > potential
> > > of
> > > > plugging in into 100s of hooks we have already easily - leveraging
> all
> > > the
> > > > "providers" richness of Airflow.
> > > >
> > > > - it aims to be easy to do "quick start" - if you have a number of
> > > > different sources/targets and as a data scientist you would like to
> > > quickly
> > > > start transferring data between them  - you can do it easily with
> only
> > > > basic python knowledge and simple DAG structure.
> > > >
> > > > - it should be possible to plug it in into our new functional
> approach
> > as
> > > > well as future lineage discussions as it makes connection between
> > sources
> > > > and targets
> > > >
> > > > - it opens up possibilities of adding simple and flexible data
> > > > transformation on-transfer. Not a replacement for any of the external
> > > > services that Airflow should use (Airflow is an orchestrator, not
> data
> > > > processing solution) but for the kind of quick-start scenarios I
> > foresee
> > > it
> > > > might be most useful, being able to apply simple data transformation
> on
> > > the
> > > > fly by data scientist might be a big plus.
> > > >
> > > > Suggestion: Panda DataFrame as the format of the "data" component
> > > >
> > > > Kamil - you should have access now.
> > > >
> > > > J.
> > > >
> > > >
> > > > On Tue, Aug 18, 2020 at 6:53 PM Kamil Olszewski <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > Hello all,
> > > > > in Polidea we have come up with an idea for a generic transfer
> > operator
> > > > > that would be able to transport data between two destinations of
> > > various
> > > > > types (file, database, storage, etc.) - please find the link with a
> > > short
> > > > > doc with POC
> > > > > <
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1o7Ph7RRNqLWkTbe7xkWjb100eFaK1Apjv27LaqHgNkE/edit?usp=sharing
> > > > > >
> > > > > where we can discuss the design initially. Once we come to the
> > initial
> > > > > conclusion I can create an AIP on cWiki - can I ask for permission
> to
> > > do
> > > > so
> > > > > (my id is 'kamil.olszewski')? I believe that during the discussion
> we
> > > > > should definitely aim for this feature to be released only after
> > > Airflow
> > > > > 2.0 is out.
> > > > >
> > > > > What do you think about this idea? Would you find such an operator
> > > > helpful
> > > > > in your pipelines? Maybe you already use a similar solution or know
> > > > > packages that could be used to implement it?
> > > > >
> > > > > Best regards,
> > > > > --
> > > > >
> > > > > Kamil Olszewski
> > > > > Polidea <https://www.polidea.com> | Software Engineer
> > > > >
> > > > > M: +48 503 361 783
> > > > > E: [email protected]
> > > > >
> > > > > Unique Tech
> > > > > Check out our projects! <https://www.polidea.com/our-work>
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129 <+48660796129>
> > > > [image: Polidea] <https://www.polidea.com/>
> > > >
> > >
> >
> >
> > --
> >
> > Kamil Olszewski
> > Polidea <https://www.polidea.com> | Software Engineer
> >
> > M: +48 503 361 783
> > E: [email protected]
> >
> > Unique Tech
> > Check out our projects! <https://www.polidea.com/our-work>
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Reply via email to