Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Sheng Wu
+1 binding

Sheng Wu 吴晟
Twitter, wusheng1108


Byung-Gon Chun  于2020年12月12日周六 上午5:59写道:

> +1 (binding)
>
> -Gon
>
> On Sat, Dec 12, 2020 at 2:35 AM Furkan KAMACI 
> wrote:
>
> > Hi,
> >
> > +1 (binding)
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On 11 Dec 2020 Fri at 20:04 Daniel B. Widdis  wrote:
> >
> > > +1 (non-binding).  I'm interested in getting involved in this project!
> > >
> > > On Fri, Dec 11, 2020 at 8:33 AM Christofer Dutz <
> > christofer.d...@c-ware.de
> > > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > following up the [DISCUSS] thread on Wayang (
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E
> > > )
> > > > I would like to call a VOTE to accept Wayang Aka Rheem into the
> Apache
> > > > Incubator.
> > > >
> > > > Please cast your vote:
> > > >
> > > >   [ ] +1, bring Wayang into the Incubator
> > > >   [ ] +0, I don't care either way
> > > >   [ ] -1, do not bring Wayang into the Incubator, because...
> > > >
> > > > The vote will open at least for 72 hours and only votes from the
> > > Incubator
> > > > PMC are binding, but votes from everyone are welcome.
> > > >
> > > > Chris
> > > >
> > > > -
> > > >
> > > > Wayang Proposal (
> > > > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal
> )
> > > >
> > > > == Abstract ==
> > > >
> > > > Wayang is a cross-platform data processing system that aims at
> > decoupling
> > > > the business logic of data analytics applications from concrete data
> > > > processing platforms, such as Apache Flink or Apache Spark. Hence, it
> > > tames
> > > > the complexity that arises from the "Cambrian explosion" of novel
> data
> > > > processing platforms that we currently witness.
> > > >
> > > > Note that Wayang project is the Rheem project, but we have renamed
> the
> > > > project because of trademark issues.
> > > >
> > > > You can find the project web page at:
> > https://rheem-ecosystem.github.io/
> > > >
> > > > = Proposal =
> > > >
> > > > Wayang is a cross-platform system that provides an abstraction over
> > data
> > > > processing platforms to free users from the burdens of (i) performing
> > > > tedious and costly data migration and integration tasks to run their
> > > > applications, and (ii) choosing the right data processing platforms
> for
> > > > their applications. To achieve this, Wayang: (1) provides an
> > abstraction
> > > on
> > > > top of existing data processing platforms that allows users to
> specify
> > > > their data analytics tasks in a form of a DAG of operators; (2) comes
> > > with
> > > > a cross-platform optimizer for automating the selection of
> > > > suitable/efficient platforms; and (3) and finally takes care of
> > executing
> > > > the optimized plan, including communication across platforms. In
> > summary,
> > > > Wayang has the following salient features:
> > > >
> > > > - Flexible Data Model - It considers a flexible and simple data model
> > > > based on data quanta. A data quantum is an atomic processing unit in
> > the
> > > > system, that can represent a large spectrum of data formats, such as
> > data
> > > > points for a machine learning application, tuples for a database
> > > > application, or RDF triples. Hence, Wayang is able to express a wide
> > > range
> > > > of data analytics tasks.
> > > > - Platform independence - It provides a simple interface (currently
> > Java
> > > > and Scala) that is inspired by established programming models, such
> as
> > > that
> > > > of Apache Spark and Apache Flink. Users represent their data analytic
> > > tasks
> > > > as a DAG (Wayang plan), where vertices correspond to Wayang operators
> > and
> > > > edges represent data flows (data quanta flowing) among these
> > operators. A
> > > > Wayang operator defines a particular kind of data transformation over
> > an
> > > > input data quantum, ranging from basic functionality (e.g.,
> > > > transformations, filters, joins) to complex, extensible tasks (e.g.,
> > > > PageRank).
> > > > - Cross-platform execution - Besides running a data analytic task on
> > any
> > > > data processing platform, it also comes with an optimizer that can
> > decide
> > > > to execute a single data analytic task using multiple data processing
> > > > platforms. This allows for exploiting the capabilities of different
> > data
> > > > processing platforms to perform complex data analytic tasks more
> > > > efficiently.
> > > > Self-tuning UDF-based cost model - Its optimizer uses a cost model
> > fully
> > > > based on UDFs. This not only enables Wayang to learn the cost
> functions
> > > of
> > > > newly added data processing platforms, but also allows developers to
> > tune
> > > > the optimizer at will.
> > > > - Extensibility - It treats data processing platforms as plugins to
> > allow
> > > > users (developers) to easily incorporate new data processing
> platforms
> > > into
> > > > the system. 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Byung-Gon Chun
+1 (binding)

-Gon

On Sat, Dec 12, 2020 at 2:35 AM Furkan KAMACI 
wrote:

> Hi,
>
> +1 (binding)
>
> Kind Regards,
> Furkan KAMACI
>
> On 11 Dec 2020 Fri at 20:04 Daniel B. Widdis  wrote:
>
> > +1 (non-binding).  I'm interested in getting involved in this project!
> >
> > On Fri, Dec 11, 2020 at 8:33 AM Christofer Dutz <
> christofer.d...@c-ware.de
> > >
> > wrote:
> >
> > > Hi all,
> > >
> > > following up the [DISCUSS] thread on Wayang (
> > >
> >
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E
> > )
> > > I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> > > Incubator.
> > >
> > > Please cast your vote:
> > >
> > >   [ ] +1, bring Wayang into the Incubator
> > >   [ ] +0, I don't care either way
> > >   [ ] -1, do not bring Wayang into the Incubator, because...
> > >
> > > The vote will open at least for 72 hours and only votes from the
> > Incubator
> > > PMC are binding, but votes from everyone are welcome.
> > >
> > > Chris
> > >
> > > -
> > >
> > > Wayang Proposal (
> > > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> > >
> > > == Abstract ==
> > >
> > > Wayang is a cross-platform data processing system that aims at
> decoupling
> > > the business logic of data analytics applications from concrete data
> > > processing platforms, such as Apache Flink or Apache Spark. Hence, it
> > tames
> > > the complexity that arises from the "Cambrian explosion" of novel data
> > > processing platforms that we currently witness.
> > >
> > > Note that Wayang project is the Rheem project, but we have renamed the
> > > project because of trademark issues.
> > >
> > > You can find the project web page at:
> https://rheem-ecosystem.github.io/
> > >
> > > = Proposal =
> > >
> > > Wayang is a cross-platform system that provides an abstraction over
> data
> > > processing platforms to free users from the burdens of (i) performing
> > > tedious and costly data migration and integration tasks to run their
> > > applications, and (ii) choosing the right data processing platforms for
> > > their applications. To achieve this, Wayang: (1) provides an
> abstraction
> > on
> > > top of existing data processing platforms that allows users to specify
> > > their data analytics tasks in a form of a DAG of operators; (2) comes
> > with
> > > a cross-platform optimizer for automating the selection of
> > > suitable/efficient platforms; and (3) and finally takes care of
> executing
> > > the optimized plan, including communication across platforms. In
> summary,
> > > Wayang has the following salient features:
> > >
> > > - Flexible Data Model - It considers a flexible and simple data model
> > > based on data quanta. A data quantum is an atomic processing unit in
> the
> > > system, that can represent a large spectrum of data formats, such as
> data
> > > points for a machine learning application, tuples for a database
> > > application, or RDF triples. Hence, Wayang is able to express a wide
> > range
> > > of data analytics tasks.
> > > - Platform independence - It provides a simple interface (currently
> Java
> > > and Scala) that is inspired by established programming models, such as
> > that
> > > of Apache Spark and Apache Flink. Users represent their data analytic
> > tasks
> > > as a DAG (Wayang plan), where vertices correspond to Wayang operators
> and
> > > edges represent data flows (data quanta flowing) among these
> operators. A
> > > Wayang operator defines a particular kind of data transformation over
> an
> > > input data quantum, ranging from basic functionality (e.g.,
> > > transformations, filters, joins) to complex, extensible tasks (e.g.,
> > > PageRank).
> > > - Cross-platform execution - Besides running a data analytic task on
> any
> > > data processing platform, it also comes with an optimizer that can
> decide
> > > to execute a single data analytic task using multiple data processing
> > > platforms. This allows for exploiting the capabilities of different
> data
> > > processing platforms to perform complex data analytic tasks more
> > > efficiently.
> > > Self-tuning UDF-based cost model - Its optimizer uses a cost model
> fully
> > > based on UDFs. This not only enables Wayang to learn the cost functions
> > of
> > > newly added data processing platforms, but also allows developers to
> tune
> > > the optimizer at will.
> > > - Extensibility - It treats data processing platforms as plugins to
> allow
> > > users (developers) to easily incorporate new data processing platforms
> > into
> > > the system. This is achieved by exposing the functionalities of data
> > > processing platforms as operators (execution operators). The same
> > approach
> > > is followed at the Wayang interface, where users can also extend Wayang
> > > capabilities, i.e., the operators, easily.
> > >
> > > We plan to work on the stability of all these features as well as
> > > extending Wayang 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread Jean-Baptiste Onofre
Thanks ;)

I’m starting the deep dive review ;)

Regards
JB

> Le 11 déc. 2020 à 18:58, Alexander Alten-Lorenz  a écrit :
> 
> Hey JB,
> 
> On behalf of the Wayang team, we are happy to have you as a mentor and we
> have already updated the proposal (thanks Chris!). Thanks JB :)
> 
> best,
> --alex
> 
> On Fri, Dec 11, 2020 at 4:50 PM Rodrigo Pardo Meza 
> wrote:
> 
>> Hi Jean,
>> 
>> Nice to meet you. Thanks for supporting Wayang proposal.
>> 
>> You get my +1 to join in the mentoring team.
>> 
>> Best regards!
>> 
>> El vie, 11 dic 2020 a las 12:44, bertty contreras (<
>> berttycontre...@gmail.com>) escribió:
>> 
>>> Hi JB,
>>> 
>>> It will be so nice that you join to wayang, you have my +1
>>> 
>>> Best Regards,
>>> Bertty
>>> 
>>> El mié, 9 dic 2020 a las 11:02, Alexander Alten-Lorenz (<
>> a...@scalytics.io
 )
>>> escribió:
>>> 
 resent to the ML since Exim does not accept MIME Content-Type
 'text/html' (#5.2.3)
 
 Hey JB,
 
 Wayang is  an intelligent scheduler to leverage multiple frameworks
 and is doing spills / tasks of an RRD by themselves, when this speed
 up the processing. We see us in the middle of Apache Spark, Apache
 Beam and Apache Nemo. Personally I see us more into Apache Flink, but
 with major advantages as described before. Our focus is more into AI /
 ML and the productionizing of models, helping data engineers to get
 the job with the right tool done in less time instead of always the
 same.
 
 For mentoring, you get my +1  Would be great to have you in the
 mentoring team!
 
 Best,
 --alex
 
 
 On Wed, Dec 9, 2020 at 2:58 PM Alexander Alten-Lorenz <
>> a...@scalytics.io>
 wrote:
> 
> Hey JB,
> 
> 
> 
> Wayang is  an intelligent scheduler to leverage multiple frameworks
>> and
 is doing spills / tasks of an RRD by themselves, when this speed up the
 processing. We see us in the middle of Apache Spark, Apache Beam and
>>> Apache
 Nemo. Personally I see us more into Apache Flink, but with major
>>> advantages
 as described before. Our focus is more into AI / ML and the
>>> productionizing
 of models, helping data engineers to get the job with the right tool
>> done
 in less time instead to use always the same.
> 
> 
> 
> For mentoring, you get my +1 Would be great to have you in the
>>> mentoring
 team!
> 
> 
> 
> Best,
> 
> --alex
> 
> 
> 
> --
> 
> Alexander Alten - Lorenz
> 
> Chief Technology Officer
> 
> Scalytics - Code once. Deploy on any cloud
> 
> 
> 
> m:   alexan...@scalytics.io
> 
> ln:   www.linkedin.com/in/alexanderalten/
> 
> m:   +49 151 156 81561
> 
> 
> 
> Scalytics, Inc.
> 
> Web: scalytics.io
> 
> 
> 
> 
> 
> From: Jean-Baptiste Onofre
> Sent: Wednesday, December 9, 2020 2:48 PM
> To: general@incubator.apache.org
> Cc: Alexander Alten-Lorenz; Rodrigo Pardo Meza
> Subject: Re: [DISCUSS] Wayang Proposal
> 
> 
> 
> Hi,
> 
> 
> 
> As it seems a bit close to Apache Beam, I would be interested by
>> follow
 and/or mentor the project.
> 
> 
> 
> Please let me know if you want an additional mentor on this podling.
> 
> 
> 
> By the way, what’s Wayang position regarding Apache Beam and Apache
>>> Nemo
 ?
> 
> 
> 
> Regards
> 
> JB
> 
> 
> 
>> Le 6 déc. 2020 à 12:13, Christofer Dutz >> 
>>> a
 écrit :
> 
>> 
> 
>> Hi and welcome on this new communication channel.
> 
>> 
> 
>> I'd like to add, that I really liked how the project approached me
>>> and
 how they reacted when I told them the old project name Rheem would
>>> probably
 not go down well and that I would suggest changing the name.
> 
>> 
> 
>> I gave them some tips what things are important for Apache and what
>>> we
 lay emphasis on and the team started brainstorming and doing all the
 trademark checks, taking options off the list where problems could be
 expected.
> 
>> 
> 
>> So, upon acceptance here, the project would do the renaming
>> together
 with the groupId and package changing, which would be required anyway.
> 
>> 
> 
>> Right now, there are 3 people signed up as mentors, however knowing
 that not always are all mentors available at every given time, it would
>>> be
 cool if we could find 1-2 more to help out.
> 
>> 
> 
>> 
> 
>> Chris
> 
>> 
> 
>> 
> 
>> 
> 
>> -Ursprüngliche Nachricht-
> 
>> Von: Bertty Contreras 
> 
>> Gesendet: Freitag, 4. Dezember 2020 17:16
> 
>> An: general@incubator.apache.org
> 
>> Cc: Alexander 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread Lars George
Same here from me, +1 to have JB join the team!

On Fri, Dec 11, 2020 at 6:59 PM Alexander Alten-Lorenz 
wrote:

> Hey JB,
>
> On behalf of the Wayang team, we are happy to have you as a mentor and we
> have already updated the proposal (thanks Chris!). Thanks JB :)
>
> best,
>  --alex
>
> On Fri, Dec 11, 2020 at 4:50 PM Rodrigo Pardo Meza <
> ro.pardo.m...@gmail.com>
> wrote:
>
> > Hi Jean,
> >
> > Nice to meet you. Thanks for supporting Wayang proposal.
> >
> > You get my +1 to join in the mentoring team.
> >
> > Best regards!
> >
> > El vie, 11 dic 2020 a las 12:44, bertty contreras (<
> > berttycontre...@gmail.com>) escribió:
> >
> > > Hi JB,
> > >
> > > It will be so nice that you join to wayang, you have my +1
> > >
> > > Best Regards,
> > > Bertty
> > >
> > > El mié, 9 dic 2020 a las 11:02, Alexander Alten-Lorenz (<
> > a...@scalytics.io
> > > >)
> > > escribió:
> > >
> > > > resent to the ML since Exim does not accept MIME Content-Type
> > > > 'text/html' (#5.2.3)
> > > >
> > > > Hey JB,
> > > >
> > > > Wayang is  an intelligent scheduler to leverage multiple frameworks
> > > > and is doing spills / tasks of an RRD by themselves, when this speed
> > > > up the processing. We see us in the middle of Apache Spark, Apache
> > > > Beam and Apache Nemo. Personally I see us more into Apache Flink, but
> > > > with major advantages as described before. Our focus is more into AI
> /
> > > > ML and the productionizing of models, helping data engineers to get
> > > > the job with the right tool done in less time instead of always the
> > > > same.
> > > >
> > > > For mentoring, you get my +1  Would be great to have you in the
> > > > mentoring team!
> > > >
> > > > Best,
> > > > --alex
> > > >
> > > >
> > > > On Wed, Dec 9, 2020 at 2:58 PM Alexander Alten-Lorenz <
> > a...@scalytics.io>
> > > > wrote:
> > > > >
> > > > > Hey JB,
> > > > >
> > > > >
> > > > >
> > > > > Wayang is  an intelligent scheduler to leverage multiple frameworks
> > and
> > > > is doing spills / tasks of an RRD by themselves, when this speed up
> the
> > > > processing. We see us in the middle of Apache Spark, Apache Beam and
> > > Apache
> > > > Nemo. Personally I see us more into Apache Flink, but with major
> > > advantages
> > > > as described before. Our focus is more into AI / ML and the
> > > productionizing
> > > > of models, helping data engineers to get the job with the right tool
> > done
> > > > in less time instead to use always the same.
> > > > >
> > > > >
> > > > >
> > > > > For mentoring, you get my +1 Would be great to have you in the
> > > mentoring
> > > > team!
> > > > >
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > --alex
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Alexander Alten - Lorenz
> > > > >
> > > > > Chief Technology Officer
> > > > >
> > > > > Scalytics - Code once. Deploy on any cloud
> > > > >
> > > > >
> > > > >
> > > > > m:   alexan...@scalytics.io
> > > > >
> > > > > ln:   www.linkedin.com/in/alexanderalten/
> > > > >
> > > > > m:   +49 151 156 81561
> > > > >
> > > > >
> > > > >
> > > > > Scalytics, Inc.
> > > > >
> > > > > Web: scalytics.io
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > From: Jean-Baptiste Onofre
> > > > > Sent: Wednesday, December 9, 2020 2:48 PM
> > > > > To: general@incubator.apache.org
> > > > > Cc: Alexander Alten-Lorenz; Rodrigo Pardo Meza
> > > > > Subject: Re: [DISCUSS] Wayang Proposal
> > > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > >
> > > > > As it seems a bit close to Apache Beam, I would be interested by
> > follow
> > > > and/or mentor the project.
> > > > >
> > > > >
> > > > >
> > > > > Please let me know if you want an additional mentor on this
> podling.
> > > > >
> > > > >
> > > > >
> > > > > By the way, what’s Wayang position regarding Apache Beam and Apache
> > > Nemo
> > > > ?
> > > > >
> > > > >
> > > > >
> > > > > Regards
> > > > >
> > > > > JB
> > > > >
> > > > >
> > > > >
> > > > > > Le 6 déc. 2020 à 12:13, Christofer Dutz <
> christofer.d...@c-ware.de
> > >
> > > a
> > > > écrit :
> > > > >
> > > > > >
> > > > >
> > > > > > Hi and welcome on this new communication channel.
> > > > >
> > > > > >
> > > > >
> > > > > > I'd like to add, that I really liked how the project approached
> me
> > > and
> > > > how they reacted when I told them the old project name Rheem would
> > > probably
> > > > not go down well and that I would suggest changing the name.
> > > > >
> > > > > >
> > > > >
> > > > > > I gave them some tips what things are important for Apache and
> what
> > > we
> > > > lay emphasis on and the team started brainstorming and doing all the
> > > > trademark checks, taking options off the list where problems could be
> > > > expected.
> > > > >
> > > > > >
> > > > >
> > > > > > So, upon acceptance here, the project would do the renaming
> > together
> > > > with the groupId and package changing, which would be required
> anyway.
> > > > >
> > > > > 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread Alexander Alten-Lorenz
Hey JB,

On behalf of the Wayang team, we are happy to have you as a mentor and we
have already updated the proposal (thanks Chris!). Thanks JB :)

best,
 --alex

On Fri, Dec 11, 2020 at 4:50 PM Rodrigo Pardo Meza 
wrote:

> Hi Jean,
>
> Nice to meet you. Thanks for supporting Wayang proposal.
>
> You get my +1 to join in the mentoring team.
>
> Best regards!
>
> El vie, 11 dic 2020 a las 12:44, bertty contreras (<
> berttycontre...@gmail.com>) escribió:
>
> > Hi JB,
> >
> > It will be so nice that you join to wayang, you have my +1
> >
> > Best Regards,
> > Bertty
> >
> > El mié, 9 dic 2020 a las 11:02, Alexander Alten-Lorenz (<
> a...@scalytics.io
> > >)
> > escribió:
> >
> > > resent to the ML since Exim does not accept MIME Content-Type
> > > 'text/html' (#5.2.3)
> > >
> > > Hey JB,
> > >
> > > Wayang is  an intelligent scheduler to leverage multiple frameworks
> > > and is doing spills / tasks of an RRD by themselves, when this speed
> > > up the processing. We see us in the middle of Apache Spark, Apache
> > > Beam and Apache Nemo. Personally I see us more into Apache Flink, but
> > > with major advantages as described before. Our focus is more into AI /
> > > ML and the productionizing of models, helping data engineers to get
> > > the job with the right tool done in less time instead of always the
> > > same.
> > >
> > > For mentoring, you get my +1  Would be great to have you in the
> > > mentoring team!
> > >
> > > Best,
> > > --alex
> > >
> > >
> > > On Wed, Dec 9, 2020 at 2:58 PM Alexander Alten-Lorenz <
> a...@scalytics.io>
> > > wrote:
> > > >
> > > > Hey JB,
> > > >
> > > >
> > > >
> > > > Wayang is  an intelligent scheduler to leverage multiple frameworks
> and
> > > is doing spills / tasks of an RRD by themselves, when this speed up the
> > > processing. We see us in the middle of Apache Spark, Apache Beam and
> > Apache
> > > Nemo. Personally I see us more into Apache Flink, but with major
> > advantages
> > > as described before. Our focus is more into AI / ML and the
> > productionizing
> > > of models, helping data engineers to get the job with the right tool
> done
> > > in less time instead to use always the same.
> > > >
> > > >
> > > >
> > > > For mentoring, you get my +1 Would be great to have you in the
> > mentoring
> > > team!
> > > >
> > > >
> > > >
> > > > Best,
> > > >
> > > > --alex
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Alexander Alten - Lorenz
> > > >
> > > > Chief Technology Officer
> > > >
> > > > Scalytics - Code once. Deploy on any cloud
> > > >
> > > >
> > > >
> > > > m:   alexan...@scalytics.io
> > > >
> > > > ln:   www.linkedin.com/in/alexanderalten/
> > > >
> > > > m:   +49 151 156 81561
> > > >
> > > >
> > > >
> > > > Scalytics, Inc.
> > > >
> > > > Web: scalytics.io
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > From: Jean-Baptiste Onofre
> > > > Sent: Wednesday, December 9, 2020 2:48 PM
> > > > To: general@incubator.apache.org
> > > > Cc: Alexander Alten-Lorenz; Rodrigo Pardo Meza
> > > > Subject: Re: [DISCUSS] Wayang Proposal
> > > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > >
> > > >
> > > > As it seems a bit close to Apache Beam, I would be interested by
> follow
> > > and/or mentor the project.
> > > >
> > > >
> > > >
> > > > Please let me know if you want an additional mentor on this podling.
> > > >
> > > >
> > > >
> > > > By the way, what’s Wayang position regarding Apache Beam and Apache
> > Nemo
> > > ?
> > > >
> > > >
> > > >
> > > > Regards
> > > >
> > > > JB
> > > >
> > > >
> > > >
> > > > > Le 6 déc. 2020 à 12:13, Christofer Dutz  >
> > a
> > > écrit :
> > > >
> > > > >
> > > >
> > > > > Hi and welcome on this new communication channel.
> > > >
> > > > >
> > > >
> > > > > I'd like to add, that I really liked how the project approached me
> > and
> > > how they reacted when I told them the old project name Rheem would
> > probably
> > > not go down well and that I would suggest changing the name.
> > > >
> > > > >
> > > >
> > > > > I gave them some tips what things are important for Apache and what
> > we
> > > lay emphasis on and the team started brainstorming and doing all the
> > > trademark checks, taking options off the list where problems could be
> > > expected.
> > > >
> > > > >
> > > >
> > > > > So, upon acceptance here, the project would do the renaming
> together
> > > with the groupId and package changing, which would be required anyway.
> > > >
> > > > >
> > > >
> > > > > Right now, there are 3 people signed up as mentors, however knowing
> > > that not always are all mentors available at every given time, it would
> > be
> > > cool if we could find 1-2 more to help out.
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > > Chris
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > >
> > > >
> > > > > -Ursprüngliche Nachricht-
> > > >
> > > > > Von: Bertty Contreras 
> > > >
> > > > > Gesendet: Freitag, 4. Dezember 2020 17:16
> > > >
> > > > > An: general@incubator.apache.org
> > 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Furkan KAMACI
Hi,

+1 (binding)

Kind Regards,
Furkan KAMACI

On 11 Dec 2020 Fri at 20:04 Daniel B. Widdis  wrote:

> +1 (non-binding).  I'm interested in getting involved in this project!
>
> On Fri, Dec 11, 2020 at 8:33 AM Christofer Dutz  >
> wrote:
>
> > Hi all,
> >
> > following up the [DISCUSS] thread on Wayang (
> >
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E
> )
> > I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> > Incubator.
> >
> > Please cast your vote:
> >
> >   [ ] +1, bring Wayang into the Incubator
> >   [ ] +0, I don't care either way
> >   [ ] -1, do not bring Wayang into the Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding, but votes from everyone are welcome.
> >
> > Chris
> >
> > -
> >
> > Wayang Proposal (
> > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> >
> > == Abstract ==
> >
> > Wayang is a cross-platform data processing system that aims at decoupling
> > the business logic of data analytics applications from concrete data
> > processing platforms, such as Apache Flink or Apache Spark. Hence, it
> tames
> > the complexity that arises from the "Cambrian explosion" of novel data
> > processing platforms that we currently witness.
> >
> > Note that Wayang project is the Rheem project, but we have renamed the
> > project because of trademark issues.
> >
> > You can find the project web page at: https://rheem-ecosystem.github.io/
> >
> > = Proposal =
> >
> > Wayang is a cross-platform system that provides an abstraction over data
> > processing platforms to free users from the burdens of (i) performing
> > tedious and costly data migration and integration tasks to run their
> > applications, and (ii) choosing the right data processing platforms for
> > their applications. To achieve this, Wayang: (1) provides an abstraction
> on
> > top of existing data processing platforms that allows users to specify
> > their data analytics tasks in a form of a DAG of operators; (2) comes
> with
> > a cross-platform optimizer for automating the selection of
> > suitable/efficient platforms; and (3) and finally takes care of executing
> > the optimized plan, including communication across platforms. In summary,
> > Wayang has the following salient features:
> >
> > - Flexible Data Model - It considers a flexible and simple data model
> > based on data quanta. A data quantum is an atomic processing unit in the
> > system, that can represent a large spectrum of data formats, such as data
> > points for a machine learning application, tuples for a database
> > application, or RDF triples. Hence, Wayang is able to express a wide
> range
> > of data analytics tasks.
> > - Platform independence - It provides a simple interface (currently Java
> > and Scala) that is inspired by established programming models, such as
> that
> > of Apache Spark and Apache Flink. Users represent their data analytic
> tasks
> > as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> > edges represent data flows (data quanta flowing) among these operators. A
> > Wayang operator defines a particular kind of data transformation over an
> > input data quantum, ranging from basic functionality (e.g.,
> > transformations, filters, joins) to complex, extensible tasks (e.g.,
> > PageRank).
> > - Cross-platform execution - Besides running a data analytic task on any
> > data processing platform, it also comes with an optimizer that can decide
> > to execute a single data analytic task using multiple data processing
> > platforms. This allows for exploiting the capabilities of different data
> > processing platforms to perform complex data analytic tasks more
> > efficiently.
> > Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> > based on UDFs. This not only enables Wayang to learn the cost functions
> of
> > newly added data processing platforms, but also allows developers to tune
> > the optimizer at will.
> > - Extensibility - It treats data processing platforms as plugins to allow
> > users (developers) to easily incorporate new data processing platforms
> into
> > the system. This is achieved by exposing the functionalities of data
> > processing platforms as operators (execution operators). The same
> approach
> > is followed at the Wayang interface, where users can also extend Wayang
> > capabilities, i.e., the operators, easily.
> >
> > We plan to work on the stability of all these features as well as
> > extending Wayang with more advanced features. Furthermore, Wayang
> currently
> > supports Apache Spark, Standalone Java, GraphChi, relational databases
> (via
> > JDBC). We plan to incorporate more data processing platforms, such as
> > Apache Flink and Apache Hive.
> >
> > === Background ===
> >
> > Many organizations and companies collect or produce large variety of data
> 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Daniel B. Widdis
+1 (non-binding).  I'm interested in getting involved in this project!

On Fri, Dec 11, 2020 at 8:33 AM Christofer Dutz 
wrote:

> Hi all,
>
> following up the [DISCUSS] thread on Wayang (
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> Incubator.
>
> Please cast your vote:
>
>   [ ] +1, bring Wayang into the Incubator
>   [ ] +0, I don't care either way
>   [ ] -1, do not bring Wayang into the Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding, but votes from everyone are welcome.
>
> Chris
>
> -
>
> Wayang Proposal (
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
>
> == Abstract ==
>
> Wayang is a cross-platform data processing system that aims at decoupling
> the business logic of data analytics applications from concrete data
> processing platforms, such as Apache Flink or Apache Spark. Hence, it tames
> the complexity that arises from the "Cambrian explosion" of novel data
> processing platforms that we currently witness.
>
> Note that Wayang project is the Rheem project, but we have renamed the
> project because of trademark issues.
>
> You can find the project web page at: https://rheem-ecosystem.github.io/
>
> = Proposal =
>
> Wayang is a cross-platform system that provides an abstraction over data
> processing platforms to free users from the burdens of (i) performing
> tedious and costly data migration and integration tasks to run their
> applications, and (ii) choosing the right data processing platforms for
> their applications. To achieve this, Wayang: (1) provides an abstraction on
> top of existing data processing platforms that allows users to specify
> their data analytics tasks in a form of a DAG of operators; (2) comes with
> a cross-platform optimizer for automating the selection of
> suitable/efficient platforms; and (3) and finally takes care of executing
> the optimized plan, including communication across platforms. In summary,
> Wayang has the following salient features:
>
> - Flexible Data Model - It considers a flexible and simple data model
> based on data quanta. A data quantum is an atomic processing unit in the
> system, that can represent a large spectrum of data formats, such as data
> points for a machine learning application, tuples for a database
> application, or RDF triples. Hence, Wayang is able to express a wide range
> of data analytics tasks.
> - Platform independence - It provides a simple interface (currently Java
> and Scala) that is inspired by established programming models, such as that
> of Apache Spark and Apache Flink. Users represent their data analytic tasks
> as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> edges represent data flows (data quanta flowing) among these operators. A
> Wayang operator defines a particular kind of data transformation over an
> input data quantum, ranging from basic functionality (e.g.,
> transformations, filters, joins) to complex, extensible tasks (e.g.,
> PageRank).
> - Cross-platform execution - Besides running a data analytic task on any
> data processing platform, it also comes with an optimizer that can decide
> to execute a single data analytic task using multiple data processing
> platforms. This allows for exploiting the capabilities of different data
> processing platforms to perform complex data analytic tasks more
> efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> based on UDFs. This not only enables Wayang to learn the cost functions of
> newly added data processing platforms, but also allows developers to tune
> the optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow
> users (developers) to easily incorporate new data processing platforms into
> the system. This is achieved by exposing the functionalities of data
> processing platforms as operators (execution operators). The same approach
> is followed at the Wayang interface, where users can also extend Wayang
> capabilities, i.e., the operators, easily.
>
> We plan to work on the stability of all these features as well as
> extending Wayang with more advanced features. Furthermore, Wayang currently
> supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> JDBC). We plan to incorporate more data processing platforms, such as
> Apache Flink and Apache Hive.
>
> === Background ===
>
> Many organizations and companies collect or produce large variety of data
> to apply data analytics over them. This is because insights from data
> rapidly allow them to make better decisions. Thus, the pursuit for
> efficient and scalable data analytics as well as the
> one-size-does-not-fit-all philosophy has given rise to a plethora of data
> processing platforms. Examples of these specialized 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Jean-Baptiste Onofre
+1 (binding)

Regards
JB

> Le 11 déc. 2020 à 17:33, Christofer Dutz  a écrit :
> 
> Hi all,
> 
> following up the [DISCUSS] thread on Wayang 
> (https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
>  I would like to call a VOTE to accept Wayang Aka Rheem into the Apache 
> Incubator.
> 
> Please cast your vote:
> 
>  [ ] +1, bring Wayang into the Incubator
>  [ ] +0, I don't care either way
>  [ ] -1, do not bring Wayang into the Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator 
> PMC are binding, but votes from everyone are welcome.
> 
> Chris
> 
> -
> 
> Wayang Proposal 
> (https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> 
> == Abstract ==
> 
> Wayang is a cross-platform data processing system that aims at decoupling the 
> business logic of data analytics applications from concrete data processing 
> platforms, such as Apache Flink or Apache Spark. Hence, it tames the 
> complexity that arises from the "Cambrian explosion" of novel data processing 
> platforms that we currently witness.
> 
> Note that Wayang project is the Rheem project, but we have renamed the 
> project because of trademark issues.
> 
> You can find the project web page at: https://rheem-ecosystem.github.io/
> 
> = Proposal =
> 
> Wayang is a cross-platform system that provides an abstraction over data 
> processing platforms to free users from the burdens of (i) performing tedious 
> and costly data migration and integration tasks to run their applications, 
> and (ii) choosing the right data processing platforms for their applications. 
> To achieve this, Wayang: (1) provides an abstraction on top of existing data 
> processing platforms that allows users to specify their data analytics tasks 
> in a form of a DAG of operators; (2) comes with a cross-platform optimizer 
> for automating the selection of suitable/efficient platforms; and (3) and 
> finally takes care of executing the optimized plan, including communication 
> across platforms. In summary, Wayang has the following salient features:
> 
> - Flexible Data Model - It considers a flexible and simple data model based 
> on data quanta. A data quantum is an atomic processing unit in the system, 
> that can represent a large spectrum of data formats, such as data points for 
> a machine learning application, tuples for a database application, or RDF 
> triples. Hence, Wayang is able to express a wide range of data analytics 
> tasks.
> - Platform independence - It provides a simple interface (currently Java and 
> Scala) that is inspired by established programming models, such as that of 
> Apache Spark and Apache Flink. Users represent their data analytic tasks as a 
> DAG (Wayang plan), where vertices correspond to Wayang operators and edges 
> represent data flows (data quanta flowing) among these operators. A Wayang 
> operator defines a particular kind of data transformation over an input data 
> quantum, ranging from basic functionality (e.g., transformations, filters, 
> joins) to complex, extensible tasks (e.g., PageRank).
> - Cross-platform execution - Besides running a data analytic task on any data 
> processing platform, it also comes with an optimizer that can decide to 
> execute a single data analytic task using multiple data processing platforms. 
> This allows for exploiting the capabilities of different data processing 
> platforms to perform complex data analytic tasks more efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully 
> based on UDFs. This not only enables Wayang to learn the cost functions of 
> newly added data processing platforms, but also allows developers to tune the 
> optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow 
> users (developers) to easily incorporate new data processing platforms into 
> the system. This is achieved by exposing the functionalities of data 
> processing platforms as operators (execution operators). The same approach is 
> followed at the Wayang interface, where users can also extend Wayang 
> capabilities, i.e., the operators, easily.
> 
> We plan to work on the stability of all these features as well as extending 
> Wayang with more advanced features. Furthermore, Wayang currently supports 
> Apache Spark, Standalone Java, GraphChi, relational databases (via JDBC). We 
> plan to incorporate more data processing platforms, such as Apache Flink and 
> Apache Hive.
> 
> === Background ===
> 
> Many organizations and companies collect or produce large variety of data to 
> apply data analytics over them. This is because insights from data rapidly 
> allow them to make better decisions. Thus, the pursuit for efficient and 
> scalable data analytics as well as the one-size-does-not-fit-all philosophy 
> has given rise to a plethora of data processing platforms. Examples of these 
> 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Alexander Alten-Lorenz
+1 (unbinding)

On Fri, Dec 11, 2020 at 5:40 PM Kevin Ratnasekera
 wrote:
>
> +1 (binding)
>
> On Fri, Dec 11, 2020 at 10:05 PM Dave Fisher  wrote:
>
> > +1 (binding)
> >
> > Sent from my iPhone
> >
> > > On Dec 11, 2020, at 8:33 AM, Christofer Dutz 
> > wrote:
> > >
> > > Hi all,
> > >
> > > following up the [DISCUSS] thread on Wayang (
> > https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> > I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> > Incubator.
> > >
> > > Please cast your vote:
> > >
> > >  [ ] +1, bring Wayang into the Incubator
> > >  [ ] +0, I don't care either way
> > >  [ ] -1, do not bring Wayang into the Incubator, because...
> > >
> > > The vote will open at least for 72 hours and only votes from the
> > Incubator PMC are binding, but votes from everyone are welcome.
> > >
> > > Chris
> > >
> > > -
> > >
> > > Wayang Proposal (
> > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> > >
> > > == Abstract ==
> > >
> > > Wayang is a cross-platform data processing system that aims at
> > decoupling the business logic of data analytics applications from concrete
> > data processing platforms, such as Apache Flink or Apache Spark. Hence, it
> > tames the complexity that arises from the "Cambrian explosion" of novel
> > data processing platforms that we currently witness.
> > >
> > > Note that Wayang project is the Rheem project, but we have renamed the
> > project because of trademark issues.
> > >
> > > You can find the project web page at: https://rheem-ecosystem.github.io/
> > >
> > > = Proposal =
> > >
> > > Wayang is a cross-platform system that provides an abstraction over data
> > processing platforms to free users from the burdens of (i) performing
> > tedious and costly data migration and integration tasks to run their
> > applications, and (ii) choosing the right data processing platforms for
> > their applications. To achieve this, Wayang: (1) provides an abstraction on
> > top of existing data processing platforms that allows users to specify
> > their data analytics tasks in a form of a DAG of operators; (2) comes with
> > a cross-platform optimizer for automating the selection of
> > suitable/efficient platforms; and (3) and finally takes care of executing
> > the optimized plan, including communication across platforms. In summary,
> > Wayang has the following salient features:
> > >
> > > - Flexible Data Model - It considers a flexible and simple data model
> > based on data quanta. A data quantum is an atomic processing unit in the
> > system, that can represent a large spectrum of data formats, such as data
> > points for a machine learning application, tuples for a database
> > application, or RDF triples. Hence, Wayang is able to express a wide range
> > of data analytics tasks.
> > > - Platform independence - It provides a simple interface (currently Java
> > and Scala) that is inspired by established programming models, such as that
> > of Apache Spark and Apache Flink. Users represent their data analytic tasks
> > as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> > edges represent data flows (data quanta flowing) among these operators. A
> > Wayang operator defines a particular kind of data transformation over an
> > input data quantum, ranging from basic functionality (e.g.,
> > transformations, filters, joins) to complex, extensible tasks (e.g.,
> > PageRank).
> > > - Cross-platform execution - Besides running a data analytic task on any
> > data processing platform, it also comes with an optimizer that can decide
> > to execute a single data analytic task using multiple data processing
> > platforms. This allows for exploiting the capabilities of different data
> > processing platforms to perform complex data analytic tasks more
> > efficiently.
> > > Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> > based on UDFs. This not only enables Wayang to learn the cost functions of
> > newly added data processing platforms, but also allows developers to tune
> > the optimizer at will.
> > > - Extensibility - It treats data processing platforms as plugins to
> > allow users (developers) to easily incorporate new data processing
> > platforms into the system. This is achieved by exposing the functionalities
> > of data processing platforms as operators (execution operators). The same
> > approach is followed at the Wayang interface, where users can also extend
> > Wayang capabilities, i.e., the operators, easily.
> > >
> > > We plan to work on the stability of all these features as well as
> > extending Wayang with more advanced features. Furthermore, Wayang currently
> > supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> > JDBC). We plan to incorporate more data processing platforms, such as
> > Apache Flink and Apache Hive.
> > >
> > > === Background ===
> > >
> > 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Kevin Ratnasekera
+1 (binding)

On Fri, Dec 11, 2020 at 10:05 PM Dave Fisher  wrote:

> +1 (binding)
>
> Sent from my iPhone
>
> > On Dec 11, 2020, at 8:33 AM, Christofer Dutz 
> wrote:
> >
> > Hi all,
> >
> > following up the [DISCUSS] thread on Wayang (
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> Incubator.
> >
> > Please cast your vote:
> >
> >  [ ] +1, bring Wayang into the Incubator
> >  [ ] +0, I don't care either way
> >  [ ] -1, do not bring Wayang into the Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator PMC are binding, but votes from everyone are welcome.
> >
> > Chris
> >
> > -
> >
> > Wayang Proposal (
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> >
> > == Abstract ==
> >
> > Wayang is a cross-platform data processing system that aims at
> decoupling the business logic of data analytics applications from concrete
> data processing platforms, such as Apache Flink or Apache Spark. Hence, it
> tames the complexity that arises from the "Cambrian explosion" of novel
> data processing platforms that we currently witness.
> >
> > Note that Wayang project is the Rheem project, but we have renamed the
> project because of trademark issues.
> >
> > You can find the project web page at: https://rheem-ecosystem.github.io/
> >
> > = Proposal =
> >
> > Wayang is a cross-platform system that provides an abstraction over data
> processing platforms to free users from the burdens of (i) performing
> tedious and costly data migration and integration tasks to run their
> applications, and (ii) choosing the right data processing platforms for
> their applications. To achieve this, Wayang: (1) provides an abstraction on
> top of existing data processing platforms that allows users to specify
> their data analytics tasks in a form of a DAG of operators; (2) comes with
> a cross-platform optimizer for automating the selection of
> suitable/efficient platforms; and (3) and finally takes care of executing
> the optimized plan, including communication across platforms. In summary,
> Wayang has the following salient features:
> >
> > - Flexible Data Model - It considers a flexible and simple data model
> based on data quanta. A data quantum is an atomic processing unit in the
> system, that can represent a large spectrum of data formats, such as data
> points for a machine learning application, tuples for a database
> application, or RDF triples. Hence, Wayang is able to express a wide range
> of data analytics tasks.
> > - Platform independence - It provides a simple interface (currently Java
> and Scala) that is inspired by established programming models, such as that
> of Apache Spark and Apache Flink. Users represent their data analytic tasks
> as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> edges represent data flows (data quanta flowing) among these operators. A
> Wayang operator defines a particular kind of data transformation over an
> input data quantum, ranging from basic functionality (e.g.,
> transformations, filters, joins) to complex, extensible tasks (e.g.,
> PageRank).
> > - Cross-platform execution - Besides running a data analytic task on any
> data processing platform, it also comes with an optimizer that can decide
> to execute a single data analytic task using multiple data processing
> platforms. This allows for exploiting the capabilities of different data
> processing platforms to perform complex data analytic tasks more
> efficiently.
> > Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> based on UDFs. This not only enables Wayang to learn the cost functions of
> newly added data processing platforms, but also allows developers to tune
> the optimizer at will.
> > - Extensibility - It treats data processing platforms as plugins to
> allow users (developers) to easily incorporate new data processing
> platforms into the system. This is achieved by exposing the functionalities
> of data processing platforms as operators (execution operators). The same
> approach is followed at the Wayang interface, where users can also extend
> Wayang capabilities, i.e., the operators, easily.
> >
> > We plan to work on the stability of all these features as well as
> extending Wayang with more advanced features. Furthermore, Wayang currently
> supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> JDBC). We plan to incorporate more data processing platforms, such as
> Apache Flink and Apache Hive.
> >
> > === Background ===
> >
> > Many organizations and companies collect or produce large variety of
> data to apply data analytics over them. This is because insights from data
> rapidly allow them to make better decisions. Thus, the pursuit for
> efficient and scalable data analytics as well as the
> 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Dave Fisher
+1 (binding)

Sent from my iPhone

> On Dec 11, 2020, at 8:33 AM, Christofer Dutz  
> wrote:
> 
> Hi all,
> 
> following up the [DISCUSS] thread on Wayang 
> (https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
>  I would like to call a VOTE to accept Wayang Aka Rheem into the Apache 
> Incubator.
> 
> Please cast your vote:
> 
>  [ ] +1, bring Wayang into the Incubator
>  [ ] +0, I don't care either way
>  [ ] -1, do not bring Wayang into the Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator 
> PMC are binding, but votes from everyone are welcome.
> 
> Chris
> 
> -
> 
> Wayang Proposal 
> (https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
> 
> == Abstract ==
> 
> Wayang is a cross-platform data processing system that aims at decoupling the 
> business logic of data analytics applications from concrete data processing 
> platforms, such as Apache Flink or Apache Spark. Hence, it tames the 
> complexity that arises from the "Cambrian explosion" of novel data processing 
> platforms that we currently witness.
> 
> Note that Wayang project is the Rheem project, but we have renamed the 
> project because of trademark issues.
> 
> You can find the project web page at: https://rheem-ecosystem.github.io/
> 
> = Proposal =
> 
> Wayang is a cross-platform system that provides an abstraction over data 
> processing platforms to free users from the burdens of (i) performing tedious 
> and costly data migration and integration tasks to run their applications, 
> and (ii) choosing the right data processing platforms for their applications. 
> To achieve this, Wayang: (1) provides an abstraction on top of existing data 
> processing platforms that allows users to specify their data analytics tasks 
> in a form of a DAG of operators; (2) comes with a cross-platform optimizer 
> for automating the selection of suitable/efficient platforms; and (3) and 
> finally takes care of executing the optimized plan, including communication 
> across platforms. In summary, Wayang has the following salient features:
> 
> - Flexible Data Model - It considers a flexible and simple data model based 
> on data quanta. A data quantum is an atomic processing unit in the system, 
> that can represent a large spectrum of data formats, such as data points for 
> a machine learning application, tuples for a database application, or RDF 
> triples. Hence, Wayang is able to express a wide range of data analytics 
> tasks.
> - Platform independence - It provides a simple interface (currently Java and 
> Scala) that is inspired by established programming models, such as that of 
> Apache Spark and Apache Flink. Users represent their data analytic tasks as a 
> DAG (Wayang plan), where vertices correspond to Wayang operators and edges 
> represent data flows (data quanta flowing) among these operators. A Wayang 
> operator defines a particular kind of data transformation over an input data 
> quantum, ranging from basic functionality (e.g., transformations, filters, 
> joins) to complex, extensible tasks (e.g., PageRank).
> - Cross-platform execution - Besides running a data analytic task on any data 
> processing platform, it also comes with an optimizer that can decide to 
> execute a single data analytic task using multiple data processing platforms. 
> This allows for exploiting the capabilities of different data processing 
> platforms to perform complex data analytic tasks more efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully 
> based on UDFs. This not only enables Wayang to learn the cost functions of 
> newly added data processing platforms, but also allows developers to tune the 
> optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow 
> users (developers) to easily incorporate new data processing platforms into 
> the system. This is achieved by exposing the functionalities of data 
> processing platforms as operators (execution operators). The same approach is 
> followed at the Wayang interface, where users can also extend Wayang 
> capabilities, i.e., the operators, easily.
> 
> We plan to work on the stability of all these features as well as extending 
> Wayang with more advanced features. Furthermore, Wayang currently supports 
> Apache Spark, Standalone Java, GraphChi, relational databases (via JDBC). We 
> plan to incorporate more data processing platforms, such as Apache Flink and 
> Apache Hive.
> 
> === Background ===
> 
> Many organizations and companies collect or produce large variety of data to 
> apply data analytics over them. This is because insights from data rapidly 
> allow them to make better decisions. Thus, the pursuit for efficient and 
> scalable data analytics as well as the one-size-does-not-fit-all philosophy 
> has given rise to a plethora of data processing platforms. 

[VOTE] Accept Wayang into the Apache Incubator

2020-12-11 Thread Christofer Dutz
Hi all,

following up the [DISCUSS] thread on Wayang 
(https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
 I would like to call a VOTE to accept Wayang Aka Rheem into the Apache 
Incubator.

Please cast your vote:

  [ ] +1, bring Wayang into the Incubator
  [ ] +0, I don't care either way
  [ ] -1, do not bring Wayang into the Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator PMC 
are binding, but votes from everyone are welcome.

Chris

-

Wayang Proposal 
(https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)

== Abstract ==

Wayang is a cross-platform data processing system that aims at decoupling the 
business logic of data analytics applications from concrete data processing 
platforms, such as Apache Flink or Apache Spark. Hence, it tames the complexity 
that arises from the "Cambrian explosion" of novel data processing platforms 
that we currently witness.

Note that Wayang project is the Rheem project, but we have renamed the project 
because of trademark issues.

You can find the project web page at: https://rheem-ecosystem.github.io/

= Proposal =

Wayang is a cross-platform system that provides an abstraction over data 
processing platforms to free users from the burdens of (i) performing tedious 
and costly data migration and integration tasks to run their applications, and 
(ii) choosing the right data processing platforms for their applications. To 
achieve this, Wayang: (1) provides an abstraction on top of existing data 
processing platforms that allows users to specify their data analytics tasks in 
a form of a DAG of operators; (2) comes with a cross-platform optimizer for 
automating the selection of suitable/efficient platforms; and (3) and finally 
takes care of executing the optimized plan, including communication across 
platforms. In summary, Wayang has the following salient features:

- Flexible Data Model - It considers a flexible and simple data model based on 
data quanta. A data quantum is an atomic processing unit in the system, that 
can represent a large spectrum of data formats, such as data points for a 
machine learning application, tuples for a database application, or RDF 
triples. Hence, Wayang is able to express a wide range of data analytics tasks.
- Platform independence - It provides a simple interface (currently Java and 
Scala) that is inspired by established programming models, such as that of 
Apache Spark and Apache Flink. Users represent their data analytic tasks as a 
DAG (Wayang plan), where vertices correspond to Wayang operators and edges 
represent data flows (data quanta flowing) among these operators. A Wayang 
operator defines a particular kind of data transformation over an input data 
quantum, ranging from basic functionality (e.g., transformations, filters, 
joins) to complex, extensible tasks (e.g., PageRank).
- Cross-platform execution - Besides running a data analytic task on any data 
processing platform, it also comes with an optimizer that can decide to execute 
a single data analytic task using multiple data processing platforms. This 
allows for exploiting the capabilities of different data processing platforms 
to perform complex data analytic tasks more efficiently.
Self-tuning UDF-based cost model - Its optimizer uses a cost model fully based 
on UDFs. This not only enables Wayang to learn the cost functions of newly 
added data processing platforms, but also allows developers to tune the 
optimizer at will.
- Extensibility - It treats data processing platforms as plugins to allow users 
(developers) to easily incorporate new data processing platforms into the 
system. This is achieved by exposing the functionalities of data processing 
platforms as operators (execution operators). The same approach is followed at 
the Wayang interface, where users can also extend Wayang capabilities, i.e., 
the operators, easily.

We plan to work on the stability of all these features as well as extending 
Wayang with more advanced features. Furthermore, Wayang currently supports 
Apache Spark, Standalone Java, GraphChi, relational databases (via JDBC). We 
plan to incorporate more data processing platforms, such as Apache Flink and 
Apache Hive.

=== Background ===

Many organizations and companies collect or produce large variety of data to 
apply data analytics over them. This is because insights from data rapidly 
allow them to make better decisions. Thus, the pursuit for efficient and 
scalable data analytics as well as the one-size-does-not-fit-all philosophy has 
given rise to a plethora of data processing platforms. Examples of these 
specialized processing platforms range from DBMSs to MapReduce-like platforms.

However, today's data analytics are moving beyond the limits of a single data 
processing platform. More and more applications need to perform complex data 
analytics over several data 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread Rodrigo Pardo Meza
Hi Jean,

Nice to meet you. Thanks for supporting Wayang proposal.

You get my +1 to join in the mentoring team.

Best regards!

El vie, 11 dic 2020 a las 12:44, bertty contreras (<
berttycontre...@gmail.com>) escribió:

> Hi JB,
>
> It will be so nice that you join to wayang, you have my +1
>
> Best Regards,
> Bertty
>
> El mié, 9 dic 2020 a las 11:02, Alexander Alten-Lorenz ( >)
> escribió:
>
> > resent to the ML since Exim does not accept MIME Content-Type
> > 'text/html' (#5.2.3)
> >
> > Hey JB,
> >
> > Wayang is  an intelligent scheduler to leverage multiple frameworks
> > and is doing spills / tasks of an RRD by themselves, when this speed
> > up the processing. We see us in the middle of Apache Spark, Apache
> > Beam and Apache Nemo. Personally I see us more into Apache Flink, but
> > with major advantages as described before. Our focus is more into AI /
> > ML and the productionizing of models, helping data engineers to get
> > the job with the right tool done in less time instead of always the
> > same.
> >
> > For mentoring, you get my +1  Would be great to have you in the
> > mentoring team!
> >
> > Best,
> > --alex
> >
> >
> > On Wed, Dec 9, 2020 at 2:58 PM Alexander Alten-Lorenz 
> > wrote:
> > >
> > > Hey JB,
> > >
> > >
> > >
> > > Wayang is  an intelligent scheduler to leverage multiple frameworks and
> > is doing spills / tasks of an RRD by themselves, when this speed up the
> > processing. We see us in the middle of Apache Spark, Apache Beam and
> Apache
> > Nemo. Personally I see us more into Apache Flink, but with major
> advantages
> > as described before. Our focus is more into AI / ML and the
> productionizing
> > of models, helping data engineers to get the job with the right tool done
> > in less time instead to use always the same.
> > >
> > >
> > >
> > > For mentoring, you get my +1 Would be great to have you in the
> mentoring
> > team!
> > >
> > >
> > >
> > > Best,
> > >
> > > --alex
> > >
> > >
> > >
> > > --
> > >
> > > Alexander Alten - Lorenz
> > >
> > > Chief Technology Officer
> > >
> > > Scalytics - Code once. Deploy on any cloud
> > >
> > >
> > >
> > > m:   alexan...@scalytics.io
> > >
> > > ln:   www.linkedin.com/in/alexanderalten/
> > >
> > > m:   +49 151 156 81561
> > >
> > >
> > >
> > > Scalytics, Inc.
> > >
> > > Web: scalytics.io
> > >
> > >
> > >
> > >
> > >
> > > From: Jean-Baptiste Onofre
> > > Sent: Wednesday, December 9, 2020 2:48 PM
> > > To: general@incubator.apache.org
> > > Cc: Alexander Alten-Lorenz; Rodrigo Pardo Meza
> > > Subject: Re: [DISCUSS] Wayang Proposal
> > >
> > >
> > >
> > > Hi,
> > >
> > >
> > >
> > > As it seems a bit close to Apache Beam, I would be interested by follow
> > and/or mentor the project.
> > >
> > >
> > >
> > > Please let me know if you want an additional mentor on this podling.
> > >
> > >
> > >
> > > By the way, what’s Wayang position regarding Apache Beam and Apache
> Nemo
> > ?
> > >
> > >
> > >
> > > Regards
> > >
> > > JB
> > >
> > >
> > >
> > > > Le 6 déc. 2020 à 12:13, Christofer Dutz 
> a
> > écrit :
> > >
> > > >
> > >
> > > > Hi and welcome on this new communication channel.
> > >
> > > >
> > >
> > > > I'd like to add, that I really liked how the project approached me
> and
> > how they reacted when I told them the old project name Rheem would
> probably
> > not go down well and that I would suggest changing the name.
> > >
> > > >
> > >
> > > > I gave them some tips what things are important for Apache and what
> we
> > lay emphasis on and the team started brainstorming and doing all the
> > trademark checks, taking options off the list where problems could be
> > expected.
> > >
> > > >
> > >
> > > > So, upon acceptance here, the project would do the renaming together
> > with the groupId and package changing, which would be required anyway.
> > >
> > > >
> > >
> > > > Right now, there are 3 people signed up as mentors, however knowing
> > that not always are all mentors available at every given time, it would
> be
> > cool if we could find 1-2 more to help out.
> > >
> > > >
> > >
> > > >
> > >
> > > > Chris
> > >
> > > >
> > >
> > > >
> > >
> > > >
> > >
> > > > -Ursprüngliche Nachricht-
> > >
> > > > Von: Bertty Contreras 
> > >
> > > > Gesendet: Freitag, 4. Dezember 2020 17:16
> > >
> > > > An: general@incubator.apache.org
> > >
> > > > Cc: Alexander Alten-Lorenz ; Rodrigo Pardo Meza <
> > rodr...@scalytics.io>
> > >
> > > > Betreff: [DISCUSS] Wayang Proposal
> > >
> > > >
> > >
> > > > Dear all,
> > >
> > > >
> > >
> > > > Rodrigo and Bertty, Senior Software Engineers at Scalytics, are two
> of
> > the main developers of the Wayang system. We write to you because we
> would
> > like to bring your attention to our Proposal for Incubation in the Apache
> > foundation at
> > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal
> > >
> > > >
> > >
> > > > Please note that Wayang was named Rheem before, but we had to rename
> > the project because of 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread bertty contreras
Hi JB,

It will be so nice that you join to wayang, you have my +1

Best Regards,
Bertty

El mié, 9 dic 2020 a las 11:02, Alexander Alten-Lorenz ()
escribió:

> resent to the ML since Exim does not accept MIME Content-Type
> 'text/html' (#5.2.3)
>
> Hey JB,
>
> Wayang is  an intelligent scheduler to leverage multiple frameworks
> and is doing spills / tasks of an RRD by themselves, when this speed
> up the processing. We see us in the middle of Apache Spark, Apache
> Beam and Apache Nemo. Personally I see us more into Apache Flink, but
> with major advantages as described before. Our focus is more into AI /
> ML and the productionizing of models, helping data engineers to get
> the job with the right tool done in less time instead of always the
> same.
>
> For mentoring, you get my +1  Would be great to have you in the
> mentoring team!
>
> Best,
> --alex
>
>
> On Wed, Dec 9, 2020 at 2:58 PM Alexander Alten-Lorenz 
> wrote:
> >
> > Hey JB,
> >
> >
> >
> > Wayang is  an intelligent scheduler to leverage multiple frameworks and
> is doing spills / tasks of an RRD by themselves, when this speed up the
> processing. We see us in the middle of Apache Spark, Apache Beam and Apache
> Nemo. Personally I see us more into Apache Flink, but with major advantages
> as described before. Our focus is more into AI / ML and the productionizing
> of models, helping data engineers to get the job with the right tool done
> in less time instead to use always the same.
> >
> >
> >
> > For mentoring, you get my +1 Would be great to have you in the mentoring
> team!
> >
> >
> >
> > Best,
> >
> > --alex
> >
> >
> >
> > --
> >
> > Alexander Alten - Lorenz
> >
> > Chief Technology Officer
> >
> > Scalytics - Code once. Deploy on any cloud
> >
> >
> >
> > m:   alexan...@scalytics.io
> >
> > ln:   www.linkedin.com/in/alexanderalten/
> >
> > m:   +49 151 156 81561
> >
> >
> >
> > Scalytics, Inc.
> >
> > Web: scalytics.io
> >
> >
> >
> >
> >
> > From: Jean-Baptiste Onofre
> > Sent: Wednesday, December 9, 2020 2:48 PM
> > To: general@incubator.apache.org
> > Cc: Alexander Alten-Lorenz; Rodrigo Pardo Meza
> > Subject: Re: [DISCUSS] Wayang Proposal
> >
> >
> >
> > Hi,
> >
> >
> >
> > As it seems a bit close to Apache Beam, I would be interested by follow
> and/or mentor the project.
> >
> >
> >
> > Please let me know if you want an additional mentor on this podling.
> >
> >
> >
> > By the way, what’s Wayang position regarding Apache Beam and Apache Nemo
> ?
> >
> >
> >
> > Regards
> >
> > JB
> >
> >
> >
> > > Le 6 déc. 2020 à 12:13, Christofer Dutz  a
> écrit :
> >
> > >
> >
> > > Hi and welcome on this new communication channel.
> >
> > >
> >
> > > I'd like to add, that I really liked how the project approached me and
> how they reacted when I told them the old project name Rheem would probably
> not go down well and that I would suggest changing the name.
> >
> > >
> >
> > > I gave them some tips what things are important for Apache and what we
> lay emphasis on and the team started brainstorming and doing all the
> trademark checks, taking options off the list where problems could be
> expected.
> >
> > >
> >
> > > So, upon acceptance here, the project would do the renaming together
> with the groupId and package changing, which would be required anyway.
> >
> > >
> >
> > > Right now, there are 3 people signed up as mentors, however knowing
> that not always are all mentors available at every given time, it would be
> cool if we could find 1-2 more to help out.
> >
> > >
> >
> > >
> >
> > > Chris
> >
> > >
> >
> > >
> >
> > >
> >
> > > -Ursprüngliche Nachricht-
> >
> > > Von: Bertty Contreras 
> >
> > > Gesendet: Freitag, 4. Dezember 2020 17:16
> >
> > > An: general@incubator.apache.org
> >
> > > Cc: Alexander Alten-Lorenz ; Rodrigo Pardo Meza <
> rodr...@scalytics.io>
> >
> > > Betreff: [DISCUSS] Wayang Proposal
> >
> > >
> >
> > > Dear all,
> >
> > >
> >
> > > Rodrigo and Bertty, Senior Software Engineers at Scalytics, are two of
> the main developers of the Wayang system. We write to you because we would
> like to bring your attention to our Proposal for Incubation in the Apache
> foundation at
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal
> >
> > >
> >
> > > Please note that Wayang was named Rheem before, but we had to rename
> the project because of trademark issues.
> >
> > > You can find the original project web page at:
> >
> > > https://rheem-ecosystem.github.io/
> >
> > >
> >
> > > Best,
> >
> > > Bertty
> >
> > >
> >
> > > -
> >
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
> >
> >
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: 

Re: [DISCUSS] Wayang Proposal

2020-12-11 Thread jorge Arnulfo Quiané Ruiz
Hello Jean,

Nice to e-meet you.
It is great to hear that you are interested in the Wayang proposal.
Absolutely, we would be more than happy to have you as a mentor +1

Best,
Jorge

On Wed, Dec 9, 2020 at 3:03 PM Jean-Baptiste Onofre  wrote:

> Hi Alex,
>
> Thanks for your update. It makes sense and Wayang looks actually as a
> great "extend" to Spark/Beam/Nemo.
>
> Happy to be mentor if you want me ! ;)
>
> Thanks again,
> Regards
> JB
>
> > Le 9 déc. 2020 à 14:58, Alexander Alten-Lorenz  a
> écrit :
> >
> > Hey JB,
> >
> > Wayang is  an intelligent scheduler to leverage multiple frameworks and
> is doing spills / tasks of an RRD by themselves, when this speed up the
> processing. We see us in the middle of Apache Spark, Apache Beam and Apache
> Nemo. Personally I see us more into Apache Flink, but with major advantages
> as described before. Our focus is more into AI / ML and the productionizing
> of models, helping data engineers to get the job with the right tool done
> in less time instead to use always the same.
> >
> > For mentoring, you get my +1  Would be great to have you in the
> mentoring team!
> >
> > Best,
> > --alex
> >
> > --
> > Alexander Alten - Lorenz
> > Chief Technology Officer
> > Scalytics - Code once. Deploy on any cloud
> >
> > m:   alexan...@scalytics.io 
> > ln:   www.linkedin.com/in/alexanderalten/ <
> http://www.linkedin.com/in/alexanderalten/%E2%80%AC>
> > m:   +49 151 156 81561
> >
> > Scalytics, Inc.
> > Web: scalytics.io 
> >
> >
> > From: Jean-Baptiste Onofre 
> > Sent: Wednesday, December 9, 2020 2:48 PM
> > To: general@incubator.apache.org 
> > Cc: Alexander Alten-Lorenz ; Rodrigo Pardo
> Meza 
> > Subject: Re: [DISCUSS] Wayang Proposal
> >
> > Hi,
> >
> > As it seems a bit close to Apache Beam, I would be interested by follow
> and/or mentor the project.
> >
> > Please let me know if you want an additional mentor on this podling.
> >
> > By the way, what’s Wayang position regarding Apache Beam and Apache Nemo
> ?
> >
> > Regards
> > JB
> >
> > > Le 6 déc. 2020 à 12:13, Christofer Dutz  a
> écrit :
> > >
> > > Hi and welcome on this new communication channel.
> > >
> > > I'd like to add, that I really liked how the project approached me and
> how they reacted when I told them the old project name Rheem would probably
> not go down well and that I would suggest changing the name.
> > >
> > > I gave them some tips what things are important for Apache and what we
> lay emphasis on and the team started brainstorming and doing all the
> trademark checks, taking options off the list where problems could be
> expected.
> > >
> > > So, upon acceptance here, the project would do the renaming together
> with the groupId and package changing, which would be required anyway.
> > >
> > > Right now, there are 3 people signed up as mentors, however knowing
> that not always are all mentors available at every given time, it would be
> cool if we could find 1-2 more to help out.
> > >
> > >
> > > Chris
> > >
> > >
> > >
> > > -Ursprüngliche Nachricht-
> > > Von: Bertty Contreras 
> > > Gesendet: Freitag, 4. Dezember 2020 17:16
> > > An: general@incubator.apache.org
> > > Cc: Alexander Alten-Lorenz ; Rodrigo Pardo Meza <
> rodr...@scalytics.io>
> > > Betreff: [DISCUSS] Wayang Proposal
> > >
> > > Dear all,
> > >
> > > Rodrigo and Bertty, Senior Software Engineers at Scalytics, are two of
> the main developers of the Wayang system. We write to you because we would
> like to bring your attention to our Proposal for Incubation in the Apache
> foundation at
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal
> > >
> > > Please note that Wayang was named Rheem before, but we had to rename
> the project because of trademark issues.
> > > You can find the original project web page at:
> > > https://rheem-ecosystem.github.io/
> > >
> > > Best,
> > > Bertty
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
>
>