Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-16 Thread Aditya Sharma
+1 (non-binding)

Thanks and regards,
Aditya Sharma


On Fri, 11 Dec 2020 at 22:03, Christofer Dutz 
wrote:

> Hi all,
>
> following up the [DISCUSS] thread on Wayang (
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> Incubator.
>
> Please cast your vote:
>
>   [ ] +1, bring Wayang into the Incubator
>   [ ] +0, I don't care either way
>   [ ] -1, do not bring Wayang into the Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding, but votes from everyone are welcome.
>
> Chris
>
> -
>
> Wayang Proposal (
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
>
> == Abstract ==
>
> Wayang is a cross-platform data processing system that aims at decoupling
> the business logic of data analytics applications from concrete data
> processing platforms, such as Apache Flink or Apache Spark. Hence, it tames
> the complexity that arises from the "Cambrian explosion" of novel data
> processing platforms that we currently witness.
>
> Note that Wayang project is the Rheem project, but we have renamed the
> project because of trademark issues.
>
> You can find the project web page at: https://rheem-ecosystem.github.io/
>
> = Proposal =
>
> Wayang is a cross-platform system that provides an abstraction over data
> processing platforms to free users from the burdens of (i) performing
> tedious and costly data migration and integration tasks to run their
> applications, and (ii) choosing the right data processing platforms for
> their applications. To achieve this, Wayang: (1) provides an abstraction on
> top of existing data processing platforms that allows users to specify
> their data analytics tasks in a form of a DAG of operators; (2) comes with
> a cross-platform optimizer for automating the selection of
> suitable/efficient platforms; and (3) and finally takes care of executing
> the optimized plan, including communication across platforms. In summary,
> Wayang has the following salient features:
>
> - Flexible Data Model - It considers a flexible and simple data model
> based on data quanta. A data quantum is an atomic processing unit in the
> system, that can represent a large spectrum of data formats, such as data
> points for a machine learning application, tuples for a database
> application, or RDF triples. Hence, Wayang is able to express a wide range
> of data analytics tasks.
> - Platform independence - It provides a simple interface (currently Java
> and Scala) that is inspired by established programming models, such as that
> of Apache Spark and Apache Flink. Users represent their data analytic tasks
> as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> edges represent data flows (data quanta flowing) among these operators. A
> Wayang operator defines a particular kind of data transformation over an
> input data quantum, ranging from basic functionality (e.g.,
> transformations, filters, joins) to complex, extensible tasks (e.g.,
> PageRank).
> - Cross-platform execution - Besides running a data analytic task on any
> data processing platform, it also comes with an optimizer that can decide
> to execute a single data analytic task using multiple data processing
> platforms. This allows for exploiting the capabilities of different data
> processing platforms to perform complex data analytic tasks more
> efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> based on UDFs. This not only enables Wayang to learn the cost functions of
> newly added data processing platforms, but also allows developers to tune
> the optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow
> users (developers) to easily incorporate new data processing platforms into
> the system. This is achieved by exposing the functionalities of data
> processing platforms as operators (execution operators). The same approach
> is followed at the Wayang interface, where users can also extend Wayang
> capabilities, i.e., the operators, easily.
>
> We plan to work on the stability of all these features as well as
> extending Wayang with more advanced features. Furthermore, Wayang currently
> supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> JDBC). We plan to incorporate more data processing platforms, such as
> Apache Flink and Apache Hive.
>
> === Background ===
>
> Many organizations and companies collect or produce large variety of data
> to apply data analytics over them. This is because insights from data
> rapidly allow them to make better decisions. Thus, the pursuit for
> efficient and scalable data analytics as well as the
> one-size-does-not-fit-all philosophy has given rise to a plethora of data
> processing platforms. Examples of these specialized processing 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-16 Thread Paul King
+1 (binding)



Virus-free.
www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Sat, Dec 12, 2020 at 2:33 AM Christofer Dutz 
wrote:

> Hi all,
>
> following up the [DISCUSS] thread on Wayang (
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> Incubator.
>
> Please cast your vote:
>
>   [ ] +1, bring Wayang into the Incubator
>   [ ] +0, I don't care either way
>   [ ] -1, do not bring Wayang into the Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding, but votes from everyone are welcome.
>
> Chris
>
> -
>
> Wayang Proposal (
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
>
> == Abstract ==
>
> Wayang is a cross-platform data processing system that aims at decoupling
> the business logic of data analytics applications from concrete data
> processing platforms, such as Apache Flink or Apache Spark. Hence, it tames
> the complexity that arises from the "Cambrian explosion" of novel data
> processing platforms that we currently witness.
>
> Note that Wayang project is the Rheem project, but we have renamed the
> project because of trademark issues.
>
> You can find the project web page at: https://rheem-ecosystem.github.io/
>
> = Proposal =
>
> Wayang is a cross-platform system that provides an abstraction over data
> processing platforms to free users from the burdens of (i) performing
> tedious and costly data migration and integration tasks to run their
> applications, and (ii) choosing the right data processing platforms for
> their applications. To achieve this, Wayang: (1) provides an abstraction on
> top of existing data processing platforms that allows users to specify
> their data analytics tasks in a form of a DAG of operators; (2) comes with
> a cross-platform optimizer for automating the selection of
> suitable/efficient platforms; and (3) and finally takes care of executing
> the optimized plan, including communication across platforms. In summary,
> Wayang has the following salient features:
>
> - Flexible Data Model - It considers a flexible and simple data model
> based on data quanta. A data quantum is an atomic processing unit in the
> system, that can represent a large spectrum of data formats, such as data
> points for a machine learning application, tuples for a database
> application, or RDF triples. Hence, Wayang is able to express a wide range
> of data analytics tasks.
> - Platform independence - It provides a simple interface (currently Java
> and Scala) that is inspired by established programming models, such as that
> of Apache Spark and Apache Flink. Users represent their data analytic tasks
> as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> edges represent data flows (data quanta flowing) among these operators. A
> Wayang operator defines a particular kind of data transformation over an
> input data quantum, ranging from basic functionality (e.g.,
> transformations, filters, joins) to complex, extensible tasks (e.g.,
> PageRank).
> - Cross-platform execution - Besides running a data analytic task on any
> data processing platform, it also comes with an optimizer that can decide
> to execute a single data analytic task using multiple data processing
> platforms. This allows for exploiting the capabilities of different data
> processing platforms to perform complex data analytic tasks more
> efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> based on UDFs. This not only enables Wayang to learn the cost functions of
> newly added data processing platforms, but also allows developers to tune
> the optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow
> users (developers) to easily incorporate new data processing platforms into
> the system. This is achieved by exposing the functionalities of data
> processing platforms as operators (execution operators). The same approach
> is followed at the Wayang interface, where users can also extend Wayang
> capabilities, i.e., the operators, easily.
>
> We plan to work on the stability of all these features as well as
> extending Wayang with more advanced features. Furthermore, Wayang currently
> supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> JDBC). We plan to incorporate more data processing platforms, such as
> Apache Flink and Apache Hive.
>
> === Background ===
>
> Many organizations and companies collect or produce large variety of data
> to apply data analytics over them. This is because insights from data
> rapidly allow them to make better decisions. 

[RESULT] [VOTE] Accept Wayang into the Apache Incubator

2020-12-16 Thread Christofer Dutz
So the vote passes:

8 binding +1 votes (All nominated  mentors voted)
3 non-binding +1 votes

No 0 or -1 votes

https://lists.apache.org/thread.html/ra8e804fadca573af0f053df56686c98dbeaf1e8d1cba7228b16a637e%40%3Cgeneral.incubator.apache.org%3E

Binding:
Dave Fisher
Kevin Ratnasekera
Furkan KAMACI
Byung-Gon Chun
Sheng Wu
Lars George
Jean-Baptiste Onofre
Bernd Fondermann

Non-Binding:
Alexander Alten-Lorenz
Daniel B. Widdis
Xiangdong Huang


-Ursprüngliche Nachricht-
Von: Xiangdong Huang 
Gesendet: Montag, 14. Dezember 2020 07:28
An: general@incubator.apache.org
Betreff: Re: [VOTE] Accept Wayang into the Apache Incubator

+1 (non-binding)
---
Xiangdong Huang
School of Software, Tsinghua University


Lars George  于2020年12月12日周六 下午8:19写道:

> +1 binding
>
> On Sat, Dec 12, 2020 at 2:24 AM Sheng Wu 
> wrote:
>
> > +1 binding
> >
> > Sheng Wu 吴晟
> > Twitter, wusheng1108
> >
> >
> > Byung-Gon Chun  于2020年12月12日周六 上午5:59写道:
> >
> > > +1 (binding)
> > >
> > > -Gon
> > >
> > > On Sat, Dec 12, 2020 at 2:35 AM Furkan KAMACI
> > > 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > +1 (binding)
> > > >
> > > > Kind Regards,
> > > > Furkan KAMACI
> > > >
> > > > On 11 Dec 2020 Fri at 20:04 Daniel B. Widdis 
> wrote:
> > > >
> > > > > +1 (non-binding).  I'm interested in getting involved in this
> > project!
> > > > >
> > > > > On Fri, Dec 11, 2020 at 8:33 AM Christofer Dutz <
> > > > christofer.d...@c-ware.de
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > following up the [DISCUSS] thread on Wayang (
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07f
> aedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E
> > > > > )
> > > > > > I would like to call a VOTE to accept Wayang Aka Rheem into
> > > > > > the
> > > Apache
> > > > > > Incubator.
> > > > > >
> > > > > > Please cast your vote:
> > > > > >
> > > > > >   [ ] +1, bring Wayang into the Incubator
> > > > > >   [ ] +0, I don't care either way
> > > > > >   [ ] -1, do not bring Wayang into the Incubator, because...
> > > > > >
> > > > > > The vote will open at least for 72 hours and only votes from
> > > > > > the
> > > > > Incubator
> > > > > > PMC are binding, but votes from everyone are welcome.
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > -
> > > > > >
> > > > > > Wayang Proposal (
> > > > > >
> > https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal
> > > )
> > > > > >
> > > > > > == Abstract ==
> > > > > >
> > > > > > Wayang is a cross-platform data processing system that aims
> > > > > > at
> > > > decoupling
> > > > > > the business logic of data analytics applications from
> > > > > > concrete
> > data
> > > > > > processing platforms, such as Apache Flink or Apache Spark.
> Hence,
> > it
> > > > > tames
> > > > > > the complexity that arises from the "Cambrian explosion" of
> > > > > > novel
> > > data
> > > > > > processing platforms that we currently witness.
> > > > > >
> > > > > > Note that Wayang project is the Rheem project, but we have
> renamed
> > > the
> > > > > > project because of trademark issues.
> > > > > >
> > > > > > You can find the project web page at:
> > > > https://rheem-ecosystem.github.io/
> > > > > >
> > > > > > = Proposal =
> > > > > >
> > > > > > Wayang is a cross-platform system that provides an
> > > > > > abstraction
> over
> > > > data
> > > > > > processing platforms to free users from the burdens of (i)
> > performing
> > > > > > tedious and costly data migration and integration tasks to
> > > > > > run
> > their
> > > > > > applications, and (ii) choosing the right data processing
> platforms
> > > for
> > > > > > their applications. To achieve this, Wayang: (1) provides an
> > > > abstraction
> > > > > on
> > > > > > top of existing data processing platforms that allows users
> > > > > > to
> > > specify
> > > > > > their data analytics tasks in a form of a DAG of operators;
> > > > > > (2)
> > comes
> > > > > with
> > > > > > a cross-platform optimizer for automating the selection of
> > > > > > suitable/efficient platforms; and (3) and finally takes care
> > > > > > of
> > > > executing
> > > > > > the optimized plan, including communication across
> > > > > > platforms. In
> > > > summary,
> > > > > > Wayang has the following salient features:
> > > > > >
> > > > > > - Flexible Data Model - It considers a flexible and simple
> > > > > > data
> > model
> > > > > > based on data quanta. A data quantum is an atomic processing
> > > > > > unit
> > in
> > > > the
> > > > > > system, that can represent a large spectrum of data formats,
> > > > > > such
> > as
> > > > data
> > > > > > points for a machine learning application, tuples for a
> > > > > > database application, or RDF triples. Hence, Wayang is able
> > > > > > to express a
> > wide
> > > > > range
> > > > > > of data analytics tasks.
> > > > > > - Platform independence - It provides a simple 

Re: [VOTE] Accept Wayang into the Apache Incubator

2020-12-16 Thread Bernd Fondermann
+1

  Bernd

On Fri, Dec 11, 2020 at 5:33 PM Christofer Dutz 
wrote:

> Hi all,
>
> following up the [DISCUSS] thread on Wayang (
> https://lists.apache.org/thread.html/r5fc03ae014f44c7c31a509a6db4ac07faedb2e1c6245cd917b744826%40%3Cgeneral.incubator.apache.org%3E)
> I would like to call a VOTE to accept Wayang Aka Rheem into the Apache
> Incubator.
>
> Please cast your vote:
>
>   [ ] +1, bring Wayang into the Incubator
>   [ ] +0, I don't care either way
>   [ ] -1, do not bring Wayang into the Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding, but votes from everyone are welcome.
>
> Chris
>
> -
>
> Wayang Proposal (
> https://cwiki.apache.org/confluence/display/INCUBATOR/WayangProposal)
>
> == Abstract ==
>
> Wayang is a cross-platform data processing system that aims at decoupling
> the business logic of data analytics applications from concrete data
> processing platforms, such as Apache Flink or Apache Spark. Hence, it tames
> the complexity that arises from the "Cambrian explosion" of novel data
> processing platforms that we currently witness.
>
> Note that Wayang project is the Rheem project, but we have renamed the
> project because of trademark issues.
>
> You can find the project web page at: https://rheem-ecosystem.github.io/
>
> = Proposal =
>
> Wayang is a cross-platform system that provides an abstraction over data
> processing platforms to free users from the burdens of (i) performing
> tedious and costly data migration and integration tasks to run their
> applications, and (ii) choosing the right data processing platforms for
> their applications. To achieve this, Wayang: (1) provides an abstraction on
> top of existing data processing platforms that allows users to specify
> their data analytics tasks in a form of a DAG of operators; (2) comes with
> a cross-platform optimizer for automating the selection of
> suitable/efficient platforms; and (3) and finally takes care of executing
> the optimized plan, including communication across platforms. In summary,
> Wayang has the following salient features:
>
> - Flexible Data Model - It considers a flexible and simple data model
> based on data quanta. A data quantum is an atomic processing unit in the
> system, that can represent a large spectrum of data formats, such as data
> points for a machine learning application, tuples for a database
> application, or RDF triples. Hence, Wayang is able to express a wide range
> of data analytics tasks.
> - Platform independence - It provides a simple interface (currently Java
> and Scala) that is inspired by established programming models, such as that
> of Apache Spark and Apache Flink. Users represent their data analytic tasks
> as a DAG (Wayang plan), where vertices correspond to Wayang operators and
> edges represent data flows (data quanta flowing) among these operators. A
> Wayang operator defines a particular kind of data transformation over an
> input data quantum, ranging from basic functionality (e.g.,
> transformations, filters, joins) to complex, extensible tasks (e.g.,
> PageRank).
> - Cross-platform execution - Besides running a data analytic task on any
> data processing platform, it also comes with an optimizer that can decide
> to execute a single data analytic task using multiple data processing
> platforms. This allows for exploiting the capabilities of different data
> processing platforms to perform complex data analytic tasks more
> efficiently.
> Self-tuning UDF-based cost model - Its optimizer uses a cost model fully
> based on UDFs. This not only enables Wayang to learn the cost functions of
> newly added data processing platforms, but also allows developers to tune
> the optimizer at will.
> - Extensibility - It treats data processing platforms as plugins to allow
> users (developers) to easily incorporate new data processing platforms into
> the system. This is achieved by exposing the functionalities of data
> processing platforms as operators (execution operators). The same approach
> is followed at the Wayang interface, where users can also extend Wayang
> capabilities, i.e., the operators, easily.
>
> We plan to work on the stability of all these features as well as
> extending Wayang with more advanced features. Furthermore, Wayang currently
> supports Apache Spark, Standalone Java, GraphChi, relational databases (via
> JDBC). We plan to incorporate more data processing platforms, such as
> Apache Flink and Apache Hive.
>
> === Background ===
>
> Many organizations and companies collect or produce large variety of data
> to apply data analytics over them. This is because insights from data
> rapidly allow them to make better decisions. Thus, the pursuit for
> efficient and scalable data analytics as well as the
> one-size-does-not-fit-all philosophy has given rise to a plethora of data
> processing platforms. Examples of these specialized processing platforms
> range from DBMSs to MapReduce-like