Re: [VOTE] Superset Proposal for Apache Incubator
+1 (non-binding). Thanks Naresh Agarwal On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > +1 (binding) > > > > On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: > > > +1 (binding) > > > > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey > > <jiten...@hortonworks.com> wrote: > > > +1 (binding) > > > > > > On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: > > > > > > +1 binding > > > > > > > On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org> > > wrote: > > > > > > > > +1 (non-binding) > > > > > > > > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < > > hashut...@apache.org> > > > > wrote: > > > > > > > >> +1 (binding) > > > >> > > > >> Thanks, > > > >> Ashutosh > > > >> > > > >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <luke...@gmail.com> > > wrote: > > > >> > > > >>> +1 binding > > > >>> > > > >>> Love to see Superset to be new incubator project. > > > >>> > > > >>> > > > >>> Best Regards! > > > >>> - > > > >>> > > > >>> Luke Han > > > >>> > > > >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng < > jeff.f...@gmail.com> > > wrote: > > > >>> > > > >>>> Dear Apache Incubator Community, > > > >>>> > > > >>>> We have updated the Superset proposal > > > >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied > > below) for > > > >>>> > > > >>>> Apache Incubation with an additional mentor (Luke Han - > > > >>>> luke@apache.org), > > > >>>> and would like to start a vote thread for acceptance into the > > incubator. > > > >>>> > > > >>>> Our team is excited to share Superset with the Apache > community > > and we > > > >>>> hope > > > >>>> for the your continued support! > > > >>>> > > > >>>> Cheers, > > > >>>> Jeff & the Superset Team > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> = Superset = > > > >>>> > > > >>>> == Abstract == > > > >>>> Superset is an enterprise-ready web application for data > > exploration, > > > >> data > > > >>>> visualization and dashboarding. > > > >>>> > > > >>>> == Proposal == > > > >>>> Superset is business intelligence (BI) software that helps > > modern > > > >>>> organizations visualize and interact with their data. Superset > > enables > > > >>>> users explore data from a variety of databases, assemble > > beautiful > > > >>>> dashboards and share their findings. Superset works neatly > > with all > > > >>>> modern > > > >>>> SQL-speaking databases, and integrates with Druid.io to > provide > > > >> real-time, > > > >>>> interactive, blazing fast data access to large datasets. > > > >>>> > > > >>>> == Background == > > > >>>> Data is mission critical. To succeed in this era, > organizations > > need to > > > >>>> provide low-friction, intuitive and interactive access to > data. > > It is > > > >>>> paramount for knowledge workers to be capable of answering > > their own > > > >>>> questions by querying, exploring and visualizing data. > > > >>>> > > > >>>> The entire business intelligence industry has pivoted from a > > model of > > > >>>> centralized top-down platforms driven by IT organizations to > > > >
Re: [VOTE] Gobblin to enter Apache Incubator
+1 (non binding). We had used Gobblin in past and found it quite powerful. Looking forward to a exciting road-map ahead in Gobblin. Thanks Naresh On Wed, Feb 22, 2017 at 12:31 AM, Abhishek Tiwari < abhishektiwari.bt...@gmail.com> wrote: > +1 (non-binding) > > Abhishek > > On Sun, Feb 19, 2017 at 3:01 PM, Olivier Lamywrote: > > > Hi Craig > > > > On 20 February 2017 at 00:23, Craig Russell > > wrote: > > > > > Hi Olivier, > > > > > > Can you also post the link to this proposal on > https://wiki.apache.org/ > > > incubator/ProjectProposals ? > > > > > > > Sure. done. > > > > Olivier > > > > > > > > > > Thanks, > > > > > > Craig > > > > > > > On Feb 16, 2017, at 9:33 PM, Olivier Lamy wrote: > > > > > > > > Hello Everyone, > > > > I would like to call a vote for accepting "Gobblin" for incubation in > > the > > > > Apache Incubator. > > > > The full proposal is available below, and is also available in the > > wiki: > > > > https://wiki.apache.org/incubator/GobblinProposal > > > > > > > > > > Craig L Russell > > > Secretary, Apache Software Foundation > > > c...@apache.org http://db.apache.org/jdo > > > > > > > > > - > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > > > -- > > Olivier Lamy > > http://twitter.com/olamy | http://linkedin.com/in/olamy > > >
Re: [VOTE] Accept DistributedLog into the Apache Incubator
+1 (non-binding) On Tue, Jun 21, 2016 at 11:26 AM, Jia Zhaiwrote: > +1 > > From: Sijie Guo > > Date: Mon, Jun 20, 2016 at 10:11 PM > > Subject: [VOTE] Accept DistributedLog into the Apache Incubator > > To: general@incubator.apache.org > > > > > > Hello All, > > > > Following the discussion thread, I would like to call a VOTE on accepting > > DistributedLog into the Apache Incubator. > > > > [] +1 Accept DistributedLog into the Apache Incubator > > [] +0 Abstain. > > [] -1 Do not accept DistributedLog into the Apache Incubator because ... > > > > This vote will be open for at least 72 hours. > > > > The proposal follows, you can also access the wiki page: > > https://wiki.apache.org/incubator/DistributedLogProposal > > > > Here is my +1. > > > > Thanks, > > Sijie > > > > = Abstract = > > DistributedLog is a high-performance replicated log service. It offers > > durability, replication and strong consistency, which provides a > > fundamental building block for building reliable distributed systems, e.g > > replicated-state-machines, general pub/sub systems, distributed > databases, > > distributed queues and etc. > > > > See “Building Distributedlog - Twitter’s high performance replicated log > > service” for details: > > > https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service > > > > = Proposal = > > We propose to contribute DistributedLog codebase and associated artifacts > > (e.g. documentation, web-site content etc.) to the Apache Software > > Foundation with the intent of forming a productive, meritocratic and open > > community around DistributedLog’s continued development, according to the > > ‘Apache Way’. > > > > = Background = > > Engineers at Twitter began developing DistributedLog in early 2013. > > DistributedLog is described in a Twitter engineering blog post and > > presented at the Messaging Meetup in Sep 2015. It has been released as an > > Apache-licensed open-source project on GitHub in May 2016. > > > > DistributedLog is a high-performance replicated log service, which > > provides simple stream-oriented abstractions over log-segments and offers > > durability, replication and strong consistency for building reliable > > distributed systems. The features offered by DistributedLog includes: > > > > * Simple high-level, stream oriented interface > > * Naming and metadata scheme for managing streams and other entities > > * Log data management policies, include data segmentation and data > > retention > > * Fast write pipeline leveraging batching and compression > > * Fast read mechanism leveraging long-poll and read-ahead caching > > * Service tiers supporting writer fan-in and reader fan-out > > * Geo-replicated logs > > > > DistributedLog’s most important benefit is high-performance with a strong > > durability guarantee, making it extremely appropriate for running > different > > workloads from distributed database journaling to real-time stream > > computing. Its modern, layered architecture makes it easy to run the > > service tiers in multi-tenant datacenter environments such as Apache > Mesos > > or cloud environments such as EC2. > > > > = Rationale = > > DistributedLog is designed to provide core fundamental features like > > high-performance, durability and strong consistency to anyone who is > > building reliable distributed systems, in a simple and efficient way. > > > > We believe that the ASF is the right venue to foster an open-source > > community around DistributedLog’s development. We expect that > > DistributedLog will benefit from collaboration with related Apache > > projects, and under the auspices of the ASF will attract talented > > contributors who will push DistributedLog’s development forward at a > faster > > pace. > > > > We believe that the timing is right for DistributedLog’s development to > > move to the ASF: DistributedLog has already run in production at Twitter > > for 3 years and served various workloads including a distributed database > > journal, reliable cross datacenter replication, search ingestion, > > andgeneral pub/sub messaging. The project is stable. We are excited to > see > > where an ASF-based community can take DistributedLog. > > > > = Current Status = > > DistributedLog is a stable project that has been used in production at > > Twitter for 3 years. The source code is public at github.com/twitter, > > which will seed the Apache git repository. > > > > = Meritocracy = > > We understand the central importance of meritocracy to the Apache Way. We > > will work to establish a welcoming, fair and meritocratic community. > > Several companies have already expressed interest in this project, and we > > intend to invite additional developers to participate. We look forward to > > growing a rich user and developer community. > > > > = Community = > > There is a large need for a performant replicated log service for > > applications such as
Re: [VOTE] Accept Tephra into the Apache Incubator
+1 (non-binding) Looking forward to this project. Thanks Naresh Agarwal On 7 Mar 2016 02:09, <la...@apache.org> wrote: > +1 (binding) > Exciting! > > From: Poorna Chandra <poo...@apache.org> > To: general@incubator.apache.org > Sent: Thursday, March 3, 2016 5:29 PM > Subject: [VOTE] Accept Tephra into the Apache Incubator > > Hi All, > > Tephra proposal was sent out for discussion last week. The proposal is > available at https://wiki.apache.org/incubator/TephraProposal > > Please vote to accept Tephra into the Apache Incubator. The vote will be > open for the next 72 hours. > > [ ] +1 Accept Tephra as an Apache Incubator podling. > [ ] +0 Abstain. > [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ... > > Thanks, > Poorna. > > -- > > = Abstract = > > Tephra is a system for providing globally consistent transactions on > top of Apache HBase and other storage engines. > > = Proposal = > > Tephra is a transaction engine for distributed data stores like Apache > HBase. > It provides ACID semantics for concurrent data operations that span over > region > boundaries in HBase using Optimistic Concurrency Control. > > = Background = > > HBase provides strong consistency with row- or region-level ACID > operations. However, it sacrifices cross-region and cross-table > consistency in favor of scalability. This trade-off requires application > developers to handle the complexity of ensuring consistency when their > modifications span region boundaries. By providing support for global > transactions that span regions, tables, or multiple RPCs, > Tephra simplifies application development on top of HBase, without a > significant impact on performance or scalability for many workloads. > > Tephra leverages HBase’s native data versioning to provide multi-versioned > concurrency control (MVCC) for transactional reads and writes. > With MVCC capability, each transaction sees its own consistent “snapshot” > of > data, providing snapshot isolation of concurrent transactions. > MVCC along with conflict detection and handling enables Optimistic > Concurrency > Control. > > Tephra consists of three main components: > * Transaction Server – maintains global view of transaction state, assigns > new transaction IDs and performs conflict detection; > * Transaction Client – coordinates start, commit, and rollback of > transactions; and > * Transaction Processor Coprocessor – applies filtering to the data read > (based > on a given transaction’s state) and cleans up any data from old > (no longer visible) transactions. > > Although Tephra only supports HBase now, it can be extended to support > transactions on any store that has multi-versioning and rollback > support. The transactions > can span over multiple stores and storage paradigms. > > = Rationale = > > Tephra has simple abstractions which can be used by an application to > add transaction support over HBase. By abstracting away transaction > handling using Tephra, the application is freed of > transaction logic, and the application developer can focus on the use case. > Also, Tephra can be extended to support transactions on data sources other > than HBase. > > By making Tephra an Apache open source project, we believe that there will > be wider adoption and more opportunities for Tephra to be integrated > into other Apache projects. > > = Current Status = > > Tephra was built at Cask Data Inc. initially as part of > open-source framework Cask Data Application Platform (CDAP) > [[http://cdap.io/]]. > It was later converted into an independent open source project with > Apache 2.0 License [[https://github.com/caskdata/tephra]]. > > Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra > has been deployed at multiple companies. > > Apache Phoenix is using Tephra as transaction engine in the next release. > > == Meritocracy == > > Our intent with this incubator proposal is to start building a diverse > developer community around Tephra following the Apache meritocracy model. > Since Tephra was initially developed in early 2013, we have had fast > adoption and contributions within Cask Data. We are looking forward to > new contributors. We wish to build a community based on Apache's > meritocracy principles, working with those who contribute significantly to > the project and welcoming them to be committers both during the incubation > process and beyond. > > == Community == > > Core developers of Tephra are at Cask Data. Recently the developer > community > has expanded to include folks from Apache Phoenix. We hope to extend our > contributor base significantly
Re: [VOTE] Accept Beam into the Apache Incubator
+1 (non-binding) Thanks Naresh On 29 Jan 2016 06:18, "Hadrian Zbarcea"wrote: > +1 (binding) > > Man, congrats on a job fantastically well done. This is ASF incubator > participation at its best. > > Expectations are high now. I am looking forward to exemplary governance > and speedy graduation. > > Best of luck, > Hadrian > > On 01/28/2016 09:28 AM, Jean-Baptiste Onofré wrote: > >> Hi, >> >> the Beam proposal (initially Dataflow) was proposed last week. >> >> The complete discussion thread is available here: >> >> >> http://mail-archives.apache.org/mod_mbox/incubator-general/201601.mbox/%3CCA%2B%3DKJmvj4wyosNTXVpnsH8PhS7jEyzkZngc682rGgZ3p28L42Q%40mail.gmail.com%3E >> >> >> As reminder the BeamProposal is here: >> >> https://wiki.apache.org/incubator/BeamProposal >> >> Regarding all the great feedbacks we received on the mailing list, we >> think it's time to call a vote to accept Beam into the Incubator. >> >> Please cast your vote to: >> [] +1 - accept Apache Beam as a new incubating project >> [] 0 - not sure >> [] -1 - do not accept the Apache Beam project (because: ...) >> >> Thanks, >> Regards >> JB >> >> ## page was renamed from DataflowProposal >> = Apache Beam = >> >> == Abstract == >> >> Apache Beam is an open source, unified model and set of >> language-specific SDKs for defining and executing data processing >> workflows, and also data ingestion and integration flows, supporting >> Enterprise Integration Patterns (EIPs) and Domain Specific Languages >> (DSLs). Dataflow pipelines simplify the mechanics of large-scale batch >> and streaming data processing and can run on a number of runtimes like >> Apache Flink, Apache Spark, and Google Cloud Dataflow (a cloud service). >> Beam also brings DSL in different languages, allowing users to easily >> implement their data integration processes. >> >> == Proposal == >> >> Beam is a simple, flexible, and powerful system for distributed data >> processing at any scale. Beam provides a unified programming model, a >> software development kit to define and construct data processing >> pipelines, and runners to execute Beam pipelines in several runtime >> engines, like Apache Spark, Apache Flink, or Google Cloud Dataflow. Beam >> can be used for a variety of streaming or batch data processing goals >> including ETL, stream analysis, and aggregate computation. The >> underlying programming model for Beam provides MapReduce-like >> parallelism, combined with support for powerful data windowing, and >> fine-grained correctness control. >> >> == Background == >> >> Beam started as a set of Google projects (Google Cloud Dataflow) focused >> on making data processing easier, faster, and less costly. The Beam >> model is a successor to MapReduce, FlumeJava, and Millwheel inside >> Google and is focused on providing a unified solution for batch and >> stream processing. These projects on which Beam is based have been >> published in several papers made available to the public: >> >> * MapReduce - http://research.google.com/archive/mapreduce.html >> * Dataflow model - http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf >> * FlumeJava - http://research.google.com/pubs/pub35650.html >> * MillWheel - http://research.google.com/pubs/pub41378.html >> >> Beam was designed from the start to provide a portable programming >> layer. When you define a data processing pipeline with the Beam model, >> you are creating a job which is capable of being processed by any number >> of Beam processing engines. Several engines have been developed to run >> Beam pipelines in other open source runtimes, including a Beam runner >> for Apache Flink and Apache Spark. There is also a “direct runner”, for >> execution on the developer machine (mainly for dev/debug purposes). >> Another runner allows a Beam program to run on a managed service, Google >> Cloud Dataflow, in Google Cloud Platform. The Dataflow Java SDK is >> already available on GitHub, and independent from the Google Cloud >> Dataflow service. Another Python SDK is currently in active development. >> >> In this proposal, the Beam SDKs, model, and a set of runners will be >> submitted as an OSS project under the ASF. The runners which are a part >> of this proposal include those for Spark (from Cloudera), Flink (from >> data Artisans), and local development (from Google); the Google Cloud >> Dataflow service runner is not included in this proposal. Further >> references to Beam will refer to the Dataflow model, SDKs, and runners >> which are a part of this proposal (Apache Beam) only. The initial >> submission will contain the already-released Java SDK; Google intends to >> submit the Python SDK later in the incubation process. The Google Cloud >> Dataflow service will continue to be one of many runners for Beam, built >> on Google Cloud Platform, to run Beam pipelines. Necessarily, Cloud >> Dataflow will develop against the Apache project additions, updates, and >> changes. Google Cloud
Re: [VOTE] Accept the iota project into the Apache Incubator
+1 (non-binding) Thanks Naresh Agarwal On Mon, Jan 18, 2016 at 12:00 PM, Bruno Mahé <br...@bmahe.net> wrote: > +1 (non-binding) > > On 01/16/2016 12:12 PM, Hadrian Zbarcea wrote: > >> Hi, >> >> The iota proposal [1] (initially Tempo) was proposed about 6 weeks ago. >> >> Because of the naming conflict that would have likely required to change >> name at graduation, the project name was changed to "Apache iota" (the >> greek letter), which resonates better with the IoT field the project >> targets and passed a summary podling name search. >> >> The code was made available in December for our review and answers on the >> general@ list have been answered. >> >> We think it's time to move to the next step, a formal vote. Therefore... >> >> Please cast your vote to: >> [] +1 - accept Apache iota as a new incubating project >> [] 0 - not sure >> [] -1 - do not accept the Apache iota project (because: ...) >> >> Thanks, >> Hadrian >> >> >> [1] https://wiki.apache.org/incubator/IotaProposal >> [2] https://en.wikipedia.org/wiki/Iota >> >> >> - >> >> iota Proposal >> >> Abstract >> >> The Apache Foundation has been very successful in bringing together key >> software components that have enabled people to interact with each other >> via a variety of content platforms and it will no doubt continue to do so. >> At the same time modern society is becoming increasingly dependent on >> devices that interact with each other and with people. The amount of data >> that will be produced by devices will be orders of magnitude greater than >> what has been produced by humans in the past. In addition, the >> orchestration of devices and people will be an important area of growth for >> the foreseeable future. This new dynamic will eventually become manifest in >> a growing number of Apache projects that enable this to occur. Our wish is >> to contribute to this movement by contributing the iota system to the Open >> Source Community via the Apache Foundation. Apache iota is an open platform >> to interconnect any and all devices, sensors, people, and applications, >> henceforth referred to as points, through a scalable, secure, and modular >> architecture, enabling applications to generate analysis, create actions >> and/or add intelligence to their behaviors and patterns. >> >> Proposal >> >> Perhaps you are a homeowner configuring the interaction between your >> family and all the smart devices in your home. Or you might be a global >> company orchestrating millions of devices and people across different >> continents. Either way you face the same fundamental problem; namely, how >> do you manage many points in a secure robust and meaningful manner? Apache >> iota is an open source software system that enables homeowners and global >> companies to download a software system that provides secure and robust >> orchestration. >> >> The iota system consists of a variety of components: >> >> A basic but extensible desktop >> An extensible mechanism for capturing data from a variety of sources >> A set of translators that feed the data capture mechanism and a framework >> for the development of additional translators >> A secure means of moving data using digital envelopes based on symmetric >> and asymmetric encryption and decryption via Apache Kafka >> Optionally maintaining data encrypted in a datastore >> Support for a variety of data repositories >> Authentication and authorization using OAuth2 >> Secure APIs for access to data and the system information >> User management >> Device management >> Automated software upgrades via Salt >> Configuration management >> Robust basic infrastructure based on Apache Mesos that enables scalability >> Dockerized applications >> Background >> >> We are in the midst of a revolution in which the Internet of Things (IoT) >> is poised to impact the development of our society in ways we can not even >> begin to imagine. Unfortunately, we know of no coherent OSS (Open Source >> Software) solution that can harness the potentialities of this increasingly >> important trend. Manufacturers of IoT devices, both in the consumer and >> industrial spaces, continue to develop proprietary systems. Apache iota is >> an open source IoT system that creates an open source solution enabling the >> orchestration of IoT devices that brings the benefits of OSS to this space. >> Apache iota was initially developed by Litbit and is d
Re: [VOTE] Graduate Apache Kylin from the Apache Incubator
+1 (non-binding) Thanks Naresh On Thu, Oct 22, 2015 at 6:35 AM, Cai, Eddywrote: > +1 > > -Original Message- > From: Chun, Chad [mailto:wac...@ebay.com] > Sent: Wednesday, October 21, 2015 9:17 AM > To: general@incubator.apache.org > Subject: Re: [VOTE] Graduate Apache Kylin from the Apache Incubator > > +1 > > Chad Chun > wp_c...@hotmail.com > > On 10/16/15, 4:22 PM, "蒋旭" wrote: > > >+1 > >Jiang Xu > >-- 原始邮件 -- > >发件人: "Henry Saputra"; ; > >发送时间: 2015年10月16日(星期五) 上午9:16 > >收件人: "general@incubator.apache.org" ; > > > >主题: Re: [VOTE] Graduate Apache Kylin from the Apache Incubator > > > > > > > >+1 (binding) > > > >On Thursday, October 15, 2015, Luke Han wrote: > > > >> The Apache Kylin community and project made significant advances > >>during the incubating (from Nov 2014) and believes it is ready to > >>graduate as a top-level project. > >> > >> The Apache Kylin is very active. The PPMC doubled in size (added 6 > >>committers and 2 mentors) and > >> increased diversity in the past year. Released 3 version in the past > >>6 months. There were presentations about Apache Kylin at most of the > >>big conferences of the world (including Strata+Hadoop World London, > >>Hadoop Summit San Jose, ApacheCon EU, Big Data Technology China, > >>Database Technology Conference > >> China) and some meetups (Bay Area, > >> Beijing and one is coming in this weekend in Shanghai), and many > >>talks around the world. > >> The dev mailing list is growing very month, about 500+ topics per > >>month now. > >> The community created 1000+ JIRA tickets, many patches from > >>contributors/committers have been merged into code base. > >> > >> A vote passed unanimously on the dev@ list (27 +1 votes). Please find > >> below references to the graduation preparation artifacts: > >> * discussion on dev list [1] > >> * vote thread [2] > >> * podling name search [3] > >> * incubation status [4] > >> * proposed resolution below > >> > >> We believe Apache Kylin is ready to become a top-level project and if > >>the IPMC agree we will move to a formal vote. > >> There are a few more items to be updated on the project status page > >>and others during the next couple of days. > >> > >> > >> Many thanks to the mentors and the IPMC for the support, Luke Han (on > >> behalf of the Apache Kylin PPMC) > >> > >> [1] http://s.apache.org/KylinDisGraduate > >> [2] http://s.apache.org/KylinGraduateVote > >> [3] https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-86 > >> [4] http://incubator.apache.org/projects/kylin.html > >> > >> > >> > >> Apache Kylin top-level project resolution: > >> === > >> > >>WHEREAS, the Board of Directors deems it to be in the best > >>interests of the Foundation and consistent with the > >>Foundation's purpose to establish a Project Management > >>Committee charged with the creation and maintenance of > >>open-source software, for distribution at no charge to the > >>public, relative to distributed and scalable OLAP engine > >> > >>NOW, THEREFORE, BE IT RESOLVED, that a Project Management > >>Committee (PMC), to be known as the "Apache Kylin Project", > >>be and hereby is established pursuant to Bylaws of the > >>Foundation; and be it further > >> > >>RESOLVED, that the Apache Kylin Project be and hereby is > >>responsible for the creation and maintenance of open-source > >>software related to distributed and scalable OLAP engine; > >>and be it further > >> > >>RESOLVED, that the office of "Vice President, Kylin" be and > >>hereby is created, the person holding such office to serve at > >>the direction of the Board of Directors as the chair of the > >>Apache Kylin Project, and to have primary responsibility for > >>management of the projects within the scope of responsibility > >>of the Apache Kylin Project; and be it further > >> > >>RESOLVED, that the persons listed immediately below be and > >>hereby are appointed to serve as the initial members of the > >>Apache Kylin Project: > >> > >> * Dayue Gao > >> * Jason Zhong > >> * Julian Hyde > >> * Luke Han > >> * Henry Saputra > >> * Hongbin Ma > >> * Hua Huang > >> * Owen O'Malley > >> * P. Taylor Goetz > >> * Qianhao Zhou > >> * Shaofeng Shi > >> * Song Yi > >> * Ted Dunning > >> * Xu Jiang > >> * Yang Li > >> * Yerui Sun < sunyerui at apache dot org> > >> > >> > >>NOW, THEREFORE, BE IT FURTHER RESOLVED, that Luke Han > >>be appointed to the office of Vice President,Apache Kylin, to > >>serve > >>in accordance with and subject to the
Re: [VOTE] Accept Apex into the Apache Incubator
+1 (non-binding) Thanks Naresh On Fri, Aug 14, 2015 at 11:14 AM, Justin Mclean jus...@classsoftware.com wrote: +1 binding - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: [VOTE] Accept Myriad into the Apache Incubator
+1 Thanks Naresh On 22 Feb 2015 11:14, Mohit Soni mohitsoni1...@gmail.com wrote: Definitely a +1 Thanks Mohit — Sent from Mailbox On Sat, Feb 21, 2015 at 9:34 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, After receiving a positive response on the discussion thread, and even a new Mentor (Luciano), I would like to call a VOTE to accept Myriad into the Apache Incubator. I will end the vote after 7 days. https://wiki.apache.org/incubator/MyriadProposal?action=recallrev=7 [ ] +1 Accept Myriad into the Incubator [ ] +0 Don’t care. [ ] -1 Don’t accept Myriad into the Incubator because.. I am clearly a +1. Thanks, -Adam- me@apache -- _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Looks interesting. Looking forward to this. Thanks Naresh On Wed, Feb 18, 2015 at 11:08 AM, Henry Saputra henry.sapu...@gmail.com wrote: I love this project and the idea. Tried to hack it couple years ago could not make it work. Looking forward seeing it in ASF incubator for sure. @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. - Henry On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN
Re: [VOTE] Accept Zeppelin into the Apache Incubator
+1 (non-binding) Thanks Naresh On Fri, Dec 19, 2014 at 4:32 PM, Fabian Hueske fhue...@apache.org wrote: +1 (non-binding) 2014-12-19 7:24 GMT+01:00 Jaideep Dhok jaideep.d...@inmobi.com: +1 (non-binding) Thanks, Jaideep On Fri, Dec 19, 2014 at 11:50 AM, Hyunsik Choi hyun...@apache.org wrote: +1 (binding) On Friday, December 19, 2014, Roman Shaposhnik r...@apache.org wrote: Following the discussion earlier: http://s.apache.org/kTp I would like to call a VOTE for accepting Zeppelin as a new Incubator project. The proposal is available at: https://wiki.apache.org/incubator/ZeppelinProposal and is also attached to the end of this email. Vote is open until at least Sunday, 21th December 2014, 23:59:00 PST [ ] +1 Accept Zeppelin into the Incubator [ ] ±0 Indifferent to the acceptance of Zeppelin [ ] -1 Do not accept Zeppelin because ... Thanks, Roman. == Abstract == Zeppelin is a collaborative data analytics and visualization tool for distributed, general-purpose data processing systems such as Apache Spark, Apache Flink, etc. == Proposal == Zeppelin is a modern web-based tool for the data scientists to collaborate over large-scale data exploration and visualization projects. It is a notebook style interpreter that enable collaborative analysis sessions sharing between users. Zeppelin is independent of the execution framework itself. Current version runs on top of Apache Spark but it has pluggable interpreter APIs to support other data processing systems. More execution frameworks could be added at a later date i.e Apache Flink, Crunch as well as SQL-like backends such as Hive, Tajo, MRQL. We have a strong preference for the project to be called Zeppelin. In case that may not be feasible, alternative names could be: “Mir”, “Yuga” or “Sora”. == Background == Large scale data analysis workflow includes multiple steps like data acquisition, pre-processing, visualization, etc and may include inter-operation of multiple different tools and technologies. With the widespread of the open source general-purpose data processing systems like Spark there is a lack of open source, modern user-friendly tools that combine strengths of interpreted language for data analysis with new in-browser visualization libraries and collaborative capabilities. Zeppelin initially started as a GUI tool for diverse set of SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open source since its inception in Sep 2013. Later, it became clear that there was a need for a greater web-based tool for data scientists to collaborate on data exploration over the large-scale projects, not limited to SQL. So Zeppelin integrated full support of Apache Spark while adding a collaborative environment with the ability to run and share interpreter sessions in-browser == Rationale == There are no open source alternatives for a collaborative notebook-based interpreter with support of multiple distributed data processing systems. As a number of companies adopting and contributing back to Zeppelin is growing, we think that having a long-term home at Apache foundation would be a great fit for the project ensuring that processes and procedures are in place to keep project and community “healthy” and free of any commercial, political or legal faults. == Initial Goals == The initial goals will be to move the existing codebase to Apache and integrate with the Apache development process. This includes moving all infrastructure that we currently maintain, such as: a website, a mailing list, an issues tracker and a Jenkins CI, as mentioned in “Required Resources” section of current proposal. Once this is accomplished, we plan for incremental development and releases that follow the Apache guidelines. To increase adoption the major goal for the project would be to provide integration with as much projects from Apache data ecosystem as possible, including new interpreters for Apache Hive, Apache Drill and adding Zeppelin distribution to Apache Bigtop. On the community building side the main goal is to attract a diverse set of contributors by promoting Zeppelin to wide variety of engineers, starting a Zeppelin user groups around the globe and by engaging with other existing Apache projects communities online. == Current Status == Currently, Zeppelin has 4 released versions and is used in production at a number of companies across the globe mentioned in Affiliation section. Current implementation status is pre-release with public API not being finalized yet. Current main and default backend processing engine is Apache Spark with consistent support of
Re: [VOTE] accept SAMOA into incubator
+1 (non-binding) Thanks Naresh On Fri, Dec 12, 2014 at 7:22 AM, John D. Ament john.d.am...@gmail.com wrote: +1 binding On Thu Dec 11 2014 at 5:10:50 PM Konstantin Boudnik c...@apache.org wrote: +1 (binding). I small comment: we don't do users@ list of podlings, do we? If so samoa-users@googlegroups -- us...@samoa.incubator.apache.org will need to be converged into dev@. Not all podlings use a users@, but they can if they like. Usually if it's coming from an established community there will be one. Cos On Thu, Dec 11, 2014 at 10:02AM, Daniel Dai wrote: Following the discussion earlier, I'm calling a vote to accept SAMOA as a new Incubator project. [ ] +1 Accept SAMOA into the Incubator [ ] +0 Indifferent to the acceptance of SAMOA [ ] -1 Do not accept SAMOA because ... The vote will be open for at least 72h and closes at the earliest on Dec 14 19:00 GMT. https://wiki.apache.org/incubator/SAMOAProposal Thanks, Daniel = SAMOA = == Abstract == SAMOA is an an open-source platform for mining big data streams. == Proposal == SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms that run on top of distributed stream processing engines (DSPEs). It features a pluggable architecture that allows it to run on several DSPEs such as Apache Storm, Apache S4, and Apache Samza. == Background == Hadoop and its ecosystem have changed the way data are processed by allowing to push algorithms to unprecedented scale. As an example, Mahout allows to run data mining and machine learning algorithms on very large datasets. However, Hadoop and Mahout are not suited to handle streaming data. Simply put, the goal of SAMOA is to provide a streaming counterpart to Mahout. == Rationale == SAMOA aims to fill the current gap in tools for mining large scale streams. Many organizations can benefit from a scalable stream mining platform system such as SAMOA. SAMOA is a natural fit for the Apache Software Foundation. It is licensed under the ASL v2.0. It already interoperates with several existing Apache projects such as Storm, S4, and Samza. Furthermore, it is complementary to existing Apache projects such as Mahout. The initial committers are familiar with the Apache process and subscribes to the Apache mission. Indeed, the team includes multiple Apache committers. Finally, joining Apache will help coordinate the development effort of the growing number of organizations which contribute to SAMOA. == Initial Goals == * Move the existing codebase to Apache * Integrate with the Apache development process * Incremental development and releases per Apache guidelines == Current Status == SAMOA started as a research project at Yahoo Labs in 2013 and was open-sourced in October the same year. It has been under development on Yahoo's public GitHub repository since being open-sourced. It has undergone two releases (0.1, 0.2). === Meritocracy === The SAMOA project already operates on meritocratic principles. Today, SAMOA has several developers and has accepted multiple patches from outside of Yahoo Labs. However, our intent with this incubator proposal is to start building a more diverse developer community around SAMOA that follows the Apache meritocracy model. We will identify all committers and PPMC members for the project operating under the ASF meritocratic principles. We plan to continue support for new contributors and work with those who contribute significantly to the project to make them committers. === Community === SAMOA is currently being used internally at Yahoo. Acceptance into the Apache foundation would bolster the existing user and developer community around SAMOA. That community includes contributors from several institutions, active mostly on GitHub's pages. SAMOA has been starred more than 300 times and forked more than 50 times on GitHub as of November 2014. === Core Developers === The core developers are a diverse group, many of which already very experienced with open source. There are two existing Apache committers, along with people from various companies and universities. === Alignment === The ASF is the natural choice to host SAMOA. First, its goal of encouraging community-driven open-source projects fits with our vision for SAMOA. Additionally, many other projects that SAMOA is based on, such as Apache Storm, S4, Samza, and HDFS, are hosted by the ASF. Close proximity of SAMOA to these projects within the ASF will provide mutual benefit. == Known Risks == === Orphaned Products === Given the current
Re: [VOTE] Accept HTrace into the Apache Incubator
+1 (non-binding) Thanks Naresh On Thu, Nov 6, 2014 at 12:26 PM, Seetharam Venkatesh venkat...@innerzeal.com wrote: +1 (non-binding) On Wed, Nov 5, 2014 at 1:34 PM, Billie Rinaldi bil...@apache.org wrote: +1 On Wed, Nov 5, 2014 at 11:37 AM, Roman Shaposhnik r...@apache.org wrote: Following the discussion earlier in the thread: http://s.apache.org/Dk7 I would like to call a VOTE for accepting HTrace as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/HTraceProposal Vote is open until at least Sunday, 9th November 2014, 23:59:00 UTC [ ] +1 accept Lens in the Incubator [ ] ±0 [ ] -1 because... -- Regards, Venkatesh “Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away.” - Antoine de Saint-Exupéry -- _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator
Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop based applications or it can be used in any Java based systems? Thanks Naresh On Mon, Nov 3, 2014 at 1:34 AM, Andrew Purtell apurt...@apache.org wrote: Really great to see an incubation proposal for HTrace. If you need another mentor, please consider me. I don't think you need to list HTrace is not the primary focus of any of the current list of contributors as a risk. One can say that about many (perhaps the majority) of contributors to Apache projects. We would hope the incubation process develops a healthy community that sustains a level of contribution that keeps the project moving forward, as we would hope for all incubation candidates. On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik r...@apache.org wrote: Hi! I would like to propose HTrace to be consider for Apache Incubator. The proposal is attached and is also available on the wiki: https://wiki.apache.org/incubator/HTraceProposal Please let me know what do you guys think and also don't hesitate to massage the proposal on the wiki based on the feedback from this thread. Thanks, Roman. == Abstract == HTrace is a tracing framework intended for use with distributed systems written in java. == Proposal == HTrace is an aid for understanding system behavior and for reasoning about performance issues in distributed systems. HTrace is primarily a low impedance library that a java distributed system can incorporate to generate ‘breadcrumbs’ or ‘traces’ along the path of execution, even as it crosses processes and machines. HTrace also includes various tools and glue for collecting, processing and ‘visualizing’ captured execution traces for analysis ex post facto of where time was spent and what resources were consumed. == Background == Distributed systems are made up of multiple software components running on multiple computers connected by networks. Debugging or profiling operations run over non-trivial distributed systems -- figuring execution paths and what services, machines, and libraries participated in the processing of a request -- can be involved. == Rationale == Rather than have each distributed system build its own custom ‘tracing’ libraries, ideally all would use a single project that provides necessary primitives and saves each project building its own visualizations and processing tools anew. Google described “...[a] large-scale distributed systems tracing infrastructure” in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The paper tells a compelling story of what is possible when disparate systems standardize on a single tracing library and cooperate, ‘passing the baton’, filling out trace context as executions cross systems. HTrace aims to provide a rough equivalent in open source of the described core Dapper tools and library. As it is adopted by more projects, there will be a ‘network effect’ as HTrace will provide a more comprehensive view of activity on the cluster. For example, as HDFS gets HTrace support, we can connect this with the HTrace support in HBase to follow HBase requests as they enter HDFS. Given the success of HTrace depends on its being integrated by many projects, HTrace should be perceived as unhampered, free of any commercial, political, or legal ‘taint’. Being an Apache project would help in this regard. == Initial Goals == HTrace is a small project of narrow scope but with a grand vision: * Move the HTrace source and repository to Apache, a vendor-neutral location. Currently HTrace resides at a Cloudera-hosted repository. * Add past contributors as committers and institute Apache governance. * Evangelize and encourage HTrace diffusion. Initially we will continue a focus on the Hadoop space since that is where most of the initial contributors work and it is where HTrace has been initially deployed. * Building out the standalone visualization tool that ships with HTrace. * Build more community and add more committers == Current Status == Currently HTrace has a viable Java trace library that can be interpolated to create ‘traces’. The work that needs to be done on this library is mostly bug fixes, ease-of-use improvements, and performance tweaks. In the future, we may add libraries for other languages besides Java. HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin (a tracing sink and visualization system developed by Twitter https://github.com/twitter/zipkin), or Apache HBase. Executions can be viewed either in Zipkin or in pygraph (https://code.google.com/p/python-graph/). Since the initial sprint in the summer of 2012 which saw HTrace patches proposed for Apache HDFS and committed to Apache HBase, development has been sporadic; mostly a single developer or two