from:"Debo Dutta \(dedutta\)"

Re: [PROPOSAL] New blockchain project: Cava

2019-02-05 Thread Debo Dutta (dedutta)

I would love to be contribute in some way. 

thx
debo

On 2/5/19, 3:36 PM, "Dave Fisher"  wrote:

Hi -

I would also be happy to volunteer as a Mentor to this podling.

Regards,
Dave

> On Feb 5, 2019, at 3:28 PM, Kenneth Knowles  wrote:
> 
> +1
> 
> I would like to volunteer as an additional mentor.
> 
> Kenn
> 
> On Tue, Feb 5, 2019 at 2:57 PM Justin Mclean 
> wrote:
> 
>> Hi,
>> 
>> Nice proposal. Just a couple of  comments.
>> 
>>> The project is well established and counts 2 active committers. Some
>> contributions were made from the community.
>> 
>> Being that small may be a concern, the ASF prefers project with a
>> community around them, but it’s not always a barrier to entry see for
>> example Apache PLC4X.
>> 
>>> Blockchain protocol developers organize well in communities, and some
>> lively discussions take place over Twitter, Gitter, Telegram.
>> 
>> At the ASF all decisions need to be made on the mailing list not on any
>> external platform mentioned above. Would the project be OK with moving
>> discussions to a mailing list and having them in an asynchronous manner?
>> 
>>> We will remain in incubation for a period of no less than a year so we
>> can properly invest and build a community of users, contributors and
>> committers around our goals.
>> 
>> Most projects take more than a year to graduate, two years is usual but
>> some projects do take longer. Is the project OK with that and will your
>> mentors stay around for that long?
>> 
>>> '''Initial Committers'''
>>> 
>>> Antoine Toulme (toulmean at apache dot org) *
>> 
>> Seem odd to only list one initial committer, all initial committer s are
>> on the PPMC and you would need a minimum of 3 active people. The IPMC
>> generally likes to see 5.
>> 
>> Thanks,
>> Justin
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>> 
>> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] DLab for Apache Incubator

2018-08-07 Thread Debo Dutta (dedutta)

I am happy to help (either mentor or volunteer). This is a good idea. Have 
helped out in Apache projects before. 

Debo 

Sent from my iPhone

> On Aug 7, 2018, at 6:08 PM, P. Taylor Goetz  wrote:
> 
> Henry Saputra (hsaputra) has been added to the mentor list.
> 
> We are still interested in proposal feedback and mentor volunteers.
> 
> -Taylor
> 
>> On Aug 6, 2018, at 10:47 AM, P. Taylor Goetz  wrote:
>> 
>> I would like to propose DLab as an Apache Incubator project.
>> 
>> The text of the proposal can be found below as well as on the Incubator wiki:
>> 
>> https://wiki.apache.org/incubator/DLabProposal
>> 
>> We are seeking additional mentors and would welcome anyone who would like to 
>> volunteer.
>> 
>> -Taylor
>> 
>> 
>> = DLab Proposal =
>> 
>> == Abstract ==
>> DLab is a platform for creating self-service, exploratory data science 
>> environments in the cloud using best-of-breed data science tools.
>> 
>> DLab includes a self-service web console, used to create and manage 
>> exploratory environments. It allows teams to spin up analytical environments 
>> with just a single click of a mouse. Once established, the environment can 
>> be managed by an analytical team itself, leveraging simple and easy-to-use 
>> web-based interface.
>> 
>> == Proposal ==
>> In order to work effectively, data scientists rely on a varying suite of 
>> analytics tools that are readily available. However, many of those tools are 
>> non-trivial to set up in terms of hardware provisioning, software 
>> installation, configuration, and deployment. Setting up a collaborative, 
>> multi-tenant development environment for data scientists consumes 
>> substantial IT and DevOps resources, as well as time. These factors often 
>> combine to hinder the agility and effectiveness of data science teams within 
>> an organization. Current solutions are largely closed source and/or 
>> proprietary, and committing to a given solution introduces the potential for 
>> vendor lock-in.
>> 
>> EPAM Systems developed DLab in response to the lack of open source, 
>> permissibly licensed solutions to better enable data science workflows. The 
>> ALv2 was selected to encourage open development and user adoption. DLab was 
>> open sourced on Dec 29, 2016 and is under active development with support 
>> from EPAM Systems.
>> 
>> We believe DLab is a unique solution with no current open source equivalent. 
>> Our primary goals of incubation are to grow and diversify the DLab community 
>> to ensure its long-term sustainability.
>> 
>> == Rationale ==
>> DLab is a platform that provides data scientists with the ability to 
>> self-provision, without IT support, exploratory and production environments 
>> with their preferred set of tools installed and pre-configured. Tool options 
>> include, but are not limited to:
>> 
>> * Apache Spark
>> * Apache Flink (planned)
>> * Apache Zeppelin
>> * Jupyter
>> * TensorFlow + Jupyter
>> * Deep Learning + Jupyter
>> 
>> DLab leverages cloud computing providers for virtual hardware provisioning 
>> and currently supports the following:
>> 
>> * Amazon Web Services (AWS)
>> * Microsoft Azure
>> * Google Compute Platform (GCP) (under development)
>> 
>> DLab offers git-based collaboration tools for data scientists and developers 
>> and integrates with the following git service providers:
>> 
>> * GItHub
>> * GitLab
>> * BitBucket
>> 
>> Additionally, DLab includes the option to configure the UnGit tool in an 
>> environment to facilitate collaboration.
>> Finally, DLab integrates closely with many security and SSO offerings, 
>> including:
>> 
>> * LDAP
>> * Microsoft Active Directory
>> * AWS Identity Access Management service
>> 
>> DLab was designed from the ground up to be highly configurable, flexible, 
>> and extensible platform. We believe these qualities will encourage community 
>> growth by enabling contributors to easily add new integrations and 
>> extensions.
>> 
>> == Initial Goals ==
>> The initial goal will be to move the existing codebase to Apache and 
>> integrate with the Apache development process and infrastructure. A primary 
>> goal of incubation will be to grow and diversify the DLab PPMC. We are well 
>> aware that the project community is comprised of individuals from a single 
>> company. We aim to change that during incubation.
>> 
>> == Current Status ==
>> As previously mentioned, DLab is under active development at EPAM Systems, 
>> and is being used in a number of production deployments:
>> 
>> * [An investment company] is using DLab as an AWS-based analytics platform 
>> for their data scientists to provide a convenient way to perform 
>> multi-tenant data analytics. This enables data scientists to easily 
>> provision work environments with integrated data sources based on 
>> Elasticsearch, Apache HBase, and Neo4j, and utilizing Apache Spark. This 
>> enabled a “one click”, self service option for users to provision an 
>> environment with the necessary tools and data.

Re: [VOTE] Accept Crail into the Apache Incubator

2017-10-26 Thread Debo Dutta (dedutta)

+1

On 10/26/17, 9:30 AM, "Gang(Gary) Wang"  wrote:

+1


On Thu, Oct 26, 2017 at 9:25 AM, Clebert Suconic 
wrote:

> +1
>
> On Thu, Oct 26, 2017 at 12:01 PM, Luciano Resende 
> wrote:
> > Off course, my + 1
> >
> > On Thu, Oct 26, 2017 at 12:31 PM, Luciano Resende 
> > wrote:
> >
> >> Now that the discussion thread on the Crail proposal has ended, please
> >> vote on accepting Crail into into the Apache Incubator.
> >>
> >> The ASF voting rules are described at:
> >>http://www.apache.org/foundation/voting.html
> >>
> >> A vote for accepting a new Apache Incubator podling is a majority vote
> >> for which only Incubator PMC member votes are binding.
> >>
> >> Votes from other people are also welcome as an indication of peoples
> >> enthusiasm (or lack thereof).
> >>
> >> Please do not use this VOTE thread for discussions.
> >> If needed, start a new thread instead.
> >>
> >> This vote will run for at least 72 hours. Please VOTE as follows
> >> [] +1 Accept Crail into the Apache Incubator
> >> [] +0 Abstain.
> >> [] -1 Do not accept Crail into the Apache Incubator because ...
> >>
> >> The proposal below is also on the wiki:
> >> https://wiki.apache.org/incubator/CrailProposal
> >>
> >> ===
> >>
> >> Abstract
> >>
> >> Crail is a storage platform for sharing performance critical data in
> >> distributed data processing jobs at very high speed. Crail is built
> >> entirely upon principles of user-level I/O and specifically targets 
data
> >> center deployments with fast network and storage hardware (e.g., 
100Gbps
> >> RDMA, plenty of DRAM, NVMe flash, etc.) as well as new modes of
> operation
> >> such resource disaggregation or serverless computing. Crail is written
> in
> >> Java and integrates seamlessly with the Apache data processing
> ecosystem.
> >> It can be used as a backbone to accelerate high-level data operations
> such
> >> as shuffle or broadcast, or as a cache to store hot data that is 
queried
> >> repeatedly, or as a storage platform for sharing inter-job data in
> complex
> >> multi-job pipelines, etc.
> >>
> >> Proposal
> >>
> >> Crail enables Apache data processing frameworks to run efficiently in
> next
> >> generation data centers using fast storage and network hardware in
> >> combination with resource (e.g., DRAM, Flash) disaggregation.
> >>
> >> Background
> >>
> >> Crail started as a research project at the IBM Zurich Research
> Laboratory
> >> around 2014 aiming to integrate high-speed I/O hardware effectively 
into
> >> large scale data processing systems.
> >>
> >> Rational
> >>
> >> During the last decade, I/O hardware has undergone rapid performance
> >> improvements, typically in the order of magnitudes. Modern day
> networking
> >> and storage hardware can deliver 100+ Gbps (10+ GBps) bandwidth with a
> few
> >> microseconds of access latencies. However, despite such progress in raw
> I/O
> >> performance, effectively leveraging modern hardware in data processing
> >> frameworks remains challenging. In most of the cases, upgrading to
> high-end
> >> networking or storage hardware has very little effect on the
> performance of
> >> analytics workloads. The problem comes from heavily layered software
> >> imposing overheads such as deep call stacks, unnecessary data copies,
> >> thread contention, etc. These problems have already been addressed at
> the
> >> operating system level with new I/O APIs such as RDMA verbs, NVMe, 
etc.,
> >> allowing applications to bypass software layers during I/O operations.
> >> Distributed data processing frameworks on the other hand, are typically
> >> implemented on legacy I/O interfaces such as such as sockets or block
> >> storage. These interfaces have been shown to be insufficient to deliver
> the
> >> full hardware performance. Yet, to the best of our knowledge, there are
> no
> >> active and systematic efforts to integrate these new user level I/O 
APIs
> >> into Apache software frameworks. This problem affects all end-users and
> >> organizations that use Apache software. We expect them to see
> >> unsatisfactory small performance gains when upgrading their networking
> and
> >> storage hardware.
> >>
> >> Crail solves this problem by providing an efficient storage platform
> built
> >> upon user-level I/O, thus, bypassing layers such as JVM and OS during
> I/O
> >> operations. Moreover, Crail directly leverages the specific hardware
> >> features of RDMA and NVMe to provide a better integration with

Re: [DISCUSS] Storage-class memory ecosystem program

2017-10-24 Thread Debo Dutta (dedutta)

BTW should we name this group “Apache Durable Computing Initiative”?

debo

On 10/24/17, 11:18 AM, "Debo Dutta (dedutta)" <dedu...@cisco.com> wrote:

Yes, the moment we have the workgroup mailer setup, will send out a DISCUSS 
thread. 

debo

On 10/24/17, 10:14 AM, "Gang(Gary) Wang" <ga...@apache.org> wrote:

It is a great idea if we could have a common benchmark and APIs for
storage-class memory oriented library/framework/application, please go
ahead to propose one for discussion in our workgroup. Thanks!

On Mon, Oct 23, 2017 at 4:27 PM, Debojyoti Dutta <ddu...@gmail.com> 
wrote:

> Would love to help out in any way including working towards common
> benchmarks, APIs etc.
>
> Debo
>
> Sent from my iPhone
>
> > On Oct 23, 2017, at 4:06 PM, Gang(Gary) Wang <ga...@apache.org> 
wrote:
> >
> > There are suggested initial goals for our workgroup
> >
> >   - Sharing idea and good practice
> >   - Identifying common opportunities
> >   - Promoting storage-class memory application
> >   - Delivering solid solution
> >   - Avoiding reinvent the wheel
> >   - Integrating one another
> >   - Coordinating the progress
> >
> >
> >> On Mon, Oct 23, 2017 at 4:02 PM, Gang(Gary) Wang <ga...@apache.org>
> wrote:
> >>
> >> Add ORC
> >>
> >>   - *Ignite* represented by Denis Magda
> >>   - *Arrow *represented by Wes McKinney
> >>   - *Hbase *represented by Anoop John
> >>   - *Crail* represented by *Patrick Stuedi*
> >>   - *ORC *represented by* Owen O'Malley*
> >>   - *Mnemonic *represented by* Gary*
> >>
> >> With above projects, we could cover Storage-class memory oriented
> *Distributed
> >> Database, KV Store, Columnar Structured Dataset, Distributed Data 
Store,
> >> Columnar Storage, **Durable Object Model, Durable Computing Model* 
for
> >> new generation high-performance applications, e.g. data querying,
> >> processing, and analytics.
> >>
> >>
> >> On Mon, Oct 23, 2017 at 1:21 PM, Owen O'Malley 
<owen.omal...@gmail.com>
> >> wrote:
> >>
> >>> I can represent ORC within the group.
> >>>
> >>> .. Owen
> >>>
> >>>> On Oct 19, 2017, at 11:55 AM, Gang(Gary) Wang <ga...@apache.org>
> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> We can expect more and more projects will take the huge potential
> >>>> advantages of storage-class memory for data processing and 
analytics
> >>>> because silicon companies are able to produce high capacity
> non-volatile
> >>>> memory on a large scale, this hardware technology will 
fundamentally
> >>> change
> >>>> the way to construct high performance applications similar to 
what
> >>> happened
> >>>> when replacing tape with disk technology since the 1980s. so if
> >>> possible, I
> >>>> advocate establishing an Apache working group to enhance the
> >>> collaboration
> >>>> and synergies mentioned by Patrick Stuedi for storage-class 
memory
> >>>> technology-oriented projects.
> >>>>
> >>>> Best.
> >>>> Gary.
> >>>
> >>>
> >>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>





-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Storage-class memory ecosystem program

2017-10-24 Thread Debo Dutta (dedutta)

Yes, the moment we have the workgroup mailer setup, will send out a DISCUSS 
thread. 

debo

On 10/24/17, 10:14 AM, "Gang(Gary) Wang"  wrote:

It is a great idea if we could have a common benchmark and APIs for
storage-class memory oriented library/framework/application, please go
ahead to propose one for discussion in our workgroup. Thanks!

On Mon, Oct 23, 2017 at 4:27 PM, Debojyoti Dutta  wrote:

> Would love to help out in any way including working towards common
> benchmarks, APIs etc.
>
> Debo
>
> Sent from my iPhone
>
> > On Oct 23, 2017, at 4:06 PM, Gang(Gary) Wang  wrote:
> >
> > There are suggested initial goals for our workgroup
> >
> >   - Sharing idea and good practice
> >   - Identifying common opportunities
> >   - Promoting storage-class memory application
> >   - Delivering solid solution
> >   - Avoiding reinvent the wheel
> >   - Integrating one another
> >   - Coordinating the progress
> >
> >
> >> On Mon, Oct 23, 2017 at 4:02 PM, Gang(Gary) Wang 
> wrote:
> >>
> >> Add ORC
> >>
> >>   - *Ignite* represented by Denis Magda
> >>   - *Arrow *represented by Wes McKinney
> >>   - *Hbase *represented by Anoop John
> >>   - *Crail* represented by *Patrick Stuedi*
> >>   - *ORC *represented by* Owen O'Malley*
> >>   - *Mnemonic *represented by* Gary*
> >>
> >> With above projects, we could cover Storage-class memory oriented
> *Distributed
> >> Database, KV Store, Columnar Structured Dataset, Distributed Data 
Store,
> >> Columnar Storage, **Durable Object Model, Durable Computing Model* for
> >> new generation high-performance applications, e.g. data querying,
> >> processing, and analytics.
> >>
> >>
> >> On Mon, Oct 23, 2017 at 1:21 PM, Owen O'Malley 
> >> wrote:
> >>
> >>> I can represent ORC within the group.
> >>>
> >>> .. Owen
> >>>
>  On Oct 19, 2017, at 11:55 AM, Gang(Gary) Wang 
> wrote:
> 
>  Hi all,
> 
>  We can expect more and more projects will take the huge potential
>  advantages of storage-class memory for data processing and analytics
>  because silicon companies are able to produce high capacity
> non-volatile
>  memory on a large scale, this hardware technology will fundamentally
> >>> change
>  the way to construct high performance applications similar to what
> >>> happened
>  when replacing tape with disk technology since the 1980s. so if
> >>> possible, I
>  advocate establishing an Apache working group to enhance the
> >>> collaboration
>  and synergies mentioned by Patrick Stuedi for storage-class memory
>  technology-oriented projects.
> 
>  Best.
>  Gary.
> >>>
> >>>
> >>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>

Re: [DISCUSS] Storage-class memory ecosystem program

2017-10-20 Thread Debo Dutta (dedutta)

Hi Patrick

Am happy to help w Crail.

Thx
Debo

Sent from my iPhone

On Oct 20, 2017, at 10:14 AM, patrick stuedi 
> wrote:

I'd be happy to be involved in the discussions as well. This matches
well with the focus we have in the Crail project 
(www.crail.io) where
we expose new storage technologies, including storage class memory, to
distributed data processing frameworks over high-speed network
fabrics. We are currently working on applying for Apache incubator
with Crail and we're looking for additional mentors :-)

On Fri, Oct 20, 2017 at 6:54 PM, Gang(Gary) Wang 
> wrote:
It is wonderful, as of now, we have the following projects and their
communities.

  - *Ignite* represented by Denis Magda
  - *Arrow *represented by Wes McKinney
  - *Hbase *represented by Anoop John
  - *Mnemonic *represented by* Gary*

With above projects, we can cover Storage-class memory oriented Database,
KV Store, Columnar Structured Dataset, Durable Object Model, Durable
Computing Model for new generation high-performance applications, e.g. data
querying, processing, and analytics.

On Thu, Oct 19, 2017 at 11:01 PM, Atri Sharma 
> wrote:

+1

On Fri, Oct 20, 2017 at 10:29 AM, Anoop John 
>
wrote:
+1..Am there to represent Apache HBase.

-Anoop-

On Fri, Oct 20, 2017 at 8:14 AM, Wes McKinney 
>
wrote:
I'm happy to represent Apache Arrow on this. This is very much in line
with our focus on zero-copy / memory-mapped data access of structured
data sets and cache-efficient memory layout.

- Wes

On Thu, Oct 19, 2017 at 9:48 PM, Debojyoti Dutta 
>
wrote:
This is a great idea. Storage class memory will have a big impact on
many of the Apache projects.

Debo

Sent from my iPhone

On Oct 19, 2017, at 11:55 AM, Gang(Gary) Wang 
>
wrote:

Hi all,

We can expect more and more projects will take the huge potential
advantages of storage-class memory for data processing and analytics
because silicon companies are able to produce high capacity
non-volatile
memory on a large scale, this hardware technology will fundamentally
change
the way to construct high performance applications similar to what
happened
when replacing tape with disk technology since the 1980s. so if
possible, I
advocate establishing an Apache working group to enhance the
collaboration
and synergies mentioned by Patrick Stuedi for storage-class memory
technology-oriented projects.

Best.
Gary.

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.org

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.org

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.org

--
Regards,

Atri
l'apprenant

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.org

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.org

Re: [VOTE] Graduate Apache Mnemonic project from Incubator

2017-09-19 Thread Debo Dutta (dedutta)

+1

Sent from my iPhone

> On Sep 19, 2017, at 11:58 AM, Johnu George  wrote:
> 
> +1
> 
> On Tue, Sep 19, 2017 at 10:23 AM, Gangumalla, Uma 
> wrote:
> 
>> +1 (binding)
>> 
>> Regards,
>> Uma
>> 
>>> On 9/19/17, 10:16 AM, "Gang(Gary) Wang"  wrote:
>>> 
>>> Hello IPMC and everyone,
>>> 
>>> The Mnemonic community has voted on its Dev list to graduate, The vote
>>> passed with
>>> 14 +1s (including  +9s from the PPMC) and 0 -1s.
>>> 
>>> Here is the vote result thread in the Dev list:
>>> https://lists.apache.org/thread.html/bbc187108b73d57fddec0d6a6c2945
>> 27b626c
>>> 7c439a7cdab991ea84e@%3Cdev.mnemonic.apache.org%3E
>>> 
>>> and the vote thread:
>>> https://lists.apache.org/thread.html/a49e82d507bb00839413e90b05cb8b
>> 9448ea2
>>> 42aeb021622f5deb323@%3Cdev.mnemonic.apache.org%3E
>>> 
>>> With the discussion having settled down, I would now like to call for
>>> a recommendation VOTE to present the ASF board with the following
>>> resolution
>>> to graduate from incubation and establish Apache Mnemonic
>>> as a top-level project (TLP).
>>> https://lists.apache.org/thread.html/94664579041db58bfe2893af6e9d54
>> 9576526
>>> 53b278d4007e914c672@%3Cgeneral.incubator.apache.org%3E
>>> 
>>> Apache Mnemonic entered incubation in March 2016. Since then there have
>>> been nine releases and four committers and two PMC candidate members
>>> have been added to the project. For each release, source artifacts
>>> have been made available. Based on the completed maturity evaluation,
>>> we believe that the project is ready to graduate from the incubator.
>>> More checklist info about graduation, please refer to
>>> https://cwiki.apache.org/confluence/display/MNEMONIC/Maturity+Evaluation
>>> 
>>> Please vote on whether to graduate Mnemonic from incubator and
>>> recommend the following graduation resolution to the ASF Board.
>>> 
>>> [ ] +1 Graduate Apache Mnemonic from the Incubator
>>> [ ] +0 Don't care
>>> [ ] -1 Don't graduate Apache Mnemonic from the Incubator because...
>>> 
>>> This VOTE will be open for at least 72 hours.
>>> Thanks to all Mentors and Apache Mnemonic Project members
>>> for their support and contributions again.
>>> 
>>> The full text of the resolution is below.
>>> If approved by the Apache Incubator PMC members,
>>> the proposed resolution will be submitted to
>>> the Board of Directors for their consideration.
>>> --
>>> Establish the Apache Mnemonic Project
>>> 
>>> WHEREAS, the Board of Directors deems it to be in the best interests of
>>> the Foundation and consistent with the Foundation's purpose to establish
>>> a Project Management Committee charged with the creation and maintenance
>>> of open-source software, for distribution at no charge to the public,
>>> related to a transparent nonvolatile hybrid memory oriented library for
>>> Big data, High-performance computing, and Analytics.
>>> 
>>> NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
>>> (PMC), to be known as the "Apache Mnemonic Project", be and hereby is
>>> established pursuant to Bylaws of the Foundation, and be it further
>>> 
>>> RESOLVED, that the Apache Mnemonic Project be and hereby is responsible
>>> for the creation and maintenance of software related to a transparent
>>> nonvolatile hybrid memory oriented library for Big data,
>>> High-performance computing and Analytics; and be it further
>>> 
>>> RESOLVED, that the office of "Vice President, Apache Mnemonic" be and
>>> hereby is created, the person holding such office to serve at the
>>> direction of the Board of Directors as the chair of the Apache Mnemonic
>>> Project, and to have primary responsibility for management of the
>>> projects within the scope of responsibility of the Apache Mnemonic
>>> Project, and be it further
>>> 
>>> RESOLVED, that the persons listed immediately below be and hereby are
>>> appointed to serve as the initial members of the Apache Mnemonic
>>> Project:
>>> 
>>> * Andrew Kyle Purtell  
>>> * Debojyoti Dutta  
>>> * Gang Wang
>>> * Hao Cheng
>>> * James R. Taylor  
>>> * Johnu George 
>>> * Kai Zheng
>>> * Patrick D. Hunt  
>>> * Rakesh Radhakrishnan 
>>> * Uma Maheswara Rao G  
>>> * Yanping Wang 
>>> 
>>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Gang Wang be appointed to
>>> the office of Vice President, Apache Mnemonic, to serve in accordance
>>> with and subject to the direction of the Board of Directors and the
>>> Bylaws of the Foundation until death, resignation, retirement, removal
>>> or disqualification, or until a successor is appointed, and be it
>>> further
>>> 
>>> RESOLVED, that the initial Apache

Re: [VOTE] Graduate Apache Streams project from Incubator

2017-07-10 Thread Debo Dutta (dedutta)

+1

On 7/10/17, 10:27 AM, "P. Taylor Goetz"  wrote:

+1 (binding)

-Taylor

> On Jul 10, 2017, at 11:09 AM, sblackmon  wrote:
> 
>  
> In concert with the discussion started last week [1], please vote on the 
draft resolution which establishes Apache Streams as a new top-level project at 
the Apache Software Foundation, as follows:  
> 
> [ ] +1, Graduate Apache Streams from the Incubator.  
> [ ] +0, Don't care.  
> [ ] -1, Don't graduate Apache Streams from the Incubator (provide 
details)  
> 
> The full text of the resolution is below.  
> 
> If approved by the Apache Incubator PMC members, the proposed resolution 
will be submitted to the Board of Directors for their consideration.  
> 
> Thanks ! 
> 
> [1] 
https://lists.apache.org/thread.html/60d676ad7b190323f8479bccff8ae996f98bf5fcd0d0ff4d54b71006@
> 
> Establish the Apache Streams Project 
> 
> WHEREAS, the Board of Directors deems it to be in the best interests of 
the Foundation and consistent with the Foundation's purpose to establish a 
Project Management Committee charged with the creation and maintenance of 
open-source software, for distribution at no charge to the public, related to 
interoperability of online profiles and activity feeds. 
> 
> NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee 
(PMC), to be known as the "Apache Streams Project", be and hereby is 
established pursuant to Bylaws of the Foundation; and be it further 
> 
> RESOLVED, that the Apache Streams Project be and hereby is responsible 
for the creation and maintenance of software related to interoperability of 
online profiles and activity feeds; and be it further 
> 
> RESOLVED, that the office of "Vice President, Apache Streams" be and 
hereby is created, the person holding such office to serve at the direction of 
the Board of Directors as the chair of the Apache Streams Project, and to have 
primary responsibility for management of the projects within the scope of 
responsibility of the Apache Streams Project; and be it further 
> 
> RESOLVED, that the persons listed immediately below be and hereby are 
appointed to serve as the initial members of the Apache Streams Project:
> 
>  * Stephen D Blackmon   
>  * Robert Baker Douglas 
>  * Ate Douma
>  * Ryan Edward Ebanks   
>  * Matt Franklin
>  * Joey Frazee  
>  * Trevor Grant 
>  * Suneel Marthi
> 
> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Stephen D Blackmon be 
appointed to the office of Vice President, Apache Streams, to serve in 
accordance with and subject to the direction of the Board of Directors and the 
Bylaws of the Foundation until death, resignation, retirement, removal or 
disqualification, or until a successor is appointed; and be it further 
> 
> RESOLVED, that the initial Apache Streams PMC be and hereby is tasked 
with the creation of a set of bylaws intended to encourage open development and 
increased participation in the Apache Streams Project; and be it further 
> 
> RESOLVED, that the Apache Streams Project be and hereby is tasked with 
the migration and rationalization of the Apache Incubator Streams podling; and 
be it further 
> 
> RESOLVED, that all responsibilities pertaining to the Apache Incubator 
Streams podling encumbered upon the Apache Incubator PMC are hereafter 
discharged.  
> 
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Heron to enter Apache Incubator

2017-06-23 Thread Debo Dutta (dedutta)

+1 to Ted’s comment.

As a user, I would love to pick one system and reuse the storm topologies. 
Ideally pick one converged solution.

+1 to the incubation since it will eventually lead to a better options within 
Apache. 

debo

On 6/23/17, 10:08 AM, "Ted Dunning"  wrote:

Anybody who worries about you serving as mentor needs a dose of reality.
They can't get anybody better.

On Jun 22, 2017 12:21 PM, "P. Taylor Goetz"  wrote:

if there are ongoing concerns from either the Storm PMC or the Heron PPMC
about me acting as a mentor, I would be willing to step down.

+1 (binding)

-Taylor



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release of Apache Mnemonic-0.8.0-incubating [rc1]

2017-06-21 Thread Debo Dutta (dedutta)

+1, great feedback. We will get this sorted very soon!

debo

On 6/21/17, 10:07 AM, "Wang, Gang1"  wrote:

Hi Josh,

Thank you for the review and suggestion, Yes it makes sense to construct a 
set of pure Java memory services as fallback for the experience of first users. 
we would add this feature to Mnemonic later, Thanks!

Best Regards
Gary
  (323)801-6286 (WhatsApp)

-Original Message-
From: Josh Elser [mailto:els...@apache.org] 
Sent: Tuesday, June 20, 2017 2:37 PM
To: general@incubator.apache.org
Subject: Re: [VOTE] Release of Apache Mnemonic-0.8.0-incubating [rc1]

+1 (binding)

- xsums/sigs OK
- L look ok
- KEYS is good
- Could sort of build from source. On OSX, I can only get so far into the 
build until I start running into native compilation failures. The docs are a 
little sparse on this -- it would be awesome if there was some non-native 
fallback mechanism to improve the first-time user experience.
- apache-rat:check looks good
- Scoped out the rest of the files included

On 6/16/17 2:26 AM, Debojyoti Dutta wrote:
> Hello incubator PMCs,
> 
> The Apache Mnemonic community PPMCs and developers have voted and 
> approved the proposal to release Apache Mnemonic 0.8.0 (incubating).
> 
> Apache Mnemonic is an advanced hybrid memory storage oriented library.
> It consists of a non-volatile/durable Java object model and a durable 
> computing model that significantly improves the performance of massive 
> real-time data processing and analytics pipelines, built on top of 
> JVMs. Developers can also use this library to design cache-less and 
> SerDe-less high performance applications, thus leveraging the core 
> benefits of non-volatile memory technologies.
> 
> [VOTE] thread:
> https://lists.apache.org/thread.html/1a04f89e9bc8f4bd588a017184c1bd2f9
> 9922506215a66ed37881fd7@%3Cdev.mnemonic.apache.org%3E
> 
> [VOTE RESULT] thread:
> https://lists.apache.org/thread.html/aaa59cc08283f085608e37f07c3f2178b
> 19b08d260436c370428e3d3@%3Cdev.mnemonic.apache.org%3E
> and an update (Patrick Hunt's vote was binding but was included as non 
binding):
> https://lists.apache.org/thread.html/a11aa5fe89e4784e610873548c26c7a08
> 57b2216a3ac1400f45d5b27@%3Cdev.mnemonic.apache.org%3E
> 
> We now kindly request the Incubator PMC members review and vote on 
> this incubator release. The Apache Mnemonic-0.8.0-incubating release 
> candidate is now available with the following artifacts for a project
> vote:
> 
> The source tarball, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/incubator/mnemonic/0.8.0-incuba
> ting-rc1/src/
> 
> The tag to be voted upon is v0.8.0-incubating:
> https://git-wip-us.apache.org/repos/asf?p=incubator-mnemonic.git;a=tag
> ;h=refs/tags/v0.8.0-incubating
> 
> The release hash is 2941b87:
> https://git-wip-us.apache.org/repos/asf?p=incubator-mnemonic.git;a=com
> mit;h=2941b87f7e28c898cbecb35c845e5b4cd90bc197
> 
> Release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/dev/incubator/mnemonic/KEYS
> 
> KEYS file available:
> https://dist.apache.org/repos/dist/dev/incubator/mnemonic/KEYS
> 
> For information about the contents of this release, see:
> https://dist.apache.org/repos/dist/dev/incubator/mnemonic/0.8.0-incuba
> ting-rc1/CHANGES.txt
> 
> The vote will be open for ~72 hours. Please download the release 
> candidate and evaluate the necessary items including checking hashes, 
> signatures, build from source, and test.
> 
> please vote:
> [ ] +1 Release this package as apache-mnemonic--incubating
> [ ] +0 no opinion
> [ ] -1 Do not release this package because...
> 
> Thanks,
> Debo Dutta, on behalf of the Apache Mnemonic (incubating) team
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org


?B�CB�?�?[��X��ܚX�K??K[XZ[?�?�[�\�[?][��X��ܚX�P?[��X�]?܋�\?X�?K�ܙ�B��܈?Y??]?[ۘ[??��[X[�?�??K[XZ[?�?�[�\�[?Z?[???[��X�]?܋�\?X�?K�ܙ�B

Re: [VOTE] Heron to enter Apache Incubator

2017-06-16 Thread Debo Dutta (dedutta)

On 6/16/17, 1:41 PM, "Bill Graham" wrote:

Hi,

Based on the discussion on the incubator mailing list[1] I would like to
call a vote to add Heron to the Apache Incubator.

The full proposal is available below, and is also available on the Apache
Incubator wiki at:
https://wiki.apache.org/incubator/HeronProposal

Please vote:
[ ] +1, bring Heron into Incubator
[ ] -1, do not bring Heron into Incubator, because...

The vote will open for 7 days until Friday June 23 at 14:00 PT.

Thank you

1 -

https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E

= Heron Proposal =

= Abstract =
Heron is a real-time, distributed, fault-tolerant stream processing engine
initially developed by Twitter.

= Proposal =

Heron is a real-time stream processing engine built for high performance,
ease of manageability, performance predictability and developer
productivity[1]. We wish to develop a community around Heron to increase
contributions and see Heron thrive in an open forum.

= Background =

Heron provides the ability for developers to compose directed acyclic
graphs (DAGs) of real-time query execution logic (i.e. a topology) and
submit the topology to execute on a pluggable job scheduling system (e.g.,
Apache Aurora, YARN, Marathon, etc). Users can employ either the native
Heron API or the Apache Storm API to develop the topology. Heron supports
the Storm API for ease of migration, but beyond that Heron’s architecture
differs considerably from Storm’s.

Users submit a topology to the scheduler using the Heron client, which uses
the Heron binary libraries to deploy all daemons required to run and manage
the topology. The topology therefore has no reliance on centrally managed
Heron services, only on a generic job scheduling system, which lends itself
well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among
others).

The scheduler runs each topology as a job consisting of multiple
containers. One of the containers runs the topology master, responsible for
managing the topology. The remaining containers each runs a stream manager
responsible for data routing, a metrics manager that collects and reports
various metrics and a number of processes called Heron instances which run
the user-defined logic on the stream of tuples. Parallelism is achieved via
process-based isolation of Heron instances, which provides predictable
performance while simplifying debugging. The containers are allocated and
managed by the scheduler framework based on resource availability of nodes
in the cluster. The metadata for the topology, such as the physical plan
and execution details, are stored in the pluggable Heron State Manager
(e.g. Apache ZooKeeper).

= Rationale =

Heron is a general-purpose, modular and extensible platform that can be
leveraged to support common, real-time analytics use cases. There is an
increasing demand for open-source, scalable real-time analytics systems. We
believe that Heron can be leveraged by other organizations to build
streaming applications that can benefit from its robustness, high
performance, adaptability to cloud environments and ease of use. Moreover,
we hope that open-sourcing Heron will help to further evolve the technology
as the project attracts contributors with diverse backgrounds and areas of
expertise.

We believe the Apache foundation is a great fit as the long-term home for
Heron, as it provides an established process for community-driven
development and decision making by consensus. This is exactly the model we
want for future Heron development.

= Initial Goals =

* Move the existing codebase, website, documentation, and mailing lists to
Apache-hosted infrastructure.
* Integrate with the Apache development process.
* Ensure all dependencies are compliant with Apache License version 2.0.
* Incrementally develop and release per Apache guidelines.

= Current Status =

Heron is a stable project used in production at Twitter since 2014 and open
sourced under the ASL v2 license in 2016. The Heron source code is
currently hosted at github.com (https://github.com/twitter/heron), which
will seed the Apache git repository.

= Meritocracy =

By submitting this incubator proposal, we’re expressing our intent to build
a diverse developer community around Heron that will conduct itself
according to The Apache Way and use a meritocratic means of building it's
committer base. Several companies and universities have already expressed
interest in and contributed to Heron. Our

Re: [PROPOSAL] Heron

2017-06-15 Thread Debo Dutta (dedutta)

Am happy to help too!

Thx 
Debo 

Sent from my iPhone

> On Jun 14, 2017, at 8:05 PM, William Markito Oliveira 
>  wrote:
> 
> Howdy!
> 
> If Heron is looking for some help around incubation process, I'd love to
> help while Geode experience is still fresh in my mind and given that it's a
> project/space that I do have interest. Since I'm not an ASF member, I don't
> think I can offer to be a mentor, but can probably still help and
> participate on the process.
> 
> Thanks!
> 
>> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz  wrote:
>> 
>> Hi Bill/Supun,
>> 
>> Sorry for not being a little more clear. I was asking more about how the
>> Heron community would seek to engage with Storm community at the
>> *community* level as opposed to the technical level (i.e. “Community over
>> Code”).
>> 
>> I’ve been asked by many why this has never happened, and have always
>> struggled to answer. Maybe you could help answer that question as well as
>> if and how that might change if Heron were to incubate.
>> 
>> Another quick question: The proposal mentions Heron being used in
>> production at Google, but some Google employees I recently spoke to seemed
>> to contradict that. Could you explain? Note that’s nothing that would
>> preclude the project from incubating, I’m just curious.
>> 
>> -Taylor
>> 
>>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve 
>> wrote:
>>> 
>>> Hi Taylor,
>>> 
>>> For me, one of the interesting differences between Heron and Storm is the
>>> execution model. Storm uses a shared memory model while Heron uses a
>>> process based model. It will be interesting to see how these two evolve.
>>> 
>>> Thanks,
>>> Supun..
>>> 
>>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham 
>> wrote:
>>> 
 Hi Taylor,
 
 Thanks for the mentor offer, we'd be glad to have your help.
 
 I think the best place for collaboration would be around the evolution
>> of
 the API. In addition we plan to look more into DSL solutions which we
>> could
 potentially collaborate on. This could be Trident, or Beam or something
 else, but there could be synergies for future development here.
 
 thanks,
 Bill
 
 On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz 
>> wrote:
 
> Hi Bill,
> 
> Could you comment on how/if the Heron community would be willing to
>> work
> with the Storm community? I've seen a number of new features in Storm
 being
> ported to Heron, but I have yet to see any attempt by the Heron
>> community
> to engage with the Apache Storm community.
> 
> I don't think it would be too far off to say that the relationship
 between
> Heron and Apache Storm has been somewhat adversarial. The pre- and
> post-open sourcing marketing around Heron seemed, at least to me,
 somewhat
> aggressively negative toward Storm.
> 
> As a peer to Apache Storm, how would the proposed "Apache Heron"
 community
> work to collaborate with the Storm community? If Heron is adopting API
> changes in Storm, then it seems there is an opportunity for
 collaboration.
> 
> Don't take any of this as an objection to incubating the project. I
>> would
> support it. I would also be willing to be a mentor, if you would
>> consider
> taking on another.
> 
> -Taylor
> 
>> On Jun 8, 2017, at 1:23 PM, Bill Graham  wrote:
>> 
>> Dear Apache Incubator Community,
>> 
>> We are excited to share our proposal for discussion and feedback
>> for entering Apache Incubation. Heron is a real-time, distributed,
>> fault-tolerant stream processing engine.
>> 
>> Our proposal can be found at https://wiki.apache.org/
> incubator/HeronProposal
>> and is included below.
>> 
>> 
>> Thank you,
>> 
>> Bill Graham on behalf of the Heron developers
>> 
>> 
>> # Heron Proposal
>> 
>> ## Abstract
>> Heron is a real-time, distributed, fault-tolerant stream processing
> engine
>> initially developed by Twitter.
>> 
>> ## Proposal
>> 
>> Heron is a real-time stream processing engine built for high
 performance,
>> ease of manageability, performance predictability and developer
>> productivity[1]. We wish to develop a community around Heron to
 increase
>> contributions and see Heron thrive in an open forum.
>> 
>> ## Background
>> 
>> Heron provides the ability for developers to compose directed acyclic
>> graphs (DAGs) of real-time query execution logic (i.e. a topology) and
>> submit the topology to execute on a pluggable job scheduling system
> (e.g.,
>> Apache Aurora, YARN, Marathon, etc). Users can employ either the
>> native
>> Heron API or the Apache Storm API to develop the topology. Heron
 supports
>> the Storm API

Re: [PROPOSAL] Superset Proposal for Apache Incubator

2017-04-13 Thread Debo Dutta (dedutta)

happy to help

debo

On 4/13/17, 9:59 AM, "Maxime Beauchemin"  wrote:

Hi Jean-Baptiste,

We are indeed looking for more mentors.

Should I update the wiki and replace all references to PMC by PPMC?

Thanks,

Max

On Wed, Apr 12, 2017 at 12:51 PM, Jean-Baptiste Onofré 
wrote:

> Hi Maxime,
>
> The proposal looks interesting.
>
> Just a note,  it's PPMC (not PMC) during incubation.
>
> Are you seeking for other mentor (I see you only have one mentor and one
> champion for now) ?
>
> Regards
> JB
>
>
> On 04/12/2017 09:41 PM, Maxime Beauchemin wrote:
>
>> Hi all,
>>
>> We would love feedback on the proposal. Do the veterans on this mailing
>> list think that the proposal is ready for a vote!?
>>
>> Thanks,
>>
>> Max
>>
>> On Tue, Apr 4, 2017 at 5:26 PM, Luke Han  wrote:
>>
>> Hi Jeff,
>>> This is great project which have been mentioned many times in
>>> community. It looks cool and fun for data works.
>>>
>>> Thanks to proposal Superset to be Apache Incubator Project, please
>>> let
>>> me know if there's anything I could help.
>>>
>>> Thanks.
>>> Luke
>>>
>>>
>>> Best Regards!
>>> -
>>>
>>> Luke Han
>>>
>>> On Sun, Apr 2, 2017 at 7:45 AM, Jeff Feng 
>>> wrote:
>>>
>>> Dear Apache Incubator Community,

 We are excited to share our proposal for discussion and feedback for
 entering Apache Incubation.  Superset is an enterprise-ready web
 application for data exploration, data visualization and dashboarding.

 Our Incubation proposal is at the following Wiki as well as copied in
 the
 email below:

 https://wiki.apache.org/incubator/SupersetProposal

 We have an active Superset community including 400+ members and nearly

>>> 200
>>>
 topics.  The Google Group can be found below.  We plan to move the
 discussion to the ASF:

 https://groups.google.com/forum/#!forum/airbnb_superset

 Thank you and look forward to the discussion!

 Jeff, Max & Alanna

 = Superset =

 == Abstract ==

 Superset is an enterprise-ready web application for data exploration,

>>> data
>>>
 visualization and dashboarding.

 == Proposal ==

 Superset is business intelligence (BI) software that helps modern
 organizations visualize and interact with their data. Superset enables
 users explore data from a variety of databases, assemble beautiful
 dashboards and share their findings.  Superset works neatly with all

>>> modern
>>>
 SQL-speaking databases, and integrates with Druid.io to provide

>>> real-time,
>>>
 interactive, blazing fast data access to large datasets.

 == Background ==

 Data is mission critical. To succeed in this era, organizations need to
 provide low-friction, intuitive and interactive access to data. It is
 paramount for knowledge workers to be capable of answering their own
 questions by querying, exploring and visualizing data.

 The entire business intelligence industry has pivoted from a model of
 centralized top-down platforms driven by IT organizations to
 self-service
 analytics and agile workflows by any user.  This shift unblocks

>>> centralized
>>>
 service bottlenecks for creating data visualizations while also 
creating

>>> an
>>>
 environment that is iterative and fast-moving.  This means that 
business
 intelligence software must also be easy and delightful to use.
 Self-service analytics doesn’t mean that admin and governance features

>>> are
>>>
 not needed.

 Modern BI tools provide fine-grain access controls and auditing
 capabilities to understand how data is being used.  Superset is a

>>> solution
>>>
 that delivers on all of these vectors.

 The technology stack is also constantly morphing - vendors are
 struggling
 to provide cheap, quick and easy solutions to access data.  Business
 intelligence users are finding existing solutions lacking as these

>>> software
>>>
 products either disregard or react slowly to recent game-changing
 technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, 
d3.js,
 React.js and iPython’s Jupyter for

Re: [VOTE] Apache Metron podling Graduation

2017-03-29 Thread Debo Dutta (dedutta)

+1

Sent from my iPhone

> On Mar 29, 2017, at 7:53 PM, Julian Hyde  wrote:
> 
> +1 (binding)
> 
>> On Mar 29, 2017, at 6:55 PM, John D. Ament  wrote:
>> 
>> +1
>> 
>> When you submit the resolution to the board, the breakdown of PMC by
>> affiliation (and make sure you use PMC not PPMC) can just be replaced with
>> the proposed PMC members.   My question asking for it is a pure sanity
>> check to make sure there is no dominance.
>> 
>> John
>> 
>>> On Wed, Mar 29, 2017 at 10:39 AM Casey Stella  wrote:
>>> 
>>> Hi Everyone,
>>> 
>>> I propose that we graduate Apache Metron (incubating) from the incubator.
>>> The full text of the proposal is below, with requisite modifications
>>> applied from the discussion thread.
>>> 
>>> The discuss thread can be found at
>>> 
>>> https://lists.apache.org/thread.html/e5d106456b28562bdc947624c6f33e3281297dfd3803aab3d171bbad@%3Cgeneral.incubator.apache.org%3E
>>> 
>>> 
>>> Best,
>>> 
>>> Casey
>>> 
>>> Resolution:
>>> 
>>> Establish the Apache Metron Project
>>> 
>>> WHEREAS, the Board of Directors deems it to be in the best
>>> interests of the Foundation and consistent with the
>>> Foundation's purpose to establish a Project Management
>>> Committee charged with the creation and maintenance of
>>> open-source software, for distribution at no charge to the
>>> public, related to a security analytics platform for big data use cases.
>>> 
>>> NOW, THEREFORE, BE IT RESOLVED, that a Project Management
>>> Committee (PMC), to be known as the "Apache Metron Project",
>>> be and hereby is established pursuant to Bylaws of the
>>> Foundation; and be it further
>>> 
>>> RESOLVED, that the Apache Metron Project be and hereby is
>>> responsible for the creation and maintenance of software
>>> related to:
>>> (a) A mechanism to capture, store, and normalize any type of security
>>> telemetry at extremely high rates.
>>> (b) Real time processing and application of enrichments
>>> (c) Efficient information storage
>>> (d) An interface that gives a security investigator a centralized view
>>> of data and alerts passed through the system.
>>> 
>>> RESOLVED, that the office of "Vice President, Apache Metron" be
>>> and hereby is created, the person holding such office to
>>> serve at the direction of the Board of Directors as the chair
>>> of the Apache Metron Project, and to have primary responsibility
>>> for management of the projects within the scope of
>>> responsibility of the Apache Metron Project; and be it further
>>> 
>>> RESOLVED, that the persons listed immediately below be and
>>> hereby are appointed to serve as the initial members of the
>>> Apache Metron Project:
>>> 
>>> 
>>> PPMC by Affiliation:
>>> 
>>> Hortonworks:
>>> Sheetal Dolas (sheetal_dolas)
>>> Ryan Merriman (rmerriman)
>>> Larry McCay (lmccay)
>>> P. Taylor Goetz (ptgoetz)
>>> Nick Allen (nickallen)
>>> David Lyle (lyle)
>>> George Vetticaden (gvetticaden)
>>> James Sirota (jsirota)
>>> Casey Stella (cstella)
>>> Michael Perez (mperez)
>>> Kiran Komaravolu (kirankom)
>>> Vinod Kumar Vavilapalli (vinodkv)
>>> 
>>> Cisco:
>>> Debo Dutta (ddutta)
>>> Discovery Gerdes (discovery)
>>> 
>>> Rackspace:
>>> Oskar Zabik (smogg)
>>> Andrew Hartnett (dev_warlord)
>>> Paul Kehrer (reaperhulk)
>>> Sean Schulte (sirsean)
>>> 
>>> B23:
>>> Mark Bittmann (mbittmann)
>>> Dave Hirko (dbhirko)
>>> Brad Kolarov (bjkolly)
>>> 
>>> Mantech:
>>> Charles Porter (cporter)
>>> Ray Urciuoli (rurcioli)
>>> 
>>> Fogbeam Labs:
>>> Phillip Rhodes (prhodes)
>>> 
>>> 
>>> 
>>> 
>>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Casey Stella
>>> be appointed to the office of Vice President, Apache Metron, to
>>> serve in accordance with and subject to the direction of the
>>> Board of Directors and the Bylaws of the Foundation until
>>> death, resignation, retirement, removal or disqualification,
>>> or until a successor is appointed; and be it further
>>> 
>>> RESOLVED, that the initial Apache Metron PMC be and hereby is
>>> tasked with the creation of a set of bylaws intended to
>>> encourage open development and increased participation in the
>>> Apache Metron Project; and be it further
>>> 
>>> RESOLVED, that the Apache Metron Project be and hereby
>>> is tasked with the migration and rationalization of the Apache
>>> Incubator Metron podling; and be it further
>>> 
>>> RESOLVED, that all responsibilities pertaining to the Apache
>>> Incubator Metron podling encumbered upon the Apache Incubator
>>> Project are hereafter discharged.
>>> 
> 
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Apache Metron podling Graduation

2017-03-26 Thread Debo Dutta (dedutta)

There are other folks from companies like Cisco who are also interested in the 
project. In fact Metron grew out of OpenSOC efforts from Cisco.

debo

On 3/26/17, 11:59 AM, "Dave Fisher"  wrote:



Sent from my iPhone

> On Mar 24, 2017, at 9:03 PM, John D. Ament  wrote:
> 
>> On Fri, Mar 24, 2017 at 11:52 PM P. Taylor Goetz  
wrote:
>> 
>> To be honest, I don't know, but I don't see it as a problem.
>> 
>> I will admit upfront that I am a Hortonworks employee, so there is
>> potential for bias. But I always try to wear the right hat, even when my
>> position may not please my employer.
>> 
>> As a mentor (to this and other podlings), I try to instill that ethic
>> during incubation and thereafter.
>> 
>> In the Metron community I see some that share that ethic. They selected a
>> PMC Chair that gets it and I trust will serve the project well in that 
role.
>> 
>> Yes, the project is Hortonworks-heavy. But I think they are doing things
>> right in terms of the Apache Way. I also intend to participate in the PMC
>> to help ensure it remains that way.
>> 
>> 
> Agreed - from what I have seen, Hortonworks as a company understands the
> Apache Way and is able to ensure that it is held in open source projects.
> I see no concerns with Metron graduating.

Sure, but the question is not about whether or not HortonWorks understands 
the Apache Way. It is what happens to Metron if HortonWorks is no longer 
interested in the project?

Given the experience with OpenOffice I think we need to always ask this.

Regards,
Dave

> 
> 
>> -Taylor
>> 
>> 
>>> On Mar 24, 2017, at 8:15 PM, Dave Fisher  wrote:
>>> 
>>> This is a list of 24 of which 12 are Hortonworks.
>>> 
>>> Can we assume that the other 12 of which 6 are PPMC are unaffiliated?
>>> 
>>> What percentage of the commits are coming from Hortonworks affiliated
>> contributors?
>>> 
>>> Thanks,
>>> Dave
>>> 
 On Mar 23, 2017, at 8:39 PM, Casey Stella  wrote:
 
 Of course, very fair question.  Also, yes, the we have 36 committers as
>> all
 PPMC members are committers.
 
 The affiliations are as follows:
 
 Hortonworks:
 * Sheetal Dolas (sheetal_dolas)
 * Larry McCay (lmccay)
 * P. Taylor Goetz (ptgoetz)
 * Ryan Merriman (rmerriman)
 * James Sirota (jsirota)
 * Casey Stella (cstella)
 * David Lyle (lyle)
 * Nick Allen (nickallen)
 * George Vetticaden (gvetticaden)
 * Vinod Kumar Vavilapalli (vinodkv)
 * Kiran Komaravolu (kirankom)
 * Michael Perez (mperez)
 
 Cisco:
 * Debo Dutta (ddutta)
 * Discovery Gerdes (discovery)
 
 Rackspace:
 * Oskar Zabik (smogg)
 * Andrew Hartnett (dev_warlord)
 * Paul Kehrer (reaperhulk)
 * Sean Schulte (sirsean)
 
 B23:
 * Mark Bittmann (mbittmann)
 * Dave Hirko (dbhirko)
 * Brad Kolarov (bjkolly)
 
 Mantech:
 * Charles Porter (cporter)
 * Ray Urciuoli (rurcioli)
 
 Fogbeam Labs:
 * Phillip Rhodes (prhodes)
 
 
 
 
 
> On Thu, Mar 23, 2017 at 9:34 PM, P. Taylor Goetz 
>> wrote:
> 
> 
> 
>> On Mar 23, 2017, at 7:38 PM, John D. Ament 
> wrote:
>> 
>>> On Thu, Mar 23, 2017 at 3:10 PM P. Taylor Goetz 
> wrote:
>>> 
>>> As a mentor, I fully support Metron's graduation. The community has
> come a
>>> long way, learned to make solid releases, build a sustainable
>> community,
>>> and follow the Apache Way.
>>> 
>>> One minor nit: The paddling status page is missing a few Apache IDs,
> but I
>>> see them in the resolution, so it's an easy fix.
>>> 
>>> 
>> Paddlings are always out of sync :-P
>> 
>> One other nit - I'm not sure why you only have 6 committers.  I think
>> you
>> mean 30 committers (all PMC members are committers).  Can you provide
>> the
>> associations of PMC members and employers?
> 
> Metron PPMC,
> 
> Can you respond? I feel this is a valid question.
> 
> -Taylor
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>

Re: [DISCUSS] Apache Metron podling Graduation

2017-03-23 Thread Debo Dutta (dedutta)

+1

On 3/23/17, 11:32 AM, "Casey Stella"  wrote:

Hi Everyone,

The incubating Apache Metron community believes it is time to graduate
to TLP.

Apache Metron entered incubation in December of 2015. Since then, we've
overcome technical challenges to remove Category X dependencies, and
made 3 releases. Our most recent release contains binary convenience
artifacts. We are a very helpful and engaged community, ready to answer
all questions and feedback directed to us via the user list. Through our
time in incubation we've added a number of committers and promoted some
of them to PPMC membership. We are actively pursuing others. While we do
still have issues to address raised by means of the maturity model, all
projects are ongoing processes, and we believe we no longer need the
incubator to continue addressing these issues.

To inform the discussion, here is some basic project information:

Project status:
  http://incubator.apache.org/projects/metron.html

Project website:
  https://metron.incubator.apache.org/

Project documentation:
   https://cwiki.apache.org/confluence/display/METRON/Documentation

Maturity assessment:


https://cwiki.apache.org/confluence/display/METRON/Apache+Project+Maturity+Model

Community Vote to Graduate:


https://lists.apache.org/thread.html/540378e2773b1b2ce2af498e56a6435e547c04948a6b496b9a4ffc71@%3Cdev.metron.apache.org%3E

DRAFT of the board resolution is at the bottom of this email

Proposed PMC size: 24 members

Total number of committers: 6 members


516 commits on develop
34 contributors across all branches

dev list averaged ~650 msgs/month for the last 3 months


Resolution:

Establish the Apache Metron Project

WHEREAS, the Board of Directors deems it to be in the best
interests of the Foundation and consistent with the
Foundation's purpose to establish a Project Management
Committee charged with the creation and maintenance of
open-source software, for distribution at no charge to the
public, related to a security analytics platform for big data use cases.

NOW, THEREFORE, BE IT RESOLVED, that a Project Management
Committee (PMC), to be known as the "Apache Metron Project",
be and hereby is established pursuant to Bylaws of the
Foundation; and be it further

RESOLVED, that the Apache Metron Project be and hereby is
responsible for the creation and maintenance of software
related to:
(a) A mechanism to capture, store, and normalize any type of security
telemetry at extremely high rates.
(b) Real time processing and application of enrichments
(c) Efficient information storage
(d) An interface that gives a security investigator a centralized view
of data and alerts passed through the system.

RESOLVED, that the office of "Vice President, Apache Metron" be
and hereby is created, the person holding such office to
serve at the direction of the Board of Directors as the chair
of the Apache Metron Project, and to have primary responsibility
for management of the projects within the scope of
responsibility of the Apache Metron Project; and be it further

RESOLVED, that the persons listed immediately below be and
hereby are appointed to serve as the initial members of the
Apache Metron Project:


PPMC:
Mark Bittmann (mbittmann)
Sheetal Dolas (sheetal_dolas)
Debo Dutta (ddutta)
Discovery Gerdes (discovery)
Andrew Hartnett (dev_warlord)
Dave Hirko (dbhirko)
Paul Kehrer (reaperhulk)
Brad Kolarov (bjkolly)
Kiran Komaravolu (kirankom)
Larry McCay (lmccay)
P. Taylor Goetz (ptgoetz)
Ryan Merriman (rmerriman)
Michael Perez (mperez)
Charles Porter (cporter)
Phillip Rhodes (prhodes)
Sean Schulte (sirsean)
James Sirota (jsirota)
Casey Stella (cstella)
Ray Urciuoli(rurcioli)
Vinod Kumar Vavilapalli (vinodkv)
George Vetticaden (gvetticaden)
Oskar Zabik (smogg)
David Lyle (lyle)
Nick Allen (nickallen)



NOW, THEREFORE, BE IT FURTHER RESOLVED, that Casey Stella
be appointed to the office of Vice President, Apache Metron, to
serve in accordance with and subject to the direction of the
Board of Directors and the Bylaws of the Foundation until
death, resignation, retirement, removal or disqualification,
or until a successor is appointed; and be it further

RESOLVED, that the initial Apache Metron PMC be and hereby is
tasked with the creation of a set of bylaws intended to
encourage open development and increased participation in the
Apache Metron Project; and be it further

RESOLVED, that the Apache Metron Project be and hereby
is tasked with the migration and

Re: [VOTE] Accept OpenWhisk into the Apache Incubator

2016-11-17 Thread Debo Dutta (dedutta)

+1




On 11/17/16, 7:22 AM, "sa3r...@gmail.com on behalf of Sam Ruby" 
 wrote:

>Now that the discussion thread on the OpenWhisk Proposal has died
>down, please take a moment to vote on accepting OpenWhisk into the
>Apache Incubator.
>
>The ASF voting rules are described at:
>   http://www.apache.org/foundation/voting.html
>
>A vote for accepting a new Apache Incubator podling is a majority vote
>for which only Incubator PMC member votes are binding.
>
>Votes from other people are also welcome as an indication of peoples
>enthusiasm (or lack thereof).
>
>Please do not use this VOTE thread for discussions.
>If needed, start a new thread instead.
>
>This vote will run for at least 72 hours. Please VOTE as follows
>[] +1 Accept OpenWhisk into the Apache Incubator
>[] +0 Abstain.
>[] -1 Do not accept OpenWhisk into the Apache Incubator because ...
>
>The proposal is listed below, but you can also access it on the wiki:
>   https://wiki.apache.org/incubator/OpenWhiskProposal
>
>- Sam Ruby
>
>= OpenWhisk Proposal =
>
>OpenWhisk is an open source, distributed Serverless computing platform
>able to execute application logic (Actions) in response to events
>(Triggers) from external sources (Feeds) or HTTP requests governed by
>conditional logic (Rules). It provides a programming environment
>supported by a REST API-based Command Line Interface (CLI) along with
>tooling to support packaging and catalog services.
>
>Champion: Sam Ruby, IBM
>
>Mentors:
> * Felix Meschberger, Adobe
> * Isabel Drost-Fromm, Elasticsearch GmbH
> * Sergio Fernández, Redlink GmbH
>
>== Background ==
>
>Serverless computing is the evolutionary next stage in Cloud computing
>carrying further the abstraction offered to software developers using
>Container-based operating system virtualization. The Serverless
>paradigm enables programmers to just “write” functional code and not
>worry about having to configure any aspect of a server needed for
>execution. Such Serverless functions are single purpose and stateless
>that respond to event-driven data sources and can be scaled on-demand.
>
>The OpenWhisk project offers a truly open, highly scalable, performant
>distributed Serverless platform leveraging other open technologies
>along with a robust programming model, catalog of service and event
>provider integrations and developer tooling.
>Specifically, every architectural component service of the OpenWhisk
>platform (e.g., Controller, Invokers, Messaging, Router, Catalog, API
>Gateway, etc.) all is designed to be run and scaled as a Docker
>container. In addition, OpenWhisk uniquely leverages aspects of Docker
>engine to manage, load balance and scale supported OpenWhisk runtime
>environments (e.g., JavaScript, Python, Swift, Java, etc.), that run
>Serverless functional code within Invoker compute instances, using
>Docker containers.
>
>OpenWhisk's containerized design tenants not only allows it to be
>hosted in various IaaS, PaaS Clouds platforms that support Docker
>containers, but also achieves the high expectation of the Serverless
>computing experience by masking all aspects of traditional resource
>specification and configuration from the end user simplifying and
>accelerating Cloud application development.
>In order to enable HTTP requests as a source of events, and thus the
>creation of Serverless microservices that expose REST APIs, OpenWhisk
>includes an API Gateway that performs tasks like security, request
>routing, throttling, and logging.
>
>== Rationale ==
>
>Serverless computing is in the very early stages of the technology
>adoption curve and has great promise in enabling new paradigms in
>event-driven application development, but current implementation
>efforts are fractured as most are tied to specific Cloud platforms and
>services. Having an open implementation of a Serverless platform, such
>as OpenWhisk, available and governed by an open community like Apache
>could accelerate growth of this technology, as well as encourage
>dialog and interoperability.
>
>Having the ASF accept and incubate OpenWhisk would provide a clear
>signal to developers interested in Serverless and its future that they
>are welcome to participate and contribute in its development, growth
>and governance.
>
>In addition, there are numerous projects already at the ASF that would
>provide a natural fit to the API-centric, event-driven programming
>model that OpenWhisk sees as integral to a Serverless future. In fact,
>any project that includes a service that can produce or consume
>actionable events could become an integration point with
>OpenWhisk-enabled functions. Apache projects that manage programming
>languages and (micro) service runtimes could become part of the
>OpenWhisk set of supported runtime environments for functions. Device
>and API gateways would provide natural event sources that could
>utilize OpenWhisk functions to process, store and analyze vast amounts
>of information

Re: [VOTE] Accept RocketMQ into the Apache Incubator

2016-11-10 Thread Debo Dutta (dedutta)

+1 non binding 




On 11/10/16, 10:07 AM, "Myrle Krantz"  wrote:

>+1 non binding 
>
>-Myrle
>
>> On 10 Nov 2016, at 19:04, John D. Ament  wrote:
>> 
>> +1
>> 
>>> On Nov 10, 2016 11:41, "Bruce Snyder"  wrote:
>>> 
>>> Subsequent to the discussion on RocketMQ, I would like to call a vote on
>>> accepting RocketMQ into the Apache Incubator.
>>> 
>>> [ ] +1 Accept RocketMQ into the Apache Incubator
>>> [ ] +0 Abstain.
>>> [ ] -1 Do not accept RocketMQ into the Apache Incubator because...
>>> 
>>> The proposal is pasted below and also available in the wiki here:
>>>https://wiki.apache.org/incubator/RocketMQProposal
>>> 
>>> Also, the ASF voting guidelines are available here:
>>>http://www.apache.org/foundation/voting.html
>>> 
>>> Thanks,
>>> 
>>> Bruce
>>> 
>>> 
>>> = RocketMQ Proposal =
>>> 
>>> == Abstract ==
>>> 
>>> RocketMQ is a fast, low latency, reliable, scalable, distributed, easy to
>>> use message-oriented middleware, especially for processing large amounts of
>>> streaming data.
>>> 
>>> == Proposal ==
>>> 
>>> RocketMQ provides a message model including both pub/sub and P2P and it
>>> supports both reliable FIFO and strict sequential message queues. It also
>>> has the ability to accumulate a billion messages in a single queue,
>>> provides mobile, internet-friendly protocols such as MQTT and HTTP.
>>> RocketMQ also supports the ability to load data into Apache Hadoop for
>>> offline storage or to handle stream processing for Apache Storm.
>>> 
>>> == Background ==
>>> 
>>> RocketMQ was developed at Alibaba in 2011 and has been used in production
>>> there since that time. It can process the large amounts of events generated
>>> by various systems and provides a common repository for many types of
>>> consumers to access and process those events. RocketMQ also handles dozens
>>> of types of events including trade order process, search, social network
>>> activity stream and data pipeline. Every day at Alibaba, RocketMQ clusters
>>> process more than 500 billion events. The Alibaba Group also uses RocketMQ
>>> to provide message services for more than 3000 core applications.
>>> 
>>> RocketMQ was developed to meet Alibaba's particular use cases to provide
>>> low latency message delivery and high throughput message sending. Alibaba
>>> has also created its cornerstone product derived from RocketMQ, a Platform
>>> as a Service (PaaS) product named the Alibaba Cloud Platform (
>>> https://intl.aliyun.com/).  More than 100 companies use the RocketMQ open
>>> source version today. We believe RocketMQ can benefit more people so, we
>>> would like to share it via the ASF and begin developing a community of
>>> developers and users via The Apache Way.
>>> 
>>> 
>>> == Rationale ==
>>> 
>>> As background description, many organizations can benefit from a low
>>> latency, reliable, high throughput, distributed platform. Its usage is
>>> varied and we expect many new use cases to emerge. RocketMQ provides many
>>> features to support many use cases from enterprise application integration,
>>> to web applications to the flourishing of IoT applications.
>>> 
>>> == Current Status ==
>>> 
>>> === Meritocracy ===
>>> 
>>> The intent of this proposal is to start building a diverse developer and
>>> user community around RocketMQ following the ASF meritocracy model. Since
>>> RocketMQ was open sourced, we have solicited contributions via the website
>>> and presentations given to user groups and technical audiences and have
>>> received positive feedback and contributions including clients for C++ and
>>> .NET. We plan to continue this support for new contributors and work with
>>> those who contribute significantly to the project to encourage them to
>>> become committers.
>>> 
>>> === Community ===
>>> 
>>> RocketMQ is currently being developed by engineers working for Alibaba
>>> where it is highly used in a production environment. We also have active
>>> users in or have received contributions from a diverse set of companies
>>> including CMBC(China Minsheng Bank), Schneider Electric(
>>> http://www.schneider-electric.com/), the China Railway Ministry official
>>> ticketing website, China Union, Sina, Umei (http://sh.jumei.com), Chinese
>>> Academy of Sciences and many more. We hope to grow the base of contributors
>>> by inviting all those who offer significant contributions and excel through
>>> the use of The Apache Way. Contributions from outside of Alibaba are now
>>> being received by the RocketMQ project, including a dashboard, the
>>> flume-rocketmq module, the storm-rocketmq and more.
>>> 
>>> To further this goal, the project currently makes use of GitHub project
>>> features as well as a public mailing list via Google Groups.
>>> 
>>> 
>>> === Core Developers ===
>>> 
>>> RocketMQ is currently being developed by engineers from Alibaba and
>>> Yeahmobi: Xiaorui Wang, Von Gosling, Jiangwei Jiang, Xinyu Zhou, Zhanhui
>>> Li.

Re: [VOTE] Accept Spot into the Apache Incubator

2016-09-22 Thread Debo Dutta (dedutta)

+1

Sent from my iPhone

> On Sep 22, 2016, at 12:57 AM, Tom White  wrote:
> 
> +1
> 
> Tom
> 
>> On Tue, Sep 20, 2016 at 7:15 PM, Doug Cutting  wrote:
>> Following the discussion thread, I would like to call a vote on
>> accepting Spot into the Apache Incubator.
>> 
>> [] +1 Accept Spot into the Apache Incubator
>> [] +0 Abstain.
>> [] -1 Do not accept Spot into the Apache Incubator because ...
>> 
>> This vote will run for the usual 72 hours.
>> 
>> The proposal is attached, but you can also access it on the wiki:
>>   https://wiki.apache.org/incubator/SpotProposal
>> 
>> Thanks,
>> 
>> Doug
>> 
>> = SpotProposal =
>> 
>> == Abstract ==
>> 
>> Spot is an open source platform for network telemetry (packet, flow,
>> and proxy at the moment) built on an open data model and Apache
>> Hadoop.
>> 
>> == Proposal ==
>> 
>> Spot (formerly Open Network Insight, or ONI) is an open source
>> solution for network telemetry (packet, flow, and proxy at the moment)
>> built on an open data model and Apache Hadoop. It provides ingestion
>> and transformation of binary data, scalable machine learning, and
>> interactive visualization for identifying threats in network flows and
>> DNS packets.
>> 
>> Spot has a pluggable architecture that can accommodate multiple open
>> data models. Although cybersecurity/network-intrusion analysis is the
>> initial use case for Spot, we are actively encouraging the
>> contribution of new models that will enable other adjacent
>> applications, such as fraud detection or IT-operational analytics such
>> as performance and health monitoring. Because these models are open,
>> users maintain control of their own data.
>> 
>> More information on Spot can be found at the existing project website
>> at http://open-network-insight.org/.
>> 
>> == Background ==
>> 
>> It almost goes without saying that cybersecurity is an acute and
>> paramount concern globally, for organizations of all types and
>> sizes. Fortunately, thanks to the availability of massively scalable
>> (in the PBs) data infrastructure, security professionals can now make
>> authentically data-driven decisions about how they protect their
>> assets. For example, records of network traffic, captured as network
>> flows, are often stored and analyzed for use in network management,
>> and this same information can provide valuable insights into network
>> vulnerabilities.
>> 
>> Cybersecurity is just one example, however: There are other examples
>> of adjacent use cases, such as user fraud detection or IT-operations
>> analytics, that would benefit from the combination of Spot
>> functionality and PB-scale data sets for analysis.
>> 
>> == Rationale ==
>> 
>> Although cybersecurity is its initial use case/data model, Spot is
>> intended to more generally tackle the dual challenges of facilitating
>> the development of big data-driven analytic solutions, while helping
>> vendors avoid having to create one/off infrastructure for each use
>> case. Spot will eliminate issues related to vendor data models that
>> create silos between solutions, and that make it difficult for users
>> to consume these innovations from multiple vendors. In summary, Spot
>> will accelerate the development of new massively scalable analytic
>> applications that give users more flexibility, and more choices.
>> 
>> As an initial effort, we are now seeking to build an ecosystem of
>> developers, data scientists, and security professionals to make Spot
>> the open, community-driven, cybersecurity platform standard it needs
>> to become. By bringing Spot to Apache, we hope to galvanize these
>> groups to cooperate in this highly matrixed effort, and to build a
>> global, and diverse, Spot community.
>> 
>> == Initial Goals ==
>> 
>> Move the existing codebase, website, documentation, and mailing lists
>> to Apache-hosted infrastructure Work with the infrastructure team to
>> implement and approve our build and testing workflows in the context
>> of the ASF Incremental development and releases per Apache guidelines
>> 
>> == Current Status ==
>> 
>> === Releases ===
>> 
>> Spot has undergone one public release (1.0). This initial release was
>> not performed in the typical ASF fashion; we will adopt the ASF source
>> release process upon joining the incubator.
>> 
>> === Source ===
>> 
>> Spot’s source, including core platform and associated submodules, is
>> currently hosted in several GitHub repositories under the indicated
>> licenses:
>> 
>> * Core (Apache License 2.0)
>> * Oni-ingest (Apache License 2.0)
>> * Oni-ml (Apache License 2.0
>> * Oni-oa (BSD & MIT)
>> * Oni-setup (Apache License 2.0)
>> * Oni-nfdump (BSD)
>> * Oni-lda-c (GNU General Public License version 2)
>> 
>> The repositories will be transitioned to Apache’s git hosting during
>> incubation.  Issues related to GPL code will be resolved during
>> incubation.
>> 
>> 
>> === Issue Tracking ===
>> 
>> Spot’s bug and feature tracking is hosted on

Re: [VOTE] Releasing Apache Metron 0.2.0BETA-RC3

2016-07-27 Thread Debo Dutta (dedutta)

+1

Sent from my iPhone

> On Jul 27, 2016, at 9:32 AM, James Sirota  wrote:
> 
> This release is exactly the same as RC2, but the Mozilla licensed file was 
> removed so it doesn’t cause problems for us on the incubator general boards. 
> We no longer use it so we just removed it.
> 
> This is a call to vote on releasing Apache Metron 0.2.0BETA-RC3 incubating
> 
> Full list of changes in this release:
> 
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.2.0BETA-RC3-incubating/CHANGES
> 
> The tag/commit to be voted upon is Metron_0.2.0BETA_rc3:
> 
> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.git;a=commit;h=75642001803396e8884385b0fc297a2312ead3eb
> 
> The source archive being voted upon can be found here:
> 
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.2.0BETA-RC3-incubating/apache-metron-0.2.0BETA-RC3-incubating.tar.gz
> 
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.2.0BETA-RC3-incubating/
> 
> The release artifacts are signed with the following key:
> 
> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.git;a=blob;f=KEYS;h=c11bcb9b7385b4d155501aa097afd890f1070a18;hb=75642001803396e8884385b0fc297a2312ead3eb
> 
> 
> Please vote on releasing this package as Apache Metron 0.2.0BETA-RC3 
> incubating
> 
> When voting, please list the actions taken to verify the release.
> Recommended build validation and verification instructions are posted here:
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> 
> This vote will be open for at least 72 hours.
> 
> [ ] +1 Release this package as Apache Metron 0.2.0BETA-RC3 incubating
> [ ] 0 No opinion
> [ ] -1 Do not release this package because...

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept DistributedLog into the Apache Incubator

2016-06-21 Thread Debo Dutta (dedutta)

+1




On 6/20/16, 10:11 PM, "Sijie Guo"  wrote:

>Hello All,
>
>Following the discussion thread, I would like to call a VOTE on accepting
>DistributedLog into the Apache Incubator.
>
>[] +1 Accept DistributedLog into the Apache Incubator
>[] +0 Abstain.
>[] -1 Do not accept DistributedLog into the Apache Incubator because ...
>
>This vote will be open for at least 72 hours.
>
>The proposal follows, you can also access the wiki page:
>https://wiki.apache.org/incubator/DistributedLogProposal
>
>Here is my +1.
>
>Thanks,
>Sijie
>
>= Abstract =
>DistributedLog is a high-performance replicated log service. It offers
>durability, replication and strong consistency, which provides a
>fundamental building block for building reliable distributed systems, e.g
>replicated-state-machines, general pub/sub systems, distributed databases,
>distributed queues and etc.
>
>See “Building Distributedlog - Twitter’s high performance replicated log
>service” for details:
>https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>
>= Proposal =
>We propose to contribute DistributedLog codebase and associated artifacts
>(e.g. documentation, web-site content etc.) to the Apache Software
>Foundation with the intent of forming a productive, meritocratic and open
>community around DistributedLog’s continued development, according to the
>‘Apache Way’.
>
>= Background =
>Engineers at Twitter began developing DistributedLog in early 2013.
>DistributedLog is described in a Twitter engineering blog post and
>presented at the Messaging Meetup in Sep 2015. It has been released as an
>Apache-licensed open-source project on GitHub in May 2016.
>
>DistributedLog is a high-performance replicated log service, which provides
>simple stream-oriented abstractions over log-segments and offers
>durability, replication and strong consistency for building reliable
>distributed systems. The features offered by DistributedLog includes:
>
> * Simple high-level, stream oriented interface
> * Naming and metadata scheme for managing streams and other entities
> * Log data management policies, include data segmentation and data
>retention
> * Fast write pipeline leveraging batching and compression
> * Fast read mechanism leveraging long-poll and read-ahead caching
> * Service tiers supporting writer fan-in and reader fan-out
> * Geo-replicated logs
>
>DistributedLog’s most important benefit is high-performance with a strong
>durability guarantee, making it extremely appropriate for running different
>workloads from distributed database journaling to real-time stream
>computing. Its modern, layered architecture makes it easy to run the
>service tiers in multi-tenant datacenter environments such as Apache Mesos
>or cloud environments such as EC2.
>
>= Rationale =
>DistributedLog is designed to provide core fundamental features like
>high-performance, durability and strong consistency to anyone who is
>building reliable distributed systems, in a simple and efficient way.
>
>We believe that the ASF is the right venue to foster an open-source
>community around DistributedLog’s development. We expect that
>DistributedLog will benefit from collaboration with related Apache
>projects, and under the auspices of the ASF will attract talented
>contributors who will push DistributedLog’s development forward at a faster
>pace.
>
>We believe that the timing is right for DistributedLog’s development to
>move to the ASF: DistributedLog has already run in production at Twitter
>for 3 years and served various workloads including a distributed database
>journal, reliable cross datacenter replication, search ingestion,
>andgeneral pub/sub messaging. The project is stable. We are excited to see
>where an ASF-based community can take DistributedLog.
>
>= Current Status =
>DistributedLog is a stable project that has been used in production at
>Twitter for 3 years. The source code is public at github.com/twitter, which
>will seed the Apache git repository.
>
>= Meritocracy =
>We understand the central importance of meritocracy to the Apache Way. We
>will work to establish a welcoming, fair and meritocratic community.
>Several companies have already expressed interest in this project, and we
>intend to invite additional developers to participate. We look forward to
>growing a rich user and developer community.
>
>= Community =
>There is a large need for a performant replicated log service for
>applications such as distributed databases, distributed transactional
>systems, replicated-state-machines and pub/sub messaging/queuing. We want
>to attract more developers to the project, and we believe that the ASF’s
>open and meritocratic philosophy will help us with this. We note the
>success of other similar projects already part of the ASF, like Kafka.
>
>= Core Developers =
>DistributedLog is actively developed within Twitter. Most of the developers
>are from Twitter. Many of them are committers or PMC members of Apache

Re: [VOTE] Accept PredictionIO into the Apache Incubator

2016-05-23 Thread Debo Dutta (dedutta)

+1




On 5/23/16, 3:22 PM, "Andrew Purtell"  wrote:

>Since discussion on the matter of PredictionIO has died down, I would like
>to call a VOTE
>on accepting PredictionIO into the Apache Incubator.
>
>Proposal: https://wiki.apache.org/incubator/PredictionIO
>
>[ ] +1 Accept PredictionIO into the Apache Incubator
>[ ] +0 Abstain
>[ ] -1 Do not accept PredictionIO into the Apache Incubator, because ...
>
>This vote will be open for at least 72 hours.
>
>My vote is +1 (binding)
>
>--
>
>PredictionIO Proposal
>
>Abstract
>
>PredictionIO is an open source Machine Learning Server built on top of
>state-of-the-art open source stack, that enables developers to manage and
>deploy production-ready predictive services for various kinds of machine
>learning tasks.
>
>Proposal
>
>The PredictionIO platform consists of the following components:
>
>   * PredictionIO framework - provides the machine learning stack for
> building, evaluating and deploying engines with machine learning
> algorithms. It uses Apache Spark for processing.
>
>   * Event Server - the machine learning analytics layer for unifying events
> from multiple platforms. It can use Apache HBase or any JDBC backends
> as its data store.
>
>The PredictionIO community also maintains a Template Gallery, a place to
>publish and download (free or proprietary) engine templates for different
>types of machine learning applications, and is a complemental part of the
>project. At this point we exclude the Template Gallery from the proposal,
>as it has a separate set of contributors and we’re not familiar with an
>Apache approved mechanism to maintain such a gallery.
>
>Background
>
>PredictionIO was started with a mission to democratize and bring machine
>learning to the masses.
>
>Machine learning has traditionally been a luxury for big companies like
>Google, Facebook, and Netflix. There are ML libraries and tools lying
>around the internet but the effort of putting them all together as a
>production-ready infrastructure is a very resource-intensive task that is
>remotely reachable by individuals or small businesses.
>
>PredictionIO is a production-ready, full stack machine learning system that
>allows organizations of any scale to quickly deploy machine learning
>capabilities. It comes with official and community-contributed machine
>learning engine templates that are easy to customize.
>
>Rationale
>
>As usage and number of contributors to PredictionIO has grown bigger and
>more diverse, we have sought for an independent framework for the project
>to keep thriving. We believe the Apache foundation is a great fit. Joining
>Apache would ensure that tried and true processes and procedures are in
>place for the growing number of organizations interested in contributing
>to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
>PredictionIO was built on top of several Apache projects (HBase, Spark,
>Hadoop). We are familiar with the Apache process and believe that the
>democratic and meritocratic nature of the foundation aligns with the
>project goals.
>
>Initial Goals
>
>The initial milestones will be to move the existing codebase to Apache and
>integrate with the Apache development process. Once this is accomplished,
>we plan for incremental development and releases that follow the Apache
>guidelines, as well as growing our developer and user communities.
>
>Current Status
>
>PredictionIO has undergone nine minor releases and many patches.
>PredictionIO is being used in production by Salesforce.com as well as many
>other organizations and apps. The PredictionIO codebase is currently
>hosted at GitHub, which will form the basis of the Apache git repository.
>
>Meritocracy
>
>We plan to invest in supporting a meritocracy. We will discuss the
>requirements in an open forum. We intend to invite additional developers
>to participate. We will encourage and monitor community participation so
>that privileges can be extended to those that contribute.
>
>Community
>
>Acceptance into the Apache foundation would bolster the already strong
>user and developer community around PredictionIO. That community includes
>many contributors from various other companies, and an active mailing list
>composed of hundreds of users.
>
>Core Developers
>
>The core developers of our project are listed in our contributors and
>initial PPMC below. Though many are employed at Salesforce.com, there are
>also engineers from ActionML, and independent developers.
>
>Alignment
>
>The ASF is the natural choice to host the PredictionIO project as its goal
>is democratizing Machine Learning by making it more easily accessible to
>every user/developer. PredictionIO is built on top of several top level
>Apache projects as outlined above.
>
>Known Risks
>
>Orphaned Products
>
>PredictionIO has a solid and growing community. It is deployed on
>production environments by companies of all sizes to run various kinds of
>predictive engines.
>
>In

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-17 Thread Debo Dutta (dedutta)

Thx a lot Henry. Would love to. 

Sent from my iPhone

> On May 17, 2016, at 2:19 PM, Henry Saputra  wrote:
> 
> You are welcome, and great to have you as one of mentors for PredictionIO
> polling.
> 
> Should be a fun project to be part of =)
> 
> - Henry
> 
>> On Tue, May 17, 2016 at 2:14 PM, Suneel Marthi  wrote:
>> 
>> Thanks Henry
>> 
>> On Tue, May 17, 2016 at 5:11 PM, Henry Saputra 
>> wrote:
>> 
>>> As mentor, you will have karma to commit to the source repository.
>>> 
>>> As you probably know, the initial committers and mentors will form the
>>> initial PPMCs for the podling.
>>> Hopefully for day to day operations you should not need to have
>> distinction
>>> of committer vs mentors anymore.
>>> 
>>> You do not have to be listed as committer for the proposal.
>>> 
>>> - Henry
>>> 
 On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi 
>>> wrote:
>>> 
 Thanks for having me as a mentor for PIO.  I would like to be added to
>>> the
 initial list of committers and am looking to actively participate in
>> the
 development too. I am not sure if my being a mentor automatically
>> grants
>>> me
 the 'commit' karma.
 
 Its already been suggested earlier in this thread by Roman and
 Jean-Baptiste that the project needs to be decoupled from Spark and
 integrated with Beam.  It would be good to reduce the reliance on
 Spark-Submit from what I have seen of the project so far. But let's not
 talk architecture and design here when the project's not in incubator
>>> yet.
 :)
 
 
 
 
 On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <
>> henry.sapu...@gmail.com>
 wrote:
 
> Cool, this will make code grant process to be easier =)
> 
> The initial committers and mentors look great.
> I am sure more will come as contributions start pouring in to the
 project.
> 
> Looking forward for the VOTE thread soon.
> 
> - Henry
> 
>> On Mon, May 16, 2016 at 12:07 PM, Simon Chan 
> wrote:
> 
>> Yes, it includes everyone who previously contributed code from
> PredictionIO
>> before the acquisition and still want to be involved in the
>> project.
>> 
>> We may have missed "Alex Merritt", going to add him to the list
>> soon.
>> 
>> Simon
>> 
>> 
>> On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <
>> smar...@apache.org>
>> wrote:
>> 
>>> I do have a question about the proposed list of committers.
>>> 
>>> Does the list also include all of those folks who were with
> PredictionIO
>>> (and had contributed to the project) and then chose to leave when
>>> PIO
> was
>>> acquired by Salesforce?
>>> 
>>> 
>>> 
>>> 
>>> On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
 j...@nanthrax.net
>> 
>>> wrote:
>>> 
 By the way, we have some discussion about integrating Zeppelin
>>> with
>> Beam
>>> ;)
 
 Regards
 JB
 
> On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> 
> Super excited to see this proposal! This will finally allow us
>>> to
> have
> an ASF managed
> backend for next generation data-driven apps that I see
>> emerging
> quite
> rapidly.
> 
> The proposal looks great to me (although I'd recommend calling
 Scala
> as an implementation
> language more prominently since it may attract additional
 developers
> with affinity to it).
> 
> I do have two questions about technology:
>1. do you think it would be possible to leverage Apache
>> Beam
> (incubating)
>for abstracting away dependency on execution
>> frameworks?
 My
> understanding
>is that PredictionIO currently only run on Spark.
>2. is there a potential integration with Apache Zeppelin
> possible?
> 
> Thanks,
> Roman.
> 
> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> apurt...@apache.org>
> wrote:
> 
>> Greetings,
>> 
>> It is my pleasure to
>> 
>> propose the PredictionIO project for incubation at the Apache
>> Software
>> Foundation.
>> 
>> PredictionIO is a
>> popular
>> open
>> 
>> source Machine Learning Server built on top of a
>>> state-of-the-art
>> open
>> source stack, including several Apache technologies, that
>> 
>> enables developers to manage and deploy production-ready
 predictive
>> services for various kinds of machine learning tasks
>> , with more than 400 production deployments around the world
>>> and
 a
>> growing
>> contributor

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-17 Thread Debo Dutta (dedutta)

Also some of us have built something similar and would be happy to help
https://github.com/CiscoSystems/cognitive 

debo




On 5/17/16, 12:58 PM, "Nick Pentreath"  wrote:

>Hi there
>
>I'm glad to see the proposal to incubate PredictionIO. In my previous life
>as a startup co-founder, I kept a close eye on the project, and it would be
>fantastic to see it become an Apache incubating project!
>
>The folks working on Apache Spark and Apache SystemML (incubating) here at
>IBM are excited about the possibilities for integrating PredictionIO and
>SystemML (Mike Dusenberry is a committer on that project), as well
>as further improving Spark integration (I'm a PMC member on that project).
>
>Mike and I, together with Luciano (who is a mentor on this proposal) would
>like to volunteer our services as initial committers, if that is agreeable.
>
>Kind regards
>Nick
>mln...@apache.org
>
>
>
>>
>> -- Forwarded message --
>> From: Andrew Purtell 
>> To: "general@incubator.apache.org" 
>> Cc:
>> Date: Fri, 13 May 2016 13:41:38 -0700
>> Subject: [DISCUSS] PredictionIO incubation proposal
>> Greetings,
>>
>> It is my pleasure to
>>  
>> propose the PredictionIO project for incubation at the Apache Software
>> Foundation.
>>  
>> PredictionIO is a
>>  popular
>> open
>>  
>> source Machine Learning Server built on top of a state-of-the-art open
>> source stack, including several Apache technologies, that
>>  
>> enables developers to manage and deploy production-ready predictive
>> services for various kinds of machine learning tasks
>> , with more than 400 production deployments around the world and a growing
>> contributor community. 
>>
>>
>> The text of the proposal is included below and is also available at
>> https://wiki.apache.org/incubator/PredictionIO
>>
>> Best regards,
>> Andrew Purtell
>>
>>
>> = PredictionIO Proposal =
>>
>> === Abstract ===
>> PredictionIO is an open source Machine Learning Server built on top of
>> state-of-the-art open source stack, that enables developers to manage and
>> deploy production-ready predictive services for various kinds of machine
>> learning tasks.
>>
>> === Proposal ===
>> The PredictionIO platform consists of the following components:
>>
>>  * PredictionIO framework - provides the machine learning stack for
>>  building, evaluating and deploying engines with machine learning
>>  algorithms. It uses Apache Spark for processing.
>>
>>  * Event Server - the machine learning analytics layer for unifying events
>>  from multiple platforms. It can use Apache HBase or any JDBC backends
>>  as its data store.
>>
>> The PredictionIO community also maintains a
>>  
>> Template Gallery, a place to
>> publish and download (free or proprietary) engine templates for different
>> types of machine learning applications, and is a complemental part of the
>> project. At this point we exclude the Template Gallery from the proposal,
>> as it has a separate set of contributors and we’re not familiar with an
>> Apache approved mechanism to maintain such a gallery.
>>
>> You can find the Template Gallery at https://templates.prediction.io/
>>
>> === Background ===
>> PredictionIO was started with a mission to democratize and bring machine
>> learning to the masses.
>>
>> Machine learning has traditionally been a luxury for big companies like
>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>> around the internet but the effort of putting them all together as a
>> production-ready infrastructure is a very resource-intensive task that is
>> remotely reachable by individuals or small businesses.
>>
>> PredictionIO is a production-ready, full stack machine learning system that
>> allows organizations of any scale to quickly deploy machine learning
>> capabilities. It comes with official and community-contributed machine
>> learning engine templates that are easy to customize.
>>
>> === Rationale ===
>> As usage and number of contributors to PredictionIO has grown bigger and
>> more diverse, we have sought for an independent framework for the project
>> to keep thriving. We believe the Apache foundation is a great fit. Joining
>> Apache would ensure that tried and true processes and procedures are in
>> place for the growing number of organizations interested in contributing
>> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>> Hadoop). We are familiar with the Apache process and believe that the
>> democratic and meritocratic nature of the foundation aligns with the
>> project goals.
>>
>> === Initial Goals ===
>> The initial milestones will be to move the existing codebase to Apache and
>> integrate with the Apache development process. Once this is accomplished,
>> we plan for incremental development and releases that follow the Apache
>> guidelines, as well as

Re: [VOTE] Accept Gossip into the Apache Incubator

2016-04-25 Thread Debo Dutta (dedutta)

+1 non binding

Sent from my iPhone

On Apr 25, 2016, at 11:14 AM, P. Taylor Goetz
> wrote:

Following the discussion thread [1], I would like to call a VOTE to accept
Gossip into the Apache Incubator.

The Gossip proposal can be found here [2] and is also listed below.

[ ] +1 Accept Gossip into the Apache Incubator
[ ] +0 Abstain.
[ ] -1 Do not accept Gossip into the Apache Incubator because…

This vote will be open for at least 72 hours.

Obviously I am +1 (binding).

-Taylor

[1] https://s.apache.org/gossip-discuss
[2] https://wiki.apache.org/incubator/GossipProposal

= Abstract =

Apache Gossip will be an implementation of the Gossip Protocol based on code
available here: https://github.com/edwardcapriolo/gossip/ which is already
licenced using the glorious Apache V2 License.

= Proposal =

Apache Gossip aims to provide a gossip based consensus protocol written in Java
for peer-to-peer communication to the Apache Incubator
(http://incubator.apache.org/). This implementation will effectively scale from
one to one-thousand node clusters. In addition to the code implementation, the
project should produce specifications of the wire protocol, features, and
expected behavior of the system such that compatible implementations can
communicate.

= Background =

The gossip protocol has been implemented to varying levels of rigor by a number
of entities. In particular, Apache Cassandra uses an implementation of gossip
to locate peers and transmit up/down state. Apache Spark leverages tooling in
Akka which provides peer-to-peer node discovery capabilities.

*
http://highscalability.com/blog/2011/11/14/using-gossip-protocols-for-failure-detection-monitoring-mess.html

* https://en.wikipedia.org/wiki/Gossip_protocol

= Rationale =

With distributed computing becoming extremely widespread, and the growth of the
buzz-factor of ‘the-internet-of-things’ it is increasingly important that
networks of IP addressable devices can form a peer-to-peer network.
Applications of peer-to-peer networks include generating crypto currency,
managing hardware such as solar power micro-grids, and more traditional roles
like grid/High Performance Computing and distributed storage systems. Different
implementations of gossip based consensus protocols have been implemented in
numerous languages or as part of more complex software stacks. The Apache
Software Foundation should lead the effort of producing a purpose built tool
that can be used by downstream projects to form peer-to-peer networks.

= Initial Goals =

* Migration of current code https://github.com/edwardcapriolo/gossip and
existing community to the Apache Software Foundation infrastructure
* Secure communications
* Transport security using a pre-shared key
* Public Key Infrastructure
* Introduce a cluster name to wire protocol to avoid misconfigurations
* Effectively operate when systems have multiple network interfaces by
controlling IP binding settings
* Effectively operate when systems have Network Address Translations devices
between them using a broadcast IP settings
* Develop advanced integration testing from cluster sizes of 1-1000 nodes
* Test convergence times
* Demonstrate the tradeoffs of different settings in regard to
bandwidth/cpu/convergence time/accuracy
* Gossip data other than cluster state such as application/user data
* Provide detailed specifications such that others can implement the protocol
in other programming languages
* Explore HTTP transport as an alternative to UDP

= Current Status =

The current code has been around for some time. Previously it was a Google code
project. Since the fork in January 2015 there have been 55 commits and 4
releases.

== Meritocracy ==

We believe in meritocracy. All suggestions are taken seriously. We enjoy
helping new people become part of process. For other projects available on our
Github, once a user shows enough activity we grant them collaborator status.

== Community ==

In a relatively short amount of time, with a small amount of promotion on
twitter and through blogging, we have 50+ followers on Github and several forks
of the project. With the Apache brand we should be able to attract more. Once
we have entered the incubator we believe it will be easier to attempt to unify
with other similar implementations.

== Core Developers ==

The code was forked on Jan 9th 2015, since then there have been 4 releases and
55 commits. Since that period, the majority of the work was undertaken by
Edward Capriolo. Several people are interested in the features of this proposal
and have indicated they will volunteer their time.

== Alignment ==

Apache is the perfect organization to take on the Gossip project. Besides
benefiting a number of projects directly, the active development and outreach
will increase adoption of Gossip with the aim of it becoming a leader in the
space.

= Known Risks =

Re: [Incubator Wiki] Update of "GossipProposal" by P. Taylor Goetz

2016-04-22 Thread Debo Dutta (dedutta)

+1 for me …..ddu...@apache.org




On 4/22/16, 1:17 PM, "Suneel Marthi"  wrote:

>Please add me to the list of contributors
>
>
>* Suneel Marthi - smar...@apache.org - Red hat Inc.
>
>
>
>On Fri, Apr 22, 2016 at 4:07 PM, Apache Wiki  wrote:
>
>> Dear Wiki user,
>>
>> You have subscribed to a wiki page or wiki category on "Incubator Wiki"
>> for change notification.
>>
>> The "GossipProposal" page has been changed by P. Taylor Goetz:
>> https://wiki.apache.org/incubator/GossipProposal?action=diff=3=4
>>
>>   == Issue Tracking ==
>>   JIRA tracker: GOSSIP
>>   = Initial Committers =
>> -  * Edward Capriolo (Hive Committer, PMC)
>> -  * P. Taylor Goetz (Storm PMC)
>> -  * Gary Dusbabek (Cassandra Committer, PMC)
>> -  * Dorian Ellerbe (requires CLA)
>> +  * Edward Capriolo (ecapriolo at apache dot org)
>> +  * P. Taylor Goetz (ptgoetz at apache dot org)
>> +  * Gary Dusbabek (gdusbabek at apache dot org)
>> +  * Dorian Ellerbe (Doellerbe06 at gmail dot com)(requires CLA)
>>* Sathish Dhinakaran (requires CLA)
>>   = Affiliations =
>> - With diverse contributors the project will be able to make balanced
>> decisions best for the future of the project.
>> +  * Edward Capriolo - The Huffington Post
>> +  * P. Taylor Goetz - Hortonworks
>> +  * Gary Dusbabek - Silicon Valley Data Science
>> +  * Dorian Ellerbe - Dstillery
>> +  * Sathish Dhinakaran - Dstillery
>> +  * Sean Busbey - Cloudera
>> +  * Josh Elser - Hortonworks
>> +
>>   = Additional Interested Contributors =
>>
>>   Those interested in getting involved with the project as it starts are
>> encourage to list themselves here.
>>
>> -
>> To unsubscribe, e-mail: cvs-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: cvs-h...@incubator.apache.org
>>
>>

Re: [VOTE] Accept Mnemonic into the Apache Incubator

2016-02-29 Thread Debo Dutta (dedutta)

tive processing. Moreover, Spark applications can
>leverage Mnemonic to perform data transforming in persistent or
>non-persistent memory without SerDes.
>
>For Apache Hadoop®, we are integrating HDFS Caching with Mnemonic
>instead of mmap. This will take advantage of persistent memory related
>features. We also plan to evaluate to integrate in Namenode Editlog,
>FSImage persistent data into Mnemonic persistent memory area.
>
>For Apache HBase, we are using Mnemonic for BucketCache and
>evaluating performance improvements.
>
>We expect Mnemonic will be further developed and integrated into many
>Apache BigData projects and so on, to enhance memory management
>solutions for much improved performance and reliability.
>
> An Excessive Fascination with the Apache Brand 
>While we expect Apache brand helps to attract more contributors, our
>interests in starting this project is based on the factors mentioned
>in the Rationale section.
>
>We would like Mnemonic to become an Apache project to further foster a
>healthy community of contributors and consumers in BigData technology
>R areas. Since Mnemonic can directly benefit many Apache projects
>and solves major performance problems, we expect the Apache Software
>Foundation to increase interaction with the larger community as well.
>
>=== Documentation ===
>The documentation is currently available at Intel and will be posted
>under: https://mnemonic.incubator.apache.org/docs
>
>=== Initial Source ===
>Initial source code is temporary hosted Github for general viewing:
>https://github.com/NonVolatileComputing/Mnemonic.git
>It will be moved to Apache http://git.apache.org/ after podling.
>
>The initial Source is written in Java code (88%) and mixed with JNI C
>code (11%) and shell script (1%) for underlying native allocation
>libraries.
>
>=== Source and Intellectual Property Submission Plan ===
>As soon as Mnemonic is approved to join the Incubator, the source code
>will be transitioned via the Software Grant Agreement onto ASF
>infrastructure and in turn made available under the Apache License,
>version 2.0.
>
>=== External Dependencies ===
>The required external dependencies are all Apache licenses or other
>compatible Licenses
>Note: The runtime dependent licenses of Mnemonic are all declared as
>Apache 2.0, the GNU licensed components are used for Mnemonic build
>and deployment. The Mnemonic JNI libraries are built using the GNU
>tools.
>
>maven and its plugins (http://maven.apache.org/ ) [Apache 2.0]
>JDK8 or OpenJDK 8 (http://java.com/) [Oracle or Openjdk JDK License]
>Nvml (http://pmem.io ) [optional] [Open Source]
>PMalloc (https://github.com/bigdata-memory/pmalloc ) [optional] [Apache
>2.0]
>
>Build and test dependencies:
>org.testng.testng v6.8.17  (http://testng.org) [Apache 2.0]
>org.flowcomputing.commons.commons-resgc v0.8.7 [Apache 2.0]
>org.flowcomputing.commons.commons-primitives v.0.6.0 [Apache 2.0]
>com.squareup.javapoet v1.3.1-SNAPSHOT [Apache 2.0]
>JDK8 or OpenJDK 8 (http://java.com/) [Oracle or Openjdk JDK License]
>
>=== Cryptography ===
>Project Mnemonic does not use cryptography itself, however, Hadoop
>projects use standard APIs and tools for SSH and SSL communication
>where necessary.
>
>=== Required Resources ===
>We request that following resources be created for the project to use
>
> Mailing lists 
>priv...@mnemonic.incubator.apache.org (moderated subscriptions)
>comm...@mnemonic.incubator.apache.org
>d...@mnemonic.incubator.apache.org
>
> Git repository 
>https://github.com/apache/incubator-mnemonic
>
> Documentation 
>https://mnemonic.incubator.apache.org/docs/
>
> JIRA instance 
>https://issues.apache.org/jira/browse/mnemonic
>
>=== Initial Committers ===
>* Gang (Gary) Wang (gang1 dot wang at intel dot com)
>
>* Yanping Wang (yanping dot wang at intel dot com)
>
>* Uma Maheswara Rao G (umamahesh at apache dot org)
>
>* Kai Zheng (drankye at apache dot org)
>
>* Rakesh Radhakrishnan Potty  (rakeshr at apache dot org)
>
>* Sean Zhong  (seanzhong at apache dot org)
>
>* Henry Saputra  (hsaputra at apache dot org)
>
>* Hao Cheng (hao dot cheng at intel dot com)
>
>=== Additional Interested Contributors ===
>* Debo Dutta (dedutta at cisco dot com)
>
>* Liang Chen (chenliang613 at Huawei dot com)
>
>=== Affiliations ===
>* Gang (Gary) Wang, Intel
>
>* Yanping Wang, Intel
>
>* Uma Maheswara Rao G, Intel
>
>* Kai Zheng, Intel
>
>* Rakesh Radhakrishnan Potty, Intel
>
>* Sean Zhong, Intel
>
>* Henry Saputra, Independent
>
>* Hao Cheng, Intel
>
>=== Sponsors ===
> Champion 
>Patrick Hunt
>
> Nominated Mentors 
>* Patrick Hunt  - Apache IPMC member
>
>* Andrew Purtell  - Apache IPMC member
>
>* James Taylor  - Apache IPMC member
>
>* Henry Saputra  - Apache IPMC member
>
> Sponsoring Entity 
>Apache Incubator PMC
>
>-
>To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>For additional commands, e-mail: general-h...@incubator.apache.org
>


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Mnemonic incubator proposal

2016-02-21 Thread Debo Dutta (dedutta)

Hi Yanping

This is very interesting and timely. Would love to contribute, participate
etc. 

thx
debo

On 2/21/16, 11:47 AM, "Wang, Yanping"  wrote:

>Hi all 
>
>We'd like to start a discussion regarding a proposal to submit Mnemonic
>to the Apache Incubator.
>
>The proposal text is available on the Wiki here:
>https://wiki.apache.org/incubator/MnemonicProposal
>
>and pasted below for convenience.
>
>We are excited to make this proposal, and look forward to the community's
>input!
>
>Best,
>Yanping
>
>
>= Mnemonic Proposal =
>=== Abstract ===
>Mnemonic is a Java based non-volatile memory library for in-place
>structured data processing and computing. It is a solution for generic
>object and block persistence on heterogeneous block and byte-addressable
>devices, such as DRAM, persistent memory, NVMe, SSD, and cloud network
>storage.
>
>=== Proposal ===
>Mnemonic is a structured data persistence in-memory in-place library for
>Java-based applications and frameworks. It provides unified interfaces
>for data manipulation on heterogeneous block/byte-addressable devices,
>such as DRAM, persistent memory, NVMe, SSD, and cloud network devices.
>
>The design motivation for this project is to create a non-volatile
>programming paradigm for in-memory data object persistence, in-memory
>data objects caching, and JNI-less IPC.
>Mnemonic simplifies the usage of data object caching, persistence, and
>JNI-less IPC for massive object oriented structural datasets.
>
>Mnemonic defines Non-Volatile Java objects that store data fields in
>persistent memory and storage. During the program runtime, only methods
>and volatile fields are instantiated in Java heap, Non-Volatile data
>fields are directly accessed via GET/SET operation to and from persistent
>memory and storage. Mnemonic avoids SerDes and significantly reduces
>amount of garbage in Java heap.
>
>Major features of Mnemonic:
>* Provides an abstract level of viewpoint to utilize heterogeneous
>block/byte-addressable device as a whole (e.g., DRAM, persistent memory,
>NVMe, SSD, HD, cloud network Storage).
>* Provides seamless support object oriented design and programming
>without adding burden to transfer object data to different form.
>* Avoids the object data serialization/de-serialization for data
>retrieval, caching and storage.
>* Reduces the consumption of on-heap memory and in turn to reduce and
>stabilize Java Garbage Collection (GC) pauses for latency sensitive
>applications.
>* Overcomes current limitations of Java GC to manage much larger memory
>resources for massive dataset processing and computing.
>* Supports the migration data usage model from traditional NVMe/SSD/HD to
>non-volatile memory with ease.
>* Uses lazy loading mechanism to avoid unnecessary memory consumption if
>some data does not need to use for computing immediately.
>* Bypasses JNI call for the interaction between Java runtime application
>and its native code.
>* Provides an allocation aware auto-reclaim mechanism to prevent external
>memory resource leaking.
>
>
>=== Background ===
>Big Data and Cloud applications increasingly require both high throughput
>and low latency processing. Java-based applications targeting the Big
>Data and Cloud space should be tuned for better throughput, lower
>latency, and more predictable response time.
>Typically, there are some issues that impact BigData applications'
>performance and scalability:
>
>1) The Complexity of Data Transformation/Organization: In most cases,
>during data processing, applications use their own complicated data
>caching mechanism for SerDes data objects, spilling to different storage
>and eviction large amount of data. Some data objects contains complex
>values and structure that will make it much more difficulty for data
>organization. To load and then parse/decode its datasets from storage
>consumes high system resource and computation power.
>
>2) Lack of Caching, Burst Temporary Object Creation/Destruction Causes
>Frequent Long GC Pauses: Big Data computing/syntax generates large amount
>of temporary objects during processing, e.g. lambda, SerDes, copying and
>etc. This will trigger frequent long Java GC pause to scan references, to
>update references lists, and to copy live objects from one memory
>location to another blindly.
>
>3) The Unpredictable GC Pause: For latency sensitive applications, such
>as database, search engine, web query, real-time/streaming computing,
>require latency/request-response under control. But current Java GC does
>not provide predictable GC activities with large on-heap memory
>management.
>
>4) High JNI Invocation Cost: JNI calls are expensive, but high
>performance applications usually try to leverage native code to improve
>performance, however, JNI calls need to convert Java objects into
>something that C/C++ can understand. In addition, some comprehensive
>native code needs to communicate with Java based application that will
>cause frequently JNI call along with

Re: [DISCUSS] Apache Dataflow Incubator Proposal

2016-01-20 Thread Debo Dutta (dedutta)

Hi JB

Would love to join now.

regards
debo

On 1/20/16, 9:31 AM, "Jean-Baptiste Onofré" <j...@nanthrax.net> wrote:

>Hi Debo,
>
>Awesome: do you want to join now (in the initial committer list) and
>once we are in the incubation ?
>
>Let me know, I can update the proposal.
>
>Regards
>JB
>
>On 01/20/2016 06:23 PM, Debo Dutta (dedutta) wrote:
>> +1
>>
>> Proposal looks good. Also a small section on relationships with Apache
>> Storm and Apache Samza would be great.
>>
>> I would like to sign up, to help/contribute.
>>
>> debo
>>
>> On 1/20/16, 8:55 AM, "Sean Busbey" <bus...@cloudera.com> wrote:
>>
>>> Great proposal. I like that your proposal includes a well presented
>>> roadmap, but I don't see any goals that directly address building a
>>>larger
>>> community. Y'all have any ideas around outreach that will help with
>>> adoption?
>>>
>>> As a start, I recommend y'all add a section to the proposal on the wiki
>>> page for "Additional Interested Contributors" so that folks who want to
>>> sign up to participate in the project can do so without requesting
>>> additions to the initial committer list.
>>>
>>> On Wed, Jan 20, 2016 at 10:32 AM, James Malone <
>>> jamesmal...@google.com.invalid> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> Attached to this message is a proposed new project - Apache Dataflow,
>>>>a
>>>> unified programming model for data processing and integration.
>>>>
>>>> The text of the proposal is included below. Additionally, the proposal
>>>> is
>>>> in draft form on the wiki where we will make any required changes:
>>>>
>>>> https://wiki.apache.org/incubator/DataflowProposal
>>>>
>>>> We look forward to your feedback and input.
>>>>
>>>> Best,
>>>>
>>>> James
>>>>
>>>> 
>>>>
>>>> = Apache Dataflow =
>>>>
>>>> == Abstract ==
>>>>
>>>> Dataflow is an open source, unified model and set of language-specific
>>>> SDKs
>>>> for defining and executing data processing workflows, and also data
>>>> ingestion and integration flows, supporting Enterprise Integration
>>>> Patterns
>>>> (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines
>>>>simplify
>>>> the mechanics of large-scale batch and streaming data processing and
>>>>can
>>>> run on a number of runtimes like Apache Flink, Apache Spark, and
>>>>Google
>>>> Cloud Dataflow (a cloud service). Dataflow also brings DSL in
>>>>different
>>>> languages, allowing users to easily implement their data integration
>>>> processes.
>>>>
>>>> == Proposal ==
>>>>
>>>> Dataflow is a simple, flexible, and powerful system for distributed
>>>>data
>>>> processing at any scale. Dataflow provides a unified programming
>>>>model,
>>>> a
>>>> software development kit to define and construct data processing
>>>> pipelines,
>>>> and runners to execute Dataflow pipelines in several runtime engines,
>>>> like
>>>> Apache Spark, Apache Flink, or Google Cloud Dataflow. Dataflow can be
>>>> used
>>>> for a variety of streaming or batch data processing goals including
>>>>ETL,
>>>> stream analysis, and aggregate computation. The underlying programming
>>>> model for Dataflow provides MapReduce-like parallelism, combined with
>>>> support for powerful data windowing, and fine-grained correctness
>>>> control.
>>>>
>>>> == Background ==
>>>>
>>>> Dataflow started as a set of Google projects focused on making data
>>>> processing easier, faster, and less costly. The Dataflow model is a
>>>> successor to MapReduce, FlumeJava, and Millwheel inside Google and is
>>>> focused on providing a unified solution for batch and stream
>>>>processing.
>>>> These projects on which Dataflow is based have been published in
>>>>several
>>>> papers made available to the public:
>>>>
>>>> * MapReduce - http://research.google.com/archive/mapreduce.html
>>>>
>>>> * Dataflow model  - http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
&

Re: [DISCUSS] Apache Dataflow Incubator Proposal

2016-01-20 Thread Debo Dutta (dedutta)

+1

Proposal looks good. Also a small section on relationships with Apache
Storm and Apache Samza would be great.

I would like to sign up, to help/contribute.

debo

On 1/20/16, 8:55 AM, "Sean Busbey"  wrote:

>Great proposal. I like that your proposal includes a well presented
>roadmap, but I don't see any goals that directly address building a larger
>community. Y'all have any ideas around outreach that will help with
>adoption?
>
>As a start, I recommend y'all add a section to the proposal on the wiki
>page for "Additional Interested Contributors" so that folks who want to
>sign up to participate in the project can do so without requesting
>additions to the initial committer list.
>
>On Wed, Jan 20, 2016 at 10:32 AM, James Malone <
>jamesmal...@google.com.invalid> wrote:
>
>> Hello everyone,
>>
>> Attached to this message is a proposed new project - Apache Dataflow, a
>> unified programming model for data processing and integration.
>>
>> The text of the proposal is included below. Additionally, the proposal
>>is
>> in draft form on the wiki where we will make any required changes:
>>
>> https://wiki.apache.org/incubator/DataflowProposal
>>
>> We look forward to your feedback and input.
>>
>> Best,
>>
>> James
>>
>> 
>>
>> = Apache Dataflow =
>>
>> == Abstract ==
>>
>> Dataflow is an open source, unified model and set of language-specific
>>SDKs
>> for defining and executing data processing workflows, and also data
>> ingestion and integration flows, supporting Enterprise Integration
>>Patterns
>> (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify
>> the mechanics of large-scale batch and streaming data processing and can
>> run on a number of runtimes like Apache Flink, Apache Spark, and Google
>> Cloud Dataflow (a cloud service). Dataflow also brings DSL in different
>> languages, allowing users to easily implement their data integration
>> processes.
>>
>> == Proposal ==
>>
>> Dataflow is a simple, flexible, and powerful system for distributed data
>> processing at any scale. Dataflow provides a unified programming model,
>>a
>> software development kit to define and construct data processing
>>pipelines,
>> and runners to execute Dataflow pipelines in several runtime engines,
>>like
>> Apache Spark, Apache Flink, or Google Cloud Dataflow. Dataflow can be
>>used
>> for a variety of streaming or batch data processing goals including ETL,
>> stream analysis, and aggregate computation. The underlying programming
>> model for Dataflow provides MapReduce-like parallelism, combined with
>> support for powerful data windowing, and fine-grained correctness
>>control.
>>
>> == Background ==
>>
>> Dataflow started as a set of Google projects focused on making data
>> processing easier, faster, and less costly. The Dataflow model is a
>> successor to MapReduce, FlumeJava, and Millwheel inside Google and is
>> focused on providing a unified solution for batch and stream processing.
>> These projects on which Dataflow is based have been published in several
>> papers made available to the public:
>>
>> * MapReduce - http://research.google.com/archive/mapreduce.html
>>
>> * Dataflow model  - http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
>>
>> * FlumeJava - http://notes.stephenholiday.com/FlumeJava.pdf
>>
>> * MillWheel - http://research.google.com/pubs/pub41378.html
>>
>> Dataflow was designed from the start to provide a portable programming
>> layer. When you define a data processing pipeline with the Dataflow
>>model,
>> you are creating a job which is capable of being processed by any
>>number of
>> Dataflow processing engines. Several engines have been developed to run
>> Dataflow pipelines in other open source runtimes, including a Dataflow
>> runner for Apache Flink and Apache Spark. There is also a ³direct
>>runner²,
>> for execution on the developer machine (mainly for dev/debug purposes).
>> Another runner allows a Dataflow program to run on a managed service,
>> Google Cloud Dataflow, in Google Cloud Platform. The Dataflow Java SDK
>>is
>> already available on GitHub, and independent from the Google Cloud
>>Dataflow
>> service. Another Python SDK is currently in active development.
>>
>> In this proposal, the Dataflow SDKs, model, and a set of runners will be
>> submitted as an OSS project under the ASF. The runners which are a part
>>of
>> this proposal include those for Spark (from Cloudera), Flink (from data
>> Artisans), and local development (from Google); the Google Cloud
>>Dataflow
>> service runner is not included in this proposal. Further references to
>> Dataflow will refer to the Dataflow model, SDKs, and runners which are a
>> part of this proposal (Apache Dataflow) only. The initial submission
>>will
>> contain the already-released Java SDK; Google intends to submit the
>>Python
>> SDK later in the incubation process. The Google Cloud Dataflow service
>>will
>> continue to be one of many runners for Dataflow, built on Google Cloud
>>

Re: [RESULT][VOTE] Accept Metron into Apache Incubator

2015-12-08 Thread Debo Dutta (dedutta)

Thanks Marvin for the clarification.

An annotation would *really* help while I keep pushing the SGA.

debo

On 12/6/15, 7:46 PM, "Marvin Humphrey" <mar...@rectangular.com> wrote:

>On Sun, Dec 6, 2015 at 5:43 PM, Debo Dutta (dedutta) <dedu...@cisco.com>
>wrote:
>> Hi Owen
>>
>> This is good.
>>
>> The SGA stuff is still in progress. Cisco is very interested in Metron
>>and
>> would like to be involved (in fact the opensrc folks at Cisco seem to be
>> unaware of the Metron proposal - till we talked about the SGA). Would it
>> be possible to put an addendum to the original Metron proposal that
>>Cisco
>> is interested - to set the facts straight!
>
>Cisco's involvement would of course be welcome and it's great to hear
>about the positive internal reactions.
>
>The Metron proposal text was the subject of an IPMC VOTE. Any
>"addendum" should take the form of annotations that do not appear to
>change the meaning of the VOTE ex post facto. But that doesn't mean we
>can't be accommodating -- we have plenty of flexibility.
>
>Marvin Humphrey
>
>-
>To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>For additional commands, e-mail: general-h...@incubator.apache.org
>


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [RESULT][VOTE] Accept Metron into Apache Incubator

2015-12-06 Thread Debo Dutta (dedutta)

Hi Owen 

This is good. 

The SGA stuff is still in progress. Cisco is very interested in Metron and
would like to be involved (in fact the opensrc folks at Cisco seem to be
unaware of the Metron proposal - till we talked about the SGA). Would it
be possible to put an addendum to the original Metron proposal that Cisco
is interested - to set the facts straight!

debo

On 12/6/15, 12:48 PM, "Owen O'Malley² < > wrote:

>With 10 binding +1's and 10 non-binding +1's and no -1's, the vote to
>accept Metron in the Apache Incubator passes. Thank you everyone.
>
>Binding +1's:
>  Owen O'Malley
>  Chris Nauroth
>  Chris Mattmann
>  Seetharan Venkatesh
>  P. Taylor Goetz
>  Vinod Kumar Vavilapalli
>  Brock Noland
>  Billie Rinaldi
>  Jacques Nadeau
>  Julian Hyde
>
>Non-binding +1's:
>  Joe Witt
>  Larry McCay
>  Debo Dutta
>  Phillip Rhodes
>  Dave Hirko
>  Brad Kolarov
>  James Sirota
>  Ryan Merriman
>  Amol Kekre
>  Balaji Ganesan
>
>.. Owen

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Metron into Apache Incubator

2015-12-04 Thread Debo Dutta (dedutta)

Yes Š still on it. 

debo

On 12/3/15, 1:34 PM, "P. Taylor Goetz"  wrote:

>+1 (binding)
>
>I think it would be prudent to at least try to get an SGA from Cisco, but
>it looks like Debo is trying to help out with that.
>
>-Taylor
>
>> On Dec 3, 2015, at 12:33 PM, Owen O'Malley  wrote:
>> 
>> The [DISCUSS] thread has would down, so I'd like to start a VOTE on
>>whether
>> Apache Incubator should accept Metron as a podling. The proposal is
>>pasted
>> below and is available on the wiki as well.
>> 
>> https://wiki.apache.org/incubator/MetronProposal
>> 
>> We've added a paragraph in the background section discussing how Apache
>> avoids hostile forks of projects, because we don't want to fork
>> communities. We've also added Larry McCay, P. Taylor Goetz, and Phillip
>> Rhodes to the proposal.
>> 
>> The vote will run until 12pm PST on Sunday.
>> 
>> Thanks,
>>   Owen
>> 
>> = Apache Metron Proposal =
>> 
>> 
>> /!\ '''FINAL''' /!\
>> 
>> This proposal is now complete and has been submitted for a VOTE.
>> 
>> 
>> == Abstract ==
>> 
>> The Metron project is an open source project dedicated to providing an
>> extensible and scalable advanced security analytics tool. It has strong
>> foundations in the Apache Hadoop ecosystem.
>> 
>> == Proposal ==
>> 
>> Metron integrates a variety of open source big data technologies in
>>order
>> to offer a centralized tool for security monitoring and analysis. Metron
>> provides capabilities for log aggregation, full packet capture indexing,
>> storage, advanced behavioral analytics and data enrichment, while
>>applying
>> the most current threat-intelligence information to security telemetry
>> within a single platform.
>> 
>> Metron can be divided into 4 areas:
>> 
>>  1. '''A mechanism to capture, store, and normalize any type of security
>> telemetry at extremely high rates.''' Because security telemetry is
>> constantly being generated, it requires a method for ingesting the data
>>at
>> high speeds and pushing it to various processing units for advanced
>> computation and analytics.
>>  1. '''Real time processing and application of enrichments''' such as
>> threat intelligence, geolocation, and DNS information to telemetry being
>> collected. The immediate application of this information to incoming
>> telemetry provides the context and situational awareness, as well as the
>> ³who² and ³where² information that is critical for investigation.
>>  1. '''Efficient information storage''' based on how the information
>>will
>> be used:
>>a. Logs and telemetry are stored such that they can be efficiently
>> mined and analyzed for concise security visibility
>>a. The ability to extract and reconstruct full packets helps an
>>analyst
>> answer questions such as who the true attacker was, what data was
>>leaked,
>> and where that data was sent
>>a. Long-term storage not only increases visibility over time, but
>>also
>> enables advanced analytics such as machine learning techniques to be
>>used
>> to create models on the information. Incoming data can then be scored
>> against these stored models for advanced anomaly detection.
>>  1. '''An interface that gives a security investigator a centralized
>>view
>> of data and alerts passed through the system.''' Metron¹s interface
>> presents alert summaries with threat intelligence and enrichment data
>> specific to that alert on one single page. Furthermore, advanced search
>> capabilities and full packet extraction tools are presented to the
>>analyst
>> for investigation without the need to pivot into additional tools.
>> 
>> Big data is a natural fit for powerful security analytics. The Metron
>> framework integrates a number of elements from the Hadoop ecosystem to
>> provide a scalable platform for security analytics, incorporating such
>> functionality as full-packet capture, stream processing, batch
>>processing,
>> real-time search, and telemetry aggregation. With Metron, our goal is to
>> tie big data into security analytics and drive towards an extensible
>> centralized platform to effectively enable rapid detection and rapid
>> response for advanced security threats.
>> 
>> == Background ==
>> 
>> OpenSOC was developed by Cisco over the last two years and pushed out to
>> Github (https://github.com/OpenSOC/opensoc) under the ALv2. However, the
>> development was mostly closed and has largely stopped. As evidence of
>>the
>> inactivity, users have complained that pull requests are not answered
>>for a
>> while
>> 
>>https://groups.google.com/d/msg/opensoc-support/R2W-ZFux8Vk/Y-5tL-EmAAAJ.
>> Finally, no public releases of OpenSOC have been made. From an Apache
>>point
>> of view, the current community is not viable.
>> 
>> However, some of the developers of the project have left Cisco and have
>> found interest from several others that would like to work together to
>>form
>> an active and open community at Apache starting from the current OpenSOC
>> code base. A

Re: [VOTE] Accept Metron into Apache Incubator

2015-12-03 Thread Debo Dutta (dedutta)

+1 (non binding)

Would be interested in contributing Š. (from Cisco)

debo

On 12/3/15, 9:38 AM, "Owen O'Malley"  wrote:

>+1 (binding)
>
>On Thu, Dec 3, 2015 at 9:33 AM, Owen O'Malley  wrote:
>
>> The [DISCUSS] thread has would down, so I'd like to start a VOTE on
>> whether Apache Incubator should accept Metron as a podling. The
>>proposal is
>> pasted below and is available on the wiki as well.
>>
>> https://wiki.apache.org/incubator/MetronProposal
>>
>> We've added a paragraph in the background section discussing how Apache
>> avoids hostile forks of projects, because we don't want to fork
>> communities. We've also added Larry McCay, P. Taylor Goetz, and Phillip
>> Rhodes to the proposal.
>>
>> The vote will run until 12pm PST on Sunday.
>>
>> Thanks,
>>Owen
>>
>> = Apache Metron Proposal =
>>
>> 
>> /!\ '''FINAL''' /!\
>>
>> This proposal is now complete and has been submitted for a VOTE.
>> 
>>
>> == Abstract ==
>>
>> The Metron project is an open source project dedicated to providing an
>> extensible and scalable advanced security analytics tool. It has strong
>> foundations in the Apache Hadoop ecosystem.
>>
>> == Proposal ==
>>
>> Metron integrates a variety of open source big data technologies in
>>order
>> to offer a centralized tool for security monitoring and analysis. Metron
>> provides capabilities for log aggregation, full packet capture indexing,
>> storage, advanced behavioral analytics and data enrichment, while
>>applying
>> the most current threat-intelligence information to security telemetry
>> within a single platform.
>>
>> Metron can be divided into 4 areas:
>>
>>   1. '''A mechanism to capture, store, and normalize any type of
>>security
>> telemetry at extremely high rates.''' Because security telemetry is
>> constantly being generated, it requires a method for ingesting the data
>>at
>> high speeds and pushing it to various processing units for advanced
>> computation and analytics.
>>   1. '''Real time processing and application of enrichments''' such as
>> threat intelligence, geolocation, and DNS information to telemetry being
>> collected. The immediate application of this information to incoming
>> telemetry provides the context and situational awareness, as well as the
>> ³who² and ³where² information that is critical for investigation.
>>   1. '''Efficient information storage''' based on how the information
>>will
>> be used:
>> a. Logs and telemetry are stored such that they can be efficiently
>> mined and analyzed for concise security visibility
>> a. The ability to extract and reconstruct full packets helps an
>> analyst answer questions such as who the true attacker was, what data
>>was
>> leaked, and where that data was sent
>> a. Long-term storage not only increases visibility over time, but
>>also
>> enables advanced analytics such as machine learning techniques to be
>>used
>> to create models on the information. Incoming data can then be scored
>> against these stored models for advanced anomaly detection.
>>   1. '''An interface that gives a security investigator a centralized
>>view
>> of data and alerts passed through the system.''' Metron¹s interface
>> presents alert summaries with threat intelligence and enrichment data
>> specific to that alert on one single page. Furthermore, advanced search
>> capabilities and full packet extraction tools are presented to the
>>analyst
>> for investigation without the need to pivot into additional tools.
>>
>> Big data is a natural fit for powerful security analytics. The Metron
>> framework integrates a number of elements from the Hadoop ecosystem to
>> provide a scalable platform for security analytics, incorporating such
>> functionality as full-packet capture, stream processing, batch
>>processing,
>> real-time search, and telemetry aggregation. With Metron, our goal is to
>> tie big data into security analytics and drive towards an extensible
>> centralized platform to effectively enable rapid detection and rapid
>> response for advanced security threats.
>>
>> == Background ==
>>
>> OpenSOC was developed by Cisco over the last two years and pushed out to
>> Github (https://github.com/OpenSOC/opensoc) under the ALv2. However, the
>> development was mostly closed and has largely stopped. As evidence of
>>the
>> inactivity, users have complained that pull requests are not answered
>>for a
>> while
>> 
>>https://groups.google.com/d/msg/opensoc-support/R2W-ZFux8Vk/Y-5tL-EmAAAJ.
>> Finally, no public releases of OpenSOC have been made. From an Apache
>>point
>> of view, the current community is not viable.
>>
>> However, some of the developers of the project have left Cisco and have
>> found interest from several others that would like to work together to
>>form
>> an active and open community at Apache starting from the current OpenSOC
>> code base. A message to the current support group proposing moving to
>> Apache got a single positive response.
>>

Re: [VOTE] Accept Metron into Apache Incubator

2015-12-03 Thread Debo Dutta (dedutta)

Will find out and get back.

debo

On 12/3/15, 10:26 AM, "Alex Harui" <aha...@adobe.com> wrote:

>Are any of the GitHub contributors to OpenSoc still at Cisco?  That might
>help.
>
>-Alex
>
>On 12/3/15, 10:18 AM, "Owen O'Malley" <omal...@apache.org> wrote:
>
>>On Thu, Dec 3, 2015 at 10:04 AM, Debo Dutta (dedutta) <dedu...@cisco.com>
>>wrote:
>>
>>> Would like to know who in Cisco was asked actually. I am from Cisco and
>>> can help.
>>
>>
>>Debo,
>>   If you can help get an SGA signed that would be great. I don't have
>>any
>>contacts in Cisco, so I didn't have anywhere to ask.
>>
>>.. Owen
>
>
>-
>To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Metron into Apache Incubator

2015-12-03 Thread Debo Dutta (dedutta)

Would like to know who in Cisco was asked actually. I am from Cisco and
can help. 

debo

On 12/3/15, 9:39 AM, "Ted Dunning"  wrote:

>I think that there was a still pending question of why not ask Cisco for
>an
>SGA for cleanliness.
>
>The only answer that I heard was that corporations don't like to do things
>like and so Cisco hadn't been asked.
>
>That seems like an insufficient answer.
>
>
>
>On Fri, Dec 4, 2015 at 4:33 AM, Owen O'Malley  wrote:
>
>> The [DISCUSS] thread has would down, so I'd like to start a VOTE on
>>whether
>> Apache Incubator should accept Metron as a podling. The proposal is
>>pasted
>> below and is available on the wiki as well.
>>
>> https://wiki.apache.org/incubator/MetronProposal
>>
>> We've added a paragraph in the background section discussing how Apache
>> avoids hostile forks of projects, because we don't want to fork
>> communities. We've also added Larry McCay, P. Taylor Goetz, and Phillip
>> Rhodes to the proposal.
>>
>> The vote will run until 12pm PST on Sunday.
>>
>> Thanks,
>>Owen
>>
>> = Apache Metron Proposal =
>>
>> 
>> /!\ '''FINAL''' /!\
>>
>> This proposal is now complete and has been submitted for a VOTE.
>> 
>>
>> == Abstract ==
>>
>> The Metron project is an open source project dedicated to providing an
>> extensible and scalable advanced security analytics tool. It has strong
>> foundations in the Apache Hadoop ecosystem.
>>
>> == Proposal ==
>>
>> Metron integrates a variety of open source big data technologies in
>>order
>> to offer a centralized tool for security monitoring and analysis. Metron
>> provides capabilities for log aggregation, full packet capture indexing,
>> storage, advanced behavioral analytics and data enrichment, while
>>applying
>> the most current threat-intelligence information to security telemetry
>> within a single platform.
>>
>> Metron can be divided into 4 areas:
>>
>>   1. '''A mechanism to capture, store, and normalize any type of
>>security
>> telemetry at extremely high rates.''' Because security telemetry is
>> constantly being generated, it requires a method for ingesting the data
>>at
>> high speeds and pushing it to various processing units for advanced
>> computation and analytics.
>>   1. '''Real time processing and application of enrichments''' such as
>> threat intelligence, geolocation, and DNS information to telemetry being
>> collected. The immediate application of this information to incoming
>> telemetry provides the context and situational awareness, as well as the
>> ³who² and ³where² information that is critical for investigation.
>>   1. '''Efficient information storage''' based on how the information
>>will
>> be used:
>> a. Logs and telemetry are stored such that they can be efficiently
>> mined and analyzed for concise security visibility
>> a. The ability to extract and reconstruct full packets helps an
>>analyst
>> answer questions such as who the true attacker was, what data was
>>leaked,
>> and where that data was sent
>> a. Long-term storage not only increases visibility over time, but
>>also
>> enables advanced analytics such as machine learning techniques to be
>>used
>> to create models on the information. Incoming data can then be scored
>> against these stored models for advanced anomaly detection.
>>   1. '''An interface that gives a security investigator a centralized
>>view
>> of data and alerts passed through the system.''' Metron¹s interface
>> presents alert summaries with threat intelligence and enrichment data
>> specific to that alert on one single page. Furthermore, advanced search
>> capabilities and full packet extraction tools are presented to the
>>analyst
>> for investigation without the need to pivot into additional tools.
>>
>> Big data is a natural fit for powerful security analytics. The Metron
>> framework integrates a number of elements from the Hadoop ecosystem to
>> provide a scalable platform for security analytics, incorporating such
>> functionality as full-packet capture, stream processing, batch
>>processing,
>> real-time search, and telemetry aggregation. With Metron, our goal is to
>> tie big data into security analytics and drive towards an extensible
>> centralized platform to effectively enable rapid detection and rapid
>> response for advanced security threats.
>>
>> == Background ==
>>
>> OpenSOC was developed by Cisco over the last two years and pushed out to
>> Github (https://github.com/OpenSOC/opensoc) under the ALv2. However, the
>> development was mostly closed and has largely stopped. As evidence of
>>the
>> inactivity, users have complained that pull requests are not answered
>>for a
>> while
>> 
>>https://groups.google.com/d/msg/opensoc-support/R2W-ZFux8Vk/Y-5tL-EmAAAJ.
>> Finally, no public releases of OpenSOC have been made. From an Apache
>>point
>> of view, the current community is not viable.
>>
>> However, some of the developers of the project have left Cisco and have
>> found

Re: [VOTE] Graduation of Apache Spark from the Incubator

2014-02-11 Thread Debo Dutta (dedutta)

+1

On 2/11/14, 5:08 AM, Ted Dunning ted.dunn...@gmail.com wrote:

+1 (binding)




On Tue, Feb 11, 2014 at 3:52 AM, Alex Karasulu akaras...@apache.org
wrote:

 +1 (binding)


 On Tue, Feb 11, 2014 at 6:50 AM, Mosharaf Chowdhury 
 mosharafka...@gmail.com
  wrote:

  +1
 
  --
  Mosharaf Chowdhury
  http://www.mosharaf.com/
 
 
  On Mon, Feb 10, 2014 at 8:45 PM, Matei Zaharia
matei.zaha...@gmail.com
  wrote:
 
   +1
  
   On Feb 10, 2014, at 8:27 PM, Chris Mattmann mattm...@apache.org
 wrote:
  
Hi Everyone,
   
This is a new VOTE to decide if Apache Spark should graduate
from the Incubator. Please VOTE on the resolution pasted below
the ballot. I'll leave this VOTE open for at least 72 hours.
   
Thanks!
   
[ ] +1 Graduate Apache Spark from the Incubator.
[ ] +0 Don't care.
[ ] -1 Don't graduate Apache Spark from the Incubator because..
   
Here is my +1 binding for graduation.
   
Cheers,
Chris
   
 snip
   
WHEREAS, the Board of Directors deems it to be in the best
interests of the Foundation and consistent with the
Foundation's purpose to establish a Project Management
Committee charged with the creation and maintenance of
open-source software, for distribution at no charge to the
public, related to fast and flexible large-scale data analysis
on clusters.
   
NOW, THEREFORE, BE IT RESOLVED, that a Project Management
Committee (PMC), to be known as the Apache Spark Project, be
and hereby is established pursuant to Bylaws of the Foundation;
and be it further
   
RESOLVED, that the Apache Spark Project be and hereby is
responsible for the creation and maintenance of software
related to fast and flexible large-scale data analysis
on clusters; and be it further RESOLVED, that the office
of Vice President, Apache Spark be and hereby is created,
the person holding such office to serve at the direction of
the Board of Directors as the chair of the Apache Spark
Project, and to have primary responsibility for management
of the projects within the scope of responsibility
of the Apache Spark Project; and be it further
RESOLVED, that the persons listed immediately below be and
hereby are appointed to serve as the initial members of the
Apache Spark Project:
   
* Mosharaf Chowdhury mosha...@apache.org
* Jason Dai jason...@apache.org
* Tathagata Das t...@apache.org
* Ankur Dave ankurd...@apache.org
* Aaron Davidson a...@apache.org
* Thomas Dudziak to...@apache.org
* Robert Evans bo...@apache.org
* Thomas Graves tgra...@apache.org
* Andy Konwinski and...@apache.org
* Stephen Haberman steph...@apache.org
* Mark Hamstra markhams...@apache.org
* Shane Huang shane_hu...@apache.org
* Ryan LeCompte ryanlecom...@apache.org
* Haoyuan Li haoy...@apache.org
* Sean McNamara mcnam...@apache.org
* Mridul Muralidharam mridul...@apache.org
* Kay Ousterhout kayousterh...@apache.org
* Nick Pentreath mln...@apache.org
* Imran Rashid iras...@apache.org
* Charles Reiss wog...@apache.org
* Josh Rosen joshro...@apache.org
* Prashant Sharma prash...@apache.org
* Ram Sriharsha har...@apache.org
* Shivaram Venkataraman shiva...@apache.org
* Patrick Wendell pwend...@apache.org
* Andrew Xia xiajunl...@apache.org
* Reynold Xin r...@apache.org
* Matei Zaharia ma...@apache.org
   
NOW, THEREFORE, BE IT FURTHER RESOLVED, that Matei Zaharia be
appointed to the office of Vice President, Apache Spark, to
serve in accordance with and subject to the direction of the
Board of Directors and the Bylaws of the Foundation until
death, resignation, retirement, removal or disqualification, or
until a successor is appointed; and be it further
   
RESOLVED, that the Apache Spark Project be and hereby is
tasked with the migration and rationalization of the Apache
Incubator Spark podling; and be it further
   
RESOLVED, that all responsibilities pertaining to the Apache
Incubator Spark podling encumbered upon the Apache Incubator
Project are hereafter discharged.
   

   
   
   
  
  
 



 --
 Best Regards,
 -- Alex



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Please add me to the Contributors Group on the Incubator Wiki

2013-09-25 Thread Debo Dutta (dedutta)

Mine is DeboDutta

Thx
debo

On 9/25/13 1:36 PM, Dave snoopd...@gmail.com wrote:

My username is DaveJohnson

Thanks!
- Dave

see also: http://wiki.apache.org/incubator/ContributorsGroup


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Storm into the Incubator

2013-09-12 Thread Debo Dutta (dedutta)

+1

On 9/12/13 12:19 PM, Doug Cutting cutt...@apache.org wrote:

Discussion about the Storm proposal has subsided, issues raised now
seemingly resolved.

I'd like to call a vote to accept Storm as a new Incubator podling.

The proposal is included below and is also at:

  https://wiki.apache.org/incubator/StormProposal

Let's keep the vote open for four working days, until 18 September.

[ ] +1 Accept Storm into the Incubator
[ ] +0 Don't care.
[ ] -1 Don't accept Storm because...

Doug


= Storm Proposal =

== Abstract ==

Storm is a distributed, fault-tolerant, and high-performance realtime
computation system that provides strong guarantees on the processing
of data.

== Proposal ==

Storm is a distributed real-time computation system. Similar to how
Hadoop provides a set of general primitives for doing batch
processing, Storm provides a set of general primitives for doing
real-time computation. Its use cases span stream processing,
distributed RPC, continuous computation, and more. Storm has become a
preferred technology for near-realtime big-data processing by many
organizations worldwide (see a partial list at
https://github.com/nathanmarz/storm/wiki/Powered-By). As an open
source project, Storm¹s developer community has grown rapidly to 46
members.

== Background ==

The past decade has seen a revolution in data processing. MapReduce,
Hadoop, and related technologies have made it possible to store and
process data at scales previously unthinkable. Unfortunately, these
data processing technologies are not realtime systems, nor are they
meant to be. The lack of a Hadoop of realtime has become the biggest
hole in the data processing ecosystem. Storm fills that hole.

Storm was initially developed and deployed at BackType in 2011. After
7 months of development BackType was acquired by Twitter in July 2011.
Storm was open sourced in September 2011.

Storm has been under continuous development on its Github repository
since being open-sourced. It has undergone four major releases (0.5,
0.6, 0.7, 0.8) and many minor ones.


== Rationale ==

Storm is a general platform for low-latency big-data processing. It is
complementary to the existing Apache projects, such as Hadoop. Many
applications are actually exploring using both Hadoop and Storm for
big-data processing. Bringing Storm into Apache is very beneficial to
both Apache community and Storm community.

The rapid growth of Storm community is empowered by open source. We
believe the Apache foundation is a great fit as the long-term home for
Storm, as it provides an established process for community-driven
development and decision making by consensus. This is exactly the
model we want for future Storm development.

== Initial Goals ==

   * Move the existing codebase to Apache
   * Integrate with the Apache development process
   * Ensure all dependencies are compliant with Apache License version 2.0
   * Incremental development and releases per Apache guidelines

== Current Status ==

Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
minor ones. Storm 0.9 is about to be released. Storm is being used in
production by over 50 organizations. Storm codebase is currently
hosted at github.com, which will seed the Apache git repository.

=== Meritocracy ===

We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. Several companies have already
expressed interest in this project, and we intend to invite additional
developers to participate. We will encourage and monitor community
participation so that privileges can be extended to those that
contribute.

=== Community ===

The need for a low-latency big-data processing platform in the open
source is tremendous. Storm is currently being used by at least 50
organizations worldwide (see
https://github.com/nathanmarz/storm/wiki/Powered-By), and is the most
starred Java project on Github. By bringing Storm into Apache, we
believe that the community will grow even bigger.

=== Core Developers ===

Storm was started by Nathan Marz at BackType, and now has developers
from Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.

=== Alignment ===

In the big-data processing ecosystem, Storm is a very popular
low-latency platform, while Hadoop is the primary platform for batch
processing. We believe that it will help the further growth of
big-data community by having Hadoop and Storm aligned within Apache
foundation. The alignment is also beneficial to other Apache
communities (such as Zookeeper, Thrift, Mesos). We could include
additional sub-projects, Storm-on-YARN and Storm-on-Mesos, in the near
future.

== Known Risks ==

=== Orphaned Products ===

The risk of the Storm project being abandoned is minimal. There are at
least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu,
Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized
to continue development. Many of these organizations have built
critical business applications

Re: [PROPOSAL] Storm for Apache Incubator

2013-09-04 Thread Debo Dutta (dedutta)

+1 This would be great.

On 9/4/13 1:07 AM, Nathan Marz nat...@nathanmarz.com wrote:

Hi everyone,

I'd like to propose Storm to be an Apache Incubator project. After much
thought I believe this is the right next step for the project, and I look
forward to hearing everyone's thoughts and feedback!

Here's a link to the proposal:
https://wiki.apache.org/incubator/StormProposal

The proposal is also pasted below.

-Nathan


= Storm Proposal =

== Abstract ==

Storm is a distributed, fault-tolerant, and high-performance realtime
computation system that provides strong guarantees on the processing of
data.

== Proposal ==

Storm is a distributed real-time computation system. Similar to how Hadoop
provides a set of general primitives for doing batch processing, Storm
provides a set of general primitives for doing real-time computation. Its
use cases span stream processing, distributed RPC, continuous computation,
and more. Storm has become a preferred technology for near-realtime
big-data processing by many organizations worldwide (see a partial list at
https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source
project, Storm¹s developer community has grown rapidly to 46 members.

== Background ==

The past decade has seen a revolution in data processing. MapReduce,
Hadoop, and related technologies have made it possible to store and
process
data at scales previously unthinkable. Unfortunately, these data
processing
technologies are not realtime systems, nor are they meant to be. The lack
of a Hadoop of realtime has become the biggest hole in the data
processing ecosystem. Storm fills that hole.

Storm was initially developed and deployed at BackType in 2011. After 7
months of development BackType was acquired by Twitter in July 2011. Storm
was open sourced in September 2011.

Storm has been under continuous development on its Github repository since
being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,
0.8) and many minor ones.

== Rationale ==

Storm is a general platform for low-latency big-data processing. It is
complementary to the existing Apache projects, such as Hadoop. Many
applications are actually exploring using both Hadoop and Storm for
big-data processing. Bringing Storm into Apache is very beneficial to both
Apache community and Storm community.

The rapid growth of Storm community is empowered by open source. We
believe
the Apache foundation is a great fit as the long-term home for Storm, as
it
provides an established process for community-driven development and
decision making by consensus. This is exactly the model we want for future
Storm development.

== Initial Goals ==

  * Move the existing codebase to Apache
  * Integrate with the Apache development process
  * Ensure all dependencies are compliant with Apache License version 2.0
  * Incremental development and releases per Apache guidelines

== Current Status ==

Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
minor
ones. Storm 0.9 is about to be released. Storm is being used in production
by over 50 organizations. Storm codebase is currently hosted at
github.com,
which will seed the Apache git repository.

=== Meritocracy ===

We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. Several companies have already expressed
interest in this project, and we intend to invite additional developers to
participate. We will encourage and monitor community participation so that
privileges can be extended to those that contribute.

=== Community ===

The need for a low-latency big-data processing platform in the open source
is tremendous. Storm is currently being used by at least 50 organizations
worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and
is
the most starred Java project on Github. By bringing Storm into Apache, we
believe that the community will grow even bigger.

=== Core Developers ===

Storm was started by Nathan Marz at BackType, and now has developers from
Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.

=== Alignment ===

In the big-data processing ecosystem, Storm is a very popular low-latency
platform, while Hadoop is the primary platform for batch processing. We
believe that it will help the further growth of big-data community by
having Hadoop and Storm aligned within Apache foundation. The alignment is
also beneficial to other Apache communities (such as Zookeeper, Thrift,
Mesos). We could include additional sub-projects, Storm-on-YARN and
Storm-on-Mesos, in the near future.

== Known Risks ==

=== Orphaned Products ===

The risk of the Storm project being abandoned is minimal. There are at
least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu,
Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized to
continue development. Many of these organizations have built critical
business applications upon Storm, and have devoted significant internal
infrastructure

Re: [VOTE] Accept Samza into the Incubator

2013-07-28 Thread Debo Dutta (dedutta)

+1

On 7/28/13 6:27 AM, Chip Childers chip.child...@sungard.com wrote:

On Fri, Jul 26, 2013 at 12:52:49PM -0700, Jakob Homan wrote:
 Incubator-
 
 Following the discussion earlier this week, I'm calling a vote to accept
 Samza as a new Incubator project.
 
 The proposal draft is available at:
 https://wiki.apache.org/incubator/SamzaProposal,
 and is also included below. It is identical as what was proposed in the
 discussion except for removing the user list, per Marvin's suggestion.
 
 Vote is open for at least 96h and closes at the earliest on 30 July
13:00
 PDT.  I'm letting the vote run an extra day as we're bookending the
weekend
 and I want to give everybody a reasonable workweek margin.
 
 [ ] +1 accept Samza in the Incubator
 [ ] +/-0
 [ ] -1 because...
 
 Here's my binding +1
 
 -Jakob

+1 (binding)

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Samza Proposal

2013-07-23 Thread Debo Dutta (dedutta)

Also add storm to the mix. Storm also allows you to do back edges.

debo

On 7/23/13 6:48 PM, Henry Saputra henry.sapu...@gmail.com wrote:

Looks like this is similar to S4 (http://incubator.apache.org/s4/) which
allow stream and real time data processing via DAG?


- Henry


On Tue, Jul 23, 2013 at 10:47 AM, Chris Ricco
criccomini@gmail.comwrote:

 Hey All,

 Sending along an incubator proposal for Samza.

 Thanks!
 Chris

 https://wiki.apache.org/incubator/SamzaProposal

 

 == Abstract ==

 Samza is a stream processing system for running continuous computation
on
 infinite streams of data.

 == Proposal ==

 Samza provides a system for processing stream data from
publish-subscribe
 systems such as Apache Kafka. The developer writes a stream processing
 task, and executes it as a Samza job. Samza then routes messages between
 stream processing tasks and the publish-subscribe systems that the
messages
 are addressed to.

 == Background ==

 Samza was developed at LinkedIn to enable easier processing of streaming
 data on top of Apache Kafka. Current use cases include content
processing
 pipelines, aggregating operational log data, data ingestion into
 distributed database infrastructure, and measuring user activity across
 different aggregation types.

 Samza is focused on providing an easy to use framework to process
streams.
 It uses Apache YARN to provide a mechanism for deploying stream
processing
 tasks in a distributed cluster. Samza also takes advantage of YARN to
make
 decisions about stream processor locality, co-partition of streams, and
 provide security. Apache Kafka is also leveraged to provide a mechanism
to
 pass messages from one stream processor to the next. Apache Kafka is
also
 used to help manage a stream processor's state, so that it can be
recovered
 in the event of a failure.

 Samza is written in Scala. It was developed internally at LinkedIn to
meet
 our particular use cases, but will be useful to many organizations
facing a
 similar need to reliably process large amounts of streaming data.
 Therefore, we would like to share it the ASF and begin developing a
 community of developers and users within Apache.

 == Rationale ==

 Many organizations can benefit from a reliable stream processing system
 such as Samza. While our use case of processing events from a large
website
 like LinkedIn has driven the design of Samza, its uses are varied and we
 expect many new use cases to emerge. Samza provides a generic API to
 process messages from streaming infrastructure and will appeal to many
 users.

 == Current Status ==

 === Meritocracy ===

 Our intent with this incubator proposal is to start building a diverse
 developer community around Samza following the Apache meritocracy model.
 Since Samza was initially developed in late 2011, we have had fast
adoption
 and contributions by multiple teams at LinkedIn. We plan to continue
 support for new contributors and work with those who contribute
 significantly to the project to make them committers.

 === Community ===

 Samza is currently being used internally at LinkedIn. We hope to extend
our
 contributor base significantly and invite all those who are interested
in
 building large-scale distributed systems to participate.

 === Core Developers ===

 Samza is currently being developed by four engineers at LinkedIn: Jay
 Kreps, Jakob Homan, Sriram Subramanian, and Chris Riccomini. Jakob is an
 ASF Member, Incubator PMC member and PMC member on Apache Hadoop, Kafka
and
 Giraph. Jay is a member of the Apache Kafka PMC and contributor to
various
 Apache projects. Chris has been an active contributor for several
projects
 including Apache Kafka and Apache YARN. Sriram has contributed to
Samza, as
 well as Apache Kafka.

 === Alignment ===

 The ASF is the natural choice to host the Samza project as its goal of
 encouraging community-driven open-source projects fits with our vision
for
 Samza. Additionally, many other projects with which we are familiar with
 and expect Samza to integrate with, such as Apache ZooKeeper, YARN, HDFS
 and log4j are hosted by the ASF and we will benefit and provide benefit
by
 close proximity to them.

 == Known Risks ==

 === Orphaned Products ===

 The core developers plan to work full time on the project. There is very
 little risk of Samza being abandoned as it is part of LinkedIn's
internal
 infrastructure.

 === Inexperience with Open Source ===

 All of the core developers have experience with open source development.
 Jay and Chris has been involved with several open source projects
released
 by LinkedIn, and Jay is a committer on Apache Kafka. Jakob has been
 actively involved with the ASF as a full-time Hadoop committer and PMC
 member. Sriram is a contributor to Apache Kafka.

 === Homogeneous Developers ===

 The current core developers are all from LinkedIn. However, we hope to
 establish a developer community that includes contributors from several

Re: Stratos proposal: is it possible to add another initial committer?

2013-06-17 Thread Debo Dutta (dedutta)

Thanks a lot Sanjiva, Ross, Afkham!

debo

From: Afkham Azeez afk...@gmail.commailto:afk...@gmail.com
Date: Tue, 18 Jun 2013 00:05:14 +0530
To: general@incubator.apache.orgmailto:general@incubator.apache.org
Cc: Debo~ Dutta dedu...@cisco.commailto:dedu...@cisco.com
Subject: Re: Stratos proposal: is it possible to add another initial committer?

Added Debo to the initial committer list.

Azeez

On Mon, Jun 17, 2013 at 11:30 PM, Ross Gardler 
rgard...@opendirective.commailto:rgard...@opendirective.com wrote:
I have not closed the vote yet because it ran over the weekend. I did
state I would leave it running into this week.

As champion I have no objection to you adding Debo. Ultimately it
reduces unnecessary traffic on this list since we won't have to
formally vote him in.

Ross

On 17 June 2013 18:37, Sanjiva Weerawarana 
sanj...@wso2.commailto:sanj...@wso2.com wrote:
 Debo Dutta, cc'ed, from Cisco, will be joining the project and it took a
 bit of time to get it sorted.

 I realize this is a late request as the VOTE is already running .. is it ok
 to add him now? ;-)

 If not we will bring him after the project starts.

 Cheers,

 Sanjiva.
 --
 Sanjiva Weerawarana, Ph.D.
 Founder, Chairman  CEO; WSO2, Inc.;  http://wso2.com/
 email: sanj...@wso2.commailto:sanj...@wso2.com; phone: +94 11 763 9614; 
 cell: +94 77 787 6880 | +1
 650 265 8311
 blog: http://sanjiva.weerawarana.org/

 Lean . Enterprise . Middleware

-
To unsubscribe, e-mail: 
general-unsubscr...@incubator.apache.orgmailto:general-unsubscr...@incubator.apache.org
For additional commands, e-mail: 
general-h...@incubator.apache.orgmailto:general-h...@incubator.apache.org

--
Afkham Azeez
Director of Architecture; WSO2, Inc.; http://wso2.com,
Member; Apache Software Foundation; http://www.apache.org/

email: az...@wso2.commailto:az...@wso2.com cell: +94 77 3320919
blog: http://blog.afkham.org
twitter: http://twitter.com/afkham_azeez
linked-in: http://lk.linkedin.com/in/afkhamazeez

Lean . Enterprise . Middleware

43 matches

Mail list logo