Re: [DISCUSS] Apache Dataflow Incubator Proposal
We are developing parallel machine learning algorithms for a research project and are very interested in DataFlow. I would like to contribute to this project as well. It will be great if you can add me. Thanks, Supun... On Thu, Jan 21, 2016 at 6:29 PM, Mayank Bansal wrote: > Hi Jean, > > Nice Proposal. > > I wanted to contribute to this project. Can you please add me too? > > Thanks a lot for the help > > Thanks, > Mayank > > On Thu, Jan 21, 2016 at 8:07 AM, Jean-Baptiste Onofré > wrote: > > > Hey Alex, > > > > awesome: I added you on the proposal. > > > > Thanks, > > Regards > > JB > > > > > > On 01/21/2016 05:03 PM, Alexander Bezzubov wrote: > > > >> Hi, > >> > >> it's great to see DataFlow becoming part to Apache ecosystem, thank you > >> bringing it in. > >> I would be happy to get involved and help. > >> > >> -- > >> Alex > >> > >> On Thu, Jan 21, 2016 at 8:42 PM, Jean-Baptiste Onofré > >> wrote: > >> > >> Perfect: done, you are on the proposal. > >>> > >>> Thanks ! > >>> Regards > >>> JB > >>> > >>> > >>> On 01/21/2016 11:55 AM, chatz wrote: > >>> > >>> Charitha Elvitigala > > On 21 January 2016 at 16:17, Jean-Baptiste Onofré > wrote: > > Hi Chatz, > > > > > sure, what name should I use on the proposal, Charitha ? > > > > Regards > > JB > > > > > > On 01/21/2016 11:32 AM, chatz wrote: > > > > Hi Jean, > > > >> > >> I’d be interested in contributing as well. > >> > >> Thanks, > >> > >> Chatz > >> > >> > >> On 21 January 2016 at 14:22, Jean-Baptiste Onofré > >> wrote: > >> > >> Sweet: you are on the proposal ;) > >> > >> > >>> Thanks ! > >>> Regards > >>> JB > >>> > >>> > >>> On 01/21/2016 08:55 AM, Byung-Gon Chun wrote: > >>> > >>> This looks very interesting. I'm interested in contributing. > >>> > >>> > Thanks. > -Gon > > --- > Byung-Gon Chun > > > On Thu, Jan 21, 2016 at 1:32 AM, James Malone < > jamesmal...@google.com.invalid> wrote: > > Hello everyone, > > > Attached to this message is a proposed new project - Apache > > Dataflow, a > > unified programming model for data processing and integration. > > > > The text of the proposal is included below. Additionally, the > > proposal > > is > > in draft form on the wiki where we will make any required > changes: > > > > https://wiki.apache.org/incubator/DataflowProposal > > > > We look forward to your feedback and input. > > > > Best, > > > > James > > > > > > > > = Apache Dataflow = > > > > == Abstract == > > > > Dataflow is an open source, unified model and set of > > language-specific > > SDKs > > for defining and executing data processing workflows, and also > data > > ingestion and integration flows, supporting Enterprise > Integration > > Patterns > > (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines > > simplify > > the mechanics of large-scale batch and streaming data processing > > and > > can > > run on a number of runtimes like Apache Flink, Apache Spark, and > > Google > > Cloud Dataflow (a cloud service). Dataflow also brings DSL in > > different > > languages, allowing users to easily implement their data > > integration > > processes. > > > > == Proposal == > > > > Dataflow is a simple, flexible, and powerful system for > distributed > > data > > processing at any scale. Dataflow provides a unified programming > > model, a > > software development kit to define and construct data processing > > pipelines, > > and runners to execute Dataflow pipelines in several runtime > > engines, > > like > > Apache Spark, Apache Flink, or Google Cloud Dataflow. Dataflow > can > > be > > used > > for a variety of streaming or batch data processing goals > including > > ETL, > > stream analysis, and aggregate computation. The underlying > > programming > > model for Dataflow provides MapReduce-like parallelism, combined > > with > > support for powerful data windowing, and fine-grained correctness > > control. > > > > == Background == > > > > Dataflow started as a set of Google projects focused on making > data > > processing easier, faster, and less costly. The Dataflow model > is a > > successor to MapReduce, FlumeJava, and Millwheel inside Google > and > > is > > focused on
Re: [PROPOSAL] Heron
logic on the stream of tuples. Parallelism is achieved > > via > > > process-based isolation of Heron instances, which provides predictable > > > performance while simplifying debugging. The containers are allocated > and > > > managed by the scheduler framework based on resource availability of > > nodes > > > in the cluster. The metadata for the topology, such as the physical > plan > > > and execution details, are stored in the pluggable Heron State Manager > > > (e.g. Apache ZooKeeper). > > > > > > ## Rationale > > > > > > Heron is a general-purpose, modular and extensible platform that can be > > > leveraged to support common, real-time analytics use cases. There is an > > > increasing demand for open-source, scalable real-time analytics > systems. > > We > > > believe that Heron can be leveraged by other organizations to build > > > streaming applications that can benefit from its robustness, high > > > performance, adaptability to cloud environments and ease of use. > > Moreover, > > > we hope that open-sourcing Heron will help to further evolve the > > technology > > > as the project attracts contributors with diverse backgrounds and areas > > of > > > expertise. > > > > > > We believe the Apache foundation is a great fit as the long-term home > for > > > Heron, as it provides an established process for community-driven > > > development and decision making by consensus. This is exactly the model > > we > > > want for future Heron development. > > > > > > ## Initial Goals > > > > > > * Move the existing codebase, website, documentation, and mailing lists > > to > > > Apache-hosted infrastructure. > > > * Integrate with the Apache development process. > > > * Ensure all dependencies are compliant with Apache License version > 2.0. > > > * Incrementally develop and release per Apache guidelines. > > > > > > ## Current Status > > > > > > Heron is a stable project used in production at Twitter since 2014 and > > open > > > sourced under the ASL v2 license in 2016. The Heron source code is > > > currently hosted at github.com (https://github.com/twitter/heron), > which > > > will seed the Apache git repository. > > > > > > ### Meritocracy > > > > > > By submitting this incubator proposal, we’re expressing our intent to > > build > > > a diverse developer community around Heron that will conduct itself > > > according to The Apache Way and use a meritocratic means of building > it's > > > committer base. Several companies and universities have already > expressed > > > interest in and contributed to Heron. Our goal is to grow the Heron > > > community by encouraging open communication, contribution and > > participation > > > of all types, and ensuring that contributors are recognized > > appropriately. > > > > > > ### Community > > > > > > Heron is currently being used by Twitter, Google, Machine Zone and > > > ndustrial.io and has received significant contributions by Microsoft > and > > > Streamlio. By bringing Heron into the Apache ecosystem, we believe we > can > > > attract even more developers who are interested in creating real-time > > > systems to build the project's contributor base. > > > > > > ### Core Developers > > > > > > Current core developers are engineers from Twitter, Google, Microsoft > and > > > Streamlio. > > > > > > ### Alignment > > > > > > Heron utilizes a number of Apache technologies. Heron leverages Apache > > > ZooKeeper for coordination and has scheduler implementations to > integrate > > > with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache > > REEF) > > > as well as spout implementations to integrate with Apache Kafka and > > metrics > > > implementations to integrate with Scribe. Heron also implements the > > Apache > > > Storm user-level API, which allows topologies written against Storm to > > run > > > in Heron. We believe that having Heron at Apache will help further the > > > growth of the streaming compute community, as well as encourage > > cooperation > > > and developer cross pollination with other Apache projects. > > > > > > ## Known Risks > > > > > > ### Orphaned Products > > > > > > The risk of the
Re: [PROPOSAL] Heron
Thank you, William, for offering to help with the incubation process. It will be really helpful. Supun.. On Wed, Jun 14, 2017 at 11:04 PM, William Markito Oliveira < william.mark...@gmail.com> wrote: > Howdy! > > If Heron is looking for some help around incubation process, I'd love to > help while Geode experience is still fresh in my mind and given that it's a > project/space that I do have interest. Since I'm not an ASF member, I don't > think I can offer to be a mentor, but can probably still help and > participate on the process. > > Thanks! > > On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz > wrote: > > > Hi Bill/Supun, > > > > Sorry for not being a little more clear. I was asking more about how the > > Heron community would seek to engage with Storm community at the > > *community* level as opposed to the technical level (i.e. “Community over > > Code”). > > > > I’ve been asked by many why this has never happened, and have always > > struggled to answer. Maybe you could help answer that question as well as > > if and how that might change if Heron were to incubate. > > > > Another quick question: The proposal mentions Heron being used in > > production at Google, but some Google employees I recently spoke to > seemed > > to contradict that. Could you explain? Note that’s nothing that would > > preclude the project from incubating, I’m just curious. > > > > -Taylor > > > > > On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve > > wrote: > > > > > > Hi Taylor, > > > > > > For me, one of the interesting differences between Heron and Storm is > the > > > execution model. Storm uses a shared memory model while Heron uses a > > > process based model. It will be interesting to see how these two > evolve. > > > > > > Thanks, > > > Supun.. > > > > > > On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham > > wrote: > > > > > >> Hi Taylor, > > >> > > >> Thanks for the mentor offer, we'd be glad to have your help. > > >> > > >> I think the best place for collaboration would be around the evolution > > of > > >> the API. In addition we plan to look more into DSL solutions which we > > could > > >> potentially collaborate on. This could be Trident, or Beam or > something > > >> else, but there could be synergies for future development here. > > >> > > >> thanks, > > >> Bill > > >> > > >> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz > > wrote: > > >> > > >>> Hi Bill, > > >>> > > >>> Could you comment on how/if the Heron community would be willing to > > work > > >>> with the Storm community? I've seen a number of new features in Storm > > >> being > > >>> ported to Heron, but I have yet to see any attempt by the Heron > > community > > >>> to engage with the Apache Storm community. > > >>> > > >>> I don't think it would be too far off to say that the relationship > > >> between > > >>> Heron and Apache Storm has been somewhat adversarial. The pre- and > > >>> post-open sourcing marketing around Heron seemed, at least to me, > > >> somewhat > > >>> aggressively negative toward Storm. > > >>> > > >>> As a peer to Apache Storm, how would the proposed "Apache Heron" > > >> community > > >>> work to collaborate with the Storm community? If Heron is adopting > API > > >>> changes in Storm, then it seems there is an opportunity for > > >> collaboration. > > >>> > > >>> Don't take any of this as an objection to incubating the project. I > > would > > >>> support it. I would also be willing to be a mentor, if you would > > consider > > >>> taking on another. > > >>> > > >>> -Taylor > > >>> > > >>>> On Jun 8, 2017, at 1:23 PM, Bill Graham > wrote: > > >>>> > > >>>> Dear Apache Incubator Community, > > >>>> > > >>>> We are excited to share our proposal for discussion and feedback > > >>>> for entering Apache Incubation. Heron is a real-time, distributed, > > >>>> fault-tolerant stream processing engine. > > >>>> > > >>>> Our pro
Re: Showcase your project at ApacheCON at a Podling's Shark Tank
We are thinking about proposing a project to the incubator. Would this be open to such a project? Best, Supun.. On Thu, Aug 15, 2019 at 10:05 AM Antoine Toulme wrote: > Hello Roman, > > The Tuweni podling is interested. I am attending ApacheCon EU and would > like a chance to present. > > Cheers, > > Antoine > > > On Aug 14, 2019, at 1:45 PM, Roman Shaposhnik wrote: > > > > Hi Podlings! > > > > in less than a month we're going to have our first > > ApacheCON this year -- the one in Las Vegas. In > > about two month there will be one more in Berlin. > > > > These are not your regular ApacheCONs -- these are > > 20th Anniversary of ASF ApacehCONs! In other words, > > these are not to be missed! > > > > And even if your talk didn't get accepted -- you still > > get an opportunity to highlight your project to, what's > > likely going to be the biggest audience attending. > > > > Here's how: if you (or any community member who's > > passionate about your project) are going to be at either > > of those ApacheCONs consider signing up for > >Podling's Shark Tank > > events: > >https://www.apachecon.com/acna19/s/#/scheduledEvent/1038 > >https://aceu19.apachecon.com/session/podlings-shark-tank > > > > Each project presenting will get ~10 min for the pitch and ~5 min > > of panel grilling them on all sorts of things. Kind of like this ;-) > > https://www.youtube.com/watch?v=wmenN7NEdBc > > > > You've got nothing to lose (in fact, the opposite: you're likely to get > > a prize!) and you will get a chance to receive feedback that might > > actually help you grow your community and ultimately graduate to the > > TLP status. And! Given our awesome panel of judges: > > * Myrle Krantz > > * Justin Mclean > > * Craig Russel > > * Shane Curcuru > > We guarantee this to be a fun and useful event for your community! > > > > We will be tracking signups over here: > > https://wiki.apache.org/apachecon/ACNA19PodlingSharkTank > > https://wiki.apache.org/apachecon/ACEU19PodlingSharkTank > > but for now: > > > > SIMPLY REPLY TO THIS EMAIL if you're interested. > > > > It is first come, first serve -- so don't delay -- sign up today! > > > > Thanks, > > Roman. > > > > ----- > > To unsubscribe, e-mail: dev-unsubscr...@tuweni.apache.org > > For additional commands, e-mail: dev-h...@tuweni.apache.org > > > > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > -- Supun Kamburugamuve, PhD Digital Science Center, Indiana University Member, Apache Software Foundation; http://www.apache.org E-mail: supun@apache.o rg; Mobile: +1 812 219 2563