Thanks, Julien --I can include that, yes. Does this work for you?
<snip> Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San Jose, California. The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/ </snip> Warmest regards, Sally ________________________________ From: Julien Le Dem <[email protected]> To: "[email protected]" <[email protected]>; Sally Khudairi <[email protected]> Cc: Sally Khudairi <[email protected]>; Daniel Weeks <[email protected]>; Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; "[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" <[email protected]>; "[email protected]" <[email protected]> Sent: Sunday, 26 April 2015, 19:14 Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?] Did you want to mention the parquet talks at the Hadoop summit in June? Otherwise this looks good to me. On Sunday, April 26, 2015, Sally Khudairi <[email protected]> wrote: Hi everyone --I haven't received any other feedback, so I think we're all set to announce tomorrow. >I'd like to issue the press release at at 7AM ET. I'll confirm when we're live. >If there are any showstoppers, please let me know ASAP. >Thanks so much,Sally > > From: Sally Khudairi <[email protected]> > To: Sally Khudairi <[email protected]>; Daniel Weeks > <[email protected]>; "[email protected]" > <[email protected]> >Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" ><[email protected]>; "[email protected]" <[email protected]> > Sent: Friday, 24 April 2015, 16:17 > Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog > post?] > >Hello again, everyone --below is the latest draft. > >Please review and forward any changes/additions no later than 5PM ET on Sunday >in order for us to announce on Monday morning. I was aiming to go live by 7AM >ET if that works for you. > >Kindly confirm. > >Thanks in advance, >Sally > >= = = > >DRAFT :: NOT FOR DISTRIBUTION > >The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level >Project > >Open Source storage format for the Apache™ Hadoop® ecosystem in use at >Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations > >Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the >all-volunteer developers, stewards, and incubators of more than 350 Open >Source projects and initiatives, announced today that Apache™ Parquet™ has >graduated from the Apache Incubator to become a Top-Level Project (TLP), >signifying that the project's community and products have been well-governed >under the ASF's meritocratic process and principles. > >"The incubation process at Apache has been fantastic and really the last step >of making Parquet a community driven standard fully integrated within the >greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache >Parquet. > >Apache Parquet is an Open Source columnar storage format for the Apache™ >Hadoop® ecosystem, built to work across programming languages and much more: > > > - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, > Crunch, Kite) > - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) > - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache > Pig, Presto, Apache Spark SQL) > >"At Twitter, Parquet has helped us scale our big data usage by in some cases >reducing storage requirements by one third on large datasets as well as scan >and deserialization time. This translated into hardware savings as well as >reduced latency for accessing the data. Furthermore, Parquet being integrated >with so many tools creates opportunities and flexibility regarding query >engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's >just fantastic to see it graduate to a top-level project and we look forward >to further collaborating with the Apache Parquet community to continually >improve performance." > >"Parquet’s integration with other object models, like Avro and Thrift, has >been a key feature for our customers," said Ryan Blue, Software Engineer at >Cloudera. "They can take advantage of columnar storage without changing the >classes they already use in their production applications." > >"At Netflix, Parquet is the primary storage format for data warehousing. More >than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that >we query across a wide range of tools including Apache Hive, Apache Pig, >Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of >columnar projection and statistics is a game changer for our big data >platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward >to working with the Apache community to advance the state of big data storage >with Parquet and are excited to see the project graduate to full Apache >status." > >"Stripe's data warehouse has been built on Parquet from the beginning," said >Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from >data import to machine learning to adhoc SQL analysis, uses Apache Parquet as >the common interchange format." > >"I was extremely happy to see Parquet arrive as an Incubator project," said >Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, >Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. >"After talking with some in its community there was a real match with this >columnar data format technology and its community with the way that we do >things here at the ASF. Parquet has had an exemplar Incubation, and the >project has big things ahead of it. I am encouraging my Data Science Team at >NASA to evaluate it for data representation especially as it relates to our >science holdings in Earth, planetary and space sciences, and astrophysics." > >The Apache Parquet project welcomes contributions and community participation >through mailing lists, face-to-face MeetUps, and user events. For more >information, visit http://parquet.apache.org/community/ > >Availability and Oversight >Apache Parquet software is released under the Apache License v2.0 and is >overseen by a self-selected team of active contributors to the project. A >Project Management Committee (PMC) guides the Project's day-to-day operations, >including community development and product releases. For downloads, >documentation, and ways to become involved with Apache Parquet, visit >http://parquet.apache.org/ and https://twitter.com/ApacheParquet > >About the Apache Incubator >The Apache Incubator is the entry path for projects and codebases wishing to >become part of the efforts at The Apache Software Foundation. All code >donations from external organizations and existing external projects wishing >to join the ASF enter through the Incubator to: 1) ensure all donations are in >accordance with the ASF legal standards; and 2) develop new communities that >adhere to our guiding principles. Incubation is required of all newly accepted >projects until a further review indicates that the infrastructure, >communications, and decision making process have stabilized in a manner >consistent with other successful ASF projects. While incubation status is not >necessarily a reflection of the completeness or stability of the code, it does >indicate that the project has yet to be fully endorsed by the ASF. For more >information, visit http://incubator.apache.org/. > >About The Apache Software Foundation (ASF) >Established in 1999, the all-volunteer Foundation oversees more than 350 >leading Open Source projects, including Apache HTTP Server --the world's most >popular Web server software. Through the ASF's meritocratic process known as >"The Apache Way," more than 500 individual Members and 4,500 Committers >successfully collaborate to develop freely available enterprise-grade >software, benefiting millions of users worldwide: thousands of software >solutions are distributed under the Apache License; and the community actively >participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the >Foundation's official user conference, trainings, and expo. The ASF is a US >501(c)(3) charitable organization, funded by individual donations and >corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, >Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, >iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For >more information, visit http://www.apache.org/ or follow @TheASF on Twitter. > >© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", >"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", >"Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", >"Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of >the Apache Software Foundation in the United States and/or other countries. >All other brands and trademarks are the property of their respective owners. > ># # # > >[MEDIA CONTACT:SALLY] >________________________________ > > >From: Sally Khudairi <[email protected]> >To: Sally Khudairi <[email protected]>; Daniel Weeks ><[email protected]>; "[email protected]" ><[email protected]> >Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" ><[email protected]>; "[email protected]" <[email protected]> >Sent: Friday, 24 April 2015, 13:56 >Subject: Re: Graduation blog post? > > > >Done. > >ALL: can you please let me know if there are any events that Parquet will be >at? Presenting? Hosting? etc. > >Thank you! > >-Sally > > > > > >________________________________ >From: Sally Khudairi <[email protected]> >To: Daniel Weeks <[email protected]>; "[email protected]" ><[email protected]> >Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" ><[email protected]>; "[email protected]" <[email protected]> >Sent: Friday, 24 April 2015, 13:40 >Subject: Re: Graduation blog post? > > > >Of course --I'll fix that now! > >Sorry about that, Daniel. > >-Sally > > > > > > >________________________________ >From: Daniel Weeks <[email protected]> >To: [email protected]; Sally Khudairi <[email protected]> >Cc: Chris Aniszczyk <[email protected]>; Ryan Blue <[email protected]>; >"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" ><[email protected]>; "[email protected]" <[email protected]> >Sent: Friday, 24 April 2015, 13:38 >Subject: Re: Graduation blog post? > > > >Sally, > >Just wanted to comment that my last name is misspelled in the Netflix >testimonial. Can someone fix that? (it's Weeks, not Week) > >Thanks, >Dan > > > > >On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi ><[email protected]> wrote: > >Hi everyone --there's been the addition of a quote from Stripe: >> >>"Stripe's data warehouse has been built on Parquet from the beginning," said >>Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, >>from data import to machine learning to adhoc SQL analysis, uses Apache >>Parquet as the common interchange format." >> >> >>--please note that I added "Apache" to "Parquet" in the second sentence. >>Stripe has also been added to the sub-head. >> >>Are we waiting for quotes from anyone else? If not, I can add a closing >>sentence and forward the final copy later today. >> >>Thanks so much, >>Sally >> >> >> >>----- Original Message ----- >> >>From: Sally Khudairi <[email protected]> >>To: Chris Aniszczyk <[email protected]>; >>"[email protected]" <[email protected]> >>Cc: Ryan Blue <[email protected]>; "[email protected]" >><[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >>Sent: Thursday, 23 April 2015, 15:25 >>Subject: Re: Graduation blog post? >> >>Hello everyone --below is the draft thus far. >> >> >>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting >>for additional quotes. >> >>Also, should we get a closing quote from Julien? Perhaps something that >>invites additional community participation? >> >>Please let me know your thoughts. >> >>Thanks so much, >>Sally >> >>= = = >> >>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level >>Project >> >>Open Source storage format for the Apache™ Hadoop® ecosystem in use at >>Cloudera, NASA, Netflix, and Twitter, among other organizations >> >>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the >>all-volunteer developers, stewards, and incubators of more than 350 Open >>Source projects and initiatives, announced today that Apache™ Parquet™ has >>graduated from the Apache Incubator to become a Top-Level Project (TLP), >>signifying that the project's community and products have been well-governed >>under the ASF's meritocratic process and principles. >> >>"The incubation process at Apache has been fantastic and really the last step >>of making Parquet a community driven standard fully integrated within the >>greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache >>Parquet. >> >>Apache Parquet is an Open Source columnar storage format for the Apache™ >>Hadoop® ecosystem, built to work across programming languages and much more: >>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, >>Crunch, Kite) >>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) >>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache >>Pig, Presto, Apache Spark SQL) >> >>"At Twitter, Parquet has helped us scale our big data usage by in some cases >>reducing storage requirements by one third on large datasets as well as scan >>and deserialization time. This translated into hardware savings as well as >>reduced latency for accessing the data. Furthermore, Parquet being integrated >>with so many tools creates opportunities and flexibility regarding query >>engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, >>it's just fantastic to see it graduate to a top-level project and we look >>forward to further collaborating with the Apache Parquet community to >>continually improve performance." >> >>"Parquet’s integration with other object models, like Avro and Thrift, has >>been a key feature for our customers," said Ryan Blue, Software Engineer at >>Cloudera. "They can take advantage of columnar storage without changing the >>classes they already use in their production applications." >> >>"At Netflix, Parquet is the primary storage format for data warehousing. More >>than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that >>we query across a wide range of tools including Apache Hive, Apache Pig, >>Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit >>of columnar projection and statistics is a game changer for our big data >>platform," said Daniel Week, Software Engineer at Netflix. "We look forward >>to working with the Apache community to advance the state of big data storage >>with Parquet and are excited to see the project graduate to full Apache >>status." >> >>"I was extremely happy to see Parquet arrive as an Incubator project," said >>Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, >>Instrument and Science Data Systems Section at NASA Jet Propulsion >>Laboratory. "After talking with some in its community there was a real match >>with >>this columnar data format technology and its community with the way that we >>do things here at the ASF. Parquet has had an exemplar Incubation, and the >>project has big things ahead of it. I am encouraging my Data Science Team at >>NASA to evaluate it for data representation especially >>as it relates to our science holdings in Earth, planetary and space sciences, >>and astrophysics." >> >> >>Stripe? @cra reached out to Avi, said he would get something by Monday >>Criteo? >> >>@@CLOSING QUOTE FROM JULIEN? >> >>Availability and Oversight >>Apache Parquet software is released under the Apache License v2.0 and is >>overseen by a self-selected team of active contributors to the project. A >>Project Management Committee (PMC) guides the Project's day-to-day >>operations, including community development and product releases. For >>downloads, documentation, and ways to become involved with Apache Parquet, >>visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet >> >>About the Apache Incubator >>The Apache Incubator is the entry path for projects and codebases wishing to >>become part of the efforts at The Apache Software Foundation. All code >>donations from external organizations and existing external projects wishing >>to join the ASF enter through the Incubator to: 1) ensure all donations are >>in accordance with the ASF legal standards; and 2) develop new communities >>that adhere to our guiding principles. Incubation is required of all newly >>accepted projects until a further review indicates that the infrastructure, >>communications, and decision making process have stabilized in a manner >>consistent with other successful ASF projects. While incubation status is not >>necessarily a reflection of the completeness or stability of the code, it >>does indicate that the project has yet to be fully endorsed by the ASF. For >>more information, visit http://incubator.apache.org/. >> >>About The Apache Software Foundation (ASF) >>Established in 1999, the all-volunteer Foundation oversees more than 350 >>leading Open Source projects, including Apache HTTP Server --the world's most >>popular Web server software. Through the ASF's meritocratic process known as >>"The Apache Way," more than 500 individual Members and 4,500 Committers >>successfully collaborate to develop freely available enterprise-grade >>software, benefiting millions of users worldwide: thousands of software >>solutions are distributed under the Apache License; and the community >>actively participates in ASF mailing lists, mentoring initiatives, and >>ApacheCon, the Foundation's official user conference, trainings, and expo. >>The ASF is a US 501(c)(3) charitable organization, funded by individual >>donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, >>Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion >>Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and >>Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF >>on Twitter. >> >>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", >>"Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", >>"Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", >>"Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or >>trademarks of the Apache Software Foundation in the United States and/or >>other countries. All other brands and trademarks are the property of their >>respective owners. >> >># # # >> >> >>________________________________ >> >>From: Chris Aniszczyk <[email protected]> >>To: "[email protected]" <[email protected]> >>Cc: Sally Khudairi <[email protected]>; Ryan Blue <[email protected]>; >>"[email protected]" <[email protected]>; "Mattmann, Chris A (3980)" >><[email protected]>; "[email protected]" <[email protected]> >>Sent: Wednesday, 22 April 2015, 14:51 >>Subject: Re: Graduation blog post? >> >> >> >>Thanks Daniel, I added your quote. >> >> >> >> >>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <[email protected]> >>wrote: >> >>Netflix Testimonial: >>> >>>At Netflix, Parquet is the primary storage format for data warehousing. >>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted >>>data that we query across a wide range of tools including Apache Hive, >>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The >>>performance benefit of columnar projection and statistics is a game changer >>>for our big data platform. We look forward to working with the Apache >>>community to advance the state of big data storage with Parquet and are >>>excited to see the project graduate to full Apache status. >>> >>>Daniel Weeks >>>Engineering Manager - Big Data Compute >>>Neflix >>> >>> >>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi < >>>[email protected]> wrote: >>> >>>> Thanks for the draft thus far, Ryan. >>>> Can we please include at least one more industry testimonial? >>>> Also, if you can please provide edit access to my account at >>>> [email protected], that would be great. >>>> Thanks in advance for this! >>>> -Sally >>>> >>>> >>>> From: Ryan Blue <[email protected]> >>>> To: [email protected]; Sally Khudairi <[email protected]> >>>> Cc: "Mattmann, Chris A (3980)" <[email protected]>; " >>>> [email protected]" <[email protected]>; "[email protected]" < >>>> [email protected]> >>>> Sent: Monday, 20 April 2015, 15:48 >>>> Subject: Re: Graduation blog post? >>>> >>>> On 04/20/2015 12:36 PM, Jake Farrell wrote: >>>> > Hey Sally >>>> > i've got root@ karma and will take care of the infra side of things for >>>> > us once the board has successfully voted on our resolution >>>> > >>>> > -Jake >>>> >>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up >>>> with this news so they don't worry about it. >>>> >>>> rb >>>> >>>> >>>> -- >>>> Ryan Blue >>>> Software Engineer >>>> Cloudera, Inc. >>>> >>>> >>>> >>>> >>> >> >> >>-- >> >>Cheers, >> >>Chris Aniszczyk >>http://aniszczyk.org >>+1 512 961 6719 >> > >
