My answers are inline. Feel free to edit or add to them!
rb On 04/27/2015 01:09 PM, Sally Khudairi wrote:
Hello Julien and Parquet PMC --per below, the SD Times is looking to cover Parquet for a story tomorrow morning and needs the following questions answered. If you can please forward your responses, I'll be happy to coordinate with Rob. Thanks in advance, Sally [From the mobile; please excuse top-posting, spelling/spacing errors, and brevity] ----- Forwarded message ----- From: "Rob Marvin" <[email protected]> To: "Sally Khudairi" <[email protected]> Subject: SD Times story on Apache Parquet graduating to TLP Date: Mon, Apr 27, 2015 15:41 Hi Sally, I hope you're well. I'm reaching out because I'm putting together a brief SD Times story on Apache Parquet's elevation to Top-Level Project, and I'd like to get an original quote or two to accompany the story. Can you ping Julien Le Dern or another ASF member on the Apache Parquet team for a brief comment or two? We're looking to run the story by tomorrow morning at the latest. Here are a couple questions to guide the comments: -What is it that makes Apache Parquet unique in what the columnar storage format brings to the Hadoop ecosystem and the many companies using the project in production?
Bring your own object model: Lots of applications are based on existing row-oriented formats, like Avro and Thrift, that come with objects to represent the data. A great feature of Parquet is that it is built to work natively with those existing classes, so you don't have to change the application to go from a row-oriented to a column-oriented format. Parquet can read directly to Avro records, Spark data frames, Hive's internal writables, and others.
-What does Parquet's elevation to TLP signify for its development going forward, and what can developers expect in terms of the future growth and evolution of the project?
Graduation from the Incubator to become a TLP shows that the Parquet project has a healthy Apache community. I think that's one of the best votes of confidence you could have in an open source project: people care about it, put time into it, and know how to work together.
That's an asset to future growth and we can see it in the on-going development efforts. For example, experts on Drill, Presto, and Hive projects are collaborating on a vectorized API for accessing Parquet data. It's great that we can work together on Parquet as a community standard across those projects.
In more practical terms, we've finished a lot of the migration work to become part of the Apache Software Foundation and we're looking forward to a more regular release cadence again.
That's it! Quick and easy. Let me know if you have any questions and when I can expect the quotes. Thanks in advance for your help! Best, Rob -- Rob Marvin <http://sdt.bz/about/RobMarvin> Online & Social Media Editor BZ Media LLC, SD Times O: (631) 421-4158 x131 C: (516) 987-9926 [email protected] <mailto:[email protected]>
-- Ryan Blue Software Engineer Cloudera, Inc.
