[this announcement is available online at https://s.apache.org/HK21 ]

Pioneering Open Source distributed enterprise framework powers US$166B Big Data 
ecosystem

Wakefield, MA —23 January 2019— The Apache Software Foundation (ASF), the 
all-volunteer developers, stewards, and incubators of more than 350 Open Source 
projects and initiatives, today announced Apache® Hadoop® v3.2.0, the latest 
version of the Open Source software framework for reliable, scalable, 
distributed computing.

Now in its 11th year, Apache Hadoop is the foundation of the US$166B Big Data 
ecosystem (source: IDC) by enabling data applications to run and be managed on 
large hardware clusters in a distributed computing environment. "Apache Hadoop 
has been at the center of this big data transformation, providing an ecosystem 
with tools for businesses to store and process data on a scale that was unheard 
of several years ago," according to Accenture Technology Labs.

"This latest release unlocks the powerful feature set the Apache Hadoop 
community has been working on for more than nine months," said Vinod Kumar 
Vavilapalli, Vice President of Apache Hadoop. "It further diversifies the 
platform by building on the cloud connector enhancements from Apache Hadoop 
3.0.0 and opening it up for deep learning use-cases and long-running apps."

Apache Hadoop 3.2.0 highlights include:

 - ABFS Filesystem connector —supports the latest Azure Datalake Gen2 Storage;
 - Enhanced S3A connector —including better resilience to throttled AWS S3 and 
DynamoDB IO;
 - Node Attributes Support in YARN —helps to tag multiple labels on the nodes 
based on its attributes and supports placing the containers based on expression 
of these labels;
 - Storage Policy Satisfier  —supports HDFS (Hadoop Distributed File System) 
applications to move the blocks between storage types as they set the storage 
policies on files/directories; 
 - Hadoop Submarine —enables data engineers to easily develop, train and deploy 
deep learning models (in TensorFlow) on very same Hadoop YARN cluster;
 - C++ HDFS client —helps to do async IO to HDFS which helps downstream 
projects such as Apache ORC;
 - Upgrades for long running services —supports in-place seamless upgrades of 
long running containers via YARN Native Service API (application program 
interface) and CLI (command-line interface).

"This is one of the biggest releases in Apache Hadoop 3.x line which brings 
many new features and over 1,000 changes," said Sunil Govindan, Apache Hadoop 
3.2.0 release manager. "We are pleased to announce that Apache Hadoop 3.2.0 is 
available to take your data management requirements to the next level. Thanks 
to all our contributors who helped to make this release happen."

Apache Hadoop is widely deployed at numerous enterprises and institutions 
worldwide, such as Adobe, Alibaba, Amazon Web Services, AOL, Apple, Capital 
One, Cloudera, Cornell University, eBay, ESA Calvalus satellite mission, 
Facebook, foursquare, Google, Hortonworks, HP, Huawei, Hulu, IBM, Intel, 
LinkedIn, Microsoft, Netflix, The New York Times, Rackspace, Rakuten, SAP, 
Tencent, Teradata, Tesla Motors, Twitter, Uber, and Yahoo. The project 
maintains a list of educational and production users, as well as companies that 
offer Hadoop-related services at https://wiki.apache.org/hadoop/PoweredBy

Global Knowledge hails, "...the open-source Apache Hadoop platform changes the 
economics and dynamics of large-scale data analytics due to its scalability, 
cost effectiveness, flexibility, and built-in fault tolerance. It makes 
possible the massive parallel computing that today's data analysis requires."

Hadoop is proven at scale: Netflix captures 500+B daily events using Apache 
Hadoop. Twitter uses Apache Hadoop to handle 5B+ sessions a day in real time. 
Twitter’s 10,000+ node cluster processes and analyzes more than a zettabyte of 
raw data through 200B+ tweets per year. Facebook’s cluster of 4,000+ machines 
that store 300+ petabytes is augmented by 4 new petabytes of data generated 
each day. Microsoft uses Apache Hadoop YARN to run the internal Cosmos data 
lake, which operates over hundreds of thousands of nodes and manages billions 
of containers per day.

Transparency Market Research recently reported that the global Hadoop market is 
anticipated to rise at a staggering 29% CAGR with a market valuation of 
US$37.7B by the end of 2023.

Apache Hadoop remains one of the most active projects at the ASF: it ranks #1 
for Apache project repositories by code commits, and is the #5 repository by 
size (3,881,797 lines of code).

"The Apache Hadoop community continues to go from strength to strength in 
further driving innovation in Big Data," added Vavilapalli. "We hope that 
developers, operators and users leverage our latest release in fulfilling their 
data management needs."

Catch Apache Hadoop in action at the Strata conference, 25-28 March 2019 in San 
Francisco, and dozens of Hadoop MeetUps held around the world, including on 30 
January 2019 at LinkedIn in Sunnyvale, California.

Availability and Oversight
Apache Hadoop software is released under the Apache License v2.0 and is 
overseen by a self-selected team of active contributors to the project. A 
Project Management Committee (PMC) guides the Project's day-to-day operations, 
including community development and product releases. For downloads, 
documentation, and ways to become involved with Apache Hadoop, visit 
http://hadoop.apache.org/ and https://twitter.com/hadoop

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 
leading Open Source projects, including Apache HTTP Server --the world's most 
popular Web server software. Through the ASF's meritocratic process known as 
"The Apache Way," more than 730 individual Members and 7,000 Committers across 
six continents successfully collaborate to develop freely available 
enterprise-grade software, benefiting millions of users worldwide: thousands of 
software solutions are distributed under the Apache License; and the community 
actively participates in ASF mailing lists, mentoring initiatives, and 
ApacheCon, the Foundation's official global conference series. The ASF is a US 
501(c)(3) charitable organization, funded by individual donations and corporate 
sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, 
Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, 
Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, LeaseWeb, 
Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red 
Hat, Target, Tencent, and Union Investment. For more information, visit 
http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", and 
"ApacheCon" are registered trademarks or trademarks of the Apache Software 
Foundation in the United States and/or other countries. All other brands and 
trademarks are the property of their respective owners.

# # #

NOTE: you are receiving this message because you are subscribed to the 
announce@apache.org distribution list. To unsubscribe, send email from the 
recipient account to announce-unsubscr...@apache.org with the word 
"Unsubscribe" in the subject line.

Reply via email to