[this announcement is available online at https://s.apache.org/NI73 ]

Newest addition to Apache Big Data ecosystem used for continual,
incremental processing of data at petabyte scale

Forest Hill, MD –26 July 2017– The Apache Software Foundation (ASF), the
all-volunteer developers, stewards, and incubators of more than 350 Open
Source projects and initiatives, announced today that Apache® Fluo™ has
graduated from the Apache Incubator to become a Top-Level Project (TLP),
signifying that the project's community and products have been
well-governed under the ASF's meritocratic process and principles.

Apache Fluo is a distributed system for incrementally processing large
data sets stored in Apache Accumulo (the sorted, distributed key/value
store based on Google's Bigtable, built on top of Apache Hadoop, Apache
Zookeeper, and Apache Thrift). With Fluo, users can continuously join
new data into large existing data sets without reprocessing all data.
Unlike batch and streaming frameworks, Fluo offers much lower latency
and can operate on extremely large data sets.

"I am very excited to see Apache Fluo graduate and I would like to thank
our mentors for all their help, the Apache Incubator Project Management
Committee for its advice and guidance, everyone in the Fluo community,
and Google for publishing the research upon which Fluo is based." said
Keith Turner, Vice President of Apache Fluo. "As a result of
collaboration within the community, we are graduating with a beautifully
designed piece of software."

Based on Percolator (built on top of Bigtable to support incremental
updates to the search index at Google), Fluo makes it possible to
continually-update the results of a large-scale computation, index, or
analytic as new data is discovered.

"Apache Fluo is a very clever piece of software, elegantly supplementing
Apache Accumulo's ability to store and maintain very large indexes,"
said Christopher Tubbs, ASF Member and Committer on Apache Accumulo and
Apache Fluo. "Its support of transactions enables Accumulo to solve a
whole new set of big data problems, and its observer framework makes
designing ingest workflows fun."

An example of how Fluo works is a use case of counting phrases in unique
documents. This could be accomplished by two MapReduce jobs: one job to
get a unique set of documents and a following job to count phrases.
Where petabytes of documents are concerned, running both jobs for a
small amount of new data is inefficient. Apache Fluo enables continuous,
quick computations of these two joins as new data arrives, constantly
emitting deltas of phrase counts. Anything could consume the emitted
deltas. For example, a query system could be continuously updated using
them.

"We are excited that Fluo is becoming a Top-Level Project at the Apache
Software Foundation," said Dr. Adina Crainiceanu, Apache Rya
(incubating) Committer and Associate Professor, Computer Science
Department, United States Naval Academy. "Heartfelt congratulations to
the Fluo community for achieving this important milestone. The Apache
Rya project uses the observer framework in Fluo to cache and maintain
answers to complex SPARQL queries for large RDF datasets. Using cached
answers greatly improves Rya's performance for complex queries. Fluo
complements Rya by allowing the incremental and continuous update of the
cached answers. Fluo is particularly useful because it allows updates to
happen as new data is ingested, reduces updates latency, avoids stale
results, and circumvents the periodical reprocessing of the entire
dataset. We are confident that Apache Fluo will become one of the
important frameworks for updating indexing results in a dynamic
data-acquiring context."

"Fluo fulfills an important role in the Apache Hadoop ecosystem,
significantly expanding existing capabilities for working with large
data sets," said Billie Rinaldi, ASF Member and former Vice President of
Apache Accumulo. "I was excited to see this project come to the Apache
Incubator, and am even more pleased to see it graduate to a top-level
Apache project."

"We welcome new users and contributors to Apache Fluo," added Turner.
"If you are interested in trying Fluo, check out the Fluo Tour on the
project Website. Join our mailing lists to discuss how Fluo may be a
good solution for your problem, as well as for help with debugging and
finding starter issues."

Catch Apache Fluo in action and meet members of the Fluo community at
Accumulo Summit, 16 October 2017 in Columbia, MD.
http://accumulosummit.com/

Availability and Oversight
Apache Fluo software is released under the Apache License v2.0 and is
overseen by a self-selected team of active contributors to the project.
A Project Management Committee (PMC) guides the Project's day-to-day
operations, including community development and product releases. For
downloads, documentation, and ways to become involved with Apache Fluo,
visit http://fluo.apache.org/ and https://twitter.com/ApacheFluo

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases
wishing to become part of the efforts at The Apache Software Foundation.
All code donations from external organizations and existing external
projects wishing to join the ASF enter through the Incubator to: 1)
ensure all donations are in accordance with the ASF legal standards; and
2) develop new communities that adhere to our guiding principles.
Incubation is required of all newly accepted projects until a further
review indicates that the infrastructure, communications, and decision
making process have stabilized in a manner consistent with other
successful ASF projects. While incubation status is not necessarily a
reflection of the completeness or stability of the code, it does
indicate that the project has yet to be fully endorsed by the ASF. For
more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350
leading Open Source projects, including Apache HTTP Server --the world's
most popular Web server software. Through the ASF's meritocratic process
known as "The Apache Way," more than 620 individual Members and 6,000
Committers successfully collaborate to develop freely available
enterprise-grade software, benefiting millions of users worldwide:
thousands of software solutions are distributed under the Apache
License; and the community actively participates in ASF mailing lists,
mentoring initiatives, and ApacheCon, the Foundation's official user
conference, trainings, and expo. The ASF is a US 501(c)(3) charitable
organization, funded by individual donations and corporate sponsors
including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct,
Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook,
Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma,
LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access,
Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For
more information, visit https://www.apache.org/ and
https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Fluo", "Apache Fluo",
"Accumulo", "Apache Accumulo", "Rya", "Apache Rya", and "ApacheCon" are
registered trademarks or trademarks of the Apache Software Foundation in
the United States and/or other countries. All other brands and
trademarks are the property of their respective owners.

# # #

NOTE: you are receiving this message because you are subscribed to the
announce@apache.org distribution list. To unsubscribe, send email from
the recipient account to announce-unsubscr...@apache.org with the word
"Unsubscribe" in the subject line. 

Reply via email to