This is an automated email from the ASF dual-hosted git repository. mmiller pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/accumulo-website.git
The following commit(s) were added to refs/heads/master by this push: new 072984c Blog post for upgrading to 2.0 (#188) 072984c is described below commit 072984cb692f6c09eb51a79f4ee327a4b3a2d975 Author: Mike Miller <mmil...@apache.org> AuthorDate: Mon Aug 12 15:45:58 2019 -0400 Blog post for upgrading to 2.0 (#188) --- _posts/blog/2019-08-12-why-upgrade.md | 138 ++++++++++++++++++++++++++++++++++ 1 file changed, 138 insertions(+) diff --git a/_posts/blog/2019-08-12-why-upgrade.md b/_posts/blog/2019-08-12-why-upgrade.md new file mode 100644 index 0000000..12c30f1 --- /dev/null +++ b/_posts/blog/2019-08-12-why-upgrade.md @@ -0,0 +1,138 @@ +--- +title: "Top 10 Reasons to Upgrade" +author: Mike Miller +reviewers: Keith Turner, Christopher Tubbs +--- + +Accumulo 2.0 has been in development for quite some time now and is packed with new features, bug +fixes, performance improvements and redesigned components. All of these changes bring challenges +when upgrading your production cluster so you may be wondering... why should I upgrade? + +My top 10 reasons to upgrade. For all changes see the [release notes][rel] + +* [Summaries](#summaries) +* [New Bulk Import](#new-bulk-import) +* [Simplified Scripts and Config](#simplified-scripts-and-config) +* [New Monitor](#new-monitor) +* [New APIs](#new-apis) +* [Offline creation](#offline-creation) +* [Search Documentation](#search-documentation) +* [On disk encryption](#new-crypto) +* [ZStandard Compression](#zstandard-compression) +* [New Scan Executors](#new-scan-executors) + +### Summaries + +This feature allows detailed stats about Tables to be written directly into Accumulo files (R-Files). +Summaries can be used to make precise decisions about your data. Once configured, summaries become a +part of your Tables, so they won't impact ingest or query performance of your cluster. + +Here are some example use cases: + +* A compaction could automatically run if deletes compose more than 25% of the data +* An admin could optimize compactions by configuring specific age off of data +* An admin could analyze R-File summaries for better performance tuning of a cluster + +For more info check out the [summary docs for 2.0][summary] + +### New Bulk Import + +Bulk Ingest was completely redone for 2.0. Previously, Bulk Ingest relied on expensive inspections of +R-Files across multiple Tablet Servers. With enough data, an old Bulk Ingest operation could easily +hold up simpler Table operations and critical compactions of files. + +The new Bulk Ingest gives the user control over the R-File inspection, allows for offline bulk +ingesting and provides performance [improvements][new-bulk]. + +## Simplified Scripts and Config + +Many improvements were done to the scripts and configuration. See Mike's description of the [improvements.][scripts] + +## New Monitor + +The Monitor has been re-written using REST, Javascript and more modern Web Tech. It is faster, +cleaner and more maintainable than the previous version. Here is a screen shot: + +<img src="{{ site.baseurl }}/images/accumulo-monitor-1.png" width="50%"/> + +## New APIs + +Connecting to Accumulo is now easier with a single point of entry for clients. It can now be done with +a fluent API, 2 imports and using minimal code: + +```java +import org.apache.accumulo.core.client.Accumulo; +import org.apache.accumulo.core.client.AccumuloClient; + +try (AccumuloClient client = Accumulo.newClient() + .to("instance", "zk") + .as("user", "pass").build()) { + // use the client + client.tableOperations().create("newTable"); + } +``` + +As you can see the client is also closable, which gives developers more control over resources. +See the [Accumulo entry point javadoc][client]. + +Key and Mutation have new fluent APIs, which now allow mixing of ```String``` and ```byte[]``` types. + +```java +Key newKey = Key.builder().row("foo").family("bar").build(); + +Mutation m = new Mutation("row0017"); +m.at().family("001").qualifier(new byte[] {0,1}).put("v99"); +m.at().family("002").qualifier(new byte[] {0,1}).delete(); +``` + +More examples for [Key] and [Mutation]. + +## Offline creation + +Tables can now be created with splits offline. This frees up online resources to perform other critical operations. +See the [GitHub issue][offline]. + +## Search Documentation + +New ability to quickly search documentation on the website. The user manual was completely redone +for 2.0. Check it out [here][manual]. Users can now quickly [search] the website across all 2.x documentation. + +## New Crypto + +On disk encryption was redone to be more secure and flexible. For an in depth description of how Accumulo +does on disk encryption, see the [user manual][crypto]. NOTE: This is currently an experimental feature. +An experimental feature is considered a work in progress or incomplete and could change. + +## Zstandard compression + +Support for Zstandard compression was added in 2.0. It has been measured to perform better than +gzip (better compression ratio and speed) and snappy (better compression ratio). Checkout Facebook's [github][zstd] for Zstandard and +the [table.file.compress.type][z-config] property for configuring Accumulo. + +## New Scan Executors + +Users now have more control over scans with the new scan executors. Tables can be configured to utilize these +powerful new mechanisms using just a few properties, giving user control over things like scan prioritization and +better cluster resource utilization. + +For example, a cluster has a bunch of long running scans and one really fast scan. The long running scans will eat up +a majority of the server resources causing the one really fast scan to be delayed. Scan executors allow an admin +to configure the cluster in a way that allows the one fast scan to be prioritized and not have to wait. + +Checkout some examples in the [user guide][scans]. + +[FATE]: {% dlink /administration/fate %} +[new-bulk]: https://accumulo.apache.org/release/accumulo-2.0.0/#new-bulk-import-api +[scripts]: https://accumulo.apache.org/blog/2016/11/16/simpler-scripts-and-config.html +[summary]: {% dlink /development/summaries %} +[client]: {% jurl org.apache.accumulo.core.client.Accumulo %} +[Key]: https://github.com/apache/accumulo/blob/master/core/src/test/java/org/apache/accumulo/core/data/KeyBuilderTest.java +[Mutation]: https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.0.0/org/apache/accumulo/core/data/Mutation.html#at() +[offline]: {% ghi 573 %} +[manual]: {% dlink /getting-started/quickstart %} +[search]: https://accumulo.apache.org/search/ +[crypto]: {% dlink /security/on-disk-encryption %} +[rel]: https://accumulo.apache.org/release/accumulo-2.0.0/ +[zstd]: https://facebook.github.io/zstd/ +[z-config]: {% dlink /configuration/server-properties %} +[scans]: {% dlink /administration/scan-executors %}