Jekyll build from gh-pages:358b7b4 Merge branch 'release-posts' into gh-pages
Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/c0655661 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/c0655661 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/c0655661 Branch: refs/heads/asf-site Commit: c0655661f4a02bee49ec3e255a1064f1e5c80614 Parents: 82bf944 Author: Mike Walch <mwa...@apache.org> Authored: Thu Nov 10 16:37:42 2016 -0500 Committer: Mike Walch <mwa...@apache.org> Committed: Thu Nov 10 16:37:42 2016 -0500 ---------------------------------------------------------------------- 1.3/user_manual/Accumulo_Design.html | 275 ++++ 1.3/user_manual/Accumulo_Shell.html | 314 ++++ 1.3/user_manual/Administration.html | 344 ++++ 1.3/user_manual/Analytics.html | 330 ++++ 1.3/user_manual/Contents.html | 369 +++++ 1.3/user_manual/High_Speed_Ingest.html | 262 +++ 1.3/user_manual/Introduction.html | 198 +++ 1.3/user_manual/Security.html | 284 ++++ 1.3/user_manual/Shell_Commands.html | 827 ++++++++++ 1.3/user_manual/Table_Configuration.html | 507 ++++++ 1.3/user_manual/Table_Design.html | 367 +++++ 1.3/user_manual/Writing_Accumulo_Clients.html | 303 ++++ 1.3/user_manual/accumulo_user_manual.html | 214 +++ 1.3/user_manual/data_distribution.png | Bin 0 -> 86936 bytes 1.3/user_manual/examples.html | 10 + 1.3/user_manual/examples/aggregation.html | 222 +++ 1.3/user_manual/examples/batch.html | 227 +++ 1.3/user_manual/examples/bloom.html | 313 ++++ 1.3/user_manual/examples/bulkIngest.html | 206 +++ 1.3/user_manual/examples/constraints.html | 220 +++ 1.3/user_manual/examples/dirlist.html | 239 +++ 1.3/user_manual/examples/filter.html | 283 ++++ 1.3/user_manual/examples/helloworld.html | 238 +++ 1.3/user_manual/examples/index.html | 230 +++ 1.3/user_manual/examples/mapred.html | 263 +++ 1.3/user_manual/examples/shard.html | 248 +++ 1.3/user_manual/failure_handling.png | Bin 0 -> 48904 bytes 1.3/user_manual/img1.png | Bin 0 -> 2977 bytes 1.3/user_manual/img2.png | Bin 0 -> 4121 bytes 1.3/user_manual/img3.png | Bin 0 -> 6520 bytes 1.3/user_manual/img4.png | Bin 0 -> 16325 bytes 1.3/user_manual/img5.png | Bin 0 -> 3974 bytes 1.3/user_manual/index.html | 214 +++ 1.4/examples/batch.html | 32 +- 1.4/examples/bloom.html | 32 +- 1.4/examples/bulkIngest.html | 32 +- 1.4/examples/combiner.html | 32 +- 1.4/examples/constraints.html | 32 +- 1.4/examples/dirlist.html | 32 +- 1.4/examples/filedata.html | 32 +- 1.4/examples/filter.html | 32 +- 1.4/examples/helloworld.html | 32 +- 1.4/examples/index.html | 32 +- 1.4/examples/isolation.html | 32 +- 1.4/examples/mapred.html | 32 +- 1.4/examples/shard.html | 32 +- 1.4/examples/visibility.html | 32 +- 1.4/user_manual/Accumulo_Design.html | 32 +- 1.4/user_manual/Accumulo_Shell.html | 32 +- 1.4/user_manual/Administration.html | 32 +- 1.4/user_manual/Analytics.html | 32 +- 1.4/user_manual/Contents.html | 32 +- 1.4/user_manual/Development_Clients.html | 32 +- 1.4/user_manual/High_Speed_Ingest.html | 32 +- 1.4/user_manual/Introduction.html | 32 +- 1.4/user_manual/Security.html | 32 +- 1.4/user_manual/Shell_Commands.html | 32 +- 1.4/user_manual/Table_Configuration.html | 32 +- 1.4/user_manual/Table_Design.html | 32 +- 1.4/user_manual/Writing_Accumulo_Clients.html | 32 +- 1.4/user_manual/accumulo_user_manual.html | 32 +- 1.4/user_manual/index.html | 32 +- 1.5/examples/batch.html | 32 +- 1.5/examples/bloom.html | 32 +- 1.5/examples/bulkIngest.html | 32 +- 1.5/examples/classpath.html | 32 +- 1.5/examples/client.html | 32 +- 1.5/examples/combiner.html | 32 +- 1.5/examples/constraints.html | 32 +- 1.5/examples/dirlist.html | 32 +- 1.5/examples/export.html | 32 +- 1.5/examples/filedata.html | 32 +- 1.5/examples/filter.html | 32 +- 1.5/examples/helloworld.html | 32 +- 1.5/examples/index.html | 32 +- 1.5/examples/isolation.html | 32 +- 1.5/examples/mapred.html | 32 +- 1.5/examples/maxmutation.html | 32 +- 1.5/examples/regex.html | 32 +- 1.5/examples/rowhash.html | 32 +- 1.5/examples/shard.html | 32 +- 1.5/examples/tabletofile.html | 32 +- 1.5/examples/terasort.html | 32 +- 1.5/examples/visibility.html | 32 +- 1.6/examples/batch.html | 32 +- 1.6/examples/bloom.html | 32 +- 1.6/examples/bulkIngest.html | 32 +- 1.6/examples/classpath.html | 32 +- 1.6/examples/client.html | 32 +- 1.6/examples/combiner.html | 32 +- 1.6/examples/constraints.html | 32 +- 1.6/examples/dirlist.html | 32 +- 1.6/examples/export.html | 32 +- 1.6/examples/filedata.html | 32 +- 1.6/examples/filter.html | 32 +- 1.6/examples/helloworld.html | 32 +- 1.6/examples/index.html | 32 +- 1.6/examples/isolation.html | 32 +- 1.6/examples/mapred.html | 32 +- 1.6/examples/maxmutation.html | 32 +- 1.6/examples/regex.html | 32 +- 1.6/examples/reservations.html | 32 +- 1.6/examples/rowhash.html | 32 +- 1.6/examples/shard.html | 32 +- 1.6/examples/tabletofile.html | 32 +- 1.6/examples/terasort.html | 32 +- 1.6/examples/visibility.html | 32 +- 1.7/examples/batch.html | 32 +- 1.7/examples/bloom.html | 32 +- 1.7/examples/bulkIngest.html | 32 +- 1.7/examples/classpath.html | 32 +- 1.7/examples/client.html | 32 +- 1.7/examples/combiner.html | 32 +- 1.7/examples/constraints.html | 32 +- 1.7/examples/dirlist.html | 32 +- 1.7/examples/export.html | 32 +- 1.7/examples/filedata.html | 32 +- 1.7/examples/filter.html | 32 +- 1.7/examples/helloworld.html | 32 +- 1.7/examples/index.html | 32 +- 1.7/examples/isolation.html | 32 +- 1.7/examples/mapred.html | 32 +- 1.7/examples/maxmutation.html | 32 +- 1.7/examples/regex.html | 32 +- 1.7/examples/reservations.html | 32 +- 1.7/examples/rowhash.html | 32 +- 1.7/examples/shard.html | 32 +- 1.7/examples/tabletofile.html | 32 +- 1.7/examples/terasort.html | 32 +- 1.7/examples/visibility.html | 32 +- 1.8/examples/batch.html | 32 +- 1.8/examples/bloom.html | 32 +- 1.8/examples/bulkIngest.html | 32 +- 1.8/examples/classpath.html | 32 +- 1.8/examples/client.html | 32 +- 1.8/examples/combiner.html | 32 +- 1.8/examples/constraints.html | 32 +- 1.8/examples/dirlist.html | 32 +- 1.8/examples/export.html | 32 +- 1.8/examples/filedata.html | 32 +- 1.8/examples/filter.html | 32 +- 1.8/examples/helloworld.html | 32 +- 1.8/examples/index.html | 32 +- 1.8/examples/isolation.html | 32 +- 1.8/examples/mapred.html | 32 +- 1.8/examples/maxmutation.html | 32 +- 1.8/examples/regex.html | 32 +- 1.8/examples/reservations.html | 32 +- 1.8/examples/rgbalancer.html | 32 +- 1.8/examples/rowhash.html | 32 +- 1.8/examples/sample.html | 32 +- 1.8/examples/shard.html | 32 +- 1.8/examples/tabletofile.html | 32 +- 1.8/examples/terasort.html | 32 +- 1.8/examples/visibility.html | 32 +- blog/2014/05/03/accumulo-classloader.html | 32 +- .../27/getting-started-with-accumulo-1.6.0.html | 32 +- ...aling-accumulo-with-multivolume-support.html | 32 +- .../07/09/functional-reads-over-accumulo.html | 32 +- ...tores-for-configuring-accumulo-with-ssl.html | 32 +- .../2015/03/20/balancing-groups-of-tablets.html | 32 +- ...plicating-data-across-accumulo-clusters.html | 32 +- blog/2016/11/02/durability-performance.html | 38 +- bylaws.html | 32 +- contrib.html | 32 +- downloads/index.html | 38 +- example/wikisearch.html | 32 +- examples/index.html | 195 +++ feed.xml | 1541 +++++++++++++----- get_involved.html | 32 +- git.html | 32 +- glossary.html | 32 +- governance/consensusBuilding.html | 32 +- governance/lazyConsensus.html | 32 +- governance/releasing.html | 32 +- governance/voting.html | 32 +- index.html | 60 +- javadocs/index.html | 195 +++ mailing_list.html | 32 +- news.html | 205 ++- notable_features.html | 34 +- old_documentation.html | 225 --- pages/old_archive.html | 235 +++ papers/index.html | 32 +- people.html | 44 +- projects.html | 32 +- rb.html | 32 +- release/accumulo-1.5.1/index.html | 408 +++++ release/accumulo-1.5.2/index.html | 365 +++++ release/accumulo-1.5.3/index.html | 314 ++++ release/accumulo-1.5.4/index.html | 276 ++++ release/accumulo-1.6.0/index.html | 569 +++++++ release/accumulo-1.6.1/index.html | 375 +++++ release/accumulo-1.6.2/index.html | 380 +++++ release/accumulo-1.6.3/index.html | 324 ++++ release/accumulo-1.6.4/index.html | 265 +++ release/accumulo-1.6.5/index.html | 321 ++++ release/accumulo-1.6.6/index.html | 336 ++++ release/accumulo-1.7.0/index.html | 606 +++++++ release/accumulo-1.7.1/index.html | 356 ++++ release/accumulo-1.7.2/index.html | 292 ++++ release/accumulo-1.8.0/index.html | 394 +++++ release/index.html | 319 ++++ release_notes.html | 10 + release_notes/1.5.1.html | 415 +---- release_notes/1.5.2.html | 372 +---- release_notes/1.5.3.html | 321 +--- release_notes/1.5.4.html | 283 +--- release_notes/1.6.0.html | 576 +------ release_notes/1.6.1.html | 382 +---- release_notes/1.6.2.html | 387 +---- release_notes/1.6.3.html | 331 +--- release_notes/1.6.4.html | 272 +--- release_notes/1.6.5.html | 328 +--- release_notes/1.6.6.html | 335 +--- release_notes/1.7.0.html | 613 +------ release_notes/1.7.1.html | 363 +---- release_notes/1.7.2.html | 291 +--- release_notes/1.8.0.html | 393 +---- release_notes/index.html | 247 +-- releasing.html | 32 +- screenshots.html | 32 +- source.html | 32 +- thanks.html | 32 +- user-manual/index.html | 196 +++ user_manual_1.3-incubating/Accumulo_Design.html | 283 ---- user_manual_1.3-incubating/Accumulo_Shell.html | 322 ---- user_manual_1.3-incubating/Administration.html | 352 ---- user_manual_1.3-incubating/Analytics.html | 338 ---- user_manual_1.3-incubating/Contents.html | 377 ----- .../High_Speed_Ingest.html | 270 --- user_manual_1.3-incubating/Introduction.html | 206 --- user_manual_1.3-incubating/Security.html | 292 ---- user_manual_1.3-incubating/Shell_Commands.html | 835 ---------- .../Table_Configuration.html | 515 ------ user_manual_1.3-incubating/Table_Design.html | 375 ----- .../Writing_Accumulo_Clients.html | 311 ---- .../accumulo_user_manual.html | 222 --- .../data_distribution.png | Bin 86936 -> 0 bytes user_manual_1.3-incubating/examples.html | 10 - .../examples/aggregation.html | 230 --- user_manual_1.3-incubating/examples/batch.html | 235 --- user_manual_1.3-incubating/examples/bloom.html | 321 ---- .../examples/bulkIngest.html | 214 --- .../examples/constraints.html | 228 --- .../examples/dirlist.html | 247 --- user_manual_1.3-incubating/examples/filter.html | 291 ---- .../examples/helloworld.html | 246 --- user_manual_1.3-incubating/examples/index.html | 238 --- user_manual_1.3-incubating/examples/mapred.html | 271 --- user_manual_1.3-incubating/examples/shard.html | 256 --- user_manual_1.3-incubating/failure_handling.png | Bin 48904 -> 0 bytes user_manual_1.3-incubating/img1.png | Bin 2977 -> 0 bytes user_manual_1.3-incubating/img2.png | Bin 4121 -> 0 bytes user_manual_1.3-incubating/img3.png | Bin 6520 -> 0 bytes user_manual_1.3-incubating/img4.png | Bin 16325 -> 0 bytes user_manual_1.3-incubating/img5.png | Bin 3974 -> 0 bytes user_manual_1.3-incubating/index.html | 226 +-- verifying_releases.html | 32 +- versioning.html | 32 +- 260 files changed, 17533 insertions(+), 17278 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/c0655661/1.3/user_manual/Accumulo_Design.html ---------------------------------------------------------------------- diff --git a/1.3/user_manual/Accumulo_Design.html b/1.3/user_manual/Accumulo_Design.html new file mode 100644 index 0000000..746add6 --- /dev/null +++ b/1.3/user_manual/Accumulo_Design.html @@ -0,0 +1,275 @@ +<!DOCTYPE html> +<html lang="en"> +<head> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<meta charset="utf-8"> +<meta http-equiv="X-UA-Compatible" content="IE=edge"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<link href="https://maxcdn.bootstrapcdn.com/bootswatch/3.3.7/paper/bootstrap.min.css" rel="stylesheet" integrity="sha384-awusxf8AUojygHf2+joICySzB780jVvQaVCAt1clU3QsyAitLGul28Qxb2r1e5g+" crossorigin="anonymous"> +<link href="//netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css" rel="stylesheet"> +<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.css"> +<link href="/css/accumulo.css" rel="stylesheet" type="text/css"> + +<title>User Manual: Accumulo Design</title> + +<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script> +<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> +<script type="text/javascript" src="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.js"></script> +<script> + // show location of canonical site if not currently on the canonical site + $(function() { + var host = window.location.host; + if (typeof host !== 'undefined' && host !== 'accumulo.apache.org') { + $('#non-canonical').show(); + } + }); + + $(function() { + // decorate section headers with anchors + return $("h2, h3, h4, h5, h6").each(function(i, el) { + var $el, icon, id; + $el = $(el); + id = $el.attr('id'); + icon = '<i class="fa fa-link"></i>'; + if (id) { + return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon)); + } + }); + }); + + // configure Google Analytics + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + if (ga.hasOwnProperty('loaded') && ga.loaded === true) { + ga('create', 'UA-50934829-1', 'apache.org'); + ga('send', 'pageview'); + } +</script> + +</head> +<body style="padding-top: 100px"> + + <nav class="navbar navbar-default navbar-fixed-top"> + <div class="container"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-items"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a href="/"><img id="nav-logo" alt="Apache Accumulo" class="img-responsive" src="/images/accumulo-logo.png" width="200"/></a> + </div> + <div class="collapse navbar-collapse" id="navbar-items"> + <ul class="nav navbar-nav"> + <li class="nav-link"><a href="/downloads">Download</a></li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Releases<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/release/accumulo-1.8.0/">1.8.0 (Latest)</a></li> + <li><a href="/release/accumulo-1.7.2/">1.7.2</a></li> + <li><a href="/release/accumulo-1.6.6/">1.6.6</a></li> + <li><a href="/release/">Archive</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/user-manual/">User Manuals</a></li> + <li><a href="/javadocs/">Javadocs</a></li> + <li><a href="/examples/">Examples</a></li> + <li><a href="/notable_features">Features</a></li> + <li><a href="/screenshots">Screenshots</a></li> + <li><a href="/papers">Papers & Presentations</a></li> + <li><a href="/glossary">Glossary</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Community<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/get_involved">Get Involved</a></li> + <li><a href="/mailing_list">Mailing Lists</a></li> + <li><a href="/people">People</a></li> + <li><a href="/news">News Archive</a></li> + <li><a href="/projects">Community Projects</a></li> + <li><a href="/thanks">Thanks</a></li> + <li class="divider"></li> + <li class="dropdown-header">Governance</li> + <li><a href="/bylaws">Bylaws</a></li> + <li><a href="/governance/consensusBuilding">Consensus Building</a></li> + <li><a href="/governance/lazyConsensus">Lazy Consensus</a></li> + <li><a href="/governance/releasing">Releasing</a></li> + <li><a href="/governance/voting">Voting</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Development<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://issues.apache.org/jira/browse/ACCUMULO">Issue Tracker <i class="fa fa-external-link"></i></a></li> + <li><a href="https://github.com/apache/accumulo/pulls">Pull Requests <i class="fa fa-external-link"></i></a></li> + <li><a href="https://builds.apache.org/view/A/view/Accumulo">Jenkins Builds <i class="fa fa-external-link"></i></a></li> + <li><a href="https://travis-ci.org/apache/accumulo">TravisCI Builds <i class="fa fa-external-link"></i></a></li> + <li class="divider"></li> + <li class="dropdown-header">Guides</li> + <li><a href="/source">Source & Guide</a></li> + <li><a href="/git">Git Workflow</a></li> + <li><a href="/versioning">Versioning</a></li> + <li><a href="/contrib">Contrib Projects</a></li> + <li><a href="/rb">Review Board</a></li> + <li><a href="/releasing">Making Releases</a></li> + <li><a href="/verifying_releases">Verifying Releases</a></li> + </ul> + </li> + </ul> + <ul class="nav navbar-nav navbar-right"> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Apache Software Foundation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://www.apache.org">Apache Homepage <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/licenses/LICENSE-2.0">License <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/sponsorship">Sponsorship <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/security">Security <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/thanks">Thanks <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/policies/conduct">Code of Conduct <i class="fa fa-external-link"></i></a></li> + </ul> + </li> + </ul> + </div> + </div> +</nav> + + + <div class="container"> + <div class="row"> + <div class="col-md-12"> + + <div id="non-canonical" style="display: none; background-color: #F0E68C; padding-left: 1em;"> + Visit the official site at: <a href="https://accumulo.apache.org">https://accumulo.apache.org</a> + </div> + <div id="content"> + + <h1 class="title">User Manual: Accumulo Design</h1> + + <p>** Next:** <a href="Accumulo_Shell.html">Accumulo Shell</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Introduction.html">Introduction</a> ** <a href="Contents.html">Contents</a>**</p> + +<p><a id="CHILD_LINKS"></a><strong>Subsections</strong></p> + +<ul> + <li><a href="Accumulo_Design.html#Data_Model">Data Model</a></li> + <li><a href="Accumulo_Design.html#Architecture">Architecture</a></li> + <li><a href="Accumulo_Design.html#Components">Components</a></li> + <li><a href="Accumulo_Design.html#Data_Management">Data Management</a></li> + <li><a href="Accumulo_Design.html#Tablet_Service">Tablet Service</a></li> + <li><a href="Accumulo_Design.html#Compactions">Compactions</a></li> + <li><a href="Accumulo_Design.html#Fault-Tolerance">Fault-Tolerance</a></li> +</ul> + +<hr /> + +<h2 id="a-idaccumulodesigna-accumulo-design"><a id="Accumulo_Design"></a> Accumulo Design</h2> + +<h2 id="a-iddatamodela-data-model"><a id="Data_Model"></a> Data Model</h2> + +<p>Accumulo provides a richer data model than simple key-value stores, but is not a fully relational database. Data is represented as key-value pairs, where the key and value are comprised of the following elements:</p> + +<p><img src="img1.png" alt="converted table" /></p> + +<p>All elements of the Key and the Value are represented as byte arrays except for Timestamp, which is a Long. Accumulo sorts keys by element and lexicographically in ascending order. Timestamps are sorted in descending order so that later versions of the same Key appear first in a sequential scan. Tables consist of a set of sorted key-value pairs.</p> + +<h2 id="a-idarchitecturea-architecture"><a id="Architecture"></a> Architecture</h2> + +<p>Accumulo is a distributed data storage and retrieval system and as such consists of several architectural components, some of which run on many individual servers. Much of the work Accumulo does involves maintaining certain properties of the data, such as organization, availability, and integrity, across many commodity-class machines.</p> + +<h2 id="a-idcomponentsa-components"><a id="Components"></a> Components</h2> + +<p>An instance of Accumulo includes many TabletServers, write-ahead Logger servers, one Garbage Collector process, one Master server and many Clients.</p> + +<h3 id="a-idtabletservera-tablet-server"><a id="Tablet_Server"></a> Tablet Server</h3> + +<p>The TabletServer manages some subset of all the tablets (partitions of tables). This includes receiving writes from clients, persisting writes to a writeâahead log, sorting new keyâvalue pairs in memory, periodically flushing sorted keyâvalue pairs to new files in HDFS, and responding to reads from clients, forming a mergeâsorted view of all keys and values from all the files it has created and the sorted inâmemory store.</p> + +<p>TabletServers also perform recovery of a tablet that was previously on a server that failed, reapplying any writes found in the write-ahead log to the tablet.</p> + +<h3 id="a-idloggersa-loggers"><a id="Loggers"></a> Loggers</h3> + +<p>The Loggers accept updates to Tablet servers and write them to local on-disk storage. Each tablet server will write their updates to multiple loggers to preserve data in case of hardware failure.</p> + +<h3 id="a-idgarbagecollectora-garbage-collector"><a id="Garbage_Collector"></a> Garbage Collector</h3> + +<p>Accumulo processes will share files stored in HDFS. Periodically, the Garbage Collector will identify files that are no longer needed by any process, and delete them.</p> + +<h3 id="a-idmastera-master"><a id="Master"></a> Master</h3> + +<p>The Accumulo Master is responsible for detecting and responding to TabletServer failure. It tries to balance the load across TabletServer by assigning tablets carefully and instructing TabletServers to migrate tablets when necessary. The Master ensures all tablets are assigned to one TabletServer each, and handles table creation, alteration, and deletion requests from clients. The Master also coordinates startup, graceful shutdown and recovery of changes in write-ahead logs when Tablet servers fail.</p> + +<h3 id="a-idclienta-client"><a id="Client"></a> Client</h3> + +<p>Accumulo includes a client library that is linked to every application. The client library contains logic for finding servers managing a particular tablet, and communicating with TabletServers to write and retrieve key-value pairs.</p> + +<h2 id="a-iddatamanagementa-data-management"><a id="Data_Management"></a> Data Management</h2> + +<p>Accumulo stores data in tables, which are partitioned into tablets. Tablets are partitioned on row boundaries so that all of the columns and values for a particular row are found together within the same tablet. The Master assigns Tablets to one TabletServer at a time. This enables row-level transactions to take place without using distributed locking or some other complicated synchronization mechanism. As clients insert and query data, and as machines are added and removed from the cluster, the Master migrates tablets to ensure they remain available and that the ingest and query load is balanced across the cluster.</p> + +<p><img src="./data_distribution.png" alt="Image data_distribution" /></p> + +<h2 id="a-idtabletservicea-tablet-service"><a id="Tablet_Service"></a> Tablet Service</h2> + +<p>When a write arrives at a TabletServer it is written to a WriteâAhead Log and then inserted into a sorted data structure in memory called a MemTable. When the MemTable reaches a certain size the TabletServer writes out the sorted key-value pairs to a file in HDFS called Indexed Sequential Access Method (ISAM) file. This process is called a minor compaction. A new MemTable is then created and the fact of the compaction is recorded in the WriteâAhead Log.</p> + +<p>When a request to read data arrives at a TabletServer, the TabletServer does a binary search across the MemTable as well as the in-memory indexes associated with each ISAM file to find the relevant values. If clients are performing a scan, several keyâvalue pairs are returned to the client in order from the MemTable and the set of ISAM files by performing a mergeâsort as they are read.</p> + +<h2 id="a-idcompactionsa-compactions"><a id="Compactions"></a> Compactions</h2> + +<p>In order to manage the number of files per tablet, periodically the TabletServer performs Major Compactions of files within a tablet, in which some set of ISAM files are combined into one file. The previous files will eventually be removed by the Garbage Collector. This also provides an opportunity to permanently remove deleted keyâvalue pairs by omitting keyâvalue pairs suppressed by a delete entry when the new file is created.</p> + +<h2 id="a-idfault-tolerancea-fault-tolerance"><a id="Fault-Tolerance"></a> Fault-Tolerance</h2> + +<p>If a TabletServer fails, the Master detects it and automatically reassigns the tablets assigned from the failed server to other servers. Any key-value pairs that were in memory at the time the TabletServer are automatically reapplied from the Write-Ahead Log to prevent any loss of data.</p> + +<p>The Master will coordinate the copying of write-ahead logs to HDFS so the logs are available to all tablet servers. To make recovery efficient, the updates within a log are grouped by tablet. The sorting process can be performed by Hadoops MapReduce or the Logger server. TabletServers can quickly apply the mutations from the sorted logs that are destined for the tablets they have now been assigned.</p> + +<p>TabletServer failures are noted on the Masterâs monitor page, accessible via <br /> +http://master-address:50095/monitor.</p> + +<p><img src="./failure_handling.png" alt="Image failure_handling" /></p> + +<hr /> + +<p>** Next:** <a href="Accumulo_Shell.html">Accumulo Shell</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Introduction.html">Introduction</a> ** <a href="Contents.html">Contents</a>**</p> + + + </div> + + +<footer> + + <p><a href="https://www.apache.org"><img src="/images/feather-small.gif" alt="Apache Software Foundation" id="asf-logo" height="100" /></a></p> + + <p>Copyright © 2011-2016 The Apache Software Foundation. Licensed under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> + +</footer> + + + </div> + </div> + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/accumulo/blob/c0655661/1.3/user_manual/Accumulo_Shell.html ---------------------------------------------------------------------- diff --git a/1.3/user_manual/Accumulo_Shell.html b/1.3/user_manual/Accumulo_Shell.html new file mode 100644 index 0000000..d432192 --- /dev/null +++ b/1.3/user_manual/Accumulo_Shell.html @@ -0,0 +1,314 @@ +<!DOCTYPE html> +<html lang="en"> +<head> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<meta charset="utf-8"> +<meta http-equiv="X-UA-Compatible" content="IE=edge"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<link href="https://maxcdn.bootstrapcdn.com/bootswatch/3.3.7/paper/bootstrap.min.css" rel="stylesheet" integrity="sha384-awusxf8AUojygHf2+joICySzB780jVvQaVCAt1clU3QsyAitLGul28Qxb2r1e5g+" crossorigin="anonymous"> +<link href="//netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css" rel="stylesheet"> +<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.css"> +<link href="/css/accumulo.css" rel="stylesheet" type="text/css"> + +<title>User Manual: Accumulo Shell</title> + +<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script> +<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> +<script type="text/javascript" src="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.js"></script> +<script> + // show location of canonical site if not currently on the canonical site + $(function() { + var host = window.location.host; + if (typeof host !== 'undefined' && host !== 'accumulo.apache.org') { + $('#non-canonical').show(); + } + }); + + $(function() { + // decorate section headers with anchors + return $("h2, h3, h4, h5, h6").each(function(i, el) { + var $el, icon, id; + $el = $(el); + id = $el.attr('id'); + icon = '<i class="fa fa-link"></i>'; + if (id) { + return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon)); + } + }); + }); + + // configure Google Analytics + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + if (ga.hasOwnProperty('loaded') && ga.loaded === true) { + ga('create', 'UA-50934829-1', 'apache.org'); + ga('send', 'pageview'); + } +</script> + +</head> +<body style="padding-top: 100px"> + + <nav class="navbar navbar-default navbar-fixed-top"> + <div class="container"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-items"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a href="/"><img id="nav-logo" alt="Apache Accumulo" class="img-responsive" src="/images/accumulo-logo.png" width="200"/></a> + </div> + <div class="collapse navbar-collapse" id="navbar-items"> + <ul class="nav navbar-nav"> + <li class="nav-link"><a href="/downloads">Download</a></li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Releases<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/release/accumulo-1.8.0/">1.8.0 (Latest)</a></li> + <li><a href="/release/accumulo-1.7.2/">1.7.2</a></li> + <li><a href="/release/accumulo-1.6.6/">1.6.6</a></li> + <li><a href="/release/">Archive</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/user-manual/">User Manuals</a></li> + <li><a href="/javadocs/">Javadocs</a></li> + <li><a href="/examples/">Examples</a></li> + <li><a href="/notable_features">Features</a></li> + <li><a href="/screenshots">Screenshots</a></li> + <li><a href="/papers">Papers & Presentations</a></li> + <li><a href="/glossary">Glossary</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Community<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/get_involved">Get Involved</a></li> + <li><a href="/mailing_list">Mailing Lists</a></li> + <li><a href="/people">People</a></li> + <li><a href="/news">News Archive</a></li> + <li><a href="/projects">Community Projects</a></li> + <li><a href="/thanks">Thanks</a></li> + <li class="divider"></li> + <li class="dropdown-header">Governance</li> + <li><a href="/bylaws">Bylaws</a></li> + <li><a href="/governance/consensusBuilding">Consensus Building</a></li> + <li><a href="/governance/lazyConsensus">Lazy Consensus</a></li> + <li><a href="/governance/releasing">Releasing</a></li> + <li><a href="/governance/voting">Voting</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Development<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://issues.apache.org/jira/browse/ACCUMULO">Issue Tracker <i class="fa fa-external-link"></i></a></li> + <li><a href="https://github.com/apache/accumulo/pulls">Pull Requests <i class="fa fa-external-link"></i></a></li> + <li><a href="https://builds.apache.org/view/A/view/Accumulo">Jenkins Builds <i class="fa fa-external-link"></i></a></li> + <li><a href="https://travis-ci.org/apache/accumulo">TravisCI Builds <i class="fa fa-external-link"></i></a></li> + <li class="divider"></li> + <li class="dropdown-header">Guides</li> + <li><a href="/source">Source & Guide</a></li> + <li><a href="/git">Git Workflow</a></li> + <li><a href="/versioning">Versioning</a></li> + <li><a href="/contrib">Contrib Projects</a></li> + <li><a href="/rb">Review Board</a></li> + <li><a href="/releasing">Making Releases</a></li> + <li><a href="/verifying_releases">Verifying Releases</a></li> + </ul> + </li> + </ul> + <ul class="nav navbar-nav navbar-right"> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Apache Software Foundation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://www.apache.org">Apache Homepage <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/licenses/LICENSE-2.0">License <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/sponsorship">Sponsorship <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/security">Security <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/thanks">Thanks <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/policies/conduct">Code of Conduct <i class="fa fa-external-link"></i></a></li> + </ul> + </li> + </ul> + </div> + </div> +</nav> + + + <div class="container"> + <div class="row"> + <div class="col-md-12"> + + <div id="non-canonical" style="display: none; background-color: #F0E68C; padding-left: 1em;"> + Visit the official site at: <a href="https://accumulo.apache.org">https://accumulo.apache.org</a> + </div> + <div id="content"> + + <h1 class="title">User Manual: Accumulo Shell</h1> + + <p>** Next:** <a href="Writing_Accumulo_Clients.html">Writing Accumulo Clients</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Accumulo_Design.html">Accumulo Design</a> ** <a href="Contents.html">Contents</a>**</p> + +<p><a id="CHILD_LINKS"></a><strong>Subsections</strong></p> + +<ul> + <li><a href="Accumulo_Shell.html#Basic_Administration">Basic Administration</a></li> + <li><a href="Accumulo_Shell.html#Table_Maintenance">Table Maintenance</a></li> + <li><a href="Accumulo_Shell.html#User_Administration">User Administration</a></li> +</ul> + +<hr /> + +<h2 id="a-idaccumuloshella-accumulo-shell"><a id="Accumulo_Shell"></a> Accumulo Shell</h2> + +<p>Accumulo provides a simple shell that can be used to examine the contents and configuration settings of tables, apply individual mutations, and change configuration settings.</p> + +<p>The shell can be started by the following command:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>$ACCUMULO_HOME/bin/accumulo shell -u [username] +</code></pre> +</div> + +<p>The shell will prompt for the corresponding password to the username specified and then display the following prompt:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>Shell - Apache Accumulo Interactive Shell +- +- version 1.3 +- instance name: myinstance +- instance id: 00000000-0000-0000-0000-000000000000 +- +- type 'help' for a list of available commands +- +</code></pre> +</div> + +<h2 id="a-idbasicadministrationa-basic-administration"><a id="Basic_Administration"></a> Basic Administration</h2> + +<p>The Accumulo shell can be used to create and delete tables, as well as to configure table and instance specific options.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>root@myinstance> tables +!METADATA + +root@myinstance> createtable mytable + +root@myinstance mytable> + +root@myinstance mytable> tables +!METADATA +mytable + +root@myinstance mytable> createtable testtable + +root@myinstance testtable> + +root@myinstance junk> deletetable testtable + +root@myinstance> +</code></pre> +</div> + +<p>The Shell can also be used to insert updates and scan tables. This is useful for inspecting tables.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>root@myinstance mytable> scan + +root@myinstance mytable> insert row1 colf colq value1 +insert successful + +root@myinstance mytable> scan +row1 colf:colq [] value1 +</code></pre> +</div> + +<h2 id="a-idtablemaintenancea-table-maintenance"><a id="Table_Maintenance"></a> Table Maintenance</h2> + +<p>The <strong>compact</strong> command instructs Accumulo to schedule a compaction of the table during which files are consolidated and deleted entries are removed.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>root@myinstance mytable> compact -t mytable +07 16:13:53,201 [shell.Shell] INFO : Compaction of table mytable +scheduled for 20100707161353EDT +</code></pre> +</div> + +<p>The <strong>flush</strong> command instructs Accumulo to write all entries currently in memory for a given table to disk.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>root@myinstance mytable> flush -t mytable +07 16:14:19,351 [shell.Shell] INFO : Flush of table mytable +initiated... +</code></pre> +</div> + +<h2 id="a-iduseradministrationa-user-administration"><a id="User_Administration"></a> User Administration</h2> + +<p>The Shell can be used to add, remove, and grant privileges to users.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>root@myinstance mytable> createuser bob +Enter new password for 'bob': ********* +Please confirm new password for 'bob': ********* + +root@myinstance mytable> authenticate bob +Enter current password for 'bob': ********* +Valid + +root@myinstance mytable> grant System.CREATE_TABLE -s -u bob + +root@myinstance mytable> user bob +Enter current password for 'bob': ********* + +bob@myinstance mytable> userpermissions +System permissions: System.CREATE_TABLE +Table permissions (!METADATA): Table.READ +Table permissions (mytable): NONE + +bob@myinstance mytable> createtable bobstable +bob@myinstance bobstable> + +bob@myinstance bobstable> user root +Enter current password for 'root': ********* + +root@myinstance bobstable> revoke System.CREATE_TABLE -s -u bob +</code></pre> +</div> + +<hr /> + +<p>** Next:** <a href="Writing_Accumulo_Clients.html">Writing Accumulo Clients</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Accumulo_Design.html">Accumulo Design</a> ** <a href="Contents.html">Contents</a>**</p> + + + </div> + + +<footer> + + <p><a href="https://www.apache.org"><img src="/images/feather-small.gif" alt="Apache Software Foundation" id="asf-logo" height="100" /></a></p> + + <p>Copyright © 2011-2016 The Apache Software Foundation. Licensed under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> + +</footer> + + + </div> + </div> + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/accumulo/blob/c0655661/1.3/user_manual/Administration.html ---------------------------------------------------------------------- diff --git a/1.3/user_manual/Administration.html b/1.3/user_manual/Administration.html new file mode 100644 index 0000000..cdad7aa --- /dev/null +++ b/1.3/user_manual/Administration.html @@ -0,0 +1,344 @@ +<!DOCTYPE html> +<html lang="en"> +<head> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<meta charset="utf-8"> +<meta http-equiv="X-UA-Compatible" content="IE=edge"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<link href="https://maxcdn.bootstrapcdn.com/bootswatch/3.3.7/paper/bootstrap.min.css" rel="stylesheet" integrity="sha384-awusxf8AUojygHf2+joICySzB780jVvQaVCAt1clU3QsyAitLGul28Qxb2r1e5g+" crossorigin="anonymous"> +<link href="//netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css" rel="stylesheet"> +<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.css"> +<link href="/css/accumulo.css" rel="stylesheet" type="text/css"> + +<title>User Manual: Administration</title> + +<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script> +<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> +<script type="text/javascript" src="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.js"></script> +<script> + // show location of canonical site if not currently on the canonical site + $(function() { + var host = window.location.host; + if (typeof host !== 'undefined' && host !== 'accumulo.apache.org') { + $('#non-canonical').show(); + } + }); + + $(function() { + // decorate section headers with anchors + return $("h2, h3, h4, h5, h6").each(function(i, el) { + var $el, icon, id; + $el = $(el); + id = $el.attr('id'); + icon = '<i class="fa fa-link"></i>'; + if (id) { + return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon)); + } + }); + }); + + // configure Google Analytics + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + if (ga.hasOwnProperty('loaded') && ga.loaded === true) { + ga('create', 'UA-50934829-1', 'apache.org'); + ga('send', 'pageview'); + } +</script> + +</head> +<body style="padding-top: 100px"> + + <nav class="navbar navbar-default navbar-fixed-top"> + <div class="container"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-items"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a href="/"><img id="nav-logo" alt="Apache Accumulo" class="img-responsive" src="/images/accumulo-logo.png" width="200"/></a> + </div> + <div class="collapse navbar-collapse" id="navbar-items"> + <ul class="nav navbar-nav"> + <li class="nav-link"><a href="/downloads">Download</a></li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Releases<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/release/accumulo-1.8.0/">1.8.0 (Latest)</a></li> + <li><a href="/release/accumulo-1.7.2/">1.7.2</a></li> + <li><a href="/release/accumulo-1.6.6/">1.6.6</a></li> + <li><a href="/release/">Archive</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/user-manual/">User Manuals</a></li> + <li><a href="/javadocs/">Javadocs</a></li> + <li><a href="/examples/">Examples</a></li> + <li><a href="/notable_features">Features</a></li> + <li><a href="/screenshots">Screenshots</a></li> + <li><a href="/papers">Papers & Presentations</a></li> + <li><a href="/glossary">Glossary</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Community<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/get_involved">Get Involved</a></li> + <li><a href="/mailing_list">Mailing Lists</a></li> + <li><a href="/people">People</a></li> + <li><a href="/news">News Archive</a></li> + <li><a href="/projects">Community Projects</a></li> + <li><a href="/thanks">Thanks</a></li> + <li class="divider"></li> + <li class="dropdown-header">Governance</li> + <li><a href="/bylaws">Bylaws</a></li> + <li><a href="/governance/consensusBuilding">Consensus Building</a></li> + <li><a href="/governance/lazyConsensus">Lazy Consensus</a></li> + <li><a href="/governance/releasing">Releasing</a></li> + <li><a href="/governance/voting">Voting</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Development<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://issues.apache.org/jira/browse/ACCUMULO">Issue Tracker <i class="fa fa-external-link"></i></a></li> + <li><a href="https://github.com/apache/accumulo/pulls">Pull Requests <i class="fa fa-external-link"></i></a></li> + <li><a href="https://builds.apache.org/view/A/view/Accumulo">Jenkins Builds <i class="fa fa-external-link"></i></a></li> + <li><a href="https://travis-ci.org/apache/accumulo">TravisCI Builds <i class="fa fa-external-link"></i></a></li> + <li class="divider"></li> + <li class="dropdown-header">Guides</li> + <li><a href="/source">Source & Guide</a></li> + <li><a href="/git">Git Workflow</a></li> + <li><a href="/versioning">Versioning</a></li> + <li><a href="/contrib">Contrib Projects</a></li> + <li><a href="/rb">Review Board</a></li> + <li><a href="/releasing">Making Releases</a></li> + <li><a href="/verifying_releases">Verifying Releases</a></li> + </ul> + </li> + </ul> + <ul class="nav navbar-nav navbar-right"> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Apache Software Foundation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://www.apache.org">Apache Homepage <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/licenses/LICENSE-2.0">License <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/sponsorship">Sponsorship <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/security">Security <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/thanks">Thanks <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/policies/conduct">Code of Conduct <i class="fa fa-external-link"></i></a></li> + </ul> + </li> + </ul> + </div> + </div> +</nav> + + + <div class="container"> + <div class="row"> + <div class="col-md-12"> + + <div id="non-canonical" style="display: none; background-color: #F0E68C; padding-left: 1em;"> + Visit the official site at: <a href="https://accumulo.apache.org">https://accumulo.apache.org</a> + </div> + <div id="content"> + + <h1 class="title">User Manual: Administration</h1> + + <p>** Next:** <a href="Shell_Commands.html">Shell Commands</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Security.html">Security</a> ** <a href="Contents.html">Contents</a>**</p> + +<p><a id="CHILD_LINKS"></a><strong>Subsections</strong></p> + +<ul> + <li><a href="Administration.html#Hardware">Hardware</a></li> + <li><a href="Administration.html#Network">Network</a></li> + <li><a href="Administration.html#Installation">Installation</a></li> + <li><a href="Administration.html#Dependencies">Dependencies</a></li> + <li><a href="Administration.html#Configuration">Configuration</a></li> + <li><a href="Administration.html#Initialization">Initialization</a></li> + <li><a href="Administration.html#Running">Running</a></li> + <li><a href="Administration.html#Monitoring">Monitoring</a></li> + <li><a href="Administration.html#Logging">Logging</a></li> + <li><a href="Administration.html#Recovery">Recovery</a></li> +</ul> + +<hr /> + +<h2 id="a-idadministrationa-administration"><a id="Administration"></a> Administration</h2> + +<h2 id="a-idhardwarea-hardware"><a id="Hardware"></a> Hardware</h2> + +<p>Because we are running essentially two or three systems simultaneously layered across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have at least one core and 2 - 4 GB each.</p> + +<p>One core running HDFS can typically keep 2 to 4 disks busy, so each machine may typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks.</p> + +<p>It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB each, but in this case it is recommended to only run up to two processes per machine - i.e. DataNode and TabletServer or DataNode and MapReduce worker but not all three. The constraint here is having enough available heap space for all the processes on a machine.</p> + +<h2 id="a-idnetworka-network"><a id="Network"></a> Network</h2> + +<p>Accumulo communicates via remote procedure calls over TCP/IP for both passing data and control messages. In addition, Accumulo uses HDFS clients to communicate with HDFS. To achieve good ingest and query performance, sufficient network bandwidth must be available between any two machines.</p> + +<h2 id="a-idinstallationa-installation"><a id="Installation"></a> Installation</h2> + +<p>Choose a directory for the Accumulo installation. This directory will be referenced by the environment variable $ACCUMULO_HOME. Run the following:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>$ tar xzf $ACCUMULO_HOME/accumulo.tar.gz +</code></pre> +</div> + +<p>Repeat this step at each machine within the cluster. Usually all machines have the same $ACCUMULO_HOME.</p> + +<h2 id="a-iddependenciesa-dependencies"><a id="Dependencies"></a> Dependencies</h2> + +<p>Accumulo requires HDFS and ZooKeeper to be configured and running before starting. Password-less SSH should be configured between at least the Accumulo master and TabletServer machines. It is also a good idea to run Network Time Protocol (NTP) within the cluster to ensure nodesâ clocks donât get too out of sync, which can cause problems with automatically timestamped data. Accumulo will remove from the set of TabletServers those machines whose times differ too much from the masterâs.</p> + +<h2 id="a-idconfigurationa-configuration"><a id="Configuration"></a> Configuration</h2> + +<p>Accumulo is configured by editing several Shell and XML files found in $ACCUMULO_HOME/conf. The structure closely resembles Hadoopâs configuration files.</p> + +<h3 id="a-ideditconfaccumulo-envsha-edit-confaccumulo-envsh"><a id="Edit_conf/accumulo-env.sh"></a> Edit conf/accumulo-env.sh</h3> + +<p>Accumulo needs to know where to find the software it depends on. Edit accumuloenv. sh and specify the following:</p> + +<ol> + <li>Enter the location of the installation directory of Accumulo for $ACCUMULO_HOME</li> + <li>Enter your systemâs Java home for $JAVA_HOME</li> + <li>Enter the location of Hadoop for $HADOOP_HOME</li> + <li>Choose a location for Accumulo logs and enter it for $ACCUMULO_LOG_DIR</li> + <li>Enter the location of ZooKeeper for $ZOOKEEPER_HOME</li> +</ol> + +<p>By default Accumulo TabletServers are set to use 1GB of memory. You may change this by altering the value of $ACCUMULO_TSERVER_OPTS. Note the syntax is that of the Java JVM command line options. This value should be less than the physical memory of the machines running TabletServers.</p> + +<p>There are similar options for the masterâs memory usage and the garbage collector process. Reduce these if they exceed the physical RAM of your hardware and increase them, within the bounds of the physical RAM, if a process fails because of insufficient memory.</p> + +<p>Note that you will be specifying the Java heap space in accumulo-env.sh. You should make sure that the total heap space used for the Accumulo tserver and the Hadoop DataNode and TaskTracker is less than the available memory on each slave node in the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate machines to allow them to use more heap space. If you are running these on the same machine on a small cluster, likewise make sure their heap space settings fit within the available memory.</p> + +<h3 id="a-idclusterspecificationa-cluster-specification"><a id="Cluster_Specification"></a> Cluster Specification</h3> + +<p>On the machine that will serve as the Accumulo master:</p> + +<ol> + <li>Write the IP address or domain name of the Accumulo Master to the <br /> +$ACCUMULO_HOME/conf/masters file.</li> + <li>Write the IP addresses or domain name of the machines that will be TabletServers in <br /> +$ACCUMULO_HOME/conf/slaves, one per line.</li> +</ol> + +<p>Note that if using domain names rather than IP addresses, DNS must be configured properly for all machines participating in the cluster. DNS can be a confusing source of errors.</p> + +<h3 id="a-idaccumulosettingsa-accumulo-settings"><a id="Accumulo_Settings"></a> Accumulo Settings</h3> + +<p>Specify appropriate values for the following settings in <br /> +$ACCUMULO_HOME/conf/accumulo-site.xml :</p> + +<div class="highlighter-rouge"><pre class="highlight"><code><property> + <name>zookeeper</name> + <value>zooserver-one:2181,zooserver-two:2181</value> + <description>list of zookeeper servers</description> +</property> +<property> + <name>walog</name> + <value>/var/accumulo/walogs</value> + <description>local directory for write ahead logs</description> +</property> +</code></pre> +</div> + +<p>This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate settings between processes and helps finalize TabletServer failure.</p> + +<p>Accumulo records all changes to tables to a write-ahead log before committing them to the table. The `walogâ setting specifies the local directory on each machine to which write-ahead logs are written. This directory should exist on all machines acting as TabletServers.</p> + +<p>Some settings can be modified via the Accumulo shell and take effect immediately. However, any settings that should be persisted across system restarts must be recorded in the accumulo-site.xml file.</p> + +<h3 id="a-iddeployconfigurationa-deploy-configuration"><a id="Deploy_Configuration"></a> Deploy Configuration</h3> + +<p>Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml from the <br /> +$ACCUMULO_HOME/conf/ directory on the master to all the machines specified in the slaves file.</p> + +<h2 id="a-idinitializationa-initialization"><a id="Initialization"></a> Initialization</h2> + +<p>Accumulo must be initialized to create the structures it uses internally to locate data across the cluster. HDFS is required to be configured and running before Accumulo can be initialized.</p> + +<p>Once HDFS is started, initialization can be performed by executing <br /> +$ACCUMULO_HOME/bin/accumulo init . This script will prompt for a name for this instance of Accumulo. The instance name is used to identify a set of tables and instance-specific settings. The script will then write some information into HDFS so Accumulo can start properly.</p> + +<p>The initialization script will prompt you to set a root password. Once Accumulo is initialized it can be started.</p> + +<h2 id="a-idrunninga-running"><a id="Running"></a> Running</h2> + +<h3 id="a-idstartingaccumuloa-starting-accumulo"><a id="Starting_Accumulo"></a> Starting Accumulo</h3> + +<p>Make sure Hadoop is configured on all of the machines in the cluster, including access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running. Make sure ZooKeeper is configured and running on at least one machine in the cluster. Start Accumulo using the bin/start-all.sh script.</p> + +<p>To verify that Accumulo is running, check the Status page as described under <em>Monitoring</em>. In addition, the Shell can provide some information about the status of tables via reading the !METADATA table.</p> + +<h3 id="a-idstoppingaccumuloa-stopping-accumulo"><a id="Stopping_Accumulo"></a> Stopping Accumulo</h3> + +<p>To shutdown cleanly, run bin/stop-all.sh and the master will orchestrate the shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may take some time for particular configurations.</p> + +<h2 id="a-idmonitoringa-monitoring"><a id="Monitoring"></a> Monitoring</h2> + +<p>The Accumulo Master provides an interface for monitoring the status and health of Accumulo components. This interface can be accessed by pointing a web browser to <br /> +http://accumulomaster:50095/status</p> + +<h2 id="a-idlogginga-logging"><a id="Logging"></a> Logging</h2> + +<p>Accumulo processes each write to a set of log files. By default these are found under <br /> +$ACCUMULO/logs/.</p> + +<h2 id="a-idrecoverya-recovery"><a id="Recovery"></a> Recovery</h2> + +<p>In the event of TabletServer failure or error on shutting Accumulo down, some mutations may not have been minor compacted to HDFS properly. In this case, Accumulo will automatically reapply such mutations from the write-ahead log either when the tablets from the failed server are reassigned by the Master, in the case of a single TabletServer failure or the next time Accumulo starts, in the event of failure during shutdown.</p> + +<p>Recovery is performed by asking the loggers to copy their write-ahead logs into HDFS. As the logs are copied, they are also sorted, so that tablets can easily find their missing updates. The copy/sort status of each file is displayed on Accumulo monitor status page. Once the recovery is complete any tablets involved should return to an ``onlineâ state. Until then those tablets will be unavailable to clients.</p> + +<p>The Accumulo client library is configured to retry failed mutations and in many cases clients will be able to continue processing after the recovery process without throwing an exception.</p> + +<p>Note that because Accumulo uses timestamps to order mutations, any mutations that are applied as part of the recovery process should appear to have been applied when they originally arrived at the TabletServer that failed. This makes the ordering of mutations consistent in the presence of failure.</p> + +<hr /> + +<p>** Next:** <a href="Shell_Commands.html">Shell Commands</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="Security.html">Security</a> ** <a href="Contents.html">Contents</a>**</p> + + + </div> + + +<footer> + + <p><a href="https://www.apache.org"><img src="/images/feather-small.gif" alt="Apache Software Foundation" id="asf-logo" height="100" /></a></p> + + <p>Copyright © 2011-2016 The Apache Software Foundation. Licensed under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> + +</footer> + + + </div> + </div> + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/accumulo/blob/c0655661/1.3/user_manual/Analytics.html ---------------------------------------------------------------------- diff --git a/1.3/user_manual/Analytics.html b/1.3/user_manual/Analytics.html new file mode 100644 index 0000000..e8cf830 --- /dev/null +++ b/1.3/user_manual/Analytics.html @@ -0,0 +1,330 @@ +<!DOCTYPE html> +<html lang="en"> +<head> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<meta charset="utf-8"> +<meta http-equiv="X-UA-Compatible" content="IE=edge"> +<meta name="viewport" content="width=device-width, initial-scale=1"> +<link href="https://maxcdn.bootstrapcdn.com/bootswatch/3.3.7/paper/bootstrap.min.css" rel="stylesheet" integrity="sha384-awusxf8AUojygHf2+joICySzB780jVvQaVCAt1clU3QsyAitLGul28Qxb2r1e5g+" crossorigin="anonymous"> +<link href="//netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css" rel="stylesheet"> +<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.css"> +<link href="/css/accumulo.css" rel="stylesheet" type="text/css"> + +<title>User Manual: Analytics</title> + +<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script> +<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> +<script type="text/javascript" src="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.js"></script> +<script> + // show location of canonical site if not currently on the canonical site + $(function() { + var host = window.location.host; + if (typeof host !== 'undefined' && host !== 'accumulo.apache.org') { + $('#non-canonical').show(); + } + }); + + $(function() { + // decorate section headers with anchors + return $("h2, h3, h4, h5, h6").each(function(i, el) { + var $el, icon, id; + $el = $(el); + id = $el.attr('id'); + icon = '<i class="fa fa-link"></i>'; + if (id) { + return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon)); + } + }); + }); + + // configure Google Analytics + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + if (ga.hasOwnProperty('loaded') && ga.loaded === true) { + ga('create', 'UA-50934829-1', 'apache.org'); + ga('send', 'pageview'); + } +</script> + +</head> +<body style="padding-top: 100px"> + + <nav class="navbar navbar-default navbar-fixed-top"> + <div class="container"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-items"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a href="/"><img id="nav-logo" alt="Apache Accumulo" class="img-responsive" src="/images/accumulo-logo.png" width="200"/></a> + </div> + <div class="collapse navbar-collapse" id="navbar-items"> + <ul class="nav navbar-nav"> + <li class="nav-link"><a href="/downloads">Download</a></li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Releases<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/release/accumulo-1.8.0/">1.8.0 (Latest)</a></li> + <li><a href="/release/accumulo-1.7.2/">1.7.2</a></li> + <li><a href="/release/accumulo-1.6.6/">1.6.6</a></li> + <li><a href="/release/">Archive</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/user-manual/">User Manuals</a></li> + <li><a href="/javadocs/">Javadocs</a></li> + <li><a href="/examples/">Examples</a></li> + <li><a href="/notable_features">Features</a></li> + <li><a href="/screenshots">Screenshots</a></li> + <li><a href="/papers">Papers & Presentations</a></li> + <li><a href="/glossary">Glossary</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Community<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="/get_involved">Get Involved</a></li> + <li><a href="/mailing_list">Mailing Lists</a></li> + <li><a href="/people">People</a></li> + <li><a href="/news">News Archive</a></li> + <li><a href="/projects">Community Projects</a></li> + <li><a href="/thanks">Thanks</a></li> + <li class="divider"></li> + <li class="dropdown-header">Governance</li> + <li><a href="/bylaws">Bylaws</a></li> + <li><a href="/governance/consensusBuilding">Consensus Building</a></li> + <li><a href="/governance/lazyConsensus">Lazy Consensus</a></li> + <li><a href="/governance/releasing">Releasing</a></li> + <li><a href="/governance/voting">Voting</a></li> + </ul> + </li> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Development<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://issues.apache.org/jira/browse/ACCUMULO">Issue Tracker <i class="fa fa-external-link"></i></a></li> + <li><a href="https://github.com/apache/accumulo/pulls">Pull Requests <i class="fa fa-external-link"></i></a></li> + <li><a href="https://builds.apache.org/view/A/view/Accumulo">Jenkins Builds <i class="fa fa-external-link"></i></a></li> + <li><a href="https://travis-ci.org/apache/accumulo">TravisCI Builds <i class="fa fa-external-link"></i></a></li> + <li class="divider"></li> + <li class="dropdown-header">Guides</li> + <li><a href="/source">Source & Guide</a></li> + <li><a href="/git">Git Workflow</a></li> + <li><a href="/versioning">Versioning</a></li> + <li><a href="/contrib">Contrib Projects</a></li> + <li><a href="/rb">Review Board</a></li> + <li><a href="/releasing">Making Releases</a></li> + <li><a href="/verifying_releases">Verifying Releases</a></li> + </ul> + </li> + </ul> + <ul class="nav navbar-nav navbar-right"> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Apache Software Foundation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://www.apache.org">Apache Homepage <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/licenses/LICENSE-2.0">License <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/sponsorship">Sponsorship <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/security">Security <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/thanks">Thanks <i class="fa fa-external-link"></i></a></li> + <li><a href="https://www.apache.org/foundation/policies/conduct">Code of Conduct <i class="fa fa-external-link"></i></a></li> + </ul> + </li> + </ul> + </div> + </div> +</nav> + + + <div class="container"> + <div class="row"> + <div class="col-md-12"> + + <div id="non-canonical" style="display: none; background-color: #F0E68C; padding-left: 1em;"> + Visit the official site at: <a href="https://accumulo.apache.org">https://accumulo.apache.org</a> + </div> + <div id="content"> + + <h1 class="title">User Manual: Analytics</h1> + + <p>** Next:** <a href="Security.html">Security</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="High_Speed_Ingest.html">High-Speed Ingest</a> ** <a href="Contents.html">Contents</a>**</p> + +<p><a id="CHILD_LINKS"></a><strong>Subsections</strong></p> + +<ul> + <li><a href="Analytics.html#MapReduce">MapReduce</a></li> + <li><a href="Analytics.html#Aggregating_Iterators">Aggregating Iterators</a></li> + <li><a href="Analytics.html#Statistical_Modeling">Statistical Modeling</a></li> +</ul> + +<hr /> + +<h2 id="a-idanalyticsa-analytics"><a id="Analytics"></a> Analytics</h2> + +<p>Accumulo supports more advanced data processing than simply keeping keys sorted and performing efficient lookups. Analytics can be developed by using MapReduce and Iterators in conjunction with Accumulo tables.</p> + +<h2 id="a-idmapreducea-mapreduce"><a id="MapReduce"></a> MapReduce</h2> + +<p>Accumulo tables can be used as the source and destination of MapReduce jobs. To use an Accumulo table with a MapReduce job (specifically with the new Hadoop API as of version 0.20), configure the job parameters to use the AccumuloInputFormat and AccumuloOutputFormat. Accumulo specific parameters can be set via these two format classes to do the following:</p> + +<ul> + <li>Authenticate and provide user credentials for the input</li> + <li>Restrict the scan to a range of rows</li> + <li>Restrict the input to a subset of available columns</li> +</ul> + +<h3 id="a-idmapperandreducerclassesa-mapper-and-reducer-classes"><a id="Mapper_and_Reducer_classes"></a> Mapper and Reducer classes</h3> + +<p>To read from an Accumulo table create a Mapper with the following class parameterization and be sure to configure the AccumuloInputFormat.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> { + public void map(Key k, Value v, Context c) { + // transform key and value data here + } +} +</code></pre> +</div> + +<p>To write to an Accumulo table, create a Reducer with the following class parameterization and be sure to configure the AccumuloOutputFormat. The key emitted from the Reducer identifies the table to which the mutation is sent. This allows a single Reducer to write to more than one table if desired. A default table can be configured using the AccumuloOutputFormat, in which case the output table name does not have to be passed to the Context object within the Reducer.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>class MyReducer extends Reducer<WritableComparable, Writable, Text, Mutation> { + + public void reduce(WritableComparable key, Iterator<Text> values, Context c) { + + Mutation m; + + // create the mutation based on input key and value + + c.write(new Text("output-table"), m); + } +} +</code></pre> +</div> + +<p>The Text object passed as the output should contain the name of the table to which this mutation should be applied. The Text can be null in which case the mutation will be applied to the default table name specified in the AccumuloOutputFormat options.</p> + +<h3 id="a-idaccumuloinputformatoptionsa-accumuloinputformat-options"><a id="AccumuloInputFormat_options"></a> AccumuloInputFormat options</h3> + +<div class="highlighter-rouge"><pre class="highlight"><code>Job job = new Job(getConf()); +AccumuloInputFormat.setInputInfo(job, + "user", + "passwd".getBytes(), + "table", + new Authorizations()); + +AccumuloInputFormat.setZooKeeperInstance(job, "myinstance", + "zooserver-one,zooserver-two"); +</code></pre> +</div> + +<p><strong>Optional settings:</strong></p> + +<p>To restrict Accumulo to a set of row ranges:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>ArrayList<Range> ranges = new ArrayList<Range>(); +// populate array list of row ranges ... +AccumuloInputFormat.setRanges(job, ranges); +</code></pre> +</div> + +<p>To restrict accumulo to a list of columns:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>ArrayList<Pair<Text,Text>> columns = new ArrayList<Pair<Text,Text>>(); +// populate list of columns +AccumuloInputFormat.fetchColumns(job, columns); +</code></pre> +</div> + +<p>To use a regular expression to match row IDs:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>AccumuloInputFormat.setRegex(job, RegexType.ROW, "^.*"); +</code></pre> +</div> + +<h3 id="a-idaccumulooutputformatoptionsa-accumulooutputformat-options"><a id="AccumuloOutputFormat_options"></a> AccumuloOutputFormat options</h3> + +<div class="highlighter-rouge"><pre class="highlight"><code>boolean createTables = true; +String defaultTable = "mytable"; + +AccumuloOutputFormat.setOutputInfo(job, + "user", + "passwd".getBytes(), + createTables, + defaultTable); + +AccumuloOutputFormat.setZooKeeperInstance(job, "myinstance", + "zooserver-one,zooserver-two"); +</code></pre> +</div> + +<p><strong>Optional Settings:</strong></p> + +<div class="highlighter-rouge"><pre class="highlight"><code>AccumuloOutputFormat.setMaxLatency(job, 300); // milliseconds +AccumuloOutputFormat.setMaxMutationBufferSize(job, 5000000); // bytes +</code></pre> +</div> + +<p>An example of using MapReduce with Accumulo can be found at <br /> +accumulo/docs/examples/README.mapred</p> + +<h2 id="a-idaggregatingiteratorsa-aggregating-iterators"><a id="Aggregating_Iterators"></a> Aggregating Iterators</h2> + +<p>Many applications can benefit from the ability to aggregate values across common keys. This can be done via aggregating iterators and is similar to the Reduce step in MapReduce. This provides the ability to define online, incrementally updated analytics without the overhead or latency associated with batch-oriented MapReduce jobs.</p> + +<p>All that is needed to aggregate values of a table is to identify the fields over which values will be grouped, insert mutations with those fields as the key, and configure the table with an aggregating iterator that supports the summarization operation desired.</p> + +<p>The only restriction on an aggregating iterator is that the aggregator developer should not assume that all values for a given key have been seen, since new mutations can be inserted at anytime. This precludes using the total number of values in the aggregation such as when calculating an average, for example.</p> + +<h3 id="a-idfeaturevectorsa-feature-vectors"><a id="Feature_Vectors"></a> Feature Vectors</h3> + +<p>An interesting use of aggregating iterators within an Accumulo table is to store feature vectors for use in machine learning algorithms. For example, many algorithms such as k-means clustering, support vector machines, anomaly detection, etc. use the concept of a feature vector and the calculation of distance metrics to learn a particular model. The columns in an Accumulo table can be used to efficiently store sparse features and their weights to be incrementally updated via the use of an aggregating iterator.</p> + +<h2 id="a-idstatisticalmodelinga-statistical-modeling"><a id="Statistical_Modeling"></a> Statistical Modeling</h2> + +<p>Statistical models that need to be updated by many machines in parallel could be similarly stored within an Accumulo table. For example, a MapReduce job that is iteratively updating a global statistical model could have each map or reduce worker reference the parts of the model to be read and updated through an embedded Accumulo client.</p> + +<p>Using Accumulo this way enables efficient and fast lookups and updates of small pieces of information in a random access pattern, which is complementary to MapReduceâs sequential access model.</p> + +<hr /> + +<p>** Next:** <a href="Security.html">Security</a> ** Up:** <a href="accumulo_user_manual.html">Apache Accumulo User Manual Version 1.3</a> ** Previous:** <a href="High_Speed_Ingest.html">High-Speed Ingest</a> ** <a href="Contents.html">Contents</a>**</p> + + + </div> + + +<footer> + + <p><a href="https://www.apache.org"><img src="/images/feather-small.gif" alt="Apache Software Foundation" id="asf-logo" height="100" /></a></p> + + <p>Copyright © 2011-2016 The Apache Software Foundation. Licensed under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p> + +</footer> + + + </div> + </div> + </div> +</body> +</html>