This is an automated email from the ASF dual-hosted git repository.
mmiller pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/accumulo-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new bdb541d Jekyll build from master:b9b8aea
bdb541d is described below
commit bdb541d0fdefc1bcd28c52d1d34c411e8aa32365
Author: Mike Miller <[email protected]>
AuthorDate: Thu Oct 17 11:08:21 2019 -0400
Jekyll build from master:b9b8aea
Blog post to configure Accumulo with Azure Data Lake Gen2 Storage (#198)
---
1.3/user_manual/Writing_Accumulo_Clients.html | 2 +-
1.4/user_manual/Writing_Accumulo_Clients.html | 2 +-
1.5/accumulo_user_manual.html | 2 +-
blog/2019/10/15/accumulo-adlsgen2-notes.html | 293 ++++++++++++++++++++++++++
feed.xml | 235 ++++++++++++---------
index.html | 20 +-
news/index.html | 7 +
redirects.json | 2 +-
search_data.json | 8 +
9 files changed, 456 insertions(+), 115 deletions(-)
diff --git a/1.3/user_manual/Writing_Accumulo_Clients.html
b/1.3/user_manual/Writing_Accumulo_Clients.html
index c9150ab..ba975ef 100644
--- a/1.3/user_manual/Writing_Accumulo_Clients.html
+++ b/1.3/user_manual/Writing_Accumulo_Clients.html
@@ -163,7 +163,7 @@ Connector conn = new Connector(inst,
"user","passwd".getBytes());
<h2 id="-writing-data"><a id="Writing_Data"></a> Writing Data</h2>
-<p>Data are written to Accumulo by creating Mutation objects that represent
all the changes to the columns of a single row. The changes are made atomically
in the TabletServer. Clients then add Mutations to a BatchWriter which submits
them to the appropriate TabletServers.</p>
+<p>Data is written to Accumulo by creating Mutation objects that represent all
the changes to the columns of a single row. The changes are made atomically in
the TabletServer. Clients then add Mutations to a BatchWriter which submits
them to the appropriate TabletServers.</p>
<p>Mutations can be created thus:</p>
diff --git a/1.4/user_manual/Writing_Accumulo_Clients.html
b/1.4/user_manual/Writing_Accumulo_Clients.html
index e74f289..ecf74fa 100644
--- a/1.4/user_manual/Writing_Accumulo_Clients.html
+++ b/1.4/user_manual/Writing_Accumulo_Clients.html
@@ -187,7 +187,7 @@ Connector conn = inst.getConnector("user", "passwd");
<h2 id="-writing-data"><a id="Writing_Data"></a> Writing Data</h2>
-<p>Data are written to Accumulo by creating Mutation objects that represent
all the changes to the columns of a single row. The changes are made atomically
in the TabletServer. Clients then add Mutations to a BatchWriter which submits
them to the appropriate TabletServers.</p>
+<p>Data is written to Accumulo by creating Mutation objects that represent all
the changes to the columns of a single row. The changes are made atomically in
the TabletServer. Clients then add Mutations to a BatchWriter which submits
them to the appropriate TabletServers.</p>
<p>Mutations can be created thus:</p>
diff --git a/1.5/accumulo_user_manual.html b/1.5/accumulo_user_manual.html
index 0916317..14941ef 100644
--- a/1.5/accumulo_user_manual.html
+++ b/1.5/accumulo_user_manual.html
@@ -1246,7 +1246,7 @@ http://www.gnu.org/software/src-highlite -->
</div>
<div class="sect2">
<h3 id="_writing_data">4.3. Writing Data</h3>
-<div class="paragraph"><p>Data are written to Accumulo by creating Mutation
objects that represent all the
+<div class="paragraph"><p>Data is written to Accumulo by creating Mutation
objects that represent all the
changes to the columns of a single row. The changes are made atomically in the
TabletServer. Clients then add Mutations to a BatchWriter which submits them to
the appropriate TabletServers.</p></div>
diff --git a/blog/2019/10/15/accumulo-adlsgen2-notes.html
b/blog/2019/10/15/accumulo-adlsgen2-notes.html
new file mode 100644
index 0000000..2557887
--- /dev/null
+++ b/blog/2019/10/15/accumulo-adlsgen2-notes.html
@@ -0,0 +1,293 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<meta charset="utf-8">
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<link
href="https://maxcdn.bootstrapcdn.com/bootswatch/3.3.7/paper/bootstrap.min.css"
rel="stylesheet"
integrity="sha384-awusxf8AUojygHf2+joICySzB780jVvQaVCAt1clU3QsyAitLGul28Qxb2r1e5g+"
crossorigin="anonymous">
+<link href="//netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css"
rel="stylesheet">
+<link rel="stylesheet" type="text/css"
href="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.css">
+<link href="/css/accumulo.css" rel="stylesheet" type="text/css">
+
+<title>Using Azure Data Lake Gen2 storage as a data store for Accumulo</title>
+
+<script
src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.2.4/jquery.min.js"
integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44="
crossorigin="anonymous"></script>
+<script
src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"
integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa"
crossorigin="anonymous"></script>
+<script type="text/javascript"
src="https://cdn.datatables.net/v/bs/jq-2.2.3/dt-1.10.12/datatables.min.js"></script>
+<script>
+ // show location of canonical site if not currently on the canonical site
+ $(function() {
+ var host = window.location.host;
+ if (typeof host !== 'undefined' && host !== 'accumulo.apache.org') {
+ $('#non-canonical').show();
+ }
+ });
+
+ $(function() {
+ // decorate section headers with anchors
+ return $("h2, h3, h4, h5, h6").each(function(i, el) {
+ var $el, icon, id;
+ $el = $(el);
+ id = $el.attr('id');
+ icon = '<i class="fa fa-link"></i>';
+ if (id) {
+ return $el.append($("<a />").addClass("header-link").attr("href", "#"
+ id).html(icon));
+ }
+ });
+ });
+
+ // fix sidebar width in documentation
+ $(function() {
+ var $affixElement = $('div[data-spy="affix"]');
+ $affixElement.width($affixElement.parent().width());
+ });
+</script>
+
+</head>
+<body style="padding-top: 100px">
+
+ <nav class="navbar navbar-default navbar-fixed-top">
+ <div class="container">
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle" data-toggle="collapse"
data-target="#navbar-items">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <a href="/"><img id="nav-logo" alt="Apache Accumulo"
class="img-responsive" src="/images/accumulo-logo.png" width="200"
+ /></a>
+ </div>
+ <div class="collapse navbar-collapse" id="navbar-items">
+ <ul class="nav navbar-nav">
+ <li class="nav-link"><a href="/downloads">Download</a></li>
+ <li class="nav-link"><a href="/tour">Tour</a></li>
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Releases<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/release/accumulo-2.0.0/">2.0.0 (Latest)</a></li>
+ <li><a href="/release/accumulo-1.9.3/">1.9.3</a></li>
+ <li><a href="/release/">Archive</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Documentation<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/docs/2.x">User Manual (2.x)</a></li>
+ <li><a href="/quickstart-1.x">Quickstart (1.x)</a></li>
+ <li><a href="/accumulo2-maven-plugin">Accumulo Maven
Plugin</a></li>
+ <li><a href="/1.9/accumulo_user_manual.html">User Manual
(1.9)</a></li>
+ <li><a href="/1.9/apidocs">Javadocs (1.9)</a></li>
+ <li><a href="/external-docs">External Docs</a></li>
+ <li><a href="/docs-archive/">Archive</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown"
href="#">Community<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/contact-us">Contact Us</a></li>
+ <li><a href="/how-to-contribute">How To Contribute</a></li>
+ <li><a href="/people">People</a></li>
+ <li><a href="/related-projects">Related Projects</a></li>
+ </ul>
+ </li>
+ <li class="nav-link"><a href="/search">Search</a></li>
+ </ul>
+ <ul class="nav navbar-nav navbar-right">
+ <li class="dropdown">
+ <a class="dropdown-toggle" data-toggle="dropdown" href="#"><img
alt="Apache Software Foundation"
src="https://www.apache.org/foundation/press/kit/feather.svg" width="15"/><span
class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="https://www.apache.org">Apache Homepage <i class="fa
fa-external-link"></i></a></li>
+ <li><a href="https://www.apache.org/licenses/">License <i
class="fa fa-external-link"></i></a></li>
+ <li><a
href="https://www.apache.org/foundation/sponsorship">Sponsorship <i class="fa
fa-external-link"></i></a></li>
+ <li><a href="https://www.apache.org/security">Security <i
class="fa fa-external-link"></i></a></li>
+ <li><a href="https://www.apache.org/foundation/thanks">Thanks <i
class="fa fa-external-link"></i></a></li>
+ <li><a
href="https://www.apache.org/foundation/policies/conduct">Code of Conduct <i
class="fa fa-external-link"></i></a></li>
+ <li><a
href="https://www.apache.org/events/current-event.html">Current Event <i
class="fa fa-external-link"></i></a></li>
+ </ul>
+ </li>
+ </ul>
+ </div>
+ </div>
+</nav>
+
+
+ <div class="container">
+ <div class="row">
+ <div class="col-md-12">
+
+ <div id="non-canonical" style="display: none; background-color:
#F0E68C; padding-left: 1em;">
+ Visit the official site at: <a
href="https://accumulo.apache.org">https://accumulo.apache.org</a>
+ </div>
+ <div id="content">
+
+ <h1 class="title">Using Azure Data Lake Gen2 storage as a data store
for Accumulo</h1>
+
+ <p>
+<b>Author: </b> Karthick Narendran<br>
+<b>Date: </b> 15 Oct 2019<br>
+
+</p>
+
+<p>Accumulo can store its files in <a
href="https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction">Azure
Data Lake Storage Gen2</a>
+using the <a
href="https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver">ABFS
(Azure Blob File System)</a> driver.
+Similar to <a
href="https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html">S3
blog</a>,
+the write ahead logs & Accumulo metadata can be stored in HDFS and
everything else on Gen2 storage
+using the volume chooser feature introduced in Accumulo 2.0. The
configurations referred on this blog
+are specific to Accumulo 2.0 and Hadoop 3.2.0.</p>
+
+<h2 id="hadoop-setup">Hadoop setup</h2>
+
+<p>For ABFS client to talk to Gen2 storage, it requires one of the
Authentication mechanism listed <a
href="https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html#Authentication">here</a>
+This post covers <a
href="https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview">Azure
Managed Identity</a>
+formerly known as Managed Service Identity or MSI. This feature provides Azure
services with an
+automatically managed identity in <a
href="https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-whatis">Azure
AD</a>
+and it avoids the need for credentials or other sensitive information from
being stored in code
+or configs/JCEKS. Plus, it comes free with Azure AD.</p>
+
+<p>At least the following should be added to Hadoop’s <code
class="highlighter-rouge">core-site.xml</code> on each node.</p>
+
+<div class="language-xml highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="nt"><property></span>
+ <span class="nt"><name></span>fs.azure.account.auth.type<span
class="nt"></name></span>
+ <span class="nt"><value></span>OAuth<span
class="nt"></value></span>
+<span class="nt"></property></span>
+<span class="nt"><property></span>
+ <span
class="nt"><name></span>fs.azure.account.oauth.provider.type<span
class="nt"></name></span>
+ <span
class="nt"><value></span>org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider<span
class="nt"></value></span>
+<span class="nt"></property></span>
+<span class="nt"><property></span>
+ <span class="nt"><name></span>fs.azure.account.oauth2.msi.tenant<span
class="nt"></name></span>
+ <span class="nt"><value></span>TenantID<span
class="nt"></value></span>
+<span class="nt"></property></span>
+<span class="nt"><property></span>
+ <span class="nt"><name></span>fs.azure.account.oauth2.client.id<span
class="nt"></name></span>
+ <span class="nt"><value></span>ClientID<span
class="nt"></value></span>
+<span class="nt"></property></span>
+</code></pre></div></div>
+
+<p>See <a
href="https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html">ABFS
doc</a>
+for more information on Hadoop Azure support.</p>
+
+<p>To get hadoop command to work with ADLS Gen2 set the
+following entries in <code class="highlighter-rouge">hadoop-env.sh</code>. As
Gen2 storage is TLS enabled by default,
+it is important we use the native OpenSSL implementation of TLS.</p>
+
+<div class="language-bash highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="nb">export </span><span
class="nv">HADOOP_OPTIONAL_TOOLS</span><span class="o">=</span><span
class="s2">"hadoop-azure"</span>
+<span class="nb">export </span><span class="nv">HADOOP_OPTS</span><span
class="o">=</span><span
class="s2">"-Dorg.wildfly.openssl.path=<path/to/OpenSSL/libraries>
</span><span class="k">${</span><span class="nv">HADOOP_OPTS</span><span
class="k">}</span><span class="s2">"</span>
+</code></pre></div></div>
+
+<p>To verify the location of the OpenSSL libraries, run <code
class="highlighter-rouge">whereis libssl</code> command
+on the host</p>
+
+<h2 id="accumulo-setup">Accumulo setup</h2>
+
+<p>For each node in the cluster, modify <code
class="highlighter-rouge">accumulo-env.sh</code> to add Azure storage jars to
the
+classpath. Your versions may differ depending on your Hadoop version,
+following versions were included with Hadoop 3.2.0.</p>
+
+<div class="language-bash highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="nv">CLASSPATH</span><span
class="o">=</span><span class="s2">"</span><span class="k">${</span><span
class="nv">conf</span><span class="k">}</span><span class="s2">:</span><span
class="k">${</span><span class="nv">lib</span><span class="k">}</span><span
class="s2">/*:</span><span class="k">${</span><span
class="nv">HADOOP_CONF_DIR</span><span class="k">}</span><span class="s2">:</
[...]
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/tools/lib/azure-data-lake-store-sdk-2.2.9.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/tools/lib/azure-keyvault-core-1.0.0.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/tools/lib/hadoop-azure-3.2.0.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/tools/lib/wildfly-openssl-1.0.4.Final.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/jaxb-api-2.2.11.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/commons-lang3-3.7.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/httpclient-4.5.2.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar"</span>
+<span class="nv">CLASSPATH</span><span class="o">=</span><span
class="s2">"</span><span class="k">${</span><span
class="nv">CLASSPATH</span><span class="k">}</span><span
class="s2">:</span><span class="k">${</span><span
class="nv">HADOOP_HOME</span><span class="k">}</span><span
class="s2">/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar"</span>
+<span class="nb">export </span>CLASSPATH
+</code></pre></div></div>
+
+<p>Tried adding <code
class="highlighter-rouge">-Dorg.wildfly.openssl.path</code> to <code
class="highlighter-rouge">JAVA_OPTS</code> in <code
class="highlighter-rouge">accumulo-env.sh</code>, but it
+did not appear to work, this needs further investigation.</p>
+
+<p>Set the following in <code
class="highlighter-rouge">accumulo.properties</code> and then run <code
class="highlighter-rouge">accumulo init</code>, but don’t start Accumulo.</p>
+
+<div class="language-ini highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="py">instance.volumes</span><span
class="p">=</span><span class="s">hdfs://<name node>/accumulo</span>
+</code></pre></div></div>
+
+<p>After running Accumulo init we need to configure storing write ahead logs in
+HDFS. Set the following in <code
class="highlighter-rouge">accumulo.properties</code>.</p>
+
+<div class="language-ini highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="py">instance.volumes</span><span
class="p">=</span><span
class="s">hdfs://<namenode>/accumulo,abfss://<file_system>@<storage_account_name>.dfs.core.windows.net/accumulo</span>
+<span class="py">general.volume.chooser</span><span class="p">=</span><span
class="s">org.apache.accumulo.server.fs.PreferredVolumeChooser</span>
+<span class="py">general.custom.volume.preferred.default</span><span
class="p">=</span><span
class="s">abfss://<file_system>@<storage_account_name>.dfs.core.windows.net/accumulo</span>
+<span class="py">general.custom.volume.preferred.logger</span><span
class="p">=</span><span class="s">hdfs://<namenode>/accumulo</span>
+</code></pre></div></div>
+
+<p>Run <code class="highlighter-rouge">accumulo init --add-volumes</code> to
initialize the Azure DLS Gen2 volume. Doing this
+in two steps avoids putting any Accumulo metadata files in Gen2 during init.
+Copy <code class="highlighter-rouge">accumulo.properties</code> to all nodes
and start Accumulo.</p>
+
+<p>Individual tables can be configured to store their files in HDFS by setting
the
+table property <code
class="highlighter-rouge">table.custom.volume.preferred</code>. This should be
set for the
+metadata table in case it splits using the following Accumulo shell
command.</p>
+
+<div class="highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>config -t accumulo.metadata -s
table.custom.volume.preferred=hdfs://<namenode>/accumulo
+</code></pre></div></div>
+
+<h2 id="accumulo-example">Accumulo example</h2>
+
+<p>The following Accumulo shell session shows an example of writing data to
Gen2 and
+reading it back. It also shows scanning the metadata table to verify the data
+is stored in Gen2.</p>
+
+<div class="highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>root@muchos> createtable gen2test
+root@muchos gen2test> insert r1 f1 q1 v1
+root@muchos gen2test> insert r1 f1 q2 v2
+root@muchos gen2test> flush -w
+2019-10-16 08:01:00,564 [shell.Shell] INFO : Flush of table gen2test
completed.
+root@muchos gen2test> scan
+r1 f1:q1 [] v1
+r1 f1:q2 [] v2
+root@muchos gen2test> scan -t accumulo.metadata -c file
+4<
file:abfss://<file_system>@<storage_account_name>.dfs.core.windows.net/accumulo/tables/4/default_tablet/F00000gj.rf
[] 234,2
+</code></pre></div></div>
+
+<p>These instructions will help to configure Accumulo to use Azure’s Data Lake
Gen2 Storage along with HDFS. With this setup,
+we are able to successfully run the continuos ingest test. Going forward,
we’ll experiment more on this space
+with ADLS Gen2 and add/update blog as we come along.</p>
+
+
+
+<p><strong>View all posts in the <a href="/news">news archive</a></strong></p>
+
+ </div>
+
+
+<footer>
+
+ <p><a href="https://www.apache.org/foundation/contributing"><img
src="https://www.apache.org/images/SupportApache-small.png" alt="Support the
ASF" id="asf-logo" height="100" /></a></p>
+
+ <p>Copyright © 2011-2019 <a href="https://www.apache.org">The Apache
Software Foundation</a>.
+Licensed under the <a href="https://www.apache.org/licenses/">Apache License,
Version 2.0</a>.</p>
+
+ <p>Apache®, the names of Apache projects and their logos, and the multicolor
feather
+logo are registered trademarks or trademarks of The Apache Software Foundation
+in the United States and/or other countries.</p>
+
+</footer>
+
+
+ </div>
+ </div>
+ </div>
+</body>
+</html>
diff --git a/feed.xml b/feed.xml
index b7f0f21..2353fcb 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,12 +6,144 @@
</description>
<link>https://accumulo.apache.org/</link>
<atom:link href="https://accumulo.apache.org/feed.xml" rel="self"
type="application/rss+xml"/>
- <pubDate>Tue, 08 Oct 2019 17:43:33 -0400</pubDate>
- <lastBuildDate>Tue, 08 Oct 2019 17:43:33 -0400</lastBuildDate>
+ <pubDate>Thu, 17 Oct 2019 11:08:16 -0400</pubDate>
+ <lastBuildDate>Thu, 17 Oct 2019 11:08:16 -0400</lastBuildDate>
<generator>Jekyll v4.0.0</generator>
<item>
+ <title>Using Azure Data Lake Gen2 storage as a data store for
Accumulo</title>
+ <description><p>Accumulo can store its files in <a
href="https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction">Azure
Data Lake Storage Gen2</a>
+using the <a
href="https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver">ABFS
(Azure Blob File System)</a> driver.
+Similar to <a
href="https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html">S3
blog</a>,
+the write ahead logs &amp; Accumulo metadata can be stored in HDFS and
everything else on Gen2 storage
+using the volume chooser feature introduced in Accumulo 2.0. The
configurations referred on this blog
+are specific to Accumulo 2.0 and Hadoop 3.2.0.</p>
+
+<h2 id="hadoop-setup">Hadoop setup</h2>
+
+<p>For ABFS client to talk to Gen2 storage, it requires one of the
Authentication mechanism listed <a
href="https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html#Authentication">here</a>
+This post covers <a
href="https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview">Azure
Managed Identity</a>
+formerly known as Managed Service Identity or MSI. This feature provides Azure
services with an
+automatically managed identity in <a
href="https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-whatis">Azure
AD</a>
+and it avoids the need for credentials or other sensitive information from
being stored in code
+or configs/JCEKS. Plus, it comes free with Azure AD.</p>
+
+<p>At least the following should be added to Hadoop’s <code
class="highlighter-rouge">core-site.xml</code> on each
node.</p>
+
+<div class="language-xml highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code><span
class="nt">&lt;property&gt;</span>
+ <span
class="nt">&lt;name&gt;</span>fs.azure.account.auth.type<span
class="nt">&lt;/name&gt;</span>
+ <span
class="nt">&lt;value&gt;</span>OAuth<span
class="nt">&lt;/value&gt;</span>
+<span class="nt">&lt;/property&gt;</span>
+<span class="nt">&lt;property&gt;</span>
+ <span
class="nt">&lt;name&gt;</span>fs.azure.account.oauth.provider.type<span
class="nt">&lt;/name&gt;</span>
+ <span
class="nt">&lt;value&gt;</span>org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider<span
class="nt">&lt;/value&gt;</span>
+<span class="nt">&lt;/property&gt;</span>
+<span class="nt">&lt;property&gt;</span>
+ <span
class="nt">&lt;name&gt;</span>fs.azure.account.oauth2.msi.tenant<span
class="nt">&lt;/name&gt;</span>
+ <span
class="nt">&lt;value&gt;</span>TenantID<span
class="nt">&lt;/value&gt;</span>
+<span class="nt">&lt;/property&gt;</span>
+<span class="nt">&lt;property&gt;</span>
+ <span
class="nt">&lt;name&gt;</span>fs.azure.account.oauth2.client.id<span
class="nt">&lt;/name&gt;</span>
+ <span
class="nt">&lt;value&gt;</span>ClientID<span
class="nt">&lt;/value&gt;</span>
+<span class="nt">&lt;/property&gt;</span>
+</code></pre></div></div>
+
+<p>See <a
href="https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html">ABFS
doc</a>
+for more information on Hadoop Azure support.</p>
+
+<p>To get hadoop command to work with ADLS Gen2 set the
+following entries in <code
class="highlighter-rouge">hadoop-env.sh</code>. As Gen2
storage is TLS enabled by default,
+it is important we use the native OpenSSL implementation of TLS.</p>
+
+<div class="language-bash highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code><span
class="nb">export </span><span
class="nv">HADOOP_OPTIONAL_TOOLS</span><span
class="o">=</span><span
class="s2">"hadoop-azure"</span>
+<span class="nb">export </span><span
class="nv">HADOOP_OPTS</span><span
class="o">=</span><span
class="s2">"-Dorg.wildfly.openssl.path=&lt;path/to/OpenSSL/libraries&gt;
</span><span class="k">${</span><span
class="nv">HADOOP_OPTS</span><span
class="k">}</span><span
class="s2">"</span>
+</code></pre></div></div>
+
+<p>To verify the location of the OpenSSL libraries, run <code
class="highlighter-rouge">whereis libssl</code> command
+on the host</p>
+
+<h2 id="accumulo-setup">Accumulo setup</h2>
+
+<p>For each node in the cluster, modify <code
class="highlighter-rouge">accumulo-env.sh</code> to add
Azure storage jars to the
+classpath. Your versions may differ depending on your Hadoop version,
+following versions were included with Hadoop 3.2.0.</p>
+
+<div class="language-bash highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code><span
class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">conf</span><span
class="k">}</span><span
class="s2">:</span&g [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nv">CLASSPATH</span><span
class="o">=</span><span
class="s2">"</span><span
class="k">${</span><span
class="nv">CLASSPATH</span><span
class="k">}</span><span
class="s2">:</span><span
class="k">${</span><span
class="nv">HADOOP_HOME</span><span
class="k">}</sp [...]
+<span class="nb">export </span>CLASSPATH
+</code></pre></div></div>
+
+<p>Tried adding <code
class="highlighter-rouge">-Dorg.wildfly.openssl.path</code>
to <code class="highlighter-rouge">JAVA_OPTS</code> in
<code class="highlighter-rouge">accumulo-env.sh</code>,
but it
+did not appear to work, this needs further investigation.</p>
+
+<p>Set the following in <code
class="highlighter-rouge">accumulo.properties</code> and
then run <code class="highlighter-rouge">accumulo
init</code>, but don’t start Accumulo.</p>
+
+<div class="language-ini highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code><span
class="py">instance.volumes</span><span
class="p">=</span><span
class="s">hdfs://&lt;name node&gt;/accumulo</span>
+</code></pre></div></div>
+
+<p>After running Accumulo init we need to configure storing write ahead
logs in
+HDFS. Set the following in <code
class="highlighter-rouge">accumulo.properties</code>.</p>
+
+<div class="language-ini highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code><span
class="py">instance.volumes</span><span
class="p">=</span><span
class="s">hdfs://&lt;namenode&gt;/accumulo,abfss://&lt;file_system&gt;@&lt;storage_account_name&gt;.dfs.core.windows.net/accumulo</span>
+<span class="py">general.volume.chooser</span><span
class="p">=</span><span
class="s">org.apache.accumulo.server.fs.PreferredVolumeChooser</span>
+<span
class="py">general.custom.volume.preferred.default</span><span
class="p">=</span><span
class="s">abfss://&lt;file_system&gt;@&lt;storage_account_name&gt;.dfs.core.windows.net/accumulo</span>
+<span
class="py">general.custom.volume.preferred.logger</span><span
class="p">=</span><span
class="s">hdfs://&lt;namenode&gt;/accumulo</span>
+</code></pre></div></div>
+
+<p>Run <code class="highlighter-rouge">accumulo init
--add-volumes</code> to initialize the Azure DLS Gen2 volume. Doing this
+in two steps avoids putting any Accumulo metadata files in Gen2 during init.
+Copy <code
class="highlighter-rouge">accumulo.properties</code> to all
nodes and start Accumulo.</p>
+
+<p>Individual tables can be configured to store their files in HDFS by
setting the
+table property <code
class="highlighter-rouge">table.custom.volume.preferred</code>.
This should be set for the
+metadata table in case it splits using the following Accumulo shell
command.</p>
+
+<div class="highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code>config -t accumulo.metadata -s
table.custom.volume.preferred=hdfs://&lt;namenode&gt;/accumulo
+</code></pre></div></div>
+
+<h2 id="accumulo-example">Accumulo example</h2>
+
+<p>The following Accumulo shell session shows an example of writing data
to Gen2 and
+reading it back. It also shows scanning the metadata table to verify the data
+is stored in Gen2.</p>
+
+<div class="highlighter-rouge"><div
class="highlight"><pre
class="highlight"><code>root@muchos&gt; createtable
gen2test
+root@muchos gen2test&gt; insert r1 f1 q1 v1
+root@muchos gen2test&gt; insert r1 f1 q2 v2
+root@muchos gen2test&gt; flush -w
+2019-10-16 08:01:00,564 [shell.Shell] INFO : Flush of table gen2test
completed.
+root@muchos gen2test&gt; scan
+r1 f1:q1 [] v1
+r1 f1:q2 [] v2
+root@muchos gen2test&gt; scan -t accumulo.metadata -c file
+4&lt;
file:abfss://&lt;file_system&gt;@&lt;storage_account_name&gt;.dfs.core.windows.net/accumulo/tables/4/default_tablet/F00000gj.rf
[] 234,2
+</code></pre></div></div>
+
+<p>These instructions will help to configure Accumulo to use Azure’s
Data Lake Gen2 Storage along with HDFS. With this setup,
+we are able to successfully run the continuos ingest test. Going forward,
we’ll experiment more on this space
+with ADLS Gen2 and add/update blog as we come along.</p>
+
+</description>
+ <pubDate>Tue, 15 Oct 2019 00:00:00 -0400</pubDate>
+
<link>https://accumulo.apache.org/blog/2019/10/15/accumulo-adlsgen2-notes.html</link>
+ <guid
isPermaLink="true">https://accumulo.apache.org/blog/2019/10/15/accumulo-adlsgen2-notes.html</guid>
+
+
+ <category>blog</category>
+
+ </item>
+
+ <item>
<title>Using HDFS Erasure Coding with Accumulo</title>
<description><p>HDFS normally stores multiple copies of each
file for both performance and durability reasons.
The number of copies is controlled via HDFS replication settings, and by
default is set to 3. Hadoop 3,
@@ -1125,104 +1257,5 @@ complete in this alpha release.</li>
</item>
- <item>
- <title>Apache Accumulo 1.9.2</title>
- <description><p>Apache Accumulo 1.9.2 contains fixes for
critical write-ahead log bugs.
-Users of any previous version of 1.8 or 1.9 are encouraged to upgrade
-immediately to avoid those issues.</p>
-
-<ul>
- <li><a href="/1.9/accumulo_user_manual.html">User
Manual</a> - In-depth developer and administrator documentation</li>
- <li><a href="/1.9/apidocs/">Javadocs</a> -
Accumulo 1.9 API</li>
- <li><a href="/1.9/examples/">Examples</a> - Code
with corresponding readme files that give step by
-step instructions for running example code</li>
-</ul>
-
-<h2 id="notable-changes">Notable Changes</h2>
-
-<h3
id="fixes-for-critical-wal-bugs-affects-versions-180-191">Fixes
for Critical WAL Bugs (affects versions 1.8.0-1.9.1)</h3>
-
-<p>Multiple bugs were fixed in 1.9.2 which affects the behavior of the
write-ahead
-log mechanism. These vary in significance, ranging from moderate to
critical.</p>
-
-<ul>
- <li><a
href="https://github.com/apache/accumulo/issues/537">#537</a>
- (Critical) Since 1.8.0, a bug existed which could cause some
-write-ahead logs to be removed (garbage collected) before Accumulo was
-finished with them. These removed logs could have contained important state
-tracking information. Without the state contained in these logs, some data
-in the remaining logs could have been replayed into a table when not needed.
-This could have reintroduced deleted data, or introduced duplicate data
-(which can interfere with combiners).</li>
- <li><a
href="https://github.com/apache/accumulo/issues/538">#538</a>
- (Moderate) A bug was introduced in 1.9.1 which resulted in some
-false positive IllegalStateExceptions to occur, preventing log
recovery.</li>
- <li><a
href="https://github.com/apache/accumulo/issues/539">#539</a>
- (Moderate) Since 1.8.0, a race condition existed which could cause a log
-file which contains data to be recovered to not be recorded, thus making it
-invisible to recovery, if a tserver died within a very small window. <a
href="https://github.com/apache/accumulo/issues/559">#559</a>
- fixes this issue and may also fix a 1.9.1 deadlock caused by the fix for
<a
href="https://github.com/apache/accumulo/issues/441">#441</a>.</li>
-</ul>
-
-<p>Even if you primarily use bulk ingest, Accumulo’s own metadata tables
can be
-affected by these bugs, causing unexpected behavior after an otherwise routine
-and recoverable server failure. As such, these bugs should be a concern to all
-users.</p>
-
-<h3
id="fixes-for-concurrency-bugs-gathering-table-information-affects-180-191">Fixes
for concurrency bugs gathering table information (affects
1.8.0-1.9.1)</h3>
-
-<p>Bugs were found with the <code
class="highlighter-rouge">master.status.threadpool.size</code>
property. If this
-property were set to a value other than <code
class="highlighter-rouge">1</code>, it could cause 100% CPU,
hanging,
-or <code
class="highlighter-rouge">ConcurrentModificationException</code>s.</p>
-
-<p>These bugs were fixed in <a
href="https://github.com/apache/accumulo/issues/546">#546</a>.</p>
-
-<h3 id="caching-of-file-lengths">Caching of file
lengths</h3>
-
-<p>RFiles stores metadata at the end of file. When opening a rfile
Accumulo
-seeks to the end and reads metadata. To do this seek the file length is
needed.
-Before opening a file its length is requested from the namenode. This can
-add more load to a busy namenode. To alleviate this, a small cache of file
lengths was
-added in <a
href="https://github.com/apache/accumulo/issues/467">#467</a>.</p>
-
-<h3 id="monitor-time-unit-bug">Monitor time unit bug</h3>
-
-<p>A bug was found in the monitor which displayed time durations (for
example,
-those pertaining to bulk imports) in incorrect time units.</p>
-
-<p>This bug was fixed in <a
href="https://github.com/apache/accumulo/issues/553">#553</a>.</p>
-
-<h2 id="other-changes">Other Changes</h2>
-
-<ul>
- <li><a
href="https://github.com/apache/accumulo/issues?q=project%3Aapache%2Faccumulo%2F6">GitHub</a>
- List of issues tracked on GitHub corresponding to this release</li>
- <li><a href="/release/accumulo-1.9.1/">1.9.1 release
notes</a> - Release notes showing changes in the previous
release</li>
-</ul>
-
-<h2 id="upgrading">Upgrading</h2>
-
-<p>View the <a
href="/docs/2.x/administration/upgrading">Upgrading Accumulo
documentation</a> for guidance.</p>
-
-<h2 id="testing">Testing</h2>
-
-<ul>
- <li>All ITs passed with Hadoop 3.0.0 (hadoop.profile=3)</li>
- <li>All ITs passed with Hadoop 2.6.4 (hadoop.profile=2)</li>
- <li>Ran 3 continuous ingesters successfully for 24 hours on a 10 node
cluster
-with agitation and pausing. Verification for all 3 tests was
successful.</li>
- <li>Ran continuous ingest for 24 hours and verified without agitation
on a 10
-node cluster.</li>
- <li>Tested <a href="https://fluo.apache.org">Apache
Fluo</a> build and ITs passed against this version.</li>
- <li>Ran a single-node cluster with <a
href="https://github.com/apache/fluo-uno">Uno</a> and
created a table, ingested data,
-flushed, compacted, scanned, and deleted the table.</li>
-</ul>
-
-</description>
- <pubDate>Thu, 19 Jul 2018 00:00:00 -0400</pubDate>
- <link>https://accumulo.apache.org/release/accumulo-1.9.2/</link>
- <guid
isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.9.2/</guid>
-
-
- <category>release</category>
-
- </item>
-
</channel>
</rss>
diff --git a/index.html b/index.html
index 303d3ec..9baac91 100644
--- a/index.html
+++ b/index.html
@@ -150,9 +150,9 @@
<h3>Major Features</h3>
<div class="row">
- <div class="col-md-6">
+ <div class="col-md-6">
<h4>Server-side programming</h4>
- <p>Accumulo has a programing meachinism (called <a
href="/docs/2.x/development/iterators">Iterators</a>) that can modify key/value
pairs at various points in the data management process.</p>
+ <p>Accumulo has a programming mechanism (called <a
href="/docs/2.x/development/iterators">Iterators</a>) that can modify key/value
pairs at various points in the data management process.</p>
</div>
<div class="col-md-6">
<h4>Cell-based access control</h4>
@@ -160,7 +160,7 @@
</div>
</div>
<div class="row">
- <div class="col-md-6">
+ <div class="col-md-6">
<h4>Designed to scale</h4>
<p>Accumulo runs on a cluster using <a
href="/docs/2.x/administration/multivolume">one or more HDFS instances</a>.
Nodes can be added or removed as the amount of data stored in Accumulo
changes.</p>
</div>
@@ -178,6 +178,13 @@
<div class="row latest-news-item">
<div class="col-sm-12" style="margin-bottom: 5px">
+ <span style="font-size: 12px; margin-right: 5px;">Oct 2019</span>
+ <a href="/blog/2019/10/15/accumulo-adlsgen2-notes.html">Using Azure
Data Lake Gen2 storage as a data store for Accumulo</a>
+ </div>
+ </div>
+
+ <div class="row latest-news-item">
+ <div class="col-sm-12" style="margin-bottom: 5px">
<span style="font-size: 12px; margin-right: 5px;">Sep 2019</span>
<a href="/blog/2019/09/17/erasure-coding.html">Using HDFS Erasure
Coding with Accumulo</a>
</div>
@@ -204,13 +211,6 @@
</div>
</div>
- <div class="row latest-news-item">
- <div class="col-sm-12" style="margin-bottom: 5px">
- <span style="font-size: 12px; margin-right: 5px;">Apr 2019</span>
- <a href="/blog/2019/04/24/using-spark-with-accumulo.html">Using
Apache Spark with Accumulo</a>
- </div>
- </div>
-
<div id="news-archive-link">
<p>View all posts in the <a href="/news">news archive</a></p>
</div>
diff --git a/news/index.html b/news/index.html
index f63e7b1..c6207dd 100644
--- a/news/index.html
+++ b/news/index.html
@@ -147,6 +147,13 @@
<div class="row" style="margin-top: 15px">
+ <div class="col-md-1">Oct 15</div>
+ <div class="col-md-10"><a
href="/blog/2019/10/15/accumulo-adlsgen2-notes.html">Using Azure Data Lake Gen2
storage as a data store for Accumulo</a></div>
+ </div>
+
+
+
+ <div class="row" style="margin-top: 15px">
<div class="col-md-1">Sep 17</div>
<div class="col-md-10"><a
href="/blog/2019/09/17/erasure-coding.html">Using HDFS Erasure Coding with
Accumulo</a></div>
</div>
diff --git a/redirects.json b/redirects.json
index 9c19363..6e10199 100644
--- a/redirects.json
+++ b/redirects.json
@@ -1 +1 @@
-{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
+{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
diff --git a/search_data.json b/search_data.json
index 90480bd..f559396 100644
--- a/search_data.json
+++ b/search_data.json
@@ -295,6 +295,14 @@
},
+ "blog-2019-10-15-accumulo-adlsgen2-notes-html": {
+ "title": "Using Azure Data Lake Gen2 storage as a data store for
Accumulo",
+ "content" : "Accumulo can store its files in Azure Data Lake
Storage Gen2using the ABFS (Azure Blob File System) driver.Similar to S3 blog,
the write ahead logs &amp; Accumulo metadata can be stored in HDFS and
everything else on Gen2 storageusing the volume chooser feature introduced in
Accumulo 2.0. The configurations referred on this blogare specific to Accumulo
2.0 and Hadoop 3.2.0.Hadoop setupFor ABFS client to talk to Gen2 storage, it
requires one of the Authentication m [...]
+ "url": " /blog/2019/10/15/accumulo-adlsgen2-notes.html",
+ "categories": "blog"
+ }
+ ,
+
"blog-2019-09-17-erasure-coding-html": {
"title": "Using HDFS Erasure Coding with Accumulo",
"content" : "HDFS normally stores multiple copies of each file
for both performance and durability reasons. The number of copies is controlled
via HDFS replication settings, and by default is set to 3. Hadoop 3, introduced
the use of erasure coding (EC), which improves durability while decreasing
overhead.Since Accumulo 2.0 now supports Hadoop 3, it’s time to take a look at
whether usingEC with Accumulo makes sense. EC Intro EC Performance Accumulo
Performance with ECEC IntroBy [...]