[26/36] accumulo git commit: Jekyll build from gh-pages:358b7b4

mwalch Thu, 10 Nov 2016 13:38:23 -0800

http://git-wip-us.apache.org/repos/asf/accumulo/blob/c0655661/feed.xml
----------------------------------------------------------------------
diff --git a/feed.xml b/feed.xml
index c133379..f7e080d 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
 </description>
     <link>https://accumulo.apache.org/</link>
     <atom:link href="https://accumulo.apache.org/feed.xml"; rel="self" 
type="application/rss+xml"/>
-    <pubDate>Thu, 03 Nov 2016 12:05:50 -0400</pubDate>
-    <lastBuildDate>Thu, 03 Nov 2016 12:05:50 -0400</lastBuildDate>
+    <pubDate>Thu, 10 Nov 2016 16:36:52 -0500</pubDate>
+    <lastBuildDate>Thu, 10 Nov 2016 16:36:52 -0500</lastBuildDate>
     <generator>Jekyll v3.2.1</generator>
     
       <item>
@@ -78,9 +78,9 @@ and checksums were stored in-line, then 1 sync could be done 
instead of 4.&lt;/p
 
 &lt;h2 id=&quot;configuring-wal-flushsync-in-accumulo-16&quot;&gt;Configuring 
WAL flush/sync in Accumulo 1.6&lt;/h2&gt;
 
-&lt;p&gt;Accumulo 1.6.0 only supported &lt;code 
class=&quot;highlighter-rouge&quot;&gt;hsync&lt;/code&gt; and this caused &lt;a 
href=&quot;/release_notes/1.6.0#slower-writes-than-previous-accumulo-versions&quot;&gt;performance
+&lt;p&gt;Accumulo 1.6.0 only supported &lt;code 
class=&quot;highlighter-rouge&quot;&gt;hsync&lt;/code&gt; and this caused &lt;a 
href=&quot;/release/accumulo-1.6.0#slower-writes-than-previous-accumulo-versions&quot;&gt;performance
 problems&lt;/a&gt;.  In order to offer better performance, the option to
-configure &lt;code class=&quot;highlighter-rouge&quot;&gt;hflush&lt;/code&gt; 
was &lt;a 
href=&quot;/release_notes/1.6.1#write-ahead-log-sync-implementation&quot;&gt;added
 in 1.6.1&lt;/a&gt;.  The
+configure &lt;code class=&quot;highlighter-rouge&quot;&gt;hflush&lt;/code&gt; 
was &lt;a 
href=&quot;/release/accumulo-1.6.1#write-ahead-log-sync-implementation&quot;&gt;added
 in 1.6.1&lt;/a&gt;.  The
 &lt;a 
href=&quot;/1.6/accumulo_user_manual#_tserver_wal_sync_method&quot;&gt;tserver.wal.sync.method&lt;/a&gt;
 configuration option was added to support
 this feature.  This was a tablet server wide option that applied to everything
 written to any table.&lt;/p&gt;
@@ -161,7 +161,7 @@ config -t accumulo.root -d table.durability
 
 &lt;p&gt;Even with these settings adjusted, minor compactions could still 
force &lt;code class=&quot;highlighter-rouge&quot;&gt;hsync&lt;/code&gt;
 to be called in 1.7.0 and 1.7.1.  This was fixed in 1.7.2 and 1.8.0.  See the
-&lt;a 
href=&quot;/release_notes/1.7.2#minor-performance-improvements&quot;&gt;1.7.2 
release notes&lt;/a&gt; and &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4112&quot;&gt;ACCUMULO-4112&lt;/a&gt;
 for more details.&lt;/p&gt;
+&lt;a 
href=&quot;/release/accumulo-1.7.2#minor-performance-improvements&quot;&gt;1.7.2
 release notes&lt;/a&gt; and &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4112&quot;&gt;ACCUMULO-4112&lt;/a&gt;
 for more details.&lt;/p&gt;
 
 &lt;p&gt;In addition to the per table durability setting, a per batch writer 
durability
 setting was also added in 1.7.0.  See
@@ -186,585 +186,1302 @@ problems with Per-durability write ahead 
logs.&lt;/p&gt;
       </item>
     
       <item>
-        <title>Replicating data across Accumulo clusters</title>
-        <description>&lt;p&gt;Originally posted at &lt;a 
href=&quot;https://blogs.apache.org/accumulo/entry/data_center_replication&quot;&gt;https://blogs.apache.org/accumulo/entry/data_center_replication&lt;/a&gt;&lt;/p&gt;
+        <title>Apache Accumulo 1.6.6</title>
+        <description>&lt;p&gt;Apache Accumulo 1.6.6 is a maintenance release 
on the 1.6 version branch. This
+release contains changes from more than 40 issues, comprised of bug-fixes,
+performance improvements, build quality improvements, and more. See
+&lt;a 
href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12312121&amp;amp;version=12334846&quot;&gt;JIRA&lt;/a&gt;
 for a complete list.&lt;/p&gt;
 
-&lt;p&gt;Traditionally, Apache Accumulo can only operate within the confines 
of a single physical location. The primary reason for this restriction is that 
Accumulo relies heavily on Apache ZooKeeper for distributed lock management and 
some distributed state. Due to the consistent nature of ZooKeeper and its 
protocol, it doesnât handle wide-area networks (WAN) well. As such, Accumulo 
suffers the same problems operating over a WAN.&lt;/p&gt;
+&lt;p&gt;Below are resources for this release:&lt;/p&gt;
 
-&lt;p&gt;Data-Center Replication is a new feature, to be included in the 
upcoming Apache Accumulo 1.7.0, which aims to address the limitation of 
Accumulo to one local-area network (LAN). The implementation makes a number of 
decisions with respect to consistency and available which aim to avoid the 
normal âlocalâ operations of the primary Accumulo instance. That is to say, 
replication was designed in such a way that enabling the feature on an instance 
should not affect the performance of that system. However, this comes at a cost 
of consistency across all replicas. Replication from one instance to others is 
performed lazily. Succinctly, replication in Accumulo can be described as an 
eventually-consistent system and not a strongly-consistent system (an Accumulo 
instance is strongly-consistent).&lt;/p&gt;
+&lt;ul&gt;
+  &lt;li&gt;&lt;a href=&quot;/1.6/accumulo_user_manual.html&quot;&gt;User 
Manual&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.6/apidocs&quot;&gt;Javadocs&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.6/examples&quot;&gt;Examples&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Users of any previous 1.6.x release are strongly encouraged to update 
as soon
+as possible to benefit from the improvements with very little concern in change
+of underlying functionality.&lt;/p&gt;
+
+&lt;p&gt;As of this release, active development has ceased for the 1.6 release 
line, so
+users should consider upgrading to a newer, actively maintained version when
+they can. While the developers may release another 1.6 version to address a
+severe issue, thereâs a strong possibility that this will be the last 1.6
+release. That would also mean that this will be the last Accumulo version to
+support Java 6 and Hadoop 1.&lt;/p&gt;
 
-&lt;p&gt;Because replication is performed lazily, this implies that the data 
to replicate must be persisted in some shape until the actual replication takes 
place. This is done using Accumuloâs write-ahead log (WAL) files for this 
purpose. The append-only nature of these files make them obvious candidates for 
reuse without the need to persist the data in another form for replication. The 
only necessary changes internally to Accumulo to support this is changing the 
conditions that the Accumulo garbage collector will delete WAL files. Using WAL 
files also has the benefit of making HDFS capacity the limiting factor in how 
âlazyâ replication can be. This means that the amount of time replication 
can be offline or stalled is only limited by the amount of extra HDFS space 
available which is typically ample.&lt;/p&gt;
+&lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
 
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/201504_replication/replication1.png&quot; 
alt=&quot;image1&quot; /&gt;&lt;/p&gt;
+&lt;h3 
id=&quot;write-ahead-logs-can-be-prematurely-deleted&quot;&gt;Write-Ahead Logs 
can be prematurely deleted&lt;/h3&gt;
 
-&lt;h2 id=&quot;terminology&quot;&gt;Terminology&lt;/h2&gt;
+&lt;p&gt;There were cases where the Accumulo Garbage Collector may 
inadvertently delete
+a WAL for a tablet server that it has erroneously determined to be down,
+causing data loss. This has been corrected. See &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4157&quot;&gt;ACCUMULO-4157&lt;/a&gt;
+for additional detail.&lt;/p&gt;
 
-&lt;p&gt;Before getting into details on the feature, it will help to define 
some basic terminology. Data in Accumulo is replicated from a âprimaryâ 
Accumulo instance to a âpeerâ Accumulo instance. Each instance here is a 
normal Accumulo instance â each instance is only differentiated by a few new 
configuration values. Users ingest data into the primary instance, and that 
data will eventually be replicated to a peer. Each instance requires a unique 
name to identify itself among all Accumulo instances replicating with each 
other. Replication from a primary to a peer is defined on a per-table basis â 
that is, the configuration states that tableA on the primary will be replicated 
to tableB on the peer. A primary can have multiple peers defined, e.g. tableA 
on the primary can will be replicated to tableB on peer1 and tableC on peer2.
- Overview&lt;/p&gt;
+&lt;h3 id=&quot;upgrade-to-commons-vfs-21&quot;&gt;Upgrade to Commons-VFS 
2.1&lt;/h3&gt;
 
-&lt;p&gt;Internally, replication is comprised of a few components to make up 
the user-facing feature: the management of data ingested on the primary which 
needs to be replicated, the assignment of replication work within the primary, 
the execution of that work within the primary to send the data to a peer, and 
the application of the data to the appropriate table within the peer.&lt;/p&gt;
+&lt;p&gt;Upgrading to Apache Commons VFS 2.1 fixes several issues with 
classloading out
+of HDFS. For further detail see &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4146&quot;&gt;ACCUMULO-4146&lt;/a&gt;.
 Additional
+fixes to a potential HDFS class loading deadlock situation were made in
+&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4341&quot;&gt;ACCUMULO-4341&lt;/a&gt;.&lt;/p&gt;
 
-&lt;h3 id=&quot;state-management-on-primary&quot;&gt;State Management on 
Primary&lt;/h3&gt;
+&lt;h3 
id=&quot;native-map-failed-to-increment-mutation-count-properly&quot;&gt;Native 
Map failed to increment mutation count properly&lt;/h3&gt;
 
-&lt;p&gt;The most important state to manage for replication is the tracking 
the data that was ingested in the primary. This is what ensures that all of the 
data will be eventually replicated to the necessary peer(s). This state is kept 
in both the Accumulo metadata table and a new table in the accumulo namespace: 
replication. Through the use of an Accumulo Combiner on these tables, updates 
to the replication state are simple updates to the replication table. This 
makes management of the state machine across all of the nodes within the 
Accumulo instance extremely simple. For example, TabletServers reporting that 
data was ingested into a write-ahead log, the Master preparing data to be 
replicated and the TabletServer reporting that data has been replicated to the 
peer are all updates to the replication table.&lt;/p&gt;
+&lt;p&gt;There was a bug (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4148&quot;&gt;ACCUMULO-4148&lt;/a&gt;)
 where multiple put calls with
+identical keys and no timestamp would exhibit different behaviour depending on
+whether native maps were enabled or not. This behaviour would result in hidden
+mutations with native maps, and has been corrected.&lt;/p&gt;
 
-&lt;p&gt;To âseedâ the state machine, TabletServers first write to the 
metadata table at the end of a minor compaction. The Master will read records 
from the metadata table and add them to the replication table. Each Key-Value 
pair in the replication table represents a WALâs current state within the 
replication âstate machineâ with different column families representing 
different states. For example, one column family represents the status of a WAL 
file being replicated to a specific peer while a different column family 
represents the status of a WAL file being replicated to all necessary 
peers.&lt;/p&gt;
+&lt;h3 
id=&quot;open-wal-files-could-prevent-datanode-decomission&quot;&gt;Open WAL 
files could prevent DataNode decomission&lt;/h3&gt;
 
-&lt;p&gt;The Master is the primary driver of this state machine, reading the 
replication table and making the necessary updates repeatedly. This allows the 
Master to maintain a constant amount of memory with respect to the amount of 
data that needs to be replicated. The only limitation on persisted state for 
replication is the size of the replication table itself and the amount of space 
the WAL files on HDFS consume.&lt;/p&gt;
+&lt;p&gt;An improvement was introduced to allow a max age before WAL files 
would be
+automatically rolled. Without a max age, they could stay open for writing
+indefinitely, blocking the Hadoop DataNode decomissioning process. For more
+information, see &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4004&quot;&gt;ACCUMULO-4004&lt;/a&gt;.&lt;/p&gt;
 
-&lt;h3 id=&quot;rpc-from-primary-to-peer&quot;&gt;RPC from primary to 
peer&lt;/h3&gt;
+&lt;h3 
id=&quot;remove-unnecessary-copy-of-cached-rfile-index-blocks&quot;&gt;Remove 
unnecessary copy of cached RFile index blocks&lt;/h3&gt;
 
-&lt;p&gt;Like the other remote procedure calls in Accumulo, Apache Thrift is 
used to make RPCs from the primary Accumulo instance to a peer instance. The 
purpose of these methods is to send the relevant data from a WAL file to the 
peer. The Master advertises units of replication work, a WAL file that needs to 
be replicated to a single peer, and all TabletServers in the primary instance 
will try to reserve, and then perform, that work. ZooKeeper provides this 
feature to us with very little code in Accumulo.&lt;/p&gt;
+&lt;p&gt;Accumulo maintains an cache for file blocks in-memory as a performance
+optimization. This can be done safely because Accumulo RFiles are immutable,
+thus their blocks are also immutable. There are two types of these blocks:
+index and data blocks. Index blocks refer to the b-tree style index inside of
+each Accumulo RFile, while data blocks contain the sorted Key-Value pairs. In
+previous versions, when Accumulo extracted an Index block from the in-memory
+cache, it would copy the data. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4164&quot;&gt;ACCUMULO-4164&lt;/a&gt;
 removes this
+unnecessary copy as the contents are immutable and can be passed by reference.
+Ensuring that the Index blocks are not copied when accessed from the cache is a
+big performance gain at the file-access level.&lt;/p&gt;
 
-&lt;p&gt;Once a TabletServer obtains the work, it will read through the WAL 
file extracting updates only for the table in this unit of work and send the 
updates across the wire to a TabletServer in the peer. The TabletServer on the 
primary asks the active Master in the peer for a TabletServer to communicate 
with. As such, ignoring some very quick interactions with the Master, RPC for 
replication is primarily a TabletServer to TabletServer operation which means 
that replication should scale in performance with respect to the number of 
available TabletServers on the primary and peer.&lt;/p&gt;
+&lt;h3 
id=&quot;analyze-key-length-to-avoid-choosing-large-keys-for-rfile-index-blocks&quot;&gt;Analyze
 Key-length to avoid choosing large Keys for RFile Index blocks&lt;/h3&gt;
 
-&lt;p&gt;The amount of data read from a WAL and sent to the peer per RPC is a 
configurable parameter defaulting to 50MB. Increasing the amount of data read 
at a time will have a large impact on the amount of memory consumed by a 
TabletServer when using replication, so take care when altering this property. 
It is also important to note that the Thrift server used for the purposes of 
replication is completely separate from the thrift server used by clients. 
Replication and the client service servers will not compete against one another 
for RPC resources.&lt;/p&gt;
+&lt;p&gt;Accumuloâs RFile index blocks are made up of a Key which exists in 
the file and
+points to that specific location in the corresponding RFile data block. Thus,
+the size of the RFile index blocks is largely dominated by the size of the Keys
+which are used by the index. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4314&quot;&gt;ACCUMULO-4314&lt;/a&gt;
 is an improvement
+which uses statistics on the length of the Keys in the Rfile to avoid choosing
+Keys for the index whose length is greater than three standard deviations for
+the RFile. By choosing smaller Keys for the index, Accumulo can access the
+RFile index faster and keep more Index blocks cached in memory. Initial tests
+showed that with this change, the RFile index size was nearly cut in 
half.&lt;/p&gt;
 
-&lt;h3 id=&quot;replay-of-data-on-peer&quot;&gt;Replay of data on 
peer&lt;/h3&gt;
+&lt;h3 id=&quot;gson-version-bump&quot;&gt;Gson version bump&lt;/h3&gt;
 
-&lt;p&gt;After a TabletServer on the primary invokes an RPC to a TabletServer 
on the peer, but before that RPC completes, the TabletServer on the peer must 
apply the updates it received to the local table. The TabletServer on the peer 
constructs a BatchWriter and simply applies the updates to the table. In the 
event of an error in writing the data, the RPC will return in error and it will 
be retried by a TabletServer on the primary. As such, in these failure 
conditions, it is possible that data will be applied on the peer multiple 
times. The use of Accumulo Combiners on tables used being replicated is nearly 
always a bad idea which will result in inconsistencies between the primary and 
replica.&lt;/p&gt;
+&lt;p&gt;Due to an &lt;a 
href=&quot;https://github.com/google/gson/issues/362&quot;&gt;upstream bug with 
Gson 2.2.2&lt;/a&gt;, weâve bumped our bundled
+dependency (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4345&quot;&gt;ACCUMULO-4345&lt;/a&gt;)
 to version 2.2.4. Please take note
+of this when you upgrade, if you were using the version shipped with Accumulo,
+and were relying on the buggy behavior in the previous version in your own
+code.&lt;/p&gt;
 
-&lt;p&gt;Because there are many TabletServers, each with their own 
BatchWriter, potential throughput for replication on the peer should be 
equivalent to the ingest throughput observed by clients normally ingesting data 
uniformly into Accumulo.&lt;/p&gt;
+&lt;h3 id=&quot;minor-performance-improvements&quot;&gt;Minor performance 
improvements.&lt;/h3&gt;
 
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/201504_replication/replication2.png&quot; 
alt=&quot;image2&quot; /&gt;&lt;/p&gt;
+&lt;p&gt;A performance issue was identified and corrected
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-1755&quot;&gt;ACCUMULO-1755&lt;/a&gt;)
 where the BatchWriter would block calls to
+addMutation while looking up destination tablet server metadata. The writer has
+been fixed to allow both operations in parallel.&lt;/p&gt;
+
+&lt;h2 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h2&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4155&quot;&gt;ACCUMULO-4155&lt;/a&gt;
 No longer publish javadoc for non-public API
+to website. (Still available in javadoc jars in maven)&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4334&quot;&gt;ACCUMULO-4334&lt;/a&gt;
 Ingest rates reported through JMX did not
+match rates reported by Monitor.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4335&quot;&gt;ACCUMULO-4335&lt;/a&gt;
 Error conditions that result in a Halt should
+ensure non-zero process exit code.&lt;/li&gt;
+&lt;/ul&gt;
 
-&lt;h2 id=&quot;complex-replication-configurations&quot;&gt;Complex 
replication configurations&lt;/h2&gt;
+&lt;h2 id=&quot;testing&quot;&gt;Testing&lt;/h2&gt;
+
+&lt;p&gt;Each unit and functional test only runs on a single node, while the 
RandomWalk
+and Continuous Ingest tests run on any number of nodes. 
&lt;em&gt;Agitation&lt;/em&gt; refers to
+randomly restarting Accumulo processes and Hadoop Datanode processes, and, in
+HDFS High-Availability instances, forcing NameNode failover.&lt;/p&gt;
+
+&lt;table id=&quot;release_notes_testing&quot; class=&quot;table&quot;&gt;
+  &lt;thead&gt;
+    &lt;tr&gt;
+      &lt;th&gt;OS/Environment&lt;/th&gt;
+      &lt;th&gt;Hadoop&lt;/th&gt;
+      &lt;th&gt;Nodes&lt;/th&gt;
+      &lt;th&gt;ZooKeeper&lt;/th&gt;
+      &lt;th&gt;HDFS HA&lt;/th&gt;
+      &lt;th&gt;Tests&lt;/th&gt;
+    &lt;/tr&gt;
+  &lt;/thead&gt;
+  &lt;tbody&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7&lt;/td&gt;
+      &lt;td&gt;1.2.1&lt;/td&gt;
+      &lt;td&gt;1&lt;/td&gt;
+      &lt;td&gt;3.3.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;Unit tests and Integration Tests&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7&lt;/td&gt;
+      &lt;td&gt;2.2.0&lt;/td&gt;
+      &lt;td&gt;1&lt;/td&gt;
+      &lt;td&gt;3.3.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;Unit tests and Integration Tests&lt;/td&gt;
+    &lt;/tr&gt;
+  &lt;/tbody&gt;
+&lt;/table&gt;
+
+</description>
+        <pubDate>Sun, 18 Sep 2016 00:00:00 -0400</pubDate>
+        <link>https://accumulo.apache.org/release/accumulo-1.6.6/</link>
+        <guid 
isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.6.6/</guid>
+        
+        
+        <category>release</category>
+        
+      </item>
+    
+      <item>
+        <title>Apache Accumulo 1.8.0</title>
+        <description>&lt;p&gt;Apache Accumulo 1.8.0 is a significant release 
that includes many important
+milestone features which expand the functionality of Accumulo. These include
+features related to security, availability, and extensibility. Over
+350 JIRA issues were resolved in this version. This includes over
+200 bug fixes and 71 improvements and 4 new features. See &lt;a 
href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12312121&amp;amp;version=12329879&quot;&gt;JIRA&lt;/a&gt;
+for the complete list.&lt;/p&gt;
 
-&lt;p&gt;So far, weâve only touched on configurations which have a single 
primary and one to many peers; however, the feature allows multiple primary 
instances in addition to multiple peers. This primary-primary configuration 
allows data to be replicated in both directions instead of just one. This can 
be extended even further to allow replication between a trio of instances: 
primaryA replicates to primaryB which replicates to primaryC which replicates 
to primaryA. This aspect is supported by including provenance of which systems 
an update was seen inside of each Mutation. In âcyclicâ replication setups, 
this prevents updates from being replicated indefinitely.&lt;/p&gt;
+&lt;p&gt;Below are resources for this release:&lt;/p&gt;
 
-&lt;p&gt;Supporting these cycles allows for different collections of users to 
access physically separated instances and eventually see the changes made by 
other groups. For example, consider two instance of Accumulo, one in New York 
City and another San Francisco. Users on the west coast can use the San 
Francisco instance while users on the east coast can use the instance in New 
York. With the two instances configured to replicate to each other, data 
created by east coast users will eventually be seen by west coast users and 
vice versa.&lt;/p&gt;
+&lt;ul&gt;
+  &lt;li&gt;&lt;a href=&quot;/1.8/accumulo_user_manual.html&quot;&gt;User 
Manual&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.8/apidocs&quot;&gt;Javadocs&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.8/examples&quot;&gt;Examples&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
 
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/201504_replication/replication3.png&quot; 
alt=&quot;image3&quot; /&gt;&lt;/p&gt;
+&lt;p&gt;In the context of Accumuloâs &lt;a 
href=&quot;http://semver.org&quot;&gt;Semantic Versioning&lt;/a&gt; &lt;a 
href=&quot;https://github.com/apache/accumulo/blob/1.8/README.md#api&quot;&gt;guidelines&lt;/a&gt;,
+this is a âminor versionâ. This means that new APIs have been created, some
+deprecations may have been added, but no deprecated APIs have been removed.
+Code written against 1.7.x should work against 1.8.0 â binary compatibility
+has been preserved with one exception of an already-deprecated Mock Accumulo
+utility class. As always, the Accumulo developers take API compatibility
+very seriously and have invested much time to ensure that we meet the promises 
set forth to our users.&lt;/p&gt;
+
+&lt;h2 id=&quot;major-changes&quot;&gt;Major Changes&lt;/h2&gt;
+
+&lt;h3 id=&quot;speed-up-wal-roll-overs&quot;&gt;Speed up WAL roll 
overs&lt;/h3&gt;
+
+&lt;p&gt;Performance of writing mutations is improved by refactoring the
+bookeeping required for Write-Ahead Log (WAL) files and by creating a
+standby WAL for faster switching when the log is full. This was a
+substantial refactor in the way WALs worked, but smoothes overall
+ingest performance in addition to provides a increase in write speed
+as shown by the simple test below. The top entry is before
+&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3423&quot;&gt;ACCUMULO-3423&lt;/a&gt;
 and the bottom graph is after the
+refactor.&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;https://issues.apache.org/jira/secure/attachment/12705402/WAL-slowdown-graphs.jpg&quot;
 alt=&quot;Graph of WAL speed up after ACCUMULO-3423&quot; title=&quot;Graph of 
WAL speed up after ACCUMULO-3423&quot; /&gt;&lt;/p&gt;
+
+&lt;h3 id=&quot;user-level-api-for-rfile&quot;&gt;User level API for 
RFile&lt;/h3&gt;
+
+&lt;p&gt;Previously the only public API available to write RFiles was via the 
AccumuloFileOutputFormat. There was no way to read RFiles in the public
+API. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4165&quot;&gt;ACCUMULO-4165&lt;/a&gt;
 exposes a brand new public &lt;a 
href=&quot;../1.8/apidocs/org/apache/accumulo/core/client/rfile/RFile.html&quot;&gt;API&lt;/a&gt;
 for reading and writing RFiles as well as cleans up some of the internal 
APIs.&lt;/p&gt;
+
+&lt;h3 
id=&quot;suspend-tablet-assignment-for-rolling-restarts&quot;&gt;Suspend Tablet 
assignment for rolling restarts&lt;/h3&gt;
+
+&lt;p&gt;When a tablet server dies, Accumulo attempted to reassign the tablets 
as quickly as possible to maintain availability.
+A new configuration property &lt;code 
class=&quot;highlighter-rouge&quot;&gt;table.suspend.duration&lt;/code&gt; 
(with a default of zero seconds) now controls how long to wait before 
reassigning
+a tablet from a dead tserver. The property is configurable via the
+Accumulo shell, so you can set it, do a rolling restart, and then
+set it back to 0. A new state as introduced, TableState.SUSPENDED to support 
this feature. By default, metadata tablet
+reassignment is not suspended, but that can also be changed with the &lt;code 
class=&quot;highlighter-rouge&quot;&gt;master.metadata.suspendable&lt;/code&gt; 
property that is false by
+default. Root tablet assignment can not be suspended. See &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4353&quot;&gt;ACCUMULO-4353&lt;/a&gt;
 for more info.&lt;/p&gt;
+
+&lt;h3 id=&quot;run-multiple-tablet-servers-on-one-node&quot;&gt;Run multiple 
Tablet Servers on one node&lt;/h3&gt;
+
+&lt;p&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4328&quot;&gt;ACCUMULO-4328&lt;/a&gt;
 introduces the capability of running multiple tservers on a single node. This 
is intended for nodes with a large
+amounts of memory and/or disk. This feature is disabled by default. There are 
several related tickets: &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4072&quot;&gt;ACCUMULO-4072&lt;/a&gt;,
 &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4331&quot;&gt;ACCUMULO-4331&lt;/a&gt;
+and &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4406&quot;&gt;ACCUMULO-4406&lt;/a&gt;.
 Note that when this is enabled, the names of the log files change. Previous 
log file names were defined in the
+generic_logger.xml as &lt;code 
class=&quot;highlighter-rouge&quot;&gt;${org.apache.accumulo.core.application}_{org.apache.accumulo.core.ip.localhost.hostname}.log&lt;/code&gt;.
+The files will now include the instance id after the application with
+&lt;code 
class=&quot;highlighter-rouge&quot;&gt;${org.apache.accumulo.core.application}_${instance}_${org.apache.accumulo.core.ip.localhost.hostname}.log&lt;/code&gt;.&lt;/p&gt;
+
+&lt;p&gt;For example: tserver_host.domain.com.log will become 
tserver_1_host.domain.log when multiple TabletServers
+are run per host. The same change also applies to the debug logs provided in 
the example configurations. The log
+names do not change if this feature is not used.&lt;/p&gt;
+
+&lt;h3 id=&quot;rate-limiting-major-compactions&quot;&gt;Rate limiting Major 
Compactions&lt;/h3&gt;
+
+&lt;p&gt;Major Compactions can significantly increase the amount of load on
+TabletServers. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4187&quot;&gt;ACCUMULO-4187&lt;/a&gt;
 restricts the rate at which data is
+read and written when performing major compactions. This has a direct
+effect on the IO load caused by major compactions with a similar
+effect on the CPU utilization. This behavior is controlled by a new
+property &lt;code 
class=&quot;highlighter-rouge&quot;&gt;tserver.compaction.major.throughput&lt;/code&gt;
 with a defaults of 0B
+which disables the rate limiting.&lt;/p&gt;
+
+&lt;h3 id=&quot;table-sampling&quot;&gt;Table Sampling&lt;/h3&gt;
+
+&lt;p&gt;Queryable sample data was added by &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3913&quot;&gt;ACCUMULO-3913&lt;/a&gt;.
  This allows users to configure a pluggable
+function to generate sample data.  At scan time, the sample data can 
optionally be scanned.
+Iterators also have access to sample data.  Iterators can access all data and 
sample data, this
+allows an iterator to use sample data for query optimizations.  The new user 
level RFile API
+supports writing RFiles with sample data for bulk import.&lt;/p&gt;
+
+&lt;p&gt;A simple configurable sampler function is included with Accumulo.  
This sampler uses hashing and
+can be configured to use a subset of Key fields.  For example if it was 
desired to have entire rows
+in the sample, then this sampler would be configured to hash+mod the row.   
Then when a row is
+selected for the sample, all of its columns and all of its updates will be in 
the sample data.
+Another scenario is one in which a document id is in the column qualifier.  In 
this scenario, one
+would either want all data related to a document in the sample data or none.  
To achieve this, the
+sample could be configured to hash+mod on the column qualifier.  See the 
sample &lt;a href=&quot;../1.8/examples/sample&quot;&gt;Readme
+example&lt;/a&gt; and javadocs on the new APIs for more information.&lt;/p&gt;
+
+&lt;p&gt;For sampling to work, all tablets scanned must have pre-generated 
sample data that was generated in
+the same way.  If this is not the case then scans will fail.  For existing 
tables, samples can be
+generated by configuring sampling on the table and compacting the 
table.&lt;/p&gt;
+
+&lt;h3 id=&quot;upgrade-to-apache-thrift-093&quot;&gt;Upgrade to Apache Thrift 
0.9.3&lt;/h3&gt;
+
+&lt;p&gt;Accumulo relies on Apache Thrift to implement remote procedure calls
+between Accumulo services. Ticket &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4077&quot;&gt;ACCUMULO-4077&lt;/a&gt;
+updates our dependency to 0.9.3. See the
+&lt;a 
href=&quot;https://github.com/apache/thrift/blob/0.9.3/CHANGES&quot;&gt;Apache 
Thrift 0.9.3 Release Notes&lt;/a&gt; for details on
+the changes to Thrift.  &lt;strong&gt;NOTE:&lt;/strong&gt; The Thrift 0.9.3 
Java library is not
+compatible other versions of Thrift. Applications running against Accumulo
+1.8 must use Thrift 0.9.3. Different versions of Thrift on the classpath
+will not work.&lt;/p&gt;
+
+&lt;h3 id=&quot;iterator-test-harness&quot;&gt;Iterator Test Harness&lt;/h3&gt;
+
+&lt;p&gt;Users often write a new iterator without fully understanding its 
limits and lifetime. Previously, Accumulo did
+not provide any means in which a user could test iterators to catch common 
issues that only become apparent
+in multi-node production deployments. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-626&quot;&gt;ACCUMULO-626&lt;/a&gt;
 provides a framework and a collection of initial tests
+which can be used to simulate common issues with Iterators that only appear in 
production deployments. This test
+harness can be used directly by users as a supplemental tool to unit tests and 
integration tests with MiniAccumuloCluster.&lt;/p&gt;
+
+&lt;p&gt;Please see the &lt;a 
href=&quot;../1.8/accumulo_user_manual.html#_iterator_testing&quot;&gt;Accumulo 
User Manual chapter on Iterator Testing&lt;/a&gt; for more information&lt;/p&gt;
+
+&lt;h3 id=&quot;default-port-for-monitor-changed-to-9995&quot;&gt;Default port 
for Monitor changed to 9995&lt;/h3&gt;
+
+&lt;p&gt;Previously, the default port for the monitor was 50095. You will need 
to update your links to point to port 9995. The default
+port for the GC process was also changed from 50091 to 9998, although this an 
RPC port used internally and automatically discovered.
+These default ports were changed because the previous defaults fell in the 
Linux Ephemeral port range. This means that the operating
+system, when a port in this range was unusued, would allocate this port for 
dynamic network communication. This has the side-effect of
+temporal bind issues when trying to start these services (as the operating
+system might have allocated them elsewhere). By moving these
+defaults out of the ephemeral range, we can guarantee that the Monitor and GC
+will reliably start. These values are still configurable by setting
+&lt;code 
class=&quot;highlighter-rouge&quot;&gt;monitor.port.client&lt;/code&gt;and 
&lt;code class=&quot;highlighter-rouge&quot;&gt;gc.port.client&lt;/code&gt; in 
the accumulo-site.xml.&lt;/p&gt;
+
+&lt;h2 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h2&gt;
 
-&lt;h2 id=&quot;conclusion-and-future-work&quot;&gt;Conclusion and future 
work&lt;/h2&gt;
+&lt;ul&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-1055&quot;&gt;ACCUMULO-1055&lt;/a&gt;
 Configurable maximum file size for merging minor compactions&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-1124&quot;&gt;ACCUMULO-1124&lt;/a&gt;
 Optimization of RFile index&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-2883&quot;&gt;ACCUMULO-2883&lt;/a&gt;
 API to fetch current tablet assignments&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3871&quot;&gt;ACCUMULO-3871&lt;/a&gt;
 Support for running integration tests in MapReduce&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3920&quot;&gt;ACCUMULO-3920&lt;/a&gt;
 Deprecate the MockAccumulo class and remove usage in our tests&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4339&quot;&gt;ACCUMULO-4339&lt;/a&gt;
 Make hadoop-minicluster optional dependency of acccumulo-minicluster&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4318&quot;&gt;ACCUMULO-4318&lt;/a&gt;
 BatchWriter, ConditionalWriter, and ScannerBase now extend 
AutoCloseable&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4326&quot;&gt;ACCUMULO-4326&lt;/a&gt;
 Value constructor now accepts Strings (and Charsequences)&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4354&quot;&gt;ACCUMULO-4354&lt;/a&gt;
 Bump dependency versions to include gson, jetty, and sl4j&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3735&quot;&gt;ACCUMULO-3735&lt;/a&gt;
 Bulk Import status page on the monitor&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4066&quot;&gt;ACCUMULO-4066&lt;/a&gt;
 Reduced time to processes conditional mutations.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4164&quot;&gt;ACCUMULO-4164&lt;/a&gt;
 Reduced seek time for cached data.&lt;/li&gt;
+&lt;/ul&gt;
 
-&lt;p&gt;The addition of the replication feature fills a large gap in the 
architecture of Accumulo where the system does not easily operate across WANs. 
While strong consistency between a primary and a peer is sacrificed, the common 
case of using replication for disaster recovery favors availability of the 
system over strong consistency and has the added benefit of not significantly 
impacting the ingest performance on the primary instance. Replication provides 
active backup support while enabling Accumulo to automatically share data 
between instances across large physical distances.&lt;/p&gt;
+&lt;h2 id=&quot;testing&quot;&gt;Testing&lt;/h2&gt;
+
+&lt;p&gt;Each unit and functional test only runs on a single node, while the 
RandomWalk
+and Continuous Ingest tests run on any number of nodes. 
&lt;em&gt;Agitation&lt;/em&gt; refers to
+randomly restarting Accumulo processes and Hadoop Datanode processes, and, in
+HDFS High-Availability instances, forcing NameNode failover.&lt;/p&gt;
+
+&lt;table id=&quot;release_notes_testing&quot; class=&quot;table&quot;&gt;
+  &lt;thead&gt;
+    &lt;tr&gt;
+      &lt;th&gt;OS/Environment&lt;/th&gt;
+      &lt;th&gt;Hadoop&lt;/th&gt;
+      &lt;th&gt;Nodes&lt;/th&gt;
+      &lt;th&gt;ZooKeeper&lt;/th&gt;
+      &lt;th&gt;HDFS HA&lt;/th&gt;
+      &lt;th&gt;Tests&lt;/th&gt;
+    &lt;/tr&gt;
+  &lt;/thead&gt;
+  &lt;tbody&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS7/openJDK7/EC2; 3 m3.xlarge leaders, 8 d2.xlarge 
workers&lt;/td&gt;
+      &lt;td&gt;2.6.4&lt;/td&gt;
+      &lt;td&gt;11&lt;/td&gt;
+      &lt;td&gt;3.4.8&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;24 HR Continuous Ingest without Agitation.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS7/openJDK7/EC2; 3 m3.xlarge leaders, 8 d2.xlarge 
workers&lt;/td&gt;
+      &lt;td&gt;2.6.4&lt;/td&gt;
+      &lt;td&gt;11&lt;/td&gt;
+      &lt;td&gt;3.4.8&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;16 HR Continuous Ingest with Agitation.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS7/openJDK7/OpenStack VMs (16G RAM 2cores 2disk3; 1 
leader, 5 workers&lt;/td&gt;
+      &lt;td&gt;HDP 2.5 (Hadoop 2.7)&lt;/td&gt;
+      &lt;td&gt;7&lt;/td&gt;
+      &lt;td&gt;HDP 2.5 (ZK 3.4)&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;24 HR Continuous Ingest without Agitation.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS7/openJDK7/OpenStack VMs (16G RAM 2cores 2disk3; 1 
leader, 5 workers&lt;/td&gt;
+      &lt;td&gt;HDP 2.5 (Hadoop 2.7)&lt;/td&gt;
+      &lt;td&gt;7&lt;/td&gt;
+      &lt;td&gt;HDP 2.5 (ZK 3.4)&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;24 HR Continuous Ingest with Agitation.&lt;/td&gt;
+    &lt;/tr&gt;
+  &lt;/tbody&gt;
+&lt;/table&gt;
 
-&lt;p&gt;One interesting detail about the implementation of this feature is 
that the code which performs replication between two Accumulo instances, the 
AccumuloReplicaSystem, is pluggable via the ReplicaSystem interface. It is 
reasonable to consider other implementations which can automatically replicate 
data from Accumulo to other systems for purposes of backup or additional query 
functionality through other data management systems. For example, Accumulo 
could be used to automatically replicate data to other indexing systems such as 
Lucene or even relational databases for advanced query functionality. Certain 
implementations of the ReplicaSystem could perform special filtering to limit 
the set of columns replicated to certain systems resulting in a subset of the 
complete dataset stored in one Accumulo instance without forcing clients to 
write the data to multiple systems. Each of these considerations are only 
theoretical at this point; however, the potential for advancement is def
 initely worth investigating.&lt;/p&gt;
 </description>
-        <pubDate>Mon, 06 Apr 2015 13:00:00 -0400</pubDate>
-        
<link>https://accumulo.apache.org/blog/2015/04/06/replicating-data-across-accumulo-clusters.html</link>
-        <guid 
isPermaLink="true">https://accumulo.apache.org/blog/2015/04/06/replicating-data-across-accumulo-clusters.html</guid>
+        <pubDate>Tue, 06 Sep 2016 00:00:00 -0400</pubDate>
+        <link>https://accumulo.apache.org/release/accumulo-1.8.0/</link>
+        <guid 
isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.8.0/</guid>
         
         
-        <category>blog</category>
+        <category>release</category>
         
       </item>
     
       <item>
-        <title>Balancing Groups of Tablets</title>
-        <description>&lt;p&gt;Originally posted at &lt;a 
href=&quot;https://blogs.apache.org/accumulo/entry/balancing_groups_of_tablets&quot;&gt;https://blogs.apache.org/accumulo/entry/balancing_groups_of_tablets&lt;/a&gt;&lt;/p&gt;
+        <title>Apache Accumulo 1.7.2</title>
+        <description>&lt;p&gt;Apache Accumulo 1.7.2 is a maintenance release 
on the 1.7 version branch. This
+release contains changes from more than 150 issues, comprised of bug-fixes,
+performance improvements, build quality improvements, and more. See
+&lt;a 
href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12312121&amp;amp;version=12333776&quot;&gt;JIRA&lt;/a&gt;
 for a complete list.&lt;/p&gt;
+
+&lt;p&gt;Below are resources for this release:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;&lt;a href=&quot;/1.7/accumulo_user_manual.html&quot;&gt;User 
Manual&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.7/apidocs&quot;&gt;Javadocs&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;/1.7/examples&quot;&gt;Examples&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Users of any previous 1.7.x release are strongly encouraged to update 
as soon
+as possible to benefit from the improvements with very little concern in change
+of underlying functionality. Users of 1.6 or earlier that are seeking to
+upgrade to 1.7 should consider 1.7.2 as a starting point.&lt;/p&gt;
+
+&lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
+
+&lt;h3 
id=&quot;write-ahead-logs-can-be-prematurely-deleted&quot;&gt;Write-Ahead Logs 
can be prematurely deleted&lt;/h3&gt;
+
+&lt;p&gt;There were cases where the Accumulo Garbage Collector may 
inadvertently delete a WAL for a tablet server that it has erroneously 
determined to be down, causing data loss. This has been corrected. See &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4157&quot;&gt;ACCUMULO-4157&lt;/a&gt;
 for additional detail.&lt;/p&gt;
+
+&lt;h3 id=&quot;upgrade-to-commons-vfs-21&quot;&gt;Upgrade to Commons-VFS 
2.1&lt;/h3&gt;
 
-&lt;p&gt;Accumulo has a pluggable tablet balancer that decides where tablets 
should be placed. Accumuloâs default configuration spreads each tables 
tablets evenly and randomly across the tablet servers. Each table can configure 
a custom balancer that does something different.&lt;/p&gt;
+&lt;p&gt;Upgrading to Apache Commons VFS 2.1 fixes several issues with 
classloading out of HDFS. For further detail see &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4146&quot;&gt;ACCUMULO-4146&lt;/a&gt;.
 Additional fixes to a potential HDFS class loading deadlock situation were 
made in &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4341&quot;&gt;ACCUMULO-4341&lt;/a&gt;.&lt;/p&gt;
 
-&lt;p&gt;For some applications to perform optimally, sub-ranges of a table 
need to be spread evenly across the cluster. Over the years I have run into 
multiple use cases for this situation. The latest use case was &lt;a 
href=&quot;https://github.com/fluo-io/fluo/issues/361&quot;&gt;bad 
performance&lt;/a&gt; on the &lt;a 
href=&quot;http://fluo.io/&quot;&gt;Fluo&lt;/a&gt; &lt;a 
href=&quot;https://github.com/fluo-io/fluo-stress&quot;&gt;Stress 
Test&lt;/a&gt;. This test stores a tree in an Accumulo table and creates 
multiple tablets for each level in the tree. In parallel, the test reads data 
from one level and writes it up to the next level. Figure 1 below shows an 
example of tablet servers hosting tablets for different levels of the tree. 
Under this scenario if many threads are reading data from level 2 and writing 
up to level 1, only Tserver 1 and Tserver 2 will be utilized. So in this 
scenario 50% of the tablet servers are idle.&lt;/p&gt;
+&lt;h3 
id=&quot;native-map-failed-to-increment-mutation-count-properly&quot;&gt;Native 
Map failed to increment mutation count properly&lt;/h3&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/201503_balancer/figure1.png&quot; 
alt=&quot;figure1&quot; /&gt;
-&lt;em&gt;Figure 1&lt;/em&gt;&lt;/p&gt;
+&lt;p&gt;There was a bug (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4148&quot;&gt;ACCUMULO-4148&lt;/a&gt;)
 where multiple put calls with identical keys and no timestamp would exhibit 
different behaviour depending on whether native maps were enabled or not. This 
behaviour would result in hidden mutations with native maps, and has been 
corrected.&lt;/p&gt;
 
-&lt;p&gt;[ACCUMULO-3439][accumulo-3949] remedied this situation with the 
introduction of the &lt;a 
href=&quot;https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java;hb=b0815affade66ab04ca27b6fc3abaac400097469&quot;&gt;GroupBalancer&lt;/a&gt;
 and &lt;a 
href=&quot;https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=server/base/src/main/java/org/apache/accumulo/server/master/balancer/RegexGroupBalancer.java;hb=51fbfaf0a52dc89e8294c86c30164fb94c9f644c&quot;&gt;RegexGroupBalancer&lt;/a&gt;
 which will be available in Accumulo 1.7.0. These balancers allow a user to 
arbitrarily group tablets. Each group defined by the user will be evenly spread 
across the tablet servers. Also, the total number of groups on each tablet 
server is minimized. As tablets are added or removed from the table, the 
balancer will migrate tablets to satisfy these goals.  Much of the complexity 
in the GroupBalan
 cer code comes from trying to minimize the number of migrations needed to 
reach a good state.&lt;/p&gt;
+&lt;h3 
id=&quot;open-wal-files-could-prevent-datanode-decomission&quot;&gt;Open WAL 
files could prevent DataNode decomission&lt;/h3&gt;
 
-&lt;p&gt;A GroupBalancer could be configured for the table in figure 1 in such 
a way that it grouped tablets by level. If this were done, the result may look 
like Figure 2 below. With this tablet to tablet server mapping, many threads 
reading from level 2 and writing data up to level 1 would utilize all of the 
tablet servers yielding better performance.&lt;/p&gt;
+&lt;p&gt;An improvement was introduced to allow a max age before WAL files 
would be automatically rolled. Without a max age, they could stay open for 
writing indefinitely, blocking the Hadoop DataNode decomissioning process. For 
more information, see &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4004&quot;&gt;ACCUMULO-4004&lt;/a&gt;.&lt;/p&gt;
 
-&lt;p&gt;&lt;img src=&quot;/images/blog/201503_balancer/figure2.png&quot; 
alt=&quot;figure2&quot; /&gt;
-&lt;em&gt;Figure 2&lt;/em&gt;&lt;/p&gt;
+&lt;h3 
id=&quot;remove-unnecessary-copy-of-cached-rfile-index-blocks&quot;&gt;Remove 
unnecessary copy of cached RFile index blocks&lt;/h3&gt;
 
-&lt;p&gt;&lt;a 
href=&quot;https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=docs/src/main/resources/examples/README.rgbalancer;hb=51fbfaf0a52dc89e8294c86c30164fb94c9f644c&quot;&gt;README.rgbalancer&lt;/a&gt;
 provides a good example of configuring and using the RegexGroupBalancer. If a 
regular expression can not accomplish the needed grouping, then a grouping 
function can be written in Java. Extend GroupBalancer to write a grouping 
function in java. RegexGroupBalancer provides a good example of how to do 
this.&lt;/p&gt;
+&lt;p&gt;Accumulo maintains an cache for file blocks in-memory as a 
performance optimization. This can be done safely because Accumulo RFiles are 
immutable, thus their blocks are also immutable. There are two types of these 
blocks: index and data blocks. Index blocks refer to the b-tree style index 
inside of each Accumulo RFile, while data blocks contain the sorted Key-Value 
pairs. In previous versions, when Accumulo extracted an Index block from the 
in-memory cache, it would copy the data. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4164&quot;&gt;ACCUMULO-4164&lt;/a&gt;
 removes this unnecessary copy as the contents are immutable and can be passed 
by reference. Ensuring that the Index blocks are not copied when accessed from 
the cache is a big performance gain at the file-access level.&lt;/p&gt;
 
-&lt;p&gt;When using a GroupBalancer, how Accumulo automatically splits tablets 
must be kept in mind. When Accumulo decides to split a tablet, it chooses the 
shortest possible row prefix from the tablet data that yields a good split 
point. Therefore its possible that a split point that is shorter than what is 
expected by a GroupBalancer could be chosen. The best way to avoid this 
situation is to pre-split the table such that it precludes this 
possibility.&lt;/p&gt;
+&lt;h3 
id=&quot;analyze-key-length-to-avoid-choosing-large-keys-for-rfile-index-blocks&quot;&gt;Analyze
 Key-length to avoid choosing large Keys for RFile Index blocks&lt;/h3&gt;
 
-&lt;p&gt;The Fluo Stress test is a very abstract use case. A more concrete use 
case for the group balancer would be using it to ensure tablets storing 
geographic data were spread out evenly. For example consider &lt;a 
href=&quot;https://ngageoint.github.io/geowave/&quot;&gt;GeoWaveâs&lt;/a&gt; 
Accumulo &lt;a 
href=&quot;http://ngageoint.github.io/geowave/documentation.html#architecture-accumulo&quot;&gt;Persistence
 Model&lt;/a&gt;. Tablets could be balanced such that bins related to different 
regions are spread out evenly. For example tablets related to each continent 
could be assigned a group ensuring data related to each continent is evenly 
spread across the cluster. Alternatively, each Tier could spread evenly across 
the cluster.&lt;/p&gt;
+&lt;p&gt;Accumuloâs RFile index blocks are made up of a Key which exists in 
the file and points to that specific location in the corresponding RFile data 
block. Thus, the size of the RFile index blocks is largely dominated by the 
size of the Keys which are used by the index. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4314&quot;&gt;ACCUMULO-4314&lt;/a&gt;
 is an improvement which uses statistics on the length of the Keys in the Rfile 
to avoid choosing Keys for the index whose length is greater than three 
standard deviations for the RFile. By choosing smaller Keys for the index, 
Accumulo can access the RFile index faster and keep more Index blocks cached in 
memory. Initial tests showed that with this change, the RFile index size was 
nearly cut in half.&lt;/p&gt;
+
+&lt;h3 id=&quot;minor-performance-improvements&quot;&gt;Minor performance 
improvements.&lt;/h3&gt;
+
+&lt;p&gt;Tablet servers would previously always hsync at the start of a minor 
compaction, causing delays in the write pipeline. These additional syncs were 
determined to provide no additional durability guarantees and have been 
removed. See &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4112&quot;&gt;ACCUMULO-4112&lt;/a&gt;
 for additional detail.&lt;/p&gt;
+
+&lt;p&gt;A performance issue was identified and corrected (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-1755&quot;&gt;ACCUMULO-1755&lt;/a&gt;)
 where the BatchWriter would block calls to addMutation while looking up 
destination tablet server metadata. The writer has been fixed to allow both 
operations in parallel.&lt;/p&gt;
+
+&lt;h2 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h2&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3923&quot;&gt;ACCUMULO-3923&lt;/a&gt;
 bootstrap_hdfs.sh script would copy incorrect jars to hdfs.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4146&quot;&gt;ACCUMULO-4146&lt;/a&gt;
 Avoid copy of RFile Index Blocks when already in cache.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4155&quot;&gt;ACCUMULO-4155&lt;/a&gt;
 No longer publish javadoc for non-public API to website. (Still available in 
javadoc jars in maven)&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4173&quot;&gt;ACCUMULO-4173&lt;/a&gt;
 Provide balancer to balance table within subset of hosts.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4334&quot;&gt;ACCUMULO-4334&lt;/a&gt;
 Ingest rates reported through JMX did not match rates reported by 
Monitor.&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4335&quot;&gt;ACCUMULO-4335&lt;/a&gt;
 Error conditions that result in a Halt should ensure non-zero process exit 
code.&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;h2 id=&quot;testing&quot;&gt;Testing&lt;/h2&gt;
+
+&lt;p&gt;Each unit and functional test only runs on a single node, while the 
RandomWalk
+and Continuous Ingest tests run on any number of nodes. 
&lt;em&gt;Agitation&lt;/em&gt; refers to
+randomly restarting Accumulo processes and Hadoop Datanode processes, and, in
+HDFS High-Availability instances, forcing NameNode failover.&lt;/p&gt;
+
+&lt;table id=&quot;release_notes_testing&quot; class=&quot;table&quot;&gt;
+  &lt;thead&gt;
+    &lt;tr&gt;
+      &lt;th&gt;OS/Environment&lt;/th&gt;
+      &lt;th&gt;Hadoop&lt;/th&gt;
+      &lt;th&gt;Nodes&lt;/th&gt;
+      &lt;th&gt;ZooKeeper&lt;/th&gt;
+      &lt;th&gt;HDFS HA&lt;/th&gt;
+      &lt;th&gt;Tests&lt;/th&gt;
+    &lt;/tr&gt;
+  &lt;/thead&gt;
+  &lt;tbody&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7; EC2 m3.xlarge, d2.xlarge workers&lt;/td&gt;
+      &lt;td&gt;2.6.3&lt;/td&gt;
+      &lt;td&gt;9&lt;/td&gt;
+      &lt;td&gt;3.4.8&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;24 HR Continuous Ingest with and without Agitation.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 6: EC2 m3.2xlarge&lt;/td&gt;
+      &lt;td&gt;2.6.1&lt;/td&gt;
+      &lt;td&gt;1&lt;/td&gt;
+      &lt;td&gt;3.4.5&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;Unit tests and Integration Tests&lt;/td&gt;
+    &lt;/tr&gt;
+  &lt;/tbody&gt;
+&lt;/table&gt;
 
 </description>
-        <pubDate>Fri, 20 Mar 2015 13:00:00 -0400</pubDate>
-        
<link>https://accumulo.apache.org/blog/2015/03/20/balancing-groups-of-tablets.html</link>
-        <guid 
isPermaLink="true">https://accumulo.apache.org/blog/2015/03/20/balancing-groups-of-tablets.html</guid>
+        <pubDate>Wed, 22 Jun 2016 00:00:00 -0400</pubDate>
+        <link>https://accumulo.apache.org/release/accumulo-1.7.2/</link>
+        <guid 
isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.7.2/</guid>
         
         
-        <category>blog</category>
+        <category>release</category>
         
       </item>
     
       <item>
-        <title>Generating Keystores for configuring Accumulo with SSL</title>
-        <description>&lt;p&gt;Originally posted at &lt;a 
href=&quot;https://blogs.apache.org/accumulo/entry/generating_keystores_for_configuring_accumulo&quot;&gt;https://blogs.apache.org/accumulo/entry/generating_keystores_for_configuring_accumulo&lt;/a&gt;&lt;/p&gt;
+        <title>Apache Accumulo 1.7.1</title>
+        <description>&lt;p&gt;Apache Accumulo 1.7.1 is a maintenance release 
on the 1.7 version branch. This
+release contains changes from more than 150 issues, comprised of bug-fixes,
+performance improvements, build quality improvements, and more. See
+&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO/fixforversion/12329940&quot;&gt;JIRA&lt;/a&gt;
 for a complete list.&lt;/p&gt;
 
-&lt;p&gt;One of the major features added in Accumulo 1.6.0 was the ability to 
configure Accumulo so that the Thrift communications will run over SSL. &lt;a 
href=&quot;http://thrift.apache.org/&quot;&gt;Apache Thrift&lt;/a&gt; is the 
remote procedure call library that is leverage for both intra-server 
communication and client communication with Accumulo. Issuing these calls over 
a secure socket ensures that unwanted actors cannot inspect the traffic sent 
across the wire. Given the sometimes sensitive nature of data stored in 
Accumulo and the authentication details for users, ensuring that no prying eyes 
have access to these communications is critical.&lt;/p&gt;
+&lt;p&gt;Users of any previous 1.7.x release are strongly encouraged to update 
as soon
+as possible to benefit from the improvements with very little concern in change
+of underlying functionality. Users of 1.6 or earlier that are seeking to
+upgrade to 1.7 should consider 1.7.1 as a starting point.&lt;/p&gt;
 
-&lt;p&gt;Due to the complex and deployment specific nature of the security 
model for some system, Accumulo expects users to provide their own 
certificates, guaranteeing that they are, in fact, secure. However, for those 
who want to get security who do not already operate within the confines of an 
established security infrastructure, OpenSSL and the Java keytool command can 
be used to generate the necessary components to enable wire 
encryption.&lt;/p&gt;
+&lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
 
-&lt;p&gt;To enable SSL with Accumulo, it is necessary to generate a 
certificate authority and certificates which are signed by that authority. 
Typically, each client and server has its own certificate which provides the 
finest level of control over a secure cluster when the certificates are 
properly secured.&lt;/p&gt;
+&lt;h3 id=&quot;silent-data-loss-via-bulk-imported-files&quot;&gt;Silent 
data-loss via bulk imported files&lt;/h3&gt;
 
-&lt;h2 id=&quot;generate-a-certificate-authority&quot;&gt;Generate a 
Certificate Authority&lt;/h2&gt;
+&lt;p&gt;A user recently reported that a simple bulk-import application would
+occasionally lose some records. Through investigation, it was found that when
+bulk imports into a table failed the initial assignment, the logic that
+automatically retries the imports was incorrectly choosing the tablets to
+import the files into. &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3967&quot;&gt;ACCUMULO-3967&lt;/a&gt;
 contains more information
+on the cause and identification of the bug. The data-loss condition would only
+affect entire files. If records from a file exist in Accumulo, it is still
+guaranteed that all records within that imported file were 
successful.&lt;/p&gt;
 
-&lt;p&gt;The certificate authority (CA) is what controls what certificates can 
be used to authenticate with each other. To create a secure connection with two 
certificates, each certificate must be signed by a certificate authority in the 
âtruststoreâ (A Java KeyStore which contains at least one Certificate 
Authorityâs public key). When creating your own certificate authority, a 
single CA is typically sufficient (and would result in a single public key in 
the truststore). Alternatively, a third party can also act as a certificate 
authority (to add an additional layer of security); however, these are 
typically not a free service.&lt;/p&gt;
+&lt;p&gt;As such, users who have bulk import applications using previous 
versions of
+Accumulo should verify that all of their data was correctly ingested into
+Accumulo and immediately update to Accumulo 1.7.1 (This is the same bug that
+was fixed in 1.6.4, so you wonât be affected if youâre running 1.6.4 or 
newer).&lt;/p&gt;
 
-&lt;p&gt;The below is an example of creating a certificate authority and 
adding its public key to a Java KeyStore to provide to Accumulo.&lt;/p&gt;
+&lt;h3 id=&quot;queued-compactions-not-running&quot;&gt;Queued Compactions Not 
Running&lt;/h3&gt;
 
-&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre 
class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# 
Create a private key&lt;/span&gt;
-openssl genrsa -des3 -out root.key 4096
+&lt;p&gt;Found and fixed a bug (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4016&quot;&gt;ACCUMULO-4016&lt;/a&gt;)
 in which some queued
+compactions would never run if the number of files changed while the tablet was
+queued.&lt;/p&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Create a certificate request using the 
private key&lt;/span&gt;
-openssl req -x509 -new -key root.key -days 365 -out root.pem
+&lt;h3 id=&quot;kerberos-ticket-renewals&quot;&gt;Kerberos Ticket 
Renewals&lt;/h3&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Generate a Base64-encoded version of the PEM 
just created&lt;/span&gt;
-openssl x509 -outform der -in root.pem -out root.der
+&lt;p&gt;A bug was fixed which caused Accumulo clients and services to fail to 
check and
+(if necessary) renew their Kerberos credentials. This would eventually lead to
+these components failing to properly authenticate until they were restarted.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4069&quot;&gt;ACCUMULO-4069&lt;/a&gt;)&lt;/p&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Import the key into a Java 
KeyStore&lt;/span&gt;
-keytool -import -alias root-key -keystore truststore.jks -file root.der
+&lt;h3 id=&quot;updated-commons-collection&quot;&gt;Updated 
commons-collection&lt;/h3&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Remove the DER formatted key file (as we 
don't need it anymore)&lt;/span&gt;
-rm root.der
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
+&lt;p&gt;The bundled commons-collection library was updated from version 3.2.1 
to 3.2.2
+because of a reported vulnerability in that library.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4056&quot;&gt;ACCUMULO-4056&lt;/a&gt;)&lt;/p&gt;
 
-&lt;p&gt;Remember to protect root.key and never distribute it as the private 
key is the basis for your circle of trust. The keytool command will prompt you 
about whether or not the certificate should be trusted: enter âyesâ. The 
truststore.jks file, a âtruststoreâ, is meant to be shared with all parties 
communicating with one another. The password provided to the truststore 
verifies that the contents of the truststore have not been tampered 
with.&lt;/p&gt;
+&lt;h3 id=&quot;faster-processing-of-conditional-mutations&quot;&gt;Faster 
Processing of Conditional Mutations&lt;/h3&gt;
 
-&lt;h2 id=&quot;generate-a-certificatekeystore-per-host&quot;&gt;Generate a 
certificate/keystore per host&lt;/h2&gt;
+&lt;p&gt;Improved ConditionalMutation processing time by a factor of 3.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4066&quot;&gt;ACCUMULO-4066&lt;/a&gt;)&lt;/p&gt;
 
-&lt;p&gt;For each host in the system, itâs desirable to generate a 
certificate. Typically, this corresponds to a certificate per host. 
Additionally, each client connecting to the Accumulo instance running with SSL 
should be issued their own certificate. By issuing individual certificates to 
each entity, it gives proper control to revoke/reissue certificates to clients 
as necessary, without widespread interruption.&lt;/p&gt;
+&lt;h3 id=&quot;slow-gc-while-bulk-importing&quot;&gt;Slow GC While Bulk 
Importing&lt;/h3&gt;
 
-&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;pre 
class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# 
Create the private key for our server&lt;/span&gt;
-openssl genrsa -out server.key 4096
+&lt;p&gt;Found and worked around an issue where lots of bulk imports creating 
many new
+files would significantly impair the Accumulo GC service, and possibly prevent
+it from running to completion entirely. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4021&quot;&gt;ACCUMULO-4021&lt;/a&gt;)&lt;/p&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Generate a certificate signing request (CSR) 
with our private key&lt;/span&gt;
-openssl req -new -key server.key -out server.csr
+&lt;h3 id=&quot;unnoticed-per-table-configuration-updates&quot;&gt;Unnoticed 
Per-table Configuration Updates&lt;/h3&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Use the CSR and the CA to create a 
certificate for the server (a reply to the CSR)&lt;/span&gt;
-openssl x509 -req -in server.csr -CA root.pem -CAkey root.key -CAcreateserial 
-out server.crt -days 365
+&lt;p&gt;Fixed a bug which caused tablet servers to not notice changes to the 
per-table
+constraints, under some circumstances. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3859&quot;&gt;ACCUMULO-3859&lt;/a&gt;)&lt;/p&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Use the certificate and the private key for 
our server to create PKCS12 file&lt;/span&gt;
-openssl pkcs12 -export -in server.crt -inkey server.key -certfile server.crt 
-name &lt;span class=&quot;s1&quot;&gt;'server-key'&lt;/span&gt; -out server.p12
+&lt;h3 
id=&quot;tabletservers-kill-themselves-on-centos7&quot;&gt;TabletServers kill 
themselves on CentOS7&lt;/h3&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Create a Java KeyStore for the server using 
the PKCS12 file (private key)&lt;/span&gt;
-keytool -importkeystore -srckeystore server.p12 -srcstoretype pkcs12 
-destkeystore server.jks -deststoretype JKS
+&lt;p&gt;Reduced the aggressiveness with which Accumulo Tablet Servers 
preemptively
+killed themselves when a local filesystem switched to read-only (indicating a
+possible failure). To reduce false positives, such as those which can occur
+with systemdâs extra cgroup mounts in CentOS7, an additional check was added 
to
+ensure that tablet servers would only kill themselves if an ext- or
+xfs-formatted disk switched to read-only. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4080&quot;&gt;ACCUMULO-4080&lt;/a&gt;)&lt;/p&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Remove the PKCS12 file as we don't need 
it&lt;/span&gt;
-rm server.p12
+&lt;h3 
id=&quot;improvements-in-locating-client-configuration-file&quot;&gt;Improvements
 in Locating Client Configuration File&lt;/h3&gt;
 
-&lt;span class=&quot;c&quot;&gt;# Import the CA-signed certificate to the 
keystore&lt;/span&gt;
-keytool -import -trustcacerts -alias server-crt -file server.crt -keystore 
server.jks
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
+&lt;p&gt;Fixed some unexpected error messages related to setting
+ACCUMULO_CLIENT_CONF_PATH, and improved the detection of the client.conf file 
if
+ACCUMULO_CLIENT_CONF_PATH was set to a directory containing client.conf.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4026&quot;&gt;ACCUMULO-4026&lt;/a&gt;,&lt;a
 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4027&quot;&gt;ACCUMULO-4027&lt;/a&gt;)&lt;/p&gt;
 
-&lt;p&gt;These commands create a private key for the server, generated a 
certificate signing request created from that private key, used the certificate 
authority to generate the certificate using the signing request and then 
created a Java KeyStore with the certificate and the private key for our 
server. This, paired with the truststore, provide what is needed to configure 
Accumulo servers to run over SSL. Both the private key (server.key), the 
certificate signed by the CA (server.pem), and the keystore (server.jks) should 
be restricted to only be accessed by the user running Accumulo on the host it 
was generated for. Use chown and chmod to protect the files and do not 
distribute them over insecure networks.&lt;/p&gt;
-
-&lt;h2 id=&quot;configure-accumulo-servers&quot;&gt;Configure Accumulo 
Servers&lt;/h2&gt;
-
-&lt;p&gt;Now that the Java KeyStores have been created with the necessary 
information, the Accumulo configuration must be updated so that Accumulo 
creates the Thrift server over SSL instead of a normal socket. In 
accumulo-site.xml, configure the following:&lt;/p&gt;
-
-&lt;div class=&quot;language-xml highlighter-rouge&quot;&gt;&lt;pre 
class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span 
class=&quot;nt&quot;&gt;&amp;lt;property&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;rpc.javax.net.ssl.keyStore&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;value&amp;gt;&lt;/span&gt;/path/to/server.jks&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/property&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;property&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;rpc.javax.net.ssl.keyStorePassword&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;value&amp;gt;&lt;/span&gt;server_password&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/property&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;property&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;rpc.javax.net.ssl.trustStore&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;value&amp;gt;&lt;/span&gt;/path/to/truststore.jks&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/property&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;property&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;rpc.javax.net.ssl.trustStorePassword&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;value&amp;gt;&lt;/span&gt;truststore_password&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/property&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;property&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;instance.rpc.ssl.enabled&lt;span
 class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
-  &lt;span 
class=&quot;nt&quot;&gt;&amp;lt;value&amp;gt;&lt;/span&gt;true&lt;span 
class=&quot;nt&quot;&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/property&amp;gt;&lt;/span&gt;
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
+&lt;h3 
id=&quot;transient-zookeeper-disconnect-causes-fate-threads-to-exit&quot;&gt;Transient
 ZooKeeper disconnect causes FATE threads to exit&lt;/h3&gt;
 
-&lt;p&gt;The keystore and truststore paths are both absolute paths on the 
local filesystem (not HDFS). Remember that the server keystore should only be 
readable by the user running Accumulo and, if you place plaintext passwords in 
accumulo-site.xml, make sure that accumulo-site.xml is also not globally 
readable. To keep these passwords out of accumulo-site.xml, consider 
configuring your system with the new Hadoop CredentialProvider class, see &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-2464&quot;&gt;ACCUMULO-2464&lt;/a&gt;
 for more information which will be available in Accumulo-1.6.1.&lt;/p&gt;
+&lt;p&gt;ZooKeeper clients are expected to handle the situation where they 
become
+disconnected from the ZooKeeper server and must wait to be reconnected
+before continuing ZooKeeper operations.&lt;/p&gt;
 
-&lt;p&gt;Also, be aware that if unique passwords are used for each server when 
generating the certificate, this will result in different accumulo-site.xml 
files for each host. Unique configuration files per host will add much 
complexity to the configuration management of your instance. The use of a 
CredentialProvider, a feature from Hadoop which allows for acquisitions of 
passwords from alternate systems) can be used to help alleviate the unique 
accumulo-site.xml files on each host. A Java KeyStore can be created using the 
CredentialProvider tools which removes the necessity of passwords to be stored 
in accumulo-site.xml and can instead point to the CredentialProvider URI which 
is consistent across hosts.&lt;/p&gt;
+&lt;p&gt;The dedicated threads running inside the Accumulo Master process for 
FATE
+actions had the potential unexpectedly exit in this disconnected state.
+This caused a scenario where all future FATE-based operations would
+be blocked until the Accumulo Master process was restarted. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4060&quot;&gt;ACCUMULO-4060&lt;/a&gt;)&lt;/p&gt;
 
-&lt;h2 id=&quot;configure-accumulo-clients&quot;&gt;Configure Accumulo 
Clients&lt;/h2&gt;
+&lt;h3 
id=&quot;incorrect-management-of-certain-apache-thrift-rpcs&quot;&gt;Incorrect 
management of certain Apache Thrift RPCs&lt;/h3&gt;
 
-&lt;p&gt;To configure Accumulo clients, use $HOME/.accumulo/config. This is a 
simple &lt;a href=&quot;http://en.wikipedia.org/wiki/.properties&quot;&gt;Java 
properties file&lt;/a&gt;: each line is a configuration, key and value can be 
separated by a space, and lines beginning with a # symbol are ignored. For 
example, if we generated a certificate and placed it in a keystore (as 
described above), we would generate the following file for the Accumulo 
client.&lt;/p&gt;
+&lt;p&gt;Accumulo relies on Apache Thrift to implement remote procedure calls 
between
+Accumulo services. Accumuloâs use of Thrift uncovered an unfortunate 
situation
+where a special RPC (a âonewayâ call) would leave unwanted data on the 
underlying
+Thrift connection. After this extra data was left on connection, all 
subsequent RPCs
+re-using that connection would fail with âout of sequence responseâ error 
messages.
+Accumulo would be left in a bad state until the mishandled connections were 
released
+or Accumulo services were restarted. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4065&quot;&gt;ACCUMULO-4065&lt;/a&gt;)&lt;/p&gt;
 
-&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre 
class=&quot;highlight&quot;&gt;&lt;code&gt;instance.rpc.ssl.enabled true
-rpc.javax.net.ssl.keyStore  /path/to/client-keystore.jks
-rpc.javax.net.ssl.keyStorePassword  client-password
-rpc.javax.net.ssl.trustStore  /path/to/truststore.jks
-rpc.javax.net.ssl.trustStorePassword  truststore-password
-&lt;/code&gt;&lt;/pre&gt;
-&lt;/div&gt;
-&lt;p&gt;When creating a ZooKeeperInstance, the implementation will 
automatically look for this file and set up a connection with the methods 
defined in this configuration file. The ClientConfiguration class also contains 
methods that can be used instead of a configuration file on the filesystem. 
Again, the paths to the keystore and truststore are on the local filesystem, 
not HDFS.&lt;/p&gt;
+&lt;h2 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h2&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3509&quot;&gt;ACCUMULO-3509&lt;/a&gt;
 Fixed some lock contention in TabletServer, preventing resource 
cleanup&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3734&quot;&gt;ACCUMULO-3734&lt;/a&gt;
 Fixed quote-escaping bug in VisibilityConstraint&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4025&quot;&gt;ACCUMULO-4025&lt;/a&gt;
 Fixed cleanup of bulk load fate transactions&lt;/li&gt;
+  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4098&quot;&gt;ACCUMULO-4098&lt;/a&gt;,&lt;a
 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4113&quot;&gt;ACCUMULO-4113&lt;/a&gt;
 Fixed widespread misuse of ByteBuffer&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;h2 id=&quot;testing&quot;&gt;Testing&lt;/h2&gt;
+
+&lt;p&gt;Each unit and functional test only runs on a single node, while the 
RandomWalk
+and Continuous Ingest tests run on any number of nodes. 
&lt;em&gt;Agitation&lt;/em&gt; refers to
+randomly restarting Accumulo processes and Hadoop Datanode processes, and, in
+HDFS High-Availability instances, forcing NameNode failover.&lt;/p&gt;
+
+&lt;table id=&quot;release_notes_testing&quot; class=&quot;table&quot;&gt;
+  &lt;thead&gt;
+    &lt;tr&gt;
+      &lt;th&gt;OS/Environment&lt;/th&gt;
+      &lt;th&gt;Hadoop&lt;/th&gt;
+      &lt;th&gt;Nodes&lt;/th&gt;
+      &lt;th&gt;ZooKeeper&lt;/th&gt;
+      &lt;th&gt;HDFS HA&lt;/th&gt;
+      &lt;th&gt;Tests&lt;/th&gt;
+    &lt;/tr&gt;
+  &lt;/thead&gt;
+  &lt;tbody&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 
d2.xlarge)&lt;/td&gt;
+      &lt;td&gt;2.6.3&lt;/td&gt;
+      &lt;td&gt;9&lt;/td&gt;
+      &lt;td&gt;3.4.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;Random walk (All.xml) 24-hour run, saw &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-3794&quot;&gt;ACCUMULO-3794&lt;/a&gt;
 and &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4151&quot;&gt;ACCUMULO-4151&lt;/a&gt;.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 
d2.xlarge)&lt;/td&gt;
+      &lt;td&gt;2.6.3&lt;/td&gt;
+      &lt;td&gt;9&lt;/td&gt;
+      &lt;td&gt;3.4.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;21 hr run of CI w/ agitation, 23.1B entries 
verified.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 7.1 w/Oracle JDK8 on EC2 (1 m3.xlarge, 8 
d2.xlarge)&lt;/td&gt;
+      &lt;td&gt;2.6.3&lt;/td&gt;
+      &lt;td&gt;9&lt;/td&gt;
+      &lt;td&gt;3.4.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;24 hr run of CI w/o agitation, 23.0B entries verified; saw 
performance issues outlined in comment on &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4146&quot;&gt;ACCUMULO-4146&lt;/a&gt;.&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td&gt;CentOS 6.7 (OpenJDK 7), Fedora 23 (OpenJDK 8), and CentOS 7.2 
(OpenJDK 7)&lt;/td&gt;
+      &lt;td&gt;2.6.1&lt;/td&gt;
+      &lt;td&gt;1&lt;/td&gt;
+      &lt;td&gt;3.4.6&lt;/td&gt;
+      &lt;td&gt;No&lt;/td&gt;
+      &lt;td&gt;All unit tests and ITs pass with -Dhadoop.version=2.6.1; 
Kerberos ITs had a problem with earlier versions of Hadoop&lt;/td&gt;
+    &lt;/tr&gt;
+  &lt;/tbody&gt;
+&lt;/table&gt;
 
 </description>
-        <pubDate>Tue, 02 Sep 2014 13:00:00 -0400</pubDate>
-        
<link>https://accumulo.apache.org/blog/2014/09/02/generating-keystores-for-configuring-accumulo-with-ssl.html</link>
-        <guid 
isPermaLink="true">https://accumulo.apache.org/blog/2014/09/02/generating-keystores-for-configuring-accumulo-with-ssl.html</guid>
+        <pubDate>Fri, 26 Feb 2016 00:00:00 -0500</pubDate>
+        <link>https://accumulo.apache.org/release/accumulo-1.7.1/</link>
+        <guid 
isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.7.1/</guid>
         
         
-        <category>blog</category>
+        <category>release</category>
         
       </item>
     
       <item>
-        <title>Functional reads over Accumulo</title>
-        <description>&lt;p&gt;Originally posted at &lt;a 
href=&quot;https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo&quot;&gt;https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo&lt;/a&gt;&lt;/p&gt;
+        <title>Apache Accumulo 1.6.5</title>
+        <description>&lt;p&gt;Apache Accumulo 1.6.5 is a maintenance release 
on the 1.6 version branch. This
+release contains changes from 55 issues, comprised of bug-fixes, performance
+improvements, build quality improvements, and more. See &lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO/fixforversion/12333674&quot;&gt;JIRA&lt;/a&gt;
 for a
+complete list.&lt;/p&gt;
 
-&lt;p&gt;Table structure is a common area of discussion between all types of 
Accumulo users. In the relational database realm, there was often a 
straightforward way that most users could agree upon that would be ideal to 
store and query some dataset. Data was identified by its schema, some fixed set 
of columns where each value within that column had some given characteristic. 
One of the big pushes behind the âNoSQLâ movement was a growing pain in 
representing evolving data within a static schema. Applications like Accumulo 
removed that notion for a more flexible layout where the columns vary per row, 
but this flexibility often sparks debates about how data is âbestâ stored 
that often ends without a clear-cut winner.&lt;/p&gt;
+&lt;p&gt;Users of any previous 1.6.x release are strongly encouraged to update 
as soon as
+possible to benefit from the improvements with very little concern in change of
+underlying functionality. Users of 1.4 or 1.5 that are seeking to upgrade to 
1.6
+should consider 1.6.5 as a starting point.&lt;/p&gt;
 
-&lt;p&gt;In general, Iâve found that, with new users to Accumulo, itâs 
difficult to move beyond the basic concept of GETs and PUTs of some value for a 
key. Rightfully so, itâs analogous to a spreadsheet: get or update the cell 
in the given row and column. However, thereâs a big difference in that the 
spreadsheet is running on your local desktop, instead of running across many 
machines. In the same way, while a local spreadsheet application has some 
similar functionality to Accumulo, it doesnât really make sense to think 
about using Accumulo as you would a spreadsheet application. Personally, Iâve 
developed a functional-programming-inspired model which I tend to follow when 
implementing applications against Accumulo. The model encourages simple, 
efficient and easily testable code, mainly as a product of modeling the client 
interactions against Accumuloâs APIs.&lt;/p&gt;
+&lt;h2 id=&quot;outstanding-known-issues&quot;&gt;Outstanding Known 
Issues&lt;/h2&gt;
 
-&lt;h3 id=&quot;read-apis&quot;&gt;Read APIs&lt;/h3&gt;
+&lt;p&gt;Be aware that a small documentation bug exists with the compact 
command in the
+shell (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4138&quot;&gt;ACCUMULO-4138&lt;/a&gt;).
 The documentation for the begin row and
+end row should be described as exclusive and inclusive, respectively, rather
+than the incorrect description of both being inclusive.&lt;/p&gt;
 
-&lt;p&gt;Accumulo has two main classes for reading data from an Accumulo 
table: the Scanner and BatchScanner. Both accept Range(s) which limit the data 
read from the table based on a start and stop Key. Only data from the table 
that falls within those start and stop keys will be returned to the client. The 
reason that we have two âtypesâ of classes to read data is that a Scanner 
will return data from a single Range in sorted order whereas the BatchScanner 
accepts multiple Ranges and returns the data unordered. In terms of Java 
language specifics, both the Scanner and BatchScanner are also Iterables, which 
return a Java Iterator that can be easily passed to some other function, 
transformation or for-loop.&lt;/p&gt;
+&lt;h2 id=&quot;highlights&quot;&gt;Highlights&lt;/h2&gt;
 
-&lt;p&gt;Having both a sorted, synchronous stream and an unsorted stream of 
Key-Value pairs from many servers in parallel allows for a variety of 
algorithms to be implemented against Accumulo. Both constructs allow for the 
transparency in where the data came from and encourage light-weight processing 
of those results on the client.&lt;/p&gt;
+&lt;h3 id=&quot;queued-compactions-not-running&quot;&gt;Queued Compactions Not 
Running&lt;/h3&gt;
 
-&lt;h3 id=&quot;accumulo-iterators&quot;&gt;Accumulo Iterators&lt;/h3&gt;
+&lt;p&gt;Found and fixed a bug (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4016&quot;&gt;ACCUMULO-4016&lt;/a&gt;)
 in which some queued
+compactions would never run if the number of files changed while the tablet was
+queued.&lt;/p&gt;
 
-&lt;p&gt;One notable feature of Accumulo is the SortedKeyValueIterator 
interface, or, more succinctly, Accumulo Iterators. Typically, these iterators 
run inside of the TabletServer process and perform much of the heavy lifting. 
Iterators are used to implement a breadth of internal features such as merged 
file reads, visibility label filtering, versioning, and more. However, users 
also have the ability to leverage this server-side processing mechanism to 
deploy their own custom code.&lt;/p&gt;
+&lt;h3 id=&quot;faster-processing-of-conditional-mutations&quot;&gt;Faster 
Processing of Conditional Mutations&lt;/h3&gt;
 
-&lt;p&gt;One interesting detail about these iterators is that they each have 
an implicit source which provides them data to operate on. This source is also 
a SortedKeyValueIterator which means that the âlocalâ 
SortedKeyValueIterator can use its own API on its data source. With this 
implicit hierarchy, Iterators act in concert with each other in some fixed 
order - they are stackable. The order in which Iterators are constructed, 
controlled by an Iteratorâs priority, determines the order of the stack. An 
Iterator uses its âsourceâ Iterator to read data, performs some operation, 
and then passes it on (the next element could be a client or another Iterator). 
The design behind iterators deserves its own blog post; however, the concept to 
see here is that iterators are best designed as stateless as possible 
(transformations, filters, or aggregations that always net the same results 
given the same input).&lt;/p&gt;
+&lt;p&gt;Improved ConditionalMutation processing time by a factor of 3.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4066&quot;&gt;ACCUMULO-4066&lt;/a&gt;)&lt;/p&gt;
 
-&lt;h3 id=&quot;functional-influences&quot;&gt;Functional Influences&lt;/h3&gt;
+&lt;h3 id=&quot;slow-gc-while-bulk-importing&quot;&gt;Slow GC While Bulk 
Importing&lt;/h3&gt;
 
-&lt;p&gt;In practice, these two concepts mesh very well with each other. Data 
read from a table can be thought of as a âstreamâ which came from some 
number of operations on the server. For a Scanner, this stream of data is 
backed by one tablet at a time to preserve sorted-order of the table. In the 
case of the BatchScanner, this is happening in parallel across many tablets 
from many tabletservers, with the client receiving data from many distinct 
hosts at one time. Likewise, the Scanner and BatchScanner APIs also encourage 
stateless processing of this data by presenting the data as a Java Iterator. 
Exposing explicit batches of Key-Value pairs would encourage blocking 
processing of each batch would be counter-intuitive to what the server-side 
processing model is. It creates a more seamless implementation paradigm on both 
the client and the server.&lt;/p&gt;
+&lt;p&gt;Found and worked around an issue where lots of bulk imports creating 
many new
+files would significantly impair the Accumulo GC service, and possibly prevent
+it from running to completion entirely. (&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4021&quot;&gt;ACCUMULO-4021&lt;/a&gt;)&lt;/p&gt;
 
-&lt;p&gt;When we take a step back from Object-Oriented Java and start to think 
about applications in a Functional mindset, it becomes clear how these APIs 
encourage functional-esque code. We are less concerned about mutability and 
encapsulation, and more concerned about stateless operations over some 
immutable data. Modeling our client code like this helps encourage parallelism 
as application in some multi-threaded environment is much simpler.&lt;/p&gt;
+&lt;h3 
id=&quot;improvements-in-locating-client-configuration-file&quot;&gt;Improvements
 in Locating Client Configuration File&lt;/h3&gt;
 
-&lt;h3 id=&quot;practical-application&quot;&gt;Practical Application&lt;/h3&gt;
+&lt;p&gt;Fixed some unexpected error messages related to setting
+ACCUMULO_CLIENT_CONF_PATH, and improved the detection of the client.conf file 
if
+ACCUMULO_CLIENT_CONF_PATH was set to a directory containing client.conf.
+(&lt;a 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4026&quot;&gt;ACCUMULO-4026&lt;/a&gt;,&lt;a
 
href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-4027&quot;&gt;ACCUMULO-4027&lt;/a&gt;)&lt;/p&gt;
 
-&lt;p&gt;I started out talking about schemas and table layouts which might 
seem a bit unrelated to this discussion on the functional influences in the 
Accumulo API. Any decisions made on a table structure must take query 
requirements with respect to the underlying data into account. As a practical 
application of what might otherwise seem like pontification, letâs consider a 
hypothetical system that processes clickstream data using Accumulo.&lt;/p&gt;
+&lt;h3 
id=&quot;transient-zookeeper-disconnect-causes-fate-threads-to-exit&quot;&gt;Transient
 ZooKeeper disconnect causes FATE threads to exit&lt;/h3&gt;
 
-&lt;p&gt;Clickstream data refers to logging users who visit a website, 
typically for the purpose of understanding usage patterns. If a website is 
thought of as a directed graph, where an anchor on one page which links to 
another page is an edge in that graph, a userâs actions on that website can 
be thought of as a âwalkâ over that graph. In managing a website, itâs 
typically very useful to understand usage patterns of your site: what page is 
most common? which links are most commonly clicked? what changes to a page make 
users act differently?&lt;/p&gt;
+&lt;p&gt;ZooKeeper clients are expected to handle the situation where they 
become
+disconnected from the ZooKeeper server and must wait to be reconnected
+before continuing ZooKeeper operations.&lt;/p&gt;
 
-&lt;p&gt;Now, letâs abstractly consider that we store this clickstream data 
in Accumulo. Letâs not go into specifics, but say we retain the typical 
row-with-columns idea: each row represents some user visiting a page on your 
website using a globally unique identifier. Each column would contain some 
information about that visit: the user who is visiting the website, the page 
theyâre visiting, the page they came from, the web-browser user-agent string, 
etc. Say youâre the owner of this website, and you recently made a 
modification to you website which added a prominent link to some new content on 
the front-page. You want to know how many people are visiting your new content


<TRUNCATED>

[26/36] accumulo git commit: Jekyll build from gh-pages:358b7b4

Reply via email to