http://git-wip-us.apache.org/repos/asf/hbase-site/blob/90048f99/book.html
----------------------------------------------------------------------
diff --git a/book.html b/book.html
index 9d47b7b..f8359fc 100644
--- a/book.html
+++ b/book.html
@@ -55,287 +55,288 @@
 <li><a href="#hbase.shell.noninteractive">16. HBase Shell in OS 
Scripts</a></li>
 <li><a href="#_read_hbase_shell_commands_from_a_command_file">17. Read HBase 
Shell Commands from a Command File</a></li>
 <li><a href="#_passing_vm_options_to_the_shell">18. Passing VM Options to the 
Shell</a></li>
-<li><a href="#_shell_tricks">19. Shell Tricks</a></li>
+<li><a href="#_overriding_configuration_starting_the_hbase_shell">19. 
Overriding configuration starting the HBase Shell</a></li>
+<li><a href="#_shell_tricks">20. Shell Tricks</a></li>
 </ul>
 </li>
 <li><a href="#datamodel">Data Model</a>
 <ul class="sectlevel1">
-<li><a href="#conceptual.view">20. Conceptual View</a></li>
-<li><a href="#physical.view">21. Physical View</a></li>
-<li><a href="#_namespace">22. Namespace</a></li>
-<li><a href="#_table">23. Table</a></li>
-<li><a href="#_row">24. Row</a></li>
-<li><a href="#columnfamily">25. Column Family</a></li>
-<li><a href="#_cells">26. Cells</a></li>
-<li><a href="#_data_model_operations">27. Data Model Operations</a></li>
-<li><a href="#versions">28. Versions</a></li>
-<li><a href="#dm.sort">29. Sort Order</a></li>
-<li><a href="#dm.column.metadata">30. Column Metadata</a></li>
-<li><a href="#joins">31. Joins</a></li>
-<li><a href="#_acid">32. ACID</a></li>
+<li><a href="#conceptual.view">21. Conceptual View</a></li>
+<li><a href="#physical.view">22. Physical View</a></li>
+<li><a href="#_namespace">23. Namespace</a></li>
+<li><a href="#_table">24. Table</a></li>
+<li><a href="#_row">25. Row</a></li>
+<li><a href="#columnfamily">26. Column Family</a></li>
+<li><a href="#_cells">27. Cells</a></li>
+<li><a href="#_data_model_operations">28. Data Model Operations</a></li>
+<li><a href="#versions">29. Versions</a></li>
+<li><a href="#dm.sort">30. Sort Order</a></li>
+<li><a href="#dm.column.metadata">31. Column Metadata</a></li>
+<li><a href="#joins">32. Joins</a></li>
+<li><a href="#_acid">33. ACID</a></li>
 </ul>
 </li>
 <li><a href="#schema">HBase and Schema Design</a>
 <ul class="sectlevel1">
-<li><a href="#schema.creation">33. Schema Creation</a></li>
-<li><a href="#table_schema_rules_of_thumb">34. Table Schema Rules Of 
Thumb</a></li>
+<li><a href="#schema.creation">34. Schema Creation</a></li>
+<li><a href="#table_schema_rules_of_thumb">35. Table Schema Rules Of 
Thumb</a></li>
 </ul>
 </li>
 <li><a href="#regionserver_sizing_rules_of_thumb">RegionServer Sizing Rules of 
Thumb</a>
 <ul class="sectlevel1">
-<li><a href="#number.of.cfs">35. On the number of column families</a></li>
-<li><a href="#rowkey.design">36. Rowkey Design</a></li>
-<li><a href="#schema.versions">37. Number of Versions</a></li>
-<li><a href="#supported.datatypes">38. Supported Datatypes</a></li>
-<li><a href="#schema.joins">39. Joins</a></li>
-<li><a href="#ttl">40. Time To Live (TTL)</a></li>
-<li><a href="#cf.keep.deleted">41. Keeping Deleted Cells</a></li>
-<li><a href="#secondary.indexes">42. Secondary Indexes and Alternate Query 
Paths</a></li>
-<li><a href="#_constraints">43. Constraints</a></li>
-<li><a href="#schema.casestudies">44. Schema Design Case Studies</a></li>
-<li><a href="#schema.ops">45. Operational and Performance Configuration 
Options</a></li>
-<li><a href="#_special_cases">46. Special Cases</a></li>
+<li><a href="#number.of.cfs">36. On the number of column families</a></li>
+<li><a href="#rowkey.design">37. Rowkey Design</a></li>
+<li><a href="#schema.versions">38. Number of Versions</a></li>
+<li><a href="#supported.datatypes">39. Supported Datatypes</a></li>
+<li><a href="#schema.joins">40. Joins</a></li>
+<li><a href="#ttl">41. Time To Live (TTL)</a></li>
+<li><a href="#cf.keep.deleted">42. Keeping Deleted Cells</a></li>
+<li><a href="#secondary.indexes">43. Secondary Indexes and Alternate Query 
Paths</a></li>
+<li><a href="#_constraints">44. Constraints</a></li>
+<li><a href="#schema.casestudies">45. Schema Design Case Studies</a></li>
+<li><a href="#schema.ops">46. Operational and Performance Configuration 
Options</a></li>
+<li><a href="#_special_cases">47. Special Cases</a></li>
 </ul>
 </li>
 <li><a href="#mapreduce">HBase and MapReduce</a>
 <ul class="sectlevel1">
-<li><a href="#hbase.mapreduce.classpath">47. HBase, MapReduce, and the 
CLASSPATH</a></li>
-<li><a href="#_mapreduce_scan_caching">48. MapReduce Scan Caching</a></li>
-<li><a href="#_bundled_hbase_mapreduce_jobs">49. Bundled HBase MapReduce 
Jobs</a></li>
-<li><a href="#_hbase_as_a_mapreduce_job_data_source_and_data_sink">50. HBase 
as a MapReduce Job Data Source and Data Sink</a></li>
-<li><a href="#_writing_hfiles_directly_during_bulk_import">51. Writing HFiles 
Directly During Bulk Import</a></li>
-<li><a href="#_rowcounter_example">52. RowCounter Example</a></li>
-<li><a href="#splitter">53. Map-Task Splitting</a></li>
-<li><a href="#mapreduce.example">54. HBase MapReduce Examples</a></li>
-<li><a href="#mapreduce.htable.access">55. Accessing Other HBase Tables in a 
MapReduce Job</a></li>
-<li><a href="#mapreduce.specex">56. Speculative Execution</a></li>
-<li><a href="#cascading">57. Cascading</a></li>
+<li><a href="#hbase.mapreduce.classpath">48. HBase, MapReduce, and the 
CLASSPATH</a></li>
+<li><a href="#_mapreduce_scan_caching">49. MapReduce Scan Caching</a></li>
+<li><a href="#_bundled_hbase_mapreduce_jobs">50. Bundled HBase MapReduce 
Jobs</a></li>
+<li><a href="#_hbase_as_a_mapreduce_job_data_source_and_data_sink">51. HBase 
as a MapReduce Job Data Source and Data Sink</a></li>
+<li><a href="#_writing_hfiles_directly_during_bulk_import">52. Writing HFiles 
Directly During Bulk Import</a></li>
+<li><a href="#_rowcounter_example">53. RowCounter Example</a></li>
+<li><a href="#splitter">54. Map-Task Splitting</a></li>
+<li><a href="#mapreduce.example">55. HBase MapReduce Examples</a></li>
+<li><a href="#mapreduce.htable.access">56. Accessing Other HBase Tables in a 
MapReduce Job</a></li>
+<li><a href="#mapreduce.specex">57. Speculative Execution</a></li>
+<li><a href="#cascading">58. Cascading</a></li>
 </ul>
 </li>
 <li><a href="#security">Securing Apache HBase</a>
 <ul class="sectlevel1">
-<li><a href="#_using_secure_http_https_for_the_web_ui">58. Using Secure HTTP 
(HTTPS) for the Web UI</a></li>
-<li><a href="#hbase.secure.spnego.ui">59. Using SPNEGO for Kerberos 
authentication with Web UIs</a></li>
-<li><a href="#hbase.secure.configuration">60. Secure Client Access to Apache 
HBase</a></li>
-<li><a href="#hbase.secure.simpleconfiguration">61. Simple User Access to 
Apache HBase</a></li>
-<li><a href="#_securing_access_to_hdfs_and_zookeeper">62. Securing Access to 
HDFS and ZooKeeper</a></li>
-<li><a href="#_securing_access_to_your_data">63. Securing Access To Your 
Data</a></li>
-<li><a href="#security.example.config">64. Security Configuration 
Example</a></li>
+<li><a href="#_using_secure_http_https_for_the_web_ui">59. Using Secure HTTP 
(HTTPS) for the Web UI</a></li>
+<li><a href="#hbase.secure.spnego.ui">60. Using SPNEGO for Kerberos 
authentication with Web UIs</a></li>
+<li><a href="#hbase.secure.configuration">61. Secure Client Access to Apache 
HBase</a></li>
+<li><a href="#hbase.secure.simpleconfiguration">62. Simple User Access to 
Apache HBase</a></li>
+<li><a href="#_securing_access_to_hdfs_and_zookeeper">63. Securing Access to 
HDFS and ZooKeeper</a></li>
+<li><a href="#_securing_access_to_your_data">64. Securing Access To Your 
Data</a></li>
+<li><a href="#security.example.config">65. Security Configuration 
Example</a></li>
 </ul>
 </li>
 <li><a href="#_architecture">Architecture</a>
 <ul class="sectlevel1">
-<li><a href="#arch.overview">65. Overview</a></li>
-<li><a href="#arch.catalog">66. Catalog Tables</a></li>
-<li><a href="#architecture.client">67. Client</a></li>
-<li><a href="#client.filter">68. Client Request Filters</a></li>
-<li><a href="#architecture.master">69. Master</a></li>
-<li><a href="#regionserver.arch">70. RegionServer</a></li>
-<li><a href="#regions.arch">71. Regions</a></li>
-<li><a href="#arch.bulk.load">72. Bulk Loading</a></li>
-<li><a href="#arch.hdfs">73. HDFS</a></li>
-<li><a href="#arch.timelineconsistent.reads">74. Timeline-consistent High 
Available Reads</a></li>
-<li><a href="#hbase_mob">75. Storing Medium-sized Objects (MOB)</a></li>
+<li><a href="#arch.overview">66. Overview</a></li>
+<li><a href="#arch.catalog">67. Catalog Tables</a></li>
+<li><a href="#architecture.client">68. Client</a></li>
+<li><a href="#client.filter">69. Client Request Filters</a></li>
+<li><a href="#architecture.master">70. Master</a></li>
+<li><a href="#regionserver.arch">71. RegionServer</a></li>
+<li><a href="#regions.arch">72. Regions</a></li>
+<li><a href="#arch.bulk.load">73. Bulk Loading</a></li>
+<li><a href="#arch.hdfs">74. HDFS</a></li>
+<li><a href="#arch.timelineconsistent.reads">75. Timeline-consistent High 
Available Reads</a></li>
+<li><a href="#hbase_mob">76. Storing Medium-sized Objects (MOB)</a></li>
 </ul>
 </li>
 <li><a href="#inmemory_compaction">In-memory Compaction</a>
 <ul class="sectlevel1">
-<li><a href="#imc.overview">76. Overview</a></li>
-<li><a href="#_enabling">77. Enabling</a></li>
+<li><a href="#imc.overview">77. Overview</a></li>
+<li><a href="#_enabling">78. Enabling</a></li>
 </ul>
 </li>
 <li><a href="#backuprestore">Backup and Restore</a>
 <ul class="sectlevel1">
-<li><a href="#br.overview">78. Overview</a></li>
-<li><a href="#br.terminology">79. Terminology</a></li>
-<li><a href="#br.planning">80. Planning</a></li>
-<li><a href="#br.initial.setup">81. First-time configuration steps</a></li>
-<li><a href="#_backup_and_restore_commands">82. Backup and Restore 
commands</a></li>
-<li><a href="#br.administration">83. Administration of Backup Images</a></li>
-<li><a href="#br.backup.configuration">84. Configuration keys</a></li>
-<li><a href="#br.best.practices">85. Best Practices</a></li>
-<li><a href="#br.s3.backup.scenario">86. Scenario: Safeguarding Application 
Datasets on Amazon S3</a></li>
-<li><a href="#br.data.security">87. Security of Backup Data</a></li>
-<li><a href="#br.technical.details">88. Technical Details of Incremental 
Backup and Restore</a></li>
-<li><a href="#br.filesystem.growth.warning">89. A Warning on File System 
Growth</a></li>
-<li><a href="#br.backup.capacity.planning">90. Capacity Planning</a></li>
-<li><a href="#br.limitations">91. Limitations of the Backup and Restore 
Utility</a></li>
+<li><a href="#br.overview">79. Overview</a></li>
+<li><a href="#br.terminology">80. Terminology</a></li>
+<li><a href="#br.planning">81. Planning</a></li>
+<li><a href="#br.initial.setup">82. First-time configuration steps</a></li>
+<li><a href="#_backup_and_restore_commands">83. Backup and Restore 
commands</a></li>
+<li><a href="#br.administration">84. Administration of Backup Images</a></li>
+<li><a href="#br.backup.configuration">85. Configuration keys</a></li>
+<li><a href="#br.best.practices">86. Best Practices</a></li>
+<li><a href="#br.s3.backup.scenario">87. Scenario: Safeguarding Application 
Datasets on Amazon S3</a></li>
+<li><a href="#br.data.security">88. Security of Backup Data</a></li>
+<li><a href="#br.technical.details">89. Technical Details of Incremental 
Backup and Restore</a></li>
+<li><a href="#br.filesystem.growth.warning">90. A Warning on File System 
Growth</a></li>
+<li><a href="#br.backup.capacity.planning">91. Capacity Planning</a></li>
+<li><a href="#br.limitations">92. Limitations of the Backup and Restore 
Utility</a></li>
 </ul>
 </li>
 <li><a href="#syncreplication">Synchronous Replication</a>
 <ul class="sectlevel1">
-<li><a href="#_background">92. Background</a></li>
-<li><a href="#_design">93. Design</a></li>
-<li><a href="#_operation_and_maintenance">94. Operation and 
maintenance</a></li>
+<li><a href="#_background">93. Background</a></li>
+<li><a href="#_design">94. Design</a></li>
+<li><a href="#_operation_and_maintenance">95. Operation and 
maintenance</a></li>
 </ul>
 </li>
 <li><a href="#hbase_apis">Apache HBase APIs</a>
 <ul class="sectlevel1">
-<li><a href="#_examples">95. Examples</a></li>
+<li><a href="#_examples">96. Examples</a></li>
 </ul>
 </li>
 <li><a href="#external_apis">Apache HBase External APIs</a>
 <ul class="sectlevel1">
-<li><a href="#_rest">96. REST</a></li>
-<li><a href="#_thrift">97. Thrift</a></li>
-<li><a href="#c">98. C/C++ Apache HBase Client</a></li>
-<li><a href="#jdo">99. Using Java Data Objects (JDO) with HBase</a></li>
-<li><a href="#scala">100. Scala</a></li>
-<li><a href="#jython">101. Jython</a></li>
+<li><a href="#_rest">97. REST</a></li>
+<li><a href="#_thrift">98. Thrift</a></li>
+<li><a href="#c">99. C/C++ Apache HBase Client</a></li>
+<li><a href="#jdo">100. Using Java Data Objects (JDO) with HBase</a></li>
+<li><a href="#scala">101. Scala</a></li>
+<li><a href="#jython">102. Jython</a></li>
 </ul>
 </li>
 <li><a href="#thrift">Thrift API and Filter Language</a>
 <ul class="sectlevel1">
-<li><a href="#thrift.filter_language">102. Filter Language</a></li>
+<li><a href="#thrift.filter_language">103. Filter Language</a></li>
 </ul>
 </li>
 <li><a href="#spark">HBase and Spark</a>
 <ul class="sectlevel1">
-<li><a href="#_basic_spark">103. Basic Spark</a></li>
-<li><a href="#_spark_streaming">104. Spark Streaming</a></li>
-<li><a href="#_bulk_load">105. Bulk Load</a></li>
-<li><a href="#_sparksql_dataframes">106. SparkSQL/DataFrames</a></li>
+<li><a href="#_basic_spark">104. Basic Spark</a></li>
+<li><a href="#_spark_streaming">105. Spark Streaming</a></li>
+<li><a href="#_bulk_load">106. Bulk Load</a></li>
+<li><a href="#_sparksql_dataframes">107. SparkSQL/DataFrames</a></li>
 </ul>
 </li>
 <li><a href="#cp">Apache HBase Coprocessors</a>
 <ul class="sectlevel1">
-<li><a href="#_coprocessor_overview">107. Coprocessor Overview</a></li>
-<li><a href="#_types_of_coprocessors">108. Types of Coprocessors</a></li>
-<li><a href="#cp_loading">109. Loading Coprocessors</a></li>
-<li><a href="#cp_example">110. Examples</a></li>
-<li><a href="#_guidelines_for_deploying_a_coprocessor">111. Guidelines For 
Deploying A Coprocessor</a></li>
-<li><a href="#_restricting_coprocessor_usage">112. Restricting Coprocessor 
Usage</a></li>
+<li><a href="#_coprocessor_overview">108. Coprocessor Overview</a></li>
+<li><a href="#_types_of_coprocessors">109. Types of Coprocessors</a></li>
+<li><a href="#cp_loading">110. Loading Coprocessors</a></li>
+<li><a href="#cp_example">111. Examples</a></li>
+<li><a href="#_guidelines_for_deploying_a_coprocessor">112. Guidelines For 
Deploying A Coprocessor</a></li>
+<li><a href="#_restricting_coprocessor_usage">113. Restricting Coprocessor 
Usage</a></li>
 </ul>
 </li>
 <li><a href="#performance">Apache HBase Performance Tuning</a>
 <ul class="sectlevel1">
-<li><a href="#perf.os">113. Operating System</a></li>
-<li><a href="#perf.network">114. Network</a></li>
-<li><a href="#jvm">115. Java</a></li>
-<li><a href="#perf.configurations">116. HBase Configurations</a></li>
-<li><a href="#perf.zookeeper">117. ZooKeeper</a></li>
-<li><a href="#perf.schema">118. Schema Design</a></li>
-<li><a href="#perf.general">119. HBase General Patterns</a></li>
-<li><a href="#perf.writing">120. Writing to HBase</a></li>
-<li><a href="#perf.reading">121. Reading from HBase</a></li>
-<li><a href="#perf.deleting">122. Deleting from HBase</a></li>
-<li><a href="#perf.hdfs">123. HDFS</a></li>
-<li><a href="#perf.ec2">124. Amazon EC2</a></li>
-<li><a href="#perf.hbase.mr.cluster">125. Collocating HBase and 
MapReduce</a></li>
-<li><a href="#perf.casestudy">126. Case Studies</a></li>
+<li><a href="#perf.os">114. Operating System</a></li>
+<li><a href="#perf.network">115. Network</a></li>
+<li><a href="#jvm">116. Java</a></li>
+<li><a href="#perf.configurations">117. HBase Configurations</a></li>
+<li><a href="#perf.zookeeper">118. ZooKeeper</a></li>
+<li><a href="#perf.schema">119. Schema Design</a></li>
+<li><a href="#perf.general">120. HBase General Patterns</a></li>
+<li><a href="#perf.writing">121. Writing to HBase</a></li>
+<li><a href="#perf.reading">122. Reading from HBase</a></li>
+<li><a href="#perf.deleting">123. Deleting from HBase</a></li>
+<li><a href="#perf.hdfs">124. HDFS</a></li>
+<li><a href="#perf.ec2">125. Amazon EC2</a></li>
+<li><a href="#perf.hbase.mr.cluster">126. Collocating HBase and 
MapReduce</a></li>
+<li><a href="#perf.casestudy">127. Case Studies</a></li>
 </ul>
 </li>
 <li><a href="#trouble">Troubleshooting and Debugging Apache HBase</a>
 <ul class="sectlevel1">
-<li><a href="#trouble.general">127. General Guidelines</a></li>
-<li><a href="#trouble.log">128. Logs</a></li>
-<li><a href="#trouble.resources">129. Resources</a></li>
-<li><a href="#trouble.tools">130. Tools</a></li>
-<li><a href="#trouble.client">131. Client</a></li>
-<li><a href="#trouble.mapreduce">132. MapReduce</a></li>
-<li><a href="#trouble.namenode">133. NameNode</a></li>
-<li><a href="#trouble.network">134. Network</a></li>
-<li><a href="#trouble.rs">135. RegionServer</a></li>
-<li><a href="#trouble.master">136. Master</a></li>
-<li><a href="#trouble.zookeeper">137. ZooKeeper</a></li>
-<li><a href="#trouble.ec2">138. Amazon EC2</a></li>
-<li><a href="#trouble.versions">139. HBase and Hadoop version issues</a></li>
-<li><a href="#_hbase_and_hdfs">140. HBase and HDFS</a></li>
-<li><a href="#trouble.tests">141. Running unit or integration tests</a></li>
-<li><a href="#trouble.casestudy">142. Case Studies</a></li>
-<li><a href="#trouble.crypto">143. Cryptographic Features</a></li>
-<li><a href="#_operating_system_specific_issues">144. Operating System 
Specific Issues</a></li>
-<li><a href="#_jdk_issues">145. JDK Issues</a></li>
+<li><a href="#trouble.general">128. General Guidelines</a></li>
+<li><a href="#trouble.log">129. Logs</a></li>
+<li><a href="#trouble.resources">130. Resources</a></li>
+<li><a href="#trouble.tools">131. Tools</a></li>
+<li><a href="#trouble.client">132. Client</a></li>
+<li><a href="#trouble.mapreduce">133. MapReduce</a></li>
+<li><a href="#trouble.namenode">134. NameNode</a></li>
+<li><a href="#trouble.network">135. Network</a></li>
+<li><a href="#trouble.rs">136. RegionServer</a></li>
+<li><a href="#trouble.master">137. Master</a></li>
+<li><a href="#trouble.zookeeper">138. ZooKeeper</a></li>
+<li><a href="#trouble.ec2">139. Amazon EC2</a></li>
+<li><a href="#trouble.versions">140. HBase and Hadoop version issues</a></li>
+<li><a href="#_hbase_and_hdfs">141. HBase and HDFS</a></li>
+<li><a href="#trouble.tests">142. Running unit or integration tests</a></li>
+<li><a href="#trouble.casestudy">143. Case Studies</a></li>
+<li><a href="#trouble.crypto">144. Cryptographic Features</a></li>
+<li><a href="#_operating_system_specific_issues">145. Operating System 
Specific Issues</a></li>
+<li><a href="#_jdk_issues">146. JDK Issues</a></li>
 </ul>
 </li>
 <li><a href="#casestudies">Apache HBase Case Studies</a>
 <ul class="sectlevel1">
-<li><a href="#casestudies.overview">146. Overview</a></li>
-<li><a href="#casestudies.schema">147. Schema Design</a></li>
-<li><a href="#casestudies.perftroub">148. Performance/Troubleshooting</a></li>
+<li><a href="#casestudies.overview">147. Overview</a></li>
+<li><a href="#casestudies.schema">148. Schema Design</a></li>
+<li><a href="#casestudies.perftroub">149. Performance/Troubleshooting</a></li>
 </ul>
 </li>
 <li><a href="#ops_mgt">Apache HBase Operational Management</a>
 <ul class="sectlevel1">
-<li><a href="#tools">149. HBase Tools and Utilities</a></li>
-<li><a href="#ops.regionmgt">150. Region Management</a></li>
-<li><a href="#node.management">151. Node Management</a></li>
-<li><a href="#hbase_metrics">152. HBase Metrics</a></li>
-<li><a href="#ops.monitoring">153. HBase Monitoring</a></li>
-<li><a href="#_cluster_replication">154. Cluster Replication</a></li>
-<li><a href="#_running_multiple_workloads_on_a_single_cluster">155. Running 
Multiple Workloads On a Single Cluster</a></li>
-<li><a href="#ops.backup">156. HBase Backup</a></li>
-<li><a href="#ops.snapshots">157. HBase Snapshots</a></li>
-<li><a href="#snapshots_azure">158. Storing Snapshots in Microsoft Azure Blob 
Storage</a></li>
-<li><a href="#ops.capacity">159. Capacity Planning and Region Sizing</a></li>
-<li><a href="#table.rename">160. Table Rename</a></li>
-<li><a href="#rsgroup">161. RegionServer Grouping</a></li>
-<li><a href="#normalizer">162. Region Normalizer</a></li>
+<li><a href="#tools">150. HBase Tools and Utilities</a></li>
+<li><a href="#ops.regionmgt">151. Region Management</a></li>
+<li><a href="#node.management">152. Node Management</a></li>
+<li><a href="#hbase_metrics">153. HBase Metrics</a></li>
+<li><a href="#ops.monitoring">154. HBase Monitoring</a></li>
+<li><a href="#_cluster_replication">155. Cluster Replication</a></li>
+<li><a href="#_running_multiple_workloads_on_a_single_cluster">156. Running 
Multiple Workloads On a Single Cluster</a></li>
+<li><a href="#ops.backup">157. HBase Backup</a></li>
+<li><a href="#ops.snapshots">158. HBase Snapshots</a></li>
+<li><a href="#snapshots_azure">159. Storing Snapshots in Microsoft Azure Blob 
Storage</a></li>
+<li><a href="#ops.capacity">160. Capacity Planning and Region Sizing</a></li>
+<li><a href="#table.rename">161. Table Rename</a></li>
+<li><a href="#rsgroup">162. RegionServer Grouping</a></li>
+<li><a href="#normalizer">163. Region Normalizer</a></li>
 </ul>
 </li>
 <li><a href="#developer">Building and Developing Apache HBase</a>
 <ul class="sectlevel1">
-<li><a href="#getting.involved">163. Getting Involved</a></li>
-<li><a href="#repos">164. Apache HBase Repositories</a></li>
-<li><a href="#_ides">165. IDEs</a></li>
-<li><a href="#build">166. Building Apache HBase</a></li>
-<li><a href="#releasing">167. Releasing Apache HBase</a></li>
-<li><a href="#hbase.rc.voting">168. Voting on Release Candidates</a></li>
-<li><a href="#hbase.release.announcement">169. Announcing Releases</a></li>
-<li><a href="#documentation">170. Generating the HBase Reference Guide</a></li>
-<li><a href="#hbase.org">171. Updating <a 
href="https://hbase.apache.org";>hbase.apache.org</a></a></li>
-<li><a href="#hbase.tests">172. Tests</a></li>
-<li><a href="#developing">173. Developer Guidelines</a></li>
+<li><a href="#getting.involved">164. Getting Involved</a></li>
+<li><a href="#repos">165. Apache HBase Repositories</a></li>
+<li><a href="#_ides">166. IDEs</a></li>
+<li><a href="#build">167. Building Apache HBase</a></li>
+<li><a href="#releasing">168. Releasing Apache HBase</a></li>
+<li><a href="#hbase.rc.voting">169. Voting on Release Candidates</a></li>
+<li><a href="#hbase.release.announcement">170. Announcing Releases</a></li>
+<li><a href="#documentation">171. Generating the HBase Reference Guide</a></li>
+<li><a href="#hbase.org">172. Updating <a 
href="https://hbase.apache.org";>hbase.apache.org</a></a></li>
+<li><a href="#hbase.tests">173. Tests</a></li>
+<li><a href="#developing">174. Developer Guidelines</a></li>
 </ul>
 </li>
 <li><a href="#unit.tests">Unit Testing HBase Applications</a>
 <ul class="sectlevel1">
-<li><a href="#_junit">174. JUnit</a></li>
-<li><a href="#mockito">175. Mockito</a></li>
-<li><a href="#_mrunit">176. MRUnit</a></li>
-<li><a href="#_integration_testing_with_an_hbase_mini_cluster">177. 
Integration Testing with an HBase Mini-Cluster</a></li>
+<li><a href="#_junit">175. JUnit</a></li>
+<li><a href="#mockito">176. Mockito</a></li>
+<li><a href="#_mrunit">177. MRUnit</a></li>
+<li><a href="#_integration_testing_with_an_hbase_mini_cluster">178. 
Integration Testing with an HBase Mini-Cluster</a></li>
 </ul>
 </li>
 <li><a href="#protobuf">Protobuf in HBase</a>
 <ul class="sectlevel1">
-<li><a href="#_protobuf">178. Protobuf</a></li>
+<li><a href="#_protobuf">179. Protobuf</a></li>
 </ul>
 </li>
 <li><a href="#pv2">Procedure Framework (Pv2): <a 
href="https://issues.apache.org/jira/browse/HBASE-12439";>HBASE-12439</a></a>
 <ul class="sectlevel1">
-<li><a href="#_procedures">179. Procedures</a></li>
-<li><a href="#_subprocedures">180. Subprocedures</a></li>
-<li><a href="#_procedureexecutor">181. ProcedureExecutor</a></li>
-<li><a href="#_nonces">182. Nonces</a></li>
-<li><a href="#_wait_wake_suspend_yield">183. Wait/Wake/Suspend/Yield</a></li>
-<li><a href="#_locking">184. Locking</a></li>
-<li><a href="#_procedure_types">185. Procedure Types</a></li>
-<li><a href="#_references">186. References</a></li>
+<li><a href="#_procedures">180. Procedures</a></li>
+<li><a href="#_subprocedures">181. Subprocedures</a></li>
+<li><a href="#_procedureexecutor">182. ProcedureExecutor</a></li>
+<li><a href="#_nonces">183. Nonces</a></li>
+<li><a href="#_wait_wake_suspend_yield">184. Wait/Wake/Suspend/Yield</a></li>
+<li><a href="#_locking">185. Locking</a></li>
+<li><a href="#_procedure_types">186. Procedure Types</a></li>
+<li><a href="#_references">187. References</a></li>
 </ul>
 </li>
 <li><a href="#amv2">AMv2 Description for Devs</a>
 <ul class="sectlevel1">
-<li><a href="#_background_2">187. Background</a></li>
-<li><a href="#_new_system">188. New System</a></li>
-<li><a href="#_procedures_detail">189. Procedures Detail</a></li>
-<li><a href="#_ui">190. UI</a></li>
-<li><a href="#_logging">191. Logging</a></li>
-<li><a href="#_implementation_notes">192. Implementation Notes</a></li>
-<li><a href="#_new_configs">193. New Configs</a></li>
-<li><a href="#_tools">194. Tools</a></li>
+<li><a href="#_background_2">188. Background</a></li>
+<li><a href="#_new_system">189. New System</a></li>
+<li><a href="#_procedures_detail">190. Procedures Detail</a></li>
+<li><a href="#_ui">191. UI</a></li>
+<li><a href="#_logging">192. Logging</a></li>
+<li><a href="#_implementation_notes">193. Implementation Notes</a></li>
+<li><a href="#_new_configs">194. New Configs</a></li>
+<li><a href="#_tools">195. Tools</a></li>
 </ul>
 </li>
 <li><a href="#zookeeper">ZooKeeper</a>
 <ul class="sectlevel1">
-<li><a href="#_using_existing_zookeeper_ensemble">195. Using existing 
ZooKeeper ensemble</a></li>
-<li><a href="#zk.sasl.auth">196. SASL Authentication with ZooKeeper</a></li>
+<li><a href="#_using_existing_zookeeper_ensemble">196. Using existing 
ZooKeeper ensemble</a></li>
+<li><a href="#zk.sasl.auth">197. SASL Authentication with ZooKeeper</a></li>
 </ul>
 </li>
 <li><a href="#community">Community</a>
 <ul class="sectlevel1">
-<li><a href="#_decisions">197. Decisions</a></li>
-<li><a href="#community.roles">198. Community Roles</a></li>
-<li><a href="#hbase.commit.msg.format">199. Commit Message format</a></li>
+<li><a href="#_decisions">198. Decisions</a></li>
+<li><a href="#community.roles">199. Community Roles</a></li>
+<li><a href="#hbase.commit.msg.format">200. Commit Message format</a></li>
 </ul>
 </li>
 <li><a href="#_appendix">Appendix</a>
@@ -352,11 +353,11 @@
 <li><a href="#asf">Appendix J: HBase and the Apache Software 
Foundation</a></li>
 <li><a href="#orca">Appendix K: Apache HBase Orca</a></li>
 <li><a href="#tracing">Appendix L: Enabling Dapper-like Tracing in 
HBase</a></li>
-<li><a href="#tracing.client.modifications">200. Client Modifications</a></li>
-<li><a href="#tracing.client.shell">201. Tracing from HBase Shell</a></li>
+<li><a href="#tracing.client.modifications">201. Client Modifications</a></li>
+<li><a href="#tracing.client.shell">202. Tracing from HBase Shell</a></li>
 <li><a href="#hbase.rpc">Appendix M: 0.95 RPC Specification</a></li>
 <li><a href="#_known_incompatibilities_among_hbase_versions">Appendix N: Known 
Incompatibilities Among HBase Versions</a></li>
-<li><a href="#_hbase_2_0_incompatible_changes">202. HBase 2.0 Incompatible 
Changes</a></li>
+<li><a href="#_hbase_2_0_incompatible_changes">203. HBase 2.0 Incompatible 
Changes</a></li>
 </ul>
 </li>
 </ul>
@@ -7702,10 +7703,30 @@ The command should be run all on a single line, but is 
broken by the <code>\</co
 </div>
 </div>
 <div class="sect1">
-<h2 id="_shell_tricks"><a class="anchor" href="#_shell_tricks"></a>19. Shell 
Tricks</h2>
+<h2 id="_overriding_configuration_starting_the_hbase_shell"><a class="anchor" 
href="#_overriding_configuration_starting_the_hbase_shell"></a>19. Overriding 
configuration starting the HBase Shell</h2>
+<div class="sectionbody">
+<div class="paragraph">
+<p>As of hbase-2.0.5/hbase-2.1.3/hbase-2.2.0/hbase-1.4.10/hbase-1.5.0, you can
+pass or override hbase configuration as specified in <code>hbase-*.xml</code> 
by passing
+your key/values prefixed with <code>-D</code> on the command-line as 
follows:</p>
+</div>
+<div class="listingblock">
+<div class="content">
+<pre class="CodeRay highlight"><code data-lang="bash">$ ./bin/hbase shell 
-Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org
 -Draining=false
+...
+hbase(main):001:0&gt; 
@shell.hbase.configuration.get(&quot;hbase.zookeeper.quorum&quot;)
+=&gt; 
&quot;ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org&quot;
+hbase(main):002:0&gt; @shell.hbase.configuration.get(&quot;raining&quot;)
+=&gt; &quot;false&quot;</code></pre>
+</div>
+</div>
+</div>
+</div>
+<div class="sect1">
+<h2 id="_shell_tricks"><a class="anchor" href="#_shell_tricks"></a>20. Shell 
Tricks</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="_table_variables"><a class="anchor" href="#_table_variables"></a>19.1. 
Table variables</h3>
+<h3 id="_table_variables"><a class="anchor" href="#_table_variables"></a>20.1. 
Table variables</h3>
 <div class="paragraph">
 <p>HBase 0.95 adds shell commands that provides jruby-style object-oriented 
references for tables.
 Previously all of the shell commands that act upon a table have a procedural 
style that always took the name of the table as an argument.
@@ -7816,7 +7837,7 @@ hbase(main):018:0&gt;</pre>
 </div>
 </div>
 <div class="sect2">
-<h3 id="irbrc"><a class="anchor" href="#irbrc"></a>19.2. <em>irbrc</em></h3>
+<h3 id="irbrc"><a class="anchor" href="#irbrc"></a>20.2. <em>irbrc</em></h3>
 <div class="paragraph">
 <p>Create an <em>.irbrc</em> file for yourself in your home directory.
 Add customizations.
@@ -7843,7 +7864,7 @@ IRB.conf[:HISTORY_FILE] = 
&quot;#{ENV['HOME']}/.irb-save-history&quot;</code></p
 </div>
 </div>
 <div class="sect2">
-<h3 id="_log_data_to_timestamp"><a class="anchor" 
href="#_log_data_to_timestamp"></a>19.3. LOG data to timestamp</h3>
+<h3 id="_log_data_to_timestamp"><a class="anchor" 
href="#_log_data_to_timestamp"></a>20.3. LOG data to timestamp</h3>
 <div class="paragraph">
 <p>To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, 
do:</p>
 </div>
@@ -7868,7 +7889,7 @@ hbase(main):022:0&gt; Date.new(1218920189000).toString() 
=&gt; "Sat Aug 16 20:56
 </div>
 </div>
 <div class="sect2">
-<h3 id="_query_shell_configuration"><a class="anchor" 
href="#_query_shell_configuration"></a>19.4. Query Shell Configuration</h3>
+<h3 id="_query_shell_configuration"><a class="anchor" 
href="#_query_shell_configuration"></a>20.4. Query Shell Configuration</h3>
 <div class="listingblock">
 <div class="content">
 <pre>hbase(main):001:0&gt; @shell.hbase.configuration.get("hbase.rpc.timeout")
@@ -7887,7 +7908,7 @@ hbase(main):006:0&gt; 
@shell.hbase.configuration.get("hbase.rpc.timeout")
 </div>
 </div>
 <div class="sect2">
-<h3 id="tricks.pre-split"><a class="anchor" href="#tricks.pre-split"></a>19.5. 
Pre-splitting tables with the HBase Shell</h3>
+<h3 id="tricks.pre-split"><a class="anchor" href="#tricks.pre-split"></a>20.5. 
Pre-splitting tables with the HBase Shell</h3>
 <div class="paragraph">
 <p>You can use a variety of options to pre-split tables when creating them via 
the HBase Shell <code>create</code> command.</p>
 </div>
@@ -7958,9 +7979,9 @@ If you need to truncate a pre-split table, you must drop 
and recreate the table
 </div>
 </div>
 <div class="sect2">
-<h3 id="_debug"><a class="anchor" href="#_debug"></a>19.6. Debug</h3>
+<h3 id="_debug"><a class="anchor" href="#_debug"></a>20.6. Debug</h3>
 <div class="sect3">
-<h4 id="_shell_debug_switch"><a class="anchor" 
href="#_shell_debug_switch"></a>19.6.1. Shell debug switch</h4>
+<h4 id="_shell_debug_switch"><a class="anchor" 
href="#_shell_debug_switch"></a>20.6.1. Shell debug switch</h4>
 <div class="paragraph">
 <p>You can set a debug switch in the shell to see more 
output&#8201;&#8212;&#8201;e.g.
 more of the stack trace on exception&#8201;&#8212;&#8201;when you run a 
command:</p>
@@ -7972,7 +7993,7 @@ more of the stack trace on 
exception&#8201;&#8212;&#8201;when you run a command:
 </div>
 </div>
 <div class="sect3">
-<h4 id="_debug_log_level"><a class="anchor" 
href="#_debug_log_level"></a>19.6.2. DEBUG log level</h4>
+<h4 id="_debug_log_level"><a class="anchor" 
href="#_debug_log_level"></a>20.6.2. DEBUG log level</h4>
 <div class="paragraph">
 <p>To enable DEBUG level logging in the shell, launch it with the 
<code>-d</code> option.</p>
 </div>
@@ -7984,9 +8005,9 @@ more of the stack trace on 
exception&#8201;&#8212;&#8201;when you run a command:
 </div>
 </div>
 <div class="sect2">
-<h3 id="_commands"><a class="anchor" href="#_commands"></a>19.7. Commands</h3>
+<h3 id="_commands"><a class="anchor" href="#_commands"></a>20.7. Commands</h3>
 <div class="sect3">
-<h4 id="_count"><a class="anchor" href="#_count"></a>19.7.1. count</h4>
+<h4 id="_count"><a class="anchor" href="#_count"></a>20.7.1. count</h4>
 <div class="paragraph">
 <p>Count command returns the number of rows in a table.
 It&#8217;s quite fast when configured with the right CACHE</p>
@@ -8059,7 +8080,7 @@ By default, the timestamp represents the time on the 
RegionServer when the data
 </div>
 </div>
 <div class="sect1">
-<h2 id="conceptual.view"><a class="anchor" href="#conceptual.view"></a>20. 
Conceptual View</h2>
+<h2 id="conceptual.view"><a class="anchor" href="#conceptual.view"></a>21. 
Conceptual View</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>You can read a very understandable explanation of the HBase data model in 
the blog post <a href="http://jimbojw.com/#understanding%20hbase";>Understanding 
HBase and BigTable</a> by Jim R. Wilson.
@@ -8193,7 +8214,7 @@ This is only a mock-up for illustrative purposes and may 
not be strictly accurat
 </div>
 </div>
 <div class="sect1">
-<h2 id="physical.view"><a class="anchor" href="#physical.view"></a>21. 
Physical View</h2>
+<h2 id="physical.view"><a class="anchor" href="#physical.view"></a>22. 
Physical View</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Although at a conceptual level tables may be viewed as a sparse set of 
rows, they are physically stored by column family.
@@ -8272,7 +8293,7 @@ Thus a request for the values of all columns in the row 
<code>com.cnn.www</code>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_namespace"><a class="anchor" href="#_namespace"></a>22. Namespace</h2>
+<h2 id="_namespace"><a class="anchor" href="#_namespace"></a>23. Namespace</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>A namespace is a logical grouping of tables analogous to a database in 
relation database systems.
@@ -8292,7 +8313,7 @@ This abstraction lays the groundwork for upcoming 
multi-tenancy related features
 </ul>
 </div>
 <div class="sect2">
-<h3 id="namespace_creation"><a class="anchor" 
href="#namespace_creation"></a>22.1. Namespace management</h3>
+<h3 id="namespace_creation"><a class="anchor" 
href="#namespace_creation"></a>23.1. Namespace management</h3>
 <div class="paragraph">
 <p>A namespace can be created, removed or altered.
 Namespace membership is determined during table creation by specifying a 
fully-qualified table name of the form:</p>
@@ -8333,7 +8354,7 @@ alter_namespace 'my_ns', {METHOD =&gt; 'set', 
'PROPERTY_NAME' =&gt; 'PROPERTY_VA
 </div>
 </div>
 <div class="sect2">
-<h3 id="namespace_special"><a class="anchor" 
href="#namespace_special"></a>22.2. Predefined namespaces</h3>
+<h3 id="namespace_special"><a class="anchor" 
href="#namespace_special"></a>23.2. Predefined namespaces</h3>
 <div class="paragraph">
 <p>There are two predefined special namespaces:</p>
 </div>
@@ -8365,7 +8386,7 @@ create 'bar', 'fam'</code></pre>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_table"><a class="anchor" href="#_table"></a>23. Table</h2>
+<h2 id="_table"><a class="anchor" href="#_table"></a>24. Table</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Tables are declared up front at schema definition time.</p>
@@ -8373,7 +8394,7 @@ create 'bar', 'fam'</code></pre>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_row"><a class="anchor" href="#_row"></a>24. Row</h2>
+<h2 id="_row"><a class="anchor" href="#_row"></a>25. Row</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Row keys are uninterpreted bytes.
@@ -8383,7 +8404,7 @@ The empty byte array is used to denote both the start and 
end of a tables' names
 </div>
 </div>
 <div class="sect1">
-<h2 id="columnfamily"><a class="anchor" href="#columnfamily"></a>25. Column 
Family</h2>
+<h2 id="columnfamily"><a class="anchor" href="#columnfamily"></a>26. Column 
Family</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Columns in Apache HBase are grouped into <em>column families</em>.
@@ -8401,7 +8422,7 @@ Because tunings and storage specifications are done at 
the column family level,
 </div>
 </div>
 <div class="sect1">
-<h2 id="_cells"><a class="anchor" href="#_cells"></a>26. Cells</h2>
+<h2 id="_cells"><a class="anchor" href="#_cells"></a>27. Cells</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>A <em>{row, column, version}</em> tuple exactly specifies a 
<code>cell</code> in HBase.
@@ -8410,27 +8431,27 @@ Cell content is uninterpreted bytes</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_data_model_operations"><a class="anchor" 
href="#_data_model_operations"></a>27. Data Model Operations</h2>
+<h2 id="_data_model_operations"><a class="anchor" 
href="#_data_model_operations"></a>28. Data Model Operations</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The four primary data model operations are Get, Put, Scan, and Delete.
 Operations are applied via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html";>Table</a>
 instances.</p>
 </div>
 <div class="sect2">
-<h3 id="_get"><a class="anchor" href="#_get"></a>27.1. Get</h3>
+<h3 id="_get"><a class="anchor" href="#_get"></a>28.1. Get</h3>
 <div class="paragraph">
 <p><a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html";>Get</a>
 returns attributes for a specified row.
 Gets are executed via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get-org.apache.hadoop.hbase.client.Get-";>Table.get</a></p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="_put"><a class="anchor" href="#_put"></a>27.2. Put</h3>
+<h3 id="_put"><a class="anchor" href="#_put"></a>28.2. Put</h3>
 <div class="paragraph">
 <p><a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html";>Put</a>
 either adds new rows to a table (if the key is new) or can update existing 
rows (if the key already exists). Puts are executed via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put-org.apache.hadoop.hbase.client.Put-";>Table.put</a>
 (non-writeBuffer) or <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-";>Table.batch</a>
 (non-writeBuffer)</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="scan"><a class="anchor" href="#scan"></a>27.3. Scans</h3>
+<h3 id="scan"><a class="anchor" href="#scan"></a>28.3. Scans</h3>
 <div class="paragraph">
 <p><a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html";>Scan</a>
 allow iteration over multiple rows for specified attributes.</p>
 </div>
@@ -8464,7 +8485,7 @@ ResultScanner rs = table.getScanner(scan);
 </div>
 </div>
 <div class="sect2">
-<h3 id="_delete"><a class="anchor" href="#_delete"></a>27.4. Delete</h3>
+<h3 id="_delete"><a class="anchor" href="#_delete"></a>28.4. Delete</h3>
 <div class="paragraph">
 <p><a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html";>Delete</a>
 removes a row from a table.
 Deletes are executed via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete-org.apache.hadoop.hbase.client.Delete-";>Table.delete</a>.</p>
@@ -8480,7 +8501,7 @@ These tombstones, along with the dead values, are cleaned 
up on major compaction
 </div>
 </div>
 <div class="sect1">
-<h2 id="versions"><a class="anchor" href="#versions"></a>28. Versions</h2>
+<h2 id="versions"><a class="anchor" href="#versions"></a>29. Versions</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>A <em>{row, column, version}</em> tuple exactly specifies a 
<code>cell</code> in HBase.
@@ -8517,7 +8538,7 @@ It has more detail on versioning than is provided 
here.</p>
 This section is basically a synopsis of this article by Bruno Dumon.</p>
 </div>
 <div class="sect2">
-<h3 id="specify.number.of.versions"><a class="anchor" 
href="#specify.number.of.versions"></a>28.1. Specifying the Number of Versions 
to Store</h3>
+<h3 id="specify.number.of.versions"><a class="anchor" 
href="#specify.number.of.versions"></a>29.1. Specifying the Number of Versions 
to Store</h3>
 <div class="paragraph">
 <p>The maximum number of versions to store for a given column is part of the 
column schema and is specified at table creation, or via an <code>alter</code> 
command, via <code>HColumnDescriptor.DEFAULT_VERSIONS</code>.
 Prior to HBase 0.96, the default number of versions kept was <code>3</code>, 
but in 0.96 and newer has been changed to <code>1</code>.</p>
@@ -8558,12 +8579,12 @@ See <a 
href="#hbase.column.max.version">hbase.column.max.version</a>.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="versions.ops"><a class="anchor" href="#versions.ops"></a>28.2. 
Versions and HBase Operations</h3>
+<h3 id="versions.ops"><a class="anchor" href="#versions.ops"></a>29.2. 
Versions and HBase Operations</h3>
 <div class="paragraph">
 <p>In this section we look at the behavior of the version dimension for each 
of the core HBase operations.</p>
 </div>
 <div class="sect3">
-<h4 id="_get_scan"><a class="anchor" href="#_get_scan"></a>28.2.1. 
Get/Scan</h4>
+<h4 id="_get_scan"><a class="anchor" href="#_get_scan"></a>29.2.1. 
Get/Scan</h4>
 <div class="paragraph">
 <p>Gets are implemented on top of Scans.
 The below discussion of <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html";>Get</a>
 applies equally to <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html";>Scans</a>.</p>
@@ -8586,7 +8607,7 @@ The below discussion of <a 
href="https://hbase.apache.org/apidocs/org/apache/had
 </div>
 </div>
 <div class="sect3">
-<h4 id="_default_get_example"><a class="anchor" 
href="#_default_get_example"></a>28.2.2. Default Get Example</h4>
+<h4 id="_default_get_example"><a class="anchor" 
href="#_default_get_example"></a>29.2.2. Default Get Example</h4>
 <div class="paragraph">
 <p>The following Get will only retrieve the current version of the row</p>
 </div>
@@ -8602,7 +8623,7 @@ Get get = <span class="keyword">new</span> 
Get(Bytes.toBytes(<span class="string
 </div>
 </div>
 <div class="sect3">
-<h4 id="_versioned_get_example"><a class="anchor" 
href="#_versioned_get_example"></a>28.2.3. Versioned Get Example</h4>
+<h4 id="_versioned_get_example"><a class="anchor" 
href="#_versioned_get_example"></a>29.2.3. Versioned Get Example</h4>
 <div class="paragraph">
 <p>The following Get will return the last 3 versions of the row.</p>
 </div>
@@ -8620,7 +8641,7 @@ get.setMaxVersions(<span class="integer">3</span>);  
<span class="comment">// wi
 </div>
 </div>
 <div class="sect3">
-<h4 id="_put_2"><a class="anchor" href="#_put_2"></a>28.2.4. Put</h4>
+<h4 id="_put_2"><a class="anchor" href="#_put_2"></a>29.2.4. Put</h4>
 <div class="paragraph">
 <p>Doing a put always creates a new version of a <code>cell</code>, at a 
certain timestamp.
 By default the system uses the server&#8217;s <code>currentTimeMillis</code>, 
but you can specify the version (= the long integer) yourself, on a per-column 
level.
@@ -8669,7 +8690,7 @@ Prefer using a separate timestamp attribute of the row, 
or have the timestamp as
 </div>
 </div>
 <div class="sect3">
-<h4 id="version.delete"><a class="anchor" href="#version.delete"></a>28.2.5. 
Delete</h4>
+<h4 id="version.delete"><a class="anchor" href="#version.delete"></a>29.2.5. 
Delete</h4>
 <div class="paragraph">
 <p>There are three different types of internal delete markers.
 See Lars Hofhansl&#8217;s blog for discussion of his attempt adding another, 
<a 
href="http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html";>Scanning 
in HBase: Prefix Delete Marker</a>.</p>
@@ -8728,7 +8749,7 @@ The change has been backported to HBase 0.94 and newer 
branches.
 </div>
 </div>
 <div class="sect2">
-<h3 id="new.version.behavior"><a class="anchor" 
href="#new.version.behavior"></a>28.3. Optional New Version and Delete behavior 
in HBase-2.0.0</h3>
+<h3 id="new.version.behavior"><a class="anchor" 
href="#new.version.behavior"></a>29.3. Optional New Version and Delete behavior 
in HBase-2.0.0</h3>
 <div class="paragraph">
 <p>In <code>hbase-2.0.0</code>, the operator can specify an alternate version 
and
 delete treatment by setting the column descriptor property
@@ -8761,13 +8782,13 @@ the order in which Mutations arrive is a factor.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="_current_limitations"><a class="anchor" 
href="#_current_limitations"></a>28.4. Current Limitations</h3>
+<h3 id="_current_limitations"><a class="anchor" 
href="#_current_limitations"></a>29.4. Current Limitations</h3>
 <div class="paragraph">
 <p>The below limitations are addressed in hbase-2.0.0. See
 the section above, <a href="#new.version.behavior">Optional New Version and 
Delete behavior in HBase-2.0.0</a>.</p>
 </div>
 <div class="sect3">
-<h4 id="_deletes_mask_puts"><a class="anchor" 
href="#_deletes_mask_puts"></a>28.4.1. Deletes mask Puts</h4>
+<h4 id="_deletes_mask_puts"><a class="anchor" 
href="#_deletes_mask_puts"></a>29.4.1. Deletes mask Puts</h4>
 <div class="paragraph">
 <p>Deletes mask puts, even puts that happened after the delete was entered.
 See <a href="https://issues.apache.org/jira/browse/HBASE-2256";>HBASE-2256</a>.
@@ -8782,7 +8803,7 @@ But they can occur even if you do not care about time: 
just do delete and put im
 </div>
 </div>
 <div class="sect3">
-<h4 id="major.compactions.change.query.results"><a class="anchor" 
href="#major.compactions.change.query.results"></a>28.4.2. Major compactions 
change query results</h4>
+<h4 id="major.compactions.change.query.results"><a class="anchor" 
href="#major.compactions.change.query.results"></a>29.4.2. Major compactions 
change query results</h4>
 <div class="paragraph">
 <p><em>&#8230;&#8203;create three cell versions at t1, t2 and t3, with a 
maximum-versions
     setting of 2. So when getting all versions, only the values at t2 and t3 
will be
@@ -8795,7 +8816,7 @@ But they can occur even if you do not care about time: 
just do delete and put im
 </div>
 </div>
 <div class="sect1">
-<h2 id="dm.sort"><a class="anchor" href="#dm.sort"></a>29. Sort Order</h2>
+<h2 id="dm.sort"><a class="anchor" href="#dm.sort"></a>30. Sort Order</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>All data model operations HBase return data in sorted order.
@@ -8804,7 +8825,7 @@ First by row, then by ColumnFamily, followed by column 
qualifier, and finally ti
 </div>
 </div>
 <div class="sect1">
-<h2 id="dm.column.metadata"><a class="anchor" 
href="#dm.column.metadata"></a>30. Column Metadata</h2>
+<h2 id="dm.column.metadata"><a class="anchor" 
href="#dm.column.metadata"></a>31. Column Metadata</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>There is no store of column metadata outside of the internal KeyValue 
instances for a ColumnFamily.
@@ -8817,7 +8838,7 @@ For more information about how HBase stores data 
internally, see <a href="#keyva
 </div>
 </div>
 <div class="sect1">
-<h2 id="joins"><a class="anchor" href="#joins"></a>31. Joins</h2>
+<h2 id="joins"><a class="anchor" href="#joins"></a>32. Joins</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Whether HBase supports joins is a common question on the dist-list, and 
there is a simple answer:  it doesn&#8217;t, at not least in the way that 
RDBMS' support them (e.g., with equi-joins or outer-joins in SQL).  As has been 
illustrated in this chapter, the read data model operations in HBase are Get 
and Scan.</p>
@@ -8830,7 +8851,7 @@ hash-joins). So which is the best approach? It depends on 
what you are trying to
 </div>
 </div>
 <div class="sect1">
-<h2 id="_acid"><a class="anchor" href="#_acid"></a>32. ACID</h2>
+<h2 id="_acid"><a class="anchor" href="#_acid"></a>33. ACID</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>See <a href="/acid-semantics.html">ACID Semantics</a>.
@@ -8863,7 +8884,7 @@ modeling on HBase.</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="schema.creation"><a class="anchor" href="#schema.creation"></a>33. 
Schema Creation</h2>
+<h2 id="schema.creation"><a class="anchor" href="#schema.creation"></a>34. 
Schema Creation</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>HBase schemas can be created or updated using the <a href="#shell">The 
Apache HBase Shell</a> or by using <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html";>Admin</a>
 in the Java API.</p>
@@ -8903,7 +8924,7 @@ online schema changes are supported in the 0.92.x 
codebase, but the 0.90.x codeb
 </table>
 </div>
 <div class="sect2">
-<h3 id="schema.updates"><a class="anchor" href="#schema.updates"></a>33.1. 
Schema Updates</h3>
+<h3 id="schema.updates"><a class="anchor" href="#schema.updates"></a>34.1. 
Schema Updates</h3>
 <div class="paragraph">
 <p>When changes are made to either Tables or ColumnFamilies (e.g. region size, 
block size), these changes take effect the next time there is a major 
compaction and the StoreFiles get re-written.</p>
 </div>
@@ -8914,7 +8935,7 @@ online schema changes are supported in the 0.92.x 
codebase, but the 0.90.x codeb
 </div>
 </div>
 <div class="sect1">
-<h2 id="table_schema_rules_of_thumb"><a class="anchor" 
href="#table_schema_rules_of_thumb"></a>34. Table Schema Rules Of Thumb</h2>
+<h2 id="table_schema_rules_of_thumb"><a class="anchor" 
href="#table_schema_rules_of_thumb"></a>35. Table Schema Rules Of Thumb</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>There are many different data sets, with different access patterns and 
service-level
@@ -8987,7 +9008,7 @@ defaults).</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="number.of.cfs"><a class="anchor" href="#number.of.cfs"></a>35. On the 
number of column families</h2>
+<h2 id="number.of.cfs"><a class="anchor" href="#number.of.cfs"></a>36. On the 
number of column families</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>HBase currently does not do well with anything above two or three column 
families so keep the number of column families in your schema low.
@@ -9000,7 +9021,7 @@ Only introduce a second and third column family in the 
case where data access is
 you query one column family or the other but usually not both at the one 
time.</p>
 </div>
 <div class="sect2">
-<h3 id="number.of.cfs.card"><a class="anchor" 
href="#number.of.cfs.card"></a>35.1. Cardinality of ColumnFamilies</h3>
+<h3 id="number.of.cfs.card"><a class="anchor" 
href="#number.of.cfs.card"></a>36.1. Cardinality of ColumnFamilies</h3>
 <div class="paragraph">
 <p>Where multiple ColumnFamilies exist in a single table, be aware of the 
cardinality (i.e., number of rows). If ColumnFamilyA has 1 million rows and 
ColumnFamilyB has 1 billion rows, ColumnFamilyA&#8217;s data will likely be 
spread across many, many regions (and RegionServers). This makes mass scans for 
ColumnFamilyA less efficient.</p>
 </div>
@@ -9008,10 +9029,10 @@ you query one column family or the other but usually 
not both at the one time.</
 </div>
 </div>
 <div class="sect1">
-<h2 id="rowkey.design"><a class="anchor" href="#rowkey.design"></a>36. Rowkey 
Design</h2>
+<h2 id="rowkey.design"><a class="anchor" href="#rowkey.design"></a>37. Rowkey 
Design</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="_hotspotting"><a class="anchor" href="#_hotspotting"></a>36.1. 
Hotspotting</h3>
+<h3 id="_hotspotting"><a class="anchor" href="#_hotspotting"></a>37.1. 
Hotspotting</h3>
 <div class="paragraph">
 <p>Rows in HBase are sorted lexicographically by row key.
 This design optimizes for scans, allowing you to store related rows, or rows 
that will be read together, near each other.
@@ -9107,7 +9128,7 @@ This effectively randomizes row keys, but sacrifices row 
ordering properties.</p
 </div>
 </div>
 <div class="sect2">
-<h3 id="timeseries"><a class="anchor" href="#timeseries"></a>36.2. 
Monotonically Increasing Row Keys/Timeseries Data</h3>
+<h3 id="timeseries"><a class="anchor" href="#timeseries"></a>37.2. 
Monotonically Increasing Row Keys/Timeseries Data</h3>
 <div class="paragraph">
 <p>In the HBase chapter of Tom White&#8217;s book <a 
href="http://oreilly.com/catalog/9780596521981";>Hadoop: The Definitive 
Guide</a> (O&#8217;Reilly) there is a an optimization note on watching out for 
a phenomenon where an import process walks in lock-step with all clients in 
concert pounding one of the table&#8217;s regions (and thus, a single node), 
then moving onto the next region, etc.
 With monotonically increasing row-keys (i.e., using a timestamp), this will 
happen.
@@ -9126,7 +9147,7 @@ Thus, even with a continual stream of input data with a 
mix of metric types, the
 </div>
 </div>
 <div class="sect2">
-<h3 id="keysize"><a class="anchor" href="#keysize"></a>36.3. Try to minimize 
row and column sizes</h3>
+<h3 id="keysize"><a class="anchor" href="#keysize"></a>37.3. Try to minimize 
row and column sizes</h3>
 <div class="paragraph">
 <p>In HBase, values are always freighted with their coordinates; as a cell 
value passes through the system, it&#8217;ll be accompanied by its row, column 
name, and timestamp - always.
 If your rows and column names are large, especially compared to the size of 
the cell value, then you may run up against some interesting scenarios.
@@ -9143,7 +9164,7 @@ Whatever patterns are selected for ColumnFamilies, 
attributes, and rowkeys they
 <p>See <a href="#keyvalue">keyvalue</a> for more information on HBase stores 
data internally to see why this is important.</p>
 </div>
 <div class="sect3">
-<h4 id="keysize.cf"><a class="anchor" href="#keysize.cf"></a>36.3.1. Column 
Families</h4>
+<h4 id="keysize.cf"><a class="anchor" href="#keysize.cf"></a>37.3.1. Column 
Families</h4>
 <div class="paragraph">
 <p>Try to keep the ColumnFamily names as small as possible, preferably one 
character (e.g. "d" for data/default).</p>
 </div>
@@ -9152,7 +9173,7 @@ Whatever patterns are selected for ColumnFamilies, 
attributes, and rowkeys they
 </div>
 </div>
 <div class="sect3">
-<h4 id="keysize.attributes"><a class="anchor" 
href="#keysize.attributes"></a>36.3.2. Attributes</h4>
+<h4 id="keysize.attributes"><a class="anchor" 
href="#keysize.attributes"></a>37.3.2. Attributes</h4>
 <div class="paragraph">
 <p>Although verbose attribute names (e.g., "myVeryImportantAttribute") are 
easier to read, prefer shorter attribute names (e.g., "via") to store in 
HBase.</p>
 </div>
@@ -9161,7 +9182,7 @@ Whatever patterns are selected for ColumnFamilies, 
attributes, and rowkeys they
 </div>
 </div>
 <div class="sect3">
-<h4 id="keysize.row"><a class="anchor" href="#keysize.row"></a>36.3.3. Rowkey 
Length</h4>
+<h4 id="keysize.row"><a class="anchor" href="#keysize.row"></a>37.3.3. Rowkey 
Length</h4>
 <div class="paragraph">
 <p>Keep them as short as is reasonable such that they can still be useful for 
required data access (e.g. Get vs.
 Scan). A short key that is useless for data access is not better than a longer 
key with better get/scan properties.
@@ -9169,7 +9190,7 @@ Expect tradeoffs when designing rowkeys.</p>
 </div>
 </div>
 <div class="sect3">
-<h4 id="keysize.patterns"><a class="anchor" 
href="#keysize.patterns"></a>36.3.4. Byte Patterns</h4>
+<h4 id="keysize.patterns"><a class="anchor" 
href="#keysize.patterns"></a>37.3.4. Byte Patterns</h4>
 <div class="paragraph">
 <p>A long is 8 bytes.
 You can store an unsigned number up to 18,446,744,073,709,551,615 in those 
eight bytes.
@@ -9225,7 +9246,7 @@ This is the main trade-off.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="reverse.timestamp"><a class="anchor" 
href="#reverse.timestamp"></a>36.4. Reverse Timestamps</h3>
+<h3 id="reverse.timestamp"><a class="anchor" 
href="#reverse.timestamp"></a>37.4. Reverse Timestamps</h3>
 <div class="admonitionblock note">
 <table>
 <tr>
@@ -9257,14 +9278,14 @@ Since HBase keys are in sorted order, this key sorts 
before any older row-keys f
 </div>
 </div>
 <div class="sect2">
-<h3 id="rowkey.scope"><a class="anchor" href="#rowkey.scope"></a>36.5. Rowkeys 
and ColumnFamilies</h3>
+<h3 id="rowkey.scope"><a class="anchor" href="#rowkey.scope"></a>37.5. Rowkeys 
and ColumnFamilies</h3>
 <div class="paragraph">
 <p>Rowkeys are scoped to ColumnFamilies.
 Thus, the same rowkey could exist in each ColumnFamily that exists in a table 
without collision.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="changing.rowkeys"><a class="anchor" href="#changing.rowkeys"></a>36.6. 
Immutability of Rowkeys</h3>
+<h3 id="changing.rowkeys"><a class="anchor" href="#changing.rowkeys"></a>37.6. 
Immutability of Rowkeys</h3>
 <div class="paragraph">
 <p>Rowkeys cannot be changed.
 The only way they can be "changed" in a table is if the row is deleted and 
then re-inserted.
@@ -9272,7 +9293,7 @@ This is a fairly common question on the HBase dist-list 
so it pays to get the ro
 </div>
 </div>
 <div class="sect2">
-<h3 id="rowkey.regionsplits"><a class="anchor" 
href="#rowkey.regionsplits"></a>36.7. Relationship Between RowKeys and Region 
Splits</h3>
+<h3 id="rowkey.regionsplits"><a class="anchor" 
href="#rowkey.regionsplits"></a>37.7. Relationship Between RowKeys and Region 
Splits</h3>
 <div class="paragraph">
 <p>If you pre-split your table, it is <em>critical</em> to understand how your 
rowkey will be distributed across the region boundaries.
 As an example of why this is important, consider the example of using 
displayable hex characters as the lead position of the key (e.g., 
"0000000000000000" to "ffffffffffffffff"). Running those key ranges through 
<code>Bytes.split</code> (which is the split strategy used when creating 
regions in <code>Admin.createTable(byte[] startKey, byte[] endKey, 
numRegions)</code> for 10 regions will generate the following 
splits&#8230;&#8203;</p>
@@ -9344,10 +9365,10 @@ Know your data.</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="schema.versions"><a class="anchor" href="#schema.versions"></a>37. 
Number of Versions</h2>
+<h2 id="schema.versions"><a class="anchor" href="#schema.versions"></a>38. 
Number of Versions</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="schema.versions.max"><a class="anchor" 
href="#schema.versions.max"></a>37.1. Maximum Number of Versions</h3>
+<h3 id="schema.versions.max"><a class="anchor" 
href="#schema.versions.max"></a>38.1. Maximum Number of Versions</h3>
 <div class="paragraph">
 <p>The maximum number of row versions to store is configured per column family 
via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html";>HColumnDescriptor</a>.
 The default for max versions is 1.
@@ -9359,7 +9380,7 @@ The number of max versions may need to be increased or 
decreased depending on ap
 </div>
 </div>
 <div class="sect2">
-<h3 id="schema.minversions"><a class="anchor" 
href="#schema.minversions"></a>37.2. Minimum Number of Versions</h3>
+<h3 id="schema.minversions"><a class="anchor" 
href="#schema.minversions"></a>38.2. Minimum Number of Versions</h3>
 <div class="paragraph">
 <p>Like maximum number of row versions, the minimum number of row versions to 
keep is configured per column family via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html";>HColumnDescriptor</a>.
 The default for min versions is 0, which means the feature is disabled.
@@ -9369,7 +9390,7 @@ The minimum number of row versions parameter is used 
together with the time-to-l
 </div>
 </div>
 <div class="sect1">
-<h2 id="supported.datatypes"><a class="anchor" 
href="#supported.datatypes"></a>38. Supported Datatypes</h2>
+<h2 id="supported.datatypes"><a class="anchor" 
href="#supported.datatypes"></a>39. Supported Datatypes</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>HBase supports a "bytes-in/bytes-out" interface via <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html";>Put</a>
 and <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html";>Result</a>,
 so anything that can be converted to an array of bytes can be stored as a 
value.
@@ -9381,7 +9402,7 @@ All rows in HBase conform to the <a 
href="#datamodel">Data Model</a>, and that i
 Take that into consideration when making your design, as well as block size 
for the ColumnFamily.</p>
 </div>
 <div class="sect2">
-<h3 id="_counters"><a class="anchor" href="#_counters"></a>38.1. Counters</h3>
+<h3 id="_counters"><a class="anchor" href="#_counters"></a>39.1. Counters</h3>
 <div class="paragraph">
 <p>One supported datatype that deserves special mention are "counters" (i.e., 
the ability to do atomic increments of numbers). See <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#increment%28org.apache.hadoop.hbase.client.Increment%29";>Increment</a>
 in <code>Table</code>.</p>
 </div>
@@ -9392,7 +9413,7 @@ Take that into consideration when making your design, as 
well as block size for
 </div>
 </div>
 <div class="sect1">
-<h2 id="schema.joins"><a class="anchor" href="#schema.joins"></a>39. Joins</h2>
+<h2 id="schema.joins"><a class="anchor" href="#schema.joins"></a>40. Joins</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>If you have multiple tables, don&#8217;t forget to factor in the potential 
for <a href="#joins">Joins</a> into the schema design.</p>
@@ -9400,7 +9421,7 @@ Take that into consideration when making your design, as 
well as block size for
 </div>
 </div>
 <div class="sect1">
-<h2 id="ttl"><a class="anchor" href="#ttl"></a>40. Time To Live (TTL)</h2>
+<h2 id="ttl"><a class="anchor" href="#ttl"></a>41. Time To Live (TTL)</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>ColumnFamilies can set a TTL length in seconds, and HBase will 
automatically delete rows once the expiration time is reached.
@@ -9435,7 +9456,7 @@ There are two notable differences between cell TTL 
handling and ColumnFamily TTL
 </div>
 </div>
 <div class="sect1">
-<h2 id="cf.keep.deleted"><a class="anchor" href="#cf.keep.deleted"></a>41. 
Keeping Deleted Cells</h2>
+<h2 id="cf.keep.deleted"><a class="anchor" href="#cf.keep.deleted"></a>42. 
Keeping Deleted Cells</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>By default, delete markers extend back to the beginning of time.
@@ -9572,7 +9593,7 @@ So with KEEP_DELETED_CELLS enabled deleted cells would 
get removed if either you
 </div>
 </div>
 <div class="sect1">
-<h2 id="secondary.indexes"><a class="anchor" href="#secondary.indexes"></a>42. 
Secondary Indexes and Alternate Query Paths</h2>
+<h2 id="secondary.indexes"><a class="anchor" href="#secondary.indexes"></a>43. 
Secondary Indexes and Alternate Query Paths</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>This section could also be titled "what if my table rowkey looks like 
<em>this</em> but I also want to query my table like <em>that</em>." A common 
example on the dist-list is where a row-key is of the format "user-timestamp" 
but there are reporting requirements on activity across users for certain time 
ranges.
@@ -9615,7 +9636,7 @@ However, HBase scales better at larger data volumes, so 
this is a feature trade-
 <p>Additionally, see the David Butler response in this dist-list thread <a 
href="http://search-hadoop.com/m/nvbiBp2TDP/Stargate%252Bhbase&amp;subj=Stargate+hbase";>HBase,
 mail # user - Stargate+hbase</a></p>
 </div>
 <div class="sect2">
-<h3 id="secondary.indexes.filter"><a class="anchor" 
href="#secondary.indexes.filter"></a>42.1. Filter Query</h3>
+<h3 id="secondary.indexes.filter"><a class="anchor" 
href="#secondary.indexes.filter"></a>43.1. Filter Query</h3>
 <div class="paragraph">
 <p>Depending on the case, it may be appropriate to use <a 
href="#client.filter">Client Request Filters</a>.
 In this case, no secondary index is created.
@@ -9623,7 +9644,7 @@ However, don&#8217;t try a full-scan on a large table 
like this from an applicat
 </div>
 </div>
 <div class="sect2">
-<h3 id="secondary.indexes.periodic"><a class="anchor" 
href="#secondary.indexes.periodic"></a>42.2. Periodic-Update Secondary 
Index</h3>
+<h3 id="secondary.indexes.periodic"><a class="anchor" 
href="#secondary.indexes.periodic"></a>43.2. Periodic-Update Secondary 
Index</h3>
 <div class="paragraph">
 <p>A secondary index could be created in another table which is periodically 
updated via a MapReduce job.
 The job could be executed intra-day, but depending on load-strategy it could 
still potentially be out of sync with the main data table.</p>
@@ -9633,13 +9654,13 @@ The job could be executed intra-day, but depending on 
load-strategy it could sti
 </div>
 </div>
 <div class="sect2">
-<h3 id="secondary.indexes.dualwrite"><a class="anchor" 
href="#secondary.indexes.dualwrite"></a>42.3. Dual-Write Secondary Index</h3>
+<h3 id="secondary.indexes.dualwrite"><a class="anchor" 
href="#secondary.indexes.dualwrite"></a>43.3. Dual-Write Secondary Index</h3>
 <div class="paragraph">
 <p>Another strategy is to build the secondary index while publishing data to 
the cluster (e.g., write to data table, write to index table). If this is 
approach is taken after a data table already exists, then bootstrapping will be 
needed for the secondary index with a MapReduce job (see <a 
href="#secondary.indexes.periodic">secondary.indexes.periodic</a>).</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="secondary.indexes.summary"><a class="anchor" 
href="#secondary.indexes.summary"></a>42.4. Summary Tables</h3>
+<h3 id="secondary.indexes.summary"><a class="anchor" 
href="#secondary.indexes.summary"></a>43.4. Summary Tables</h3>
 <div class="paragraph">
 <p>Where time-ranges are very wide (e.g., year-long report) and where the data 
is voluminous, summary tables are a common approach.
 These would be generated with MapReduce jobs into another table.</p>
@@ -9649,7 +9670,7 @@ These would be generated with MapReduce jobs into another 
table.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="secondary.indexes.coproc"><a class="anchor" 
href="#secondary.indexes.coproc"></a>42.5. Coprocessor Secondary Index</h3>
+<h3 id="secondary.indexes.coproc"><a class="anchor" 
href="#secondary.indexes.coproc"></a>43.5. Coprocessor Secondary Index</h3>
 <div class="paragraph">
 <p>Coprocessors act like RDBMS triggers. These were added in 0.92.
 For more information, see <a href="#cp">coprocessors</a></p>
@@ -9658,7 +9679,7 @@ For more information, see <a 
href="#cp">coprocessors</a></p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_constraints"><a class="anchor" href="#_constraints"></a>43. 
Constraints</h2>
+<h2 id="_constraints"><a class="anchor" href="#_constraints"></a>44. 
Constraints</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>HBase currently supports 'constraints' in traditional (SQL) database 
parlance.
@@ -9673,7 +9694,7 @@ since version 0.94.</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="schema.casestudies"><a class="anchor" 
href="#schema.casestudies"></a>44. Schema Design Case Studies</h2>
+<h2 id="schema.casestudies"><a class="anchor" 
href="#schema.casestudies"></a>45. Schema Design Case Studies</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The following will describe some typical data ingestion use-cases with 
HBase, and how the rowkey design and construction can be approached.
@@ -9706,7 +9727,7 @@ Know your data, and know your processing requirements.</p>
 </ul>
 </div>
 <div class="sect2">
-<h3 id="schema.casestudies.log_timeseries"><a class="anchor" 
href="#schema.casestudies.log_timeseries"></a>44.1. Case Study - Log Data and 
Timeseries Data</h3>
+<h3 id="schema.casestudies.log_timeseries"><a class="anchor" 
href="#schema.casestudies.log_timeseries"></a>45.1. Case Study - Log Data and 
Timeseries Data</h3>
 <div class="paragraph">
 <p>Assume that the following data elements are being collected.</p>
 </div>
@@ -9730,7 +9751,7 @@ Know your data, and know your processing requirements.</p>
 <p>We can store them in an HBase table called LOG_DATA, but what will the 
rowkey be? From these attributes the rowkey will be some combination of 
hostname, timestamp, and log-event - but what specifically?</p>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.log_timeseries.tslead"><a class="anchor" 
href="#schema.casestudies.log_timeseries.tslead"></a>44.1.1. Timestamp In The 
Rowkey Lead Position</h4>
+<h4 id="schema.casestudies.log_timeseries.tslead"><a class="anchor" 
href="#schema.casestudies.log_timeseries.tslead"></a>45.1.1. Timestamp In The 
Rowkey Lead Position</h4>
 <div class="paragraph">
 <p>The rowkey <code>[timestamp][hostname][log-event]</code> suffers from the 
monotonically increasing rowkey problem described in <a 
href="#timeseries">Monotonically Increasing Row Keys/Timeseries Data</a>.</p>
 </div>
@@ -9758,14 +9779,14 @@ Attention must be paid to the number of buckets, 
because this will require the s
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.log_timeseries.hostlead"><a class="anchor" 
href="#schema.casestudies.log_timeseries.hostlead"></a>44.1.2. Host In The 
Rowkey Lead Position</h4>
+<h4 id="schema.casestudies.log_timeseries.hostlead"><a class="anchor" 
href="#schema.casestudies.log_timeseries.hostlead"></a>45.1.2. Host In The 
Rowkey Lead Position</h4>
 <div class="paragraph">
 <p>The rowkey <code>[hostname][log-event][timestamp]</code> is a candidate if 
there is a large-ish number of hosts to spread the writes and reads across the 
keyspace.
 This approach would be useful if scanning by hostname was a priority.</p>
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.log_timeseries.revts"><a class="anchor" 
href="#schema.casestudies.log_timeseries.revts"></a>44.1.3. Timestamp, or 
Reverse Timestamp?</h4>
+<h4 id="schema.casestudies.log_timeseries.revts"><a class="anchor" 
href="#schema.casestudies.log_timeseries.revts"></a>45.1.3. Timestamp, or 
Reverse Timestamp?</h4>
 <div class="paragraph">
 <p>If the most important access path is to pull most recent events, then 
storing the timestamps as reverse-timestamps (e.g., <code>timestamp = 
Long.MAX_VALUE – timestamp</code>) will create the property of being able to 
do a Scan on <code>[hostname][log-event]</code> to obtain the most recently 
captured events.</p>
 </div>
@@ -9791,7 +9812,7 @@ See <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Sca
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.log_timeseries.varkeys"><a class="anchor" 
href="#schema.casestudies.log_timeseries.varkeys"></a>44.1.4. Variable Length 
or Fixed Length Rowkeys?</h4>
+<h4 id="schema.casestudies.log_timeseries.varkeys"><a class="anchor" 
href="#schema.casestudies.log_timeseries.varkeys"></a>45.1.4. Variable Length 
or Fixed Length Rowkeys?</h4>
 <div class="paragraph">
 <p>It is critical to remember that rowkeys are stamped on every column in 
HBase.
 If the hostname is <code>a</code> and the event type is <code>e1</code> then 
the resulting rowkey would be quite small.
@@ -9861,7 +9882,7 @@ by using an <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/cl
 </div>
 </div>
 <div class="sect2">
-<h3 id="schema.casestudies.log_steroids"><a class="anchor" 
href="#schema.casestudies.log_steroids"></a>44.2. Case Study - Log Data and 
Timeseries Data on Steroids</h3>
+<h3 id="schema.casestudies.log_steroids"><a class="anchor" 
href="#schema.casestudies.log_steroids"></a>45.2. Case Study - Log Data and 
Timeseries Data on Steroids</h3>
 <div class="paragraph">
 <p>This effectively is the OpenTSDB approach.
 What OpenTSDB does is re-write data and pack rows into columns for certain 
time-periods.
@@ -9892,7 +9913,7 @@ from HBaseCon2012.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="schema.casestudies.custorder"><a class="anchor" 
href="#schema.casestudies.custorder"></a>44.3. Case Study - Customer/Order</h3>
+<h3 id="schema.casestudies.custorder"><a class="anchor" 
href="#schema.casestudies.custorder"></a>45.3. Case Study - Customer/Order</h3>
 <div class="paragraph">
 <p>Assume that HBase is used to store customer and order information.
 There are two core record-types being ingested: a Customer record type, and 
Order record type.</p>
@@ -9978,7 +9999,7 @@ What is the keyspace of the customer number, and what is 
the format (e.g., numer
 </ul>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.custorder.tables"><a class="anchor" 
href="#schema.casestudies.custorder.tables"></a>44.3.1. Single Table? Multiple 
Tables?</h4>
+<h4 id="schema.casestudies.custorder.tables"><a class="anchor" 
href="#schema.casestudies.custorder.tables"></a>45.3.1. Single Table? Multiple 
Tables?</h4>
 <div class="paragraph">
 <p>A traditional design approach would have separate tables for CUSTOMER and 
SALES.
 Another option is to pack multiple record types into a single table (e.g., 
CUSTOMER++).</p>
@@ -10017,7 +10038,7 @@ Another option is to pack multiple record types into a 
single table (e.g., CUSTO
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.casestudies.custorder.obj"><a class="anchor" 
href="#schema.casestudies.custorder.obj"></a>44.3.2. Order Object Design</h4>
+<h4 id="schema.casestudies.custorder.obj"><a class="anchor" 
href="#schema.casestudies.custorder.obj"></a>45.3.2. Order Object Design</h4>
 <div class="paragraph">
 <p>Now we need to address how to model the Order object.
 Assume that the class structure is as follows:</p>
@@ -10223,13 +10244,13 @@ Care should be taken with this approach to ensure 
backward compatibility in case
 </div>
 </div>
 <div class="sect2">
-<h3 id="schema.smackdown"><a class="anchor" href="#schema.smackdown"></a>44.4. 
Case Study - "Tall/Wide/Middle" Schema Design Smackdown</h3>
+<h3 id="schema.smackdown"><a class="anchor" href="#schema.smackdown"></a>45.4. 
Case Study - "Tall/Wide/Middle" Schema Design Smackdown</h3>
 <div class="paragraph">
 <p>This section will describe additional schema design questions that appear 
on the dist-list, specifically about tall and wide tables.
 These are general guidelines and not laws - each application must consider its 
own needs.</p>
 </div>
 <div class="sect3">
-<h4 id="schema.smackdown.rowsversions"><a class="anchor" 
href="#schema.smackdown.rowsversions"></a>44.4.1. Rows vs. Versions</h4>
+<h4 id="schema.smackdown.rowsversions"><a class="anchor" 
href="#schema.smackdown.rowsversions"></a>45.4.1. Rows vs. Versions</h4>
 <div class="paragraph">
 <p>A common question is whether one should prefer rows or HBase&#8217;s 
built-in-versioning.
 The context is typically where there are "a lot" of versions of a row to be 
retained (e.g., where it is significantly above the HBase default of 1 max 
versions). The rows-approach would require storing a timestamp in some portion 
of the rowkey so that they would not overwrite with each successive update.</p>
@@ -10239,7 +10260,7 @@ The context is typically where there are "a lot" of 
versions of a row to be reta
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.smackdown.rowscols"><a class="anchor" 
href="#schema.smackdown.rowscols"></a>44.4.2. Rows vs. Columns</h4>
+<h4 id="schema.smackdown.rowscols"><a class="anchor" 
href="#schema.smackdown.rowscols"></a>45.4.2. Rows vs. Columns</h4>
 <div class="paragraph">
 <p>Another common question is whether one should prefer rows or columns.
 The context is typically in extreme cases of wide tables, such as having 1 row 
with 1 million attributes, or 1 million rows with 1 columns apiece.</p>
@@ -10250,7 +10271,7 @@ But there is also a middle path between these two 
options, and that is "Rows as
 </div>
 </div>
 <div class="sect3">
-<h4 id="schema.smackdown.rowsascols"><a class="anchor" 
href="#schema.smackdown.rowsascols"></a>44.4.3. Rows as Columns</h4>
+<h4 id="schema.smackdown.rowsascols"><a class="anchor" 
href="#schema.smackdown.rowsascols"></a>45.4.3. Rows as Columns</h4>
 <div class="paragraph">
 <p>The middle path between Rows vs.
 Columns is packing data that would be a separate row into columns, for certain 
rows.
@@ -10261,7 +10282,7 @@ For an overview of this approach, see <a 
href="#schema.casestudies.log_steroids"
 </div>
 </div>
 <div class="sect2">
-<h3 id="casestudies.schema.listdata"><a class="anchor" 
href="#casestudies.schema.listdata"></a>44.5. Case Study - List Data</h3>
+<h3 id="casestudies.schema.listdata"><a class="anchor" 
href="#casestudies.schema.listdata"></a>45.5. Case Study - List Data</h3>
 <div class="paragraph">
 <p>The following is an exchange from the user dist-list regarding a fairly 
common question: how to handle per-user list data in Apache HBase.</p>
 </div>
@@ -10376,10 +10397,10 @@ If you don&#8217;t have time to build it both ways 
and compare, my advice would
 </div>
 </div>
 <div class="sect1">
-<h2 id="schema.ops"><a class="anchor" href="#schema.ops"></a>45. Operational 
and Performance Configuration Options</h2>
+<h2 id="schema.ops"><a class="anchor" href="#schema.ops"></a>46. Operational 
and Performance Configuration Options</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="_tune_hbase_server_rpc_handling"><a class="anchor" 
href="#_tune_hbase_server_rpc_handling"></a>45.1. Tune HBase Server RPC 
Handling</h3>
+<h3 id="_tune_hbase_server_rpc_handling"><a class="anchor" 
href="#_tune_hbase_server_rpc_handling"></a>46.1. Tune HBase Server RPC 
Handling</h3>
 <div class="ulist">
 <ul>
 <li>
@@ -10437,7 +10458,7 @@ If you don&#8217;t have time to build it both ways and 
compare, my advice would
 </div>
 </div>
 <div class="sect2">
-<h3 id="_disable_nagle_for_rpc"><a class="anchor" 
href="#_disable_nagle_for_rpc"></a>45.2. Disable Nagle for RPC</h3>
+<h3 id="_disable_nagle_for_rpc"><a class="anchor" 
href="#_disable_nagle_for_rpc"></a>46.2. Disable Nagle for RPC</h3>
 <div class="paragraph">
 <p>Disable Nagle’s algorithm. Delayed ACKs can add up to ~200ms to RPC round 
trip time. Set the following parameters:</p>
 </div>
@@ -10473,7 +10494,7 @@ If you don&#8217;t have time to build it both ways and 
compare, my advice would
 </div>
 </div>
 <div class="sect2">
-<h3 id="_limit_server_failure_impact"><a class="anchor" 
href="#_limit_server_failure_impact"></a>45.3. Limit Server Failure Impact</h3>
+<h3 id="_limit_server_failure_impact"><a class="anchor" 
href="#_limit_server_failure_impact"></a>46.3. Limit Server Failure Impact</h3>
 <div class="paragraph">
 <p>Detect regionserver failure as fast as reasonable. Set the following 
parameters:</p>
 </div>
@@ -10499,7 +10520,7 @@ If you don&#8217;t have time to build it both ways and 
compare, my advice would
 </div>
 </div>
 <div class="sect2">
-<h3 id="shortcircuit.reads"><a class="anchor" 
href="#shortcircuit.reads"></a>45.4. Optimize on the Server Side for Low 
Latency</h3>
+<h3 id="shortcircuit.reads"><a class="anchor" 
href="#shortcircuit.reads"></a>46.4. Optimize on the Server Side for Low 
Latency</h3>
 <div class="paragraph">
 <p>Skip the network for local blocks when the RegionServer goes to read from 
HDFS by exploiting HDFS&#8217;s
 <a 
href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html";>Short-Circuit
 Local Reads</a> facility.
@@ -10568,9 +10589,9 @@ interesting read showing the HDFS community at its best 
(caveat a few comments).
 </div>
 </div>
 <div class="sect2">
-<h3 id="_jvm_tuning"><a class="anchor" href="#_jvm_tuning"></a>45.5. JVM 
Tuning</h3>
+<h3 id="_jvm_tuning"><a class="anchor" href="#_jvm_tuning"></a>46.5. JVM 
Tuning</h3>
 <div class="sect3">
-<h4 id="_tune_jvm_gc_for_low_collection_latencies"><a class="anchor" 
href="#_tune_jvm_gc_for_low_collection_latencies"></a>45.5.1. Tune JVM GC for 
low collection latencies</h4>
+<h4 id="_tune_jvm_gc_for_low_collection_latencies"><a class="anchor" 
href="#_tune_jvm_gc_for_low_collection_latencies"></a>46.5.1. Tune JVM GC for 
low collection latencies</h4>
 <div class="ulist">
 <ul>
 <li>
@@ -10603,7 +10624,7 @@ interesting read showing the HDFS community at its best 
(caveat a few comments).
 </div>
 </div>
 <div class="sect3">
-<h4 id="_os_level_tuning"><a class="anchor" 
href="#_os_level_tuning"></a>45.5.2. OS-Level Tuning</h4>
+<h4 id="_os_level_tuning"><a class="anchor" 
href="#_os_level_tuning"></a>46.5.2. OS-Level Tuning</h4>
 <div class="ulist">
 <ul>
 <li>
@@ -10631,10 +10652,10 @@ echo never &gt; 
/sys/kernel/mm/transparent_hugepage/defrag</pre>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_special_cases"><a class="anchor" href="#_special_cases"></a>46. 
Special Cases</h2>
+<h2 id="_special_cases"><a class="anchor" href="#_special_cases"></a>47. 
Special Cases</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="_for_applications_where_failing_quickly_is_better_than_waiting"><a 
class="anchor" 
href="#_for_applications_where_failing_quickly_is_better_than_waiting"></a>46.1.
 For applications where failing quickly is better than waiting</h3>
+<h3 id="_for_applications_where_failing_quickly_is_better_than_waiting"><a 
class="anchor" 
href="#_for_applications_where_failing_quickly_is_better_than_waiting"></a>47.1.
 For applications where failing quickly is better than waiting</h3>
 <div class="ulist">
 <ul>
 <li>
@@ -10663,7 +10684,7 @@ echo never &gt; 
/sys/kernel/mm/transparent_hugepage/defrag</pre>
 </div>
 </div>
 <div class="sect2">
-<h3 
id="_for_applications_that_can_tolerate_slightly_out_of_date_information"><a 
class="anchor" 
href="#_for_applications_that_can_tolerate_slightly_out_of_date_information"></a>46.2.
 For applications that can tolerate slightly out of date information</h3>
+<h3 
id="_for_applications_that_can_tolerate_slightly_out_of_date_information"><a 
class="anchor" 
href="#_for_applications_that_can_tolerate_slightly_out_of_date_information"></a>47.2.
 For applications that can tolerate slightly out of date information</h3>
 <div class="paragraph">
 <p><strong>HBase timeline consistency (HBASE-10070) </strong>
 With read replicas enabled, read-only copies of regions (replicas) are 
distributed over the cluster. One RegionServer services the default or primary 
replica, which is the only replica that can service writes. Other RegionServers 
serve the secondary replicas, follow the primary RegionServer, and only see 
committed updates. The secondary replicas are read-only, but can serve reads 
immediately while the primary is failing over, cutting read availability blips 
from seconds to milliseconds. Phoenix supports timeline consistency as of 4.4.0
@@ -10694,7 +10715,7 @@ Tips:</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="_more_information"><a class="anchor" 
href="#_more_information"></a>46.3. More Information</h3>
+<h3 id="_more_information"><a class="anchor" 
href="#_more_information"></a>47.3. More Information</h3>
 <div class="paragraph">
 <p>See the Performance section <a href="#perf.schema">perf.schema</a> for more 
information about operational and performance schema design options, such as 
Bloom Filters, Table-configured regionsizes, compression, and blocksizes.</p>
 </div>
@@ -10740,7 +10761,7 @@ In the notes below, we refer to 
<em>o.a.h.h.mapreduce</em> but replace with
 </div>
 </div>
 <div class="sect1">
-<h2 id="hbase.mapreduce.classpath"><a class="anchor" 
href="#hbase.mapreduce.classpath"></a>47. HBase, MapReduce, and the 
CLASSPATH</h2>
+<h2 id="hbase.mapreduce.classpath"><a class="anchor" 
href="#hbase.mapreduce.classpath"></a>48. HBase, MapReduce, and the 
CLASSPATH</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>By default, MapReduce jobs deployed to a MapReduce cluster do not have 
access to
@@ -10941,7 +10962,7 @@ $ HADOOP_CLASSPATH=$(hbase classpath) hadoop jar 
MyJob.jar MyJobMainClass</code>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_mapreduce_scan_caching"><a class="anchor" 
href="#_mapreduce_scan_caching"></a>48. MapReduce Scan Caching</h2>
+<h2 id="_mapreduce_scan_caching"><a class="anchor" 
href="#_mapreduce_scan_caching"></a>49. MapReduce Scan Caching</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>TableMapReduceUtil now restores the option to set scanner caching (the 
number of rows which are cached before returning the result to the client) on 
the Scan object that is passed in.
@@ -10976,7 +10997,7 @@ If you think of the scan as a shovel, a bigger cache 
setting is analogous to a b
 </div>
 </div>
 <div class="sect1">
-<h2 id="_bundled_hbase_mapreduce_jobs"><a class="anchor" 
href="#_bundled_hbase_mapreduce_jobs"></a>49. Bundled HBase MapReduce Jobs</h2>
+<h2 id="_bundled_hbase_mapreduce_jobs"><a class="anchor" 
href="#_bundled_hbase_mapreduce_jobs"></a>50. Bundled HBase MapReduce Jobs</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The HBase JAR also serves as a Driver for some bundled MapReduce jobs.
@@ -11007,7 +11028,7 @@ To run one of the jobs, model your command after the 
following example.</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_hbase_as_a_mapreduce_job_data_source_and_data_sink"><a class="anchor" 
href="#_hbase_as_a_mapreduce_job_data_source_and_data_sink"></a>50. HBase as a 
MapReduce Job Data Source and Data Sink</h2>
+<h2 id="_hbase_as_a_mapreduce_job_data_source_and_data_sink"><a class="anchor" 
href="#_hbase_as_a_mapreduce_job_data_source_and_data_sink"></a>51. HBase as a 
MapReduce Job Data Source and Data Sink</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>HBase can be used as a data source, <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html";>TableInputFormat</a>,
 and data sink, <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html";>TableOutputFormat</a>
 or <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.html";>MultiTableOutputFormat</a>,
 for MapReduce jobs.
@@ -11036,7 +11057,7 @@ Otherwise use the default partitioner.</p>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_writing_hfiles_directly_during_bulk_import"><a class="anchor" 
href="#_writing_hfiles_directly_during_bulk_import"></a>51. Writing HFiles 
Directly During Bulk Import</h2>
+<h2 id="_writing_hfiles_directly_during_bulk_import"><a class="anchor" 
href="#_writing_hfiles_directly_during_bulk_import"></a>52. Writing HFiles 
Directly During Bulk Import</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>If you are importing into a new table, you can bypass the HBase API and 
write your content directly to the filesystem, formatted into HBase data files 
(HFiles). Your import will run faster, perhaps an order of magnitude faster.
@@ -11045,7 +11066,7 @@ For more on how this mechanism works, see <a 
href="#arch.bulk.load">Bulk Loading
 </div>
 </div>
 <div class="sect1">
-<h2 id="_rowcounter_example"><a class="anchor" 
href="#_rowcounter_example"></a>52. RowCounter Example</h2>
+<h2 id="_rowcounter_example"><a class="anchor" 
href="#_rowcounter_example"></a>53. RowCounter Example</h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The included <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html";>RowCounter</a>
 MapReduce job uses <code>TableInputFormat</code> and does a count of all rows 
in the specified table.
@@ -11066,17 +11087,17 @@ If you have classpath errors, see <a 
href="#hbase.mapreduce.classpath">HBase, Ma
 </div>
 </div>
 <div class="sect1">
-<h2 id="splitter"><a class="anchor" href="#splitter"></a>53. Map-Task 
Splitting</h2>
+<h2 id="splitter"><a class="anchor" href="#splitter"></a>54. Map-Task 
Splitting</h2>
 <div class="sectionbody">
 <div class="sect2">
-<h3 id="splitter.default"><a class="anchor" href="#splitter.default"></a>53.1. 
The Default HBase MapReduce Splitter</h3>
+<h3 id="splitter.default"><a class="anchor" href="#splitter.default"></a>54.1. 
The Default HBase MapReduce Splitter</h3>
 <div class="paragraph">
 <p>When <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html";>TableInputFormat</a>
 is used to source an HBase table in a MapReduce job, its splitter will make a 
map task for each region of the table.
 Thus, if there are 100 regions in the table, there will be 100 map-tasks for 
the job - regardless of how many column families are selected in the Scan.</p>
 </div>
 </div>
 <div class="sect2">
-<h3 id="splitter.custom"><a class="anchor" href="#splitter.custom"></a>53.2. 
Custom Splitters</h3>
+<h3 id="splitter.custom"><a class="anchor" href="#splitter.custom"></a>54.2. 
Custom Splitters</h3>
 <div class="paragraph">
 <p>For those interested in implementing custom splitters, see the method 
<code>getSplits</code> in <a 
href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.html";>TableInputFormatBase</a>.
 That is where the logic for map-task assignment resides.</p>
@@ -11085,10 

<TRUNCATED>

Reply via email to