Author: stack Date: Tue Apr 20 23:10:06 2010 New Revision: 936110 URL: http://svn.apache.org/viewvc?rev=936110&view=rev Log: HBASE-2294 Enumerate ACID properties of HBase in a well defined spec
Added: hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/acid-semantics.xml Modified: hadoop/hbase/branches/0.20/CHANGES.txt hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/site.xml Modified: hadoop/hbase/branches/0.20/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/CHANGES.txt?rev=936110&r1=936109&r2=936110&view=diff ============================================================================== --- hadoop/hbase/branches/0.20/CHANGES.txt (original) +++ hadoop/hbase/branches/0.20/CHANGES.txt Tue Apr 20 23:10:06 2010 @@ -11,7 +11,9 @@ Release 0.20.4 - Unreleased HBASE-2165 Improve fragmentation display and implementation HBASE-2448 Remove 'indexed' contrib HBASE-2248 Provide new non-copy mechanism to assure atomic reads in - get and scan + get and scan + HBASE-2294 Enumerate ACID properties of HBase in a well defined spec + (Todd Lipcon via Stack) BUG FIXES HBASE-2173 New idx javadoc not included with the rest Added: hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/acid-semantics.xml URL: http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/acid-semantics.xml?rev=936110&view=auto ============================================================================== --- hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/acid-semantics.xml (added) +++ hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/acid-semantics.xml Tue Apr 20 23:10:06 2010 @@ -0,0 +1,227 @@ +<?xml version="1.0"?> +<!-- + Copyright 2002-2008 The Apache Software Foundation + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" + "http://forrest.apache.org/dtd/document-v20.dtd"> + + +<document> + + <header> + <title> + HBase ACID Properties + </title> + </header> + + <body> + <section> + <title>About this Document</title> + <p>HBase is not an ACID compliant database. However, it does guarantee certain specific + properties.</p> + <p>This specification enumerates the ACID properties of HBase.</p> + </section> + <section> + <title>Definitions</title> + <p>For the sake of common vocabulary, we define the following terms:</p> + <dl> + <dt>Atomicity</dt> + <dd>an operation is atomic if it either completes entirely or not at all</dd> + + <dt>Consistency</dt> + <dd> + all actions cause the table to transition from one valid state directly to another + (eg a row will not disappear during an update, etc) + </dd> + + <dt>Isolation</dt> + <dd> + an operation is isolated if it appears to complete independently of any other concurrent transaction + </dd> + + <dt>Durability</dt> + <dd>any update that reports "successful" to the client will not be lost</dd> + + <dt>Visibility</dt> + <dd>an update is considered visible if any subsequent read will see the update as having been committed</dd> + </dl> + <p> + The terms <em>must</em> and <em>may</em> are used as specified by RFC 2119. + In short, the word "must" implies that, if some case exists where the statement + is not true, it is a bug. The word "may" implies that, even if the guarantee + is provided in a current release, users should not rely on it. + </p> + </section> + <section> + <title>APIs to consider</title> + <ul> + <li>Read APIs + <ul> + <li>get</li> + <li>scan</li> + </ul> + </li> + <li>Write APIs</li> + <ul> + <li>put</li> + <li>batch put</li> + <li>delete</li> + </ul> + <li>Combination (read-modify-write) APIs</li> + <ul> + <li>incrementColumnValue</li> + <li>checkAndPut</li> + </ul> + </ul> + </section> + + <section> + <title>Guarantees Provided</title> + + <section> + <title>Atomicity</title> + + <ol> + <li>All mutations are atomic within a row. Any put will either wholely succeed or wholely fail.</li> + <ol> + <li>An operation that returns a "success" code has completely succeeded.</li> + <li>An operation that returns a "failure" code has completely failed.</li> + <li>An operation that times out may have succeeded and may have failed. However, + it will not have partially succeeded or failed.</li> + </ol> + <li> This is true even if the mutation crosses multiple column families within a row.</li> + <li> APIs that mutate several rows will _not_ be atomic across the multiple rows. + For example, a multiput that operates on rows 'a','b', and 'c' may return having + mutated some but not all of the rows. In such cases, these APIs will return a list + of success codes, each of which may be succeeded, failed, or timed out as described above.</li> + <li> The checkAndPut API happens atomically like the typical compareAndSet (CAS) operation + found in many hardware architectures.</li> + <li> The order of mutations is seen to happen in a well-defined order for each row, with no + interleaving. For example, if one writer issues the mutation "a=1,b=1,c=1" and + another writer issues the mutation "a=2,b=2,c=2", the row must either + be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must <em>not</em> be something + like "a=1,b=2,c=1".</li> + <ol> + <li>Please note that this is not true _across rows_ for multirow batch mutations.</li> + </ol> + </ol> + </section> + <section> + <title>Consistency and Isolation</title> + <ol> + <li>All rows returned via any access API will consist of a complete row that existed at + some point in the table's history.</li> + <li>This is true across column families - i.e a get of a full row that occurs concurrent + with some mutations 1,2,3,4,5 will return a complete row that existed at some point in time + between mutation i and i+1 for some i between 1 and 5.</li> + <li>The state of a row will only move forward through the history of edits to it.</li> + </ol> + + <section><title>Consistency of Scans</title> + <p> + A scan is <strong>not</strong> a consistent view of a table. Scans do + <strong>not</strong> exhibit <em>snapshot isolation</em>. + </p> + <p> + Rather, scans have the following properties: + </p> + + <ol> + <li> + Any row returned by the scan will be a consistent view (i.e. that version + of the complete row existed at some point in time) + </li> + <li> + A scan will always reflect a view of the data <em>at least as new as</em> + the beginning of the scan. This satisfies the visibility guarantees + enumerated below.</li> + <ol> + <li>For example, if client A writes data X and then communicates via a side + channel to client B, any scans started by client B will contain data at least + as new as X.</li> + <li>A scan _must_ reflect all mutations committed prior to the construction + of the scanner, and _may_ reflect some mutations committed subsequent to the + construction of the scanner.</li> + <li>Scans must include <em>all</em> data written prior to the scan (except in + the case where data is subsequently mutated, in which case it _may_ reflect + the mutation)</li> + </ol> + </ol> + <p> + Those familiar with relational databases will recognize this isolation level as "read committed". + </p> + <p> + Please note that the guarantees listed above regarding scanner consistency + are referring to "transaction commit time", not the "timestamp" + field of each cell. That is to say, a scanner started at time <em>t</em> may see edits + with a timestamp value greater than <em>t</em>, if those edits were committed with a + "forward dated" timestamp before the scanner was constructed. + </p> + </section> + </section> + <section> + <title>Visibility</title> + <ol> + <li> When a client receives a "success" response for any mutation, that + mutation is immediately visible to both that client and any client with whom it + later communicates through side channels.</li> + <li> A row must never exhibit so-called "time-travel" properties. That + is to say, if a series of mutations moves a row sequentially through a series of + states, any sequence of concurrent reads will return a subsequence of those states.</li> + <ol> + <li>For example, if a row's cells are mutated using the "incrementColumnValue" + API, a client must never see the value of any cell decrease.</li> + <li>This is true regardless of which read API is used to read back the mutation.</li> + </ol> + <li> Any version of a cell that has been returned to a read operation is guaranteed to + be durably stored.</li> + </ol> + + </section> + <section> + <title>Durability</title> + <ol> + <li> All visible data is also durable data. That is to say, a read will never return + data that has not been made durable on disk[1]</li> + <li> Any operation that returns a "success" code (eg does not throw an exception) + will be made durable.</li> + <li> Any operation that returns a "failure" code will not be made durable + (subject to the Atomicity guarantees above)</li> + <li> All reasonable failure scenarios will not affect any of the guarantees of this document.</li> + + </ol> + </section> + <section> + <title>Tunability</title> + <p>All of the above guarantees must be possible within HBase. For users who would like to trade + off some guarantees for performance, HBase may offer several tuning options. For example:</p> + <ul> + <li>Visibility may be tuned on a per-read basis to allow stale reads or time travel.</li> + <li>Durability may be tuned to only flush data to disk on a periodic basis</li> + </ul> + </section> + </section> + <section> + <title>Footnotes</title> + + <p>[1] In the context of HBase, "durably on disk" implies an hflush() call on the transaction + log. This does not actually imply an fsync() to magnetic media, but rather just that the data has been + written to the OS cache on all replicas of the log. In the case of a full datacenter power loss, it is + possible that the edits are not truly durable.</p> + </section> + + </body> +</document> Modified: hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/site.xml URL: http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/site.xml?rev=936110&r1=936109&r2=936110&view=diff ============================================================================== --- hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/site.xml (original) +++ hadoop/hbase/branches/0.20/src/docs/src/documentation/content/xdocs/site.xml Tue Apr 20 23:10:06 2010 @@ -36,6 +36,7 @@ See http://forrest.apache.org/docs/linki <started label="Getting Started" href="ext:api/started" /> <api label="API Docs" href="ext:api/index" /> <api label="HBase Metrics" href="metrics.html" /> + <api label="HBase Semantics" href="acid-semantics.html" /> <api label="HBase Default Configuration" href="hbase-conf.html" /> <api label="HBase on Windows" href="cygwin.html" /> <wiki label="Wiki" href="ext:wiki" />