Author: lewismc
Date: Wed Feb 12 18:12:38 2014
New Revision: 1567707
URL: http://svn.apache.org/r1567707
Log:
CMS commit to gora by lewismc
Modified:
gora/site/trunk/content/current/gora-hbase.md
Modified: gora/site/trunk/content/current/gora-hbase.md
URL:
http://svn.apache.org/viewvc/gora/site/trunk/content/current/gora-hbase.md?rev=1567707&r1=1567706&r2=1567707&view=diff
==============================================================================
--- gora/site/trunk/content/current/gora-hbase.md (original)
+++ gora/site/trunk/content/current/gora-hbase.md Wed Feb 12 18:12:38 2014
@@ -30,23 +30,25 @@ Say we wished to map some Employee data
Here you can see that we require the definition of two child elements within
the
<code>gora-orm</code> mapping configuration, namely;
-1. The table element; where we specify:
- * the HBase table name e.g. <b>Employee</b>,
- * the type and definition of families we wish to create within HBase. In
this case we create one family which could have a combination of any of the
following characteristics;
- 1. <b>name</b>: family name e.g. info
- 2. <b>compression</b>: the compression option to use in HBase. Please see <a
href="http://hbase.apache.org/book/compression.html">HBase documentation</a>.
- 3. <b>blockCache</b>: an LRU cache that contains three levels of block
priority to allow for scan-resistance and in-memory ColumnFamilies. Please see
<a
href="https://hbase.apache.org/book/regionserver.arch.html#block.cache">HBase
documentation</a>.
- 4. <b>blockSize</b>: The blocksize can be configured for each ColumnFamily
in a table, and this defaults to 64k. Larger cell values require larger
blocksizes. There is an inverse relationship between blocksize and the
resulting StoreFile indexes (i.e., if the blocksize is doubled then the
resulting indexes should be roughly halved). Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.cf.blocksize">HBase
documentation</a>.
- 5. <b>bloomFilter</b>: Bloom Filters can be enabled per-ColumnFamily. We use
<code>HColumnDescriptor.setBloomFilterType(NONE | ROW | ROWCOL)</code> to
enable blooms per Column Family. Default = NONE for no bloom filters. If ROW,
the hash of the row will be added to the bloom on each insert. If ROWCOL, the
hash of the row + column family name + column family qualifier will be added to
the bloom on each key insert. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.bloom">HBase
documentation</a>.
- 6. <b>maxVersions</b>: The maximum number of row versions to store is
configured per column family via <code>HColumnDescriptor</code>. The default
for max versions is <b>3</b>. This is an important parameter because HBase does
not overwrite row values, but rather stores different values per row by time
(and qualifier). Excess versions are removed during major compaction's. The
number of max versions may need to be increased or decreased depending on
application needs. Please see <a
href="http://hbase.apache.org/book/schema.versions.html">HBase
documentation</a>.
- 7. <b>timeToLive</b>: ColumnFamilies can set a TTL length in seconds, and
HBase will automatically delete rows once the expiration time is reached. This
applies to all versions of a row - even the current one. The TTL time encoded
in the HBase for the row is specified in UTC. Please see <a
href="https://hbase.apache.org/book/ttl.html">HBase documentation</a>.
- 8. <b>inMemory</b>: ColumnFamilies can optionally be defined as in-memory.
Data is still persisted to disk, just like any other ColumnFamily. In-memory
blocks have the highest priority in the Block Cache, but it is not a guarantee
that the entire table will be in memory. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#cf.in.memory">HBase
documentation</a>.
-2. Specification of persistent fields which values should map to;
- * the Persistent class name e.g.
<b>org.apache.gora.examples.generated.Employee</b>,
- * the keyClass e.g. <b>java.lang.String</b> which specifies the keys which
map to the field
-values,
- * the Table e.g. <b>Employee</b> which matches to the above Table definition,
- * finally fields which are to be persisted into HBase need to be configured
such that they
-receive a <b>name</b> e.g. (name, dateOfBirth, ssn and salary respectively),
the column <b>family</b>
-to which they belong e.g. (all info in this case) and an additional
<b>qualifier</b>, which enables
-more granular control over the data to be persisted into HBase.
+The table element; where we specify:
+* a parameter relating to the HBase table name e.g. name=<b>"Employee"</b>,
+* a nested element containing the type and definition of families we wish to
create within HBase. In this case we create one family <b>info</b> which could
have a combination of any of the following parameters;
+
+ <b>name</b>: family name e.g. info
+ <b>compression</b>: the compression option to use in HBase. Please see <a
href="http://hbase.apache.org/book/compression.html">HBase documentation</a>.
+ <b>blockCache</b>: an LRU cache that contains three levels of block
priority to allow for scan-resistance and in-memory ColumnFamilies. Please see
<a
href="https://hbase.apache.org/book/regionserver.arch.html#block.cache">HBase
documentation</a>.
+ <b>blockSize</b>: The blocksize can be configured for each ColumnFamily in
a table, and this defaults to 64k. Larger cell values require larger
blocksizes. There is an inverse relationship between blocksize and the
resulting StoreFile indexes (i.e., if the blocksize is doubled then the
resulting indexes should be roughly halved). Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.cf.blocksize">HBase
documentation</a>.
+ <b>bloomFilter</b>: Bloom Filters can be enabled per-ColumnFamily. We use
<code>HColumnDescriptor.setBloomFilterType(NONE | ROW | ROWCOL)</code> to
enable blooms per Column Family. Default = NONE for no bloom filters. If ROW,
the hash of the row will be added to the bloom on each insert. If ROWCOL, the
hash of the row + column family name + column family qualifier will be added to
the bloom on each key insert. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.bloom">HBase
documentation</a>.
+ <b>maxVersions</b>: The maximum number of row versions to store is
configured per column family via <code>HColumnDescriptor</code>. The default
for max versions is <b>3</b>. This is an important parameter because HBase does
not overwrite row values, but rather stores different values per row by time
(and qualifier). Excess versions are removed during major compaction's. The
number of max versions may need to be increased or decreased depending on
application needs. Please see <a
href="http://hbase.apache.org/book/schema.versions.html">HBase
documentation</a>.
+ <b>timeToLive</b>: ColumnFamilies can set a TTL length in seconds, and
HBase will automatically delete rows once the expiration time is reached. This
applies to all versions of a row - even the current one. The TTL time encoded
in the HBase for the row is specified in UTC. Please see <a
href="https://hbase.apache.org/book/ttl.html">HBase documentation</a>.
+ <b>inMemory</b>: ColumnFamilies can optionally be defined as in-memory.
Data is still persisted to disk, just like any other ColumnFamily. In-memory
blocks have the highest priority in the Block Cache, but it is not a guarantee
that the entire table will be in memory. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#cf.in.memory">HBase
documentation</a>.
+
+The class element where we specify of persistent fields which values should
map to. This contains;
+* a parameter containing the Persistent class name e.g.
<b>org.apache.gora.examples.generated.Employee</b>,
+* a parameter containing the keyClass e.g. <b>java.lang.String</b> which
specifies the keys which map to the field values,
+* a parameter containing the Table name e.g. <b>Employee</b> which matches to
the above Table definition,
+* finally nested child element(s) mapping fields which are to be persisted
into HBase. These fields need to be configured such that they receive;
+
+ a parameter containing the <b>name</b> e.g. (name, dateOfBirth, ssn and
salary respectively),
+ a parameter containing the column <b>family</b> to which they belong e.g.
(all info in this case),
+ an optional parameter <b>qualifier</b>, which enables more granular
control over the data to be persisted into HBase.