[ 
https://issues.apache.org/jira/browse/HBASE-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089851#comment-14089851
 ] 

Jonathan Hsieh commented on HBASE-11476:
----------------------------------------

Let's give this another go -- here's some feedback for the next rev.

{quote}
+          <para>A row in HBase consists of a row key and one or more column 
families. If you think
+            of a row as a key-value pair, the column families are the value. 
</para>
{quote}

Might be simpler to say a "row consists of a rowkey and one or more columns 
with values associated with them."


{quote}
+          <para>A column family loosely corresponds to a type of data. Each 
row in a table has the
+            same column families, though a given row might not store anything 
in a given column
+            family. If an HBase table is a multi-dimensional map, the column 
family is a second
+            dimension.</para>
{quote}

How about something more like this:

Column families physically colocate a set of columns and their values often for 
performance reasons.  Each column family has a set of storage properties (in 
mem cached, compressed, data block encoding, etc),  

{quote}
+          <para>A timestamp is written alongside each value, and is the 
identifier for a given
+            version of a value. By default, the timestamp represents the time 
on the RegionServer
+            when the data was written, but you can specify a different 
timestamp value when you put
+            data into the cell.</para>
{quote}

We should probably say times stamps are an advanced feature, and only exposed 
for use in special cases that are deeply aware and integrated with hbase.  
Direct use of these is discouraged -- encoding a timestamp at the application 
level is generally preferred.

I'll do another pass that looks the tables/exampes.  



> Expand 'Conceptual View' section of Data Model chapter 
> -------------------------------------------------------
>
>                 Key: HBASE-11476
>                 URL: https://issues.apache.org/jira/browse/HBASE-11476
>             Project: HBase
>          Issue Type: Bug
>          Components: documentation
>            Reporter: Misty Stanley-Jones
>            Assignee: Misty Stanley-Jones
>         Attachments: HBASE-11476.patch
>
>
> Could use some updating and expansion to emphasize the differences between 
> HBase and an RDBMS. I found 
> http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable 
> which is just excellent and we should link to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to