[
https://issues.apache.org/jira/browse/HBASE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064533#comment-15064533
]
Hudson commented on HBASE-11985:
--------------------------------
FAILURE: Integrated in HBase-Trunk_matrix #567 (See
[https://builds.apache.org/job/HBase-Trunk_matrix/567/])
HBASE-11985 Document sizing rules of thumb (mstanleyjones: rev
7a4590dfdbda1250f8203e30f6ba1ad0c8094928)
* src/main/asciidoc/_chapters/schema_design.adoc
> Document sizing rules of thumb
> ------------------------------
>
> Key: HBASE-11985
> URL: https://issues.apache.org/jira/browse/HBASE-11985
> Project: HBase
> Issue Type: Task
> Components: documentation
> Affects Versions: 2.0.0
> Reporter: Misty Stanley-Jones
> Assignee: Misty Stanley-Jones
> Fix For: 2.0.0
>
> Attachments: HBASE-11985.patch
>
>
> I'm looking for tuning/sizing rules of thumb to put in the Ref Guide.
> Info I have gleaned so far:
> A reasonable region size is between 10 GB and 50 GB.
> A reasonable maximum cell size is 1 MB to 10 MB. If your cells are larger
> than 10 MB, consider storing the cell contents in HDFS and storing a
> reference to the location in HBase. Pending MOB work for 10 MB - 64 MB window.
> When you size your regions and cells, keep in mind that a region cannot split
> across a row. If your row size is too large, or your region size is too
> small, you can end up with a single row per region, which is not a good
> pattern. It is also possible that one big column causes splits while other
> columns are tiny, and this may not be great.
> A large # of columns probably means you are doing it wrong.
> Column names need to be short because they get stored for every value
> (barring encoding). Don't need to be self-documenting like in RDBMS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)