[ 
https://issues.apache.org/jira/browse/HBASE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838398#comment-13838398
 ] 

Nick Dimiduk commented on HBASE-10074:
--------------------------------------

I don't read docbook, so I cannot comment about the markup. However, the 
content is great! Here are some nits.

bq. <para>The master as is is allergic to tons of regions

Mind adding some JIRA references here?

bq. tons of regions on a few RS can cause the store file index to rise raising 
heap usage and...

"store file index to rise rising" ? This sentence is confusing me.

bq. Keeping 5 regions per RS would be too low for a job, whereas 1000 will 
generate too many maps.

How about "Hosting only 5 regions per RS will not be enough task splits for a 
mapreduce job, while 1000 regions will generate far too many map tasks."

In section {{<section xml:id="ops.capacity.regions"><title>Determining region 
count and size</title>}} you suggest "20-200 regions per RS" but previously you 
said "20-100".

+1

> consolidate and improve capacity/sizing documentation
> -----------------------------------------------------
>
>                 Key: HBASE-10074
>                 URL: https://issues.apache.org/jira/browse/HBASE-10074
>             Project: HBase
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-10074.patch
>
>
> Region count description is in config section; region size description is in 
> architecture sections; both of these have a lot of good technical details, 
> but imho we could do better in terms of admin-centric advice. 
> Currently, there's a nearly-empty capacity section; I'd like to rewrite it to 
> consolidate capacity planning/sizing/region sizing information, and some 
> basic configuration pertaining to it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to