[ 
https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izaak Rubin updated HBASE-745:
------------------------------

    Attachment: hbase-745-for-0.2.patch

I've been looking over the issue, and I (and Stack) agree with LN and the 
changes proposed in his patch.  However, as Jim noted, we want to be focusing 
on 0.2 instead of 0.1.3.  I've taken LN's patch and modified it slightly to fit 
into trunk (hbase-745-for-0.2.patch).  I've also added several additional 
assertions to TestCompaction to account for the changes.

All HBase tests passed successfully.  However, this patch SHOULD NOT be applied 
until after HBase-720 is resolved and it's patch (hbase-720.patch) is applied.  
Both of these patches modify the same two files (HStore, TestCompaction), and 
they must be committed in the correct order (first 720, then 745).  

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are 
> many works to do,  before a particular regionserver can handle data about 
> 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading 
> hardware, use 64bit os and 8G memory for the regionserver process, and speed 
> up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total 
> compaction time is basicly linear relative to whole data size, even worse, 
> sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and 
> HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to