Hi,

I tested this now on a larger cluster and the initial wait will
definitely cause frustration. I will go with plan be and load the
fragmentation asynchronously. Any preferred way do do that? Small
thread class like the other workers or is there a better way? Is that
OK? One more thread in the master which is idle most of the time
compared to the RS's.

Lars

On Tue, Jan 5, 2010 at 4:14 PM, Lars George (JIRA) <j...@apache.org> wrote:
>
>    [ 
> https://issues.apache.org/jira/browse/HBASE-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796693#action_12796693
>  ]
>
> Lars George commented on HBASE-2021:
> ------------------------------------
>
> I added those "title" attributes to the headers with some "soothing" 
> explanation. Do you think that is not good enough?
>
>> Add compaction details to master UI
>> -----------------------------------
>>
>>                 Key: HBASE-2021
>>                 URL: https://issues.apache.org/jira/browse/HBASE-2021
>>             Project: Hadoop HBase
>>          Issue Type: Improvement
>>            Reporter: Lars George
>>            Assignee: Lars George
>>            Priority: Minor
>>             Fix For: 0.20.3, 0.21.0
>>
>>         Attachments: HBASE-2021-0.20-v2.patch, HBASE-2021-0.20-v3.patch, 
>> HBASE-2021-0.20.patch, HBASE-2021.patch
>>
>>
>> There are two issues with this, first to detect that there is a compaction 
>> needed. You can currently use the little helper util that checks if a table 
>> has at least one colfam with more than one store file. I though about 
>> scanning all tables and all colfams in each and then compute the 
>> "fragmentation" ratio as a percentage of colfams with more than one store to 
>> the total number of colfams. That gives a "Table xyz is 33% fragmented" 
>> output. While minor percentage are normal under insert operations it is 
>> still important to know how bad the fragmentation is overall.
>> Another idea is to weigh the number of files per store too, so that if you 
>> have two per colfam it is considered "low" and if you have more, for example 
>> 6-8 it is considered "high". Not sure how that can be done yet but noting 
>> the idea down here.
>> Of course seeing the .META. fragmentation is useful to quickly debug 
>> performance issues (as JD told me on IRC).
>> The other issue is that when you have started a compaction you have no idea 
>> how far it is and if it is still in progress. One indication of course is 
>> the above value. If it is 0% then all is done. But if you are at say 23%, is 
>> it still compacting? We could have a simple status that compactions are 
>> still in progress.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Reply via email to