Re: Compaction throttling and per-region compaction automation

Jeremy Carroll Tue, 06 Nov 2012 11:26:20 -0800

Wrong issue. https://issues.apache.org/jira/browse/HBASE-5920



On Tue, Nov 6, 2012 at 11:24 AM, Jeremy Carroll <[email protected]> wrote:

> We ran into the throttleSize here at Klout in this issue (
> https://issues.apache.org/jira/browse/HBASE-592). Everything was promoted
> to the major compaction threads as a result. It was an error on our part
> since we were bulk loading files, and not using puts / flush sizes for it's
> compaction logic.
>
> The patches are not IO rate limiting, but a way to determine how many
> compaction threads are available to run (Queues), and when items are
> promoted from the small (minor) queue to the large (major) queue.
>
> I would welcome any real IO throttling on a per server basis. ;)
>
>
> On Tue, Nov 6, 2012 at 11:06 AM, Otis Gospodnetic <
> [email protected]> wrote:
>
>> Thanks Jeremy (I see you everywhere I turn!)
>>
>> https://issues.apache.org/jira/browse/HBASE-5867 sounds like there is
>> compaction throttling in 0.96.0, no?
>>
>> Lucene faces very similar problems as HBase, I think.
>> * An index has multiple segments.
>> * Files are added, not modified.
>> * Deletions are markers/tombstones.
>> * Optimization process purges deletes and rewrites segments.
>>
>> But there are some other options like:
>> * Optimize only partially, not all the way down to just 1 segment, but
>> down
>> to N
>> * Only expunge deletes, don't actually merge segments and rewrite them to
>> disk
>> * Pick segments with most deletes first
>> * Throttle IO
>>    See
>>
>>
>> http://search-lucene.com/c/Lucene:core/src/java/org/apache/lucene/store/RateLimiter.java
>>
>>
>> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/store/RateLimiter.html
>>
>>
>> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/TieredMergePolicy.html
>>     http://search-lucene.com/?q=throttle+merge&fc_project=Lucene
>>
>> Maybe some of the above is "borrowable" if throttling has not been
>> implemented yet.
>>
>> Otis
>> --
>> Search Analytics - http://sematext.com/search-analytics/index.html
>> Performance Monitoring - http://sematext.com/spm/index.html
>>
>>
>> On Tue, Nov 6, 2012 at 1:59 PM, Jeremy Carroll <[email protected]>
>> wrote:
>>
>> > To date I have used the major / minor compaction threads to control how
>> > many compactions are allowed to exist at one time on a per RegionServer
>> > basis. Then compact a table, and have the threads control how many
>> regions
>> > can compact at once. With care taken if minors are upgraded to majors as
>> > there is no throttling on Disk IO for majors which can be very
>> impactful.
>> >
>> > Aravind created an offline compaction script which pre-dated the
>> > threading implementation which you may find useful.
>> >
>> > https://github.com/aravind/hbase_compact
>> >
>> > On Tue, Nov 6, 2012 at 10:38 AM, Otis Gospodnetic <
>> > [email protected]> wrote:
>> >
>> > > Hi,
>> > >
>> > > Major compactions are..... you know... :)
>> > > So I saw there is https://issues.apache.org/jira/browse/HBASE-3743for
>> > > throttling them.
>> > >
>> > > We are about to try the per-region compaction, so I was wondering if
>> > anyone
>> > > has written a tool/script to automate that a bit?
>> > >
>> > > Thanks,
>> > > Otis
>> > > --
>> > > Search Analytics - http://sematext.com/search-analytics/index.html
>> > > Performance Monitoring - http://sematext.com/spm/index.html
>> > >
>> >
>>
>
>

Re: Compaction throttling and per-region compaction automation

Reply via email to