Re: ideas to improve throughput of the base writting

Vidhyashankar Venkataraman Mon, 07 Jun 2010 17:09:05 -0700

A related question then:

Going by the response to a previous question that I had posed a few days back, 
the value of hbase.hstore.blockingStoreFiles seems to depend on the 
compactionThreshold and the number of column families (number of Stores)..


Reaching compactionThreshold would trigger compaction in a Store for a 
particular region during which flushes will be disabled (which means no more 
store files for this store until the compaction finishes).. Writes on other 
Stores in the same region will be happening until each of them reach 
compactionThreshold after which they will trigger compactions themselves.

Which means, for low values of compactionThreshold and column families, the 
value of blockingStoreFiles may never be reached which makes it a redundant 
parameter..

Is this correct?

Vidhya

On 6/7/10 3:58 PM, "Ryan Rawson" <[email protected]> wrote:

Try setting this config value:

<property>
  <name>hbase.hstore.blockingStoreFiles</name>
  <value>15</value>
</property>

and see if that helps.

The thing about the 1 compact thread is the scarce resources being
preserved in this case is cluster IO.  People have had issues with
compaction IO being too heavy.

in your case, this setting can let the regionserver build up more
store files without pausing your import.

-ryan

On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu <[email protected]> wrote:
> Hi,  There:
>  While saving lots of data to  on hbase, I noticed that the regionserver CPU
> went to more than 100%. examination shows that the hbase CompactSplit is
> spending full time working on compacting/splitting  hbase store files. The
> machine I have is an 8 core machine. because there is only one comact/split
> thread in hbase, only one core is fully used.
>  I continue to submit  map/reduce job to insert records to hbase. most of
> the time, the job runs very fast, around 1-5 minutes. But occasionally, it
> can take 2 hours. That is very bad to me. I highly suspect that the
> occasional slow insertion is related to the
> insufficient speed  compactsplit thread.
>  I am thinking that I should parallize the compactsplit thread, the code has
> this  : the for loop "for (Store store: stores.values())  " can be
> parallized via java 5's threadpool , thus multiple cores are used instead
> only one core is used. I wonder if this will help to increase the
> throughput.
>
>  Somebody mentioned that I can increase the regionsize to that I don't do so
> many compaction. Under heavy writing situation.
> does anybody have experience showing it helps ?
>
> Jimmy.
>
>
>
>  byte [] compactStores(final boolean majorCompaction)
>
>  throws IOException {
>
>   if (this.closing.get() || this.closed.get()) {
>
>     LOG.debug("Skipping compaction on " + this + " because closing/closed");
>
>     return null;
>
>   }
>
>   splitsAndClosesLock.readLock().lock();
>
>   try {
>
>     byte [] splitRow = null;
>
>     if (this.closed.get()) {
>
>       return splitRow;
>
>     }
>
>     try {
>
>       synchronized (writestate) {
>
>         if (!writestate.compacting && writestate.writesEnabled) {
>
>           writestate.compacting = true;
>
>         } else {
>
>           LOG.info("NOT compacting region " + this +
>
>               ": compacting=" + writestate.compacting + ", writesEnabled=" +
>
>               writestate.writesEnabled);
>
>             return splitRow;
>
>         }
>
>       }
>
>       LOG.info("Starting" + (majorCompaction? " major " : " ") +
>
>           "compaction on region " + this);
>
>       long startTime = System.currentTimeMillis();
>
>       doRegionCompactionPrep();
>
>       long maxSize = -1;
>
>       for (Store store: stores.values()) {
>
>         final Store.StoreSize ss = store.compact(majorCompaction);
>
>         if (ss != null && ss.getSize() > maxSize) {
>
>           maxSize = ss.getSize();
>
>           splitRow = ss.getSplitRow();
>
>         }
>
>       }
>
>       doRegionCompactionCleanup();
>
>       String timeTaken =
> StringUtils.formatTimeDiff(System.currentTimeMillis(),
>
>           startTime);
>
>       LOG.info("compaction completed on region " + this + " in " +
> timeTaken);
>
>     } finally {
>
>       synchronized (writestate) {
>
>         writestate.compacting = false;
>
>         writestate.notifyAll();
>
>       }
>
>     }
>
>     return splitRow;
>
>   } finally {
>
>     splitsAndClosesLock.readLock().unlock();
>
>   }
>
>  }
>
>
>
>
>

Re: ideas to improve throughput of the base writting

Reply via email to