Re: ideas to improve throughput of the base writting

Jinsong Hu Wed, 09 Jun 2010 11:53:06 -0700


I made this change
<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>15</value>
</property>


the system is still slow.

Here is the most recent value for the region :

stores=21, storefiles=186, storefileSizeMB=9681, memstoreSizeMB=128,storefileIndexSizeMB=12



And the same log still happens:

2010-06-09 18:36:40,577 WARN org.apache.h

adoop.hbase.regionserver.MemStoreFlusher: RegionSOME_ABCEventTable,2010-06-09 09:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has too many storefil

es, putting it back at the end of the flush queue.

One idea that I have now is to further increase thehbase.hstore.blockingStoreFiles to a very high

Number, such as 1000.  What is the negative impact of this change ?


Jimmy


--------------------------------------------------
From: "Ryan Rawson" <[email protected]>
Sent: Monday, June 07, 2010 3:58 PM
To: <[email protected]>
Subject: Re: ideas to improve throughput of the base writting

Try setting this config value:

<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>15</value>
</property>

and see if that helps.

The thing about the 1 compact thread is the scarce resources being
preserved in this case is cluster IO.  People have had issues with
compaction IO being too heavy.

in your case, this setting can let the regionserver build up more
store files without pausing your import.

-ryan

On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu <[email protected]> wrote:

Hi,  There:

While saving lots of data to on hbase, I noticed that the regionserverCPU

went to more than 100%. examination shows that the hbase CompactSplit is

spending full time working on compacting/splitting hbase store files.Themachine I have is an 8 core machine. because there is only onecomact/split

thread in hbase, only one core is fully used.
 I continue to submit  map/reduce job to insert records to hbase. most of

the time, the job runs very fast, around 1-5 minutes. But occasionally,it

can take 2 hours. That is very bad to me. I highly suspect that the
occasional slow insertion is related to the
insufficient speed  compactsplit thread.

I am thinking that I should parallize the compactsplit thread, the codehas

this  : the for loop "for (Store store: stores.values())  " can be
parallized via java 5's threadpool , thus multiple cores are used instead
only one core is used. I wonder if this will help to increase the
throughput.

Somebody mentioned that I can increase the regionsize to that I don't doso

many compaction. Under heavy writing situation.
does anybody have experience showing it helps ?

Jimmy.



 byte [] compactStores(final boolean majorCompaction)

 throws IOException {

  if (this.closing.get() || this.closed.get()) {

LOG.debug("Skipping compaction on " + this + " becauseclosing/closed");


    return null;

  }

  splitsAndClosesLock.readLock().lock();

  try {

    byte [] splitRow = null;

    if (this.closed.get()) {

      return splitRow;

    }

    try {

      synchronized (writestate) {

        if (!writestate.compacting && writestate.writesEnabled) {

          writestate.compacting = true;

        } else {

          LOG.info("NOT compacting region " + this +

": compacting=" + writestate.compacting + ",writesEnabled=" +


              writestate.writesEnabled);

            return splitRow;

        }

      }

      LOG.info("Starting" + (majorCompaction? " major " : " ") +

          "compaction on region " + this);

      long startTime = System.currentTimeMillis();

      doRegionCompactionPrep();

      long maxSize = -1;

      for (Store store: stores.values()) {

        final Store.StoreSize ss = store.compact(majorCompaction);

        if (ss != null && ss.getSize() > maxSize) {

          maxSize = ss.getSize();

          splitRow = ss.getSplitRow();

        }

      }

      doRegionCompactionCleanup();

      String timeTaken =
StringUtils.formatTimeDiff(System.currentTimeMillis(),

          startTime);

      LOG.info("compaction completed on region " + this + " in " +
timeTaken);

    } finally {

      synchronized (writestate) {

        writestate.compacting = false;

        writestate.notifyAll();

      }

    }

    return splitRow;

  } finally {

    splitsAndClosesLock.readLock().unlock();

  }

 }

Re: ideas to improve throughput of the base writting

Reply via email to