Re: ideas to improve throughput of the base writting

Jinsong Hu Wed, 09 Jun 2010 13:34:46 -0700

I checked the log, there are lots of

e 128.1m is >= than blocking 128.0m size

2010-06-09 17:26:36,736 INFO org.apache.hadoop.hbase.regionserver.HRegion:Blocking updates for 'IPC Server handler 8 on 60020' on regionSpam_MsgEventTable,2010-06-09 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262: memstoresiz

e 128.1m is >= than blocking 128.0m size


then after that there are lots of

2010-06-09 17:26:36,800 DEBUG org.apache.hadoop.hbase.regionserver.Store:Added

hdfs://namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/376337880/messag

e_compound_terms/7606939244559826252, entries=30869, sequenceid=8350447892,memsize=7.2m, filesize=3.4m to Spam_MsgEventTable,2010-06-0905:25:32\x09c873847edf6



then lots of

2010-06-09 17:26:39,005 INFO org.apache.hadoop.hbase.regionserver.HRegion:Unblocking updates for region Spam_MsgEventTable,2010-06-0905:25:32\x09c873847edf6e5

390477494956ec04729,1276104002262 'IPC Server handler 8 on 60020'

This cycle happens again and again in the log. What can I do in this caseto speed up writing ?right now the writing speed is really slow. close to 4 rows/second for aregionserver.

I checked the code and try to find out why there are so many store files,and I noticed each secondthe regionserver reports to master, it calls the memstore flush and write astore file.

the parameter hbase.regionserver.msginterval default value is 1 second. I amthinking to change to 10 second.can that help ? I am also thinking to change hbase.hstore.blockingStoreFilesto 1000. I noticed that there is a parameterhbase.hstore.blockingWaitTime with default value of 1.5 minutes. as long asthe 1.5 minutes is reached,the compaction is executed. I am fine with running compaction every 1.5minutes, but running compaction every second

and causing CPU consistently higher than 100% is not wanted.

Any suggestion what kind of parameters to change to improve my writing speed?


Jimmy




--------------------------------------------------
From: "Ryan Rawson" <[email protected]>
Sent: Wednesday, June 09, 2010 1:01 PM
To: <[email protected]>
Subject: Re: ideas to improve throughput of the base writting

The log will say something like "blocking updates to..." when you hit
a limit.  That log you indicate is just the regionserver attempting to
compact a region, but shouldn't prevent updates.

what else does your logfile say?  Search for the string (case
insensitive) "blocking updates"...

-ryan

On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu <[email protected]>wrote:


I made this change
<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>15</value>
</property>

the system is still slow.

Here is the most recent value for the region :
stores=21, storefiles=186, storefileSizeMB=9681, memstoreSizeMB=128,
storefileIndexSizeMB=12


And the same log still happens:

2010-06-09 18:36:40,577 WARN org.apache.h
adoop.hbase.regionserver.MemStoreFlusher: Region
SOME_ABCEventTable,2010-06-09 0

9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has too manystore

fil
es, putting it back at the end of the flush queue.

One idea that I have now is to further increase the
hbase.hstore.blockingStoreFiles to a very high
Number, such as 1000.  What is the negative impact of this change ?


Jimmy


--------------------------------------------------
From: "Ryan Rawson" <[email protected]>
Sent: Monday, June 07, 2010 3:58 PM
To: <[email protected]>
Subject: Re: ideas to improve throughput of the base writting

Try setting this config value:

<property>
 <name>hbase.hstore.blockingStoreFiles</name>
 <value>15</value>
</property>

and see if that helps.

The thing about the 1 compact thread is the scarce resources being
preserved in this case is cluster IO.  People have had issues with
compaction IO being too heavy.

in your case, this setting can let the regionserver build up more
store files without pausing your import.

-ryan

On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu <[email protected]>wrote:


Hi,  There:

While saving lots of data to on hbase, I noticed that theregionserver

CPU

went to more than 100%. examination shows that the hbase CompactSplitis

spending full time working on compacting/splitting  hbase store files.
The
machine I have is an 8 core machine. because there is only one
comact/split
thread in hbase, only one core is fully used.

I continue to submit map/reduce job to insert records to hbase. mostof

the time, the job runs very fast, around 1-5 minutes. But occasionally,
it
can take 2 hours. That is very bad to me. I highly suspect that the
occasional slow insertion is related to the
insufficient speed  compactsplit thread.

I am thinking that I should parallize the compactsplit thread, thecode

has
this  : the for loop "for (Store store: stores.values())  " can be

parallized via java 5's threadpool , thus multiple cores are usedinstead

only one core is used. I wonder if this will help to increase the
throughput.

Somebody mentioned that I can increase the regionsize to that I don'tdo

so
many compaction. Under heavy writing situation.
does anybody have experience showing it helps ?

Jimmy.



 byte [] compactStores(final boolean majorCompaction)

 throws IOException {

 if (this.closing.get() || this.closed.get()) {

   LOG.debug("Skipping compaction on " + this + " because
closing/closed");

   return null;

 }

 splitsAndClosesLock.readLock().lock();

 try {

   byte [] splitRow = null;

   if (this.closed.get()) {

     return splitRow;

   }

   try {

     synchronized (writestate) {

       if (!writestate.compacting && writestate.writesEnabled) {

         writestate.compacting = true;

       } else {

         LOG.info("NOT compacting region " + this +

": compacting=" + writestate.compacting + ",writesEnabled="

+

             writestate.writesEnabled);

           return splitRow;

       }

     }

     LOG.info("Starting" + (majorCompaction? " major " : " ") +

         "compaction on region " + this);

     long startTime = System.currentTimeMillis();

     doRegionCompactionPrep();

     long maxSize = -1;

     for (Store store: stores.values()) {

       final Store.StoreSize ss = store.compact(majorCompaction);

       if (ss != null && ss.getSize() > maxSize) {

         maxSize = ss.getSize();

         splitRow = ss.getSplitRow();

       }

     }

     doRegionCompactionCleanup();

     String timeTaken =
StringUtils.formatTimeDiff(System.currentTimeMillis(),

         startTime);

     LOG.info("compaction completed on region " + this + " in " +
timeTaken);

   } finally {

     synchronized (writestate) {

       writestate.compacting = false;

       writestate.notifyAll();

     }

   }

   return splitRow;

 } finally {

   splitsAndClosesLock.readLock().unlock();

 }

 }

Re: ideas to improve throughput of the base writting

Reply via email to