RE: ideas to improve throughput of the base writting

Jonathan Gray Mon, 07 Jun 2010 17:30:15 -0700

blockingStoreFiles is a protection mechanism and ideally should never be hit.


As for multithreaded compactions, this is currently under development and until 
further notice is targeted at 0.21.  See HBASE-1476.  Also related to this is 
HBASE-2375.

JG

> -----Original Message-----
> From: Vidhyashankar Venkataraman [mailto:[email protected]]
> Sent: Monday, June 07, 2010 5:08 PM
> To: [email protected]
> Subject: Re: ideas to improve throughput of the base writting
> 
> A related question then:
> 
> Going by the response to a previous question that I had posed a few
> days back, the value of hbase.hstore.blockingStoreFiles seems to depend
> on the compactionThreshold and the number of column families (number of
> Stores)..
> 
> Reaching compactionThreshold would trigger compaction in a Store for a
> particular region during which flushes will be disabled (which means no
> more store files for this store until the compaction finishes).. Writes
> on other Stores in the same region will be happening until each of them
> reach compactionThreshold after which they will trigger compactions
> themselves.
> 
> Which means, for low values of compactionThreshold and column families,
> the value of blockingStoreFiles may never be reached which makes it a
> redundant parameter..
> 
> Is this correct?
> 
> Vidhya
> 
> On 6/7/10 3:58 PM, "Ryan Rawson" <[email protected]> wrote:
> 
> Try setting this config value:
> 
> <property>
>   <name>hbase.hstore.blockingStoreFiles</name>
>   <value>15</value>
> </property>
> 
> and see if that helps.
> 
> The thing about the 1 compact thread is the scarce resources being
> preserved in this case is cluster IO.  People have had issues with
> compaction IO being too heavy.
> 
> in your case, this setting can let the regionserver build up more
> store files without pausing your import.
> 
> -ryan
> 
> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu <[email protected]>
> wrote:
> > Hi,  There:
> >  While saving lots of data to  on hbase, I noticed that the
> regionserver CPU
> > went to more than 100%. examination shows that the hbase CompactSplit
> is
> > spending full time working on compacting/splitting  hbase store
> files. The
> > machine I have is an 8 core machine. because there is only one
> comact/split
> > thread in hbase, only one core is fully used.
> >  I continue to submit  map/reduce job to insert records to hbase.
> most of
> > the time, the job runs very fast, around 1-5 minutes. But
> occasionally, it
> > can take 2 hours. That is very bad to me. I highly suspect that the
> > occasional slow insertion is related to the
> > insufficient speed  compactsplit thread.
> >  I am thinking that I should parallize the compactsplit thread, the
> code has
> > this  : the for loop "for (Store store: stores.values())  " can be
> > parallized via java 5's threadpool , thus multiple cores are used
> instead
> > only one core is used. I wonder if this will help to increase the
> > throughput.
> >
> >  Somebody mentioned that I can increase the regionsize to that I
> don't do so
> > many compaction. Under heavy writing situation.
> > does anybody have experience showing it helps ?
> >
> > Jimmy.
> >
> >
> >
> >  byte [] compactStores(final boolean majorCompaction)
> >
> >  throws IOException {
> >
> >   if (this.closing.get() || this.closed.get()) {
> >
> >     LOG.debug("Skipping compaction on " + this + " because
> closing/closed");
> >
> >     return null;
> >
> >   }
> >
> >   splitsAndClosesLock.readLock().lock();
> >
> >   try {
> >
> >     byte [] splitRow = null;
> >
> >     if (this.closed.get()) {
> >
> >       return splitRow;
> >
> >     }
> >
> >     try {
> >
> >       synchronized (writestate) {
> >
> >         if (!writestate.compacting && writestate.writesEnabled) {
> >
> >           writestate.compacting = true;
> >
> >         } else {
> >
> >           LOG.info("NOT compacting region " + this +
> >
> >               ": compacting=" + writestate.compacting + ",
> writesEnabled=" +
> >
> >               writestate.writesEnabled);
> >
> >             return splitRow;
> >
> >         }
> >
> >       }
> >
> >       LOG.info("Starting" + (majorCompaction? " major " : " ") +
> >
> >           "compaction on region " + this);
> >
> >       long startTime = System.currentTimeMillis();
> >
> >       doRegionCompactionPrep();
> >
> >       long maxSize = -1;
> >
> >       for (Store store: stores.values()) {
> >
> >         final Store.StoreSize ss = store.compact(majorCompaction);
> >
> >         if (ss != null && ss.getSize() > maxSize) {
> >
> >           maxSize = ss.getSize();
> >
> >           splitRow = ss.getSplitRow();
> >
> >         }
> >
> >       }
> >
> >       doRegionCompactionCleanup();
> >
> >       String timeTaken =
> > StringUtils.formatTimeDiff(System.currentTimeMillis(),
> >
> >           startTime);
> >
> >       LOG.info("compaction completed on region " + this + " in " +
> > timeTaken);
> >
> >     } finally {
> >
> >       synchronized (writestate) {
> >
> >         writestate.compacting = false;
> >
> >         writestate.notifyAll();
> >
> >       }
> >
> >     }
> >
> >     return splitRow;
> >
> >   } finally {
> >
> >     splitsAndClosesLock.readLock().unlock();
> >
> >   }
> >
> >  }
> >
> >
> >
> >
> >

RE: ideas to improve throughput of the base writting

Reply via email to