RE: ideas to improve throughput of the base writting

Jonathan Gray Wed, 09 Jun 2010 17:27:44 -0700

You should also consider doing straight-to-hdfs imports with HFileOutputFormat 
/ loadtable.rb


There's some documentation in the comments of the files to get you started.

This is one way to do bulk imports that can more easily/feasibly approach the 
IO limits of your hardware.

> -----Original Message-----
> From: Ryan Rawson [mailto:[email protected]]
> Sent: Wednesday, June 09, 2010 5:20 PM
> To: [email protected]
> Subject: Re: ideas to improve throughput of the base writting
>
> I am not familiar with that exception, I have not seen of it before...
> perhaps someone else has?
>
> And my 200k rows/sec is over 19 machines.  It is the average over many
> hours.  My calculation of row size might not match how much data was
> flowing to disk, but I think it isn't too far off.
>
> Unfortunately comparing raw disk speed in a trivial benchmark (such as
> hdparm -t is) doesn't tell us how absolute speed of HBase must
> perform.  This is because HBase does much more work than a raw disk
> write benchmark -- doing so to maintain structure and sorting.  We can
> say that 'faster disks = faster HBase performance'.
>
> From the log lines you have pasted it sounds like the regionserver's
> flush ability is not keeping up with your rate of data input.  How big
> are your records?  What is your target input speed?  Have you done
> anything on this page:
> http://wiki.apache.org/hadoop/PerformanceTuning
>
>
>
> On Wed, Jun 9, 2010 at 4:58 PM, Jinsong Hu <[email protected]>
> wrote:
> > My hardware has 2 disks. I did a file copy on the machine and found
> that I
> > can get 300 mbyte/second.
> >
> > At this time, I see my insertion is less than 1k/second. my row size
> is . in
> > terms of disk writing. my record
> > insertion rate is far less than the hardware limit.  my row size is
> about
> > 10K byte
> >
> > if in your i7-based server, you are doing 200k row/sec, each row is
> 200
> > byte, then you are doing 40M byte/sec.
> >
> > in my case, if it behaves normally, I can get 100 row/sec * 10K byte
> =1M
> > /sec.
> > that is far from the disk speed. occasionally I can see 1k
> row/second. which
> > is more reasonable in my case,
> > but I rarely get that performance.
> >
> > even worse, with the change done, now I have seem lots of compaction
> > failure:
> >
> > 2010-06-09 23:40:55,117 ERROR
> > org.apache.hadoop.hbase.regionserver.CompactSplitT
> > hread: Compaction failed for region Spam_MsgEventTable,2010-06-09
> > 20:05:20\x0905
> > 860d4bf1cb268ef69391cf97de9f64,1276121160527
> > java.lang.RuntimeException: java.io.IOException: Could not find
> target
> > position
> > 65588
> >       at
> > org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS
> > canner.java:61)
> >       at
> > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.j
> > ava:79)
> >       at
> > org.apache.hadoop.hbase.regionserver.MinorCompactingStoreScanner.next
> > (MinorCompactingStoreScanner.java:96)
> >       at
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:920)
> >       at
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:764)
> >       at
> > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja
> > va:832)
> >       at
> > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja
> > va:785)
> >       at
> > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSp
> > litThread.java:93)
> > Caused by: java.io.IOException: Could not find target position 65588
> >       at
> > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockAt(DFSClien
> > t.java:1556)
> >       at
> > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient
> > .java:1666)
> >       at
> > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1
> > 780)
> >       at java.io.DataInputStream.read(DataInputStream.java:132)
> >       at
> > org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(Bou
> > ndedRangeFileInputStream.java:105)
> >       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
> >       at
> > org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1
> > 018)
> >       at
> > org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:96
> > 6)
> >       at
> > org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java
> > :1159)
> >       at
> > org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS
> > canner.java:58)
> >       ... 7 more
> >
> > I can't stop this unless I restarted the regionserver. After restart
> I
> > truncate the table, and when I list the table again in shell,
> > it appears 2 times. now I can't even disable the table and drop it.
> >
> > I will restart the whole hbase cluster and report the progress.
> >
> > Jimmy/
> >
> >
> >
> >
> >
> > --------------------------------------------------
> > From: "Ryan Rawson" <[email protected]>
> > Sent: Wednesday, June 09, 2010 4:16 PM
> > To: <[email protected]>
> > Subject: Re: ideas to improve throughput of the base writting
> >
> >> Hey,
> >>
> >> Sounds like you are hitting limits of your hardware... I dont think
> >> you mentioned the hardware spec you are running in this thread...
> >>
> >> What you are seeing is essentially the limits of HDFS's ability to
> >> take writes.  The errors might be due to various HDFS setup problems
> >> (eg: xceiver count, file handle count, all outlined in various HBase
> >> "startup" docs)... But the overall performance might be limited by
> >> your hardware.
> >>
> >> For example, I use i7-based servers with 4 disks.  This gives a
> >> reasonable IO bandwidth, and can cope with high rates of inserts
> (upto
> >> 100-200k rows/sec (each row is ~ 100-300 bytes).  If you are running
> a
> >> 1 or 2 disk system it is possible you are hitting limits of what
> your
> >> hardware can do.
> >>
> >> Also note that the write-pipeline performance is ultimately defined
> in
> >> bytes/sec, not just 'rows/sec'... thus my rows were small, and if
> >> yours are big then you might be hitting a lower 'row/sec' limit even
> >> though the amount of bytes you are writing is higher than what i
> might
> >> have been doing.
> >>
> >>
> >>
> >> On Wed, Jun 9, 2010 at 3:59 PM, Jinsong Hu <[email protected]>
> wrote:
> >>>
> >>> I still get lots of repetition of
> >>>
> >>> 2010-06-09 22:54:38,428 WARN
> >>> org.apache.hadoop.hbase.regionserver.MemStoreFlushe
> >>> r: Region Spam_MsgEventTable,2010-06-09
> >>> 20:05:20\x0905860d4bf1cb268ef69391cf97de
> >>> 9f64,1276121160527 has too many store files, putting it back at the
> end
> >>> of
> >>> the f
> >>> lush queue.
> >>> 2010-06-09 22:54:38,428 DEBUG
> >>> org.apache.hadoop.hbase.regionserver.CompactSplitT
> >>> hread: Compaction requested for region Spam_MsgEventTable,2010-06-
> 09
> >>> 20:05:20\x0
> >>> 905860d4bf1cb268ef69391cf97de9f64,1276121160527/1537478401 because:
> >>> regionserver
> >>> /10.110.8.88:60020.cacheFlusher
> >>>
> >>>
> >>> I also saw lots of
> >>>
> >>> 2010-06-09 22:50:12,527 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion:
> >>> Block
> >>> ing updates for 'IPC Server handler 1 on 60020' on region
> >>> Spam_MsgEventTable,201
> >>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527:
> >>> memstore
> >>> siz
> >>> e 512.0m is >= than blocking 512.0m size
> >>> 2010-06-09 22:50:12,598 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion:
> >>> Block
> >>> ing updates for 'IPC Server handler 5 on 60020' on region
> >>> Spam_MsgEventTable,201
> >>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527:
> >>> memstore
> >>> siz
> >>> e 512.0m is >= than blocking 512.0m size
> >>>
> >>> even with the changed config. the regionserver has 4G ram.  what
> else can
> >>> be
> >>> wrong ?
> >>>
> >>> The insertion rate is still not good.
> >>>
> >>> Jimmy.
> >>>
> >>>
> >>> --------------------------------------------------
> >>> From: "Jinsong Hu" <[email protected]>
> >>> Sent: Wednesday, June 09, 2010 1:59 PM
> >>> To: <[email protected]>
> >>> Subject: Re: ideas to improve throughput of the base writting
> >>>
> >>>> Thanks. I will make this change:
> >>>>
> >>>> <property>
> >>>>  <name>hbase.hregion.memstore.block.multiplier</name>
> >>>>  <value>8</value>
> >>>> </property>
> >>>>
> >>>> <property>
> >>>>  <name>hbase.regionserver.msginterval</name>
> >>>>  <value>10000</value>
> >>>> </property>
> >>>>
> >>>>  <property>
> >>>>  <name>hbase.hstore.compactionThreshold</name>
> >>>>  <value>6</value>
> >>>> </property>
> >>>>
> >>>>
> >>>> <property>
> >>>>  <name>hbase.hstore.blockingStoreFiles</name>
> >>>>  <value>18</value>
> >>>> </property>
> >>>>
> >>>>
> >>>> and see how it goes.
> >>>>
> >>>>
> >>>> Jimmy.
> >>>>
> >>>> --------------------------------------------------
> >>>> From: "Ryan Rawson" <[email protected]>
> >>>> Sent: Wednesday, June 09, 2010 1:49 PM
> >>>> To: <[email protected]>
> >>>> Subject: Re: ideas to improve throughput of the base writting
> >>>>
> >>>>> More background here... you are running into a situation where
> the
> >>>>> regionserver cannot flush fast enough and the size of the
> region's
> >>>>> memstore has climbed too high and thus you get that error
> message.
> >>>>> HBase attempts to protect itself by holding up clients (thus
> causing
> >>>>> the low performance you see).  By expanding how big a memstore
> can get
> >>>>> during times of stress you can improve performance, at the cost
> of
> >>>>> memory usage. That is what that setting is about.
> >>>>>
> >>>>> As for the 1.5 minute setting, that is the maximal amount of time
> a
> >>>>> handler thread will block for.  You shouldn't need to tweak that
> >>>>> value, and reducing it could cause issues.
> >>>>>
> >>>>> Now, as for compacting, HBase will compact small files into
> larger
> >>>>> files, and on a massive upload you can expect to see this happen
> >>>>> constantly, thus tying up 1 cpu worth on your regionserver.  You
> could
> >>>>> potentially reduce that by increasing the value:
> >>>>>
> >>>>>  <property>
> >>>>>  <name>hbase.hstore.compactionThreshold</name>
> >>>>>  <value>3</value>
> >>>>>
> >>>>> the value is interpreted as "if there are more than 3 files for a
> >>>>> region then run the compaction check".  By raising this limit you
> can
> >>>>> accumulate more files before compacting them, thus reducing the
> >>>>> frequency of compactions but also potentially increasing the
> >>>>> performance of reads (more files to read = more seeks = slower).
>  I'd
> >>>>> consider setting it to 5-7 or so in concert with setting
> >>>>> "hbase.hstore.blockingStoreFiles" to a value at least 2x that.
> >>>>>
> >>>>> All of these settings increase the amount of ram your
> regionserver may
> >>>>> need, so you will want to ensure you have at least 4000m of ram
> set in
> >>>>> hbase-env.sh.  This is why they are set so conservatively in the
> >>>>> default shipping config.
> >>>>>
> >>>>> These are the 3 important settings that control how often
> compactions
> >>>>> occur and how RPC threads get blocked.  Try tweaking all of them
> and
> >>>>> let me know if you are doing better.
> >>>>>
> >>>>> -ryan
> >>>>>
> >>>>>
> >>>>> On Wed, Jun 9, 2010 at 1:37 PM, Ryan Rawson <[email protected]>
> wrote:
> >>>>>>
> >>>>>> you also want this config:
> >>>>>>
> >>>>>> <property>
> >>>>>>  <name>hbase.hregion.memstore.block.multiplier</name>
> >>>>>>  <value>8</value>
> >>>>>> </property>
> >>>>>>
> >>>>>>
> >>>>>> that should hopefully clear things up.
> >>>>>>
> >>>>>> -ryan
> >>>>>>
> >>>>>> On Wed, Jun 9, 2010 at 1:34 PM, Jinsong Hu
> <[email protected]>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> I checked the log, there are lots of
> >>>>>>>
> >>>>>>> e 128.1m is >= than blocking 128.0m size
> >>>>>>> 2010-06-09 17:26:36,736 INFO
> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >>>>>>> Block
> >>>>>>> ing updates for 'IPC Server handler 8 on 60020' on region
> >>>>>>> Spam_MsgEventTable,201
> >>>>>>> 0-06-09
> 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262:
> >>>>>>> memstore
> >>>>>>> siz
> >>>>>>> e 128.1m is >= than blocking 128.0m size
> >>>>>>>
> >>>>>>> then after that there are lots of
> >>>>>>>
> >>>>>>> 2010-06-09 17:26:36,800 DEBUG
> >>>>>>> org.apache.hadoop.hbase.regionserver.Store:
> >>>>>>> Added
> >>>>>>>
> >>>>>>>
> >>>>>>>
> hdfs://namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/3763378
> 80/messag
> >>>>>>> e_compound_terms/7606939244559826252, entries=30869,
> >>>>>>> sequenceid=8350447892,
> >>>>>>> mems
> >>>>>>> ize=7.2m, filesize=3.4m to Spam_MsgEventTable,2010-06-09
> >>>>>>> 05:25:32\x09c873847edf6
> >>>>>>>
> >>>>>>>
> >>>>>>> then lots of
> >>>>>>>
> >>>>>>> 2010-06-09 17:26:39,005 INFO
> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >>>>>>> Unblo
> >>>>>>> cking updates for region Spam_MsgEventTable,2010-06-09
> >>>>>>> 05:25:32\x09c873847edf6e5
> >>>>>>> 390477494956ec04729,1276104002262 'IPC Server handler 8 on
> 60020'
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> This cycle happens again and again in the log.   What can I do
> in
> >>>>>>> this
> >>>>>>> case
> >>>>>>> to speed up writing ?
> >>>>>>> right now the writing speed is really slow. close to 4
> rows/second
> >>>>>>> for
> >>>>>>> a
> >>>>>>> regionserver.
> >>>>>>>
> >>>>>>> I checked the code and try to find out why there are so many
> store
> >>>>>>> files,
> >>>>>>> and I noticed each second
> >>>>>>> the regionserver reports to master, it calls the memstore flush
> and
> >>>>>>> write a
> >>>>>>> store file.
> >>>>>>>
> >>>>>>> the parameter hbase.regionserver.msginterval default value is 1
> >>>>>>> second.
> >>>>>>> I am
> >>>>>>> thinking to change to 10 second.
> >>>>>>> can that help ? I am also thinking to change
> >>>>>>> hbase.hstore.blockingStoreFiles
> >>>>>>> to 1000.  I noticed that there is a parameter
> >>>>>>> hbase.hstore.blockingWaitTime with default value of 1.5
> minutes. as
> >>>>>>> long as
> >>>>>>> the 1.5 minutes is reached,
> >>>>>>> the compaction is executed. I am fine with running compaction
> every
> >>>>>>> 1.5
> >>>>>>> minutes, but running compaction every second
> >>>>>>> and causing CPU consistently higher than 100% is not wanted.
> >>>>>>>
> >>>>>>> Any suggestion what kind of parameters to change to improve my
> >>>>>>> writing
> >>>>>>> speed
> >>>>>>> ?
> >>>>>>>
> >>>>>>> Jimmy
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --------------------------------------------------
> >>>>>>> From: "Ryan Rawson" <[email protected]>
> >>>>>>> Sent: Wednesday, June 09, 2010 1:01 PM
> >>>>>>> To: <[email protected]>
> >>>>>>> Subject: Re: ideas to improve throughput of the base writting
> >>>>>>>
> >>>>>>>> The log will say something like "blocking updates to..." when
> you
> >>>>>>>> hit
> >>>>>>>> a limit.  That log you indicate is just the regionserver
> attempting
> >>>>>>>> to
> >>>>>>>> compact a region, but shouldn't prevent updates.
> >>>>>>>>
> >>>>>>>> what else does your logfile say?  Search for the string (case
> >>>>>>>> insensitive) "blocking updates"...
> >>>>>>>>
> >>>>>>>> -ryan
> >>>>>>>>
> >>>>>>>> On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu
> <[email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I made this change
> >>>>>>>>> <property>
> >>>>>>>>>  <name>hbase.hstore.blockingStoreFiles</name>
> >>>>>>>>>  <value>15</value>
> >>>>>>>>> </property>
> >>>>>>>>>
> >>>>>>>>> the system is still slow.
> >>>>>>>>>
> >>>>>>>>> Here is the most recent value for the region :
> >>>>>>>>> stores=21, storefiles=186, storefileSizeMB=9681,
> >>>>>>>>> memstoreSizeMB=128,
> >>>>>>>>> storefileIndexSizeMB=12
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> And the same log still happens:
> >>>>>>>>>
> >>>>>>>>> 2010-06-09 18:36:40,577 WARN org.apache.h
> >>>>>>>>> adoop.hbase.regionserver.MemStoreFlusher: Region
> >>>>>>>>> SOME_ABCEventTable,2010-06-09 0
> >>>>>>>>> 9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has
> too
> >>>>>>>>> many
> >>>>>>>>> store
> >>>>>>>>> fil
> >>>>>>>>> es, putting it back at the end of the flush queue.
> >>>>>>>>>
> >>>>>>>>> One idea that I have now is to further increase the
> >>>>>>>>> hbase.hstore.blockingStoreFiles to a very high
> >>>>>>>>> Number, such as 1000.  What is the negative impact of this
> change ?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Jimmy
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --------------------------------------------------
> >>>>>>>>> From: "Ryan Rawson" <[email protected]>
> >>>>>>>>> Sent: Monday, June 07, 2010 3:58 PM
> >>>>>>>>> To: <[email protected]>
> >>>>>>>>> Subject: Re: ideas to improve throughput of the base writting
> >>>>>>>>>
> >>>>>>>>>> Try setting this config value:
> >>>>>>>>>>
> >>>>>>>>>> <property>
> >>>>>>>>>>  <name>hbase.hstore.blockingStoreFiles</name>
> >>>>>>>>>>  <value>15</value>
> >>>>>>>>>> </property>
> >>>>>>>>>>
> >>>>>>>>>> and see if that helps.
> >>>>>>>>>>
> >>>>>>>>>> The thing about the 1 compact thread is the scarce resources
> being
> >>>>>>>>>> preserved in this case is cluster IO.  People have had
> issues with
> >>>>>>>>>> compaction IO being too heavy.
> >>>>>>>>>>
> >>>>>>>>>> in your case, this setting can let the regionserver build up
> more
> >>>>>>>>>> store files without pausing your import.
> >>>>>>>>>>
> >>>>>>>>>> -ryan
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu
> >>>>>>>>>> <[email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi,  There:
> >>>>>>>>>>>  While saving lots of data to  on hbase, I noticed that the
> >>>>>>>>>>> regionserver
> >>>>>>>>>>> CPU
> >>>>>>>>>>> went to more than 100%. examination shows that the hbase
> >>>>>>>>>>> CompactSplit
> >>>>>>>>>>> is
> >>>>>>>>>>> spending full time working on compacting/splitting  hbase
> store
> >>>>>>>>>>> files.
> >>>>>>>>>>> The
> >>>>>>>>>>> machine I have is an 8 core machine. because there is only
> one
> >>>>>>>>>>> comact/split
> >>>>>>>>>>> thread in hbase, only one core is fully used.
> >>>>>>>>>>>  I continue to submit  map/reduce job to insert records to
> hbase.
> >>>>>>>>>>> most
> >>>>>>>>>>> of
> >>>>>>>>>>> the time, the job runs very fast, around 1-5 minutes. But
> >>>>>>>>>>> occasionally,
> >>>>>>>>>>> it
> >>>>>>>>>>> can take 2 hours. That is very bad to me. I highly suspect
> that
> >>>>>>>>>>> the
> >>>>>>>>>>> occasional slow insertion is related to the
> >>>>>>>>>>> insufficient speed  compactsplit thread.
> >>>>>>>>>>>  I am thinking that I should parallize the compactsplit
> thread,
> >>>>>>>>>>> the
> >>>>>>>>>>> code
> >>>>>>>>>>> has
> >>>>>>>>>>> this  : the for loop "for (Store store: stores.values())  "
> can
> >>>>>>>>>>> be
> >>>>>>>>>>> parallized via java 5's threadpool , thus multiple cores
> are used
> >>>>>>>>>>> instead
> >>>>>>>>>>> only one core is used. I wonder if this will help to
> increase the
> >>>>>>>>>>> throughput.
> >>>>>>>>>>>
> >>>>>>>>>>>  Somebody mentioned that I can increase the regionsize to
> that I
> >>>>>>>>>>> don't
> >>>>>>>>>>> do
> >>>>>>>>>>> so
> >>>>>>>>>>> many compaction. Under heavy writing situation.
> >>>>>>>>>>> does anybody have experience showing it helps ?
> >>>>>>>>>>>
> >>>>>>>>>>> Jimmy.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  byte [] compactStores(final boolean majorCompaction)
> >>>>>>>>>>>
> >>>>>>>>>>>  throws IOException {
> >>>>>>>>>>>
> >>>>>>>>>>>  if (this.closing.get() || this.closed.get()) {
> >>>>>>>>>>>
> >>>>>>>>>>>  LOG.debug("Skipping compaction on " + this + " because
> >>>>>>>>>>> closing/closed");
> >>>>>>>>>>>
> >>>>>>>>>>>  return null;
> >>>>>>>>>>>
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>>  splitsAndClosesLock.readLock().lock();
> >>>>>>>>>>>
> >>>>>>>>>>>  try {
> >>>>>>>>>>>
> >>>>>>>>>>>  byte [] splitRow = null;
> >>>>>>>>>>>
> >>>>>>>>>>>  if (this.closed.get()) {
> >>>>>>>>>>>
> >>>>>>>>>>>   return splitRow;
> >>>>>>>>>>>
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>>  try {
> >>>>>>>>>>>
> >>>>>>>>>>>   synchronized (writestate) {
> >>>>>>>>>>>
> >>>>>>>>>>>     if (!writestate.compacting && writestate.writesEnabled)
> {
> >>>>>>>>>>>
> >>>>>>>>>>>       writestate.compacting = true;
> >>>>>>>>>>>
> >>>>>>>>>>>     } else {
> >>>>>>>>>>>
> >>>>>>>>>>>       LOG.info("NOT compacting region " + this +
> >>>>>>>>>>>
> >>>>>>>>>>>           ": compacting=" + writestate.compacting + ",
> >>>>>>>>>>> writesEnabled="
> >>>>>>>>>>> +
> >>>>>>>>>>>
> >>>>>>>>>>>           writestate.writesEnabled);
> >>>>>>>>>>>
> >>>>>>>>>>>         return splitRow;
> >>>>>>>>>>>
> >>>>>>>>>>>     }
> >>>>>>>>>>>
> >>>>>>>>>>>   }
> >>>>>>>>>>>
> >>>>>>>>>>>   LOG.info("Starting" + (majorCompaction? " major " : " ")
> +
> >>>>>>>>>>>
> >>>>>>>>>>>       "compaction on region " + this);
> >>>>>>>>>>>
> >>>>>>>>>>>   long startTime = System.currentTimeMillis();
> >>>>>>>>>>>
> >>>>>>>>>>>   doRegionCompactionPrep();
> >>>>>>>>>>>
> >>>>>>>>>>>   long maxSize = -1;
> >>>>>>>>>>>
> >>>>>>>>>>>   for (Store store: stores.values()) {
> >>>>>>>>>>>
> >>>>>>>>>>>     final Store.StoreSize ss =
> store.compact(majorCompaction);
> >>>>>>>>>>>
> >>>>>>>>>>>     if (ss != null && ss.getSize() > maxSize) {
> >>>>>>>>>>>
> >>>>>>>>>>>       maxSize = ss.getSize();
> >>>>>>>>>>>
> >>>>>>>>>>>       splitRow = ss.getSplitRow();
> >>>>>>>>>>>
> >>>>>>>>>>>     }
> >>>>>>>>>>>
> >>>>>>>>>>>   }
> >>>>>>>>>>>
> >>>>>>>>>>>   doRegionCompactionCleanup();
> >>>>>>>>>>>
> >>>>>>>>>>>   String timeTaken =
> >>>>>>>>>>> StringUtils.formatTimeDiff(System.currentTimeMillis(),
> >>>>>>>>>>>
> >>>>>>>>>>>       startTime);
> >>>>>>>>>>>
> >>>>>>>>>>>   LOG.info("compaction completed on region " + this + " in
> " +
> >>>>>>>>>>> timeTaken);
> >>>>>>>>>>>
> >>>>>>>>>>>  } finally {
> >>>>>>>>>>>
> >>>>>>>>>>>   synchronized (writestate) {
> >>>>>>>>>>>
> >>>>>>>>>>>     writestate.compacting = false;
> >>>>>>>>>>>
> >>>>>>>>>>>     writestate.notifyAll();
> >>>>>>>>>>>
> >>>>>>>>>>>   }
> >>>>>>>>>>>
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>>  return splitRow;
> >>>>>>>>>>>
> >>>>>>>>>>>  } finally {
> >>>>>>>>>>>
> >>>>>>>>>>>  splitsAndClosesLock.readLock().unlock();
> >>>>>>>>>>>
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >

RE: ideas to improve throughput of the base writting

Reply via email to