On Wed, Jun 9, 2010 at 10:27 PM, Ted Yu <[email protected]> wrote: > Is it possible for one or more of the parameters dynamic ? > Meaning embedding tuning heuristic in HBase code. >
This should be the goal for sure. Ideally, hbase would adjust as the loading character changed. St.Ack > On Wed, Jun 9, 2010 at 6:08 PM, Ryan Rawson <[email protected]> wrote: > >> One issue you may run into is that 0.20 doesn't have >> https://issues.apache.org/jira/browse/HBASE-2066 >> >> a dev preview of 0.21 which does include that, and does improve >> performance should be available soon. >> >> On Wed, Jun 9, 2010 at 5:55 PM, Jinsong Hu <[email protected]> wrote: >> > Yes, I have done all the suggestion of the >> > http://wiki.apache.org/hadoop/PerformanceTuning. >> > >> > I just restarted the hbase cluster and recreated the table, the data >> > insertion looks fine for now and >> > I am getting about 1k record/second . I consider that to be reasonable >> > giving that my record is about >> > 10k bytes per record. but this is the beginning of the writing and I >> notice >> > that when the table is small, >> > the hbase works fine. when there are lots of records in the table >> already, >> > problem begin to happen. >> > I will report back and see how it goes after some more time. >> > >> > Jimmy. >> > >> > >> > -------------------------------------------------- >> > From: "Ryan Rawson" <[email protected]> >> > Sent: Wednesday, June 09, 2010 5:20 PM >> > To: <[email protected]> >> > Subject: Re: ideas to improve throughput of the base writting >> > >> >> I am not familiar with that exception, I have not seen of it before... >> >> perhaps someone else has? >> >> >> >> And my 200k rows/sec is over 19 machines. It is the average over many >> >> hours. My calculation of row size might not match how much data was >> >> flowing to disk, but I think it isn't too far off. >> >> >> >> Unfortunately comparing raw disk speed in a trivial benchmark (such as >> >> hdparm -t is) doesn't tell us how absolute speed of HBase must >> >> perform. This is because HBase does much more work than a raw disk >> >> write benchmark -- doing so to maintain structure and sorting. We can >> >> say that 'faster disks = faster HBase performance'. >> >> >> >> From the log lines you have pasted it sounds like the regionserver's >> >> flush ability is not keeping up with your rate of data input. How big >> >> are your records? What is your target input speed? Have you done >> >> anything on this page: >> >> http://wiki.apache.org/hadoop/PerformanceTuning >> >> >> >> >> >> >> >> On Wed, Jun 9, 2010 at 4:58 PM, Jinsong Hu <[email protected]> >> wrote: >> >>> >> >>> My hardware has 2 disks. I did a file copy on the machine and found >> that >> >>> I >> >>> can get 300 mbyte/second. >> >>> >> >>> At this time, I see my insertion is less than 1k/second. my row size is >> . >> >>> in >> >>> terms of disk writing. my record >> >>> insertion rate is far less than the hardware limit. my row size is >> about >> >>> 10K byte >> >>> >> >>> if in your i7-based server, you are doing 200k row/sec, each row is 200 >> >>> byte, then you are doing 40M byte/sec. >> >>> >> >>> in my case, if it behaves normally, I can get 100 row/sec * 10K byte >> =1M >> >>> /sec. >> >>> that is far from the disk speed. occasionally I can see 1k row/second. >> >>> which >> >>> is more reasonable in my case, >> >>> but I rarely get that performance. >> >>> >> >>> even worse, with the change done, now I have seem lots of compaction >> >>> failure: >> >>> >> >>> 2010-06-09 23:40:55,117 ERROR >> >>> org.apache.hadoop.hbase.regionserver.CompactSplitT >> >>> hread: Compaction failed for region Spam_MsgEventTable,2010-06-09 >> >>> 20:05:20\x0905 >> >>> 860d4bf1cb268ef69391cf97de9f64,1276121160527 >> >>> java.lang.RuntimeException: java.io.IOException: Could not find target >> >>> position >> >>> 65588 >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS >> >>> canner.java:61) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.j >> >>> ava:79) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.MinorCompactingStoreScanner.next >> >>> (MinorCompactingStoreScanner.java:96) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:920) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:764) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja >> >>> va:832) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja >> >>> va:785) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSp >> >>> litThread.java:93) >> >>> Caused by: java.io.IOException: Could not find target position 65588 >> >>> at >> >>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockAt(DFSClien >> >>> t.java:1556) >> >>> at >> >>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient >> >>> .java:1666) >> >>> at >> >>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1 >> >>> 780) >> >>> at java.io.DataInputStream.read(DataInputStream.java:132) >> >>> at >> >>> org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(Bou >> >>> ndedRangeFileInputStream.java:105) >> >>> at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) >> >>> at >> >>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1 >> >>> 018) >> >>> at >> >>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:96 >> >>> 6) >> >>> at >> >>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java >> >>> :1159) >> >>> at >> >>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS >> >>> canner.java:58) >> >>> ... 7 more >> >>> >> >>> I can't stop this unless I restarted the regionserver. After restart I >> >>> truncate the table, and when I list the table again in shell, >> >>> it appears 2 times. now I can't even disable the table and drop it. >> >>> >> >>> I will restart the whole hbase cluster and report the progress. >> >>> >> >>> Jimmy/ >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> -------------------------------------------------- >> >>> From: "Ryan Rawson" <[email protected]> >> >>> Sent: Wednesday, June 09, 2010 4:16 PM >> >>> To: <[email protected]> >> >>> Subject: Re: ideas to improve throughput of the base writting >> >>> >> >>>> Hey, >> >>>> >> >>>> Sounds like you are hitting limits of your hardware... I dont think >> >>>> you mentioned the hardware spec you are running in this thread... >> >>>> >> >>>> What you are seeing is essentially the limits of HDFS's ability to >> >>>> take writes. The errors might be due to various HDFS setup problems >> >>>> (eg: xceiver count, file handle count, all outlined in various HBase >> >>>> "startup" docs)... But the overall performance might be limited by >> >>>> your hardware. >> >>>> >> >>>> For example, I use i7-based servers with 4 disks. This gives a >> >>>> reasonable IO bandwidth, and can cope with high rates of inserts (upto >> >>>> 100-200k rows/sec (each row is ~ 100-300 bytes). If you are running a >> >>>> 1 or 2 disk system it is possible you are hitting limits of what your >> >>>> hardware can do. >> >>>> >> >>>> Also note that the write-pipeline performance is ultimately defined in >> >>>> bytes/sec, not just 'rows/sec'... thus my rows were small, and if >> >>>> yours are big then you might be hitting a lower 'row/sec' limit even >> >>>> though the amount of bytes you are writing is higher than what i might >> >>>> have been doing. >> >>>> >> >>>> >> >>>> >> >>>> On Wed, Jun 9, 2010 at 3:59 PM, Jinsong Hu <[email protected]> >> >>>> wrote: >> >>>>> >> >>>>> I still get lots of repetition of >> >>>>> >> >>>>> 2010-06-09 22:54:38,428 WARN >> >>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlushe >> >>>>> r: Region Spam_MsgEventTable,2010-06-09 >> >>>>> 20:05:20\x0905860d4bf1cb268ef69391cf97de >> >>>>> 9f64,1276121160527 has too many store files, putting it back at the >> end >> >>>>> of >> >>>>> the f >> >>>>> lush queue. >> >>>>> 2010-06-09 22:54:38,428 DEBUG >> >>>>> org.apache.hadoop.hbase.regionserver.CompactSplitT >> >>>>> hread: Compaction requested for region Spam_MsgEventTable,2010-06-09 >> >>>>> 20:05:20\x0 >> >>>>> 905860d4bf1cb268ef69391cf97de9f64,1276121160527/1537478401 because: >> >>>>> regionserver >> >>>>> /10.110.8.88:60020.cacheFlusher >> >>>>> >> >>>>> >> >>>>> I also saw lots of >> >>>>> >> >>>>> 2010-06-09 22:50:12,527 INFO >> >>>>> org.apache.hadoop.hbase.regionserver.HRegion: >> >>>>> Block >> >>>>> ing updates for 'IPC Server handler 1 on 60020' on region >> >>>>> Spam_MsgEventTable,201 >> >>>>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527: >> >>>>> memstore >> >>>>> siz >> >>>>> e 512.0m is >= than blocking 512.0m size >> >>>>> 2010-06-09 22:50:12,598 INFO >> >>>>> org.apache.hadoop.hbase.regionserver.HRegion: >> >>>>> Block >> >>>>> ing updates for 'IPC Server handler 5 on 60020' on region >> >>>>> Spam_MsgEventTable,201 >> >>>>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527: >> >>>>> memstore >> >>>>> siz >> >>>>> e 512.0m is >= than blocking 512.0m size >> >>>>> >> >>>>> even with the changed config. the regionserver has 4G ram. what else >> >>>>> can >> >>>>> be >> >>>>> wrong ? >> >>>>> >> >>>>> The insertion rate is still not good. >> >>>>> >> >>>>> Jimmy. >> >>>>> >> >>>>> >> >>>>> -------------------------------------------------- >> >>>>> From: "Jinsong Hu" <[email protected]> >> >>>>> Sent: Wednesday, June 09, 2010 1:59 PM >> >>>>> To: <[email protected]> >> >>>>> Subject: Re: ideas to improve throughput of the base writting >> >>>>> >> >>>>>> Thanks. I will make this change: >> >>>>>> >> >>>>>> <property> >> >>>>>> <name>hbase.hregion.memstore.block.multiplier</name> >> >>>>>> <value>8</value> >> >>>>>> </property> >> >>>>>> >> >>>>>> <property> >> >>>>>> <name>hbase.regionserver.msginterval</name> >> >>>>>> <value>10000</value> >> >>>>>> </property> >> >>>>>> >> >>>>>> <property> >> >>>>>> <name>hbase.hstore.compactionThreshold</name> >> >>>>>> <value>6</value> >> >>>>>> </property> >> >>>>>> >> >>>>>> >> >>>>>> <property> >> >>>>>> <name>hbase.hstore.blockingStoreFiles</name> >> >>>>>> <value>18</value> >> >>>>>> </property> >> >>>>>> >> >>>>>> >> >>>>>> and see how it goes. >> >>>>>> >> >>>>>> >> >>>>>> Jimmy. >> >>>>>> >> >>>>>> -------------------------------------------------- >> >>>>>> From: "Ryan Rawson" <[email protected]> >> >>>>>> Sent: Wednesday, June 09, 2010 1:49 PM >> >>>>>> To: <[email protected]> >> >>>>>> Subject: Re: ideas to improve throughput of the base writting >> >>>>>> >> >>>>>>> More background here... you are running into a situation where the >> >>>>>>> regionserver cannot flush fast enough and the size of the region's >> >>>>>>> memstore has climbed too high and thus you get that error message. >> >>>>>>> HBase attempts to protect itself by holding up clients (thus >> causing >> >>>>>>> the low performance you see). By expanding how big a memstore can >> >>>>>>> get >> >>>>>>> during times of stress you can improve performance, at the cost of >> >>>>>>> memory usage. That is what that setting is about. >> >>>>>>> >> >>>>>>> As for the 1.5 minute setting, that is the maximal amount of time a >> >>>>>>> handler thread will block for. You shouldn't need to tweak that >> >>>>>>> value, and reducing it could cause issues. >> >>>>>>> >> >>>>>>> Now, as for compacting, HBase will compact small files into larger >> >>>>>>> files, and on a massive upload you can expect to see this happen >> >>>>>>> constantly, thus tying up 1 cpu worth on your regionserver. You >> >>>>>>> could >> >>>>>>> potentially reduce that by increasing the value: >> >>>>>>> >> >>>>>>> <property> >> >>>>>>> <name>hbase.hstore.compactionThreshold</name> >> >>>>>>> <value>3</value> >> >>>>>>> >> >>>>>>> the value is interpreted as "if there are more than 3 files for a >> >>>>>>> region then run the compaction check". By raising this limit you >> can >> >>>>>>> accumulate more files before compacting them, thus reducing the >> >>>>>>> frequency of compactions but also potentially increasing the >> >>>>>>> performance of reads (more files to read = more seeks = slower). >> I'd >> >>>>>>> consider setting it to 5-7 or so in concert with setting >> >>>>>>> "hbase.hstore.blockingStoreFiles" to a value at least 2x that. >> >>>>>>> >> >>>>>>> All of these settings increase the amount of ram your regionserver >> >>>>>>> may >> >>>>>>> need, so you will want to ensure you have at least 4000m of ram set >> >>>>>>> in >> >>>>>>> hbase-env.sh. This is why they are set so conservatively in the >> >>>>>>> default shipping config. >> >>>>>>> >> >>>>>>> These are the 3 important settings that control how often >> compactions >> >>>>>>> occur and how RPC threads get blocked. Try tweaking all of them >> and >> >>>>>>> let me know if you are doing better. >> >>>>>>> >> >>>>>>> -ryan >> >>>>>>> >> >>>>>>> >> >>>>>>> On Wed, Jun 9, 2010 at 1:37 PM, Ryan Rawson <[email protected]> >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> you also want this config: >> >>>>>>>> >> >>>>>>>> <property> >> >>>>>>>> <name>hbase.hregion.memstore.block.multiplier</name> >> >>>>>>>> <value>8</value> >> >>>>>>>> </property> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> that should hopefully clear things up. >> >>>>>>>> >> >>>>>>>> -ryan >> >>>>>>>> >> >>>>>>>> On Wed, Jun 9, 2010 at 1:34 PM, Jinsong Hu < >> [email protected]> >> >>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> I checked the log, there are lots of >> >>>>>>>>> >> >>>>>>>>> e 128.1m is >= than blocking 128.0m size >> >>>>>>>>> 2010-06-09 17:26:36,736 INFO >> >>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion: >> >>>>>>>>> Block >> >>>>>>>>> ing updates for 'IPC Server handler 8 on 60020' on region >> >>>>>>>>> Spam_MsgEventTable,201 >> >>>>>>>>> 0-06-09 >> 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262: >> >>>>>>>>> memstore >> >>>>>>>>> siz >> >>>>>>>>> e 128.1m is >= than blocking 128.0m size >> >>>>>>>>> >> >>>>>>>>> then after that there are lots of >> >>>>>>>>> >> >>>>>>>>> 2010-06-09 17:26:36,800 DEBUG >> >>>>>>>>> org.apache.hadoop.hbase.regionserver.Store: >> >>>>>>>>> Added >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> hdfs:// >> namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/376337880/messag >> >>>>>>>>> e_compound_terms/7606939244559826252, entries=30869, >> >>>>>>>>> sequenceid=8350447892, >> >>>>>>>>> mems >> >>>>>>>>> ize=7.2m, filesize=3.4m to Spam_MsgEventTable,2010-06-09 >> >>>>>>>>> 05:25:32\x09c873847edf6 >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> then lots of >> >>>>>>>>> >> >>>>>>>>> 2010-06-09 17:26:39,005 INFO >> >>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion: >> >>>>>>>>> Unblo >> >>>>>>>>> cking updates for region Spam_MsgEventTable,2010-06-09 >> >>>>>>>>> 05:25:32\x09c873847edf6e5 >> >>>>>>>>> 390477494956ec04729,1276104002262 'IPC Server handler 8 on 60020' >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> This cycle happens again and again in the log. What can I do in >> >>>>>>>>> this >> >>>>>>>>> case >> >>>>>>>>> to speed up writing ? >> >>>>>>>>> right now the writing speed is really slow. close to 4 >> rows/second >> >>>>>>>>> for >> >>>>>>>>> a >> >>>>>>>>> regionserver. >> >>>>>>>>> >> >>>>>>>>> I checked the code and try to find out why there are so many >> store >> >>>>>>>>> files, >> >>>>>>>>> and I noticed each second >> >>>>>>>>> the regionserver reports to master, it calls the memstore flush >> and >> >>>>>>>>> write a >> >>>>>>>>> store file. >> >>>>>>>>> >> >>>>>>>>> the parameter hbase.regionserver.msginterval default value is 1 >> >>>>>>>>> second. >> >>>>>>>>> I am >> >>>>>>>>> thinking to change to 10 second. >> >>>>>>>>> can that help ? I am also thinking to change >> >>>>>>>>> hbase.hstore.blockingStoreFiles >> >>>>>>>>> to 1000. I noticed that there is a parameter >> >>>>>>>>> hbase.hstore.blockingWaitTime with default value of 1.5 minutes. >> as >> >>>>>>>>> long as >> >>>>>>>>> the 1.5 minutes is reached, >> >>>>>>>>> the compaction is executed. I am fine with running compaction >> every >> >>>>>>>>> 1.5 >> >>>>>>>>> minutes, but running compaction every second >> >>>>>>>>> and causing CPU consistently higher than 100% is not wanted. >> >>>>>>>>> >> >>>>>>>>> Any suggestion what kind of parameters to change to improve my >> >>>>>>>>> writing >> >>>>>>>>> speed >> >>>>>>>>> ? >> >>>>>>>>> >> >>>>>>>>> Jimmy >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -------------------------------------------------- >> >>>>>>>>> From: "Ryan Rawson" <[email protected]> >> >>>>>>>>> Sent: Wednesday, June 09, 2010 1:01 PM >> >>>>>>>>> To: <[email protected]> >> >>>>>>>>> Subject: Re: ideas to improve throughput of the base writting >> >>>>>>>>> >> >>>>>>>>>> The log will say something like "blocking updates to..." when >> you >> >>>>>>>>>> hit >> >>>>>>>>>> a limit. That log you indicate is just the regionserver >> >>>>>>>>>> attempting >> >>>>>>>>>> to >> >>>>>>>>>> compact a region, but shouldn't prevent updates. >> >>>>>>>>>> >> >>>>>>>>>> what else does your logfile say? Search for the string (case >> >>>>>>>>>> insensitive) "blocking updates"... >> >>>>>>>>>> >> >>>>>>>>>> -ryan >> >>>>>>>>>> >> >>>>>>>>>> On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu >> >>>>>>>>>> <[email protected]> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> I made this change >> >>>>>>>>>>> <property> >> >>>>>>>>>>> <name>hbase.hstore.blockingStoreFiles</name> >> >>>>>>>>>>> <value>15</value> >> >>>>>>>>>>> </property> >> >>>>>>>>>>> >> >>>>>>>>>>> the system is still slow. >> >>>>>>>>>>> >> >>>>>>>>>>> Here is the most recent value for the region : >> >>>>>>>>>>> stores=21, storefiles=186, storefileSizeMB=9681, >> >>>>>>>>>>> memstoreSizeMB=128, >> >>>>>>>>>>> storefileIndexSizeMB=12 >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> And the same log still happens: >> >>>>>>>>>>> >> >>>>>>>>>>> 2010-06-09 18:36:40,577 WARN org.apache.h >> >>>>>>>>>>> adoop.hbase.regionserver.MemStoreFlusher: Region >> >>>>>>>>>>> SOME_ABCEventTable,2010-06-09 0 >> >>>>>>>>>>> 9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has >> too >> >>>>>>>>>>> many >> >>>>>>>>>>> store >> >>>>>>>>>>> fil >> >>>>>>>>>>> es, putting it back at the end of the flush queue. >> >>>>>>>>>>> >> >>>>>>>>>>> One idea that I have now is to further increase the >> >>>>>>>>>>> hbase.hstore.blockingStoreFiles to a very high >> >>>>>>>>>>> Number, such as 1000. What is the negative impact of this >> change >> >>>>>>>>>>> ? >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Jimmy >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -------------------------------------------------- >> >>>>>>>>>>> From: "Ryan Rawson" <[email protected]> >> >>>>>>>>>>> Sent: Monday, June 07, 2010 3:58 PM >> >>>>>>>>>>> To: <[email protected]> >> >>>>>>>>>>> Subject: Re: ideas to improve throughput of the base writting >> >>>>>>>>>>> >> >>>>>>>>>>>> Try setting this config value: >> >>>>>>>>>>>> >> >>>>>>>>>>>> <property> >> >>>>>>>>>>>> <name>hbase.hstore.blockingStoreFiles</name> >> >>>>>>>>>>>> <value>15</value> >> >>>>>>>>>>>> </property> >> >>>>>>>>>>>> >> >>>>>>>>>>>> and see if that helps. >> >>>>>>>>>>>> >> >>>>>>>>>>>> The thing about the 1 compact thread is the scarce resources >> >>>>>>>>>>>> being >> >>>>>>>>>>>> preserved in this case is cluster IO. People have had issues >> >>>>>>>>>>>> with >> >>>>>>>>>>>> compaction IO being too heavy. >> >>>>>>>>>>>> >> >>>>>>>>>>>> in your case, this setting can let the regionserver build up >> >>>>>>>>>>>> more >> >>>>>>>>>>>> store files without pausing your import. >> >>>>>>>>>>>> >> >>>>>>>>>>>> -ryan >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu >> >>>>>>>>>>>> <[email protected]> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Hi, There: >> >>>>>>>>>>>>> While saving lots of data to on hbase, I noticed that the >> >>>>>>>>>>>>> regionserver >> >>>>>>>>>>>>> CPU >> >>>>>>>>>>>>> went to more than 100%. examination shows that the hbase >> >>>>>>>>>>>>> CompactSplit >> >>>>>>>>>>>>> is >> >>>>>>>>>>>>> spending full time working on compacting/splitting hbase >> store >> >>>>>>>>>>>>> files. >> >>>>>>>>>>>>> The >> >>>>>>>>>>>>> machine I have is an 8 core machine. because there is only >> one >> >>>>>>>>>>>>> comact/split >> >>>>>>>>>>>>> thread in hbase, only one core is fully used. >> >>>>>>>>>>>>> I continue to submit map/reduce job to insert records to >> >>>>>>>>>>>>> hbase. >> >>>>>>>>>>>>> most >> >>>>>>>>>>>>> of >> >>>>>>>>>>>>> the time, the job runs very fast, around 1-5 minutes. But >> >>>>>>>>>>>>> occasionally, >> >>>>>>>>>>>>> it >> >>>>>>>>>>>>> can take 2 hours. That is very bad to me. I highly suspect >> that >> >>>>>>>>>>>>> the >> >>>>>>>>>>>>> occasional slow insertion is related to the >> >>>>>>>>>>>>> insufficient speed compactsplit thread. >> >>>>>>>>>>>>> I am thinking that I should parallize the compactsplit >> thread, >> >>>>>>>>>>>>> the >> >>>>>>>>>>>>> code >> >>>>>>>>>>>>> has >> >>>>>>>>>>>>> this : the for loop "for (Store store: stores.values()) " >> can >> >>>>>>>>>>>>> be >> >>>>>>>>>>>>> parallized via java 5's threadpool , thus multiple cores are >> >>>>>>>>>>>>> used >> >>>>>>>>>>>>> instead >> >>>>>>>>>>>>> only one core is used. I wonder if this will help to increase >> >>>>>>>>>>>>> the >> >>>>>>>>>>>>> throughput. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Somebody mentioned that I can increase the regionsize to >> that >> >>>>>>>>>>>>> I >> >>>>>>>>>>>>> don't >> >>>>>>>>>>>>> do >> >>>>>>>>>>>>> so >> >>>>>>>>>>>>> many compaction. Under heavy writing situation. >> >>>>>>>>>>>>> does anybody have experience showing it helps ? >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Jimmy. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> byte [] compactStores(final boolean majorCompaction) >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> throws IOException { >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> if (this.closing.get() || this.closed.get()) { >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> LOG.debug("Skipping compaction on " + this + " because >> >>>>>>>>>>>>> closing/closed"); >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> return null; >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> } >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> splitsAndClosesLock.readLock().lock(); >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> try { >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> byte [] splitRow = null; >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> if (this.closed.get()) { >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> return splitRow; >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> } >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> try { >> >>>>>>>>>>>>> >> >> >> > >> >
