On Tue, Aug 16, 2016 at 9:27 AM, Sterfield <[email protected]> wrote:
> > > > ... > > > On the corresponding RS, at the same time, there's a message about a > big > > > flush, but not with so much memory in the memstore. Also, I don't see > any > > > warning that could explain why the memstore grew so large (nothing > about > > > the fact that there's too many hfiles to compact, for example) > > > > > > > > HBase keeps writing the memstore till it trips a lower limit. It then > moves > > to try and flush the region taking writes all the time until it hits an > > upper limit at which time it stops taking on writes until the flush > > completes to prevent running out of memory. If the RS is under load, it > may > > take a while to flush out the memstore causing us to hit the upper bound. > > Is that what is going on here? > > > What I saw from the documentation + research on the Internet is : > > - memstore is growing up to the memstore size limit (here 256MB) > - It could flush earlier if : > - the total amount of space taken by all the memstore hit the lower > limit (0.35 by default), or the upper limit (0.40 by default). If it > reaches upper limit, writes are blocked > Did you hit 'blocked' state in this case? (You can see it in the logs if it the case). > - under "memory pressure", meaning that there's no more memory > available > - the "multiplier" allowed a memstore to grow up to a certain point > (here x4), apparently in various cases : > - when there's too many hfiles generated and that a compaction that > must happened > - apparently under lots of load. > > We will block too if too many storefiles. Again, logs should report if this is why it is holding up writes. 681 <property> 682 <name>hbase.hstore.blockingStoreFiles</name> 683 <value>10</value> 684 <description> If more than this number of StoreFiles exist in any one Store (one StoreFile 685 is written per flush of MemStore), updates are blocked for this region until a compaction is 686 completed, or until hbase.hstore.blockingWaitTime has been exceeded.</description> 687 </property> (I hate this config -- smile). > Back to your question, yes, the server is being load-tested at the moment, > so I'm constantly writing 150k rows through OpenTSDB as we speak > > > Ok. Can you let the memstores run higher than 256M? > > > 2016-08-16 12:04:57,752 INFO [MemStoreFlusher.0] regionserver.HRegion: > > > Finished memstore flush of ~821.25 MB/861146920, currentsize=226.67 > > > MB/237676040 for region > > > tsdb,\x00\x03\xD9W\xAD\x82\x00\x00\x00\x01\x00\x00T\x00\ > > > x00\x0A\x00\x008\x00\x00\x0B\x00\x009\x00\x00\x0C\x00\x005, > > 1471090649103. > > > b833cb8fdceff5cd21887aa9ff11e7bc. > > > in 13449ms, sequenceid=11332624, compaction requested=true > > > > > > So, what could explain this amount of memory taken by the memstore, and > > how > > > I could handle such situation ? > > > > > > > > You are taking on a lot of writes? The server is under load? Are lots of > > regions concurrently flushing? You could up flushing thread count from > > default 1. You could up the multiplier so more runway for the flush to > > complete within (x6 instead of x4), etc. > > > Thanks for the flushing thread count, never heard about that ! > I could indeed increase the multiplier. I'll try that as well. > > 674 <property> 675 <name>hbase.hstore.flusher.count</name> 676 <value>2</value> 677 <description> The number of flush threads. With fewer threads, the MemStore flushes will be 678 queued. With more threads, the flushes will be executed in parallel, increasing the load on 679 HDFS, and potentially causing more compactions. </description> 680 </property> See if it helps. If you want to put up more log from a RS for us to look at w/ a bit of note on what configs you have, we can take a look. St.Ack > Guillaume >
