We saw this as well at Splice Machine. This led us to run compactions in Spark. Once we did this, we saw the compaction effects go away almost entirely.
Here is a link to our code. https://github.com/splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/CompactionJob.java <https://github.com/splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/CompactionJob.java> We have a todo to get this back in the community. Regards, John Leach > On Aug 11, 2016, at 8:03 AM, Sterfield <sterfi...@gmail.com> wrote: > > And it's gone [1]. No more spikes in the writes / read, no more OpenTSDB > error. So I think it's safe to assume that OpenTSDB compaction is > generating some additional load that is not very well handled by the HBase, > and therefore, generating the issues I'm mentioning. > > It seems also that the MVCC error are gone (to be checked). > > I don't know how to manage Hbase in order to make it possible to handle > compaction without any issues, but at least, I know where it comes from > > [1] : > https://www.dropbox.com/s/d6l2lngr6mpizh9/Without%20OpenTSDB%20compaction.png?dl=0 > > 2016-08-11 13:18 GMT+02:00 Sterfield <sterfi...@gmail.com>: > >> Hello, >> >> >>>> Hi, >>>> >>>> Thanks for your answer. >>>> >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of >>> HTTP >>>> POST on OpenTSDB in order to write data points (currently up to 300k/s). >>>> OpenTSDB is only doing increment / append (AFAIK) >>>> >>>> How many nodes or is that 300k/s on a single machine? >> >> >> 1 master node, 4 slaves, colo HDFS + RS. >> >> Master : m4.2xlarge (8CPU, 32GB RAM) >> Slave : d2.2xlarge (8CPU, 61GB RAM, 6x2T disk) >> >> >>>> If I have understood your answer correctly, some write ops are queued, >>> and >>>> some younger ops in the queue are "done" while some older are not. >>>> >>>> >>> What Anoop said plus, we'll see the STUCK notice when it is taking a long >>> time for the MVCC read point to come up to the write point of the >>> currently >>> ongoing transaction. We will hold the updating thread until the readpoint >>> is equal or greater than the current transactions write point. We do this >>> to ensure a client can read its own writes. The MVCC is region wide. If >>> many ongoing updates, a slightly slower one may drag down other >>> outstanding >>> transactions completing. The STUCK message goes away after some time? It >>> happens frequently? A thread dump while this is going on would be >>> interesting if possible or what else is going on on the server around this >>> time (see in logs?) >> >> >> Yes, the STUCK message happens for quite some time (a dozen of minutes, >> each hour.). It happens every hour. >> >>> Few additional questions : >>>> >>>> - Is it a problem regarding the data or is it "safe" ? In other >>> words, >>>> the old data not been written yet will be dropped or they will be >>>> written >>>> correctly, just later ? >>>> >>> No data is going to be dropped. The STUCK message is just flagging you >>> that >>> a write is taking a while to complete while we wait on MVCC. You backed up >>> on disk or another resource or a bunch of writers have all happened to >>> arrive at one particular region (MVCC is by region)? >> >> >> I've pre-splitted my "tsdb" region, to be managed by all 4 servers, so I >> think I'm ok on this side. All information are stored locally on the EC2 >> disks. >> >> >>>> - How can I debug this and if possible, fix it ? >>>> >>>> >>> See above. Your writes are well distributed. Disks are healthy? >> >> >> >> So here's the result of my investigation + my assumptions : >> >> >> - Every hour, my RS have a peak of Load / CPU. I was looking at a RS >> when it happened (that's easy, it's at the beginning of each hour), and the >> RS java process was taking all the CPU available on the machine, hence the >> load. You can see the load of all my servers on those images, see [1] and >> [2]. >> - Disk are fine IMO. Write IO is OK on average, peak to 300 / 400 >> IOPS, in range of a correct mechanical drive. I don't see a particular IOPS >> load at that time >> - However, you can see that every hour (see [3]) : >> - Calls are queued >> - Write are impacted (from 160k/s, down to 150 - 140k/s) >> - Read RPC are increased (I suppose that the RS is not answering, >> so they are queued and suddendly released in a batch) >> - GC count is spiking a bit >> - You can see that flush and compaction queues are fine >> - Also, the Heap usage is rather fine (I've put -Xmx27136m). At first, >> I was thinking that I was reaching the maximum Heap Size, but you can see >> on the graph that 10.200.0.4 is far from the maximum. >> - OpenTSDB is also showing that it cannot write or compact rows. >> Here's an error : >> >> 10:19:02.415 ERROR [CompactionQueue.call] - Failed to write a row to >> re-compact >> org.hbase.async.RemoteException: >> org.apache.hadoop.hbase.RegionTooBusyException: >> Above memstore limit, regionName=tsdb,\x00\x07\xE4W\ >> xA9\xE1\xE0\x00\x00\x01\x00\x00\x86\x00\x00\x0B\x00\x004\ >> x00\x00\x0C\x00\x00I,1470758107028.5482c6 >> 65b5843179116be7643cf1fa67., server=ip-10-200-0-6.eu-west- >> 1.campaign.aws,16020,1470888079169, memstoreSize=537335912, >> blockingMemStoreSize=536870912 >> >> Then a Java stack trace. >> >> I'm wondering if the issue is not because of OpenTSDB "compaction" [4]. I >> don't fully understand what's being done when OpenTSDB is doing those >> compactions, but it's done "when the base hour has passed", so it could >> explains those hourly spikes. I'll try to deactivate this feature and see >> if those spikes are still there. >> >> [1] : https://www.dropbox.com/s/x8t56le3cxf3qci/Host%20detail% >> 20%231.png?dl=0 >> [2] : https://www.dropbox.com/s/kmtg0dnedyphptr/Host%20detail% >> 20%232.png?dl=0 >> [3] : https://www.dropbox.com/s/yb6uq08xb5u7v0a/GC.png?dl=0 >> [4] : http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html# >> compactions >> >> 2016-08-10 19:08 GMT+02:00 Stack <st...@duboce.net>: >> >>> On Wed, Aug 10, 2016 at 1:13 AM, Sterfield <sterfi...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> Thanks for your answer. >>>> >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of >>> HTTP >>>> POST on OpenTSDB in order to write data points (currently up to 300k/s). >>>> OpenTSDB is only doing increment / append (AFAIK) >>>> >>>> How many nodes or is that 300k/s on a single machine? >>> >>> >>> >> >> >> >>> >>> >>> >>> >>> >>>> - If it's not a problem right now, and if the situation persists, >>> can it >>>> be a problem later ? (for example, HBase cannot keep up the pace then >>>> finally drop some data) >>>> >>> >>> HBase will not drop data. It will not be a problem later. It is a >>> WARN-level message saying writes are stalling some. >>> >>> >>> >>> >>> >>> >>>> I have to mention that OpenTSDB is using their own HBase client called >>>> asynchbase (https://github.com/OpenTSDB/asynchbase). I don't know if >>> it's >>>> related or not, but just FYI. >>>> >>>> >>> This is serverside internal to the Region. Nothing to do with asynchbase >>> (other than asynchbase tends to run with less friction so may be writing >>> faster than our default client). >>> >>> St.Ack >>> >>> >>> >>>> Thanks, >>>> >>>> Guillaume >>>> >>>> 2016-08-10 8:10 GMT+02:00 Anoop John <anoop.hb...@gmail.com>: >>>> >>>>> Ya it comes with write workload. Not like with concurrent reads. >>>>> Once the write is done (memstore write and WAL write), we mark that >>>>> MVCC operation corresponding to this as complete and wait for a global >>>>> read point to advance to atleast this point. (Every write op will have >>>>> a number corresponding to its mvcc op) >>>>> As the mvcc op begin, the write entries (which is having this number) >>>>> added to a Q. Once write is over and we complete the mvcc op, we will >>>>> mark the entry corresponding to it (It is in Q now) as completed/done. >>>>> So there are other entries also in the Q corresponding to other >>>>> parallel writes. Now we will try to move the global read point at >>>>> least upto this mark. Means we will have to make sue all the write >>>>> ops with lesser number than this to get completed. Until then , there >>>>> is a wait. So what is happening here is a later write op is >>>>> completed and that is still waiting for an older write to get >>>>> completed. What sort of write ops u have? Batch put and/or >>>>> increment/append? >>>>> >>>>> -Anoop- >>>>> >>>>> >>>>> On Tue, Aug 9, 2016 at 6:45 PM, Sterfield <sterfi...@gmail.com> >>> wrote: >>>>>> 2016-08-05 17:52 GMT+02:00 Bryan Beaudreault < >>> bbeaudrea...@hubspot.com >>>>> : >>>>>> >>>>>>> I'm also interested in an answer here. We see this from time to >>> time >>>> in >>>>> our >>>>>>> production HBase clusters (non-opentsdb). It seems to be related to >>>>>>> contention under heavy reads or heavy writes. But it's not clear >>> what >>>>> the >>>>>>> impact is here. >>>>>>> >>>>>>> On Fri, Aug 5, 2016 at 5:14 AM Sterfield <sterfi...@gmail.com> >>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm currently testing Hbase 1.2.1 + OpenTSDB. For that, I'm >>>>> generating a >>>>>>>> high load of HTTP PUT on OpenTSDB, which then writes in Hbase. >>>>> Currently, >>>>>>>> I'm able to feed 300k data point per seconds, and I'm trying to >>>>> achieve >>>>>>>> higher speed. >>>>>>>> >>>>>>>> I have also activate JMX on both Master and Region servers in >>> order >>>> to >>>>>>> grab >>>>>>>> a bunch of metrics from Hbase. >>>>>>>> >>>>>>>> At some point, I'm noticing a bunch of warnings in a region >>> server >>>> log >>>>>>> file >>>>>>>> : >>>>>>>> >>>>>>>> 2016-08-04 05:12:06,537 WARN >>>>>>>> [B.defaultRpcServer.handler=21,queue=3,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714778} >>>>>>>> 2016-08-04 05:12:06,538 WARN >>>>>>>> [B.defaultRpcServer.handler=19,queue=1,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714778} >>>>>>>> 2016-08-04 05:12:06,538 WARN >>>>>>>> [B.defaultRpcServer.handler=40,queue=4,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714778} >>>>>>>> 2016-08-04 05:12:06,548 WARN >>>>>>>> [B.defaultRpcServer.handler=0,queue=0,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714778} >>>>>>>> 2016-08-04 05:12:06,549 WARN >>>>>>>> [B.defaultRpcServer.handler=35,queue=5,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714778} >>>>>>>> 2016-08-04 05:12:06,556 WARN >>>>>>>> [B.defaultRpcServer.handler=7,queue=1,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,558 WARN >>>>>>>> [B.defaultRpcServer.handler=20,queue=2,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,559 WARN >>>>>>>> [B.defaultRpcServer.handler=14,queue=2,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,609 WARN >>>>>>>> [B.defaultRpcServer.handler=42,queue=0,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,609 WARN >>>>>>>> [B.defaultRpcServer.handler=32,queue=2,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,619 WARN >>>>>>>> [B.defaultRpcServer.handler=2,queue=2,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> 2016-08-04 05:12:06,619 WARN >>>>>>>> [B.defaultRpcServer.handler=48,queue=0,port=16020] >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, >>>> writePoint=7714779} >>>>>>>> >>>>>>>> Can someone shed some lights on this ? I've googling + searching >>> the >>>>>>>> documentation, but found no clue on this subject. >>>>>>>> >>>>>>>> Is it a problem ? If so, how can I fix it ? >>>>>>>> >>>>>>>> Thanks a lot for your help, >>>>>>>> >>>>>>>> Guillaume >>>>>>>> >>>>>>> >>>>>> >>>>>> Hello, >>>>>> >>>>>> Sorry to bump this thread, but no one knows anything about this >>> issue ? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Guillaume >>>>> >>>> >>> >> >>