Hi John, So it means that you were not able to handle the compaction additional load by tuning Hbase ?
I've reach OpenTSDB guys in order to know what's being done exactly when doing a compaction. With that information, it may be possible to tune Hbase in order to handle correctly the load. To me, it seems that there's a big scan, then lots of writes. I'll continue to work on this, but I'm accepting any comments / advice on this subject ! Thanks, Guillaume 2016-08-11 17:13 GMT+02:00 John Leach <[email protected]>: > We saw this as well at Splice Machine. This led us to run compactions in > Spark. Once we did this, we saw the compaction effects go away almost > entirely. > > Here is a link to our code. > > https://github.com/splicemachine/spliceengine/blob/ > 73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/ > java/com/splicemachine/olap/CompactionJob.java <https://github.com/ > splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac2 > 2f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/ > CompactionJob.java> > > We have a todo to get this back in the community. > > Regards, > John Leach > > > On Aug 11, 2016, at 8:03 AM, Sterfield <[email protected]> wrote: > > > > And it's gone [1]. No more spikes in the writes / read, no more OpenTSDB > > error. So I think it's safe to assume that OpenTSDB compaction is > > generating some additional load that is not very well handled by the > HBase, > > and therefore, generating the issues I'm mentioning. > > > > It seems also that the MVCC error are gone (to be checked). > > > > I don't know how to manage Hbase in order to make it possible to handle > > compaction without any issues, but at least, I know where it comes from > > > > [1] : > > https://www.dropbox.com/s/d6l2lngr6mpizh9/Without% > 20OpenTSDB%20compaction.png?dl=0 > > > > 2016-08-11 13:18 GMT+02:00 Sterfield <[email protected]>: > > > >> Hello, > >> > >> > >>>> Hi, > >>>> > >>>> Thanks for your answer. > >>>> > >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of > >>> HTTP > >>>> POST on OpenTSDB in order to write data points (currently up to > 300k/s). > >>>> OpenTSDB is only doing increment / append (AFAIK) > >>>> > >>>> How many nodes or is that 300k/s on a single machine? > >> > >> > >> 1 master node, 4 slaves, colo HDFS + RS. > >> > >> Master : m4.2xlarge (8CPU, 32GB RAM) > >> Slave : d2.2xlarge (8CPU, 61GB RAM, 6x2T disk) > >> > >> > >>>> If I have understood your answer correctly, some write ops are queued, > >>> and > >>>> some younger ops in the queue are "done" while some older are not. > >>>> > >>>> > >>> What Anoop said plus, we'll see the STUCK notice when it is taking a > long > >>> time for the MVCC read point to come up to the write point of the > >>> currently > >>> ongoing transaction. We will hold the updating thread until the > readpoint > >>> is equal or greater than the current transactions write point. We do > this > >>> to ensure a client can read its own writes. The MVCC is region wide. If > >>> many ongoing updates, a slightly slower one may drag down other > >>> outstanding > >>> transactions completing. The STUCK message goes away after some time? > It > >>> happens frequently? A thread dump while this is going on would be > >>> interesting if possible or what else is going on on the server around > this > >>> time (see in logs?) > >> > >> > >> Yes, the STUCK message happens for quite some time (a dozen of minutes, > >> each hour.). It happens every hour. > >> > >>> Few additional questions : > >>>> > >>>> - Is it a problem regarding the data or is it "safe" ? In other > >>> words, > >>>> the old data not been written yet will be dropped or they will be > >>>> written > >>>> correctly, just later ? > >>>> > >>> No data is going to be dropped. The STUCK message is just flagging you > >>> that > >>> a write is taking a while to complete while we wait on MVCC. You > backed up > >>> on disk or another resource or a bunch of writers have all happened to > >>> arrive at one particular region (MVCC is by region)? > >> > >> > >> I've pre-splitted my "tsdb" region, to be managed by all 4 servers, so I > >> think I'm ok on this side. All information are stored locally on the EC2 > >> disks. > >> > >> > >>>> - How can I debug this and if possible, fix it ? > >>>> > >>>> > >>> See above. Your writes are well distributed. Disks are healthy? > >> > >> > >> > >> So here's the result of my investigation + my assumptions : > >> > >> > >> - Every hour, my RS have a peak of Load / CPU. I was looking at a RS > >> when it happened (that's easy, it's at the beginning of each hour), > and the > >> RS java process was taking all the CPU available on the machine, > hence the > >> load. You can see the load of all my servers on those images, see [1] > and > >> [2]. > >> - Disk are fine IMO. Write IO is OK on average, peak to 300 / 400 > >> IOPS, in range of a correct mechanical drive. I don't see a > particular IOPS > >> load at that time > >> - However, you can see that every hour (see [3]) : > >> - Calls are queued > >> - Write are impacted (from 160k/s, down to 150 - 140k/s) > >> - Read RPC are increased (I suppose that the RS is not answering, > >> so they are queued and suddendly released in a batch) > >> - GC count is spiking a bit > >> - You can see that flush and compaction queues are fine > >> - Also, the Heap usage is rather fine (I've put -Xmx27136m). At first, > >> I was thinking that I was reaching the maximum Heap Size, but you can > see > >> on the graph that 10.200.0.4 is far from the maximum. > >> - OpenTSDB is also showing that it cannot write or compact rows. > >> Here's an error : > >> > >> 10:19:02.415 ERROR [CompactionQueue.call] - Failed to write a row to > >> re-compact > >> org.hbase.async.RemoteException: org.apache.hadoop.hbase. > RegionTooBusyException: > >> Above memstore limit, regionName=tsdb,\x00\x07\xE4W\ > >> xA9\xE1\xE0\x00\x00\x01\x00\x00\x86\x00\x00\x0B\x00\x004\ > >> x00\x00\x0C\x00\x00I,1470758107028.5482c6 > >> 65b5843179116be7643cf1fa67., server=ip-10-200-0-6.eu-west- > >> 1.campaign.aws,16020,1470888079169, memstoreSize=537335912, > >> blockingMemStoreSize=536870912 > >> > >> Then a Java stack trace. > >> > >> I'm wondering if the issue is not because of OpenTSDB "compaction" [4]. > I > >> don't fully understand what's being done when OpenTSDB is doing those > >> compactions, but it's done "when the base hour has passed", so it could > >> explains those hourly spikes. I'll try to deactivate this feature and > see > >> if those spikes are still there. > >> > >> [1] : https://www.dropbox.com/s/x8t56le3cxf3qci/Host%20detail% > >> 20%231.png?dl=0 > >> [2] : https://www.dropbox.com/s/kmtg0dnedyphptr/Host%20detail% > >> 20%232.png?dl=0 > >> [3] : https://www.dropbox.com/s/yb6uq08xb5u7v0a/GC.png?dl=0 > >> [4] : http://opentsdb.net/docs/build/html/user_guide/ > backends/hbase.html# > >> compactions > >> > >> 2016-08-10 19:08 GMT+02:00 Stack <[email protected]>: > >> > >>> On Wed, Aug 10, 2016 at 1:13 AM, Sterfield <[email protected]> > wrote: > >>> > >>>> Hi, > >>>> > >>>> Thanks for your answer. > >>>> > >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of > >>> HTTP > >>>> POST on OpenTSDB in order to write data points (currently up to > 300k/s). > >>>> OpenTSDB is only doing increment / append (AFAIK) > >>>> > >>>> How many nodes or is that 300k/s on a single machine? > >>> > >>> > >>> > >> > >> > >> > >>> > >>> > >>> > >>> > >>> > >>>> - If it's not a problem right now, and if the situation persists, > >>> can it > >>>> be a problem later ? (for example, HBase cannot keep up the pace > then > >>>> finally drop some data) > >>>> > >>> > >>> HBase will not drop data. It will not be a problem later. It is a > >>> WARN-level message saying writes are stalling some. > >>> > >>> > >>> > >>> > >>> > >>> > >>>> I have to mention that OpenTSDB is using their own HBase client called > >>>> asynchbase (https://github.com/OpenTSDB/asynchbase). I don't know if > >>> it's > >>>> related or not, but just FYI. > >>>> > >>>> > >>> This is serverside internal to the Region. Nothing to do with > asynchbase > >>> (other than asynchbase tends to run with less friction so may be > writing > >>> faster than our default client). > >>> > >>> St.Ack > >>> > >>> > >>> > >>>> Thanks, > >>>> > >>>> Guillaume > >>>> > >>>> 2016-08-10 8:10 GMT+02:00 Anoop John <[email protected]>: > >>>> > >>>>> Ya it comes with write workload. Not like with concurrent reads. > >>>>> Once the write is done (memstore write and WAL write), we mark that > >>>>> MVCC operation corresponding to this as complete and wait for a > global > >>>>> read point to advance to atleast this point. (Every write op will > have > >>>>> a number corresponding to its mvcc op) > >>>>> As the mvcc op begin, the write entries (which is having this number) > >>>>> added to a Q. Once write is over and we complete the mvcc op, we > will > >>>>> mark the entry corresponding to it (It is in Q now) as > completed/done. > >>>>> So there are other entries also in the Q corresponding to other > >>>>> parallel writes. Now we will try to move the global read point at > >>>>> least upto this mark. Means we will have to make sue all the write > >>>>> ops with lesser number than this to get completed. Until then , > there > >>>>> is a wait. So what is happening here is a later write op is > >>>>> completed and that is still waiting for an older write to get > >>>>> completed. What sort of write ops u have? Batch put and/or > >>>>> increment/append? > >>>>> > >>>>> -Anoop- > >>>>> > >>>>> > >>>>> On Tue, Aug 9, 2016 at 6:45 PM, Sterfield <[email protected]> > >>> wrote: > >>>>>> 2016-08-05 17:52 GMT+02:00 Bryan Beaudreault < > >>> [email protected] > >>>>> : > >>>>>> > >>>>>>> I'm also interested in an answer here. We see this from time to > >>> time > >>>> in > >>>>> our > >>>>>>> production HBase clusters (non-opentsdb). It seems to be related to > >>>>>>> contention under heavy reads or heavy writes. But it's not clear > >>> what > >>>>> the > >>>>>>> impact is here. > >>>>>>> > >>>>>>> On Fri, Aug 5, 2016 at 5:14 AM Sterfield <[email protected]> > >>> wrote: > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I'm currently testing Hbase 1.2.1 + OpenTSDB. For that, I'm > >>>>> generating a > >>>>>>>> high load of HTTP PUT on OpenTSDB, which then writes in Hbase. > >>>>> Currently, > >>>>>>>> I'm able to feed 300k data point per seconds, and I'm trying to > >>>>> achieve > >>>>>>>> higher speed. > >>>>>>>> > >>>>>>>> I have also activate JMX on both Master and Region servers in > >>> order > >>>> to > >>>>>>> grab > >>>>>>>> a bunch of metrics from Hbase. > >>>>>>>> > >>>>>>>> At some point, I'm noticing a bunch of warnings in a region > >>> server > >>>> log > >>>>>>> file > >>>>>>>> : > >>>>>>>> > >>>>>>>> 2016-08-04 05:12:06,537 WARN > >>>>>>>> [B.defaultRpcServer.handler=21,queue=3,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714778} > >>>>>>>> 2016-08-04 05:12:06,538 WARN > >>>>>>>> [B.defaultRpcServer.handler=19,queue=1,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714778} > >>>>>>>> 2016-08-04 05:12:06,538 WARN > >>>>>>>> [B.defaultRpcServer.handler=40,queue=4,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714778} > >>>>>>>> 2016-08-04 05:12:06,548 WARN > >>>>>>>> [B.defaultRpcServer.handler=0,queue=0,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714778} > >>>>>>>> 2016-08-04 05:12:06,549 WARN > >>>>>>>> [B.defaultRpcServer.handler=35,queue=5,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714778} > >>>>>>>> 2016-08-04 05:12:06,556 WARN > >>>>>>>> [B.defaultRpcServer.handler=7,queue=1,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,558 WARN > >>>>>>>> [B.defaultRpcServer.handler=20,queue=2,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,559 WARN > >>>>>>>> [B.defaultRpcServer.handler=14,queue=2,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,609 WARN > >>>>>>>> [B.defaultRpcServer.handler=42,queue=0,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,609 WARN > >>>>>>>> [B.defaultRpcServer.handler=32,queue=2,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,619 WARN > >>>>>>>> [B.defaultRpcServer.handler=2,queue=2,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> 2016-08-04 05:12:06,619 WARN > >>>>>>>> [B.defaultRpcServer.handler=48,queue=0,port=16020] > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK: > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719, > >>>> writePoint=7714779} > >>>>>>>> > >>>>>>>> Can someone shed some lights on this ? I've googling + searching > >>> the > >>>>>>>> documentation, but found no clue on this subject. > >>>>>>>> > >>>>>>>> Is it a problem ? If so, how can I fix it ? > >>>>>>>> > >>>>>>>> Thanks a lot for your help, > >>>>>>>> > >>>>>>>> Guillaume > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> Hello, > >>>>>> > >>>>>> Sorry to bump this thread, but no one knows anything about this > >>> issue ? > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Guillaume > >>>>> > >>>> > >>> > >> > >> > >
