Re: Hbase regionserver.MultiVersionConcurrencyControl Warning

John Leach Thu, 11 Aug 2016 08:13:57 -0700

We saw this as well at Splice Machine.  This led us to run compactions in 
Spark.  Once we did this, we saw the compaction effects go away almost entirely.


Here is a link to our code.

https://github.com/splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/CompactionJob.java
 
<https://github.com/splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/CompactionJob.java>

We have a todo to get this back in the community.  

Regards,
John Leach

> On Aug 11, 2016, at 8:03 AM, Sterfield <sterfi...@gmail.com> wrote:
> 
> And it's gone [1]. No more spikes in the writes / read, no more OpenTSDB
> error. So I think it's safe to assume that OpenTSDB compaction is
> generating some additional load that is not very well handled by the HBase,
> and therefore, generating the issues I'm mentioning.
> 
> It seems also that the MVCC error are gone (to be checked).
> 
> I don't know how to manage Hbase in order to make it possible to handle
> compaction without any issues, but at least, I know where it comes from
> 
> [1] :
> https://www.dropbox.com/s/d6l2lngr6mpizh9/Without%20OpenTSDB%20compaction.png?dl=0
> 
> 2016-08-11 13:18 GMT+02:00 Sterfield <sterfi...@gmail.com>:
> 
>> Hello,
>> 
>> 
>>>> Hi,
>>>> 
>>>> Thanks for your answer.
>>>> 
>>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of
>>> HTTP
>>>> POST on OpenTSDB in order to write data points (currently up to 300k/s).
>>>> OpenTSDB is only doing increment / append (AFAIK)
>>>> 
>>>> How many nodes or is that 300k/s on a single machine?
>> 
>> 
>> 1 master node, 4 slaves, colo HDFS + RS.
>> 
>> Master : m4.2xlarge (8CPU, 32GB RAM)
>> Slave : d2.2xlarge (8CPU, 61GB RAM, 6x2T disk)
>> 
>> 
>>>> If I have understood your answer correctly, some write ops are queued,
>>> and
>>>> some younger ops in the queue are "done" while some older are not.
>>>> 
>>>> 
>>> What Anoop said plus, we'll see the STUCK notice when it is taking a long
>>> time for the MVCC read point to come up to the write point of the
>>> currently
>>> ongoing transaction. We will hold the updating thread until the readpoint
>>> is equal or greater than the current transactions write point. We do this
>>> to ensure a client can read its own writes. The MVCC is region wide. If
>>> many ongoing updates, a slightly slower one may drag down other
>>> outstanding
>>> transactions completing. The STUCK message goes away after some time? It
>>> happens frequently? A thread dump while this is going on would be
>>> interesting if possible or what else is going on on the server around this
>>> time (see in logs?)
>> 
>> 
>> Yes, the STUCK message happens for quite some time (a dozen of minutes,
>> each hour.). It happens every hour.
>> 
>>> Few additional questions :
>>>> 
>>>>   - Is it a problem regarding the data or is it "safe" ? In other
>>> words,
>>>>   the old data not been written yet will be dropped or they will be
>>>> written
>>>>   correctly, just later ?
>>>> 
>>> No data is going to be dropped. The STUCK message is just flagging you
>>> that
>>> a write is taking a while to complete while we wait on MVCC. You backed up
>>> on disk or another resource or a bunch of writers have all happened to
>>> arrive at one particular region (MVCC is by region)?
>> 
>> 
>> I've pre-splitted my "tsdb" region, to be managed by all 4 servers, so I
>> think I'm ok on this side. All information are stored locally on the EC2
>> disks.
>> 
>> 
>>>>   - How can I debug this and if possible, fix it ?
>>>> 
>>>> 
>>> See above. Your writes are well distributed. Disks are healthy?
>> 
>> 
>> 
>> So here's the result of my investigation + my assumptions :
>> 
>> 
>>   - Every hour, my RS have a peak of Load / CPU. I was looking at a RS
>>   when it happened (that's easy, it's at the beginning of each hour), and the
>>   RS java process was taking all the CPU available on the machine, hence the
>>   load. You can see the load of all my servers on those images, see [1] and
>>   [2].
>>   - Disk are fine IMO. Write IO is OK on average, peak to 300 / 400
>>   IOPS, in range of a correct mechanical drive. I don't see a particular IOPS
>>   load at that time
>>   - However, you can see that every hour (see [3]) :
>>      - Calls are queued
>>      - Write are impacted (from 160k/s, down to 150 - 140k/s)
>>      - Read RPC are increased (I suppose that the RS is not answering,
>>      so they are queued and suddendly released in a batch)
>>      - GC count is spiking a bit
>>   - You can see that flush and compaction queues are fine
>>   - Also, the Heap usage is rather fine (I've put -Xmx27136m). At first,
>>   I was thinking that I was reaching the maximum Heap Size, but you can see
>>   on the graph that 10.200.0.4 is far from the maximum.
>>   - OpenTSDB is also showing that it cannot write or compact rows.
>>   Here's an error :
>> 
>> 10:19:02.415 ERROR [CompactionQueue.call] - Failed to write a row to
>> re-compact
>> org.hbase.async.RemoteException: 
>> org.apache.hadoop.hbase.RegionTooBusyException:
>> Above memstore limit, regionName=tsdb,\x00\x07\xE4W\
>> xA9\xE1\xE0\x00\x00\x01\x00\x00\x86\x00\x00\x0B\x00\x004\
>> x00\x00\x0C\x00\x00I,1470758107028.5482c6
>> 65b5843179116be7643cf1fa67., server=ip-10-200-0-6.eu-west-
>> 1.campaign.aws,16020,1470888079169, memstoreSize=537335912,
>> blockingMemStoreSize=536870912
>> 
>> Then a Java stack trace.
>> 
>> I'm wondering if the issue is not because of OpenTSDB "compaction" [4]. I
>> don't fully understand what's being done when OpenTSDB is doing those
>> compactions, but it's done "when the base hour has passed", so it could
>> explains those hourly spikes. I'll try to deactivate this feature and see
>> if those spikes are still there.
>> 
>> [1] : https://www.dropbox.com/s/x8t56le3cxf3qci/Host%20detail%
>> 20%231.png?dl=0
>> [2] : https://www.dropbox.com/s/kmtg0dnedyphptr/Host%20detail%
>> 20%232.png?dl=0
>> [3] : https://www.dropbox.com/s/yb6uq08xb5u7v0a/GC.png?dl=0
>> [4] : http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html#
>> compactions
>> 
>> 2016-08-10 19:08 GMT+02:00 Stack <st...@duboce.net>:
>> 
>>> On Wed, Aug 10, 2016 at 1:13 AM, Sterfield <sterfi...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Thanks for your answer.
>>>> 
>>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands of
>>> HTTP
>>>> POST on OpenTSDB in order to write data points (currently up to 300k/s).
>>>> OpenTSDB is only doing increment / append (AFAIK)
>>>> 
>>>> How many nodes or is that 300k/s on a single machine?
>>> 
>>> 
>>> 
>> 
>> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>>   - If it's not a problem right now, and if the situation persists,
>>> can it
>>>>   be a problem later ? (for example, HBase cannot keep up the pace then
>>>>   finally drop some data)
>>>> 
>>> 
>>> HBase will not drop data. It will not be a problem later. It is a
>>> WARN-level message saying writes are stalling some.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> I have to mention that OpenTSDB is using their own HBase client called
>>>> asynchbase (https://github.com/OpenTSDB/asynchbase). I don't know if
>>> it's
>>>> related or not, but just FYI.
>>>> 
>>>> 
>>> This is serverside internal to the Region. Nothing to do with asynchbase
>>> (other than asynchbase tends to run with less friction so may be writing
>>> faster than our default client).
>>> 
>>> St.Ack
>>> 
>>> 
>>> 
>>>> Thanks,
>>>> 
>>>> Guillaume
>>>> 
>>>> 2016-08-10 8:10 GMT+02:00 Anoop John <anoop.hb...@gmail.com>:
>>>> 
>>>>> Ya it comes with write workload.  Not like with concurrent reads.
>>>>> Once the write is done (memstore write and WAL write), we mark that
>>>>> MVCC operation corresponding to this as complete and wait for a global
>>>>> read point to advance to atleast this point. (Every write op will have
>>>>> a number corresponding to its mvcc op)
>>>>> As the mvcc op begin, the write entries (which is having this number)
>>>>> added to a Q.  Once write is over and we complete the mvcc op, we will
>>>>> mark the entry corresponding to it (It is in Q now) as completed/done.
>>>>>  So there are other entries also in the Q corresponding to other
>>>>> parallel writes.   Now we will try to move the global read point at
>>>>> least upto this mark.  Means we will have to make sue all the write
>>>>> ops with lesser number than this to get completed.  Until then , there
>>>>> is a wait.   So what is happening here is a later write op is
>>>>> completed and that is still waiting for an older write to get
>>>>> completed.   What sort of write ops u have?   Batch put and/or
>>>>> increment/append?
>>>>> 
>>>>> -Anoop-
>>>>> 
>>>>> 
>>>>> On Tue, Aug 9, 2016 at 6:45 PM, Sterfield <sterfi...@gmail.com>
>>> wrote:
>>>>>> 2016-08-05 17:52 GMT+02:00 Bryan Beaudreault <
>>> bbeaudrea...@hubspot.com
>>>>> :
>>>>>> 
>>>>>>> I'm also interested in an answer here. We see this from time to
>>> time
>>>> in
>>>>> our
>>>>>>> production HBase clusters (non-opentsdb). It seems to be related to
>>>>>>> contention under heavy reads or heavy writes. But it's not clear
>>> what
>>>>> the
>>>>>>> impact is here.
>>>>>>> 
>>>>>>> On Fri, Aug 5, 2016 at 5:14 AM Sterfield <sterfi...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I'm currently testing Hbase 1.2.1 + OpenTSDB. For that, I'm
>>>>> generating a
>>>>>>>> high load of HTTP PUT on OpenTSDB, which then writes in Hbase.
>>>>> Currently,
>>>>>>>> I'm able to feed 300k data point per seconds, and I'm trying to
>>>>> achieve
>>>>>>>> higher speed.
>>>>>>>> 
>>>>>>>> I have also activate JMX on both Master and Region servers in
>>> order
>>>> to
>>>>>>> grab
>>>>>>>> a bunch of metrics from Hbase.
>>>>>>>> 
>>>>>>>> At some point, I'm noticing a bunch of warnings in a region
>>> server
>>>> log
>>>>>>> file
>>>>>>>> :
>>>>>>>> 
>>>>>>>> 2016-08-04 05:12:06,537 WARN
>>>>>>>> [B.defaultRpcServer.handler=21,queue=3,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714778}
>>>>>>>> 2016-08-04 05:12:06,538 WARN
>>>>>>>> [B.defaultRpcServer.handler=19,queue=1,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714778}
>>>>>>>> 2016-08-04 05:12:06,538 WARN
>>>>>>>> [B.defaultRpcServer.handler=40,queue=4,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714778}
>>>>>>>> 2016-08-04 05:12:06,548 WARN
>>>>>>>> [B.defaultRpcServer.handler=0,queue=0,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714778}
>>>>>>>> 2016-08-04 05:12:06,549 WARN
>>>>>>>> [B.defaultRpcServer.handler=35,queue=5,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714778}
>>>>>>>> 2016-08-04 05:12:06,556 WARN
>>>>>>>> [B.defaultRpcServer.handler=7,queue=1,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,558 WARN
>>>>>>>> [B.defaultRpcServer.handler=20,queue=2,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,559 WARN
>>>>>>>> [B.defaultRpcServer.handler=14,queue=2,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,609 WARN
>>>>>>>> [B.defaultRpcServer.handler=42,queue=0,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,609 WARN
>>>>>>>> [B.defaultRpcServer.handler=32,queue=2,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,619 WARN
>>>>>>>> [B.defaultRpcServer.handler=2,queue=2,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 2016-08-04 05:12:06,619 WARN
>>>>>>>> [B.defaultRpcServer.handler=48,queue=0,port=16020]
>>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
>>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
>>>> writePoint=7714779}
>>>>>>>> 
>>>>>>>> Can someone shed some lights on this ? I've googling + searching
>>> the
>>>>>>>> documentation, but found no clue on this subject.
>>>>>>>> 
>>>>>>>> Is it a problem ? If so, how can I fix it ?
>>>>>>>> 
>>>>>>>> Thanks a lot for your help,
>>>>>>>> 
>>>>>>>> Guillaume
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> Sorry to bump this thread, but no one knows anything about this
>>> issue ?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Guillaume
>>>>> 
>>>> 
>>> 
>> 
>>

Re: Hbase regionserver.MultiVersionConcurrencyControl Warning

Reply via email to