Re: Has anyone tried profiling trunk recently?

Andrew Purtell Thu, 05 Nov 2015 11:54:12 -0800

Here's what I did:

0. Launch minicluster out of dev tree with ./bin/hbase master start
1. Load 10M rows using YCSB 
2. Snapshot 
3. Shutdown and restart
4. Clone a test table from the snapshot
5. Run YCSB workload A for 2M operations


The profile was captured at step 5. 

HTH


> On Nov 5, 2015, at 10:40 AM, Matteo Bertozzi <[email protected]> wrote:
> 
> what operation do you have going on the HMaster?
> I did a simple test where there are no operation going on and poll doesn't
> get out of the waitCond.await()
> so unless you have operations on the master you'll never call queue.poll()
> 
> or maybe, there's a bug and the previous ops where hanging and trigger a
> loop.
> what did you executed on the master before?
> do you have a script or something that I can use to reproduce this?
> 
> Matteo
> 
> 
> On Thu, Nov 5, 2015 at 10:25 AM, Matteo Bertozzi <[email protected]>
> wrote:
> 
>> let me take a look, but branch-1 and master should be the same on the proc
>> stuff
>> 
>> Matteo
>> 
>> 
>>> On Thu, Nov 5, 2015 at 10:23 AM, Stack <[email protected]> wrote:
>>> 
>>> You are looking at master process only?
>>> 
>>> I've been looking at branch-1 only of late.
>>> 
>>> St.Ack
>>> 
>>> On Thu, Nov 5, 2015 at 12:06 AM, Andrew Purtell <[email protected]>
>>> wrote:
>>> 
>>>> I just profiled trunk running YCSB workload A. I was looking for
>>> something
>>>> else but found completely unexpected results.
>>>> 
>>>> I see 60% of all allocation pressure in TLABs coming from
>>>> ProcedureExecutors, allocating 16 GB of
>>>> java.util.AbstractMap$SimpleImmutableEntry in this code:
>>>> 
>>>> Stack Trace TLABs Total TLAB Size(bytes) Pressure(%)
>>>> java.util.concurrent.ConcurrentSkipListMap$Node.createSnapshot() 11,603
>>>> 18,402,380,120 100
>>>>   java.util.concurrent.ConcurrentSkipListMap.firstEntry() 11,599
>>>> 18,401,678,776 99.996
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.nextObject()
>>>> 11,599 18,401,678,776 99.996
>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll()
>>>> 11,599 18,401,678,776 99.996
>>>> 
>>>> org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll()
>>> 11,599
>>>> 18,401,678,776 99.996
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop() 11,599
>>>> 18,401,678,776 99.996
>>>> 
>>>> 
>>>> The three ProcedureExecutor threads also consume about 60% of all CPU
>>> time,
>>>> each about 20%. Here are the top hot code arcs from one of them:
>>>> 
>>>> Stack Trace Sample Count Percentage(%)
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll() 1,521
>>>> 39.191
>>>>   org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll()
>>>> 1,521 39.191
>>>>      org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop()
>>> 1,521
>>>> 39.191
>>>> java.lang.String$CaseInsensitiveComparator.compare(String, String) 614
>>>> 15.821
>>>>   java.lang.String$CaseInsensitiveComparator.compare(Object, Object)
>>> 614
>>>> 15.821
>>>>      java.lang.String.compareToIgnoreCase(String) 614 15.821
>>>>         org.apache.hadoop.hbase.ServerName.compareTo(ServerName) 614
>>>> 15.821
>>>>            org.apache.hadoop.hbase.ServerName.compareTo(Object) 614
>>> 15.821
>>> java.util.concurrent.ConcurrentSkipListMap.cpr(Comparator,
>>>> Object, Object) 614 15.821
>>>> 
>>>> java.util.concurrent.ConcurrentSkipListMap.findNear(Object, int,
>>>> Comparator)
>>>> 614 15.821
>>>> 
>>>> java.util.concurrent.ConcurrentSkipListMap.getNear(Object, int) 614
>>> 15.821
>>>> 
>>>> java.util.concurrent.ConcurrentSkipListMap.higherEntry(Object) 614
>>> 15.821
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.nextObject()
>>> 614
>>>> 15.821
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll() 614
>>> 15.821
>>>> 
>>>> org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll()
>>> 614
>>>> 15.821
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop() 614
>>> 15.821
>>>> java.util.concurrent.ConcurrentSkipListMap$Node.createSnapshot() 468
>>> 12.059
>>>>   java.util.concurrent.ConcurrentSkipListMap.getNear(Object, int) 468
>>>> 12.059
>>>>      java.util.concurrent.ConcurrentSkipListMap.higherEntry(Object) 468
>>>> 12.059
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.nextObject()
>>> 468
>>>> 12.059
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll()
>>>> 468 12.059
>>>> 
>>>> org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll()
>>> 468
>>>> 12.059
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop() 468
>>> 12.059
>>>> java.util.concurrent.ConcurrentSkipListMap.cpr(Comparator, Object,
>>> Object)
>>>> 402 10.358
>>>>   java.util.concurrent.ConcurrentSkipListMap.findNear(Object, int,
>>>> Comparator) 280 7.215
>>>>      java.util.concurrent.ConcurrentSkipListMap.getNear(Object, int)
>>> 280
>>>> 7.215
>>>>         java.util.concurrent.ConcurrentSkipListMap.higherEntry(Object)
>>> 280
>>>> 7.215
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.nextObject()
>>> 280
>>>> 7.215
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll() 280
>>> 7.215
>>>> 
>>>> org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll() 280
>>>> 7.215
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop() 280
>>> 7.215
>>>>   java.util.concurrent.ConcurrentSkipListMap.findPredecessor(Object,
>>>> Comparator) 122 3.144
>>>>      java.util.concurrent.ConcurrentSkipListMap.findNear(Object, int,
>>>> Comparator) 122 3.144
>>>>         java.util.concurrent.ConcurrentSkipListMap.getNear(Object, int)
>>>> 122
>>>> 3.144
>>> java.util.concurrent.ConcurrentSkipListMap.higherEntry(Object)
>>>> 122 3.144
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.nextObject()
>>> 122
>>>> 3.144
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureFairRunQueues.poll() 122
>>> 3.144
>>>> 
>>>> org.apache.hadoop.hbase.master.procedure.MasterProcedureQueue.poll()
>>> 122
>>>> 3.144
>>>> 
>>>> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop() 122
>>> 3.144
>>>> 
>>>> 
>>>> Just me? Someone working on ProcedureV2 should have a look at WTH the
>>>> ProcedureExecutors are doing. They seem to be polling the run queue
>>>> constantly and burning up CPU. I'll try running more workloads at the
>>> next
>>>> opportunity and see if I can repeat this or gain more insight.
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>>   - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>>> (via Tom White)
>> 
>>

Re: Has anyone tried profiling trunk recently?

Reply via email to