Does Phoenix exposes metrics about its code execution? Network time, coprocessor time, client time, etc?
On Sunday, December 22, 2013, lars hofhansl wrote: > You would have to measure the incoming/outgoing traffic on the affected > machine. > > The easiest is to periodically check the output of ifconfig. If all data > is local and the query just returns a count I would not expect much (any?) > network traffic even after you ran the query multiple times. > > Beyond that I can't think of anything further. You said you checked the > machines, they are configured the same, etc. > If you run any local benchmarks on the boxes (any benchmark will do), do > they really perform all the same? > > Lastly, I assume all the regions are of the same size, again, check on the > regionserver UI pages (or maybe Hannibal tells you?) > > -- Lars > > > > ________________________________ > From: Kristoffer Sjögren <[email protected] <javascript:;>> > To: [email protected] <javascript:;>; lars hofhansl > <[email protected]<javascript:;> > > > Sent: Saturday, December 21, 2013 3:17 PM > Subject: Re: Performance tuning > > > > There are quite a lot of established and time wait connections between the > RS on port 50010, but i dont know a good way of monitoring how much data is > going through each connection (if that's what you meant)? > > > > On Sun, Dec 22, 2013 at 12:00 AM, Kristoffer Sjögren <[email protected]> > wrote: > > Scans on RS 19 and 23, which have 5 regions instead of 4, stands out more > than scans on RS 20, 21, 22. But scans on RS 7 and 18, that also have 5 > regions are doing fine, not best, but still in the mid-range. > > > > > > > >On Sat, Dec 21, 2013 at 11:51 PM, Kristoffer Sjögren <[email protected]> > wrote: > > > >Yeah, im doing a count(*) query on the 96 region table. Do you mean to > check network traffic between RS? > >> > >> > >>From debugging phoenix code I can see that there are 96 scans sent and > each response returned back to the client contain only the sum of rows, > which are then aggregated and returned. So the traffic between client and > each RS is very small. > >> > >> > >> > >> > >> > >> > >> > >>On Sat, Dec 21, 2013 at 11:35 PM, lars hofhansl <[email protected]> > wrote: > >> > >>Thanks Kristoffer, > >>> > >>>yeah, that's the right metric. I would put my bet on the slower network. > >>>But you're also doing a select count(*) query in Phoenix, right? So > nothing should really be sent across the network. > >>> > >>>When you do the queries, can you check whether there is any network > traffic? > >>> > >>> > >>>-- Lars > >>> > >>> > >>> > >>>________________________________ > >>> From: Kristoffer Sjögren <[email protected]> > >>>To: [email protected]; lars hofhansl <[email protected]> > >>>Sent: Saturday, December 21, 2013 1:28 PM > >>>Subject: Re: Performance tuning > >>> > >>> > >>> > >>>@pradeep scanner caching should not be an issue since data transferred > to > >>>the client is tiny. > >>> > >>>@lars Yes, the data might be small for this particular case :-) > >>> > >>>I have checked everything I can think of on RS (CPU, network, Hbase > >>>console, uptime etc) and nothing stands out, except for the pings > (network > >>>pings). > >>>There are 5 regions on 7, 18, 19, and 23 the others have 4. > >>>hdfsBlocksLocalityIndex=100 on all RS (was that the correct metric?) > >>> > >>>-Kristoffer > >>> > >>> > >>> > >>> > >>>On Sat, Dec 21, 2013 at 9:44 PM, lars hofhansl <[email protected]> > wrote: > >>> > >>>> Hi Kristoffer, > >>>> For this particular problem. Are many regions on the same > RegionServers? > >>>> Did you profile those RegionServers? Anything weird on that box? > >>>> Pings slower might well be an issue. How's the data locality? (You can > >>>> check on a RegionServer's overview page). > >>>> If needed, you can issue a major compaction to reestablish local data > on > >>>> all RegionServers. > >>>> > >>>> > >>>> 32 cores matched with only 4G of RAM is a bit weird, but with your > tiny > >>>> dataset it doesn't matter anyway. > >>>> > >>>> 10m rows across 96 regions is just about 100k rows per region. You > won't > >>>> see many of the nice properties for HBase. > >>>> Try with 100m (or better 1bn rows). Then we're talking. For anything > below > >>>> this you wouldn't want to use HBase anyway. > >>>> (100k rows I could scan on my phone with a Perl script in less than > 1s) > >>>> > >>>> > >>>> With "ping" you mean an actual network ping, or some operation on top > of > >>>> HBase? > >>>> > >>>> > >>>> -- Lars > >>>> > >>>> >
