Zack,
Thanks for reporting this and for the detailed description. Here's a bunch
of questions and some things you can try in addition to what Andrew
suggested:
1) Is this reproducible in a test environment (perhaps through Pherf:
https://phoenix.apache.org/pherf.html) so you can experiment more?
2) Do you get a sense of whether the bottleneck is on the client or the
server? CPU, IO, or network? How many clients are you running and have you
tried increasing this? Do you think your network is saturated by the data
being returned?
3) From your description, it sounds like you're querying the data as your
ingesting. When it gets slow, have you tried running a major compaction to
see if that helps? Perhaps queries are getting slower because of the number
of HFiles that need to be merged.
4) If you bounce your cluster when it gets slow, does this have any impact?
5) What kinds of queries are running? Aggregation? Joins? Or just plain
single table selects? Any ORDER BY clauses? Are you using secondary
indexes, and if so, what kind?
6) Are you seeing GC pauses on the server during times of slowness
(correlate time of slowness with your server logs)?
7) Sounds like your queries are returning a lot of data. On the
client-side, Phoenix will keep phoenix.query.spoolThresholdBytes in memory
and then spool to disk as parallel execution happens. Are you seeing many
spool files on the client side your /tmp directory (this is where Phoenix
puts these by default with a name of ResultSpoolerXXX.bin). Try increasing
this spool threshold if that's the case.
8) For the data ingest, are you using UPSERT VALUES? How big of batches are
you committing? That's one thing to tune, especially if you're using
secondary indexing.
9) Have you tried tuning the level of parallelization that Phoenix is doing
for queries? This is controlled by the
server-side phoenix.stats.guidepost.width parameter (assuming you haven't
set the phoenix.stats.guidepost.per.region parameter) and defaults to
300MB. Try increasing it (you'll need to run a major compaction for this to
take effect, and there's 15min lag to when the client sees it).
10) If you're doing aggregation or join queries, try increasing
the phoenix.query.maxGlobalMemorySize property on the server side. Both
hash joins and aggregation are done in memory, up to this % limit. If the
limit is reached, then on the server side, Phoenix will
wait phoenix.query.maxGlobalMemoryWaitMs time for the usage to go below the
limit (and then throw an exception if it doesn't). You can try tuning this
wait time down to see if it has an impact.
11) There a bunch of client-side metrics you can collect (but little
documentation yet - keep your eye on PHOENIX-2486) that might help you
diagnose this. See PhoenixRuntime.getGlobalPhoenixClientMetrics(),
PhoenixRuntime.getOverAllReadRequestMetrics(), and other methods with
Metrics in the name.
12) There's also tracing, which is end-to-end client/server, but it's in a
bit on the raw side still: https://phoenix.apache.org/tracing.html

There's more information on these tuning parameters here:
https://phoenix.apache.org/tuning.html and you should take a look at
Andrew's excellent tuning presentation here:
https://phoenix.apache.org/resources.html.

Thanks,
James


On Fri, Dec 4, 2015 at 8:28 AM, Andrew Purtell <apurt...@apache.org> wrote:

> Kumar - I believe you mentioned you are seeing this in a cluster of ~20
> regionservers.
>
> Zack - Yours is smaller yet, at 9.
>
> These clusters are small enough to make getting stack dumps through the
> HBase debug servlet during periods of unusually slow response possible.
> Perhaps you can write a script that queries all of the debug servlets (can
> use curl) and dumps the received output into per-regionserver files? Scrape
> every 10 or so seconds during the observed periods of slowness? Then
> compress them and make them available for Phoenix devs up on S3? Consider
> it a poor man's sampler. I don't know what we might find, but this could
> prove very helpful.
>
>
> On Fri, Dec 4, 2015 at 8:11 AM, Kumar Palaniappan <
> kpalaniap...@marinsoftware.com> wrote:
>
>> I'm in the same exact position as Zack described. Appreciate your
>> feedback.
>>
>> So far we tried the call queue n the handlers, nope. Planned to try
>> off-heap cache.
>>
>> Kumar Palaniappan <http://about.me/kumar.palaniappan>
>> <https://twitter.com/intent/follow?original_referer=https://twitter.com/about/resources/buttons&region=follow_link&screen_name=megamda&source=followbutton&variant=2.0>
>>  [image: Description: Macintosh HD:Users:Kumarappan:Desktop:linkedin.gif]
>> <http://www.linkedin.com/in/kumarpalaniappan>
>>
>> On Dec 4, 2015, at 6:45 AM, Riesland, Zack <zack.riesl...@sensus.com>
>> wrote:
>>
>> Thanks Satish,
>>
>>
>>
>> To clarify: I’m not looking up single rows. I’m looking up the history of
>> each widget, which returns hundreds-to-thousands of results per widget (per
>> query).
>>
>>
>>
>> Each query is a range scan, it’s just that I’m performing thousands of
>> them.
>>
>>
>>
>> *From:* Satish Iyengar [mailto:sat...@gmail.com <sat...@gmail.com>]
>> *Sent:* Friday, December 04, 2015 9:43 AM
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: Help tuning for bursts of high traffic?
>>
>>
>>
>> Hi Zack,
>>
>>
>>
>> Did you consider avoiding hitting hbase for every single row by doing
>> that step in an offline mode? I was thinking if you could have some kind of
>> daily export of hbase table and then use pig to perform join (co-group
>> perhaps) to do the same. Obviously this would work only when your hbase
>> table is not maintained by stream based system. Hbase is really good at
>> range scans and may not be ideal for single row (large number of).
>>
>>
>>
>> Thanks,
>>
>> Satish
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Dec 4, 2015 at 9:09 AM, Riesland, Zack <zack.riesl...@sensus.com>
>> wrote:
>>
>> SHORT EXPLANATION: a much higher percentage of queries to phoenix return
>> exceptionally slow after querying very heavily for several minutes.
>>
>>
>>
>> LONGER EXPLANATION:
>>
>>
>>
>> I’ve been using Pheonix for about a year as a data store for web-based
>> reporting tools and it works well.
>>
>>
>>
>> Now, I’m trying to use the data in a different (much more
>> request-intensive) way and encountering some issues.
>>
>>
>>
>> The scenario is basically this:
>>
>>
>>
>> Daily, ingest very large CSV files with data for widgets.
>>
>>
>>
>> Each input file has hundreds of rows of data for each widget, and tens of
>> thousands of unique widgets.
>>
>>
>>
>> As a first step, I want to de-duplicate this data against my
>> Phoenix-based DB (I can’t rely on just upserting the data for de-dup
>> because it will go through several ETL steps before being stored into
>> Phoenix/HBase).
>>
>>
>>
>> So, per-widget, I perform a query against Phoenix (the table is keyed
>> against the unique widget ID + sample point). I get all the data for a
>> given widget id, within a certain period of time, and then I only ingest
>> rows for that widget that are new to me.
>>
>>
>>
>> I’m doing this in Java in a single step: I loop through my input file and
>> perform one query per widget, using the same Connection object to Phoenix.
>>
>>
>>
>> THE ISSUE:
>>
>>
>>
>> What I’m finding is that for the first several thousand queries, I almost
>> always get a very fast (less than 10 ms) response (good).
>>
>>
>>
>> But after 15-20 thousand queries, the response starts to get MUCH slower.
>> Some queries respond as expected, but many take as many as 2-3 minutes,
>> pushing the total time to prime the data structure into the 12-15 hour
>> range, when it would only take 2-3 hours if all the queries were fast.
>>
>>
>>
>> The same exact queries, when run manually and not part of this bulk
>> process, return in the (expected) < 10 ms.
>>
>>
>>
>> So it SEEMS like the burst of queries puts Phoenix into some sort of busy
>> state that causes it to respond far too slowly.
>>
>>
>>
>> The connection properties I’m setting are:
>>
>>
>>
>> Phoenix.query.timeoutMs: 90000
>>
>> Phoenix.query.keepAliveMs: 90000
>>
>> Phenix.query.threadPoolSize: 256
>>
>>
>>
>> Our cluster is 9 (beefy) region servers and the table I’m referencing is
>> 511 regions. We went through a lot of pain to get the data split extremely
>> well, and I don’t think Schema design is the issue here.
>>
>>
>>
>> Can anyone help me understand how to make this better? Is there a better
>> approach I could take? A better set of configuration parameters? Is our
>> cluster just too small for this?
>>
>>
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Satish Iyengar
>>
>> "Anyone who has never made a mistake has never tried anything new."
>> Albert Einstein
>>
>>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Reply via email to