[ 
https://issues.apache.org/jira/browse/PHOENIX-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832692#comment-16832692
 ] 

John Phillips commented on PHOENIX-5239:
----------------------------------------

{quote}The problem is that after running Q1, RS1 has PQ1 cached, but RS2 does 
not. Thus, when running Q2, the query fails/errors/is-slow because you have to 
cache PQ1 on RS2. Is this correct?
{quote}
Correct.
{quote}Why, when Q2 starts, wouldn't it make sure that it's cached on RS2?
{quote}
It does first check, and if it's not found on all of RS2, it rebuilds the 
subquery and will send to all of them. The problem is that leads to 
unpredictability of response time.

Ideally, we're looking for a way to "warm" a subquery in an offline process, 
then once it's cached everywhere have it be served by live queries. Without a 
good way to cache it on all the regionservers (forcing a full table scan in the 
outer join would "work", but be very inefficient), some of the live queries 
wouldn't hit the cache, resulting in general unpredictability and potential 
cache stampede situations.

Also, FWIW, the cluster we're using this on is 60 regionservers and we haven't 
noticed any issues from this change. And yes, there are tradeoffs, but I would 
see this as the lesser of two evils when it's anticipated the subquery will end 
up being used on most of the regionservers. What's your opinion of adding a 
config option to toggle this behavior?

For other potential options I see:
 - Adding a new query hint that forces the subquery to be cached everywhere
 - Copying the caches to some shared location after computation (HDFS?) where 
regionservers could later pull them from on demand
 - Adding some mechanism for PQS to copy caches between regionservers

However, at least the last two would add a lot of complexity and bring in a 
whole new set of problems.

> Send persistent subquery cache to all regionservers
> ---------------------------------------------------
>
>                 Key: PHOENIX-5239
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5239
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: John Phillips
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> PHOENIX-4666 introduced a persistent subquery cache that allowed phoenix to 
> cache the results from an expensive subquery (enabled with a 
> {{USE_PERSISTENT_CACHE}} query hint) to speed up subsequent queries.
> More context is available on the PHOENIX-4666 ticket, but a quick example 
> would be a query like:
> {code:java}
> SELECT /*+ USE_PERSISTENT_CACHE */ *
>     FROM table1
>     JOIN (SELECT id_1 FROM large_table WHERE x = 10) expensive_result
>     ON table1.id_1 = expensive_result.id_2
> WHERE table1.id_1 = [some_id]
> {code}
> Where lots of queries are ran, differing only by {{some_id}}. Our usage 
> involves first running one query over phoenix to warm the cache (which takes 
> ~20 seconds), then once complete, allowing the live query to run which 
> utilize the persistent subquery cache (~100ms).
> However, we noticed that when phoenix sends the cache to the regionservers, 
> it looks at {{some_id}} in the outer query to figure out which regionservers 
> might contain {{table1.id_1 = [some_id]}} ([code 
> here|https://github.com/apache/phoenix/blob/2084a6c/phoenix-core/src/main/java/org/apache/phoenix/cache/ServerCacheClient.java#L282-L283]).
>  This means that when we first start running the query, we'll inconsistently 
> hit the cache until it ends up being propagated to all the regionservers.
> Basically, we'd like to have some way to warm the subquery cache and ensure 
> it's on all the regionservers so subsequent queries will always find the 
> cache. I think the simplest solution might be updating the [if statement in 
> ServerCacheClient#addServerCache|https://github.com/apache/phoenix/blob/2084a6c/phoenix-core/src/main/java/org/apache/phoenix/cache/ServerCacheClient.java#L282-L283]
>  to simply always send the cache to all the regionservers if it's a 
> persistent subquery:
> {code:java}
> - if ( ! servers.contains(entry) &&
> -         keyRanges.intersectRegion(regionStartKey, regionEndKey,
> -                 cacheUsingTable.getIndexType() == IndexType.LOCAL)) {
> + boolean keyRangesIntersect = keyRanges.intersectRegion(regionStartKey, 
> regionEndKey,
> +         cacheUsingTable.getIndexType() == IndexType.LOCAL);
> + if (!servers.contains(entry) && (keyRangesIntersect || usePersistentCache)) 
> {
> {code}
> I tested this out, and it seems to work as expected. If it sounds like an 
> acceptable solution, I'd be happy to make an actual PR. Or, if anyone has any 
> other suggestions on better ways to handle this, it would be much appreciated.
> FYI [~jamestaylor], [~elserj], and [~maryannxue] since it looks like you 
> three handled most of the review on the [original persistent cache 
> PR|https://github.com/apache/phoenix/pull/298]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to