[ https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478166#comment-16478166 ]
ASF GitHub Bot commented on PHOENIX-4666: ----------------------------------------- Github user ortutay commented on a diff in the pull request: https://github.com/apache/phoenix/pull/298#discussion_r188783472 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java --- @@ -383,7 +384,7 @@ protected QueryPlan compileJoinQuery(JoinCompiler.Strategy strategy, StatementCo new PTable[]{lhsTable}, new int[]{fieldPosition}, postJoinFilterExpression, QueryUtil.getOffsetLimit(limit, offset)); Pair<Expression, Expression> keyRangeExpressions = new Pair<Expression, Expression>(null, null); getKeyExpressionCombinations(keyRangeExpressions, context, joinTable.getStatement(), rhsTableRef, type, joinExpressions, hashExpressions); - return HashJoinPlan.create(joinTable.getStatement(), rhsPlan, joinInfo, new HashSubPlan[]{new HashSubPlan(0, lhsPlan, hashExpressions, false, keyRangeExpressions.getFirst(), keyRangeExpressions.getSecond())}); + return HashJoinPlan.create(joinTable.getStatement(), rhsPlan, joinInfo, new HashSubPlan[]{new HashSubPlan(0, lhsPlan, hashExpressions, false, false, keyRangeExpressions.getFirst(), keyRangeExpressions.getSecond())}); --- End diff -- @maryannxue I've changed the code here to use the `usePersistentCache` flag, but I'm not sure how to force a `HASH_BUILD_LEFT` strategy. The test I've put in the integration test file does not actually use that strategy. Do you know of an easy way to force a left table hash join? > Add a subquery cache that persists beyond the life of a query > ------------------------------------------------------------- > > Key: PHOENIX-4666 > URL: https://issues.apache.org/jira/browse/PHOENIX-4666 > Project: Phoenix > Issue Type: Improvement > Reporter: Marcell Ortutay > Assignee: Marcell Ortutay > Priority: Major > > The user list thread for additional context is here: > [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E] > ---- > A Phoenix query may contain expensive subqueries, and moreover those > expensive subqueries may be used across multiple different queries. While > whole result caching is possible at the application level, it is not possible > to cache subresults in the application. This can cause bad performance for > queries in which the subquery is the most expensive part of the query, and > the application is powerless to do anything at the query level. It would be > good if Phoenix provided a way to cache subquery results, as it would provide > a significant performance gain. > An illustrative example: > SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) > expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = > \{id} > In this case, the subquery "expensive_result" is expensive to compute, but it > doesn't change between queries. The rest of the query does because of the > \{id} parameter. This means the application can't cache it, but it would be > good if there was a way to cache expensive_result. > Note that there is currently a coprocessor based "server cache", but the data > in this "cache" is not persisted across queries. It is deleted after a TTL > expires (30sec by default), or when the query completes. > This is issue is fairly high priority for us at 23andMe and we'd be happy to > provide a patch with some guidance from Phoenix maintainers. We are currently > putting together a design document for a solution, and we'll post it to this > Jira ticket for review in a few days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)